Join Barton Poulson for an in-depth discussion in this video Systems of equations, part of Data Science Foundations: Fundamentals.
- [Voiceover] The ability to work with Systems of Linear Equations is an important part of Data Science. The idea here specifically is how do you work with many unknowns? The problem is when there's a Mutual Dependence, so for instance, X depends on Y but Y depends on X, and that sounds like it can be super difficult, but what's funny is that Systems of Linear Equations can actually be solved by hand. You can also use linear or matrix algebra and I'll demonstrate both of these. Let's use a quick example of somebody who's making and selling iPhone cases.
Let's say you sold a thousand cases, some sold for 20 dollars, some sold for five, and you had a total revenue of 5900 dollars. How many cases were sold at each price? Well, we're gonna write these out in equations. The sales is equal to this, it's x which is one price point plus y, another price point equals 1000 cases total. The revenue of 5900 dollars is equal to x times 20 dollars plus y times five dollars.
So we have two equations here and we have to find a way to combine them to get a single unique answer. We'll start by looking at sales. Sales, x+y that's the sales of one price and the sales of another price is equal to 1000, and there's revenue. Let's focus just on sales right now. What we're going to do is solve this one for x, so we're gonna move the y to the other side. We simply subtract y from both sides, when we do that that cancels out on the left and then we now have x expressed solely in terms of y.
We can now go to the revenue equation and we can substitute that in right here at the x. And then we multiply through, then we take our -20y+5y and that reduces to -15y and then we can solve for y. We do this by subtracting 20 thousand from each side. We'll do it on this one side first then we'll do it on the other side and then we divide both sides by -15 dollars and we get y is equal to 940.
Now if we go back to our sales equation x+y together make 1000, put in the 940, subtract that from each side, and we get x is equal to 60. What that means is that 60 cases were sold at 20 dollars each and 940 cases, a much larger quantity, were sold at five dollars each. What's neat is you can graph this as well. The problem is these are originally expressed with x and y both on the left side and a constant on the right. We need to solve bothe equations for y, so we subtract x from both sides, and then we solve for y on the top.
In this other one we divide by five dollars all the way through and then we subtract 4x from both sides, that cancels out, and now we have two equations both are y is a function of x. We can graph those, so for instance, this line right here shows us the number of cases sold. I was originally x+y=1000, that simplifies to y=-x+1000 and this represents all the possible combinations of cases that could be sold.
This blue line however represents earnings and is the 20x+5y=5900. That also simplifies to y and this represents every possible combination for earnings. What you have though is an intersection point and that intersection gives us the combined solution for the two of these at 60 and 940. Now we can do this in R as well. What I'm going to do here is I'm gonna come down and I'm going to enter the data. This is how we had it originally x+y=1000 and 20x+5y=5900.
I can take those coefficients, the ones on the left side, and enter them into a matrix. I'm gonna call it Q for Quantity, and when I do that there's my matrix, then I can enter the outcomes or the totals in a vector. I'm gonna put that right here, and then I'm going to use R's built-in solve function and we can get some help on that one with this command. And then here you see it over here solving the system of equations and I'm just going to solve Q for R and when I do you see I merely get the values of 60 and 940 which is what I got through the calculations that I actually did by hand earlier and if you wanna check the answers you can come down here and run them through a CES, that was added to 1000 and yes, that was added to 5900.
So what can we conclude from this? Number one, systems of linear equations are a vital part of working with data in Data Science. They're an important method for balancing several different unknowns to find a unique solution. Also, they rely on linear or matrix algebra although they can often even be done by hand, one of the few opportunities to do that in Data Science.
- The demand for data science
- Roles and careers
- Ethical issues in data science
- Sourcing data
- Exploring data through graphs and statistics
- Programming with R, Python, and SQL
- Data science in math and statistics
- Data science and machine learning
- Communicating with data