In this video, the instructor demonstrates the meaning of confidence intervals around a forecast.
- [Instructor] Now we've run our regression and we have coefficients that we can use to make a prediction, but what we don't know yet is how confident we can be in those coefficients. We don't have a way to stress test those coefficients to figure out whether or not the predictions that we're going to make based on them are going to be particularly accurate or not. That's what we're going to talk about in just a moment. I'm in the 0402 folder using the begin financial data file.
If you recall, we've got a variety of different data points almost 400,000 rows of data, and roughly a dozen different variables covering various corporate financial characteristics along with sales of the firm, and we were trying to predict the sales of a company based on that. To that end, in sheet one we ran this regression, and we saw that the regression was reasonable overall. We had a 45% R squared and our coefficients looked highly statistically significant based on the P values.
But what we don't know yet is whether or not those coefficients are particularly precise or accurate. That's where the lower and upper 95% columns come into play. This is what we call the 95% confidence interval. This represents, in essence, the value of a particular variable 95% of the time. So for example, 95% of the time the coefficient on assets will range between 0.295 and 0.299.
What this tells us is that an incremental one dollar in assets leads to between a 29 and a half cent and a 29.9 cent increase in sales. Our coefficient is in the middle of these two values. On average, one additional dollar in assets should increase sales by 29.7 cents. That's reasonably precise, right? When we look at this, .2 cents on either side of that coefficient, that's a fairly precise estimate.
We can be pretty comfortable with that as a 95% confidence interval. In contrast, let's look down here at R&D, research and development. What we see in this case is that a one dollar increase in R&D leads to a 2.914, $2.91 increase in sales. Well that sounds pretty good, but what's our confidence interval around that? Well it's much wider. In point of fact, a one dollar increase in R&D in some cases actually leads to a $28.33 fall in sales.
In other cases, a one dollar increase in R&D leads to a $34.16 increase in sales. So what this is showing us is that research and development has a much less precise point estimate on it. It's much harder to get a sense for how valuable R&D is rather than say, assets. That's simply because R&D has a lot more variation in it in our data. In some cases, R&D expenses result in big sales down the road.
In other cases, R&D expenditures done pay off at all, and in a few cases, R&D expense probably leads to a new product but that new product flops and causes sales to actually fall. So it's always important that we go through and look at the 95% confidence interval for any data point that we're interested in. In fact, we can go through and we can build our hedonic pricing model based on not only our coefficients but also the upper and lower 95% confidence intervals.
So let's do that, shall we? Again, we're going to look to predict sales based on all of our other variables. Now I'm adding a column for coefficients, for input, which is our assumption, and for the variable in question, and I'm going to take all of our variables, simply paste them over, and I'm going to do the same thing with our coefficients.
Sales is what we're trying to forecast, of course. So I'll simply put in a Y. Next we need to determine what our inputs are. Well in the case of the intercept it's just going to be a one. In the case of assets we need some kind of a value that we could put in, and that value is probably going to be based on the data that we have collected for our firm or where we hope our firm will get to. If you recall, these datas were in millions of dollars. So for example, a value of 2,700 was the equivalent of 2.7 billion dollars in assets.
So I'm simply going to put in 1,000 for assets. Let's say that liabilities is 500. Perhaps 500 million. Our net income might be 100. I'm adding these data ad hoc, but just as Jack and Diane are going to use the values for their particular firms, you'd want to go through and use the value that's associated with your company and with your prediction. If your company expect to spend 300 million on CapEx next year, you'd want to put in 300 as your CapEx value.
I'm going to use 50 for R&D. Let's say that I'm trying to predict sales in the second quarter. My Tobins Q value I'm going to use based on, again, the data that's out there. So let's say we have a Tobins Q value of five. My net promoter score, again, I'm going to refer to the data that's available. That net promoter score tends to be a small number decimal, as you can see here. So I'll use .001. My standard deviation of Tobins Q, again I'm going to use a value based on other variables.
HHI, that's the concentration in the industry. I'm using .05, and my year let's say is 2016, and my AltmanZ score I'm going to use 0.70. You probably don't know the values on all of these for your company off the bat, but you can go through and look at what the company is saying its R&D expenses will be or what the CapEx expenses will be.
Once you've done that, determining sales is as easy as multiplying the coefficient by each of these numbers. And then summing up the result. And we see here that our expected sales for the firm are 751.53. Or put differently, 751.5 million dollars. That's given all of these inputs. The next question we might ask is, how can we use these upper and lower 95% confidence intervals to stress test this outcome and this sales prediction? We'll take a look at that in the future.
Join Professor Michael McDonald and discover how to use predictive analytics to forecast key performance indicators of interest, such as quarterly sales, projected cash flow, or even optimized product pricing. All you need is Microsoft Excel. Michael uses the built-in formulas, functions, and calculations to perform regression analysis, calculate confidence intervals, and stress test your results. You'll walk away from the course able to immediately begin creating forecasts for your own business needs.
- Understanding big data and predictive analytics
- Gathering financial data
- Cleaning up your data
- Calculating key financial metrics
- Using regression analysis for business-specific forecasts
- Performing scenario analysis
- Calculating confidence intervals
- Stress testing