In this video, the instructor interprets the results of a regression forecast.
- [Instructor] Jack and Diane are back! Jack and Diane have been following along with us, learning about financial forecasting as we have. Now they're ready to put what they've learned into practice. Jack wants to do a sales forecast to figure out where his company is going in comparison to their competitors. While Diane wants to go through an optimized pricing, to determine its impact on sales. Now, let's take a look at some data that can help both of them.
I'm in the zero four zero one folder, looking at the begin financial data file. And as you can see we've got a very different dataset, than the employee theft data we worked with before. This dataset has almost 400,000 rows of data. And roughly a dozen different variables. This is essentially all of the data that Excel can handle. If you have any more data than this you'll probably need to move to a professional software package, something like SAS or Stata or R.
I've certainly done projects using those kinds of software. And your company perhaps has enough data that it might make sense to invest in that type of software. But for right now I want to make sure that all of us can understand and apply what we're doing, in the most basic software possible, Excel. So we've got a variety of different variables which I've pulled together and cleaned up for us. Our company ID is an anonymized indicator for a particular firm. So we can see for example in row two, we have company 105920.
And this is for the year 2000, and it's quarter one. Row three then is for exactly the same company, but it's in quarter two of the year 2000. So again if you recall this is time series data. But we also have data on other companies out there. For example, here we are with company 178937. So we have both cross-sectional data and time series data.
What we call panel data. I've gathered a variety of different values based on that, assets, liabilities, net income, capital expenditures, R and D, Tobin's q, which is a measure for valuation of the firm, NPS or the net promoter score, which is a marketing metric, our standard deviation of Tobin's q, HHI, or the Herfindahl Index which tells us how concentrated it is, AltmanZ which is a measure for financial stability, and finally sales, or revenue.
What we're interested in doing right now is helping Jack to run a regression that'll allow him to analyze what his company's sales should be given all of these values. So how will we do this, do you recall? Well to start with, let's go to the data tab, and click data analysis. From here we're going to scroll down, until we get to regression. Now, we need to put in our Y variable.
What was the Y variable? What was the interpretation on that, do you recall? That's right, the Y variable is the variable we're trying to predict. So in this case that's going to be sales. Now how about our X's? So for my X values I'm going to use assets through AltmanZ. For my Y I'm going to use sales. Remember we want to put in labels, so that we can accurately interpret the output.
In this case I'm not going to use the constant as zero, I'm going to allow the constant to vary. And I'm going to hit OK. Now, this'll take a couple of minutes to run, because again we've got so much data. Bear with me for a moment. Alright, now we've got our output. I'm going to go through and resize my rows and columns, so that I can read them effectively. And now I'm going to delete these last two columns, because again they are superfluous data.
And I'm going to clean up my values slightly, to make them easier to read. Now, what we see is that this is a reasonable regression. We're able to explain 45% of the variation in our data, based on these set of variables that we've constructed. That's not a bad regression. Especially in a corporate finance setting.
We're using roughly 400,000 different observations across a variety of firms. All in all this looks like reasonable output. When we look at this we can also determine which factors are the most important in driving sales. And as we can see from our p values, there are a number of factors that are significant in driving sales. In particular the higher our assets are, the higher our sales are. And that's not surprising, right? Bigger companies tend to have more assets, so they should have more sales.
Nonetheless this regression now gives us all of the tools that we need to run the same kind of hedonic model we looked at before to predict sales. And we can also determine which factors have the greatest influence on our expected sales. The next step would be to go through and build a hedonic model based on these coefficients and the output we see here.
Join Professor Michael McDonald and discover how to use predictive analytics to forecast key performance indicators of interest, such as quarterly sales, projected cash flow, or even optimized product pricing. All you need is Microsoft Excel. Michael uses the built-in formulas, functions, and calculations to perform regression analysis, calculate confidence intervals, and stress test your results. You'll walk away from the course able to immediately begin creating forecasts for your own business needs.
- Understanding big data and predictive analytics
- Gathering financial data
- Cleaning up your data
- Calculating key financial metrics
- Using regression analysis for business-specific forecasts
- Performing scenario analysis
- Calculating confidence intervals
- Stress testing