Join Curt Frye for an in-depth discussion in this video Calculating correlation, part of Excel 2007: Business Statistics.
- View Offline
In the previous movie, I showed you one way to measure how two variables tend to change together by using covariance. In this movie, I'll show you another very useful way to measure how two variables tend to change together; that is correlation. correlation is related to covariance, but it uses a different formula to generate its value. The idea is that a correlation coefficient can have a value from negative one to one, and this worksheet that I have here displays those three examples. And these are all perfect examples, meaning that they'd probably never occur in real life, but they give the example of what a correlation of 1, -1, and 0 would look like.
Correlation of one is what's called positive correlation, or perfect positive correlation. That means that whenever one value increases, the other value increases at exactly the same rate. So the data here, we have 1, 2, 3, 4, 5, and that is the data I used to plot the horizontal axis. And then the vertical axis also contains the data 1, 2, 3, 4, 5, so it increases at the same rate. Correlation of negative one means that there's a negative correlation, where one value goes up, the other goes down, and again, in exactly the same proportion.
So we have 1, 2, 3, 4, 5, and Column 2: 5, 4, 3, 2, 1. And you can see how that correlation is represented in the body of the chart. A correlation of zero means that there is absolutely no relationship between the two data columns whatsoever. So, for example, if you look at the first pair, you see that I have 1 and 1, and that's started here. The next row is 2 and 1, so that is here, 3 and 1, and so on. And I have every possible pair of values.
I have a 1 and a 1, a 1 and a 2, and 1 and a 3, and I have similar pairs for 2 and 3 as my starting values. So that means, as you can see in the plot, that you cannot use the value on the horizontal axis to predict anything about the data, or the value, that will occur on the vertical axis. It's completely spread out, so that means that this example has a correlation of zero. Now let's take a look at a business case to show you how you can use correlation in your own work. To do that, we'll switch to the Correlation worksheet.
The data in this Excel table shows how the number of customers varies by day, and the days measure is the number of days after your company has sent out a mailer. So, in other words, you want to see how effective the mailer was, but you also are interested in how much that mailer's effect trailed off over time, and that is what this table measures. If you want to create a correlation formula for this data, you can do so by clicking equal--and I am in cell E1--so it's equal and then C-O-R-R-E-L, that's the correlation function. Left parenthesis. And then you just identify the data you want to use.
So the first column is days, and I selected that column by moving my mouse pointer over the column header. And when the mouse pointer changed to a downward- pointing black arrow, I clicked. When I did, Excel selected that table's column. Type a comma, and I'll select the Customers column. So I have Days and Customers, those are my two arrays. Right parenthesis. Press Return and I get my correlation of -0.9, and a small decimal value after that.
As a general rule, correlation coefficients with absolute values-- that is, the distance from zero--a three or less are considered to be weak, 0.3 to 0.7 are considered moderate, and 0.7 or higher are considered strong. So in this case, a correlation coefficient of -0.9 means that there's a very strong negative correlation between the number of days after you mailed a brochure to your customers and the number of customers who come in because of that brochure. Determining whether two sets of values are correlated is in one sense of mathematical question, but you can't take a correlation coefficient of 0.95 to mean that some invisible factor makes two unrelated data sets change at the same rate.
You must use your common sense and business sense to determine whether the correlation result is true, and--most importantly--discover why or why not.
- Understanding statistical terms
- Creating a basic Excel table
- Auditing formulas
- Creating frequency distributions for qualitative data
- Calculating a running total
- Creating a histogram
- Using PivotTables
- Calculating mean, median, mode, and other numerical data
- Using probability distributions
- Population sampling
- Testing hypotheses
- Developing liner and multiple regression models