When you collect business data, one of the most useful measures is how spread out your data is. In this video, learn how to characterize your data using variance and standard deviations.
- [Instructor] When you gather data over time, you can use statistical measures to get information about that data. In this movie, I will discuss variance and standard deviation, which are two very common measures that you use to see how spread out data is from the overall average of the set. In this movie I'm going to focus on the mechanics of calculating the variance and standard deviation. And then in the next movie, I will apply it to what's called the normal distribution for data, and how knowing the standard deviation especially can help you.
This worksheet contains a set of sales data, and what I want to do is find out how much it varies from the average, and also what its standard deviation is. To calculate the variance, I will click in cell C2, type an equals sign, and then type v-a-r, which is the function that we use to calculate variance. I use v-a-r instead of v-a-r-p for a couple of reasons. V-a-r-p, as you can see from the help text here, finds the variance of an entire population.
In other words, if you're certain that you have data on every possible example of something, such as every living Canadian citizen, then you would have data on the entire population. The problem is that, as soon as someone dies, or someone is born, you no longer have data about the entire population. Using v-a-r provides a slightly more conservative measure, which leads to a slightly higher variance. It's conservative because it gives you a wider range of possible values, given certain probabilities. So, I highly recommended that you use v-a-r instead of v-a-r-p.
With all that said, I will type a left parentheses and then select the values in the cells A2 through A13. Then type a right parentheses and enter, and I get my variance which is over 1.5 million. So that is very large. Especially in relation to sales values that seem to average around 35 hundred to 37 hundred or so. What variance is, is the sum of all the squared errors. So what the program did, was determine how far from the overall average each of the sales value was, either above or below, and then squared the difference, and that's why we have such large values for variance.
The standard deviation is the square root of the variance, but rather than just taking the square root you can calculate it directly. So I'll click in cell D2, type an equals sign, and the function we'll use is s-t-dev. Same story with standard deviation as it was for variance. You want to use s-t-dev instead of s-t-dev-p. So, we use s-t-dev, select cells A2 through A13, right parenthesis, enter, and we get a standard deviation that's much more in line with what we had before of $1,236.62.
And if we take a look at the sales, that seems to make sense. It seems like a reasonable value for a standard deviation. So that's how you calculate variance and standard deviation. In practice, you'll use standard deviation a lot more than variance. In the next movie, I'll describe the normal distribution and how the standard deviation helps you discover information about your data.
- Entering formulas
- Creating, editing, and deleting named ranges
- Using mathematical functions such as SUM and AVERAGE
- Summarizing data
- Analyzing financial data
- Working with dates and times
- Looking up values
- Multiplying arrays