The values in most datasets fall into a familiar pattern, with many more values close to the overall average, and fewer values as you move away from the average. This pattern is called the normal distribution.
- [Instructor] The values in most datasets fall into a familiar pattern, with many more values close to the overall average and fewer values as you move away from the average. This pattern is called the normal distribution. In this movie, I'll describe the normal distribution and provide an example to make the pattern clear. You've probably seen the normal distribution or the normal curve, and also called the Gaussian curve. The idea is that you have an overall average here, where there is zero marked on the horizontal line of the graph, and then the curve is described by standard deviations.
A standard deviation is a measure of how far your data is spread apart. In a perfect normal curve, and no dataset is ever perfect, you'll find 68% of all of your values within one standard deviation of the mean. For example, if your average, or mean, is 20, and the standard deviation is five, then 68% of your values will be within 15, which is minus one standard deviation, and 25, which is plus one standard deviation.
Within two standard deviations, you will expect to find about 95% of your values. So, you can see how there's a central clustering that goes on within the normal distribution. Most analyses go out to three standard deviations and if you look at three standard deviations on either side, you will encompass about 99.7% of your values. And of course, even though most values will happen within three standard deviations, it doesn't mean that on rare occasions there won't be values beyond that range.
Now let's take a look at a specific example so that we can get a better idea of what the normal distribution looks like with actual data. In the United States, the average male aged 20 to 29-years-old is 5'10". There is also a standard deviation of three inches. That means 68% of men are expected to be within one standard deviation in either way of the average. So, that's the range, from 5'7" to 6'1".
That encompasses 68% of the values, but how many men are at least 5'7"? That's a different question so we need to take it step-by-step. The first thing we can realize is that there are 50% of men who are above average height, so everything from the center line to the right encompasses 50% of the men in our survey. So, we can draw a line at the minus one standard deviation mark, which is 5'7", and then realize that half of the range from minus one standard deviation to plus one standard deviation falls to the right of the center line, and that would be 34%.
34% being one half of 68%, which is the total amount that falls within one standard deviation in either direction. And for completeness, we have 34% accounted for to the left of the curve, so that means that there are 16% more individuals who are 5'7" or shorter. Now we can add up the total number of individuals who are 5'7" or taller. We add our 34%, plus 50%, and get 84%.
So, we can finally answer our question, 84% of American men, age 20 to 29, will be 5'7" or taller.
- Designing a scenario-planning exercise
- Estimating scenario plausibility and outcomes
- Establishing parameter value ranges
- Calculating the standard deviation of a dataset
- Indicating the probability of a scenario value occurring
- Walking through a scenario presentation
- Performing retrospective analysis using a PivotTable
- Changing PivotTable summary operations