- Suppose we give an exam to a very large class of students. What might the results look like? These exam scores skew high. Some students got low scores, but not many. This one, on the other hand, skews low or to the left. Most of the exam scores were very low. This distribution is all over the place. Some high scores, some low ones, but no real pattern. Strangely, many things do follow a pattern. In the case of the heights of people or large scale standardized test scores, even health data, often we find that data follows a pattern.
Data will take on this bell shape for its probability distribution. What are we looking at? Often, the mean of the data is centered at the highest point of the curve. The data around the mean is symmetrical. In other words, 50% of the data is above the mean, 50% is below the mean. The farther we get from the mean, the lower the probability of those outcomes. And notice the curve never touches the axis.
It just keeps on going to infinity in either direction. Also remember that the area under the curve is equal to one, which means that the area under the curve accounts for 100% of all the possible outcomes. This is what we call the classic normal curve. It is vital to understanding probabilities, especially when you're told that the data is normally distributed. Essentially, they're saying that the data is taking on the shape of a normal curve.
Not all normal curves are created equal though. These are both normal curves, but one of them is taller and more narrow. The more narrow curve has a smaller standard deviation. As you can see, the flatter curve is more accommodating. It has more room under the curve at distances farther away from the mean. Still, you might be asking, why is this so important? Very few data sets are symmetrical, so normal distributions must be very rare.
Well, believe it or not, the normal distribution is more widespread than most would ever imagine.
Released
9/18/2016Professor Eddie Davila covers statistics basics, like calculating averages, medians, modes, and standard deviations. He shows how to use probability and distribution curves to inform decisions, and how to detect false positives and misleading data. Each concept is covered in simple language, with detailed examples that show how statistics are used in real-world scenarios from the worlds of business, sports, education, entertainment, and more. These techniques will help you understand your data, prove theories, and save time, money, and other valuable resources—all by understanding the numbers.
- Calculate mean and median for specific data sets.
- Explain how the mode is used to assess a data set.
- Identify situations in which standard deviation can be used to investigate individual data points.
- Use mean and standard deviation to find the Z-score for a data point.
- List the three different categories of probability.
- Analyze data to determine if two events are dependent or independent.
- Predict possible outcomes for a situation using basic permutation calculations.
- Give examples of binomial random variables.
Share this video
Embed this video
Video: The famous bell-shaped curve: Introduction