In this video, get an explanation of predictive analytics using Amazon. Additionally, receive an explanation of statistical significance and standard deviation.
- Predictive analytics answers the question what might happen in the future. Amazon has offered a unique spin on that concept with it's Recommended For You feature. This personalized recommendation system uses something called a collaborative filtering engine. It looks at what's in your cart, your wish list, and what you've bought recently, and recommends other products based on what other consumers have bought while purchasing those items. The key here is that predictive analytics does not predict the future.
There are too many variables to ever safely know what is going to happen. Rather, predictive analytics attempts to determine which future events are the most likely. Amazon does not know exactly what you'll buy when you put peanut butter into your cart, however, past behavior from other customers suggest it'll probably be jelly. But, this is by no means a certainty, which is why Amazon gives you more than one recommended product. They want you to buy at least one, and increase their revenue. So let's add to our list of descriptive statistics.
Another potentially useful one is a standard deviation. A measure of the relative spread of the data. The standard deviation measures how close to the mean is the overall sample. Or are there many observations that are far from the average? If we have a large standard deviation, that means that a great deal of data lies further away from the mean. A small standard deviation means that the data are close together around the mean. Without getting overly technical, we often assume large datasets exhibit a normal distribution.
What you might have heard of as the bell curve. With this assumption, the 68, 95, 99 rule is often used as a way to put standard deviations into perspective. 68% of the data are within one standard deviation, above or below the mean. 95% of the observations are within two standard deviations above or below the mean. And 99.7% of the observations are within three standard deviations above or below the mean. Keenly observing standard deviations for a dataset can give us insight on the spread of the data and an idea of how we should interpret the statistics.
- Qualitative vs. quantitative data
- Data analytics success stories
- Making predictions
- Asking the right questions
- Collecting data
- Understanding averages
- Sampling: pros and cons
- Cause and effect