Join Barton Poulson for an in-depth discussion in this video Hypothesis, part of Data Science Foundations: Fundamentals.
- [Voiceover] Hypothesis testing is one of the most common approaches to inferential statistics that you'll find. The idea here is that you'll want to directly test your theory. There's a few steps that go into this. First off is you want to calculate the probability of X, whatever result you have, what's the probability of that occurring by chance if randomness is the only explanation? And then, if that probability is low, you reject randomness as a likely explanation for your observed result.
That's the basic principle of hypothesis testing. You can think of this as being especially useful in a few situations. Hypothesis testing's very common in scientific research where you're testing a particular theory to see if the theory is valid. It's also common in diagnostics where you're trying to figure out how likely a particular outcome is based on the results of a test. And then, it's basically the general principle behind go versus no go decisions, when you're trying to see if you pass a particular cut-off or criterion.
When you're doing null hypothesis testing, which is its full name, you have a couple of hypotheses or theories that you're dealing with. The first is the null hypothesis. It's often written as H sub zero. And what it says is that there is no systematic effect, no consistent difference between group means, no association between variables, and that random sampling error is the only explanation for the observed effects in the sample. Contrast that to the alternative hypothesis, which can be written as H sub A, for alternative, or H sub one.
And you can think simply it says, there is a systematic effect, there is a consistent difference between the group means, there is an association between the variables. But what's important about this is it's called null hypothesis testing. And so what you do is you're looking directly at the null, which says there's no systematic effect. In fact, you can look at it graphically with a null distribution. And this is for the z test. And what it is is it sets up the possible range of outcomes if the null is true, and only random sampling error can explain the different between the group means.
And you see that most of the scores are close to the middle, and it tapers out on either side. What you do is you set a region, or regions, of rejection. I've shaded those here in red. And those are the extreme 2.5% on top and 2.5% on the bottom, and the idea here is, those set up a criterion, and if your sample value falls into either of those regions of rejection, then there's a low probability of that occurring by chance from the null distribution.
What you then have to do is you have to make a decision. The problem of course, is when you decide whether there's a sample effect or not, is you could make a mistake. One possibility is a false positive. Now what that means is that the sample data shows some kind of statistical effect, but it's actually due to randomness. In the scatter plot on the right, you can see that there's a strong negative association between the two variables. As scores go from left to right on x, they go down.
You can see by the regression line in there it's a pretty strong pattern. On the other hand, I generated this data at random, using code that actually created uncorrelated variables. So the code specified a correlation of zero, and I had to run it four or five times in order to get this strong positive result at random. And so this is a false positive. The sample has a correlation, but the population that it came from doesn't. And so, a false positive, on the other hand, can only occur if you actually reject the null.
That should make sense. It's called a type one error, and the idea here is that in the entire null distribution, yes there will sometimes be extreme values. You can pick the probability that you're comfortable with. The most common is 5%, and that is, there's a 5% chance of a false positive if the null hypothesis is true. On the other hand, you can also have a false negative. This is when the data looks random, but in fact there is a systematic difference between groups or an association.
That's the case here. This scatter plot looks totally flat, or very very close to flat. But in fact the code that generated the data specified a positive correlation of 0.25, which is reasonably strong. And again, I had to run it four or five times in order to get a flat association. But it lets you know that random variation can lead to different impressions than the overall population would give you. Of course, a false negative can only occur when you don't reject the null, when you have a negative result.
It's called a type two error, and unlike the false positive where you can pick a value that you like, this one is actually calculated based on several factors. Now there are, however, a lot of critiques of hypothesis testing in general. The first is an important one, that it's really easy to misinterpret the probability that it gives you as a result. Also, the assumption of a null, or sometimes a nil, which is exactly zero, effect, people take exception to that. Also there can be bias in interpretation that comes from the use of a cut-off for your criterion.
But perhaps most importantly, it can be argued that it answers the wrong question. It gives you the probability of the data giving your hypothesis, when in fact what you really want is the Bayesian alternative, what's the probability of the hypothesis given your data? There are ways around that, but they're not normally included in discussions of hypothesis testing. So our conclusions are these. Hypothesis testing is very common for yes, no, go, no go, decisions. Also it's very useful despite some of the critiques.
A huge amount of very important research has been conducted using hypothesis testing. And then, Bayesian methods, which can flip around the probabilities, and estimation, or confidence intervals, can be very helpful alternatives or additions to standard hypothesis testing.
- The demand for data science
- Roles and careers
- Ethical issues in data science
- Sourcing data
- Exploring data through graphs and statistics
- Programming with R, Python, and SQL
- Data science in math and statistics
- Data science and machine learning
- Communicating with data