Anytime you run a science experiment, you need to think about which hypotheses to test and what type of confidence intervals to use. Learn how to set hypotheses, proper intervals, and p-values.
- [Tutor] When you run your experiments, you'll have to formulate hypotheses and test them out, there's a lot involved in doing that, but let's cover the basics. I want to go back to our example of women's heights, you've learned that the historical average height is 64 inches, so let's base our hypotheses off this historical parameter. To test whether the average height has changed, since the last measurement, we need to first construct a range around that 64, we construct this range, so that if the height has not changed, that being if the mean is still 64, it would be very likely that the mean of our sample would fall inside this range, on the other hand, if the mean of our sample has changed and falls outside of our range, we can feel confident in saying that the average height is probably different now, than it was in the past, if the sample mean falls within the range, then we don't have sufficient evidence to support the claim that the height has changed.
There are generally two types of hypotheses, the null and the alternative, null is denoted as H nought or some people say H subzero, the alternative is denoted as H sub a, I just say H a, the null hypothesis is a basic statement about our topic, this statement generally assumes that nothing has changed, for us, that would mean that the average height for women is equal to 64 inches, that's exactly what our null would be, we would always start by assuming that the null hypothesis is true, the alternative hypothesis is always the opposite of the null, so in this case, it would be that the mean height for women is not 64 inches, if the null hypothesis isn't directional, the alternative hypothesis shouldn't be directional, for example, if the historical data showed that women have always been shorter than or equal to 64 inches, then that would be the null, the alternative in that case would be that the average height is greater than 64 inches.
A hypothesis test only has two outcomes, we either reject the null hypothesis or we fail to reject it, because we don't have sufficient evidence, we can never ever accept the null hypothesis, that is too risky of a move, it's just that we don't have sufficient evidence to reject it, so we can't, you'll have to be very clear in your language, if you plan on writing up results. When you test your hypothesis, you'll have to build a confidence interval around your mean, in our case, 64 inches, why build this interval? Well, so we know when we can reject our null or we have failed to reject our null, for example, imagine that our experimental mean is 65, that's pretty doggone close to 64, or is it? We have to depend on the confidence interval to tell us that, this interval should be drawn, so that if the population mean actually still is 64, then 95% of all samples drawn from the population will have averages that fall within that range.
The width of the confidence interval depends on three things, the confidence level, the sample size and the standard deviation. The confidence level for us is 95%, if we were to increase it to 98%, then the confidence interval would get wider, that's because we have to make sure that we can be more confident in our estimate, if we increase the sample size, the confidence interval gets narrower, a greater sample size is more representative of our population, so we know we're getting a more precise measurement, hence the narrower confidence interval, as the standard deviation increases, the confidence interval gets wider, the greater the standard deviation, the more imprecision and uncertainty we have, so the confidence interval has to account for that.
So let's imagine that we get a sample mean height of 69 inches, that falls outside of our confidence interval, so we have to reject the null hypothesis, now, we know we've rejected it, but we may want to know how strong our evidence is, right now, we're 95% confident that our population parameter for the average height of women has changed, in reality, it could be more than 95%, imagine if our sample mean had fallen even further outside the range, what if it had been 72? Intuition tells us that we would have even stronger evidence to reject the null hypothesis, to capture this logic and capitalize on our intuition, it's useful to determine how likely it would be for a sample we choose to have a particular mean, if the null hypothesis is true, in this case, we would ask, if the null hypothesis were true, how likely would it be for us to choose a sample with a mean height, that is at least as far as 72 is from the historical average mean, which is 64? This measure of likelihood is called the P-value, suppose our sample mean falls in the rejection region, which it does, that's outside the 95% range, then we know that it if a null hypothesis true, the likelihood of obtaining the sample must be less than 5% or equivalently, that the sample mean's P-value must be less than .05, so when we take a sample, we reject the null hypothesis, if the sample's P-value is less than 5%, the threshold at which we reject the null hypothesis is called the significance level and it's equal to one minus our confidence level, in this case, with a 95% confidence level, the significance level is 5% or .05, which is why I said earlier that the P-value is less than .05.
The P-value does more than simply answer the questions of whether or not we can reject the null hypothesis, it also indicates the strength of evidence against a null hypothesis, for example, if the P-value is .049, we barely have enough evidence to reject the null at the 95% confidence level, however if the P-value is .001, then we have really strong evidence against a null hypothesis and we can say with more confidence that yes, in fact our average height has changed from the historical mean.
- Quantitative vs. qualitative analysis
- Sample size considerations
- Normal distribution
- Estimating the population mean
- One-sample t-test
- Paired-sample t-test
- One-way and two-way ANOVA
- Repeated measure ANOVA