Easy-to-follow video tutorials help you learn software, creative, and business skills.Become a member
In the last movie we looked at a procedure to compare the means of two different groups on a scale variable using what's called the Independent Samples T-Test. On the other hand, if you want to compare the means of more than two groups, you would want to use something called the Analysis of Variance or ANOVA. And although you can use ANOVA with two group comparisons, and there's a simple conversion formula between the ANOVA results and the T-Test, it's more common to reserve it for times when you have three or more groups. What the Analysis of Variance does is look for any kind of difference between the means of the various groups.
That might mean that Group A is different from Group B is different from Group C, or it might mean that A and B together are different from Group C or any of several other possible combinations. For this reason you'll want to do a couple of things when you do an Analysis of Variance. First, you'll want to look at the group means, such as with a bar chart of the means to see if any natural groupings emerge. Second, you'll want to do something called a Post Hoc Test. That's for after the fact. That can tell you where the differences specifically are.
We will look at both of these in this example. For this demonstration I am going to use the Google Searches information in Searches.sav, and to get the Analysis of Variance what we need to do is go up to Analyze, to Compare Means, to what's called the One-Way Analysis of Variance. It's called One-Way because we're going to use a single categorical variable or factor to differentiate between the groups. This is because there are other versions of the Analysis of Variance where you can have more than one categorical variable. We have just one, so this is the One-Way Analysis of Variance.
You can check more than one variable at a time by putting it into the Dependent List. These are the outcome variables where you're looking for differences. In this particular case I'm just going to use one and I'm going to use the relative interest in searching for the NFL in Google, and I am going to look for regional differences on that. So I find the regions of the U.S.. that's Census Bureau Regions, and I put that under Factor. In the Analysis of Variance the categorical variable is called a factor and the categories within that variable are called levels.
So we have four groups within the Census Bureau Region, so we will have four levels in the factor of region. Now we come up and we check a few other things. The first possibility is Contrasts. Now, this is something that we can ignore, because it's for specialized comparisons,like changes over time or mathematical combinations of group, something called planned contrasts, and we're not doing any of that so we can just ignore this one for right now. I will press Cancel. The second one that we want to look at is called Post Hoc, again for after the fact.
Now, we have a lot of choices here. The most common choices are what are called the Bonferroni and the Scheffe Tests. They're common, but statistically speaking, they're not perfect. They tend to be a little over- conservative and their output can be a little complicated in SPSS. For that reason, I prefer to use a test called the Tukey test. It's named after John Tukey, the statistician, and it's full name is actually the Tukey Honestly Significant Difference Test or HSD Test, which is what you'll see in the output. So I am going to click on the Tukey Test.
Then I will just come down and hit Continue. Now let's take a quick look at the other Options. I click on the Options and I can get Descriptive Statistics, which are helpful for this kind of analysis. I can also get a Means Plot. It's a simple line plot, but it's still helpful for looking at a graphical representation of the differences between the means. So I am going to click on Means Plot and then I will click Continue. Now we're back in the main dialog and I will click OK. Here we have several tables that show up.
The first one is the Descriptive Statistics. It gives me the mean for each of the four groups in this Factor. It tells me, for instance, that the relative interest in searching for the NFL in the Northeast is below average. It's -.36. That means that one-third of the standard deviation below the national average for states and relative interest in searches for the NFL. The Midwest, on the other hand, is much higher. It's three quarters of a standard deviation above the mean, with a mean of 0.75.
The South is slightly below 0 at -.07. And the West is, again, about a third of a standard deviation below 0, at -.33. The next column over is the Standard Deviations and they go from about .8 to 1.1, and they're not hugely different, and they feed into the Standard Error, which is used for the inferential tests. But otherwise we can ignore these. Now, this is the Analysis of Variance table or ANOVA table and what it does is on the top corner it tells me that it's looking at the variable NFL and you see that it's statistically significant. In the last column under Sig it has .020.
That's the probability value for these, and the general guideline is if it's under .05, it's statistically significant. Beneath that are the results for the Tukey Post Hoc Test. Now, this first table of Multiple Comparisons is kind of complicated and we can ignore it. Let's go to the one beneath it. This one is called Homogeneous Subsets and what this does is it places the groups in like with like, and this tells us that the Northeast and the West and the South are all relatively similar to each other in terms of their searching for NFL and Google.
You can see they all have negative means. On the other hand, the second group is kind of interesting. Midwest is much higher, so that makes sense. The South is still with it and the reason for that is even though the South and the Midwest are different from each other, they still have some overlap with the Standard Deviation. So they are not significantly different from each other and this becomes clear if we go down one more and look at the Means Plot. Here you can see that the Midwest is much higher, and the South, while it's down lower, is still above the West and the Northeast. So the Northeast, the South, and the West all form a group, but the Midwest and the South actually combine as well.
But the point here is we are able to do a lot of comparisons and get a lot of information from this one test. The Analysis of Variance is a very flexible and useful procedure for comparing the means of several different groups. In combination with a graphical analysis and Post Hoc Tests, you can get a lot of insight in a little bit of time. In the next movie, however, we'll backtrack just a little to look at a variation on the T-Test, one in which you can look at changes over time for a single group of people or look at differences between two different variables using what's called the Paired T-Test.
Get unlimited access to all courses for just $25/month.Become a member
Access exercise files from a button right under the course name.
Search within course videos and transcripts, and jump right to the results.
Remove icons showing you already watched videos if you want to start over.
Make the video wide, narrow, full-screen, or pop the player out of the page into its own window.
Click on text in the transcript to jump to that spot in the video. As the video plays, the relevant spot in the transcript will be highlighted.
Your file was successfully uploaded.