Easy-to-follow video tutorials help you learn software, creative, and business skills.Become a member
The last deferential test that we'll look at in this course is a variation on the Analysis of Variance or ANOVA or ANOVA. As we discussed in the sections on associations, the Analysis of Variance is a very flexible and powerful procedure and there are probably dozens of permutations on it. In this movie we're going to talk about the version that is designed for situations where two categorical variables are used jointly to predict scores on a scaled or quantitative outcome variable. Because categorical variables are generally referred to as factors in the Analysis of Variance and the categories that make them up are called levels, this version of the Analysis of Variance is usually called the Factorial ANOVA, or more colloquially, a Two-Factor ANOVA.
An important thing to note is that when you have two separate factors like gender and educational category and you're looking at levels of discretionary spending, an Analysis of Variance will give you three different results. The first result will let you know whether spending differs by gender, ignoring educational level. The second result will let you know whether spending differs by educational level ignoring gender. These are both known as the main effects where effect has to do with the statistical association and main because their factor has an effect on its own.
However, an Analysis of Variance also gives you one more important result. It lets you know whether the two factors interact. That is, it lets you know if for example, women with college degrees spend more than women without college degrees, but for men, their spending is the same with and without a degree. By the way, I'm just making that up. I don't really know what the association between those variables is, but I'm sure that some of you actually do. In some domains, the interactions are particularly interesting and can take precedence over the main effects.
However, it all comes down to interpretability and applicability and that will depend on what you are trying to do with your data. With that in mind, let's see how a Two-Factor ANOVA can work in SPSS. To do the Analysis of Variance, we need to go to Analyze and down to General Linear Model. Now that actually is an interesting term, and the idea here is that all of the procedures that we've done, T-Tests and Regression and Multiple Regression are all variations and once called a General Linear Model, a way of predicting scores on a single outcome.
Let's do this one over here, Univariate. Now what do we need to do is pick our main dependent variable. that's the outcome variable, this thing that we're trying to predict. In this particular example, I thought I might use interest in NBA as a search term, so I'll put that up in the dependent variable. And then I'm going to use two categorical variables as predictors of interest in searching for NBA. The first one that makes a lot of sense to me is whether a state has an NBA team.
So I'll put that here under Fixed Factor(s). When you have categories that are determined like yes or no, they have an NBA team, then it's a fixed factor. You can also have what are called random factors in the Analysis of Variance, but in many situations, those are unusual and I've never used them. A covariate there is if you want to throw in another quantitative or scaled variable, by putting covariates into analysis can complicate the results dramatically. The last one is if you want the Weight Cases and we're not going to deal with that. I'm just going to go back and find my second predictor category and that's going to be region of the United States.
And I can just click that one and put it in there. Now it's okay that there are four levels in this category. The Analysis of Variance is able to deal with that just fine. Let's take a quick look at some of the options here. Under Model, I can specify whether I want something called a full factorial model or custom. We don't need to worry about that. We can Cancel. Under Contrasts, I can try to decide if there's special ways I want to compare the results, and I don't need to worry about that. Under Plots, I could get Profile Plots, but these can get a little complicated, so I'm going to cancel that.
Post-HOC lets me look at the differences more effectively. I'm not going to do that on this one. If I want to save the predicted values or if I want to save some other statistics for diagnostics, I could do that, but I'm going to skip it for now. And finally under Options, there are some here that I might want to do. I might want to get what are called descriptive statistics and estimates of effect size. I think those two are really helpful. Then I'm going to press Continue. And I've got it set up the way I need, so I'll just click OK.
And so here are my results. The first thing is I get an indication of what are called the Between-Subject Factors. These are the things that separate one group from another. One factor is whether a state has an NBA team and you can see that 23 of them do and 28 of them don't. The second thing is the Census Bureau region. You see that I have nine states in the Northeast, 12 in the Midwest, and so on. Below that, I have the actual descriptive statistics for the search interest in NBA.
Well, it's breaking it down by whether they have an NBA team and by the Census Bureau region. So the states in the Northeast that do not have NBA teams have a mean of minus .42. That means that they are about half a standard deviation below the rest of the country in relative interest in searching for NBA teams. On the other hand, if you go to the Northeast teams that do have NBA teams, you see that they have a score of +.39. That means they're about four-tenths of a standard deviation above the national average in relative interest in searching for NBA on Google.
And then you can run through and see the various combinations there. The next table is the actual analysis of variance table, and what it has is several different results here. The first one that says Corrected Model simply tells me how well the model as a whole works and it predicts rather nicely. You can see that it has a Significance level in the first row of 000. And it also has something called a Partial Eta Squared. Again, it's like a correlation that's squared and it's .492. In fact, if you look at the footnote at the bottom of that table, you'll see it says R Squared = .492.
And what it means is that if we know the region of the country that a state is in and whether that state has an NBA basketball team, then we can accurately predict about 50% of the variance in interest in NBA as a Google search term. So that's the entire model. The next step down on that table is Intercept and that just means that the starting score is not 0 and that's not terribly interesting in and of its own. What's funny here is that it actually is close to 0. The next one is whether a state has an NBA team, has_nba, and you can see there that it's highly significant.
Their probability value is 000 and the Partial Eta Squared is .412. And what this lets us know is that most of the interest in NBA as a search term has to do with whether a state has an NBA team. So that's a major predictor. The next one is region. Is there region by region interest? The significance level is .079 and that's above the standard cutoff of 05, so we would say that on the whole, no, the region that a state is in does not make a big difference in terms of their interest. On the other hand, whether they had an NBA team did.
Those are the two main effects that an Analysis of Variance gives us. There is however the third thing that I talked about: the statistical interaction. And that is whether the region interacts with whether a state has an NBA team to predict overall interest. And you see that on this one, the significance level on the second to last column, the last entry is .049, which is just barely beneath the 05 cutoff and there's enough to be considered statistically significant. Now what we're going to need to do is very quickly make a chart to show what these differences look like.
I'm going to do that really quickly in the graph. Go to Graphs, to Chart Builder, I'll get a Clustered Bar Chart. And from there I'll take NBA as an interesting search term and I'll take whether they have an NBA team, I'll make that cluster and I'll put the Region on the X axis. And when I do that, you see what's going on here. The bars in green are for states that have an NBA team and you see every region where they have NBA teams, there are above-average interest in searching for NBA, and it makes sense.
The states that don't have NBA teams are in blue and they'll have below average interest, regardless of the region, except you do see an interesting thing. In the South, the states that have NBA teams, and there are several, are barely above the national average in terms of interest. But in the West, the states that have NBA teams have huge amounts of interest, much higher. And so you can see that the effect of having an NBA team varies according to region.
And that's the idea of a statistical interaction. it's one of the benefits of an Analysis of Variance. And so, for our final inferential test, the Factorial Analysis of Variance, you see this is an excellent way of looking at the association between two categorical predictor variables in a single-scaled outcome variable. It lets you look at the statistical effect of each of the categorical variables on its own, as well as the interaction of the two, which can often be more interesting and more important. And with that, we'll conclude our last section on statistical graphing and testing.
In the next and last section, we'll wrap things up a little and talk about how you can get all of your results out of SPSS and format them, so they'll be as clear and as communicative as possible.
Access exercise files from a button right under the course name.
Search within course videos and transcripts, and jump right to the results.
Remove icons showing you already watched videos if you want to start over.
Make the video wide, narrow, full-screen, or pop the player out of the page into its own window.
Click on text in the transcript to jump to that spot in the video. As the video plays, the relevant spot in the transcript will be highlighted.
Your file was successfully uploaded.