# Comparing means with a two-factor ANOVA

## Video: Comparing means with a two-factor ANOVA

The last deferential test that we'll look at in this course is a variation on the Analysis of Variance or ANOVA or ANOVA. As we discussed in the sections on associations, the Analysis of Variance is a very flexible and powerful procedure and there are probably dozens of permutations on it. In this movie we're going to talk about the version that is designed for situations where two categorical variables are used jointly to predict scores on a scaled or quantitative outcome variable. Because categorical variables are generally referred to as factors in the Analysis of Variance and the categories that make them up are called levels, this version of the Analysis of Variance is usually called the Factorial ANOVA, or more colloquially, a Two-Factor ANOVA.

## Comparing means with a two-factor ANOVA

The last deferential test that we'll look at in this course is a variation on the Analysis of Variance or ANOVA or ANOVA. As we discussed in the sections on associations, the Analysis of Variance is a very flexible and powerful procedure and there are probably dozens of permutations on it. In this movie we're going to talk about the version that is designed for situations where two categorical variables are used jointly to predict scores on a scaled or quantitative outcome variable. Because categorical variables are generally referred to as factors in the Analysis of Variance and the categories that make them up are called levels, this version of the Analysis of Variance is usually called the Factorial ANOVA, or more colloquially, a Two-Factor ANOVA.

An important thing to note is that when you have two separate factors like gender and educational category and you're looking at levels of discretionary spending, an Analysis of Variance will give you three different results. The first result will let you know whether spending differs by gender, ignoring educational level. The second result will let you know whether spending differs by educational level ignoring gender. These are both known as the main effects where effect has to do with the statistical association and main because their factor has an effect on its own.

However, an Analysis of Variance also gives you one more important result. It lets you know whether the two factors interact. That is, it lets you know if for example, women with college degrees spend more than women without college degrees, but for men, their spending is the same with and without a degree. By the way, I'm just making that up. I don't really know what the association between those variables is, but I'm sure that some of you actually do. In some domains, the interactions are particularly interesting and can take precedence over the main effects.

However, it all comes down to interpretability and applicability and that will depend on what you are trying to do with your data. With that in mind, let's see how a Two-Factor ANOVA can work in SPSS. To do the Analysis of Variance, we need to go to Analyze and down to General Linear Model. Now that actually is an interesting term, and the idea here is that all of the procedures that we've done, T-Tests and Regression and Multiple Regression are all variations and once called a General Linear Model, a way of predicting scores on a single outcome.

Let's do this one over here, Univariate. Now what do we need to do is pick our main dependent variable. that's the outcome variable, this thing that we're trying to predict. In this particular example, I thought I might use interest in NBA as a search term, so I'll put that up in the dependent variable. And then I'm going to use two categorical variables as predictors of interest in searching for NBA. The first one that makes a lot of sense to me is whether a state has an NBA team.

So I'll put that here under Fixed Factor(s). When you have categories that are determined like yes or no, they have an NBA team, then it's a fixed factor. You can also have what are called random factors in the Analysis of Variance, but in many situations, those are unusual and I've never used them. A covariate there is if you want to throw in another quantitative or scaled variable, by putting covariates into analysis can complicate the results dramatically. The last one is if you want the Weight Cases and we're not going to deal with that. I'm just going to go back and find my second predictor category and that's going to be region of the United States.

And I can just click that one and put it in there. Now it's okay that there are four levels in this category. The Analysis of Variance is able to deal with that just fine. Let's take a quick look at some of the options here. Under Model, I can specify whether I want something called a full factorial model or custom. We don't need to worry about that. We can Cancel. Under Contrasts, I can try to decide if there's special ways I want to compare the results, and I don't need to worry about that. Under Plots, I could get Profile Plots, but these can get a little complicated, so I'm going to cancel that.

Post-HOC lets me look at the differences more effectively. I'm not going to do that on this one. If I want to save the predicted values or if I want to save some other statistics for diagnostics, I could do that, but I'm going to skip it for now. And finally under Options, there are some here that I might want to do. I might want to get what are called descriptive statistics and estimates of effect size. I think those two are really helpful. Then I'm going to press Continue. And I've got it set up the way I need, so I'll just click OK.

And so here are my results. The first thing is I get an indication of what are called the Between-Subject Factors. These are the things that separate one group from another. One factor is whether a state has an NBA team and you can see that 23 of them do and 28 of them don't. The second thing is the Census Bureau region. You see that I have nine states in the Northeast, 12 in the Midwest, and so on. Below that, I have the actual descriptive statistics for the search interest in NBA.

Well, it's breaking it down by whether they have an NBA team and by the Census Bureau region. So the states in the Northeast that do not have NBA teams have a mean of minus .42. That means that they are about half a standard deviation below the rest of the country in relative interest in searching for NBA teams. On the other hand, if you go to the Northeast teams that do have NBA teams, you see that they have a score of +.39. That means they're about four-tenths of a standard deviation above the national average in relative interest in searching for NBA on Google.

And then you can run through and see the various combinations there. The next table is the actual analysis of variance table, and what it has is several different results here. The first one that says Corrected Model simply tells me how well the model as a whole works and it predicts rather nicely. You can see that it has a Significance level in the first row of 000. And it also has something called a Partial Eta Squared. Again, it's like a correlation that's squared and it's .492. In fact, if you look at the footnote at the bottom of that table, you'll see it says R Squared = .492.

And what it means is that if we know the region of the country that a state is in and whether that state has an NBA basketball team, then we can accurately predict about 50% of the variance in interest in NBA as a Google search term. So that's the entire model. The next step down on that table is Intercept and that just means that the starting score is not 0 and that's not terribly interesting in and of its own. What's funny here is that it actually is close to 0. The next one is whether a state has an NBA team, has_nba, and you can see there that it's highly significant.

Their probability value is 000 and the Partial Eta Squared is .412. And what this lets us know is that most of the interest in NBA as a search term has to do with whether a state has an NBA team. So that's a major predictor. The next one is region. Is there region by region interest? The significance level is .079 and that's above the standard cutoff of 05, so we would say that on the whole, no, the region that a state is in does not make a big difference in terms of their interest. On the other hand, whether they had an NBA team did.

Those are the two main effects that an Analysis of Variance gives us. There is however the third thing that I talked about: the statistical interaction. And that is whether the region interacts with whether a state has an NBA team to predict overall interest. And you see that on this one, the significance level on the second to last column, the last entry is .049, which is just barely beneath the 05 cutoff and there's enough to be considered statistically significant. Now what we're going to need to do is very quickly make a chart to show what these differences look like.

I'm going to do that really quickly in the graph. Go to Graphs, to Chart Builder, I'll get a Clustered Bar Chart. And from there I'll take NBA as an interesting search term and I'll take whether they have an NBA team, I'll make that cluster and I'll put the Region on the X axis. And when I do that, you see what's going on here. The bars in green are for states that have an NBA team and you see every region where they have NBA teams, there are above-average interest in searching for NBA, and it makes sense.

The states that don't have NBA teams are in blue and they'll have below average interest, regardless of the region, except you do see an interesting thing. In the South, the states that have NBA teams, and there are several, are barely above the national average in terms of interest. But in the West, the states that have NBA teams have huge amounts of interest, much higher. And so you can see that the effect of having an NBA team varies according to region.

And that's the idea of a statistical interaction. it's one of the benefits of an Analysis of Variance. And so, for our final inferential test, the Factorial Analysis of Variance, you see this is an excellent way of looking at the association between two categorical predictor variables in a single-scaled outcome variable. It lets you look at the statistical effect of each of the categorical variables on its own, as well as the interaction of the two, which can often be more interesting and more important. And with that, we'll conclude our last section on statistical graphing and testing.

In the next and last section, we'll wrap things up a little and talk about how you can get all of your results out of SPSS and format them, so they'll be as clear and as communicative as possible.

Show transcript

#### This video is part of

SPSS Statistics Essential Training (2011)

52 video lessons · 20137 viewers

Author

Expand all | Collapse all
1. ### Introduction

2m 58s
1. Welcome
1m 5s
2. Using the exercise files
40s
3. Using a different version of the software
1m 13s
2. ### 1. Getting Started

19m 0s
1. Taking a first look at the interface
11m 49s
7m 11s
3. ### 2. Charts for One Variable

21m 54s
1. Creating bar charts for categorical variables
7m 18s
2. Creating pie charts for categorical variables
2m 54s
3. Creating histograms for quantitative variables
5m 45s
4. Creating box plots for quantitative variables
5m 57s
4. ### 3. Modifying Data

33m 10s
1. Recoding variables
5m 33s
2. Recoding with visual binning
5m 33s
3. Recoding by ranking cases
5m 26s
4. Computing new variables
5m 37s
5. Combining or excluding outliers
5m 21s
6. Transforming outliers
5m 40s
5. ### 4. Working with the Data File

28m 12s
1. Selecting cases
6m 44s
2. Using the Split File command
5m 12s
3. Merging files
5m 33s
4. Using the Multiple Response command
10m 43s
6. ### 5. Descriptive Statistics for One Variable

22m 14s
1. Calculating frequencies
8m 43s
2. Calculating descriptives
5m 31s
3. Using the Explore command
8m 0s
7. ### 6. Inferential Statistics for One Variable

16m 3s
1. Calculating inferential statistics for a single proportion
6m 6s
2. Calculating inferential statistics for a single mean
5m 39s
3. Calculating inferential statistics for a single categorical variable
4m 18s
8. ### 7. Charts for Two Variables

30m 43s
1. Creating clustered bar charts
7m 10s
2. Creating scatterplots
5m 8s
3. Creating time series
3m 24s
4. Creating simple bar charts of group means
4m 17s
5. Creating population pyramids
3m 0s
6. Creating simple boxplots for groups
3m 3s
7. Creating side-by-side boxplots
4m 41s
9. ### 8. Descriptive and Inferential Statistics for Two Variables

45m 28s
1. Calculating correlations
8m 17s
2. Computing a bivariate regression
6m 27s
3. Creating crosstabs for categorical variables
6m 34s
4. Comparing means with the Means procedure
6m 33s
5. Comparing means with the t-test
6m 4s
6. Comparing means with a one-way ANOVA
6m 30s
7. Comparing paired means
5m 3s
10. ### 9. Charts for Three or More Variables

24m 30s
1. Creating clustered bar charts for frequencies
6m 34s
2. Creating clustered bar charts for means
3m 45s
3. Creating scatterplots by group
4m 13s
4. Creating 3-D scatterplots
4m 25s
5. Creating scatterplot matrices
5m 33s
11. ### 10. Descriptive Statistics for Three or More Variables

30m 57s
1. Using Automatic Linear Models
11m 52s
2. Calculating multiple regression
9m 3s
3. Comparing means with a two-factor ANOVA
10m 2s
12. ### 11. Formatting and Exporting Tables and Charts

29m 29s
1. Formatting descriptive statistics
6m 1s
2. Formatting correlations
7m 49s
3. Formatting regression
10m 19s
4. Exporting charts and tables
5m 20s
13. ### Conclusion

51s
1. What's next
51s

### Start learning today

Sometimes @lynda teaches me how to use a program and sometimes Lynda.com changes my life forever. @JosefShutter
@lynda lynda.com is an absolute life saver when it comes to learning todays software. Definitely recommend it! #higherlearning @Michael_Caraway
@lynda The best thing online! Your database of courses is great! To the mark and very helpful. Thanks! @ru22more
Got to create something yesterday I never thought I could do. #thanks @lynda @Ngventurella
I really do love @lynda as a learning platform. Never stop learning and developing, it’s probably our greatest gift as a species! @soundslikedavid
@lynda just subscribed to lynda.com all I can say its brilliant join now trust me @ButchSamurai
@lynda is an awesome resource. The membership is priceless if you take advantage of it. @diabetic_techie
One of the best decision I made this year. Buy a 1yr subscription to @lynda @cybercaptive
guys lynda.com (@lynda) is the best. So far I’ve learned Java, principles of OO programming, and now learning about MS project @lucasmitchell
Signed back up to @lynda dot com. I’ve missed it!! Proper geeking out right now! #timetolearn #geek @JayGodbold
Share a link to this course

### What are exercise files?

Exercise files are the same files the author uses in the course. Save time by downloading the author's files instead of setting up your own files, and learn by following along with the instructor.

### Can I take this course without the exercise files?

Yes! If you decide you would like the exercise files later, you can upgrade to a premium account any time.

How to use exercise files.

Learn by watching, listening, and doing, Exercise files are the same files the author uses in the course, so you can download them and follow along Premium memberships include access to all exercise files in the library.

Exercise files

How to use exercise files.

This course includes free exercise files, so you can practice while you watch the course. To access all the exercise files in our library, become a Premium Member.

Are you sure you want to mark all the videos in this course as unwatched?

This will not affect your course history, your reports, or your certificates of completion for this course.

Congratulations

You have completed SPSS Statistics Essential Training (2011).

Become a member to add this course to a playlist

Join today and get unlimited access to the entire library of video courses—and create as many playlists as you like.

Become a member to like this course.

Join today and get unlimited access to the entire library of video courses.

Exercise files

Learn by watching, listening, and doing! Exercise files are the same files the author uses in the course, so you can download them and follow along. Exercise files are available with all Premium memberships. Learn more

How to use exercise files.

Thanks for contacting us.
You’ll hear from our Customer Service team within 24 hours.

Please enter the text shown below:

The classic layout automatically defaults to the latest Flash Player.

To choose a different player, hold the cursor over your name at the top right of any lynda.com page and choose Site preferencesfrom the dropdown menu.

• Mark video as unwatched
• Mark ALL videos as unwatched
Exercise files

Access exercise files from a button right under the course name.

Mark videos as unwatched

Remove icons showing you already watched videos if you want to start over.

Make the video wide, narrow, full-screen, or pop the player out of the page into its own window.

Interactive transcripts

Click on text in the transcript to jump to that spot in the video. As the video plays, the relevant spot in the transcript will be highlighted.

## Are you sure you want to delete this note?

Thanks for signing up.

We’ll send you a confirmation email shortly.

• new course releases
• general communications
• special notices

Keep up with news, tips, and latest courses with emails from lynda.com.

• new course releases