Easy-to-follow video tutorials help you learn software, creative, and business skills.Become a member
In this movie, on graphing the association between two variables, we will look at what SPSS calls simple boxplots, which is a series of boxplots for a single scale variable, broken down by the groups in a single categorical variable. One of the main benefits of this particular chart is that it allows you to check for outliers separately for each group. This is important because a variable may not have any outliers, when all of the cases are considered together, but can have an outlier when groups are separated.
For example, enough people in the sample might be 6'4" tall, that it might not be considered an outlier overall, but that it almost certainly would be an outlier, if you looked at the heights of men and women separately. So, here's how to break boxplots down by various categories. For this example, I am going to be using the Searches database again from Google, Searches.sav, and except in this case I am going to be looking at the relative interest in search for this one variable, Modern Dance as a search term and break it down by region.
To do this, I am going to go up to Graphs, to Chart Builder, and I am going to come down to Boxplot, and I am going to take this first one which is called the Simple Boxplot and drag it up to this canvas, and from there I'm going to get the Region variable, that's this one, Census Bureau region, and drag that down to the X axis. Then I'm going to get the variable that shows the relative interest in Modern Dance as a search term. From there I'm going to add group and point IDs. This is helpful when you're labeling outliers, which often show up in boxplots.
So I'm going to come down and click on Point ID label, and then I am going to get the State Code from the variable list, and drag that over, and that's all I need for right now. So I am going to come down and press OK. And what you find rather surprisingly is that Utah is an extraordinarily high outlier on the far right, been four-and-a-half Standard Deviations above the national average in the relative mind sharing interest in Modern Dance as a Google search term.
You might associate Modern Dance with the city like New York and the Northeast, and you do see that New York is an outlier on the left side, but still it's at only about a value of one standard deviation above the mean. And you can see that there are others at a much lower interest and the Midwest is generally below 0, that they are negative. And so, this is a good way of looking at the relative differences in distributions especially in outliers of one group across another.
The Simple Boxplot is a great way to compare the distributions of a single scale variable, for the different groups in the categorical variable, and again because it's especially important to identify outliers because they can wreak havoc with the statistical procedures, it's an important consideration before going on to further analysis, like the inferential statistics for associations that we will cover in the next several movies.
Get unlimited access to all courses for just $25/month.Become a member
Access exercise files from a button right under the course name.
Search within course videos and transcripts, and jump right to the results.
Remove icons showing you already watched videos if you want to start over.
Make the video wide, narrow, full-screen, or pop the player out of the page into its own window.
Click on text in the transcript to jump to that spot in the video. As the video plays, the relevant spot in the transcript will be highlighted.
Your file was successfully uploaded.