# Creating box plots for quantitative variables

## Video: Creating box plots for quantitative variables

When you're looking at what SPSS calls a scale variable--that's something that can be measured as more or less, like the percentage of critics who gave a favorable rating to a movie or the budget or the box office earnings for that movie--you should generally make two kinds of charts. The first one, which we did in the last movie, is called a histogram. It's like a bell curve, and it's a good way of getting a feel for the overall shape of a distribution. The second kind that you should generally make for a scale variable is called a box plot, and it's primary purpose in this context is to check for outlying scores, because they can cause a lot of problems in later statistical analyses.

## Creating box plots for quantitative variables

When you're looking at what SPSS calls a scale variable--that's something that can be measured as more or less, like the percentage of critics who gave a favorable rating to a movie or the budget or the box office earnings for that movie--you should generally make two kinds of charts. The first one, which we did in the last movie, is called a histogram. It's like a bell curve, and it's a good way of getting a feel for the overall shape of a distribution. The second kind that you should generally make for a scale variable is called a box plot, and it's primary purpose in this context is to check for outlying scores, because they can cause a lot of problems in later statistical analyses.

So you need to be able to identify whether you have outliers and often what those outliers are. So what I'm going to do now is I'm going to create a box plot for budget, which we used in the last movie on histograms. Come up to Graphs, to the Chart Builder, and from there I come down to the list, to Boxplot. There are several different versions of box plots. I am going to choose the simplest one possible. That's this one over here, which is called a 1-D Boxplot. It's for charting all of the cases on a single variable.

If I wanted to break down budgets by a genre of film, I could do that over here, under what's called a Simple Boxplot, but it's grouped, and I will show that in a later movie. But right now I'm simply going to drag the 1-D Boxplot up to the canvas, and then I'm going to bring in budget to the Y axis. This is the general format of a box plot. I will explain more when we look at the finished version. But I am going to do a couple of things. Number one is I may want to identify points.

If click on Point ID Label, and then I can actually get the movie name and I can drag that into here, so if I have unusually high or low points, it will actually tell me what the movie is. It makes life easier. I can also put titles on. I will have a title, and I will put Boxplot of Movie Budgets. Then I will press Apply, and for both of these I can now press OK over here. And what comes up is this particular chart. This is the text that is the syntax that produces the command.

This is the name of the command, this is the data set, Movies.sav, and this is the Boxplot of Movie Budgets. What you have here is budgets ranging from 0-- there's actually nothing with 0-- up to \$250 million for the movie. This is from a few years ago. And this box right here shows the quartiles of a distribution, and this is the minimum value of any movie in the data set. This right here is the highest non- outlying value, and I say non-outlying because we have two outlier movies.

In this particularly data set Spiderman 2 and King Kong both had budgets of approximately \$200 million. On the other hand, this box down here shows you the median, that 50% of the movies--there were 61 in this data set, so 30 of them--had budgets beneath this, which is around \$25 or \$30 million, and half of them were above. Now, I am going to show you a few ways to modify this chart that I think will make it a little easier to deal with.

As with every chart in SPSS, you modify it by first double-clicking on it to activate it. That brings up the chart in a Chart Editor window and it brings up a Properties window to the right. Now, one thing that I personally like to do is I like to turn these charts sideways by coming up to the button bar and clicking on the button that says "Transpose the chart coordinate system." The reason I do this is because the other charts that we make up, like histograms and like the scatter plots that we will show later, they have these variables listed across the bottom, with the lowest value on the left, highest value on the right, and I find it helpful to be consistent in this particular way.

I'd like to change the color of the chart. I click on the box, come over here to change the fill, and then the border I can change to another color if I want. I can change the way these bars work at the end. These are sometimes called whiskers. They go to the lowest and the highest non-outlying value. In case you're wondering, outliers are determined by being one and a half times of this middle range above or below the range. What we're going to do is I'm going to change the way these whiskers are.

This is just a preference issue. I click on that, and I come over her to Bar Options, and I am going to change it from a T-bar to what's called a Whisker. It's just a line at the end. And then here, if I want to, I can actually change the way that these look at the end. I have the movie labels there as well. Finally, if I want to change the Axis labels here on the bottom, like I did with the histogram where I changed these to millions of dollars, I click on the numbers, and I come over to the Properties window, to Number Format, and the Scaling Factor here, I'm going to put in millions.

I am going to press Apply, and this now gives me millions of dollars. And I need to change this-- it says Budget--to say Budget in Millions. I can close the chart, and now I have a good depiction that the overall distribution is on the low end, because this is movies that included award winners, that half of the movies have budgets of 30 million or less, but they go up to about 150 million, and that in this particular data set we had two other movies--Spiderman 2 and King Kong-- that had unusually large budgets, as is common among summer blockbusters.

Anyhow, when you're looking at a scale variable like budget, like viewer evaluations, like time spent on tasks, like time spent viewing a web site, then you do want to look at both the overall shape of the distribution with the histogram and you want to check for outliers, and a box plot is an ideal way to do that.

Show transcript

#### This video is part of

SPSS Statistics Essential Training (2011)

52 video lessons · 20159 viewers

Author

Expand all | Collapse all
1. ### Introduction

2m 58s
1. Welcome
1m 5s
2. Using the exercise files
40s
3. Using a different version of the software
1m 13s
2. ### 1. Getting Started

19m 0s
1. Taking a first look at the interface
11m 49s
7m 11s
3. ### 2. Charts for One Variable

21m 54s
1. Creating bar charts for categorical variables
7m 18s
2. Creating pie charts for categorical variables
2m 54s
3. Creating histograms for quantitative variables
5m 45s
4. Creating box plots for quantitative variables
5m 57s
4. ### 3. Modifying Data

33m 10s
1. Recoding variables
5m 33s
2. Recoding with visual binning
5m 33s
3. Recoding by ranking cases
5m 26s
4. Computing new variables
5m 37s
5. Combining or excluding outliers
5m 21s
6. Transforming outliers
5m 40s
5. ### 4. Working with the Data File

28m 12s
1. Selecting cases
6m 44s
2. Using the Split File command
5m 12s
3. Merging files
5m 33s
4. Using the Multiple Response command
10m 43s
6. ### 5. Descriptive Statistics for One Variable

22m 14s
1. Calculating frequencies
8m 43s
2. Calculating descriptives
5m 31s
3. Using the Explore command
8m 0s
7. ### 6. Inferential Statistics for One Variable

16m 3s
1. Calculating inferential statistics for a single proportion
6m 6s
2. Calculating inferential statistics for a single mean
5m 39s
3. Calculating inferential statistics for a single categorical variable
4m 18s
8. ### 7. Charts for Two Variables

30m 43s
1. Creating clustered bar charts
7m 10s
2. Creating scatterplots
5m 8s
3. Creating time series
3m 24s
4. Creating simple bar charts of group means
4m 17s
5. Creating population pyramids
3m 0s
6. Creating simple boxplots for groups
3m 3s
7. Creating side-by-side boxplots
4m 41s
9. ### 8. Descriptive and Inferential Statistics for Two Variables

45m 28s
1. Calculating correlations
8m 17s
2. Computing a bivariate regression
6m 27s
3. Creating crosstabs for categorical variables
6m 34s
4. Comparing means with the Means procedure
6m 33s
5. Comparing means with the t-test
6m 4s
6. Comparing means with a one-way ANOVA
6m 30s
7. Comparing paired means
5m 3s
10. ### 9. Charts for Three or More Variables

24m 30s
1. Creating clustered bar charts for frequencies
6m 34s
2. Creating clustered bar charts for means
3m 45s
3. Creating scatterplots by group
4m 13s
4. Creating 3-D scatterplots
4m 25s
5. Creating scatterplot matrices
5m 33s
11. ### 10. Descriptive Statistics for Three or More Variables

30m 57s
1. Using Automatic Linear Models
11m 52s
2. Calculating multiple regression
9m 3s
3. Comparing means with a two-factor ANOVA
10m 2s
12. ### 11. Formatting and Exporting Tables and Charts

29m 29s
1. Formatting descriptive statistics
6m 1s
2. Formatting correlations
7m 49s
3. Formatting regression
10m 19s
4. Exporting charts and tables
5m 20s
13. ### Conclusion

51s
1. What's next
51s

### Start learning today

Sometimes @lynda teaches me how to use a program and sometimes Lynda.com changes my life forever. @JosefShutter
@lynda lynda.com is an absolute life saver when it comes to learning todays software. Definitely recommend it! #higherlearning @Michael_Caraway
@lynda The best thing online! Your database of courses is great! To the mark and very helpful. Thanks! @ru22more
Got to create something yesterday I never thought I could do. #thanks @lynda @Ngventurella
I really do love @lynda as a learning platform. Never stop learning and developing, it’s probably our greatest gift as a species! @soundslikedavid
@lynda just subscribed to lynda.com all I can say its brilliant join now trust me @ButchSamurai
@lynda is an awesome resource. The membership is priceless if you take advantage of it. @diabetic_techie
One of the best decision I made this year. Buy a 1yr subscription to @lynda @cybercaptive
guys lynda.com (@lynda) is the best. So far I’ve learned Java, principles of OO programming, and now learning about MS project @lucasmitchell
Signed back up to @lynda dot com. I’ve missed it!! Proper geeking out right now! #timetolearn #geek @JayGodbold
Share a link to this course

### What are exercise files?

Exercise files are the same files the author uses in the course. Save time by downloading the author's files instead of setting up your own files, and learn by following along with the instructor.

### Can I take this course without the exercise files?

Yes! If you decide you would like the exercise files later, you can upgrade to a premium account any time.

How to use exercise files.

Learn by watching, listening, and doing, Exercise files are the same files the author uses in the course, so you can download them and follow along Premium memberships include access to all exercise files in the library.

Exercise files

How to use exercise files.

This course includes free exercise files, so you can practice while you watch the course. To access all the exercise files in our library, become a Premium Member.

Are you sure you want to mark all the videos in this course as unwatched?

This will not affect your course history, your reports, or your certificates of completion for this course.

Congratulations

You have completed SPSS Statistics Essential Training (2011).

Become a member to add this course to a playlist

Join today and get unlimited access to the entire library of video courses—and create as many playlists as you like.

Become a member to like this course.

Join today and get unlimited access to the entire library of video courses.

Exercise files

Learn by watching, listening, and doing! Exercise files are the same files the author uses in the course, so you can download them and follow along. Exercise files are available with all Premium memberships. Learn more

How to use exercise files.

Thanks for contacting us.
You’ll hear from our Customer Service team within 24 hours.

Please enter the text shown below:

The classic layout automatically defaults to the latest Flash Player.

To choose a different player, hold the cursor over your name at the top right of any lynda.com page and choose Site preferencesfrom the dropdown menu.

• Mark video as unwatched
• Mark ALL videos as unwatched
Exercise files

Access exercise files from a button right under the course name.

Mark videos as unwatched

Remove icons showing you already watched videos if you want to start over.

Make the video wide, narrow, full-screen, or pop the player out of the page into its own window.

Interactive transcripts

Click on text in the transcript to jump to that spot in the video. As the video plays, the relevant spot in the transcript will be highlighted.

## Are you sure you want to delete this note?

Thanks for signing up.

We’ll send you a confirmation email shortly.

• new course releases
• general communications
• special notices

Keep up with news, tips, and latest courses with emails from lynda.com.

• new course releases