# Creating scatterplots

## Video: Creating scatterplots

In the last movie we talked about how to chart the relationship between two categorical variables with clustered bar charts. On the other hand, if you have two scale variables, also called quantitative variables or measured variables, then your best choice is almost always a scatter plot. Scatter plots are familiar to most people. There's an x axis across the bottom and a y axis up this side, and each person or case gets a dot to show the combination of their two scores, like height and weight or high school and college GPA. In general you want to put your predictor variable on the bottom, on the x axis, and your outcome variable or the thing you're trying to predict on the y axis, and SPSS makes the whole process very simple.

## Creating scatterplots

You can create a scatter plot with the Chart Builder in just a few steps. And for this example I'm going to be using the same Google searches information in Searches.sav. I am going to come up to Graphs, to Chart Builder and then in the Gallery I will choose Scatter, and just use a Simple Scatter plot. I will drag that up to the canvas. And then in this particular example, I'm going to take interest in SPSS as a relative interest as a search term and put it on the x axis, and then I am going to take one that may seem a little peculiar, but the search term, Totally Lost, and put that on the y axis.

I'm also going to make it possible for me to identify points by clicking on the Point ID label. That brings up a box in the canvas. and I can come up here and I can take the state code and drag that in and that should be enough for right now. I'll click OK and here's my general scatter plot. And what you see is first off a lot of fuzz, because I have dots and I have the state labels. I am going to take care of those in just a second. But it's clear that there's a very strong linear uphill trend, that places that show greater relative interest in SPSS as a search term in Google also for reasons that may not be totally clear show greater use of the search term Totally Lost as they go through.

Now, I am going to clean up this chart in a few ways. I am going to try to go through it relatively quickly and give you an idea of what's possible. To edit the chart you need to double- click on it, and what I am going to do is I am going to turn off all of the state labels by going to Elements and Hide Data Labels. I will bring back just one or two of them for illustration later. There's a few things I want to show you how to clean up. For instance, you can change almost anything by clicking on it. I have selected the data points here and I can make them instead of black circles, I can make them red dots by clicking red for the Border and then red for the Fill.

If I want to change the colors of lines, I can do that as well. I can also change the axis down here from 3 decimal places by clicking on Number Format and changing that to 0, clicking Apply, and doing the same thing over here, changing that to 0 and clicking Apply. Now what I am going to do is I am going to add a linear regression line. This is also the basis of an inferential procedure, linear aggression, that we'll be coming to a little bit later, but right now it's a very simple thing to do. I just come up to the Button bar and click on this one that says Add a Fit Line at Total, and that's a regression line that goes all the way through.

It also adds a little bit of information right here that I don't need right now, so I am going to select that and press Delete. And then I've got a very clear, strong, upward trend, higher relative interest in SPSS as a search term, also higher use of the word Totally Lost as a search term. The one last thing I'm going to do is I'm going to add an identifier to the point that's in the top right. We saw what it was earlier, but I am going to add an identifier for just it. By coming over to the left of this button bar, clicking on the little target, which is the Data Label Mode, I click on that, and then I come back over and click on that data point I want to identify, and we see there that it's Washington D.C., and that's probably enough for this particular chart.

I want you to be aware that there are many other options. For instance, I can add vertical and horizontal reference lines. I can also change the kind of regression line I have through. For instance, this is called a linear regression line, but if you're interested in growth, like changes in stock prices over time, you might want to use a Quadratic or something called a Cubic. If you want to see if it's a straight line at all, you can find what's called a Smoother, in this case it's called Loess Smoother through the regression line, and I encourage you to try these alternatives, and it's actually possible to overlay one on top of the other. But for now I am going to leave this with a straight regressioline as it shows the linear patterns most clearly.

So I am going to close that and close that. So the Scatter plot can give really good insight into the relationship between two scale variables and the options that SPS gives for lines through the data can help you explore how well your data matched the assumptions of standard linear regression. In the next movie we'll look at a special kind of scatter plot called the Time Series Plot or Time Plot, where the variable on the bottom is, not surprisingly, time.

#### This video is part of

SPSS Statistics Essential Training (2011)

52 video lessons · 20059 viewers

Author

Expand all | Collapse all
1. ### Introduction

2m 58s
1. Welcome
1m 5s
2. Using the exercise files
40s
3. Using a different version of the software
1m 13s
2. ### 1. Getting Started

19m 0s
1. Taking a first look at the interface
11m 49s
7m 11s
3. ### 2. Charts for One Variable

21m 54s
1. Creating bar charts for categorical variables
7m 18s
2. Creating pie charts for categorical variables
2m 54s
3. Creating histograms for quantitative variables
5m 45s
4. Creating box plots for quantitative variables
5m 57s
4. ### 3. Modifying Data

33m 10s
1. Recoding variables
5m 33s
2. Recoding with visual binning
5m 33s
3. Recoding by ranking cases
5m 26s
4. Computing new variables
5m 37s
5. Combining or excluding outliers
5m 21s
6. Transforming outliers
5m 40s
5. ### 4. Working with the Data File

28m 12s
1. Selecting cases
6m 44s
2. Using the Split File command
5m 12s
3. Merging files
5m 33s
4. Using the Multiple Response command
10m 43s
6. ### 5. Descriptive Statistics for One Variable

22m 14s
1. Calculating frequencies
8m 43s
2. Calculating descriptives
5m 31s
3. Using the Explore command
8m 0s
7. ### 6. Inferential Statistics for One Variable

16m 3s
1. Calculating inferential statistics for a single proportion
6m 6s
2. Calculating inferential statistics for a single mean
5m 39s
3. Calculating inferential statistics for a single categorical variable
4m 18s
8. ### 7. Charts for Two Variables

30m 43s
1. Creating clustered bar charts
7m 10s
2. Creating scatterplots
5m 8s
3. Creating time series
3m 24s
4. Creating simple bar charts of group means
4m 17s
5. Creating population pyramids
3m 0s
6. Creating simple boxplots for groups
3m 3s
7. Creating side-by-side boxplots
4m 41s
9. ### 8. Descriptive and Inferential Statistics for Two Variables

45m 28s
1. Calculating correlations
8m 17s
2. Computing a bivariate regression
6m 27s
3. Creating crosstabs for categorical variables
6m 34s
4. Comparing means with the Means procedure
6m 33s
5. Comparing means with the t-test
6m 4s
6. Comparing means with a one-way ANOVA
6m 30s
7. Comparing paired means
5m 3s
10. ### 9. Charts for Three or More Variables

24m 30s
1. Creating clustered bar charts for frequencies
6m 34s
2. Creating clustered bar charts for means
3m 45s
3. Creating scatterplots by group
4m 13s
4. Creating 3-D scatterplots
4m 25s
5. Creating scatterplot matrices
5m 33s
11. ### 10. Descriptive Statistics for Three or More Variables

30m 57s
1. Using Automatic Linear Models
11m 52s
2. Calculating multiple regression
9m 3s
3. Comparing means with a two-factor ANOVA
10m 2s
12. ### 11. Formatting and Exporting Tables and Charts

29m 29s
1. Formatting descriptive statistics
6m 1s
2. Formatting correlations
7m 49s
3. Formatting regression
10m 19s
4. Exporting charts and tables
5m 20s
13. ### Conclusion

51s
1. What's next
51s

Keep up with news, tips, and latest courses with emails from lynda.com.

• new course releases