Easy-to-follow video tutorials help you learn software, creative, and business skills.Become a member

Calculating correlations

From: SPSS Statistics Essential Training (2011)

Video: Calculating correlations

Whenever you explore your data you'll find that each step can build on the others before it. In this course for example we started by looking at individual variables before looking at pairs of variables and that comes before looking at sets of variables. When we looked at individual variables we started by creating graphic displays for each variable. Then by computing descriptive statistics for each and finished with inferential statistics. There is a logical progression to this and it's one that we will follow here with the associations for pairs of variables and later for sets of variables.

Calculating correlations

Whenever you explore your data you'll find that each step can build on the others before it. In this course for example we started by looking at individual variables before looking at pairs of variables and that comes before looking at sets of variables. When we looked at individual variables we started by creating graphic displays for each variable. Then by computing descriptive statistics for each and finished with inferential statistics. There is a logical progression to this and it's one that we will follow here with the associations for pairs of variables and later for sets of variables.

The first procedure that we are going to look at, correlations, is the most general measure of association between pairs of variables. Let's look at how to do correlations in SPSS and how to interpret the results. For this example, I'm going to be using the same dataset I've used in the last few. It's about the Google Searches, Searches.sav, and to get correlations we need to go up to Analyze and then we come down to Correlate, and what we are going to be doing is the basic version called Bivariate or two variable correlations.

All you need to do here is take all the variables that you want to correlate with each other and put them in the variable list on the right. Now if there is one variable in particular that can serve as an outcome variable, it's helpful to put that one in first so it shows up at the very top of the list. In this particular example I thought it might be interesting to look at the relative interest in searching for Facebook. So I am going to put that in first, and then I'll see how that compares with other search terms by selecting all of these, and I might as well put in nearly everything here.

I am going to come down to Median Age, because all of these are either scale or dichotomous. Now I am not going to put in Census Bureau Region because that has four categories and Census Bureau Division because it has even more. However, you can use indicator variables and what I've done is I've created three indicator variables. One for whether a state is in the Northeast, another for the Midwest, and a third for the South, and what that does is it leaves implied in all of these is the West.

So I am going to add the three of those and put them over here. Now I have a few options with correlation. I can get three different kinds of correlations. There is the Pearson Product-Moment Correlation coefficient which is the standard correlation, also sometimes known by its symbol R. There's Kendall's Tau-b and there is the Spearman rank order correlation coefficient. Truthfully, I've never had to do with anything other than the Pearson and I recommend that you stick with that one. There's also Test of Significance.

You can do what's called a one- tailed test or a two-tailed test. Now this has to do with calculating false positive rates and I recommend that you always stay with a two-tailed test unless you have some super-compelling reason to go with the one-tailed. Also, we have the option of flagging statistically significant correlations. That's very helpful and I'd leave that on there, and let's come over here and take a quick look at the other options. You can also get means and standard deviations for each variable, but we don't need that at this point, because we should have done that already.

You can get what are called cross- product deviations and covariances and that's a little technical and we don't need that. The other question is whether you want to exclude cases pairwise or listwise. I've mentioned these before. Pairwise means that you might have a different sample size for each set of correlations. If for instance everybody has data on two particular variables, but you're missing a lot of information on another variable, you would end up with different sample sizes. This isn't necessarily a problem and I usually leave it at pairwise.

However, there may be times when you only want to deal with cases with complete information, in which case you would choose listwise. But I am going to leave it at the default for right now. So I'll press Continue and I'll press OK. Now I asked for a lot of variables and so what I get here is a very large table. You can see that it goes down a long way and it goes across a long way. You can also tell that the labels aren't there and when we scroll down it's hard to see. But that's okay, and what you see here is that every variable is listed down this side.

We have Facebook to SPSS to Regression as Google Searches, and we have the same variables listed across the top: Facebook, SPSS, Regression, and so on. Then what you have is a cell that gives information about the association between each one. In each cell the top number is the Pearson correlation. That's the actual correlation coefficient. It goes from 0 to 1 and 0 means no linear relationship and 1 indicates a perfect linear relationship. It can be positive or negative.

The positive or negative has nothing to do with the strength of the relationship. It only indicates whether it's an uphill or downhill relationship. The second number it says Sig. Two-tailed. This is the probability value that's associated with the significance test for the correlation, and the third one is the N or the number of cases that go into calculating that particular correlation. This dataset has complete data for all 51 cases. That's the 50 states in Washington, D.C. Additionally, you see that down the diagonal we have a series of 1s and blanks and 51s.

That's because it's each variable correlated with itself which will always be a perfect positive correlation, and truthfully some programs just don't put anything there at all. But let's say I'm interested in the relative interest in each state in searching for Facebook. Then what I want to do is I want to go down this first column. It says Facebook at the top and I want to scroll down and I want to look for statistically significant correlations. Now SPSS makes this easy, because they will put asterisks next to statistically significant correlations.

So you see for instance the top is Facebook correlated with itself. That doesn't really mean anything. Facebook and SPSS have a correlation of -.184. It's not a very strong correlation. It's closer to 0 than it is to + or -1 and you can tell that its probability value is .196. It's nowhere close to a statistically significant. However, we do see that in the next few we have statistically significant negative correlations. The higher a state's interest in Facebook the lower its interest in searching on Google for regression or statistically significant or business intelligence.

We can scroll down and see some more. Similarly, lower interest in data visualization, they're also less likely to use the term totally lost. On the other hand, states that show a relatively high interest in Facebook also show a relatively high interest in searching for American Idol. That's the correlation of .516 and as that probability value of 000 is not actually a 0, but it means that it rounds off to less than 001. As we scroll down we see that modern dance goes into it and NBA.

Interestingly, NFL does not correlate, but the NBA and FIFA do. Also, as we scroll down we can see that states that have an NFL team show a lower interest in Facebook, similarly for an NBA and MLS. It's just as whole series of correlations that show things that can be used to predict the level of interest in a particular item. Now the most important thing probably to remember here is that correlations are simply associations. They don't explain why the variables are associated.

It's simply a predictor. The matter of explaining why they are correlated is a whole different issue about causation and something that we need to be careful about. So in summary, correlations are great way to look at the strength of associations between two variables. The correlations of general purpose they can be used with scale variables, ordinal variables or dichotomous variables, and they can give a good way to compare associations across a number of procedures. For that reason it's a good idea to always include correlations in your analyses.

However, there are also some more specialized procedures that are helpful to use and we will turn to those next.

Show transcript

This video is part of

Image for SPSS Statistics Essential Training (2011)
SPSS Statistics Essential Training (2011)

52 video lessons · 19134 viewers

Barton Poulson
Author

 
Expand all | Collapse all
  1. 2m 58s
    1. Welcome
      1m 5s
    2. Using the exercise files
      40s
    3. Using a different version of the software
      1m 13s
  2. 19m 0s
    1. Taking a first look at the interface
      11m 49s
    2. Reading data from a spreadsheet
      7m 11s
  3. 21m 54s
    1. Creating bar charts for categorical variables
      7m 18s
    2. Creating pie charts for categorical variables
      2m 54s
    3. Creating histograms for quantitative variables
      5m 45s
    4. Creating box plots for quantitative variables
      5m 57s
  4. 33m 10s
    1. Recoding variables
      5m 33s
    2. Recoding with visual binning
      5m 33s
    3. Recoding by ranking cases
      5m 26s
    4. Computing new variables
      5m 37s
    5. Combining or excluding outliers
      5m 21s
    6. Transforming outliers
      5m 40s
  5. 28m 12s
    1. Selecting cases
      6m 44s
    2. Using the Split File command
      5m 12s
    3. Merging files
      5m 33s
    4. Using the Multiple Response command
      10m 43s
  6. 22m 14s
    1. Calculating frequencies
      8m 43s
    2. Calculating descriptives
      5m 31s
    3. Using the Explore command
      8m 0s
  7. 16m 3s
    1. Calculating inferential statistics for a single proportion
      6m 6s
    2. Calculating inferential statistics for a single mean
      5m 39s
    3. Calculating inferential statistics for a single categorical variable
      4m 18s
  8. 30m 43s
    1. Creating clustered bar charts
      7m 10s
    2. Creating scatterplots
      5m 8s
    3. Creating time series
      3m 24s
    4. Creating simple bar charts of group means
      4m 17s
    5. Creating population pyramids
      3m 0s
    6. Creating simple boxplots for groups
      3m 3s
    7. Creating side-by-side boxplots
      4m 41s
  9. 45m 28s
    1. Calculating correlations
      8m 17s
    2. Computing a bivariate regression
      6m 27s
    3. Creating crosstabs for categorical variables
      6m 34s
    4. Comparing means with the Means procedure
      6m 33s
    5. Comparing means with the t-test
      6m 4s
    6. Comparing means with a one-way ANOVA
      6m 30s
    7. Comparing paired means
      5m 3s
  10. 24m 30s
    1. Creating clustered bar charts for frequencies
      6m 34s
    2. Creating clustered bar charts for means
      3m 45s
    3. Creating scatterplots by group
      4m 13s
    4. Creating 3-D scatterplots
      4m 25s
    5. Creating scatterplot matrices
      5m 33s
  11. 30m 57s
    1. Using Automatic Linear Models
      11m 52s
    2. Calculating multiple regression
      9m 3s
    3. Comparing means with a two-factor ANOVA
      10m 2s
  12. 29m 29s
    1. Formatting descriptive statistics
      6m 1s
    2. Formatting correlations
      7m 49s
    3. Formatting regression
      10m 19s
    4. Exporting charts and tables
      5m 20s
  13. 51s
    1. What's next
      51s

Start learning today

Get unlimited access to all courses for just $25/month.

Become a member
Sometimes @lynda teaches me how to use a program and sometimes Lynda.com changes my life forever. @JosefShutter
@lynda lynda.com is an absolute life saver when it comes to learning todays software. Definitely recommend it! #higherlearning @Michael_Caraway
@lynda The best thing online! Your database of courses is great! To the mark and very helpful. Thanks! @ru22more
Got to create something yesterday I never thought I could do. #thanks @lynda @Ngventurella
I really do love @lynda as a learning platform. Never stop learning and developing, it’s probably our greatest gift as a species! @soundslikedavid
@lynda just subscribed to lynda.com all I can say its brilliant join now trust me @ButchSamurai
@lynda is an awesome resource. The membership is priceless if you take advantage of it. @diabetic_techie
One of the best decision I made this year. Buy a 1yr subscription to @lynda @cybercaptive
guys lynda.com (@lynda) is the best. So far I’ve learned Java, principles of OO programming, and now learning about MS project @lucasmitchell
Signed back up to @lynda dot com. I’ve missed it!! Proper geeking out right now! #timetolearn #geek @JayGodbold
Share a link to this course

What are exercise files?

Exercise files are the same files the author uses in the course. Save time by downloading the author's files instead of setting up your own files, and learn by following along with the instructor.

Can I take this course without the exercise files?

Yes! If you decide you would like the exercise files later, you can upgrade to a premium account any time.

Become a member Download sample files See plans and pricing

Please wait... please wait ...
Upgrade to get access to exercise files.

Exercise files video

How to use exercise files.

Learn by watching, listening, and doing, Exercise files are the same files the author uses in the course, so you can download them and follow along Premium memberships include access to all exercise files in the library.


Exercise files

Exercise files video

How to use exercise files.

For additional information on downloading and using exercise files, watch our instructional video or read the instructions in the FAQ.

This course includes free exercise files, so you can practice while you watch the course. To access all the exercise files in our library, become a Premium Member.

Join now "Already a member? Log in

Are you sure you want to mark all the videos in this course as unwatched?

This will not affect your course history, your reports, or your certificates of completion for this course.


Mark all as unwatched Cancel

Congratulations

You have completed SPSS Statistics Essential Training (2011).

Return to your organization's learning portal to continue training, or close this page.


OK
Become a member to add this course to a playlist

Join today and get unlimited access to the entire library of video courses—and create as many playlists as you like.

Get started

Already a member?

Become a member to like this course.

Join today and get unlimited access to the entire library of video courses.

Get started

Already a member?

Exercise files

Learn by watching, listening, and doing! Exercise files are the same files the author uses in the course, so you can download them and follow along. Exercise files are available with all Premium memberships. Learn more

Get started

Already a Premium member?

Exercise files video

How to use exercise files.

Ask a question

Thanks for contacting us.
You’ll hear from our Customer Service team within 24 hours.

Please enter the text shown below:

The classic layout automatically defaults to the latest Flash Player.

To choose a different player, hold the cursor over your name at the top right of any lynda.com page and choose Site preferencesfrom the dropdown menu.

Continue to classic layout Stay on new layout
Exercise files

Access exercise files from a button right under the course name.

Mark videos as unwatched

Remove icons showing you already watched videos if you want to start over.

Control your viewing experience

Make the video wide, narrow, full-screen, or pop the player out of the page into its own window.

Interactive transcripts

Click on text in the transcript to jump to that spot in the video. As the video plays, the relevant spot in the transcript will be highlighted.

Are you sure you want to delete this note?

No

Your file was successfully uploaded.

Thanks for signing up.

We’ll send you a confirmation email shortly.


Sign up and receive emails about lynda.com and our online training library:

Here’s our privacy policy with more details about how we handle your information.

Keep up with news, tips, and latest courses with emails from lynda.com.

Sign up and receive emails about lynda.com and our online training library:

Here’s our privacy policy with more details about how we handle your information.

   
submit Lightbox submit clicked
Terms and conditions of use

We've updated our terms and conditions (now called terms of service).Go
Review and accept our updated terms of service.