Easy-to-follow video tutorials help you learn software, creative, and business skills.Become a member

Formatting regression

From: SPSS Statistics Essential Training (2011)

Video: Formatting regression

In the last two movies, we've looked at ways to take output from SPSS and reformat it by pasting it into a spreadsheet and working with it to get it so it's clear, simpler, and easier to communicate. In the first movie, we looked at formatting a table of descriptive statistics. In the second one, we looked at how to deal with a correlation matrix. In this third one, I want to show you how to take the results of a multiple regression and compare them with the results of correlation coefficients, as a way of communicating the different perspectives that these analyses can give you and to make it clearer how to interpret them in a meaningful way.

Formatting regression

In the last two movies, we've looked at ways to take output from SPSS and reformat it by pasting it into a spreadsheet and working with it to get it so it's clear, simpler, and easier to communicate. In the first movie, we looked at formatting a table of descriptive statistics. In the second one, we looked at how to deal with a correlation matrix. In this third one, I want to show you how to take the results of a multiple regression and compare them with the results of correlation coefficients, as a way of communicating the different perspectives that these analyses can give you and to make it clearer how to interpret them in a meaningful way.

To do this, I'm going to be using the same data sets, Google searches, and the same variables that I used in the last two examples. I need to get a linear regression output. To do this, I come up to Analyze and go to Regression, to Linear. I need to take my dependent variable. That's my outcome variable or the thing I'm trying to predict. That's SPSS and I put that into Dependent. Then I take all the variables that I want to use as my predictors, the things that I think will explain interest in SPSS.

And in this case, I'm going to be using the same ones that was used before, searches for Business Intelligence, searches for Data Visualization. And then I'm going to come down to the degree, Percentage of a state population with Bachelors Degree or more, the Median Age, and then my three dichotomous indicators for Region. Now I've mentioned before that Region has four categories and the reason we used three indicator variables for this is because the fourth category, which would be West, is implied by 0s in all of these.

In the other analyses, it's okay to have a fourth indicator for West, but in linear regression it's not. That introduces something called multi-co-linearity and it can really wreak havoc with the result if you have variables that are correlated entirely with each other. So that's why we don't do that. Now to make this one simple, I'll leave it as Enter. That means it's going to give me a regression coefficient for all of these at once. I just leave everything at the default and I press OK. And I have a number of statistics here. The one I'm going to go to right now is this one that says Coefficients.

Really there is one column here that's of most interest. it's the one that says Standardized Coefficients Beta. It's third from the right. There's an inferential statistic next to it, the T-Test, and then there's a Significant value next to that. What I really want is the Beta Coefficients, because those are the ones that are most comparable to correlation coefficients. And then I'm going to indicate the statistical significance by highlighting the ones that are significant. I'm also going to use some of the information from the two tables above that, the Model Summary and the ANOVA.

I'll show you those in a moment. So what I'm going to do is I'm going to right-click on my Coefficients table, copy it, and I'm going to go to the same Excel spreadsheet that I used for modifying the correlation coefficients, except for this moment I'm going to start with the second sheet. I'll go to B1 and paste the results in. Again, because that allows me to put in a column, so I can reconstitute the order if I need to. And then I'm going to start getting rid of some information. I don't need this merged cell that says Coefficients on the top.

I don't need this giant merged cell that says Model here on the side. And then I don't need this one that says t and I don't need the Unstandardized Coefficients. So these are the ones in the original metric, but I'm just going to leave those out for right now, because the standardized coefficients, which are also called the Beta Weights, are the ones that are most easily compared with the correlation coefficients. Now the Constant, the Intercept term, doesn't have a standardized regression Beta Weight.

That's fine, so we can just leave that out. And in fact, what I'm going to going to do is I'm going to put here Predictor, Beta, and then I'm going to put p right here, and I don't need one for the Intercept. That way I can delete these merged cells up here and I have just these ones left. I don't need to worry too much about the formatting of the labels here, because I'm going to use the ones on the other page.

In the last one, I highlighted everything that was statistically significant in the 05. I'm also going to highlight the ones here that are statistically significant. An easy way to do that is to come in here to the p values and sort. And so now all the small p values, the ones that are statistically significant, are right here. And then I can highlight those and then if all goes well, I can sort them again. Now I can delete the p values. All I need are these ones, and I'm going to copy those and I'm going to go to the first page where I have my correlation coefficients.

And I just want to make sure that everything is in the same order. It is. These I need to say are correlations and these are beta coefficients. A beta coefficient is a standardized regression coefficient, and then here I've got Predicting SPSS. And so now what I have, I'm going to remove the borders that I actually put in earlier, and I'll get those all centered.

Here's an interesting thing. The correlations and the beta coefficients, I'm going to change the decimal places here, are approximately the same thing. Now what's interesting about putting the correlation coefficients in one column and the beta coefficients next to them is you can see actually that there's a huge contrast between the two of these. In the correlations, we had three variables that individually had high correlations with the relative interest in SPSS as a Google search term. They were Business Intelligence, Data Visualization, and the proportion of a state's population that had degrees.

All three of those are significantly and positively correlated, and the age and the region variables were not. However, when we go over to the regression results, we get a very different pattern. For one thing, Business Intelligence is no longer significantly correlated, where there's gone negative but it's not significant, so we'll treat it as functionally 0. Degree has also gone negative, but it's not significant. Data visualization on the other hand is still statistically significant and it has actually gone much, much higher.

Beta coefficients are like correlations and that they go from 0 to 1. They can be positive or negative. This is almost as strong as it can be. Data Visualization becomes a huge predictor. And then what's really shocking is that this three region variables, which individually had no correlation with interest in SPSS, all three of them had become statistically significant in the regression coefficient. What this lets us know is that region as a whole does matter and mostly because the three of these are contrasting with the West, we would want to look at the relative interest of SPSS in the four regions.

The other thing to keep in mind is that the correlation coefficients are valid individually. The correlation of Business Intelligence to SPSS of .49 is calculated on its own. The next one down between Data Visualization and SPSS, where we have a correlation of .60, that's correlated on its own. However, for the regression the seven beta coefficients are calculated simultaneously. If we removed any one of these, all of the others would change. They're taken as a combination and their values and their probability values are only valid when taken as a group.

And so that's one of the reasons why I can get very different patterns when you put in a linear regression result versus a correlation. Now there's one other thing I want to add for the linear regression. And that is this thing up here, under Model Summary where it gives the R Squared. And that is an indication of the proportion of variance in the outcome variable, which is SPSS searches that can accurately be predicted by the combination of the other variables. And what we have here is an R Squared of .589 and what that means is that nearly 60% of the variance in SPSS and just can be predicted by these other seven variables collectively.

So I'm going to take that .589, I'm just going to insert a row, and I'll label it R Squared, and I'm going to put down the .589. I'll just round it off right now and you can actually put that down as a percentage. And I'm going to leave it highlighted, I'll change that one to a percentage, and I'm going to leave it highlighted in yellow, because it is statistically significant. What that means is it's different from the 0 and the way I can tell that is by the result in the next table of the Analysis of Variance table where the model as a whole has a significant value of less than .000 here, but .001.

And so I know that that R Squared value of .589 is statistically significant. What I have here is a result that says that those seven variables collectively predict a lot of the interest in SPSS as a Google search term. What's funny about it is that the pattern from the individual correlations to the combined regression coefficient changes dramatically. And it's not the case that one of these is accurate and the other is inaccurate. They are both accurate; they are just very different perspectives on the issue, the individual versus the group predicting.

Anyhow, this can be one step in trying to tell an analytic story about your data. It can get complicated. it can require some insight and some judgment in how best to interpret it. But this is a way of taking a huge amount of numbers and a huge number of tables and boiling them down to a very small concise way of presenting the results that I think it makes it much easier for you to articulate your story, your vision of your data analysis.

Show transcript

This video is part of

Image for SPSS Statistics Essential Training (2011)
SPSS Statistics Essential Training (2011)

52 video lessons · 20121 viewers

Barton Poulson
Author

 
Expand all | Collapse all
  1. 2m 58s
    1. Welcome
      1m 5s
    2. Using the exercise files
      40s
    3. Using a different version of the software
      1m 13s
  2. 19m 0s
    1. Taking a first look at the interface
      11m 49s
    2. Reading data from a spreadsheet
      7m 11s
  3. 21m 54s
    1. Creating bar charts for categorical variables
      7m 18s
    2. Creating pie charts for categorical variables
      2m 54s
    3. Creating histograms for quantitative variables
      5m 45s
    4. Creating box plots for quantitative variables
      5m 57s
  4. 33m 10s
    1. Recoding variables
      5m 33s
    2. Recoding with visual binning
      5m 33s
    3. Recoding by ranking cases
      5m 26s
    4. Computing new variables
      5m 37s
    5. Combining or excluding outliers
      5m 21s
    6. Transforming outliers
      5m 40s
  5. 28m 12s
    1. Selecting cases
      6m 44s
    2. Using the Split File command
      5m 12s
    3. Merging files
      5m 33s
    4. Using the Multiple Response command
      10m 43s
  6. 22m 14s
    1. Calculating frequencies
      8m 43s
    2. Calculating descriptives
      5m 31s
    3. Using the Explore command
      8m 0s
  7. 16m 3s
    1. Calculating inferential statistics for a single proportion
      6m 6s
    2. Calculating inferential statistics for a single mean
      5m 39s
    3. Calculating inferential statistics for a single categorical variable
      4m 18s
  8. 30m 43s
    1. Creating clustered bar charts
      7m 10s
    2. Creating scatterplots
      5m 8s
    3. Creating time series
      3m 24s
    4. Creating simple bar charts of group means
      4m 17s
    5. Creating population pyramids
      3m 0s
    6. Creating simple boxplots for groups
      3m 3s
    7. Creating side-by-side boxplots
      4m 41s
  9. 45m 28s
    1. Calculating correlations
      8m 17s
    2. Computing a bivariate regression
      6m 27s
    3. Creating crosstabs for categorical variables
      6m 34s
    4. Comparing means with the Means procedure
      6m 33s
    5. Comparing means with the t-test
      6m 4s
    6. Comparing means with a one-way ANOVA
      6m 30s
    7. Comparing paired means
      5m 3s
  10. 24m 30s
    1. Creating clustered bar charts for frequencies
      6m 34s
    2. Creating clustered bar charts for means
      3m 45s
    3. Creating scatterplots by group
      4m 13s
    4. Creating 3-D scatterplots
      4m 25s
    5. Creating scatterplot matrices
      5m 33s
  11. 30m 57s
    1. Using Automatic Linear Models
      11m 52s
    2. Calculating multiple regression
      9m 3s
    3. Comparing means with a two-factor ANOVA
      10m 2s
  12. 29m 29s
    1. Formatting descriptive statistics
      6m 1s
    2. Formatting correlations
      7m 49s
    3. Formatting regression
      10m 19s
    4. Exporting charts and tables
      5m 20s
  13. 51s
    1. What's next
      51s

Start learning today

Get unlimited access to all courses for just $25/month.

Become a member
Sometimes @lynda teaches me how to use a program and sometimes Lynda.com changes my life forever. @JosefShutter
@lynda lynda.com is an absolute life saver when it comes to learning todays software. Definitely recommend it! #higherlearning @Michael_Caraway
@lynda The best thing online! Your database of courses is great! To the mark and very helpful. Thanks! @ru22more
Got to create something yesterday I never thought I could do. #thanks @lynda @Ngventurella
I really do love @lynda as a learning platform. Never stop learning and developing, it’s probably our greatest gift as a species! @soundslikedavid
@lynda just subscribed to lynda.com all I can say its brilliant join now trust me @ButchSamurai
@lynda is an awesome resource. The membership is priceless if you take advantage of it. @diabetic_techie
One of the best decision I made this year. Buy a 1yr subscription to @lynda @cybercaptive
guys lynda.com (@lynda) is the best. So far I’ve learned Java, principles of OO programming, and now learning about MS project @lucasmitchell
Signed back up to @lynda dot com. I’ve missed it!! Proper geeking out right now! #timetolearn #geek @JayGodbold
Share a link to this course

What are exercise files?

Exercise files are the same files the author uses in the course. Save time by downloading the author's files instead of setting up your own files, and learn by following along with the instructor.

Can I take this course without the exercise files?

Yes! If you decide you would like the exercise files later, you can upgrade to a premium account any time.

Become a member Download sample files See plans and pricing

Please wait... please wait ...
Upgrade to get access to exercise files.

Exercise files video

How to use exercise files.

Learn by watching, listening, and doing, Exercise files are the same files the author uses in the course, so you can download them and follow along Premium memberships include access to all exercise files in the library.


Exercise files

Exercise files video

How to use exercise files.

For additional information on downloading and using exercise files, watch our instructional video or read the instructions in the FAQ.

This course includes free exercise files, so you can practice while you watch the course. To access all the exercise files in our library, become a Premium Member.

Join now "Already a member? Log in

Are you sure you want to mark all the videos in this course as unwatched?

This will not affect your course history, your reports, or your certificates of completion for this course.


Mark all as unwatched Cancel

Congratulations

You have completed SPSS Statistics Essential Training (2011).

Return to your organization's learning portal to continue training, or close this page.


OK
Become a member to add this course to a playlist

Join today and get unlimited access to the entire library of video courses—and create as many playlists as you like.

Get started

Already a member?

Become a member to like this course.

Join today and get unlimited access to the entire library of video courses.

Get started

Already a member?

Exercise files

Learn by watching, listening, and doing! Exercise files are the same files the author uses in the course, so you can download them and follow along. Exercise files are available with all Premium memberships. Learn more

Get started

Already a Premium member?

Exercise files video

How to use exercise files.

Ask a question

Thanks for contacting us.
You’ll hear from our Customer Service team within 24 hours.

Please enter the text shown below:

The classic layout automatically defaults to the latest Flash Player.

To choose a different player, hold the cursor over your name at the top right of any lynda.com page and choose Site preferencesfrom the dropdown menu.

Continue to classic layout Stay on new layout
Exercise files

Access exercise files from a button right under the course name.

Mark videos as unwatched

Remove icons showing you already watched videos if you want to start over.

Control your viewing experience

Make the video wide, narrow, full-screen, or pop the player out of the page into its own window.

Interactive transcripts

Click on text in the transcript to jump to that spot in the video. As the video plays, the relevant spot in the transcript will be highlighted.

Are you sure you want to delete this note?

No

Your file was successfully uploaded.

Thanks for signing up.

We’ll send you a confirmation email shortly.


Sign up and receive emails about lynda.com and our online training library:

Here’s our privacy policy with more details about how we handle your information.

Keep up with news, tips, and latest courses with emails from lynda.com.

Sign up and receive emails about lynda.com and our online training library:

Here’s our privacy policy with more details about how we handle your information.

   
submit Lightbox submit clicked
Terms and conditions of use

We've updated our terms and conditions (now called terms of service).Go
Review and accept our updated terms of service.