Easy-to-follow video tutorials help you learn software, creative, and business skills.Become a member

Selecting cases

From: SPSS Statistics Essential Training (2011)

Video: Selecting cases

When you're doing an in-depth investigation of your data, there are times when you'll want to focus on just some of the cases, for example, all of the men over 50 who visited your website, or clients with outstanding payments, or people under 16 who have taken the SAT. Now, one way to deal with this is to sort the data and then delete all the cases that you don't want and save it as a new data file. This is an option, but it can get cumbersome, and you do run the risk of multiplying data files or losing track of what you've got. An easier way is to have SPSS select the cases of interest, and when this happens, the other cases are still in the data set, but are temporarily excluded from the procedures, and you can then switch to different selection criteria or you can return to the entire data set.

Selecting cases

When you're doing an in-depth investigation of your data, there are times when you'll want to focus on just some of the cases, for example, all of the men over 50 who visited your website, or clients with outstanding payments, or people under 16 who have taken the SAT. Now, one way to deal with this is to sort the data and then delete all the cases that you don't want and save it as a new data file. This is an option, but it can get cumbersome, and you do run the risk of multiplying data files or losing track of what you've got. An easier way is to have SPSS select the cases of interest, and when this happens, the other cases are still in the data set, but are temporarily excluded from the procedures, and you can then switch to different selection criteria or you can return to the entire data set.

It's a more flexible and efficient way of working with interesting subgroups in your data. For this example I am going to be using the data set Searches.sav, which is information about Google searches on a state-by-state basis. The first several searches all have to do with statistical topics, for instance the SPSS Google search term or regression, and then I have some social media ones, and then I have some sports ones. One that's interesting at the right end of the data set--so I am going to scroll over--is an indication of whether a state has an outline for a high school statistics class, and maybe I would want to restrict my analyses temporarily to states that have this to see, for instance, if that's associated with their Google search patterns for statistical topics.

So the way that I am going to do this is I am going to select cases. I go up to the Data menu, and then I come down to the bottom to Select Cases. And the dialog box gives me several options. The first one is to simply include all the cases, which is what I have right now. The second one is If condition is satisfied, and the idea here is, say, if they have a score on this variable that is equal to this, or maybe another one, I can have more than one variable. And this is what I am going to use. I am going to say whether they have the statistics education. That's going to be statistics_ed = 1.

I will show that to you in just a second. I also have an option of using the random sample of cases. If I have a large data set, sometimes it's a good idea to try doing an analysis on a small part of it, let's say 20% or 30% or 40%, and then trying again with other parts of the data to see if the patterns I found hold there. You can also look for a time, or case range, for instance all the customers from 2009 or from 2007. And the last one, Use a filter variable, what happens is when I do a selection, SPSS automatically creates an indicator variable at the end of the data set.

So if I have one already, this simply gives me the option of using that existing filter variable. The second below that, Output, is grayed out because I haven't done a selection yet, so I can't use those options. So what I am going to do right now is I am going to go to select If condition is satisfied, and then I click on the If box to say what my criteria are for the selection. What I want to use here is the variable about whether a state has a high school curriculum for statistics. That's near the bottom of the variable list on the left.

I can simply double-click on that and it puts it up in the Selection box. Now, my selection in this case is very easy. This is a 0, 1 variable. It's called a dichotomous indicator variable. It only has two options. And I just want the 1s, so I am going to type statistics_ad, which is already there, and I am going to add =1. Once I've got that, I can go to the bottom and click Continue, and that shows up in my If condition is satisfied in the selection box. Now, the options at the bottom in Output show up. The first one is to simply filter out the unselected cases. It's the default.

It's what I am going to use here. But I do have two other options that allow me to change the data set. The second one, Copy selected cases to a new data set, does exactly that. It creates a second data set. I have to give a name for that data set. And then if I want to work with just that one, it can be easier. Or I can get rid of the cases that I didn't select. There may be situations in which I want to do that. You can call that destructive editing. I usually just filter out the unselected cases, but it's up to you. So now that I have got my criteria specified by what I am selecting and what I am going to do with the unselected cases, I simply press OK.

Now the output file shows me the syntax statements that it has used to create the selection. It doesn't show any charts here, because we don't have them. But if I go to the data file, you can see that on the left the row numbers of a lot of the cases are selected out, because not too many states have a high school statistics curriculum. Also, on the right side you can see there's a new variable there, Filter_$, that says Selected or Not Selected. That's a 0/1 variable. If I turn off the variable labels with the button on the menu bar, you can see that those are 0s and 1s underneath, but I will turn the labels back on now by clicking on the Value Labels button.

So anything I do is going to work only with the cases that I have selected, which in this case are states with a high school statistics education curriculum. I will make a box plot, for example, of their SPSS searches. I click on Graphs, to Chart Builder, and then in the gallery on the bottom I go to Boxplot, and I am simply going to drag the one-dimensional box plot up into the canvas. And from there, I drag in the variable from the list that I want. I am going to take SPSS and drag that into the X axis.

Also, because I may have outliers here, it's nice to have an ID to know what states they are. I can go down to the Group/Point ID tab, I can select Point ID label on the bottom, and then I need to drag in the variable that provides the labels. In this one it's the state code. So I come up to the variable list and drag the state code over, and now I am ready. I click OK. I first get a bunch of more code that's the syntax for what I have done. There is the GGraph command that gives the data set, and then here is the box plot.

This shows the distribution of Google search patterns in terms of how common that particular search is relative to others for several different locations, and you can see we have an outlier, it's Washington, D.C. up at the top, and they search for this term SPSS much more than other states do. So anyhow, what I have here is a selection criteria, the ability to temporarily or permanently select a subset of cases for a more thorough analysis, and this is a great feature of SPSS.

It lets you really dive into your data and get the most out of it. In the next movie we'll look at a related procedure called Split File that also lets you work with subsets, but instead of reporting on just one subgroup at a time, it gives the results for all of them so you can make comparisons between the subgroups.

Show transcript

This video is part of

Image for SPSS Statistics Essential Training (2011)
SPSS Statistics Essential Training (2011)

52 video lessons · 19985 viewers

Barton Poulson
Author

 
Expand all | Collapse all
  1. 2m 58s
    1. Welcome
      1m 5s
    2. Using the exercise files
      40s
    3. Using a different version of the software
      1m 13s
  2. 19m 0s
    1. Taking a first look at the interface
      11m 49s
    2. Reading data from a spreadsheet
      7m 11s
  3. 21m 54s
    1. Creating bar charts for categorical variables
      7m 18s
    2. Creating pie charts for categorical variables
      2m 54s
    3. Creating histograms for quantitative variables
      5m 45s
    4. Creating box plots for quantitative variables
      5m 57s
  4. 33m 10s
    1. Recoding variables
      5m 33s
    2. Recoding with visual binning
      5m 33s
    3. Recoding by ranking cases
      5m 26s
    4. Computing new variables
      5m 37s
    5. Combining or excluding outliers
      5m 21s
    6. Transforming outliers
      5m 40s
  5. 28m 12s
    1. Selecting cases
      6m 44s
    2. Using the Split File command
      5m 12s
    3. Merging files
      5m 33s
    4. Using the Multiple Response command
      10m 43s
  6. 22m 14s
    1. Calculating frequencies
      8m 43s
    2. Calculating descriptives
      5m 31s
    3. Using the Explore command
      8m 0s
  7. 16m 3s
    1. Calculating inferential statistics for a single proportion
      6m 6s
    2. Calculating inferential statistics for a single mean
      5m 39s
    3. Calculating inferential statistics for a single categorical variable
      4m 18s
  8. 30m 43s
    1. Creating clustered bar charts
      7m 10s
    2. Creating scatterplots
      5m 8s
    3. Creating time series
      3m 24s
    4. Creating simple bar charts of group means
      4m 17s
    5. Creating population pyramids
      3m 0s
    6. Creating simple boxplots for groups
      3m 3s
    7. Creating side-by-side boxplots
      4m 41s
  9. 45m 28s
    1. Calculating correlations
      8m 17s
    2. Computing a bivariate regression
      6m 27s
    3. Creating crosstabs for categorical variables
      6m 34s
    4. Comparing means with the Means procedure
      6m 33s
    5. Comparing means with the t-test
      6m 4s
    6. Comparing means with a one-way ANOVA
      6m 30s
    7. Comparing paired means
      5m 3s
  10. 24m 30s
    1. Creating clustered bar charts for frequencies
      6m 34s
    2. Creating clustered bar charts for means
      3m 45s
    3. Creating scatterplots by group
      4m 13s
    4. Creating 3-D scatterplots
      4m 25s
    5. Creating scatterplot matrices
      5m 33s
  11. 30m 57s
    1. Using Automatic Linear Models
      11m 52s
    2. Calculating multiple regression
      9m 3s
    3. Comparing means with a two-factor ANOVA
      10m 2s
  12. 29m 29s
    1. Formatting descriptive statistics
      6m 1s
    2. Formatting correlations
      7m 49s
    3. Formatting regression
      10m 19s
    4. Exporting charts and tables
      5m 20s
  13. 51s
    1. What's next
      51s

Start learning today

Get unlimited access to all courses for just $25/month.

Become a member
Sometimes @lynda teaches me how to use a program and sometimes Lynda.com changes my life forever. @JosefShutter
@lynda lynda.com is an absolute life saver when it comes to learning todays software. Definitely recommend it! #higherlearning @Michael_Caraway
@lynda The best thing online! Your database of courses is great! To the mark and very helpful. Thanks! @ru22more
Got to create something yesterday I never thought I could do. #thanks @lynda @Ngventurella
I really do love @lynda as a learning platform. Never stop learning and developing, it’s probably our greatest gift as a species! @soundslikedavid
@lynda just subscribed to lynda.com all I can say its brilliant join now trust me @ButchSamurai
@lynda is an awesome resource. The membership is priceless if you take advantage of it. @diabetic_techie
One of the best decision I made this year. Buy a 1yr subscription to @lynda @cybercaptive
guys lynda.com (@lynda) is the best. So far I’ve learned Java, principles of OO programming, and now learning about MS project @lucasmitchell
Signed back up to @lynda dot com. I’ve missed it!! Proper geeking out right now! #timetolearn #geek @JayGodbold
Share a link to this course

What are exercise files?

Exercise files are the same files the author uses in the course. Save time by downloading the author's files instead of setting up your own files, and learn by following along with the instructor.

Can I take this course without the exercise files?

Yes! If you decide you would like the exercise files later, you can upgrade to a premium account any time.

Become a member Download sample files See plans and pricing

Please wait... please wait ...
Upgrade to get access to exercise files.

Exercise files video

How to use exercise files.

Learn by watching, listening, and doing, Exercise files are the same files the author uses in the course, so you can download them and follow along Premium memberships include access to all exercise files in the library.


Exercise files

Exercise files video

How to use exercise files.

For additional information on downloading and using exercise files, watch our instructional video or read the instructions in the FAQ.

This course includes free exercise files, so you can practice while you watch the course. To access all the exercise files in our library, become a Premium Member.

Join now "Already a member? Log in

Are you sure you want to mark all the videos in this course as unwatched?

This will not affect your course history, your reports, or your certificates of completion for this course.


Mark all as unwatched Cancel

Congratulations

You have completed SPSS Statistics Essential Training (2011).

Return to your organization's learning portal to continue training, or close this page.


OK
Become a member to add this course to a playlist

Join today and get unlimited access to the entire library of video courses—and create as many playlists as you like.

Get started

Already a member?

Become a member to like this course.

Join today and get unlimited access to the entire library of video courses.

Get started

Already a member?

Exercise files

Learn by watching, listening, and doing! Exercise files are the same files the author uses in the course, so you can download them and follow along. Exercise files are available with all Premium memberships. Learn more

Get started

Already a Premium member?

Exercise files video

How to use exercise files.

Ask a question

Thanks for contacting us.
You’ll hear from our Customer Service team within 24 hours.

Please enter the text shown below:

The classic layout automatically defaults to the latest Flash Player.

To choose a different player, hold the cursor over your name at the top right of any lynda.com page and choose Site preferencesfrom the dropdown menu.

Continue to classic layout Stay on new layout
Exercise files

Access exercise files from a button right under the course name.

Mark videos as unwatched

Remove icons showing you already watched videos if you want to start over.

Control your viewing experience

Make the video wide, narrow, full-screen, or pop the player out of the page into its own window.

Interactive transcripts

Click on text in the transcript to jump to that spot in the video. As the video plays, the relevant spot in the transcript will be highlighted.

Are you sure you want to delete this note?

No

Your file was successfully uploaded.

Thanks for signing up.

We’ll send you a confirmation email shortly.


Sign up and receive emails about lynda.com and our online training library:

Here’s our privacy policy with more details about how we handle your information.

Keep up with news, tips, and latest courses with emails from lynda.com.

Sign up and receive emails about lynda.com and our online training library:

Here’s our privacy policy with more details about how we handle your information.

   
submit Lightbox submit clicked
Terms and conditions of use

We've updated our terms and conditions (now called terms of service).Go
Review and accept our updated terms of service.