Easy-to-follow video tutorials help you learn software, creative, and business skills.Become a member
In the last movie, we took a look at a really handy procedure for selecting subgroups of your data for a more focused analysis--that was the Select Cases or filter variable. In this movie, we will explore related procedure called Split File that also breaks the data down by subgroups, but unlike the Select Files command, it then gives you the results for all of the subgroups, and it'll let you make explicit comparisons between the groups, which can be a really handy feature. Now when we left the data set, I had some of the cases selected and some of them not.
You can tell that this is the case because, obviously, over on the left a bunch of the rows are crossed out. Also, you see that on the right end of the date set, I have a variable called filter_$, and we have Not Selected and Selected. Also, at the very bottom right of the screen you see that it says Filter On. This is an indication that the filter, the selection criterion, is active. So before I go on to do a Split File, I need to turn off the selection.
I go back up to the Data menu, to Select Cases, and than at the top of the box I simply click on All Cases. I don't have to erase the criterion. It's okay if it's still there and I press OK. And then the output, it tells me that the filter is off and I'm now using all the cases. If I go back to the data, you can see that none of the cases are crossed out and that down here on the bottom-right the Filter On is not there anymore. The variable that created the filter is still there if I want use it later, but now I am going to create a Split File where I can compare several groups.
To do this, I am going go back up to Data and I am going go down to the bottom to Split File, which is right next to Select Cases. In this dialog box, I have three options for Split File. The first one is Analyze all cases, do not create groups. That's what I have now. That's the default. The next two, Compare groups and Organize output by groups, determine how things will look if I request several procedures, or a procedure that has a lot of output. The first one, Compare groups, puts the results for each step right next to each other.
So for instance, if I have tables and charts, the tables for group 1, then the tables for group 2, then the chart for group 1 and then the chart for group 2. On the other hand, Organize output by groups would do the tables and the charts for group 1, then the tables and the charts for group 2. I'm going to use Compare groups in this case. It's a personal preference. From time to time, I might use the other one, and it's up to your judgment. I click on Compare groups and then I choose the variable that I'm going to use to split the groups. In this one, I'm going to use the region of the United States.
So I need to scroll down on my variable list and if I make the box wider, you can see, I have Census Bureau Region. That's the label. The variable name is Region. I will just double-click on that, and there it is, in the Groups Based on box. So I've got the criterion in there, and by the way, you can put more than one in there if you want to split it by two variables, but then things get rather complicated. So I'm just going to press OK now, and now in the Results it tells me that it has sorted the data file by its region and that it's now going to split things by the region.
If I go back to the data set, nothing is crossed out, because I'm using everything, but you can see that Region is sorted here in this column. And if you go to the very bottom right of this screen, you'll see that it says Split by region, so I know that it's going to be doing this where it does things separately for each group. So what I'm going to do now is I am going to request some information. I am just going do histograms. I go to Graphs and to Chart Builder. Now I am going to come down to Histogram. I am going to drag the basic histogram up into the canvas and then I select the variable.
I will use the SPSS Google Search. So I click on that and drag it to X axis in the canvas and from there, I can simply Press OK. And then in My Results what you see is I have several histograms. These are very large chunky ones because there are not a lot cases in them, but this is for the first one. This is for the Northeast region of the United States. But this is for the Midwest, and this has more bars because there's more cases. If we come back up, you see there's only nine states in the Northeast region, and this one has 12, and we have the South and then the West.
So what it's done is it's done a procedure but it's done it separately for each of these particular groups. I can get much more complicated procedures that we'll cover later in the course and break them down by region or by some other variable, or combination of variables. So the Split File command, along with the Select Cases command, this is a great way to focus on subgroups and get a deeper understanding of your data, and by comparing the results for one group to the next, you can see whether the patterns you find hold across groups or whether you should dive even deeper into your data.
Get unlimited access to all courses for just $25/month.Become a member
82 Video lessons · 64863 Viewers
80 Video lessons · 124401 Viewers
52 Video lessons · 60325 Viewers
59 Video lessons · 46149 Viewers