Join Barton Poulson for an in-depth discussion in this video Reading data from a spreadsheet, part of SPSS Statistics Essential Training.
- While it's possible to enter data directly into SPSS, or download it in the SPSS.sav format, data sets will usually come to you in other formats such as database files, text files, or frequently as spreadsheets. There are actually advantages to this. These other programs, such as spreadsheets, create files that are easier to share than are SPSS files. And SPSS is well set up to import data from each of these formats. In this movie, I'll show you how to work with spreadsheets in Microsoft Excel format.
At the end of this movie, I'll point you to SPSS's excellent instructions on tutorials on importing data from other sources as well, such as databases, which have been an area of recent progress for SPSS. What I'm showing you right here is a SPSS.sav file. It's one of the sample data sets, and it comes in the SPSS format. What's interesting, though, is SPSS also provides this same data set in a couple of other formats. I'm going to go to the Explorer window, Finder window on a Mac, where the files are located.
As a reminder, the sample files are located in the C drive on your Program Files, IBM, SPSS, Statistics, 22, Samples, English. So it's a long path to get here. Once you see it, you have all of the data sets here in front of you. The one I have open right now is demo. It's actually demo.sav. It's a native SPSS data file. On the other hand, we have the same file available as a Microsoft Access Database, which you'll notice actually is much larger than what we have, or as a Text Document.
I'm going to double-click and open the Text version. This is just a Plain Text Document opening in Notepad, where you see, for instance, looks like we are using tabs to delimit things. There is one row per person and one column. This is, of course, the simplest possible way to enter the data. One that I find works a little better is this right here, the Excel files. In a lot of programs, I would recommend that you not use a straight ahead Excel file, but you would use a .csv file, or comma separated values.
Fortunately, SPSS reads Excel just great, exactly as it is. All you need to know is that if you have more than one tab in Excel, it's going to read the first tab. What I'm going to do right here, I'm going to open up the demo.xls file just for a minute. It's going to open up in Excel. Here you see that we have the same data. I'm going to move this off to the right. And we'll compare it to the data in SPSS. And you can see we have the same information. 55, Married, 12.
Things are set up a little bit differently because over here we don't have labels. Over here, we have the labels shown on the SPSS. Also, the titles are a little different. In the Excel spreadsheet, Column D is labeled Marital Status, and it's two different words. On the other hand, in SPSS, it's just called marital, single word, and if I hover over it, you'll see that then the two word version 'Marital Status' comes up. On the other hand, it's mostly straightforward and its a very easy process to get an Excel file into SPSS.
Let me show you how that works. First thing you need to do is actually close the Excel file. Sometimes it will not import it if the file is open in Excel. I'm going to close that. I'm going to come back to SPSS, and I'm going to go to Open, Data. Now see, I don't have to do anything fancy about Import or anything. I just do Data. And then what I do is right now, it's going to the folder that has all the data sets in it. If I scroll over a little bit, you'll see, for instance, there's the demo.sav file that I have open currently. Now, right now, it is only showing me .sav save files.
That's because I've asked it for File Type. If I click on that, I can pick other things. For instance, if I come down here to Excel, you'll see that it can take .xls, xlsx, or even the .xlsm format. I click on that and the only Excel file that's in this folder is demo.xls. I'm going to double-click on that. It's going to ask me exactly what I want to open. It first needs to know whether I have variable names on the first row of the data. Now it's good form to have that.
And we did, so I just leave that one checked. It's there by default. And then it asks for the range of the data. The first one where it says demo, that's going to be the name of the sheet on the Excel file. And then we have the top left and the bottom right corners of the data. So it goes from A1 over to AB60401, because there's a lot of data here. Then I can simply click OK, and now it has it opened in the SPSS Data Editor window. I'm going to bring this off to the side so you can see that there's a couple of important differences right here from how it is when we opened it in the native SPSS format.
Now the most significant has to do with the variable view. We come here, you'll see that some of these have variable labels already. Let me scroll over just a little bit here. For instance, marital status has a label, but what it's done, is it simply taken the original title of the column in Excel because it has more than one word, it's got a space in it, and it puts that over here as a Label. And if I scroll back here, you'll see it just concatenates it, sticks it together, whereas we have a shorter version over here.
Also the widths and the decimals may be different. There will be no value labels. You'll see that we have a lot of value labels over here, because those have to get set separately. There's no information about missing data. All the columns are 11 characters wide. And the scales are different. Over here, a lot of them needed to be set differently. For instance, let's go to the second variable, Marital Status. If we want this to match what we had in the original file, I need to change that.
It says Nominal, and it actually should be Nominal because it's coded as Yes or No. Are you, are you not married, but if I want it to match what came in the Default one, I just click on that and set it to Scale. Now what's nice is you see we have a whole bunch of other ones down here that are Nominal, you can actually copy that format and then paste it into a punch of other cells. So I can go like this, and there I can paste the format. Same thing would be true if we had any variation on input. On the original data set, they're all listed as Input and it would probably be a good idea to list some of them specifically as Output variables or Target variables.
We haven't done that here. But overall, the import process was really quick and easy, and there is just a little bit of cleanup that needs to be done. Now if you need to work with database files or delimited text files, then what I suggest you do is go to the Help menu, and let's go to Tutorial, and specifially let's go to Tutorial and Reading Data. In here, you're going to have information about reading from a database, reading from a text file, and you can get more information there if that's what you need. The overall message is that spreadsheets are a great way to enter and share data, and SPSS can read Excel format spreadsheets with no trouble.
With the proper preparation, it's a relatively straightforward procedure to import data from a spreadsheet right into SPSS so you can begin exploring your data.
- Build charts, scatterplots, and box plots
- Calculate descriptive statistics such as means and standard deviations
- Use inferential statistics such as t-tests and chi-squares
- Enter and read data
- Create new variables and crosstabulations
- Model associations with correlations, contingency tables, and multiple-regression analysis
- Format and export presentations to share your data
Plus, learn how to extend the power of SPSS with Python and R. This course is ideal for first-time researchers and those who want to make the most of data in their professional and academic work.