Start learning with our library of video tutorials taught by experts. Get started
Viewers: in countries Watching now:
In this course, author Barton Poulson takes a practical, visual, and non-mathematical approach to the basics of statistical concepts and data analysis in SPSS, the statistical package for business, government, research, and academic organization. From importing spreadsheets to creating regression models to exporting presentation graphics, this course covers all the basics, with an emphasis on clarity, interpretation, communicability, and application.
In the last two movies, we looked at two different kinds of displays you can use for categorical variables. We looked at bar charts and we looked at pie charts. On the other hand, you may also have what SPSS calls a scale variable, also called a quantitative, or measured, variable, So for instance the percentage of critics who favorably endorse their movie, or the budget for the movie, or viewer evaluations, these are all measured as more or less quantities, and a bar chart and pie chart won't work for these. Instead, there are generally two kinds of charts that you want to make.
The first one that we're going to do right now is called a histogram, and it's like a bell curve that shows the distribution of scores. Let's look at that one right now. Come to Graphs, to Chart Builder, and from here I come down to Histogram. There are a few variations, but the one that's most informative is the basic one. I grab it out of the gallery and drag it into the chart canvas, and from there I simply need to tell it what variable it is that I want to chart. In this case, I'm going to use Budget.
I'm going to drag that down into the X axis. Now by the way, this is not the real data that SPSS is showing. When it uses a canvas it simply puts in some kind of random data to let you know that it's not producing a pie chart or something. Now I have some options here. One of them is whether I need IDs--I don't think I do--or Titles, and I'm going to put a title on this one. And I'm going to put "Budget for Movies in Movie.sav." And I'll press Apply, and then I'll press OK.
And the Output window first shows the code that produces this one, and you can save that to rerun this later if you want to. It shows the name of the command in SPSS. It's GGraph. It shows the data set that was used to produce this. That's important, especially if you have more than one data set open at a time, and this is the chart as this produced by default in SPSS. It's called a histogram. You can see we have a whole lot of movies in the status that was very small budgets. This is $50 million, $100 million, up to a quarter billion there on the scale.
And this tells us that there are about 23 movies with budgets in the lowest range. That makes sense when you consider these are a lot of award-winning movies, like animated shorts that people may not have seen and that don't require a huge budget. On the other hand, this chart is not particularly attractive, and it's got some communication problem. So what I'm going to do is I'm going to double-click on the chart. Then I'm going to take this information right here with the Mean, the Standard Deviation, and Sample Size, and I don't need that in the chart. I may need that information elsewhere, but I don't need it here.
So I'm going to click on it and then I hit Delete. Then I see over here I have frequencies with decimal points on them, and I don't need that there. That's kind of silly. So I can click on that and then come over here to Number Format and I can put it to zero decimal places. Then, here across the bottom, these are millions of dollars and truthfully these numbers are hard to read, because there are so many digits there. What I can do is I can click on that, and I can come to Number Format, and I can go to Scaling Factor here, and I put it as Millions, and I press Apply.
And now it's much easier to read, but I need to change this one. It says, Budget. I just click on that and I'll say, Budget in Millions. Now there are two other things I want to do here. Number one is, I find this to be in a very unattractive color, so I'm going to click on it, and since it's money, I might as well use green for my charts. There is a little curiosity here about the fact that we have three bars for every $50 million. Now there are some general guidelines for the number of bins that you should have in a histogram. These are bins, how wide each bar is.
And we've got some gaps here, which means we might need a few more bins to help smooth out the pattern. Again, the idea here is that every chart, including histograms, is meant to be a simplification, an abstraction of the data. It needs to be informative and accurate, but it is a simplification. So sometimes reducing the number of bins can make it easier to see the patterns without getting overwhelmed by the complexities of the real data. So what I'm going to do is I've got these selected already, I'm going to come over to the Properties window, click on Binning, and then I'm going to come down to Custom, Interval Width. And what I'm going to do is I'm going to make it so that there are two bars instead of three for each one of these, so they are each 25 million wide.
I believe that's 25 million. And now we have just two bars per gap, and it smoothes things out a little bit. And what you can see is that most of the movies in this particular data set of award winners and top grossers have budgets between 0 and 25 million. There are some very low-budget movies. This again, these short movies are some animated movies, and then we have some very large summer blockbusters with budgets of $150 million or $200 million. It's a good way of seeing what the distribution is like. When I'm done modifying the variable, if I want to, I can come to File and I can save that template and I can use it again later.
I'm not going to do that right now. And then when I'm done editing the chart, I can simply press the X and close the chart, and there's my finished chart that I can export later. And again, a histogram is the first of two charts that you should generally use when you're looking at scale data. The other one, which we'll cover in the next movie, is a box plot, which is ideal for looking at outliers in distributions in which we appear to have in this particular one. But both of these charts are a great way of getting the feel for the shape of a distribution of a scaled variable, and give you a better idea of how well you meet the statistical assumptions of tests that you're going to be performing later on them.
There are currently no FAQs about SPSS Statistics Essential Training.
Access exercise files from a button right under the course name.
Search within course videos and transcripts, and jump right to the results.
Remove icons showing you already watched videos if you want to start over.
Make the video wide, narrow, full-screen, or pop the player out of the page into its own window.
Click on text in the transcript to jump to that spot in the video. As the video plays, the relevant spot in the transcript will be highlighted.