# Creating population pyramids

## Video: Creating population pyramids

In the last movie we looked at how you can create pie charts to show the mean or maybe the median, for each group on a categorical variable. However sometimes, it can be more helpful to see not just a single summary statistic, but the entire distribution of scores for each group. One way to do this, provided your categorical variable is a dichotomy, that is it has just two values, is a variation on the histogram or bell curve that we looked at back in the section on univariate charts. In this case what we are going to create is a pair of back-to-back histograms, what SPSS calls a population pyramid.

## Creating population pyramids

For this example, I'm going to be using the Searches.sav data file, and I am going to be looking at relative interest in NBA, as a search term, and compare that with whether a state has an NBA team or not. Now I am going to do this by going up to Graphs, to Chart Builder, and from there, I come down to Histogram, because the pyramid plot is a variation on the Histogram. This one on the far right, Population Pyramid, I drag that up to the canvas, and then what I'm going to do is I am going to come on this variable list and scroll down until I find the results for NBA as a Google search term, and I take that over to the distribution variable. We are trying to find out how common that is.

Then I am going to split it by whether the state has an NBA team. That's this variable right here and I take that up to the split variable, and from there I can just press OK. And what we find in this one is that the states that have an NBA team, the ones on the right side in the green, tend to have the higher scores on the relative interest in NBA as a search term in Google, as opposed to the states that don't have NBA teams. For instance, on the right we see that there are two states that have relative interest in NBA, right around three standard deviations above the mean.

On the other hand we see of the states that don't have NBA teams, a lot of them are below zero, around negative one. And so this is a way of looking at things back to back in Histogram and making the differences between the two sets really obvious. Now if you want to, you can double- click on this chart and you can change the colors on each side,. You can change the bins. You can change the number of decimal places on the side, the same way that we've edited nearly everything else. But this one is probably clear enough as it is.

So a population pyramid, that is, a back- to-back histogram, this can be a new way to compare the distribution of a scale variable across two different groups. Like a regular Univariate Histogram, it lets you examine the shape of the distribution, let's you check visually for outliers, and lets you identify any possible quirks in the data that might throw off later analyses. In the next movie, we will look at one final display for showing the association between the categorical variable and skilled variable, what's called grouped boxplots.

Show transcript

