In the last movie, we looked at how you can make a clustered bar chart to show the association between three different categorical variables. In this movie, we'll look at how to show the associations between two categorical predictor variables and a single outcome variable that is scaled or quantitative. For example, you may want to show the average purchase price of items bought by men and women in two different retail categories. Surprisingly, this kind of chart is even simpler than the categorical version we just covered, because that one required that we use panels to show all three variables.

## Creating clustered bar charts for means

In the last movie, we looked at how you can make a clustered bar chart to show the association between three different categorical variables. In this movie, we'll look at how to show the associations between two categorical predictor variables and a single outcome variable that is scaled or quantitative. For example, you may want to show the average purchase price of items bought by men and women in two different retail categories. Surprisingly, this kind of chart is even simpler than the categorical version we just covered, because that one required that we use panels to show all three variables.

With the scaled outcome though, we can use just a single panel like this. In this example, I am going to be using the General Social Survey data. GSS.sav again. To make the chart, let's go up to Graphs to Chart Builder. From there, we are going to come down to the Gallery to Bar Charts and choose a clustered bar chart. We'll drag that up here and what we are going to do is get our two predictor variables, placed one on the X-axis, one to set the cluster on X, the set color, and the third one, the Y-axis, will be our outcome variable.

In this case, I'm going to try to predict family income. That will be my outcome variable. So I'll just grab family income and take it over to the Y-axis and I am going to use two variables to predict that. One is whether a person is a male or female. I am going to drag that down here to the X-axis. And another one is whether a person has children or not. I'll bring that over here. I think it'd also be helpful to put on error bars and I'll click Apply.

Then I'll come back over here and click OK. There is a lot of code that goes into this so we can save and reuse later if we want. But here's the actual chart. So what we have here is women on the left and men on the right. People who do not have children are in blue and people who do have children are in green, and what's charted on the Y-axis is the mean family income that people reported. What's interesting about this is we have an interaction and that is, for women, those who do not have children reported a slightly higher average family income than those who do have children, although the standard deviation, the spread on these, is pretty big.

On the other hand, for men, the exact opposite is observed. That men who have children report a substantially higher family income than those who do not have children. That's about 25,000-40,000. Now again, all this chart is showing us there is an association between the variables. It doesn't explain why those differences are there. There are a lot of reasons that go into that and it could actually require some pretty nuanced investigation. Nevertheless, this is a very simple chart that shows how two predictor variables, male/female as one category, and having children, yes or no, as another, can be used to predict scores on a third quantitative or scale variable, in this case, family income.

So clustered bar charts for me is an easy and informative way to show how two categorical predictors are associated with the scaled outcome or an indicator outcome if you are doing 01. They also give a good idea of what the results of the inferential test would be. This kind of clustered bar chart can be one of the most effective tools that you have in exploring, analyzing, and presenting your own data. In the next movie, we'll look at another simple variation on a chart for when you have just one categorical variable and two scaled variables.

In this case, the scatter plot as group markers.

