- [Instructor] One of the most difficult things to do when you're starting in data visualization is figuring out which charts to use in which situation. Now eventually you're going to want to push the envelope and try different forms, different chart types, really out there alternative approaches to visualizations. But before that, I'd recommend having a really good handle on the basic charts, and when you might use them. You know the saying, before you can walk you need to learn how to crawl. This video is a high level overview, but hopefully a really good introduction to the topic of when to use which basic chart forms in which situation.
These charts, by the way for the most part, allow you to easily display one to three variables. So these are the most basic chart types that I'm going to be talking about in this movie. You have bar charts, line charts, area charts, timelines, scatter plots, bubble charts and pie charts. These seven forms are very straight forward and basic and for the most part are forms that people naturally understand. So lets start talking about the bar chart or you could call it the boring old bar chart. And the reason it's the boring old bar chart is because it is used so widely, because it is so effective.
The fact is that humans have a capacity, a built in capacity to easily parse the differences between these rectangular shapes. We're wired to see this type of chart. So while it is boring and old, the fact of the matter is, it's extremely effective and what I would argue is that anytime you're doing a visualization you should start off by thinking of it as a bar chart and ask yourself, is there a reason that this should not just be a bar chart? It's effective, it's easily understood, you won't be confusing your users, etcetera.
Lack of confusion is a good thing. Now, a bar chart is really great at showing those just one or two variables, you can add more data to it. So here we have what's called a grouped bard chart, so we have two different data points, the gray and the black and it's very easy to understand still. When you start adding more data points, it can start to get a little bit harder to read. In grouped bar charts, this certainly gets into the category of maybe I want to try a different form. And when you have a whole bunch of them, even if they're separated into different groups in this way, with a lot more spacing etcetera, it can get overwhelming quickly although it is still decipherable.
If you're trying to make an emphasis on a comparison within groups, if you want to think of the elements of data as part of a whole, like a category, then a stacked bar chart might be a better way to go. Here you can see that the whole bar represents the total value for each group and each segment represents the category value, sort of, the proportion within the group of the data. If you want to emphasize not the total value for each but the relative value, so how much each category influences the total value, then what's called a stacked percentage bar might be a good choice.
So here each bar represents 100% of each data point, so each segment within, each color, represents the relative value within the whole as a percentage. It's often easier to see relationships like this in a stacked percentage chart. If I wanted to show the relative strength of a category within the whole. A bar chart can't convey all types of data. So if you look at these two charts, both are showing changes in values over time.
The problem is that the bar chart really only shows each value at a single point. So for instance, maybe this is telling me the change in value of a stock price every January 1st over X number of years. So I get that sort of snapshot of a moment in time. Whereas the line is really great at showing me the trends over continuous time. Line charts are a great idea for a default choice of chart type, when you're showing things over time. Information designers telling content driven stories with time based elements, will often use what I call timelines as a good default paradigm, right? Each one of these dots is on a timeline, at a certain point in time, and might have more information inside of them.
An area chart is like a filled in line chart. Line charts are often better than filled line charts like this because where the lines cross it can be hard to see, with the filled in area charts, where the dark gray's covering up the lighter gray and the medium gray behind it. It's hard to tell where the relative values are. I don't know where the bottoms of those troughs are, in the light gray. But this is an interesting way at looking at data. There's also what's called a stacked area chart. Here you have the filled in line charts and they're treated sort of like the stack bars where they're on top of each other.
Again, this is good at showing categories of data over time and how they relate to each other. And finally, like the stacked percentage bar chart, we have the stacked percentage area chart, which again, is a really effective way of showing the relative strength of categories across time, or whatever the X access represents, but as a portion of a whole. So again, the top here is 100%. Another great chart type for showing two variables is called a scatter plot. Scatter plots are great at showing correlation.
So here you can see that as X increases, as things get further to the right, Y also increases. They also tend to go further up. You could have shown this data in a bar chart but you wouldn't as easily see the correlations. This one shows what's called a positive correlation, where as things go up on one axis, they go up in the other axis. This one has a negative correlation. As one increases, the other decreases. Here is a scatter plot with no discernible correlation but there are some interesting patterns and certain types of patterns will show up much better in a scatter plot than in a bar chart for instance.
Bubble charts are great at showing three variables. It's really just a scatter plot, but now we have a third variable. The size of the dot is representing that third variable. And so we can add more interesting layers to the data by looking at it this way. And once again, in this example, we have correlation. As X goes up, so does the size of the dot, generally. But again it's easy to see the outliers. So we have that one giant dot over on the left hand side, that's something worth investigating, it sort of, it bucks the trend and it's very immediately visible.
Lastly we're going to look at the pie chart and you can spend a lot of time reading about the intense debate about how worthless the pie chart is. There are plenty of detractors of this form and a few defenders, but really, to my eye, the pie chart has two major problems. One is that, it's really hard to parse when there are more than a couple of data points. So here we have two, four, six different pieces of data and it's a little bit hard to parse. I mean I can certainly see the smallest slice and the biggest slice, but three of the medium slices, I can't tell really anything about them.
And that gets to the second point, is that the pie chart is really bad at showing slight variance between data points. So those two top wedges look almost identical in size. Again, human eyes, human brains, have a hard time parsing circular shapes and arcs, whereas if these were bar charts, I could probably immediately see the difference between those two data points. But with all that said, the pie chart actually is pretty effective at comparing the difference between two data points and especially if it's just one variable.
So if the point is really just to show that this is a lot more than that or that this is pretty much the same as that, then I would say that the pie chart works pretty well. So these are the most basic chart forms. I'm sure you are already familiar with all of them or certainly most of them. Hopefully this video has helped you understand specifically when to use each form.
- Describe the process by which individuals’ interests are incorporated into data visualizations.
- Differentiate the use of the Ws in data visualization.
- Explain techniques involved in defining your narrative when visualizing data.
- Identify the factors that make data visualizations relatable to an audience’s interests and needs.
- Review the appropriate use of charts in data visualizations.
- Define the process involved in applying interactivity to data visualizations.