Join Bill Shander for an in-depth discussion in this video "Explatory", part of Learning Data Visualization.
- So this is going to be probably the fastest video in the course, I'm going to explain a very simple concept but it's something that I think is really core to all visualization. And it's something that I call explatory. All visualizations in the end, are what I call explatory. And what I mean by that is that they are all either explanatory or exploratory, and often both. So I'm going to start off by showing a couple examples starting off with explanatory visualizations. Visualizations where you're trying to explain something to the viewer.
So the first example is something I created for the Boston Marathon. Trying to explain really two simple concepts. This was done in 2012 for the 40th anniversary of women being officially allowed to run the Boston Marathon. This was about 65 years into the Boston Marathon's existence and the men had been running all along. But it was thought for a long time that it was too strenuous for women. So the idea for this visualization was to look at the data and think about two things. The first thing was women's running times.
And it's really interesting to see, and you can see from this graphic the orange represents the running times for women, that started out in the three, three and a half hour range. And very, very quickly dropped off and plateauedish around 1985, 1986, and has been relatively similarly the same it would seem throughout that time period, as have the men. But as you can see for the men they took a much longer time for their race times to come down and have plateaued but they are steadily decreasing by a little bit. The fastest year for women was 2012 and the gap that year as you can see here was just 11 minutes.
So a really interesting look and explanation of the speed of the women over that time frame and how quickly they came up to speed. But the big one, from an explanation standpoint was this idea that the race was to strenuous for women. So how do we disprove that? And we look at the finish rates. And so in the running world DNF means did not finish. And as you can see, back in 1975, men their DNF rate was 21%, and even went up to 32% in 1980.
And then came down, down, down, down until now, men are finishing at about 10%, so 10% of the people who went to the Boston Marathon, who are men, finish. And you can see that women, in the very early years also had a very high DNF, probably making some of their critics happy, above 50%. But as you can see also, over the years, very rapidly declined until today when the women's DNF rate is really very comparable to the men's. So it's not too strenuous for women, this infographic sort of looks at the data and explains that concept and also explains how the field of runners is now nearly half women and that has just been steadily increasing over time as well.
So a very simple explanatory visualization of some fairly straightforward data. So the next category being exploratory. I wanted to show two examples, the first one is an interactive example. So I really like this visualization. Keeping with the running theme, this is a look at some data from a half marathon and as you can see here it's actually really good explanatory visualization first, in that I can see men and women, blue and pink dots, and on the X axis here I see running time. And so I can see the very few people at the beginning who were the fastest, you know the six or seven, or eight, who are all men and then the vast majority of people in the middle who finish around two hours.
And then again a few stragglers towards the end who take, approaching three hours. So from an explanatory visualization standpoint very straightforward, but interesting view at the overall macro trend of this race. But I can explore the data much further. So I can look at age groups, and take a look at teenagers. There were three teenagers 0-17 who ran, that's interesting. I can look at the vast majority of runners who are 18-39 and see what those trends look like. Interestingly, the fastest person in the race was a 40 to 49-year-old.
And you can come down here and see about the 60 plus category, not too many of them. Pretty good spread in terms of finish times. That's already an interesting way to explore the data to break it up by age group. I can also go in and search by name. So I'm going to go in and search by the name Barry, and the reason I'm doing that is because the guy who created this visualization, Mike Barry, was one of the runners. In fact one of the fastest runners in this race. And as you can see, I can roll over any of these single dots and whether they're lit up or hidden, and literally explore every component of this data.
Get very granular with it. It's a really interesting visualization, great example of giving the ability to the user to explore the data for themselves. But you don't have to just add interactivity to make an infographic or data visualization exploratory. In this case this is a static infographic looking at data of Nobel Prize winners from 1901 to 2012. And there's just so much data here that it's really exploratory because I can quickly get a glance at some trends as I'm zoomed out, but if I zoom in on one of the pieces of data here, I can really look at individual components of it and explore the data in much more deeper ways.
So just to explain the graphic a little bit, the idea here is that I'm looking at the different categories of visualization, chemistry, economics, physics, literature, medicine, and peace. And I can see each dot represents a person, one of the winners. And a solid dot's a man, a dot with a circle around it is a woman. I can see over here what degrees they attained. And then I also see over here, what top schools the various winners went to. And just at quick glance I can see a few trends, especially the more I explore. So for instance, if I really look carefully at physics, I can see there are only two physics winners who are women over all of these years.
Interestingly, one of the very first was a woman and then no one for a very long time, and no one for a very long time since. Another interesting trend is the fact that literature winners are much more likely to have no degree than to have a master's or PhD or even a bachelor's degree, which is interesting. Also, literature winners haven't gone to any of the top schools that all the other categories of winners have gone to. So, as you can see, you could spend a long time exploring this data, looking at the different categories, looking at gender issues, looking at how old the people were, how far above and below that line represents how much younger or older they are compared to the average.
So a lot of interesting stuff here, great example of a static infographic that is in fact exploratory. So visualization really can be both, explanatory and exploratory. As I said, it has to be one. When I'm doing visualization I'm always thinking about that. Do I want to allow people to explore the data for themselves or am I really just trying to explain something. In the end, I think it's best that you do both. That you can provide even some minimal level of exploration to your users, because when you empower them to find the ah-has in the data, as I like to say, you're sort of allowing them to turn your data story into their data story.
It's a very powerful thing.
- Channeling your audience
- Understanding your data
- Determining the information hierarchy
- Sketching and wireframing your ideas
- Defining your narrative
- Using typography, color, contrast, and shape to convey meaning
- Making your visualization interactive