Join Bill Shander for an in-depth discussion in this video Explore your data, part of Learning Data Visualization.
This might seem obvious, but you really need to know your data, before you can design visualization of it. Before you can bring it into illustrator or start coding or in the Photoshop, you have to resist the temptation to start those before you've really explored to data. Really gotten to know it in and out get intimate with it, and so as the title of this movie implies, you know I really think exploring is the right verb. And you really want to think of yourself as a data explorer like this guy, and if you do that and you really dig deep into your data your going to find interesting tidbits, you going to find patterns and outliers and you'll start to see the data if you start to play with it visually.
There are tools like Tableau or Click views in other one, which allow you to bring in data from a lot of different forms and immediately, very quickly visualize it in different ways there's also programming languages like R. R is a statistical programming language that let's you do statistical analysis, as well as quick visual displays of your data. If there are programmers out there, you're probably familiar with R and its a great way to really dig deep into your data. Another tool like Tableau that lets you bring in existing data from a bunch of different forms, and just quickly graph it visually is called Geffy.
And it actually is really good at different forms of network visualization. Things like Nose and Lynx, showing the relationships between data points. And of course if you're mapping geographic data there's tons of mapping tools, whether it's Google Maps, or MapBox, or CartoDB. There is almost an endless supply of tools out there for you to play with your data. But my guess is that there is one tool in particular that you're all familiar with. I'm sure you've all used xls, or maybe the Apple equivalent, numbers before.
And so today we're going to talk mostly about how to use xls to explore your data. To look at it visually in different ways. And try to see some patterns in your data. So I have here is some data on minimum wage over the years from 1980 through 2013 and I've also gone and gotten some other contextual data to go along with it, such as the poverty line or the cost of gas or a loaf of bread or a dozen eggs. Again for those same time periods. And so, I like to just dive in. When it comes to exploring your data, there's no reason to hesitate.
Let's just jump in and see what we can see. That's sort of the point here. So I'm going to go and I'm going to graph minimum wage. So I'm going to select all the way from minimum wage, in that row, and I'm going to just command shift right arrow just to select the entire row. And then I'm going to go up here under the Charts tab, and I'm going to click on Line and I'm going to generate just a basic line chart. And I can very quickly see that minimum wage has just gone up from 1980 through 2013. Sort of in a stepped pattern, right? The, raises it, and then it sort of stays for a while, and then it gets raised and stays, and raised and stays.
So, that's interesting. I can see that there's a pattern. Minimum wage has gone up. Let's look at some of these other numbers. Let's look at the price of gasoline. Same time period. The entire time period that we have, I'm going to generate a line chart and actually move back over here so I don't have to move it manually. And here's my price of gas line chart. And again, stayed pretty steady. LIttle bit up and down. And then it really started rocketing up over these years. Big drop and then up again. So, interesting patterns to look at. I don't quite know what they mean yet, but it's, it's getting interesting. If I look at the price of eggs, over the same time period again, generate a line chart, and I can see eggs have also just gone up.
And let's look at one more. Let's look at CPI. If you're not familiar with CPI, CPI is the Consumer Price Index. It's sort of like, you take the prices of a whole bunch of things. You average it out. You create an index value, and this is sort of the cost of living essentially for people in the United States. And again, if I generate a live chart of those values, I can see Interestingly already, just looking at the data, look at how that steady rise. So, if you look at a whole bunch of things at once, the prices just started very, very steadily and consistently gone up and up and up and up over those years.
Interesting stuff, but I can now think to myself, okay, inflation, consumer price index has gone up very steadily, these things have gone up. What are they have in relation to each other? And so let's go in and actually do a ratio, so what I want to do is say, well how much gas could I buy on minimum wage? And so all I've done to do that is I've actually taken all the values, so in this case, let's say the price of gasoline And I've divided the minimum wage into the price of gasoline. And this shows me, how many gallons of gas I could buy for minimum wage in 1980, in this case.
And so, as you can see down here, I've generated the line charts of all these different ratios. So, I can see, I could buy less gas now on minimum wage, than I could buy in 1980. It sort of went up, and then it went down. Bread has gone fairly consistently down over those years. Eggs, interestingly, have gone up and down and up and down, but ended up a little bit higher. So I can actually buy more eggs now than I could back in 1980. But, you know, I can't generate endless line charts and just look at them like this. Maybe I should just generate one line chart with all of the data.
So I'm actually going to do that. I'm going to select this column here, wages to gas, and if I Cmd+Shift+Right Arrow, it'll select all of the columns, 1980 through 2013, and then I can Cmd+Shift+DownArrow to select all of the rows. And now I have all that data selected. And if I generate a line chart now, with all of that data, as you'll see. It generates a line chart for each data value, and it color codes them and labels them makes it easy to see what I'm looking at. But as you can see it's a little hard to tell exactly what I'm looking at, because the price of electricity is so different from the price of the other values, our scale's a little wacky here.
So instead of looking at a line chart in one scale like this, I'm going to generate a different chart form which is called spark lines, which is a really great way of looking at a bunch of different line charts that may have very different scales, but it'll show them to me as though they're all on the same scale. To do that I'm actually going to select just the data now you can't do it with the column names and again select all and select all across. And then I say, insert spark lines. And I'm going to do line graph spark lines, not a bar chart.
Click on that, and the data that I've selected to be the source data is already in here. And then I have to select where to place it, which means I need to just select the number of rows for the, all the rows of data that I have. So, one, two, three, four, five, six. Say OK. And now I have these great little line charts. And if I zoom in on those, I can see the patterns very clearly. And, I can even zoom in a little bit more. So I can see that this data point sort of went up. Went back done. Ended a little lower than where it started.
This one, again, steadily down and lower. These two look almost identical. And of course, those are my two CPI data points. So the last movie, I talked about converting your data and how to create your index data and why you may want to do that. Not going to go into that here but you'll see that I did create index version of that data. Generate line charts of the index versions which look pretty similar to the others. But at this point, I think it's a really good idea to start looking at different forms, different visual forms of the data. So in this case, let's say, I created a radar chart of all the data.
Sort like I had a line chart of all the data. And again, real quick, the way you can do that is that you can just select all the data just like you did for that line chart. And then, go up here to the other chart. You know, Charts tab and then Other, and say Radar Chart. And it'll generate a radar with all of our different data points. And as you can see, it's a little bit hard to pass, although I can see the blue line is more towards the center, and the red line is more towards the outside. But, maybe, it's a little bit overwhelming. What if I did a radar chart, of just 1980 versus 2013.
So, by doing that, I just select all of these rows for this one column for 1980. And then, I can go to 2013. Then, I can Cmd+drag or Ctrl+click and drag on Windows. And then that'll select this column and without deselecting this first column. And then same thing, I'm going to generate a radar chart of just that first year and last year. And so I can see that 1980, which is the blue line, and 2013 look different, right? These values have gone down over the time.
These two categories have stayed the same. So again I'm sort of starting to reveal some interesting things about the data, that I couldn't think about visualizing more formally later. Other data forms like scatter plots, or a donut charts, again, some times very hard to read and don't add much. But you'll be surprised about what you can find, when you start playing with the data this way. Here we have a bar chart, that has actually an exact duplicate Of this radar chart, just in a different form. So, comparing the same charts next to each other, but in different visual forms is a great experiment to run in xls, so quick and easy to do.
One of the other great things about Excel is that you can take a chart in Excel and bring it right into Illustrator and start working in it. So if I take this chart, just click on the chart, and Cmd+C, copy it Go into Illustrator and paste it. I now have this chart available to me as a vector object in Illustrator. And so I can click and drag and delete the stuff that I don't want and now I have a data accurate two scale line. I can change the color or I can put on a different background, etcetera.
Really powerful and great aspect of Excel for data exploration. Data exploration is absolutely essential, whether you're exploring the data in Excel or another tool. Trying it in different forms, looking at it in all in different ways, it's a very, very productive use of your time. Don't skip it. Doesn't take very long, and it'll lead to great insights that'll inform all of your work.
- Channeling your audience
- Understanding your data
- Determining the information hierarchy
- Sketching and wireframing your ideas
- Defining your narrative
- Using typography, color, contrast, and shape to convey meaning
- Making your visualization interactive