Viewers: in countries Watching now:
Start communicating ideas and diagramming data in a more interactive way. In this course, author Barton Poulson shows how to read, map, and illustrate data with Processing, an open-source drawing and development environment. On top of a solid introduction to Processing itself, this course investigates methods for obtaining and preparing data, designing for data visualization, and building an interactive experience out of a design. When your visualization is complete, explore the options for sharing your work, whether uploading it to specialized websites, embedding the visualizations in your own web pages, or even creating a desktop or Android app for your work.
If you want to do any data visualization in Processing, the single most important thing to do is to actually get your data into Processing, and that can also be an unusually challenging procedure, because Processing doesn't really have any built-in functions for doing this. On the other hand, a number of people have developed methods to facilitate the integration of external data files into Processing, and I'm going to show you one of them, probably the simplest, and it's for reading and external spreadsheet file saved as a tab-separated values, or TSV, file and using those values in Processing, by using a table class developed by Ben Fry, who is one of the cofounders of Processing.
Now, the good news about version 2.0 of Processing is that the table class will become part of Processing. And it's not there yet, because we're still using an alpha right now, but I can only assume that since it's probably going to be written by the same guy, that it will be very similar to what we have right here. Let's open up our sketch folder, and you see that we've got a few things in there. Number one is the actual Processing sketch, and I can show you that I've got that opened, and that's what that looks like right there. The other one is the table.
That's a class file that contains special instructions for Processing on how to read tabular data. And we're not going to mess with that one. We're just going to say that pretty much, it works. In case you are worried about not be able to find this code again, you should know that Ben Fry's table class is available in the Built-In Examples in Processing; that's what I am showing you right here. The way you get that is by coming up to File, to Examples, and this window pops up. And the one we're looking for is Book/Visualizing Data/ch03-usmap, and then the second one, which is step01_fig1_red_dots, if you click on that, then you see the same table class right there.
What I also have in here is a data folder, and I have two versions of the same data. Now, the reason I did that is because I got some data from Google Correlate that compares states on relative interest in search terms. In fact, I am just going to open that up. I've got it in Excel right now. And here's what we have. We've got Alabama down through Wyoming. We also have Washington D.C. in the mix, and that's important because D.C. tends to function usually in a lot of statistics. We have the state name.
We have the region and the division that there in. Then we have the percentage of adults in that state with degrees, and we have the median age of the population. And so those are a couple of descriptive statistics, but all the rest of that is Google's search terms. And what it is, these signify the relative interest of that state and that search term compared to other states. So what you see here, for instance--let's find a good one. That's video games, and what you find is that Washington D.C. shows a much lower interest in searching on Google for video games than other places.
On the other hand, you find that Hawaii has an unusually high interest in the show Top Chef. And I'll let you make it that what you will. But these are based on search patterns nationally and these data from Google. They're a few months old, but the patterns tend not to change dramatically in short amounts of time. Now, the reason I opened this up in the spreadsheet is because the way we're going to be working with the data files in Processing is, you don't want any of the header information.
Right here I have four rows on the top. The first one numbers the columns starting at 0, because that's how Processing reads things; it started at 0 and then goes on. Then I have the names of the things that people search for. By the way, what I have is the percent, the degree, the age. Video games is a search term, iPhone is a search term, dance, top chef, nfl, nba, mlb for Major League Baseball and then Major League Soccer. By the way, I included two of those because I am from Utah and we love the NBA more than anybody else in the country apparently.
We also love dance the most, and we also love Major League Soccer the most. Anyhow, those are little points of regional pride. But I have to prepare this with this information here, but then you have to delete it to make a very clean file. And so what I have is a second file that has the data but does not have any of those headers, but I like to keep the one with them so I can consult it. Let me recommended that you always do that: keep a copy of the data with the headers with summary statistics, if you want, for your reference purposes, but also have one that's easy to read.
Now the other thing is this is an Excel .xls file, and while there are libraries that make it possible to read Excel files into Processing, I find they are complicated. It's easier to just go with a standard text file. And so what you need to do is you need to save your file as a text file--actually, a tab-separated value. Now, let me show you how this works. You come up to File > Save As. We're going to Enable Saving, and then I have a choice here.
I can do stateData and then say it's Workbook, and then I click on that. I get a whole long list of choices. The one that you want is this one right here, Text (Tab delimited). And when you save that it will come out with a .txt extension at the end, and you just click on it and you change it manually to TSV. You would think that the CSV, the comma- separated value, would be a good one, but I encountered some really significant problems in reading that data into Processing, so I just say stick with the TSV for now and then hopefully the Table class will get improved and be a little more flexible when the full version of Processing 2.0 comes around.
So I am just going hit Cancel because I have that one already. By the way, if your data is in Google Docs spreadsheet, you can do Export or Download as Text and it saves it as a TVS file anyhow. So, I am just going to go back to my Processing sketch, and here you see my information. I've got a little palette here. And then this is the class. Table is the class, and then stateData is the object that I'm going to create. That's going to be sort of my Data variable.
I'm also creating a variable called rowCount that will count as it goes through the file. I have the setup. I've got a window of 600 x 200 pixels. Then I take my stateData object and I read the data into, by saying New-- we have to do a New when you're dealing with the classes--and then Table is the name of the class and then statesData.tsv in quotes and parentheses. That's the name of the file, and it needs to be in the data folder, in the sketch folder.
I take the variable rowCount, and now I initialize it by using one of the Table functions, and that's getRowCount. So I do stateData, because that's now the name of the object in I am dealing with, .getRowCount and then empty parentheses, and that will tell me how many rows there are in the data. And in fact, I've got this little thing here beneath that to say print line and tell me how many rows there are. I know there are 51 because, well, there should be 51; we'll see if it works right. Beneath that, I have just a little bit of code to draw.
I've got a background for my palette. I've turned on anti-aliasing. Then I have another palette color for some dots that I'd be drawing, but I am making them somewhat transparent. That's the 160. That's the alpha, about halfway between the 0 and 255. And I've also turned off the outlines. Then I have a for loop that does the important sort of heavy lifting here. And what it says is just go into the data file, start at row 0. And remember, I got rid of all the headers in this, so row 0 is a data row.
And it says and then go through one at a time until you get to the end. And what we want to do is we want to pull out the state name. We're just telling it that that's what's in the first one. We're also going to get age. We're going to get degrees. I am just getting those two variables out right now. The other just so going to be there, and what I am going to do is I am going to create a little scatter plot of median age in the state to the percentage of adults with a degree. And what I'm doing down under the ellipse, where I am multiplying things by 12 and 3, that's just a very rough attempt of getting them to spread out across.
You'll see that there is actually big gap on the left, and I can adjust for that manually. But right now this is just a simple version. And also, because once it gets drawn, it's not going to change, or at this point it's not dynamic, I am using the noLoop command to tell Processing it doesn't have to keep going through the draw loop; just do once and we're done. And let's press Ctrl+R on the PC or Command+R on the Mac and see what we get. We've got two things. Number one, take a look down at the bottom of the console, and you'll see that it says rowCount=51.
That confirms that Processing was able to read 51 rows of data from my sheet, which is right because we have the 50 states and we have Washington D.C. And then what you have in the top is an unlabeled, but you know, slightly picturesque scatter plot that shows the association of median age. That's what goes across the bottom, and that one that's way down at the left is Utah. We've got a median age of only 28 compared to most other states. And then going up is the percentage of the population with college degrees.
And so you can see, if you take out Utah and you take out the other outlier that we've got, there's not a real strong association between these two things. Nevertheless, the purpose of this exercise was not so much to produce this one graph; it was to show you how you can read data into Processing. And the easiest way to do this is to save a very clean TSV file from a spreadsheet with no headers on it and use Ben Fry's excellent Table class to import the data into your sketch.
There are currently no FAQs about Interactive Data Visualization with Processing.
Access exercise files from a button right under the course name.
Search within course videos and transcripts, and jump right to the results.
Remove icons showing you already watched videos if you want to start over.
Make the video wide, narrow, full-screen, or pop the player out of the page into its own window.
Click on text in the transcript to jump to that spot in the video. As the video plays, the relevant spot in the transcript will be highlighted.