Start learning with our library of video tutorials taught by experts. Get started
Viewers: in countries Watching now:
Start communicating ideas and diagramming data in a more interactive way. In this course, author Barton Poulson shows how to read, map, and illustrate data with Processing, an open-source drawing and development environment. On top of a solid introduction to Processing itself, this course investigates methods for obtaining and preparing data, designing for data visualization, and building an interactive experience out of a design. When your visualization is complete, explore the options for sharing your work, whether uploading it to specialized websites, embedding the visualizations in your own web pages, or even creating a desktop or Android app for your work.
In this movie, I want to talk to you not so much about Processing in and of itself, or even about data visualization per se, but I want to talk to you about some of the principles of design that can help you make more compelling sketches and more informative data visualizations. The first thing you need to know is that informativeness is the major goal, and that is best achieved when you try to keep things as clear as possible. The idea is that complexity is the enemy of comprehension, and the complexity is not a matter of the size of the data set--for instance the number of observations--but has more to do with the number of design elements that you are trying to incorporate into a single sketch or visualization.
And so what you generally want to try to do is keep the design elements to a minimum and try to maximize the size of the data set and try to maximize the attention that your audience is able to give to the data. The less effort they have to spend in coding and trying to simply wrap their heads around your visualization and the more time they can spend focusing on the data and its message, the better you will have achieved your particular goal. Now, one of the interesting ironies here is about the nature of interactivity.
Interactivity is part of what this particular course is about, and interactivity makes a lot of things possible that otherwise you would have to spend days or hours trying to work through. On the other hand, interactive elements do take time and cognitive effort from the user. They can also serve as a distraction. They can lead people away from the data, and have them spend time playing with the interactivity, and so they're just focused on the medium and not on the message per se. Also, there's the problem that if you have an interactive visualization, you can't just print it on a piece of paper and hand it to somebody; you need to give it to them in some sort of digital dynamic way, which can be a logistical challenge.
You want to keep things as simple as possible; include the interactivity that is essential to your storytelling mission, but leave out the rest of it. I like to think about the lessons of one of my favorite pieces of design. It's Apple's old white 2 1/2 button remote. It's a very small piece, and it does all the remote-control functions that I could possibly want with my radio, and it's the only remote I have that I can use without looking at it. They also won a Red Dot Design award for their product in 2006, one of over 60 that Apple has won, which is extraordinary successful.
And a lot of it has to do with the clarity and the simplicity and also really, the invisibility of the user interface. Back to data visualization. You're trying to be clear. You're trying to do your data justice, but you want to do it in a straightforward manner. So one thing to think about is how much information are you trying to encode? And how many values does each of these pieces of information have? So for instance, a 3D position of an object is very communicative, very easy for people to understand. A two-dimensional positioning is probably the next most communicative, and in a flat environment like a computer screen, that's easy to deal with.
After that, the straight one-dimensional length or height of something is among the most informative and easiest to read. Color and shape are intriguing and people want to do a lot with them, but the trick is, color is actually best used as a qualitative indicator. It's hard to read something continuously. Even though you can put a picture of a color spectrum it's not how people interpret color. We will talk more about this in the next movie, where we specifically focus on color.
And then also you want to make it as easy as possible for people to compare values. So for instance, length or unidimensional measurement is just about the easiest way to compare values. On the other hand, judgment of area and angles, especially along curved lines, are notoriously difficult. And what that tells us, more than anything, is that pie charts are very hard to read, because pie charts require that a person read both an angle and the area of a pie in order to compare with other pies.
And unfortunately, this means that Florence Nightingale's rose diagram that I produced at the beginning of this course is actually one of the hardest-to-read diagrams. It was very persuasive, but it's very difficult to get the exact numbers out of that one because it revolves on circular segments that project in and out, so both angle and area, and it does make it difficult. Another principle is that redundant encodings can make perception easier. So for instance, if you're trying to separate two groups in your picture, you can do them with different colors or you can do them a different shapes or you can do them with simultaneously different colors and shapes.
So you have two methods of encoding the differences. Now, that makes it easier to spot the difference. There is a trade-off though; sometimes it means that you've induced unnecessary complication; other times it means that you've now lost one potential method of distinguishing other variables, and so there is a cost associated with a redundant encoding. And speaking of encodings, when you have a graph, you need to make sure that you have labeled things as clearly as possible and that the labels and legends are as close as possible so that people don't have to extend across the paper, across the screen, to see what they need to do.
There are a few other factors that come in with just any kind of graphical representation. If you have scales on your diagram, like a scatter plot, try to maintain equal scales. Also, start at 0 and don't have breaks in the order unless there is compelling reason to do that. If your data have a natural order, try to keep that and work within it. So for instance, I once had some students who were making a bar chart of the number of children in their family, and they thought it wasn't important to put the most common category first and then go to the least common categories.
And so we ended up with the first bar was for 2 to 3 children, the next bar was for zero to one, the next bar after that was 4 to 5, and it was all out of order. They just felt that the descending order in terms of the height of the bar was useful, but there's a natural order to have in the number of children. You want 0 at one end. You want a whole bunch at the other end. Similarly, with distance traveled, you want to keep those things in the natural order. Also, sometimes there's an implied order, or you have to watch out particularly with association charts about an implied cause-and-effect relationship.
Now, sometimes you want to be very clear about cause and effect, so for instance, if you're doing a scatter plot, you want to make sure that you put the cause on the bottom going across the X axis, and you want the effect going up the Y. You flip it around. It can be very, very hard for people adjust to that. Also, when you have your scatter plot, give your titles and label your axes. Identify the source of your data if possible and when it was gathered. You might want to include what the actual questions were or the methods that were used for gathering the data, because that helps in judging the representativeness of the data and the results.
If there are unusual data points anywhere in your chart, you want to either point them out yourself or make it very easy for people to figure out what they are. Similarly, if you did any transformations or exclusions or other manipulations on your data, you're going to want to make that clear. Next, you want to consider ways to lead the eye of the person looking at your visualization. You want to highlight important bars, dots, and cases, and you want to deemphasize or even delete unimportant information. So for instance, if you do a rollover and you have decimal places, try not to use the number of decimal places that your analysis software can put out. Sometimes I have software that puts out seven or eight decimal places when all you need is one, or maybe even none.
Earlier, I showed that you can use one of Processing's functions, the nf function, which is a way of taking numbers and turning them into a string format where you can specify the number of digits, and that can reduce it to a meaningful amount and keep it from getting overly complicated. Processing has other functions, like casting into an integer or a ceiling function, a floor function, a rounding function that can deal with these things. I also remember when I did tables of correlation coefficients, sometimes I would make a point of not even printing the correlations that were not statistically significant, so they simply didn't appear.
Instead of having to look at significance tests or look for asterisks, the simple presence of a correlation was enough to suggest that it was statistically significant, and then people would go ahead and try to decide whether they felt it was substantively significant. Now, there are some very important critiques of data visualization, mostly by the author Edward Tufte, who spends a lot of time talking about what he calls chartjunk, and that is stuff that's in the chart that is either uninformative or even misleading, so to give you an example, putting a false third dimension on a bar chart. I know it makes it look prettier, but it separates the data from the axis.
It makes it harder to read. Also people start doing things like judging by depth or by volume, which can be very misleading. Similarly, avoid things like drop shadows or gradients or textures or other tricks that provide some sort of flash but tend to distract people and make it harder to read things. Also, don't provide too much detail on the axes. It is okay to break it down and provide occasional reference points on the axes. It's enough. It's also even okay to have nothing on there if interaction is able to provide the data for you.
Then again, I do want to say it's okay to design something that's pretty, that's catchy, in an effort to get somebody's attention, as long as you can also be informative once you have their attention. Now, I've got just a couple words about typography. There is a long and ongoing debate about whether you should use serif fonts or sans serif fonts. They have to do with the ease of reading or the clarity of printing. And I'm not going to get into that one. Safe to say there are opinions on both sides. I can tell you, there are a lot of people who really hate Comic Sans font, and you might want to avoid that, just so people don't try to throw your stuff against the wall.
Also ornamental fonts, shaded fonts can be very hard to read, and you want to use those hardly ever. Also, don't put text on a complex background that's got different shades of colors. This reminds me of an Austin Powers movie where they used this comic effect with subtitles that would selectively disappear and appear. Try to reduce the amount of text as much as possible. Whenever possible, don't even have the text there and have it come up when the person interacts with the diagram. Finally, as far as typography goes, avoid the temptation to write things in all capitals.
It's difficult to read, and it comes across as aggressive. The moral of this particular lesson is not that I am trying to teach you to be an expert designer, but simply to avoid some common errors and to give you some recommendations on other places that you can learn more information. I think everybody has a pretty good sense of design, and I find one of the best exercises for this is to walk into a grocery store and just go up and down the aisles and look at the product packaging, and you can very quickly identify which ones were well designed, which ones were poorly designed, and which ones people just didn't seem to care when they are doing it.
And you can see that that sort of transfers over to your own work in computer drawing, in you're sketching, and in your data visualizations. With that in mind, I want to point you towards a few references that give you much more information and much richer examples, and I strongly suggest them. First, for anybody who is interested in statistical graphics and data visualization, there are the foundational books by Edward Tufte. The first is his book The Visual Display of Quantitative Information, which despite a very dry title, is an astoundingly beautiful book and absolutely worth your time to look at.
He has three other books that are also wonderful and relevant to anybody who's trying to produce graphics that convey information. The next one is Envisioning Information, from 1990. Then Visual Explanations, in 97, and Beautiful Evidence, 2006. And I strongly encourage you to take at least a brief look at any of these. Also, the O'Reilly Press has several wonderful books. They have for, instance, Beautiful Data: The Stories Behind Elegant Data Solutions, by Toby Segaran and Jeff Hammerbacher, and Beautiful Visualization: Looking at Data Through the Eyes of Experts, by Julie Steele and Noah Iliinsky.
A third one that's directly relevant to what we are doing is called Designing Data Visualizations, also by Noah Iliinsky and Julie Steele. And then finally, one of my favorites is a website called FlowingData by Nathan Yau, who has a wonderful book called Visualize This. I strongly encourage you to take a look at any and all of those. And then, in closing, I also want to recommend that you take a look at a wide range of courses that lynda.com offers on design principles.
For instance, Shane Snow has several short courses on creating infographics. Nigel French has several on print designs, such as designing a poster, a brochure, a restaurant menu, et cetera, which would be wonderful for layout and color use. Don Burnett has a course on typographic principles. And then of course there are several on color-- I will mention those in the next movie--and then a nearly limitless selection of courses on web graphics and web design, all of which would be relevant and useful for a person creating online interactive data visualizations.
There are currently no FAQs about Interactive Data Visualization with Processing.
Access exercise files from a button right under the course name.
Search within course videos and transcripts, and jump right to the results.
Remove icons showing you already watched videos if you want to start over.
Make the video wide, narrow, full-screen, or pop the player out of the page into its own window.
Click on text in the transcript to jump to that spot in the video. As the video plays, the relevant spot in the transcript will be highlighted.