Join Barton Poulson for an in-depth discussion in this video ggvis, part of Data Science Tips, #1.
- View Offline
- [Voiceover] ggvis is an attempt to take the principles of the Grammar of Graphics and apply them to interactive visualizations. More specifically, you want to have graphs that react to the user. Now ggvis is an early experiment still in its Beta stage. But it takes the ggplot ideas about building from essential components and applies those to interactive graphics. It also gives you the ability to share your insights on the web. Now let's take a quick look at how this works by going to this script.
The first thing you're going to want to do is install ggvis and then load it up. We're also going to use a sample Dataset from Datasets, so I'll include that one. We use the iris data and we'll take a look at just the first six cases of it by running this. Let me zoom in on that one. And you see we have these cases. There's a 150 Observations total. You can't tell that from this. And there are five variables, four measurements and one categorical Width Species information. We'll come back out here.
Now we're going to start creating plots here. One important thing to know is that these are not going to show up in the Plot Window which is currently shown on the bottom right here, but they're going to show up in the Viewer Pane. That reacts a little differently. So let's start with a Basic Scatterplot. I'm just going to do iris, and then using the pipes I'm going to feed them into ggvis until I want to look at Sepal Length and Sepal Width together. And then again, a Grammar of Graphics asset use Layers Points and that tells it what I want to actually see.
And so its a very basic graph. We don't need to zoom in on that. Easy to make sense of. I can then in the Grammar of Graphics approach of building things iteratively I can add, for instance, a Smoother onto this one. So I just take this same command here and I do those two things and then I add one more pipe and to that I go layer_smooths or smoother, and then se is for the standard error of the estimate true. And then we can zoom in on that one. I'll actually bring that one up. It's a great graphic and it starts to give us some good insight on what we're dealing with.
Now I can change things around a little bit. I can take out the Smoother and I can make a Grouped Scatterplot. Again I use ggvis and that first part stays the same, but the next one I do, layer_points and then fill, I specify Species. Great, and now it's colored now by the three different kinds of species of irises we have. And then I'll do a Grouped Regression right here. Again you can see the adding components on are moving them around a little bit. We do the ggvis, then layer_points that color it.
And except this time I see group_by, that's a ggvis command, group_by (Species), and then I'm going to add a layer_model_predictions that actually includes the Regression information, or more specifically the Regression Lines. Let's zoom in on that one. And so far this is all stuff, it's good but it's not very different from what we could've done in ggplot2 which is designed for these sorts of things. But the big difference is this next part, ggvis allows us to do interactive components.
Now there's some things you can do at least currently you can do interaction for the arguments to some of the Transforms and you can change them to properties. However, you cannot add a Remove Layers or a Switch between different Datasets, which is something you might want to do, but currently not able to do that. So I'm going to create a single interactive graphic here, but with three different kinds of interaction. And I start with the same code. I say use the iris data, and then with the pipe, oh by the way says it is guessing the formula and that's just fine, I use the same ggvis thing where I say use the length and the width.
And then here I'm going to do the Layer Points. That's all the same. Now I'm going to add a component that says vary the size of the points with a slider. So previously I just spelled it out. This time I used input_slider and then I give the range for the slider and I give it a title. And that's fine. Then I'm also going to vary the opacity of the points, see through there. And I'm going to do that with another slider except this time it varies from zero to one. And then that feeds with a pipe into this one here where I'm going to vary the kind of fit.
I'm going to draw a line through it. I can either have a straight line linear model or a LOESS smoother. And this one I used would've called Radio Buttons. And so I put that in there and I give it, I spell out what the choices are in terms of both how I want them to appear and the R code. And then I gave it the default was selected, and then a title by seeing what the label is, and that's going to produce my graphic. And let's zoom in on that one. This is a great graphic because we have a full picture of the Dataset.
We have several ways of interacting with it. I can make the things larger, I can make them less transparent, I can make 'em more opaque. I can also change the line. Obviously you can see a straight line is not a really good model for this. I can do LOESS smoother. It's a better way of looking at the data. Now something to keep in mind here, this isn't the viewer, and it's constantly going back and forth. R is listening to this one, and so it's not going to do much else while we're working on this. So let me come back here for a moment.
What we have to do, we have to stop it. You can either do Escape or Ctrl C, or you just hit Stop and it clears it out and that gets to our less static image. And that's the general idea of ggvis. It's a basic method, it's an active development, but it's a neat way of taking the principles of the Grammar of Graphics and applying them to interactive visualizations. There are, however, in our other methods that are more common, specifically the shiny apps that our studio has developed to make available that can build on all this and incorporate some of the other principles that we've looked at.