From the course: R Essential Training: Wrangling and Visualizing Data

Introduction to ggplot2

From the course: R Essential Training: Wrangling and Visualizing Data

Start my 1-month free trial

Introduction to ggplot2

- [Instructor] One of the things you find when you're using R is there's more to R than just R. Specifically, in this course, I've been using things like the tidyverse, a whole collection of packages that not only extend the functionality of R, but actually change the way that you interact with it. In addition, part of the tidyverse, ggplot2, represents a profoundly different way of creating graphics, and it's become for many people the gold standard of R graphics, and I want to introduce you a little bit to the theory behind ggplot2. The first thing is the very peculiar name, ggplot2, the gg part comes from grammar of graphics. It's inspired by this book, "The Grammar of Graphics," second edition by Leland Wilkinson. This is the basis of the theory of the grammar of graphics, and ggplot and ggplot2 are implementations of that grammar of graphics. The basic idea here is to separate what is graphed. That is, the actual data behind it, from how it is graphed, or the way that it is represented in the graphic. You can see this when you look at the abstract general structure of ggplot2 commands. They don't all include all of these. They can be very short, but they give you the possibility of addressing a layered grammar of graphics from several different elements. So, for instance, here at the top, where you call a ggplot, you actually specify what data is going into it, and you may actually have some commands that go there in the data statement. Then you go to the GEOM function, where GEOM stands for geometric or geometric object. It can be a histogram, it can be a dot, it can be line. It can actually be very sophisticated things, but you take that one function, and then you start telling it things like the mapping. How are you going to map the aes, stands for aesthetic elements, the actual visual things that are depicted that represent the data? You can also do certain statistical transformations right here in terms of how you show the data. And you can adjust the position of the object to best match the goals you have in your graphics. After that, you can also specify coordinate functions. Say, for instance, you want to do a polar coordinate system. You could specify that here, and you can also have a facet function, which allows you to include multiple graphs, possibly in rows and columns, to get a broader perspective on what you're working with. Now, ggplot2 also include something called qplot, which stands for quick plot. These are commands that are quicker to work with. They're easy, they're fast, but they do have less power and control. I use them, and when I'm trying to do something where I don't feel a need to modify anything, I'll use a qplot command. And I'll demonstrate them several times in this course. Now, I want to give you a few other resources for ggplot2. One is the actual ggplot2 page on tidyverse.org, which explains a little bit about how to install it and gives a link to some other information. One thing you might want to look at is this page, which is ggplot2 extensions. These are other packages that build onto and connect with the functionality of ggplot. They allow you to do some impressive things, like animations or simple things, like modifying where the labels appear. There are so many possibilities, and obviously, this is where you can see the power of ggplot because it lets you specify things at such a micro level. It enables enormous creativity in the exploration and the presentation of your data. Finally, I want you to be aware of the cheat sheets that are available through our studio because the people who have developed ggplot2, Hadley Wickham in particular, works at our studio. This is a downloadable PDF, which can give you a list of commands including the over 40 different geometric objects and how you can specify some of the commands for working in ggplot. So these are resources that are available to you, but in the videos that follow, I'll be showing you some very simple commands, both with qplot, ggplot, as ways of exploring data that are consistent, both with the tidyverse approach to R and all of which work together to help you better or explore, understand and present your data.

Contents