Join Barton Poulson for an in-depth discussion in this video Using and managing packages, part of Learning R.
R is a very powerful and flexible program, even with its default installation. The beauty of R, though, is it can go so much further than its base version by adding packages or bundles of code that add functionality to R. At the moment, the Comprehensive R Archive Network or CRAN package repository lists over 4000 packages for R, all of which can be freely downloaded and installed. The creativity and functionality of these packages is astounding, leading many people such as myself to tell others that R can do anything.
In this movie, I want to show you how to find out about packages, how to install them, and how to use them in R. The first thing to do is to find out about packages that are available. On the bottom right of the screen here, this is one of the nice things about RStudio, is you have a list of packages that are already available. We start from the bootstrap functions under boot, to classification, and it goes all the way down to the utilities. These are ones that are installed, but it doesn't mean that they're loaded at the moment. The checkmark means that they're currently loaded, so the utilities and the stats are the ones that are loaded in this particular window.
Let's take a look at what some of the options are. I'm going to go to line 6 in the editor window here, and browseURL; this opens up the URL in a Web browser. I'm just going to run that line, and it's going to open up my default browser, and there you have a large list of categories of packages that are available. And CRAN, again, stands for Comprehensive R Archive Network. You can pick a field that you're interested in; say, for instance, under graphics, and a huge number of choices. One of the most popular, by the way, is this one right over here: ggplot2. That stands for the grammar of graphics.
That's a book, and this was written to be based on that book. It's an incredible package. I'm going to go back to R. That's a list of topics. You can also see what's available by name. In this case, I'm going to go to a specific mirror; the one that's at UCLA. I'm going to run this line. Here we have a very long list of packages. This is going to be the 4000 packages that are available. And pretty much everybody should be able to find something of utility for them in here.
The next step in line 9 is to bring up the editor list of the available packages. So, those are going to be the ones that I have already. I'm going to just run that line. What this does is it brings up a text file in an editor window; we see right up here, and this mirrors a lot of what's over on the right, except it does show ones that are invisible, like the base that you couldn't turn on or off if you wanted to. Close that, and say, what about the packages that are currently active? That is, the ones that are already checked. I can do that with search. Just run line number 10, and then in the console, it shows me the packages that are there.
It's got 11 listed. Again, not all of these show up, because some of them are invisible, like the global environment, but also the ones that are checked off on the right, you'll see in this list. Now, if I want to install a new package, say I found one that I really liked, there are a couple of ways to do this. For instance, you can come up to the menu, to Tools, to Install Packages. It brings up this menu; that's one way to do it. Or, you can use the Packages window here on the right, and just click the one that you want.
But personally, I find it easy to use scripts, and one of the reasons for that is that it makes the procedure repeatable for other people. And also, it means that you can run them in larger source scripts, and they can run automatically. Now, one that I like is called psych. And what I'm going to do is I'm going to run this line on number 18; install.packages. That's the command to download the package. Then you have to put the name of it in parentheses, and quotation marks. I am gong to run that line; it's going to download the package.
You see that's what we have here on the bottom left in the console. There's all this text, and it says it ran the command, it downloaded the package, it's been installed, and in fact, if you go to the Packages list on the right, and come down, you'll see that psych is now installed. It doesn't have a checkmark, because it hasn't been loaded. That's a separate procedure. So, what I'm going to do is I'm going to come to line 20, to library("psych"). Now, please note, the quotation marks in library are not necessary, but Google suggests them as a good format.
It's consistent with installing. You use the command library to make a package available when you're loading it in a script, like I am right now. On the other hand, if you've created a function or a package, sometimes you use instead require. Both of them have the same effect of loading the code that's in the package. I'm just going to use library, because that's the one that I use in scripts. So, I'm going to run that line, and then you see in the console that it ran library("psych"), and then you see in the window on the bottom right that I now have a checkmark next to psych.
Require would do the same thing. Now, if you want to see the documentation, you can just come down here. I put library(help = "psych"). That lets it know what I want the help on. I run that line, and it brings up a window in the editor. It has a text description, and it has a lot of the information about what goes into it. It's pretty lengthy. But you can get even more, and in a different format, if you try a different approach. Instead of just doing this one, a lot of programs, and psych is one of them, have what are called vignettes, and these really are just examples of how to use the package.
So, what I'm going to do right here is I'm going to come to line 28, and I'm going to use the command vignette, then I'm going to specify it's for the package psych, so package = "psych". And if I run that, it brings up an editor window with not much in it. But if I do a small modification, and say I want to browse vignettes, that's going to open it up in a browser. It's going to look like this. Now, what I have is PDFs, and R codes, and LaTeX. I can hit on the PDF here, and now I can see a PDF that is nearly 100 pages of documentation on how to use the psych package.
That can be downloaded and saved. It can be searched. It's a wonderful thing. I'm going to go back to R. You can also bring up a list of all of the vignettes that are available in all of the packages that are currently installed in R. That's just vignette(). I'm going to run that line, and here are all the ones; we have displaylist, sharing, matrix, and just as we did with the psych vignettes a moment ago, if you want to have interactive hyperlinked version of this, you just use browseVignettes(). Now I have the documentation for nearly everything, including, for instance, Sweave.
Now, once you have packages installed, it's important to remember that everything gets updated frequently in R, and so you're going to want to get things updated, including your packages that you use. In RStudio, there's a few different ways to do this. You come up to Tools to Check for Package Updates. You can do it there. You can also come over here, and just click on the green circle to check for updates, or you can just run this command: update.packages(). Run that one, and it lets me know that there are some updates. Cancel those for right now.
Then finally, if you have a package that you no longer need, you have the option of simply coming over here to the window, unchecking it, and then clicking on the X to get rid of it if you want, or you can also use this one: detach. That will also remove the package so it's no longer active. I'm just going to run that line. Now you see that the checkmark next to psych has disappeared, and if I want to get rid of it entirely, I just click on the X. Anyhow, that's one way that you can add extra functionality to R, and to give you some more of the flexibility and power to do almost anything that you need to do.
And again, like R itself, these are free, they're open source, and they can make your analytical life much easier, and much more creative.
The course continues with examples on how to create charts and plots, check statistical assumptions and the reliability of your data, look for data outliers, and use other data analysis tools. Finally, learn how to get charts and tables out of R and share your results with presentations and web pages.
- What is R?
- Installing R
- Creating bar character for categorical variables
- Building histograms
- Calculating frequencies and descriptives
- Computing new variables
- Creating scatterplots
- Comparing means