From the course: R for Data Science: Lunch Break Lessons

R built-in data sets

- [Presenter] As you learn R, you're going to bump into something called a dataset, or a built-in dataset, and all datasets are, is a convenient way to explore the R language. So let's take a look at how these datasets work, and what they are. The first thing you'll want to do, is type in library, and then parentheses, help, equals, dataset. And what you're going to see result from that command, is a list of all of the datasets that are available as part of R. So for example, here I've got something called AirPassengers, followed by a description called Monthly Airline Passenger Numbers from 1949 to 1960. And if we were to look inside of that particular dataset, what we'd see is exactly described, the passenger numbers for a series of years. Now, let's look at another thing here called help files. And what I'll do, is I'll go back to the console window, I'll type in a question mark followed by data. And you can see incidentally, I'm using R Studio which provides a lot of auto fill for me. In this case it says, "Oh, I see you're trying "to call up the data command." And it helpfully pops down a menu. If I like what it's telling me, I can hit return, and then return again, and that will execute the command. Executing a question mark followed by any function gives us a help file for that function or package. In this case, the help file says, "Oh, data sets," and the data command will load specified data sets or list the available data sets. So let's go ahead and try that command. I'm going to go over here to the console again, and I'm going to type in data. Again, our studio offers some help, so I say, "Yes, that's exactly "what I want to do," and up above, you'll see that we have a list again of all of the datasets. AirPassengers, BJsales, BOD, CO2, all of these are datasets available for your use. A dataset that's really used a lot is called mtcars, and let's take a look at that. So the first thing we need to do is load it, so I'll type in data, parentheses, and if I type in M-T, you'll see that R studio's providing me help with which dataset I want to load. So I can hit return to accept its suggestion, which is mtcars, it adds the quote marks for me as well, and now if I hit return again, you'll see a couple of things happen. Most important thing, is up here in the global environment, the upper right-hand corner, you can see that we have something called mtcars listed as a value, and the type of mtcars is listed as a promise. And what that means, is this, that when we loaded mtcars, we haven't done anything with it, and so R studio and R are just telling us that, "Mtcars is available for your use, "when you choose to do something with it." So let's go ahead and do something with it, and you'll see that change again. I'll type in head, H-E-A-D, which is a command that will show us the top of a particular dataset. If I type in parentheses, now I can type in mtcars, and you'll see something just changed here. Let's go take a look at that. First of all, R studio is suggesting that we want to use mtcars, but it also has shown us that mtcars has actually been loaded, and in the upper right-hand corner here you'll see mtcars followed by 32 observations of 11 variables. As a side note, an observation is equivalent to a row, and a variable is equivalent to a column. So let's go ahead and load in the top of mtcars. And I'll hit head mtcars, and what this is going to show me is the top six lines of the mtcars dataset. Now once you've loaded in a dataset, you can go ahead and do some experiments. So an easy command is plot, P-L-O-T. And what plot will do for us is just generate a plot, and we need to give it two data points to plot against. We can select mtcars, dollar sign, and let's plot the horsepower, H-P, against mtcars, dollar sign, miles per gallon, M-P-G. And now if I hit return on that command, we can ignore the warnings, but you'll see that on the right-hand side a plot has shown up under the plots tab of R studio, and it shows us the horsepower versus mile per gallon. We're going to talk about plot again in a later video, but for right now it just gives you an example of using a dataset. There are also things called built-in constants, and these are a little bit different than datasets, but you can kind of conceive of them as the same thing. For example, a built-in constant is called letters, and if we type in letters, we can see that what it contains is the capital letters of the alphabet. So this is an idea of what datasets are, as well as built-in constants. And again, datasets are just a convenient way to explore the R language.

Contents