Join Barton Poulson for an in-depth discussion in this video Taking a first look at the interface, part of R Statistics Essential Training.
- View Offline
In this movie, we're going to take a quick look at what R looks like when you first open it. And we're going to compare its functions to some of the things you could or not do with spread sheets or other programs you may be be accustomed to. Now, I'm going to be using this in RStudio. When you first open up RStudio, what you have here on the top left is this script window. If it's not there when you first open it, just go to File, New, and then R Script, and then this window here will pop up. This is where you do your actual writing, your composition, your programming, and what I want to do is show you some of the very simple things you can do before we get into full-on data analysis.
You have the option of using the default R environment. Interestingly, if you're a command line type of person and you're using a Mac, you can actually launch R from the terminal by just typing the letter R. If you're in Linux, you do the $R. You can also set up R to run from the text editor of your choice by going to the Preferences or Options. Now, what we have here is the Script window. We have the Console window beneath it, which is where we get our output, our text output. And then we have the Workspace and the History on the top right.
And then we have the Files window, the Plots, something about the Packages, which I'll explain more later. And the Help window in the bottom right. In fact, I'm just going to click on Help right now, so you can see it all there. What I want to do right now is show you some of the things you can do, then again, would be similar to what you can do in a spreadsheet but some of the important differences. The file that I'm using right here is one of the files that comes in the lynda.com exercise files. This is for chapter one, movie three, 01_03. If you double-click, you'll have the full text right now.
I'm going to be entering it line-by-line so we can go through it one piece at a time. I want to start by explaining these first three lines that have the pound signs in front of them. That makes them comments, and that means you put a pound sign at the front or a hash tag, if you want to talk Twitter. You put that there and it will ignore the entire line after that. Now, one of the sad things about R is that it does not have the built-in capacity for multi-line comments. In other programs there are different symbols in that you can type an entire paragraph. With R you simply have to put a pound sign at the beginning of each paragraph, but there's an easy way to do that.
In fact, if I come right here and I highlight the text. And I do Shift+Cmd+C, it takes off the commenting, or it puts the commenting back on. I'm going to put that back right there. And now, I'm going to start showing you bits and pieces of what R can do. The first thing I want to show you is basic math. If you type in just 8 plus 5, and hit Cmd or Crl+Enter, now that's the command in RStudio to have something go from the script window to show up in the console.
Here in the console, you just hit Enter. So I've got 8 + 5 here and I'm going to press Cmd+Enter. And then at the bottom you see that it repeats the command that I just did. And then we have this funny thing where we have a 1 in brackets, and then a 13. The reason for this, is that R works on vectors, which are one-dimensional sets of numbers. And it will, even if you only have a single number, it's a vector of a single number. And what R does, is when it prints the results, it gives you the index number for the first number in a vector in that line.
So the first one is always going to begin with 1. So there's only one number here, 8 + 5 is equal to 13. And so, we have an index number for one but then we have our actual number right there. The next thing, is if you want to print a whole lot of numbers. So for instance, in R, you can just type 1 and then :250, it's sort of the seriation thing, and we're actually going to print up, not to 100 but to 250, so I'm going to select that line and hit Ctrl+Enter. And now you see I've got a lot of lines. I'm going to scroll up.
And then, this demonstrates really what the index number is at the front. You can see, for instance, that it's going to match the first number of each one because it's giving you the index number for each of those. So, the number 1 is our first number, so it's, you know, 1 and it goes all the way through down to 250. Now, this makes the console a little busy, so I'm going to clear the console right now. I'm going to use Ctrl+L and that clears the console. Let's go to the next line in the script. You can print words if you want. Anytime you're learning a new language, and R counts as a programming language.
One of the first things you learn is how to print Hello World! . So, we're going to do a little Hello World!, right now. You use the command print and then in the parenthesis and quotation marks you put whatever text you want to appear. I'm going to hit Cmd+Enter. This is a single item. Now what we have here is one item in a vector and it's Hello World!, and basically it's equivalent to what you might call a string variable in other programming languages. After that, we're going to talk a little bit about some kinds of variables. If you want to, you can create a variable called x.
And then you can put the numbers 1 through 5 in it. So, let's take a quick look at how this works. First, I put the name of the variable, and that's the x, right over here. Then I have this little thing right over here, which is the less than sign. And a dash that makes an arrow. That is the assignment operator. Now it is possible to use an equal sign. However, that is frowned upon. It's considered poor style in R. The assignment operator in R is this little arrow thing. And then you have the thing that goes into the variable.
By the way, the variable is a vector, and so it can hold more than one value. And then, I just finish. You don't have to finish with a semicolon or anything, it's just more like Python, the of the line is the end of the command. And so, I'm just going to press Cmd+Enter. And now you see what's happened, in the console beneath, it repeats the command, but it doesn't show the output. The output is over here in the work space on the top-right. If we go up there, we see that it now shows that I have a variable called x, and that it's got integers in it, and there's five values.
Now if you want to actually see what's in it, all you gotta do is type the word x. You just type the name of the variable, and now hit Ctrl+Return. And now it shows me that it's got one, two, three, four, five, that's what's in it. You also can just come up here and click on it, and that shows the actual command itself. Our next step here, is I'm going to create another variable called y, and I'm going to manually set the numbers in it by using the concatenate function. That's just a c. So, I have a lowercase y.
And let me point out, R is case sensitive, so a lowercase x and a capital x are different variables. This time I'm creating a lowercase y. I have my assignment operator. By the way, one of the neat things about RStudio is you have a shortcut for the assignment operator. And it's Option+-. We'll insert a space, the assignment operator and a space after that. That's save you just a tiny bit of typing. But the c here is for concatenate, or you can also think of it as combine or collection. And then in parentheses, I put the numbers that I want to go into that, and then I enter it, and you see that a new variable has shown up in my workspace.
Now I come down, I'm going to do a little bit more here. Let's take a look at the contents of y. There it is. Okay. The next thing we're going to do is take a little closer look at assignment here. You want to put the variable on the left, and you put the value on the right with the assignment operator, the arrow, in between. Now, interestingly you can do it the other way if you want. You could put the value on the left as long as the assignment operator is pointing to the variable, it understands what you mean. The assignment operator is often read as gets, so this says online 18, a gets 1 or the bottom one, a gets 2.
You can do it backwards but, that's a silly thing to do. And it would just be confusing for other people. Also, you can do multiple assignments. So, for instance, right here, I can assign 3 variables all with the value of c. I'm going to hit return, and now you see all three of them popped up. And right here, it puts the actual value because there's only a single value in each one. And it's 3 all the way through. And let's take a look at how to do a little bit of math in this. Now this is vector math, so it works a little differently, so I've got my x variable right here.
It's got the 1, 2, 3, 4, 5, and I've also got my y variable, and that one is the 6, 7, 8, 9, 10. And then what we can do is you can actually add the elements in each of these vectors. If they're the same length, then they just correspond with each other. So, what it does is, I'm going to do x plus y here. And, then you see, it's done a 1 plus 6, 2 plus 7, 3 plus 8, and so on. It's taken each element in x and added it to the corresponding element in y. This is a very easy way to do a large set of functions.
I can also do multiplication doing x and the asterisk for times. X times 2 multiplies each element in x by 2. And there you can see 2, 4, 6, 8, 10. So we've gone through that very quickly. There are just two more things I want to show you, and that is in case you have questions about things like the assignment operator and whether there are any particular recommendations on how to write your code in R, the answer is yes. A convenient place is that Google has actually written an actual style guide for their internal use on using R.
And this right here is the url and the browseURL function, means you can just do the Cmd+Return on this one and it will open up in your default browser. So there is Google's R Style Guide. And this is where we learned, for instance, assignment, use the assignment operator, don't use equals, even though it's possible to use equals. And then I'm going to show you just one more thing to finish up here, and that's how to clean up. You see that we've got five variables or vectors here in our workspace, a, b, c, x, and y. What you can do is you can remove one thing at a time if you want by using rm or by writing by the word remove.
Either one works. And then in parentheses, put the thing that you're removing, so this first one I'm going to remove the variable x. I hit that, and you see that x is now gone from the workspace on the right. If you want to remove more than one thing at a time, you can just put them in there with a comma between them. Now I'm going to remove a and b. Now they're gone. Or if you want the shortcut and you just want to clear out everything all at once you use this one. It says rm or remove and then you type list is equal to ls. Ls is the command for listing all the objects in the workspace, and with empty in parentheses there, and that will clear the entire workspace all at once.
And it's a nice way of making sure that when you get to another script, because you can have multiple scripts open, you don't want to have variables with the same name or conflicts. It's a good way of keeping track of what you got. And so, R is conceptually, it's very simple, and because it's command line based it doesn't use a lot of menus. It can be very helpful to keep a few windows opened simultaneously, such as the Console, the Editor and Help. Or you can make it as simple as you want. I find working here in RStudio is very easy. And I find that the simplicity of the commands here makes it very easy to work with my numbers, which we're going to see when we start working with data.
- Installing R on your computer
- Using the built-in datasets
- Importing data
- Creating bar and pie charts for categorical variables
- Creating histograms and box plots for quantitative variables
- Calculating frequencies and descriptives
- Transforming variables
- Coding missing data
- Analyzing by subgroups
- Creating charts for associations
- Calculating correlations
- Creating charts and statistics for three or more variables
- Creating crosstabs for categorical variables