From the course: SQL Server Machine Learning Services: R

Variables in R

From the course: SQL Server Machine Learning Services: R

Start my 1-month free trial

Variables in R

- [Instructor] One of the most common data management techniques across all programming languages is to use containers called variables to store values for use later on in a script. In Transact-SQL, the process looks something like what I have on lines two through four. Here, we'll use the declare keyword, and I'll set up a variable called, my value. I'm going to set this variable to an integer datatype. Then on line number three, we'll set its value equal to 23, and then finally, on line four, we can use that in a select statement. I'm going to highlight lines two through four and execute the statement, so you can see the results here. Now, creating variables in R is much simpler, and it looks something like what I have on line number 10. Here I'm creating a variable again, I'll use the name my value, and I'll set its value to 23. To assign a value to a variable in R, you use the less than symbol followed by the hyphen character. This does the same thing as the equal sign up here in T-SQL. So, we're creating a variable called my value, and it gets the value of 23. Then we can send that value out to the print statement. I'm going to highlight line seven through 12. And actually, let's go ahead and just get rid of these lines up here at the top instead. That way, I don't have to highlight everything through the rest of this movie. Now, execute the statement. You see the result of the R version of our variable. And we get the output here, and it says 23. So, we've created our variable, and we sent it to the print message. We can now apply the techniques that we saw in the previous chapter to loaded multiple values into a single variable. One way is to specify a range with a colon. So, instead of just loading in the number of 23, I'll say 23:60. I like to get the statement again. And this time, we get all the values between 23 and 60, or we can combine a comma-separated list of values with the C function. Instead of loading in 23 through 60, I'll say C, open up a parentheses, and then the numbers 23 comma 26, comma and let's say 32. The C function will combine these three values into a single variable. I'll execute the statement, and then I get the three values down below. Now that we have multiple values in a single vector saved into a named variable, we can pull out individual elements, using a square bracket notation to ask for the index. So, if I wanted to see the value at index position number two, I can print its value here. I can say print my value, open up a square bracket, and the index position two. We'll close the bracket and execute the statement. This time it only returns the value in the second position, 26. I can change this to a three to see the value in the third position, and that returns 32. What happens if I specify an index value that doesn't exist? Let's change this to four. This time when I execute the statement, I get a result NA, that this value isn't available. Now, we can also use these index positions to change individual values that are already stored in the variable. Right now the my value variable contains these three values. Let's go ahead and replace one of them. I'm going to say my value at index position number one is going to get reassigned. Let's assign it the new value of 100. Now, we'll print out the entire variable list. We'll go ahead and get rid of this square bracket for notation. This time when I execute the same as, we get the numbers 100, 26 and 32. The original value 23 is getting overwritten on line number five. We can also use this technique to add an index where one didn't already exist, and simply add new data to the vector. I'll come up here and create a new line number six, and we'll say my value at index position four gets the value 96. This time when I execute the statement, we get four numbers, 100, 26, 32, and the new number, 96. And finally, you can also perform mathematical operations to load data into new indexes. Let's add a new value at index five This one, we're going to assign the value of, my value at index four plus 100. This time I'll execute the statement, and we get a new number 196, added to the end of the list. So, now that we have a vector that contains multiple values saved as a variable, you might be curious to find out what type of data it contains. We can apply the type of function to the variable to find out its data type. Remember that a vector like a SQL Server table column, can only contain values of a single data type. In addition to printing out the full list of my values, I'm going to come down and print one more line. This time, I'll print type of my value. This time I get two lines in the messages output. The first one just list out the original numbers, and the second one tells me that we're dealing with a double precision floating point numbers set. If I need to work with these numbers as integers, I can convert them with as.integer. I'll come up here and insert that onto line number nine. We'll apply the as.integer function to my values, and then we'll check the type of that. This time we'll execute the statement. And now, we're working with them as integers. But keep in mind that these values are just integers for this specific print line. If I want to work with these values as integers for everything, I can convert them with as.integer in earlier line. Let's take all of this out on lines number eight and nine. And I'm going to set my value, and I'll assign the entire set a new value of as.integer, apply to the original my value values. Now, we can print my value again. And we'll also check its type. We'll execute the script. And now, we get the numbers. These are now being treated as integers. So, that's a bunch of different techniques you can apply to working with variables in R. Variables can hold many values in a one dimension vector, which makes them easier to manage as a group.

Contents