From the course: R for Data Science: Lunch Break Lessons

Be careful with transpose

- [Instructor] When you're using outside data, you are inevitably going to run into some sort of data set that is really really wide or really really tall and you want to flip it side to side. You want to transpose it from wide to tall or tall to wide. And there's a couple of things you need to watch out for when you start doing that and let's go through what those are. First of all, I've created a small bit of code here that creates a dataframe called talldata and let's take a look at what that looks like. Here is talldata and I'll open it up and you can see that it's a pretty simple data set. There are 10 rows, columns for deca, alpha, and month. No surprise there. Now there's a couple of things you'll want to notice here. First of all, if I use dollar sign addressing, I can use talldata and then use a dollar sign and I can use month to access the column called month and you'll see January, February, March, April, May. Now what's interesting to note about this is that talldata month is a factor and we can check that out by typing in str which is the structure of talldata. And you'll see that month is listed as a factor with 10 levels. And that's important to remember and I'll show you why here in just a second. Now let's make talldata wide data and to do that I can use let's create a vector called widedata and into widedata I'm going to transpose t, that's a function, talldata. And this would flip talldata on its side essentially. So I'm going to run that and now I have a vector called widedata. And if I click on that, what you'll see is the same data from talldata but now it's wide. So here is talldata and you can see that the columns are labeled deca, alpha, and month. And in widedata, the rows are labeled deca, alpha, and month. So this looks great, doesn't it? It's exactly kind of what you want. However, there is something that you need to find out and let's look at the structure here of widedata. So str which is the structure command widedata. Now this looks different than what we saw when we did structure with talldata. And what you're seeing here is that widedata has been converted from things like factors and numeric and characters to all characters. And the reason why is well let's use the class command to find out what's going on. Class for widedata, we find out that widedata has been turned into a matrix. Class of talldata was a dataframe and this is crucial because as you'll remember from early our weekly sessions, a matrix consists of rows and columns of all the same type of variables. You cannot mix factors and characters and numbers in a matrix. And what's critical about this is that deca for example has been turned into characters. It's also important because addressing rows and columns has changed. So with talldata for example, we could use the dollar sign and then month and that would give us all of the months in that particular column. With widedata, if I tried to do the same thing, there is no column called month and we get an error. So what I need to do instead is use bracket addressing so if I do widedata and a bracket and I say give me the second row in all of the columns, then what I get is the second row which is alpha and a, b, c, d, e, f, g, h, i, j. If I did talldata and a bracket and I said give me the second column, you'll see that I get the exact same information. So it's important to understand that if you transpose or flip a data set on its side, 90 degrees, using the transpose command is going to turn it into a matrix and matrices behave differently than dataframes.

Contents