From the course: R for Data Science: Lunch Break Lessons
Be careful with transpose
From the course: R for Data Science: Lunch Break Lessons
Be careful with transpose
- [Instructor] When you're using outside data, you are inevitably going to run into some sort of data set that is really really wide or really really tall and you want to flip it side to side. You want to transpose it from wide to tall or tall to wide. And there's a couple of things you need to watch out for when you start doing that and let's go through what those are. First of all, I've created a small bit of code here that creates a dataframe called talldata and let's take a look at what that looks like. Here is talldata and I'll open it up and you can see that it's a pretty simple data set. There are 10 rows, columns for deca, alpha, and month. No surprise there. Now there's a couple of things you'll want to notice here. First of all, if I use dollar sign addressing, I can use talldata and then use a dollar sign and I can use month to access the column called month and you'll see January, February, March, April, May. Now what's interesting to note about this is that talldata month is a factor and we can check that out by typing in str which is the structure of talldata. And you'll see that month is listed as a factor with 10 levels. And that's important to remember and I'll show you why here in just a second. Now let's make talldata wide data and to do that I can use let's create a vector called widedata and into widedata I'm going to transpose t, that's a function, talldata. And this would flip talldata on its side essentially. So I'm going to run that and now I have a vector called widedata. And if I click on that, what you'll see is the same data from talldata but now it's wide. So here is talldata and you can see that the columns are labeled deca, alpha, and month. And in widedata, the rows are labeled deca, alpha, and month. So this looks great, doesn't it? It's exactly kind of what you want. However, there is something that you need to find out and let's look at the structure here of widedata. So str which is the structure command widedata. Now this looks different than what we saw when we did structure with talldata. And what you're seeing here is that widedata has been converted from things like factors and numeric and characters to all characters. And the reason why is well let's use the class command to find out what's going on. Class for widedata, we find out that widedata has been turned into a matrix. Class of talldata was a dataframe and this is crucial because as you'll remember from early our weekly sessions, a matrix consists of rows and columns of all the same type of variables. You cannot mix factors and characters and numbers in a matrix. And what's critical about this is that deca for example has been turned into characters. It's also important because addressing rows and columns has changed. So with talldata for example, we could use the dollar sign and then month and that would give us all of the months in that particular column. With widedata, if I tried to do the same thing, there is no column called month and we get an error. So what I need to do instead is use bracket addressing so if I do widedata and a bracket and I say give me the second row in all of the columns, then what I get is the second row which is alpha and a, b, c, d, e, f, g, h, i, j. If I did talldata and a bracket and I said give me the second column, you'll see that I get the exact same information. So it's important to understand that if you transpose or flip a data set on its side, 90 degrees, using the transpose command is going to turn it into a matrix and matrices behave differently than dataframes.
Practice while you learn with exercise files
Download the files the instructor uses to teach the course. Follow along and learn by watching, listening and practicing.
-
Exercise File: Subscribe to access.
Ex_Files_R_Data_Lunch_Break.zip -
Exercise File: Subscribe to access.
Ex_Files_R_for_Data_Sci_2021_Q3.zip -
Exercise File: Subscribe to access.
Ex_Files_R_for_Data_Sci_2021_Q4.zip -
Exercise File: Subscribe to access.
Ex_Files_R_for_Data_Sci_2022_Q1.zip -
Exercise File: Subscribe to access.
Ex_Files_R_for_Data_Sci_2022_Q2.zip -
Exercise File: Subscribe to access.
Ex_Files_R_for_Data_Sci_2022_Q3.zip -
Exercise File: Subscribe to access.
Ex_Files_R_for_Data_Sci_2022_Q4.zip -
Exercise File: Subscribe to access.
Ex_Files_R_for_Data_Sci_2023_Q1.zip -
Exercise File: Subscribe to access.
Ex_Files_R_for_Data_Sci_2023_Q2.zip
Download courses and learn on the go
Watch courses on your mobile device without an internet connection. Download courses using your iOS or Android LinkedIn Learning app.
Contents
-
-
R built-in data sets5m 21s
-
Vector math5m 57s
-
(Locked)
Subsetting7m 17s
-
(Locked)
R data types: Basic types7m 34s
-
(Locked)
R data types: Vector5m 16s
-
(Locked)
R data types: List5m 27s
-
(Locked)
R data types: Factor5m 15s
-
(Locked)
R data types: Matrix8m 48s
-
(Locked)
R data types: Array3m 50s
-
(Locked)
R data types: Data frame6m 44s
-
(Locked)
Data frames: Order and merge8m 10s
-
(Locked)
Data frames: Read and update4m 44s
-
-
-
Data frames: rbind3m 4s
-
Dataframes: cbind3m 11s
-
(Locked)
apply and lapply3m 21s
-
(Locked)
mapply2m 21s
-
(Locked)
plot2m 45s
-
(Locked)
Brackets and double-brackets2m 50s
-
(Locked)
mean, rowMeans, and colMeans1m 49s
-
(Locked)
RSQLite3m 44s
-
(Locked)
sqldf2m 9s
-
(Locked)
Aggregate3m 17s
-
(Locked)
Random numbers4m 26s
-
(Locked)
Pipeline4m 42s
-
(Locked)
Working with clipboards2m 45s
-
-
-
barplot2m 31s
-
Pie chart2m 11s
-
(Locked)
unlist3m 24s
-
(Locked)
Joins: Inner and full3m 3s
-
(Locked)
Joins: Left and right2m 13s
-
(Locked)
Sets: Union, intersect, and difference2m 10s
-
(Locked)
Sets: Equal and in2m 14s
-
(Locked)
colors2m 25s
-
(Locked)
ifelse3m 5s
-
(Locked)
spineplot2m 36s
-
(Locked)
browser3m 37s
-
(Locked)
debugonce2m 25s
-
(Locked)
Default mirror2m 31s
-
-
-
Dealing with NA6m 1s
-
Using with()2m 55s
-
(Locked)
Simple string matching4m 35s
-
(Locked)
grep2m 53s
-
(Locked)
dotchart3m 54s
-
(Locked)
fourfoldplot3m 34s
-
(Locked)
matplot3m 50s
-
(Locked)
dimnames5m 4s
-
(Locked)
mosaicplot4m 23s
-
(Locked)
stemplot2m 57s
-
(Locked)
stripchart3m 10s
-
(Locked)
sunflower2m 57s
-
(Locked)
Switch2m 16s
-
-
-
Switch on factors2m 18s
-
Any/all4m 13s
-
(Locked)
sub, gsub, regex, and backreferences4m 52s
-
(Locked)
agrep and fuzzy matching4m 44s
-
(Locked)
combn finds combinations2m 33s
-
(Locked)
edit, fix, and dataentry4m 57s
-
(Locked)
zeallot5m 30s
-
(Locked)
menu2m 58s
-
(Locked)
person3m 16s
-
(Locked)
txtProgressBar3m 13s
-
(Locked)
zip and tar3m 50s
-
(Locked)
bitwise4m 11s
-
(Locked)
by is like tapply4m 15s
-
(Locked)
Update your R4m 1s
-
-
-
Be careful with transpose4m 45s
-
Passwords4m 45s
-
(Locked)
heatmap4m 24s
-
(Locked)
combine4m 11s
-
(Locked)
stopifnot2m 44s
-
(Locked)
weighted.mean2m 16s
-
(Locked)
chartr3m 50s
-
(Locked)
file.choose4m 2s
-
(Locked)
duplicated and unique2m 52s
-
(Locked)
load and save4m 23s
-
(Locked)
floor, round, ceiling, and trunc2m 32s
-
(Locked)
expand.grid2m 55s
-
(Locked)
Professional groups2m 26s
-
-
-
Simplify with c3m 29s
-
Logical operators5m 56s
-
(Locked)
char.expand3m 57s
-
(Locked)
complete.cases3m 16s
-
(Locked)
swirl2m 8s
-
(Locked)
tryCatch3m 23s
-
(Locked)
Double colons3m 5s
-
(Locked)
for loop4m 54s
-
(Locked)
The 100th episode4m 12s
-
(Locked)
while loop4m 18s
-
(Locked)
repeat loop4m 14s
-
(Locked)
Create your own swirl lesson4m 4s
-
(Locked)
Logic and flow control4m 2s
-
-
-
matrix, row, and column4m 41s
-
cumsum, cumprod, cummax, an dcummin4m 11s
-
(Locked)
issymetric3m 14s
-
(Locked)
file.access4m
-
(Locked)
file.info4m 1s
-
(Locked)
dput and dget4m 35s
-
(Locked)
Sort a data frame by multiple columns4m 12s
-
(Locked)
diag2m 52s
-
(Locked)
crossprod3m 13s
-
(Locked)
upper.tri and lower.tri3m 7s
-
(Locked)
strsplit() splits strings at matched characters2m 37s
-
(Locked)
Use setnames() to change the name of an object5m 3s
-
(Locked)
Change the structure of a vector with stack()4m 44s
-
-
-
Use droplevels() to simplify factors3m 26s
-
Use .Rmd for documentation7m 2s
-
(Locked)
Use rep() to create long repetitive vectors4m 58s
-
(Locked)
Use format() to improve readability4m 53s
-
(Locked)
Use pmax() and pmin() to discover the scope of paired vectors5m 18s
-
(Locked)
Use print() for more than you do now4m 55s
-
(Locked)
Use range() and extendrange() to analyze and manipulate groups of numbers3m 42s
-
(Locked)
Evaluate the importance of a number with rank()4m 51s
-
(Locked)
Use saveRDS() and readRDS() to serialize objects3m 26s
-
(Locked)
Use regular expressions with regexpr() and gregexpr()4m 22s
-
(Locked)
message5m 21s
-
(Locked)
regexpr5m 45s
-
(Locked)
diff4m 50s
-
-
-
exists1m 57s
-
formulas4m 42s
-
(Locked)
RPres5m 26s
-
(Locked)
lattice: Introduction5m 8s
-
(Locked)
lattice: xyplot5m 37s
-
(Locked)
lattice: cloud and wireframe4m 31s
-
(Locked)
lattice: contourplot4m 8s
-
(Locked)
lattice: barchart4m 57s
-
(Locked)
lattice: splom charts6m 14s
-
(Locked)
lattice: panels4m 50s
-
(Locked)
lattice: stripplot3m 18s
-
(Locked)
whichmin and whichmax2m 52s
-
(Locked)
par: font, size, color5m 10s
-
-
-
par: margins6m 21s
-
par: pch and points3m 17s
-
(Locked)
legend5m 26s
-
(Locked)
identical3m 28s
-
(Locked)
Matrix math: Overview of functions1m 38s
-
(Locked)
Matrix math review4m 50s
-
(Locked)
matrix: solve systems4m 11s
-
(Locked)
matrix: solve inverse3m 32s
-
(Locked)
matrix: backsolve and forwardsolve5m 24s
-
(Locked)
Matrix: Determinant3m
-
(Locked)
Arrays and outer2m 49s
-
(Locked)
Matrix: Crossproduct2m 7s
-
(Locked)
Matrix SVD and QR decomposition3m 39s
-
-
-
Matrix: Eigenvalues and eigenvectors1m 38s
-
Locator4m 38s
-
(Locked)
on.exit4m 11s
-
(Locked)
missing3m 11s
-
(Locked)
nargs2m 28s
-
(Locked)
tidyverse5m 43s
-
(Locked)
gutenbergr5m 4s
-
(Locked)
Create and clean a natural language corpus7m 25s
-
(Locked)
Remove stopwords from an NLP corpus5m 16s
-
(Locked)
NLP and term-document matrix5m 53s
-
-
-
Analyze term-document matrix5m 38s
-
NLP packages: Tidytext5m 7s
-
(Locked)
NLP packages: Quanteda7m 40s
-
(Locked)
NLP packages: Sentiment analysis8m 28s
-
(Locked)
Word clouds3m 10s
-
(Locked)
Hidden features of installr4m 1s
-
(Locked)
Use the Matrix package5m 29s
-
(Locked)
Create a sparse matrix4m 21s
-
(Locked)
Sparse matrices, triangles, and more6m 25s
-
(Locked)
Bootstrap analysis with R6m 8s
-
(Locked)
checkUsage4m 41s
-
-
-
(Locked)
Use R on the Raspberry Pi7m 32s
-
(Locked)
list2df()4m 28s
-
(Locked)
Introduction to clustering2m 23s
-
(Locked)
Clustering with kmeans6m 57s
-
(Locked)
Clustering with pam and clara6m 23s
-
(Locked)
Understanding silhouette graphs8m 39s
-
(Locked)
Clustering with fanny5m 23s
-
(Locked)
Clustering with hclust5m 12s
-
(Locked)
Clustering with agnes6m 22s
-
(Locked)
Clustering with diana4m 20s
-
(Locked)
cutree and identify with hclust4m 15s
-
(Locked)
Clustering with mona4m 31s
-
(Locked)
Clustering: dist vs. daisy4m 32s
-
(Locked)
-
-
(Locked)
Parameterized R markdown3m 42s
-
(Locked)
Run R on a schedule2m 53s
-
(Locked)
The new forward pipe operator3m 56s
-
(Locked)
Backslash lambda functions5m 24s
-
(Locked)
Dist() in depth5m 29s
-
(Locked)
Scale()3m 9s
-
(Locked)
toJSON4m 6s
-
(Locked)
fromJSON3m 48s
-
(Locked)
Validate JSON2m 28s
-
(Locked)
Plotmath and expression2m 24s
-
(Locked)
Run R in batch mode5m 40s
-
(Locked)
Explore music3m 49s
-
(Locked)
BEEP2m 3s
-
(Locked)
-
-
install.packages4m 27s
-
old.packages, new.packages, and update.packages2m 44s
-
(Locked)
library and require5m 32s
-
(Locked)
Excel in R: SUM5m 51s
-
(Locked)
Excel in R: IF6m 12s
-
(Locked)
Excel in R: LOOKUP5m 17s
-
(Locked)
Excel in R: LEFT and RIGHT4m 15s
-
(Locked)
Excel in R: MATCH4m 50s
-
(Locked)
Excel in R: CHOOSE4m 46s
-
(Locked)
Excel in R: DATE4m 8s
-
(Locked)
Excel in R: DAYS3m 55s
-
(Locked)
Excel in R: FIND and FINDB3m 9s
-
-
-
(Locked)
Excel in R: INDEX2m 28s
-
(Locked)
Excel in R: COUNT4m 5s
-
(Locked)
Excel in R: AVERAGE6m 39s
-
(Locked)
Excel in R: SUMIF and AVERAGEIF5m 17s
-
(Locked)
Excel in R: COUNTIF4m 48s
-
(Locked)
Excel in R: CONCATENATE4m 23s
-
(Locked)
Excel in R: MAX and MIN6m 56s
-
(Locked)
Excel in R: PROPER4m 28s
-
(Locked)
Excel in R: AND6m 58s
-
(Locked)
Excel in R: LEN3m 57s
-
(Locked)
Excel in R: COUNTA6m 28s
-
(Locked)
Excel in R: NETWORKDAYS6m 59s
-
(Locked)
Excel in R: IFERROR6m 27s
-
(Locked)
-
-
(Locked)
Citation2m 39s
-
(Locked)
Vectorize5m 21s
-
(Locked)
Powerpoint from R4m 40s
-
(Locked)
Infix operator2m 30s
-
(Locked)
Kronecker2m 47s
-
(Locked)
Flowcharting3m 47s
-
(Locked)
Glue5m 20s
-
(Locked)
Crayon4m 14s
-
(Locked)
COVID-196m 5s
-
(Locked)
Apexcharter3m 23s
-
(Locked)
Factorial2m 49s
-
(Locked)
Download files5m 58s
-
(Locked)
Choose3m 3s
-
(Locked)
-
-
(Locked)
Beta and gamma3m 20s
-
(Locked)
as.Date()6m 23s
-
(Locked)
as.POSIXlt()5m 20s
-
(Locked)
as.POSIXct()4m 57s
-
(Locked)
Lubridate4m 41s
-
(Locked)
ISOdate()5m 26s
-
(Locked)
system.timezone() and OlsonNames()4m 46s
-
(Locked)
format()4m 28s
-
(Locked)
difftime()5m 29s
-
(Locked)
seq.Date()5m 50s
-
(Locked)
weekdays(), months(), quarters(), Julian()3m 50s
-
(Locked)
-
-
(Locked)
Introduction to Plumber6m 5s
-
(Locked)
Plumber request and response objects6m 43s
-
(Locked)
getwd setwd4m 24s
-
(Locked)
Use Visual Studio Code with R4m 34s
-
(Locked)
Tibbles4m 37s
-
(Locked)
Overview of dplyr4m 52s
-
(Locked)
dplyr: mutate6m 3s
-
(Locked)
dplyr: select4m 18s
-
(Locked)
dplyr: filter2m 27s
-
(Locked)
dplyr: slice and friends2m 59s
-
(Locked)
dplyr: summarise2m 55s
-
(Locked)
dplyr: arrange1m 43s
-
(Locked)
dplyr: group_by2m 34s
-
(Locked)
-
-
(Locked)
dbplyr translates R to SQL5m 14s
-
(Locked)
dplyr: pull4m 41s
-
(Locked)
dplyr: joins3m 50s
-
(Locked)
R7 OOP: Introduction6m 7s
-
(Locked)
R7 OOP: Properties4m 27s
-
(Locked)
R7 OOPS: Property getters and setters5m 38s
-
(Locked)
R7 OOPS: Validators3m 22s
-
(Locked)
R7 OOP: Class Inheritance3m 36s
-
(Locked)
R7 OOP: Generics and Methods6m 39s
-
(Locked)
Python with RStudio5m 12s
-
(Locked)
Animating plots3m 1s
-
(Locked)
Animating ggplot4m 3s
-
(Locked)
Introduction to Quarto6m 50s
-
(Locked)