Join Mike Chapple for an in-depth discussion in this video Aggregations in the data set, part of Cleaning Bad Data in R.
- [Instructor] Another issue that can impact the integrity…of your dataset is when your source data…includes pre-computed aggregations.…This situation happens very often…when you're dealing with census data.…Let's take a look at an example.…I'm going to run the code that I have loaded here…to load the tidyverse,…change my working directory,…and then load a data file that includes information about…the population of the city of Carpinteria in California.…Let's go ahead now and take a look…at the dataset that we loaded.…
I'm going to use the glimpse function to do this.…It looks like each row contains information…about a type of person who might be in the city,…and the number of people in that category.…I could go ahead and compute the total population…by simply adding up all the rows using the sum function.…I'll just use sum, across the Carpinteria data frame,…and the population variable in that data frame.…When I do that, I get a result…of 40,659.…
Now, I've been to Carpinteria many times,…and I don't think that sounds right.…
Where possible, instructor Mike Chapple shows how to correct the issues using R, but the same principles can be applied to any statistical programing language.
- Missing data
- Duplicate rows and values
- Converting data
- Formatting data
- Working with tidy data
- Tidying data sets
- Dealing with suspicious data
Skill Level Beginner
1. Missing Data
2. Duplicated Data
3. Formatting Data
5. Tidy Data
6. Red Flags
What's next?1m 5s
- Mark as unwatched
- Mark all as unwatched
Are you sure you want to mark all the videos in this course as unwatched?
This will not affect your course history, your reports, or your certificates of completion for this course.Cancel
Take notes with your new membership!
Type in the entry box, then click Enter to save your note.
1:30Press on any video thumbnail to jump immediately to the timecode shown.
Notes are saved with you account but can also be exported as plain text, MS Word, PDF, Google Doc, or Evernote.