Join Mike Chapple for an in-depth discussion in this video Detecting illogical values, part of Cleaning Bad Data in R.
- [Narrator] Outliers aren't the only way…that you can detect bad data in your datasets.…Sometimes, you'll detect illogical values…that break business rules or violate common sense.…You can write tests in R that identify these values.…For example, consider a dataset containing…information about the residents of a town,…including their ages, employment…status, information about where they…live, and whether they own a car.…In the exercise files I provided this code to…load the dataset that has around 2,000 records.…
As with our previous examples,…we begin by loading the tidyverse,…setting our working directory,…and then reading in the residents dataset.…I'll begin by taking a look at some…summary statistics for this dataset…and looking at the summary I see that…I have records about adults of working age.…All of the values for the age variable…are between 18 and 65.…I also have a field that shows whether…the person is employed using a logical value…where true means that they are employed.…
Then I have similar logical fields…
Where possible, instructor Mike Chapple shows how to correct the issues using R, but the same principles can be applied to any statistical programing language.
- Missing data
- Duplicate rows and values
- Converting data
- Formatting data
- Working with tidy data
- Tidying data sets
- Dealing with suspicious data
Skill Level Beginner
1. Missing Data
2. Duplicated Data
3. Formatting Data
5. Tidy Data
6. Red Flags
What's next?1m 5s
- Mark as unwatched
- Mark all as unwatched
Are you sure you want to mark all the videos in this course as unwatched?
This will not affect your course history, your reports, or your certificates of completion for this course.Cancel
Take notes with your new membership!
Type in the entry box, then click Enter to save your note.
1:30Press on any video thumbnail to jump immediately to the timecode shown.
Notes are saved with you account but can also be exported as plain text, MS Word, PDF, Google Doc, or Evernote.