Join Mike Chapple for an in-depth discussion in this video Suspicious values, part of Cleaning Bad Data in R.
- [Instructor] As you perform data cleaning,…there are some special suspicious values…that you should watch out for.…The presence of these values doesn't necessary mean…that your data is incorrect,…but if you see them in many places,…you should view them with suspicion.…I'm going to review a set of common suspicious values…developed by the data science community.…Many of these come from the Quartz guide to bad data,…which has an excellent exploration of data cleaning issues.…The first type of suspicious value stems from the way…that computers store data.…
You probably know that computers store data in binary form,…using a sequence of ones and zeros.…Each digit in the binary number is called a bit.…When you create a numeric variable,…you allocate a defined number of bits to store that value.…The number of bits that you allocate limits…the largest number that you can store in that variable.…For example, imagine that we have a two-bit variable.…That allows us to have two digits,…either one of which may be one or zero.…
Where possible, instructor Mike Chapple shows how to correct the issues using R, but the same principles can be applied to any statistical programing language.
- Missing data
- Duplicate rows and values
- Converting data
- Formatting data
- Working with tidy data
- Tidying data sets
- Dealing with suspicious data
Skill Level Beginner
1. Missing Data
2. Duplicated Data
3. Formatting Data
5. Tidy Data
6. Red Flags
What's next?1m 5s
- Mark as unwatched
- Mark all as unwatched
Are you sure you want to mark all the videos in this course as unwatched?
This will not affect your course history, your reports, or your certificates of completion for this course.Cancel
Take notes with your new membership!
Type in the entry box, then click Enter to save your note.
1:30Press on any video thumbnail to jump immediately to the timecode shown.
Notes are saved with you account but can also be exported as plain text, MS Word, PDF, Google Doc, or Evernote.