Join Mike Chapple for an in-depth discussion in this video Text improperly converted to numbers, part of Cleaning Bad Data in R.
- [Man] Data analysis tools are usually…very good at guessing the format of data…and reading it correctly without having to be…explicitly told what variable types to use.…However, as you saw in the last video,…that doesn't always work out.…Another common issue that occurs with reading in data,…is when text data is improperly read…into a numeric data type.…The key observation is that numerals aren't always numeric.…Sometimes they're just text,…and treating them as numeric values can cause real problems.…
Now that might sound odd,…but let's take a look at a common example.…Here I have the code, ready to load in the data set…containing information about state capitals…and their populations.…Let's go ahead and load this data set.…We'll begin by loading the tidyverse and lubridate.…We'll change our working directory…and then read in the capitals.csv file.…I'm doing this, using the default settings of read csv.…When I don't specify the variable types,…read csv provides in red text, the column specification.…
These are the guesses that it made…
Where possible, instructor Mike Chapple shows how to correct the issues using R, but the same principles can be applied to any statistical programing language.
- Missing data
- Duplicate rows and values
- Converting data
- Formatting data
- Working with tidy data
- Tidying data sets
- Dealing with suspicious data
Skill Level Beginner
1. Missing Data
2. Duplicated Data
3. Formatting Data
5. Tidy Data
6. Red Flags
What's next?1m 5s
- Mark as unwatched
- Mark all as unwatched
Are you sure you want to mark all the videos in this course as unwatched?
This will not affect your course history, your reports, or your certificates of completion for this course.Cancel
Take notes with your new membership!
Type in the entry box, then click Enter to save your note.
1:30Press on any video thumbnail to jump immediately to the timecode shown.
Notes are saved with you account but can also be exported as plain text, MS Word, PDF, Google Doc, or Evernote.