Join Mike Chapple for an in-depth discussion in this video Numbers stored as text, part of Cleaning Bad Data in R.
- Currency values can often cause data quality issues…that require cleaning prior to performing analysis.…Just as we discussed with units of weight,…we need to make sure that any currency values…we have in our data set have clearly identified units.…If we simply have a column for example that says price,…how should we interpret that value?…Is it US dollars or Canadian dollars? Euros?…Or some other currency.…The second issue that we might encounter…is that we have problems with the formatting…of currency values when we try to read them into R.…
Lets take a look at an example in a real data set.…I have the code set up here to load a data set…containing information on medicare expense reimbursement.…Lets go ahead and run through this code…and load the data file.…And then we'll take a look at a summary of this data set.…You can see immediately that there are some problems here.…Look at these three values.…Average charges, average total payments,…and average medicare payments.…
I would expect that charges and payment information…
Where possible, instructor Mike Chapple shows how to correct the issues using R, but the same principles can be applied to any statistical programing language.
- Missing data
- Duplicate rows and values
- Converting data
- Formatting data
- Working with tidy data
- Tidying data sets
- Dealing with suspicious data
Skill Level Beginner
1. Missing Data
2. Duplicated Data
3. Formatting Data
5. Tidy Data
6. Red Flags
What's next?1m 5s
- Mark as unwatched
- Mark all as unwatched
Are you sure you want to mark all the videos in this course as unwatched?
This will not affect your course history, your reports, or your certificates of completion for this course.Cancel
Take notes with your new membership!
Type in the entry box, then click Enter to save your note.
1:30Press on any video thumbnail to jump immediately to the timecode shown.
Notes are saved with you account but can also be exported as plain text, MS Word, PDF, Google Doc, or Evernote.