Data that is publicly available often has poor data quality, such as misspelled words and duplicate entries. In this video, learn how to determine if a data source is worth investing time into.
- [Instructor] The quality of our data … is extremely important … to ensure the accuracy of any results … that we derive from a model, a metric, … or any other measurement tool. … Poor methodology and design can cause errors, … such as misspellings and duplicates, … and all of these errors are easily avoidable … by designing tools that collect information … in a standardized way. … For instance, think about a web form that you might submit. … Allowing users to use a text box enables them … to enter just about anything they want in that field. … However, if the text box is replaced with a dropdown, … that ensures that all of the data being submitted … will be spelled correctly, … and users will only be allowed to choose … from the options that were provided. … In the long run, that allows businesses to have control … over the quality of the data that is being collected, … whereas with a text box, the number of variations … in information that can be collected is infinite. … In the long term, having a text box field …
This course was created by Madecraft. We are pleased to host this content in our library.
- Identifying high-quality data sources
- Evaluating data usability, accuracy, and more
- COVID-19 data sources
- Challenges and takeaways