- [Instructor] Let's discuss some tips…for exploring your data.…Data exploration is an important step…in the data science process.…It's used to help understand the quality of data.…This is important, before you start to make hypotheses,…and draw conclusions from your data.…It's used to identify missing values,…which can happen if there are data quality control programs,…with the source systems that supply your data.…It also helps us find unusual, or unexpected values,…such as outliers, that don't make sense…from a business logic perspective.…
Data exploration helps highlight inconsistent data,…especially with regards to business rules.…It also allows us to understand…the distribution of the data,…that is, the shape of the data,…and understand subgroups, using histograms.…And correlations allow us to see relationships…between variables, when we use…the Pearson's Correlation Coefficient.…Exploratory data analysis, is a starting point,…that helps you understand your data,…and avoid potential problems in data analysis,…
Released
6/7/2018- Exploratory data analysis vs. hypothesis-driven statistical analysis
- Performing data quality checks
- Calculating quartiles
- Using box plot to understand the distribution of values
- Using histograms to understand the frequency of values
- Using chi square to understand the correlation between values
Share this video
Embed this video
Video: Next steps