Learn how to use SQL to understand the characteristics of data sets destined for data science and machine learning.
- [Dan] Welcome to SQL for Exploratory Data Analysis Essential Training. In this course, you'll learn how to review your data before beginning analysis and data science efforts. You'll learn why it's important to perform preliminary data reviews using queries, statistics, and visualizations. You'll understand how your hypothesis testing can be thrown off if you make incorrect assumptions about your data. In this course, you'll learn ways to check for data quality, correct for missing data, and verify business rules that apply to your data.
You will learn distributions of data and why they are important, and you'll learn when to apply visualization tools, like histograms and box plots. You'll also learn how to measure correlations between variables and how to use those measures to gain insights into your data. So let's get started with SQL for Exploratory Data Analysis Essentials Training.
- Exploratory data analysis vs. hypothesis-driven statistical analysis
- Performing data quality checks
- Calculating quartiles
- Using box plot to understand the distribution of values
- Using histograms to understand the frequency of values
- Using chi square to understand the correlation between values