Understand how to calculate the correlation coefficient in PostgreSQL.
- [Instructor] We can calculate…a measure of correlation using a formula called…the Pearson's correlation coefficient.…The formula calculates a number between…negative one and positive one.…A negative one means the variables…are highly negatively correlated.…A positive one means they are highly positively correlated.…A zero means they're not correlated at all.…The formula for Pearson's correlation coefficient…is based on the standard deviation…of the distribution of variables…and a measure called the covariance…between the two variables.…
The details of the formula…are beyond the scope of this lesson,…but if you are interested,…you can find the details at Wikipedia…on the page about Pearson's correlation coefficient.…For our purposes, we shall use a SQL function,…CORR, or C-O-R-R, in PostgreSQL.…It takes two values, such as columns in a table,…and returns a value between negative one and one.…Let's imagine we have a product sales table…with the number of units sold and the total revenue.…
We can calculate the correlation…
Released
6/7/2018- Exploratory data analysis vs. hypothesis-driven statistical analysis
- Performing data quality checks
- Calculating quartiles
- Using box plot to understand the distribution of values
- Using histograms to understand the frequency of values
- Using chi square to understand the correlation between values
Share this video
Embed this video
Video: Calculating correlation with SQL