From the course: SQL for Exploratory Data Analysis Essential Training

Unlock the full course today

Join today to access over 22,500 courses taught by industry experts or purchase this course individually.

Why learn about the distribution of data?

Why learn about the distribution of data? - PostgreSQL Tutorial

From the course: SQL for Exploratory Data Analysis Essential Training

Start my 1-month free trial

Why learn about the distribution of data?

- [Teacher] Let's take a look at the distribution of data. Or, in other words, the frequency with which different values appear in our dataset. Let's look at a series of numbers. Now, let's count how many times each of those numbers appears. We have three ones, five twos, six threes, four fours, and three fives. Now let's plot the list of the counts of each of those numbers from one to five. You'll notice that the smallest values are the ones and the fives. Those are the outer boundaries of the list of the numbers. The highest number, and the tallest bar, is in the middle at number three. The number of threes in the list is six. The kind of shape that is lower on the edges and smoothly goes up to a maximum height is often called a bell curve. It's technical name is a normal curve. Plotting the counts of values, or the frequency of the values, gives us a picture of the distribution of the data, or a picture of how the data's spread out. Another common kind of distribution has more…

Contents