Introduces a grouping shortcut for unique values, called SELECT DISTINCT, and shows how to combine this with numeric functions
- [Narrator] Now, we've been introduced to GROUP BY already, and we used this with count(*) to see which values were the most popular in a field. When we run this query, we see that we have 378 rows returned even though we know that there are 603 rows in the table overall. So what this tells us is that there are 378 unique values. Now sometimes, all you need is a list of those unique values. And if that's what you're after, there's a quicker way of doing it. And that's called SELECT DISTINCT.
So we could say SELECT DISTINCT district, like so. And that's giving us, albeit in a different order, 378 unique values from that one field. Sometimes when you see SELECT DISTINCT written it's followed by parentheses or brackets that contain the fieldname like so. And in MySQL, that works fine with or without. But what doesn't work is when you're trying to select a unique combination of fields which you can do with SELECT DISTINCT.
So if we tried to put SELECT DISTINCT(district,address2) that would throw an error in MySQL but it would work absolutely fine without the brackets, like so. In fact, you can use SELECT DISTINCT on as many fields as you like. You could list every single field in the table, and you would simply get back the same number of rows that there are in the table. Now let's click on the rental table. It has more fields than previous tables we've looked at and an awful lot more rows, 16,044 to be precise.
Now let's see how many staff there are in this table. SELECT DISTINCT, and that can be lowercase by the way as with all these keywords, staff_id. And we see that despite having 16,000 rows, we only have two rather busy staff listed in the rental table. Now, how many customers do we have? And that gives us 599.
Now, if we wanted to know which customers had been served by which staff, we can do that by saying DISTINCT staff_id, customer_id. And in fact, that gives us 1,198 rows. Now, the previous query gave us 599 rows and 1,198 is 599 times two. So what this is telling us is that every customer has been served by both members of staff at some point. It seems a bit unlikely if you ask me, but then, this is a sample database.
So SELECT DISTINCT is basically used to remove duplicate values. Now you can use SELECT DISTINCT with functions too. This is a bit of a contrived example, I grant you, but we could say SELECT DISTINCT CONCAT (first_name,last_update) which is a date column FROM 'actor'. And it is conceivable that you would combine completely different datatypes like this to create some sort of temporary unique reference for a row or a combination of datapoints.
Running SELECT DISTINCT on this combination then gives us all the unique combinations. And we can see that we have 128 of them. And you can use other functions as well, not just string ones. So we could say DISTINCT YEAR(last_update), and there we have just the two. It's a far sight quicker than checking manually. And if all you want is the number of unique values, you can actually use count around that statement to return you a number two.
COUNT DISTINCT works across all the main versions of SQL.
Join Emma Saunders as she shows you how to design and write simple SQL queries for data reporting and analysis. Review the different types of SQL, and then learn how to filter, group, and sort data, using built-in SQL functions to format or calculate results. Learn a bit about data types and database design. Discover how to perform more complex queries, such as joining data together from different database tables. Last but not least, Emma shows how to save your queries as views, so you can run them again and again.
- Using different versions of SQL
- Retrieving data with SELECT statements
- Filtering and sorting your results
- Transforming results with built-in SQL functions
- Grouping SQL results
- Merging data from multiple tables
- Identifying data types, and how to make sense of your database design
- Saving SQL queries