Join Bill Weinman for an in-depth discussion in this video How aggregates work, part of SQL Essential Training (2014).
Aggregate data is information derived from more than one row at a time. SQL has powerful features for dealing with aggregate data. For this lesson, we're going to use the world.db database. And, if you've been following along, you've seen queries like this before. So the result of this query shows us that there are 239 rows in the Country table. The Count function is an aggregate function. It returns a single value from a query that spans many rows of data. Now, what if you want to know how many countries are represented in each region within this table? This is where you start seeing the power of aggregate functions in SQL.
So, we'll start by just splitting this up into separate lines, so that we can do some more stuff with it easily. And we're going to select Region and Count with the asterisk from Country. And, we're going to say Group By Region. So now you see a result where it has a count for each region. And that's because of the Group By clause. The GROUP BY clause groups results before calling the aggregate function. This allows you to apply the aggregate function to groups of data from the query.
In this case, the Group By clause specifies the Region column. So it sorts the table by region, and then calls the aggregate function for each value of Region. The result is a table with one row per region with the aggregate value as a column in the resulting table. You can use aggregate values in the rest of the query just as you would with any other values. For example, you can order by. We'll put an AS here in alias for Count, like that so that we can order by it easily.
Order by count descending and by Region. So now we have it sorted by the Count column, which is the aggregate function. And it's still grouped by the Region. This also works with joined queries, and so now we're going to switch to the album database. And I'm going to go ahead and I'm going to grab this from the exercise files here. It's this one here. Grab that and paste it into our SQL box here. So Albums and Tracks are in different tables.
So we have a joined query here, we are joining album and track as t and a, and then the GROUP BY clause. Groups are queried by the album ID, and we are getting a count of the tracks in each album. So even though the Track table is separate from the Album table, we're still able to group by the album ID and count the number of tracks in the groups. So when I press Go here you see these are our Albums and these are how many tracks there are in each Album.
Now continuing on, suppose I only want to list the albums with ten or more tracks? I can use a Having clause, having tracks greater than or equal to ten. And I want to put that after the Group By clause because Having relates to groups. So we'll go ahead and we'll run the query. You see now we only get the albums with ten or more tracks. The HAVING clause is different than the WHERE clause, because it works on aggregate data.
It's important to have a separate keyword for this, because you may still need the Where clause to operate on the non-aggregate data in the same query. You can think of HAVING as being like WHERE, but for aggregate data. Say for example, you only wanted to count the albums by the Beatles. You can use a WHERE clause and a HAVING clause in the same query to select just the Beatles albums before the aggregation happens. And so I'm going to put a Where clause in here before the Group By clause. And now when I run this query, I only get the one album, Rubber Soul, which is a Beatles Album, because that's the only album that meets both of these criteria, has ten or more tracks and the artist is the Beatles.
So it's important that the Where clause is before the Group By clause, and the Having clause is after it. The Where selection operates on the data before the aggregation and the Having operates after. Just remember Having is for aggregate data, and Where is for non-aggregate data. So SQL provides simple and powerful tools for operating on aggregate data. Aggregate functions operate on groups of rows rather than individual rows. And they're useful on join queries as well as straight queries. The Having clause is for filtering the aggregate queries, just as the Where clause filters the non-aggregate queries.
- Understanding SQL terminology and syntax
- Creating new tables and records
- Inserting and updating data
- Writing basic SQL queries
- Sorting and filtering
- Accessing related tables with JOIN
- Working with strings
- Finding the numeric type of a value
- Using aggregate functions and transactions
- Updating a table with triggers
- Creating views
Skill Level Beginner
Q: For Mac OS X: When I try to start the Apache Web Server from the XAMPP control panel, it doesn't start, and when I open "localhost" in my web browser, I see a white screen that says "It Works!" instead of the XAMPP page.
sudo apachectl stop
Q: I'm on a Mac, and I get an error in SID that says "attempt to write a read only database." How can I fix this?
A: This usually means that the database folder does not have sufficient permissions for writing by the web user. This can happen if you create the SQL folder new, rather than copying it from the Exercise Files. Here's how to fix this:
- Open a Finder window and Navigate to /Applications/XAMPP/htdocs/SQL
- Control-click on the SQL folder and select "Get Info" from the context menu.
- Under "Sharing and Permissions" (you may need to open the disclosure triangle), in the "everyone" row, select "Read & Write."Then you can close the Info window.
- Now repeat the process for the three *.db files inside the folder.