Introduces the method and pitfalls of using UNION to join rows from two tables into the same columns
- [Instructor] Now we've learned how to merge tables together using Join. Joins allow us to merge columns from different tables, it's like a horizontal kind of merge. But we can also merge rows into the same columns using a new keyword called Union. It's a bit like a vertical merge. So we return rows from more than one table, but into the same number of columns. Let's say that for audit reasons we want to see the date when two tables were last updated. We could run two queries, one after the other.
SELECT date of last_update FROM actor, and we see we have mostly the 15th of February there with one exception. And we could run the same thing for the address table, which we see is mostly the 25th of September. But we can join these two queries together like so. And we literally write one SELECT query, and a second one, and separate them with the word UNION.
And if we click Go, we can see something straight away. We've only been returned three rows. By default UNION performs a select distinct, that is to say it only gives us unique values. You can overrule this behavior, although it's pretty useful, but if you want to overrule it you just say UNION ALL instead like so, and then we get all of the rows from the first table, followed by all of the rows from the second table. And you can see that's 803 rows now, rather than three.
Now to make it a bit clearer what we're looking at, we can add some synthetic columns in here. We can say SELECT the text value actor as tbl, and that's going to become our first column, and then here we say SELECT address also as tbl, as the first column. And if we click Go, it's a bit clearer now which dates correspond to which tables if we were to look through them. Now this is quite a well-formed example.
You can see that we're actually returning the same field from each table, and I happen to know it's got the same data type in each table. So here we're returning a string, and here we're returning a date. But in fact we can join two completely different columns using UNION, and they can even have different names so we could say actor as tbl, date, update, and then we'll say UNION so that we can actually see a few more of the results, and then when we get to the address, we'll take that one out, and we'll say city_id and click Go.
Now this can be really risky. The second column is called date(last_update), even though most of the rows underneath it aren't a date at all. Never mind one called last_update, they're actually city_id, so take some care with UNION and UNION ALL because it tends to mask the true identity of the second set of data. Now just to point out that you can filter these queries just as you would with any other type of query, and you do it in just the same way.
So we say FROM actor, and this one is FROM address, and here we could say WHERE city_id is less than 5. And you can see that that's returned us a reduced data set.
Join Emma Saunders as she shows you how to design and write simple SQL queries for data reporting and analysis. Review the different types of SQL, and then learn how to filter, group, and sort data, using built-in SQL functions to format or calculate results. Learn a bit about data types and database design. Discover how to perform more complex queries, such as joining data together from different database tables. Last but not least, Emma shows how to save your queries as views, so you can run them again and again.
- Using different versions of SQL
- Retrieving data with SELECT statements
- Filtering and sorting your results
- Transforming results with built-in SQL functions
- Grouping SQL results
- Merging data from multiple tables
- Identifying data types, and how to make sense of your database design
- Saving SQL queries