When asking questions that involve data from more than one table, there are a few ways of structuring the question. Explore the different kinds of JOIN commands to understand how your question should be phrased.
- [Instructor] When we use a join to associate data across tables, there are a few ways we can go about it. Let's think about our two tables as being on the left and the right in some kind of workspace. Left and right in this sense are the table named first in the select statement on the left and the table we're joining to on the right. If we just leave a statement without any kind of qualifying parameters, we get back a resulting table that's the size of the left table times the right table. We get back one of every row from the right table for every row of the left table.
This is called a cross join. When we ask to join tables based on some given criteria, that creates overlap between the tables where the criteria match. The state abbreviation CA for example on one table would be equated to CA in the other table, thereby joining those records together and associating data from the right table to data from the left table. It's not a one to one relationship necessarily. We could have one key on the right side match that same key in more than one row on the left, but we also might have a condition where records on the right or the left don't have a corresponding key on the other side.
That's why we have other types of joins. To account for situations like that. This basic join, where we're asking for results from both tables where the state field from the people table is set to match up with the abbreviation field on the states table is actually what's called an inner join. Inner and outer refer to the space where records match or overlap. Here, CA and VA match up but DE and MA don't match up. CA and VA would be considered inner because they're in the overlap space and DE and MA would be considered outer.
The inner join asks for records that overlap. So, that's what we see in the result over here. If we wanted to see all of the data from one table or another and the matches where there's a match happening we can use this outer concept to make that happen. To get back a result for every row in the left table I'd use left join which is the same as left outer join. And here in the result there is a row for every row of the left table. The people table. And wherever there is a match from the right table I have that information too.
But if there is no match on the state abbreviation, for example here with Devin from Massachusetts I get back null values because there's no information about that key in the right table. So my result is the same length as the left table with a left join. If I wanted to make sure that I saw all the values on the right table instead, I could use right join. And in this case I see all the record values from the right table, some of which match, and if there isn't a match I just see the values from the right table with null in the fields for the left table where there's no data.
Delaware here is in the right table but it doesn't have a coordinating row in the left table and Devin from Massachusetts doesn't appear because Massachusetts isn't in the right-most table and there's no match on MA. And we can ask for a row representing each record in both tables with a full outer join. This gives us the matches and also unmatched records from left and right tables. There are more complex kinds of joins that you may come across as you extend your skills with SQL. But, these are the basics.
You should also be aware that not all software supports all of the kinds of joins that are available in SQL. For example, MySQL doesn't support full outer join. And SQLite doesn't support right join or full outer join. We can see some of these joins in action here in the database software. Let's join up our quiz participants, names, and state abbreviations, with the full state name from the states table. I'll write SELECT people.first_name, people.last_name, people.state, and states.state_name.
FROM people JOIN states ON people.state equals states.state_abbrev;. There we go. We can switch up the order that we're joining the tables too and we'll see that we get an different order of records. Instead of saying FROM people I'll say FROM states JOIN people.
This puts states on the left instead of on the right and people on the right instead of the left. I'll press run and I can see a different order of records. And now that the states table is on the right and the people table is on the left let's switch this to a LEFT JOIN so we get all the records from the states table instead of just the ones that match. There is 1,003 rows now rather than the 1,000 we had before.
Scrolling down to the bottom I can start to see why that is. There's at least one record here in the states table that doesn't have corresponding information in the people table. And given that I have 1,003 records, I'm willing to bet that there are a few other states that don't match up with people in our people table. Let's write a query to match up the state abbreviations in the people table with those in the states table so we can see what's missing. I'll change my SELECT clause to show me DISTINCT values from the state field on the people table and the state abbreviations from the states table.
And then, just to keep things clean, I'll ORDER them BY people.state. I'll run that and here I can see, right up at the top, for my states table I have Maine, Rhode Island and Wyoming but that doesn't match any values in the people table. None of our quiz participants live in Maine, Rhode Island, or Wyoming. Interesting.
- Name the predicate of the following statement: SELECT EyeColor, Age FROM Student WHERE FirstName = 'Tim' ORDER BY LastName ASC;
- Explain what to use to enforce the order in which an expression must be evaluated if the WHERE clause contains multiple expressions to evaluate.
- Identify the best option to join two tables in a database to be able to display data from both.
- List a data type that is not numeric.
- Determine the result of running the following statement on a table containing columns col_1 and col_2:
- INSERT INTO Box (col_1, col_2) VALUES ('A', 'B'), ('A', 'B'), ('A', 'B'), ('A', 'B');
- Determine the best approach of deleting Jon Ramirez (ID 3452) from a Student table.