Easy-to-follow video tutorials help you learn software, creative, and business skills.Become a member
As you get more comfortable with writing SQL statements, you'll find yourself wanting to combine information from multiple tables more often than not and though a join is a classic way of doing it, it's not the only way. There is another way called a sub- query and let me explain that by kind of taking you step-by-step through the roadmap of how you might get there. So we start with something really simple. We've got a SELECT * statement here. It'll give me everything from SalesOrderHeader. Now if I execute that, I'm getting all the totaling of my sales orders, including things like the customer IDs and the subtotal, and the total due.
And I of course can scan this information just fine and look at it, but what if I'm more interested in specific stuff. What about if I want to know information about the biggest order that ever happened? Well, I could, of course, just do something very simple like an ORDER BY. TotalDue, which of course will give it to me ascending which might not be that helpful. So let's do TotalDue descending. We execute that and we know then that the first row should have the maximum number that was ever ordered, and we could scan more of that and look and try to find our customer ID and so on.
But we're getting way more information back than we want. Really what I'm interested in is customer information. Who did this? Who placed this order? Well, I could go to something simple. Instead of saying SELECT *, we could say SELECT CustomerID, execute, and we're getting them returned in their order. But I don't need the rest of them. I just want this one. Well, we could, knowing what we know now, even say SELECT TOP 1 and we've got that customer ID. But the issue is, I don't know who this is.
What I'm interested in is what company is this, who was the contact, and that information is stored in a different table, in the Customer table. Okay, so we could do a join here. I'm going to show you how you do with the join and then with the sub-query. So if we want to do a join, we know that we need to add the JOIN word after the first table and this is going to be an inner join, because I'm not interested in all the other customers. I'm just interested in one. So we're going to say INNER JOIN SalesLT.Customer.
Now, of course, because it's immediately saying this is now an ambiguous column, because customer ID is in both tables, we better do the little a and b to say which tables these are. I'm going to say SELECT TOP 1 a.CustomerID. But we're still missing one thing. Even though I've named two tables, it doesn't know how to join them. So what I'm going to do is the join word, which is ON, and we have to name the columns. Well, we're joining them based on the column CustomerID that's in both the tables, so a.CustomerID = b.CustomerID.
Now if I execute this, I should get exactly the same result, because I'm not asking for any new information to be retrieved yet. So let's just double-check that. Yup, 29736. But what I can do is now comma b. and I get CompanyName, b.FirstName, b.LastName. If I execute it, we can get all that information coming back.
Now what's the issue? Well, not a lot. I mean if this was commented, which we could do in the raw SQL with just fetch details for the biggest order, it might be obvious what it is. But if I was looking at this SQL six months down the line, I might be a little challenged to figure out exactly what it was doing because it's not all that clear. But it works, which is fine. So let me show you the other way of doing this that doesn't involve doing a join. So I'll take it back to what we had just a minute ago, which was fetch me that TOP 1 CustomerID, no join needed.
We're getting that same CustomerID back. Well, here's the deal. I want the information from the Customer table. So I am going to approach it from the other end completely. I'm just going to move that down to the bottom a little bit. I'm going to write a completely separate SELECT statement that describes the information I want, which is SELECT FirstName, LastName, CompanyName FROM the Customers table WHERE CustomerID equals.
The question is, equals what? Well, I want this information from the Customers table, whether CustomerID is equal to what I got back from this query, and that's what a sub-query is. We're going to take the query that returns the CustomerID and just feed it into the first one. I'm going to just put it in the parentheses here. So this first query, the inner query or the sub-query, will be executed first. We'll end up with the result 29736 and it would be the same as saying give me the FirstName, LastName, and CompanyName from the Customer table, where CustomerID equals 29736.
So we execute that and we get the FirstName, LastName, and CompanyName, and this is a sub-query, a query within a query. Now it might sound like a sub-query and a join are just choices, you just make the decision one or the other. Well, sometimes that's true, but you will find as you get more familiar with them sometimes it's more natural to do a join, sometimes it's more natural to do a sub-query, because there is crossover, but they're not always the same. There are instances when a sub-query is a much better, in fact, sometimes the only possible way to get some data coming back.
For example, let me take you through another example of a sub-query, but this time it will be something that will be quite challenging to do in a join. What I start to do is perhaps I start writing a query like this. I want the FirstName, LastName, and CompanyName from the Customer table where these customers have not placed any orders. The question is well, how do I find that information? Now particularly when you've got a SQL statement that needs to find some kind of negative information, give me information where this doesn't join to something, that can be a little challenging.
And oftentimes, a little trick of this is to turn it on its head. Well, let's say how would I get the first name, last name, and company name for all the customers who have ever ordered anything? But what I really want is that information where the CustomerID is to be found in the SalesOrderHeader table. So I could do that with the sub-query. I'm going to create my sub-query down here and then copy it up into the top level. I'm going to say SELECT CustomerID FROM SalesLT.SalesOrderHeader.
Now this little SELECT statement will give me a lot of different CustomerIDs and it could give me the same ones again and again and again. I only want the same CustomerID once. If CustomerID 75 is placed seven orders, I don't need seven CustomerIDs of 75. I just want one. So I'm going to say SELECT DISTINCT. So if this part, which is going to be my sub-query, returns a whole bunch of different Customer IDs, how do I tie it into the first one? Well, this is how.
We use the keyword IN, which is going to allow us to provide a range. Again, the sub-query is evaluated first. Give me all the distinct customer IDs that are found in that SalesOrde Header table and then use that to bring me the information from the Customer table. So I execute that and we are actually getting now 32 rows, which sounds about right because there were 32 orders. But what I originally asked for was I want the information of the customers that have not placed an order, whatever that means. Maybe it was canceled, maybe didn't make it all the way through.
Well, because we can use IN, we can also use NOT IN. We're taking a positive result of selecting the distinct customer IDs and getting it to feed us into really kind of a negative result. I want the customer information where this is not true. They're not in SalesOrderHeader. So I execute that and this one I get 815 rows, a whole bunch of orders that haven't been completed. So this is a second example of the kind of things that you could do with a sub-query.
Now word of warning. This kind of information, particularly on a group of large tables, could be quite an inefficient query to run if you're starting to scan through tables to find a bunch of results and then say well, where do all these results not match in some other scanned table? So a little word of warning there. Pay attention to that and as you get a bit more familiar with SQL, we're going to start running into things like analyzing our queries and taking a look at how quick things are going to run.
Get unlimited access to all courses for just $25/month.Become a member
82 Video lessons · 97510 Viewers
80 Video lessons · 141053 Viewers
59 Video lessons · 59403 Viewers
52 Video lessons · 72725 Viewers
Access exercise files from a button right under the course name.
Search within course videos and transcripts, and jump right to the results.
Remove icons showing you already watched videos if you want to start over.
Make the video wide, narrow, full-screen, or pop the player out of the page into its own window.
Click on text in the transcript to jump to that spot in the video. As the video plays, the relevant spot in the transcript will be highlighted.