Join Simon Allardice for an in-depth discussion in this video Introduction to relationships and foreign keys, part of SQL Server 2008 Essential Training.
- View Offline
A vital part of designing any database is the ability to create relationships between your tables. Let's go through an example. We have a simple product table, dbo.Product, here. It has its own primary key, the ProductID, which is automatically generating a unique number for each product as it's entered into the database. And one of the benefits of having that primary key is we can use it elsewhere. So we have a different table, this time called OrderItem, helping us construct our orders.
Itself it has its own primary key with its own automatically incremented value, but what's a bit more interesting is it has another column here of ProductID. Now ProductID in OrderItem should reference the ProductID in the Product table, but whereas in the Product table it has to be unique, in OrderItem it doesn't. We could have for example the ProductID 1001 three times in OrderItems, or four times, or a dozen, all referencing the same product.
We could have 1010 pointing to our 1010 product. We could have 1002. This is the relationship that we're talking about between our tables. In the Product table, ProductID is referred to as the primary key. It is unique. In the OrderItem table, ProductID is referred to as a foreign key. It's a key to some other table, allowing us to join our information together. The benefit of having this relationship described is we can go either way we can go from the OrderItem and go and get product details, or we can go from the product and find how many order items have been created for that product.
And when you have a relationship like this it's what's defined as a one-to-many relationship. For every one product we can have many order items. Now you'll end up creating dozens or potentially hundreds of relationships when you design your databases. If you're using SQL Server Management Studio to show a database diagram you'll actually see that relationship defined between the tables. And as you create more-and-more complex databases, you'll find that you're going to have relationships between most of them.
What I'm looking at here is just a zoomed out diagram of the AdventureWorks database and that's not even a particularly large database. It's very, very common to have relationships between your tables. And there really are two main kinds of relationships that we care about. The first one, the most common, what we just explored, the idea of a one-to-many relationship. So one customer many orders, one category has many products in it, one department has many employees. Now bare in mind when you describe a one- to-many relationship you don't actually have to have many. Any one customer might only place one order.
Might even place zero orders, but one customer can have many orders, one category can have many products, one department can have many employees. But the important point is here that at least according to your own business rules the opposite of these statements is not true, that in your business an order has one customer. You don't say an order can have many customers. It can't. An employee has one department. A product has one category. So you can view this either from the top- down or the bottom-up, whatever makes sense.
Now as this is the most common thing that you're going to see when you're looking at a diagram in SQL Server Management Studio, you'll see the relationship defined like this with this icon. The key represents the one and the infinity sign represents the many. So when you're looking at this diagram it doesn't matter whether the tables are shown on top or underneath. What's important is where the key is. The next kind of relationship we can have is called a many-to-many relationship. It's not as common as one-to- many, but it's still quite common.
Now you often have to think about this one, because these business situations can often feel like one-to-many. At first, let's say we have an author table with a list of a few authors inside it and their names, and some kind of key and identity for them, and we also have a title table with the list of book titles. Well at first glance what we might say is okay, this is a one-to-many relationship. We could take the author number 74, Jordan Winters, and say that we want to represent that author has written both book one, DB Design, and book 3, SQL Server.
And we can say that Fred Summers, author number 75, wrote book 2 on SharePoint. And if this was a classic one-to-many relationship, one author, one or more titles, we could add a new column to the title table. This would be an author ID column. It would be a foreign key to the author table. But here is the issue. What happens if in a day, or a week, or a month we say we'll actually the SQL Server book was written by two authors? Well the way that we have it right now, we have that author column can only store one, storing a foreign key to author ID 74.
And this is what we mean by a many-to- many relationship. One author can have many titles, one title can have many authors. Now the way some people try and model this is they'll add a new column to the title table. They'll put in author ID 2 and they'll have that be 76 and pointed to John Marr. However, adding new columns to your tables, and particularly these kind. what are called repeating groups or repeating columns. is a really, really bad idea and it's a definite no-no in database design.
So we'll get rid of that idea. Well some other people think, "Well I'm going to cheat a little bit and I'll just do something quick and dirty and I'll just slide in little set of comma separated values there." So that the author column can point to both 74 and 76, but that's just a cheat and like adding a new column this is not suggested either. In fact we're going to solve this problem by getting rid of the author column entirely. So we go back to two completely detached tables and what we do to fix this is we add another table.
We add what's often referred to as a junction or a linking table. Now the only reason for this table to exist is to join author and title together. So in fact the name of this title by convention would be authortitle. It could also be titleauthor. It doesn't really matter which way round it goes, because what we're going to do is setup two one-to-many relationships. In fact, you cannot, in SQL Server or in any other relational database, you cannot express a many-to-many relationship directly. You can only do two one-to-many relationships.
So what we'll do is we'll define a one-to- many relationship from the author to authortitle. So one author with ID 74 can exist in the AuthorID foreign key column in authortitle twice, or three times, or five times, or a dozen. Using that we can go from author to authortitle, find a TitleID ,and map it up to the title table. And we can also go the other way. We can take the identity of a title like number 3, take it back to the authortitle table to two rows there so one-to-many relationship back the way, grab the AuthorID, and follow that back up to the author table.
So we are expressing now a many-to-many relationship. And if you go looking in say some of the sampled database, whether it's the larger ones like AdventureWorks or even the small ones like AdventureWorksLT, anytime you see a table name that seems to join the name of two other tables-- So we have ProductModel, ProductDescription, and then ProductModelProductDescription. You can make the assumption that the one with a long name is simply there to join ProductModel and ProductDescription together.
In fact, if you want to confirm that what we could do is make a new database diagram. I'm going to add just those three tables that I suspect that from, ProductDescription, I'll hold down Ctrl and select ProductModel and ProductModelProductDescription, click Add, close this, and what I can actually see here is my many-to-many being rendered out. We go from ProductModel is the one, to the many, a ProductModelProductDescription. ProductDescription is the one to the many.
The only reason for the existence of this table is to join the other two together. In a large database you're going to end up with a lot of one-to-many relationships between your tables and a few of those will really be used to create a many-to-many relationship. Officially, there is also a one-to- one relationship that is possible, but it's not common at all. If you think about it, if one row in one table is pointing to one row and only one row in another table, well you might as well just combine them so it's just one row in both places.
Although also bear in mind that if those are the official three kinds of relationships, one-to-many very common, many-to-many quite common, and one-to-one not common at all, you also have what some people consider a fourth kind of relationship. Zoom into any large database diagram and you'll find things like this, a table just existing without any. Some people consider "none" to be a relationship. If you get obsessive about what these relationships are, and again the geek trivia term is the cardinality is what we call something that describes the relationship between tables.
Some people say none is an official relationship. I don't think so, but you'll certainly see that a lot. You don't have to connect your tables to other parts of your database. So the key question is going to be, how do we do this? And we'll see that next.
- Using T-SQL (Transact-SQL)
- Managing databases with SQL Server Management Studio
- Understanding database normalization
- Using SELECT statements
- Building indexes
- Monitoring database size and integrity
- Backing up and restoring databases
- Creating functions and stored procedures
- Managing database permissions
- Creating and formatting reports
- Adding charts to reports
- Creating and executing a simple SSIS package