Join Martin Guidry for an in-depth discussion in this video Using first normal form, part of SQL Server 2012: Designing Database Solutions.
- In this section, I'd like to talk about First Normal Form. First Normal Form states that the tables in our database will have no repeating columns. First let's look at an example that does have repeating columns, and we'll see why this can be a problem. Here is a hypothetical Orders table, for a database, and if we look at the data, we have customer John Doe placed an order for a Red Bicycle, Black Socks, and Red Shoes, all in one order.
This database is designed to hold that data in three repeating columns. Those repeating columns being Item1, Item2, and Item3. For John Doe's order, because he ordered three items, this works perfectly fine. We have a little bit of a problem with Jane Smith. Jane Smith only ordered one item. Therefore, the column named Item2 and the column Item3 are going to be blank for her order, which is a little inefficient, but not too big of a problem.
The biggest problem with this happens when a customer wants to order four items in the same order. There would really be no way for this structure to support four items in one order. We could, of course, add another column, called Item4, but then if a customer came back and wanted to order five items, well we wouldn't be able to support five items, we'd have to add another column, and we could continue doing this forever. So for these reasons, using repeated columns can be a mistake.
Now let's look at a similar structure that doesn't utilize repeated columns. In this design, we'll have to break down the orders into two tables. One of the tables creating an OrderID, and the other table, listing for each OrderID, all of the items that have been purchased. So, now John Doe, with his three item order, instead of getting three columns, he gets three rows. One row for each item in his order, and that means if he wants to order a fourth, a fifth, a sixth item, that's not a problem, because we can continue adding rows to this table.
It also makes the other order more efficient. The order where a user only requested one item, now only takes one row. We see that with OrderID number two, the blue shirt. We can see some examples of this in our AdventureWorks Database. I'm gonna open up Management Studio, and look at AdventureWorks 2012. Look at the tables. I'll scroll down a bit, here, and near the bottom, there is Sales, SaleOrderHeader, and SaleOrderDetail.
So, if we look at some sample data from SaleOrderHeader, we'll see that every Sales Order is given an ID, just an integer, and then that links back to the other table, SalesOrderDetail. If we look at some sample data from there, we see that for every item, a new row is generated, and you'll notice, these top seven rows all have the same SalesOrderID. In other words, they were all part of the same order.
Someone must have ordered seven different items, and we'll see the ProductID out here, of what they ordered, and the quantity. So, rather than have seven different columns to support this order, we can do it in seven different rows with this technique. As you see, this like some of the other normalization techniques cause us to generate an additional table, and have a well defined relationship between the two tables.
- Planning the database infrastructure
- Designing databases vs. data warehouses
- Designing the physical database implementation
- Planning for partitioning
- Adding FileStream data
- Monitoring server health
- Managing SQL Server with PowerShell