SQL Server 2008 Essential Training
Illustration by Mark Todd

Database normalization


SQL Server 2008 Essential Training

with Simon Allardice

Start your free trial now, and begin learning software, business and creative skills—anytime, anywhere—with video instruction from recognized industry experts.

Start Your Free Trial Now

Video: Database normalization

Database normalization is the process of taking your database design through a set of rules called normal forms. So that it conforms to relational database standards and you really want to do this, so that your database will contain a minimum of duplicate data or redundant data. It'll contain data that's easy to get to define to edit and maintain, and that you can perform operations even difficult ones on your database without creating garbage inside, without invalidating the state of it.
Expand all | Collapse all
  1. 2m 21s
    1. Welcome
      1m 19s
    2. Using the exercise files
      1m 2s
  2. 17m 58s
    1. SQL Server core concepts
      9m 4s
    2. SQL Server editions
      3m 8s
    3. Applications included with SQL Server
      5m 46s
  3. 26m 1s
    1. Preparing for installation
      3m 44s
    2. Creating service accounts
      2m 33s
    3. Installing SQL Server
      11m 42s
    4. Post-installation checks
      3m 9s
    5. Installing sample databases
      4m 53s
  4. 13m 35s
    1. Introduction to SQL Server Management Studio
      8m 7s
    2. Introduction to SQL Server Books Online
      3m 6s
    3. SQL Server system databases
      2m 22s
  5. 1h 26m
    1. Planning your database
      9m 39s
    2. Creating a SQL Server database
      4m 7s
    3. Creating tables
      7m 51s
    4. Data types in SQL Server
      12m 25s
    5. Defining keys
      8m 9s
    6. Creating default values
      4m 39s
    7. Creating check constraints
      2m 25s
    8. Creating unique constraints
      4m 34s
    9. Introduction to relationships and foreign keys
      9m 51s
    10. Creating relationships in SQL Server Management Studio
      8m 14s
    11. Database normalization
      11m 47s
    12. Creating computed columns
      3m 10s
  6. 23m 11s
    1. Using the SQL Server Import and Export Wizard
      3m 58s
    2. Importing Excel files into SQL Server
      6m 11s
    3. Importing CSV files into SQL Server
      5m 27s
    4. Importing Access databases into SQL Server
      7m 35s
  7. 55m 29s
    1. Introduction to Transact-SQL
      3m 43s
    2. Using SELECT statements
      7m 16s
    3. Changing the default database
      2m 21s
    4. Creating conditions in SQL
      8m 10s
    5. Sorting your output
      3m 23s
    6. Using aggregate functions
      7m 12s
    7. Finding unique values
      2m 14s
    8. Using subqueries
      9m 33s
    9. Joining multiple tables together
      8m 0s
    10. Viewing execution plans
      3m 37s
  8. 19m 36s
    1. Writing INSERT statements
      5m 47s
    2. Writing UPDATE statements
      4m 38s
    3. Writing DELETE statements
      2m 54s
    4. Using the OUTPUT clause to return inserted keys and GUIDs
      6m 17s
  9. 32m 52s
    1. Introduction to SQL functions
      6m 26s
    2. Using SQL configuration functions
      2m 14s
    3. Using string functions
      7m 26s
    4. Using date functions
      6m 27s
    5. Creating user-defined functions
      10m 19s
  10. 28m 46s
    1. Introduction to stored procedures
      4m 23s
    2. Creating stored procedures
      11m 23s
    3. Introducing transactions
      4m 23s
    4. Creating transactions
      8m 37s
  11. 16m 39s
    1. Understanding and creating indexes
      6m 32s
    2. Monitoring and rebuilding indexes
      6m 0s
    3. Monitoring database size and integrity
      4m 7s
  12. 11m 41s
    1. Creating backups
      4m 21s
    2. Creating differential backups and using backup compression
      3m 40s
    3. Restoring databases
      3m 40s
  13. 17m 40s
    1. Introduction to SQL Server security and permissions
      5m 54s
    2. Adding a Windows user to the database
      5m 7s
    3. Creating SQL Server logins and switching authentication modes
      6m 39s
  14. 36m 41s
    1. Introduction to SQL Server Reporting Services
      2m 52s
    2. Connecting to the Report Manager
      4m 29s
    3. Using Report Builder
      12m 4s
    4. Formatting values in reports
      4m 17s
    5. Adding indicators to reports
      5m 11s
    6. Adding charts to reports
      3m 54s
    7. Working with report security
      3m 54s
  15. 24m 41s
    1. Introduction to SQL Server Integration Services (SSIS)
      1m 57s
    2. Using Business Intelligence Development Studio (BIDS)
      6m 59s
    3. Creating and executing a simple SSIS package
      7m 35s
    4. Importing packages into SQL Server Management Studio
      3m 21s
    5. Scheduling jobs with SQL Server Agent
      4m 49s
  16. 31s
    1. Goodbye

please wait ...
Watch the Online Video Course SQL Server 2008 Essential Training
6h 54m Beginner Dec 15, 2010

Viewers: in countries Watching now:

In SQL Server 2008 Essential Training, Simon Allardice explores all the major features of SQL Server 2008 R2, beginning with core concepts: installing, planning, and building a first database. Explore how Transact-SQL is used to retrieve, update, and insert information, and gain insight into how to effectively administer databases. The course also covers features outside SQL Server's database engine, including technologies that have grown up around it: SQL Server Reporting Services and Integration Services. Exercise files are included with the course.

Topics include:
  • Using T-SQL (Transact-SQL)
  • Managing databases with SQL Server Management Studio
  • Understanding database normalization
  • Using SELECT statements
  • Building indexes
  • Monitoring database size and integrity
  • Backing up and restoring databases
  • Creating functions and stored procedures
  • Managing database permissions
  • Creating and formatting reports
  • Adding charts to reports
  • Creating and executing a simple SSIS package
Business Developer IT
SQL Server
Simon Allardice

Database normalization

Database normalization is the process of taking your database design through a set of rules called normal forms. So that it conforms to relational database standards and you really want to do this, so that your database will contain a minimum of duplicate data or redundant data. It'll contain data that's easy to get to define to edit and maintain, and that you can perform operations even difficult ones on your database without creating garbage inside, without invalidating the state of it.

It should be carried out for every database you design, and it's really not that hard, even though yes, when you first start reading about database normalization, you're likely to run into phrases like "your database won't be in third normal form until every non- prime attribute of R is non-transitively dependent (i.e. directly dependent) on every candidate key of R," but you don't have to get into all this language. You just have to understand these were a set of rules developed about 40 years ago by E. F. Codd, the father of databases, and we step through them basically one, two, three, first normal form, second normal form, third normal form.

So what's first normal form? Well, it starts off with stuff we've been doing already. Your data needs to have a unique key. It should always have a unique key. There are a few very rare situations in which you don't have a unique and primary key, but we're going to have one for all our databases. Really, the key for first normal form is that each of your columns, each of your fields, should contain one value and just one value and there should be no repeating groups. Okay, what does this mean with actually our tables? Well, let's say, for example, I begin developing a customer table.

I've got a customer ID, so that's good. We've started off for first normal form. I've got the name of the customer and the city they're based in. Then what I decide to do is say that all our customers have a representative, the person we talk to. So add another column to the table. This would be the customer contact. Who do we speak to at ACME Corp? Who do we speak to at Two Trees or Acacia? The issue is what happens when one of these companies starts to grow a little bit, and we find out that we've got more than one contact. Well, there is a couple of ways you could deal with it.

You could just start stuffing extra data into that one column. So we could just start putting commas or any other delimiter and putting multiple values in the one Contact column. Well this is a no-no. This is not in first normal form if you do this because first normal form demands that every column, every field, contains one and only one value. If you decide to show multiple values in like this, you'll find it harder to search, you'll find it harder to sort, you'll find it harder to maintain.

Well, what some people do is they rip it out that way, go back to the original one, and then they start adding more columns, Contact, Contact 2, Contact 3. This is what's called a repeating group and there should be no repeating groups. The classic sign of a repeating group is fields with the same name and different numbers. We don't want either of these things. So what do we do? What we do is what we do for a lot of the normalization steps. We rip the Contact data out and create our own customerContact table.

This then has relationships. We go to many-to-one relationship between customer and customerContact, where we can go from customer 1, find the contact, go from customer 2, find the contact, go from customer 3 and find the three contacts, and this would get it in first normal form. That's step one, because to go onto second normal form, well, first you have to be in first normal form. You don't pick and choose. You go through this one, two, three. Second normal form has the rather puzzling phrase that "any non-key field should be dependent on the entire primary key, " and that's about as simple as it can get phrased.

Now what does this actually mean? Well, for most of what we've done in this course, this isn't an issue for us. We're already in second normal form. Let me show you a table that currently is in first normal form, but not in second normal form. I have an events table here that has an ID of a course and a Date and a CourseTitle. Now what's actually happening is this table has been defined so that it's using two columns as the key to it. This is what's referred to as a compound primary key.

Instead of just one ID column, which I can't use the ID here, because as you see SQL101 appears multiple times, but I can combine the ID with the Date and in a lot of cases this makes sense. The issue is if you do this and use a compound key, you need to look at the other columns in this table. So I have got CourseTitle as Intro to SQL. Seats, five seats available. It's in room 14 and a lot of this information is unique to this one entry and that's fine.

But second normal form demands that all my non-key columns, things aren't keys, CourseTitle, Seats and Room, they have to be dependent on the entire primary key. Now that is the case for Seats and Room. Those are unique values based on the fact that we're running this course on a particular date in a particular room with a certain number of seats available. But CourseTitle I could get from just the course ID part of the key. This might sound a bit ivory tower, but here's the impact.

What happens if somebody reaches into this table and they change that course ID, because accidentally it was SQL101? It's now changed to ASP101. Well, now I've got the wrong title for a piece of data. That's because my data is not in second normal form, and if I now look at this row, ASP101, Intro to SQL, well, which one is right? Is it the wrong ID or the wrong title? I don't know. So how do we fix this? Well, once again we're going to rip out the CourseTitle.

We're going to create a separate courses table where we can map the ID from the events table to the ID in the courses table and always have one specific value for one specific ID, and everything in the events table is all based on the whole key, in this case both ID and Date. Again, if you're not using compound keys, it's not really a concern. You can just step ahead, go right through second normal form and into third normal form. About as plain English as I can describe this one is that no non-key fields, and things that are not part of the primary key, none of them are dependent on another non-key field.

This is in a way similar to second normal form. It's still saying, can I figure out any of the fields I have from other fields that I have? So for example, I'm looking at an updated version of the events table. This is in both first normal form and second normal form, but it's not in third normal form. It's not in first normal form. I have got my key. I don't have any repeating groups and I don't have any repeating values within a column. It's actually in second normal form because I have decided to change it to have one column primary key, which is EventID, but it's not in third normal form. Why? Well, what I can do is scan through my non-key fields, which for me is everything other than EventID.

SQL101 is occurring on the 2nd of April. There are apparently five seats available. That's being held in room 14. There is a capacity of 18. These values, the date, the availability, the room, could be different from row to row so they're fine. The issue is with the Capacity column. If we are looking at the room, so Room 14 has 18 seats and Room 11 has 24 seats and Room 8 has 12 seats, well that means one non-key field that we have, Capacity, is dependent on another non-key field, Room.

If we can figure out Capacity from Room, we don't need to store in the same table. We need to, you've guessed it, split this out into its own table. So we need to pull out Capacity from this table and just keep Room. That's as long as Room can always tell us the capacity if we have, say, a Room table. Another example of third normal form would be this, which is quite common. You'll often see it's an orderItems table, which has an ID and a ProductID and a UnitPrice and a Quantity and a Total, but if I look here the Total is based on Quantity times UnitPrice.

Quantity and UnitPrice are both non-key fields. We can figure out what total is from the other fields that we have. So we rip it out. Don't store information that's easily ascertained from other non-key fields. Now, you can actually in SQL Server create what's called a computed, a calculated field that's not really stored in the database. So if you wanted this actual behavior to make an easy way to scan the Total particularly when you've got complex quantities, you can do that and I'll show you that a little later. But don't store it because there is nothing that would stop me from storing UnitPrice for 100, Quantity of 3, Total 75,000.

It doesn't have to make sense and we want our data to make sense. Now in fact third normal form is quite an odd one, because you will actually find that a lot of tables out there are not in third normal form. A classic example is any table that's full of address information. If I look at a table like this and I see that I've got PostalCode being stored as the last column here, well, I can figure out what the City, the State and the name of the state are from the PostalCode. That means I have non-key fields that are dependent on another non-key field.

You've probably had situations yourself where if you're talking to someone on the phone and filling in an address, they don't actually ask for the city and state. They just say, "Can I get the postal code?" because that's all they really need is the postal code. If I get the postal code, I can figure out the rest of it. Now AddressLine1 is not dependent on the PostalCode. So AddressLine1 isn't a problem. It's City, State and the name of the state. We could rip out that information and put it in a separate table. Now, a lot of the time we don't do that just because it makes it easier and quicker to scan through, say, address tables, and in fact the process of taking all the way to third normal form, ripping the stuff out and then deciding to put it back in, is what's called denormalization.

But make no mistake. If you decide to store the City and the State and the name of the state information in your address table, you are storing redundant data. There is no real reason why you should store the ZIP code 91502 and the city Burbank and the state of CA, and the full name of California a thousand times, when you could get it all from having a zip code database with a city and a state in it. So you might denormalize to make things a bit more efficient, but do it knowingly. And those really are the three steps that we would go through.

You can take normalization even further into what are called voice card and fourth and fifth normal forms, but that's really not very typical and I've very, very rarely run across that. We want to take our database designs through the first normal form, about our primary key and on non-repeating fields. Our second normal form, making sure our data is based on the whole key, and third normal form, that all of our data is based on the whole key, or if you prefer, the quicken mnemonic is that your data should always be based on the key, the whole key and nothing, but the key. So help me God.

Find answers to the most frequently asked questions about SQL Server 2008 Essential Training .

Expand all | Collapse all
please wait ...
Q: I'm having problems installing the free Express R2 version of SQL Server on Windows XP. I tried 64-bit and 32-bit versions. In the videos, the author installs from a DVD. Do I need to do the same?
A: While the author installs from a DVD, it's not strictly necessary. There certainly shouldn't be a problem installing the Express edition from a regular download. That's the way it's intended to be installed.

If you're using Windows XP, the only officially supported version is the 32-bit version. However, you do need to make sure that your Windows XP install is completely up-to-date and patched, with XP Service Pack 3 installed. (See http://msdn.microsoft.com/en-us/library/ms143506.aspx#Express32 for formal requirements.)

It's not unusual for the install process to take a while, and with older operating systems like XP, you'll often have to back it out and try again, as usually there's a bunch of prerequisites that need to be installed. (Like the .NET Framework 3.5 SP1, the correct version of Windows Installer, etc.)
Q: The link to the installer for the AdventureWorks sample database, as shown in the Chapter 2 movie "Installing sample databases," no longer works. Where can I find the installer?
A: Microsoft has reorganized its site. The sample files are still there, but they're a bit harder to find. To install them:

1) Visit http://msftdbprodsamples.codeplex.com/.
2) Click the link to "SQL Server 2008 R2 OLTP."
3) Click the AdventureWOkrs2008R2 Data File link and agree to the conditions to download the MDF file.
4) Move the MDF file to your SQL Server Directory, usually located at C:\Program Files\Microsfot SQL Server\MSSQL 10_50.MSSQLSERVER\MSSQL\DATA.
5) Open the SQL Sever Management Studio and connect to your instance using an account with administrative privileges.
6) Attach the sample database by right-clicking the Databases folder in the Object Explorer and choosing Attach from the pop-up menu.
7) Click the Add button in the next menu and navigate to the MDF file in the Locate Database Files window that appears. Select it and click OK.
8) Remove the reference to the log file in the "AdventureWorks2008R2" database details: pane by selecting the Log entry and clicking removing.*
9) Click OK to return to SQL Server Management Studio and complete the attachment process.

*MDF files are the "data" files for SQL Server databases. They often come along with LOG files (ldf files). This one didn't so we need to REMOVE the reference to the non-existent log file. Select the second row in the lower section (it should say File Type: Log and Message: Not Found) and click the REMOVE button.

For an illustrated version of these instructions (with screenshots), click here for a PDF version.
Share a link to this course

What are exercise files?

Exercise files are the same files the author uses in the course. Save time by downloading the author's files instead of setting up your own files, and learn by following along with the instructor.

Can I take this course without the exercise files?

Yes! If you decide you would like the exercise files later, you can upgrade to a premium account any time.

Become a member Download sample files See plans and pricing

Please wait... please wait ...
Upgrade to get access to exercise files.

Exercise files video

How to use exercise files.

Learn by watching, listening, and doing, Exercise files are the same files the author uses in the course, so you can download them and follow along Premium memberships include access to all exercise files in the library.

Exercise files

Exercise files video

How to use exercise files.

For additional information on downloading and using exercise files, watch our instructional video or read the instructions in the FAQ .

This course includes free exercise files, so you can practice while you watch the course. To access all the exercise files in our library, become a Premium Member.

Join now Already a member? Log in

* Estimated file size

Are you sure you want to mark all the videos in this course as unwatched?

This will not affect your course history, your reports, or your certificates of completion for this course.

Mark all as unwatched Cancel


You have completed SQL Server 2008 Essential Training.

Return to your organization's learning portal to continue training, or close this page.

Become a member to add this course to a playlist

Join today and get unlimited access to the entire library of video courses—and create as many playlists as you like.

Get started

Already a member ?

Exercise files

Learn by watching, listening, and doing! Exercise files are the same files the author uses in the course, so you can download them and follow along. Exercise files are available with all Premium memberships. Learn more

Get started

Already a Premium member?

Exercise files video

How to use exercise files.

Ask a question

Thanks for contacting us.
You’ll hear from our Customer Service team within 24 hours.

Please enter the text shown below:

The classic layout automatically defaults to the latest Flash Player.

To choose a different player, hold the cursor over your name at the top right of any lynda.com page and choose Site preferences from the dropdown menu.

Continue to classic layout Stay on new layout
Exercise files

Access exercise files from a button right under the course name.

Mark videos as unwatched

Remove icons showing you already watched videos if you want to start over.

Control your viewing experience

Make the video wide, narrow, full-screen, or pop the player out of the page into its own window.

Interactive transcripts

Click on text in the transcript to jump to that spot in the video. As the video plays, the relevant spot in the transcript will be highlighted.

Learn more, save more. Upgrade today!

Get our Annual Premium Membership at our best savings yet.

Upgrade to our Annual Premium Membership today and get even more value from your lynda.com subscription:

“In a way, I feel like you are rooting for me. Like you are really invested in my experience, and want me to get as much out of these courses as possible this is the best place to start on your journey to learning new material.”— Nadine H.

Thanks for signing up.

We’ll send you a confirmation email shortly.

Sign up and receive emails about lynda.com and our online training library:

Here’s our privacy policy with more details about how we handle your information.

Keep up with news, tips, and latest courses with emails from lynda.com.

Sign up and receive emails about lynda.com and our online training library:

Here’s our privacy policy with more details about how we handle your information.

submit Lightbox submit clicked
Terms and conditions of use

We've updated our terms and conditions (now called terms of service).Go
Review and accept our updated terms of service.