Join Adam Wilbert for an in-depth discussion in this video Understanding the relational structure, part of Learning Relational Databases (2014).
At their core, all relational databases rely on storing information about the real world in a series of simple two-dimensional data tables. These tables are organized into a series of cells, grouped into horizontal rows, and vertical columns just like what you are probably used to from using Microsoft Excel or Apple Numbers. These rows and columns go by different names in different database systems, and it's not uncommon to hear people interchange them in conversation. But they all mean essentially the same thing. The vertical columns sometimes go by the name attributes, values, or fields.
And they store a single piece of information, such as the first name or last name of the employees in your organization. The horizontal rows most commonly go by the name records, and every record is made up of a series of values stored in the intersecting columns. While every record in the data table has an identical structure horizontally, each record represents a single entity, such as an individual employee or a single product. When talking about relational databases, though, these tables need to follow a few stricter rules for their design than something that you might create in a spreadsheet program like Excel.
I need to go into a little bit of relational database theory here, but it offers a useful definition that you're likely to come across. What we naturally think of as a data table, is more specifically called a relation. The term table implies a loose grouping of rows and columns. The term relation is used for a data table with a few additional characteristics. The first characteristic is that every cell in the table contains only a single value. This means that the values are atomic and cannot or should not be split apart into smaller components. Think of the way you fill out your address when making an online purchase.
I bet you've never typed your address into a single large text box, but rather you placed your house number and street into one box, while your city, state and ZIP codes all get their own data entry fields. This is because the database that holds this information needs it split apart to efficiently process your order. It makes it easy for the retailer to charge tax similarly for everyone in a particular state. Or to target a populous city with shipping discounts. If everything was typed into a single cell, then it would be hard for the system to know where your street address ends and the city begins. The second characteristic of a relation is that all of the values in a single column must be of the same type.
If, for example, you have a column called city, then none of the rows should be trying to store an apartment number there. This keeps the relation clean, and more importantly, predictable, in the types of information you'll find in specific locations. The third requirement is pretty simple. It's that each column has a unique name. Imagine that your online retailer is collecting information about a billing address for your credit card and a separate shipping address where the product is going to go. If they were to store both in columns called address, city, or state, it could get confusing really quickly. Instead, you would want to have separate columns to store billing city, separate from shipping city.
The fourth requirement of a relation is that the order of either the columns or rows is insignificant. The fact that the billing city column appears before the shipping city makes no difference to the type of information stored. And likewise, the fact that one row appears above another row doesn't matter in the functionality of the database. And finally, probably the most important characteristic of a relation is that no two rows are identical. There needs to be at least one piece of information in at least one column that allows you to uniquely identify and separate every row from every other row. These rules will come into play as we further explore relational databases.
But for now you can start to see that the term relation is more appropriate than the generic term table when it comes to creating a place to store our data. That said most DBMSs use the more familiar table language in their interface and documentation. However, it's important to understand that even though we're often a little loose about the terminology, all of the rules and restrictions of the relations still apply. And that regardless if we call it a table or relation, we need to make sure that it follow the characteristics that I just outlined.
- What is a database management system (DBMS)?
- Moving through the database development cycle
- Preventing duplicate, inconsistent, and conflicting data entries
- Gathering requirements
- Developing relationships
- Identifying key fields
- Following a naming convention
- Developing the actual database