Join Tom Geller for an in-depth discussion in this video Planning data structure, part of Drupal 6: Online Presentation of Data.
- View Offline
This video is unusual because there is nothing in it that's specifically about Drupal. Instead we are going to talk about concepts that are essential to understanding later chapters, specifically, how databases are structured. Learning what's in this video is the ounce of prevention that will keep you from needing a pound of cure later because without good planning from the start, you might have to throw everything away and start again. In fact, you will probably find yourself returning to this video from time to time to better understand ideas you'll need, once you start playing with Drupal. So, let's talk about databases. What are they? Well, a database, obviously, is a place that contains data. Well what is data? Here is the kind of data that you are already familiar with: names and address. In this example, we've taken our data that is a name and address and a phone number and so forth and put it into what's called a record. Each one of these is a record. Tom De Nada at 123 Fake Street, etcetera etcetera is a single record. You could stop right here. Before computers, we old timers kept records on 3x5 cards just like this, but there are limits to storing data like this.
For example, let's say that you wanted to sort it by street address, you really couldn't do that because this whole record starts with the letter T for Tom De Nada or S of Sylvia De Nada and when you are sorting something you only sort based on the first letter or first few letters, you would have to look into the middle of the record in order to do a sort on the street name, which would be very difficult computationally. So, we don't set up our databases this way, but there is a solution. You can break records into what are called fields, where if the name is a field and the address is a field and the phone number is a field.
Here is what that looks like, each one of the records in now shown as a row, so Tom De Nada at 123 Fake Street San Francisco etcetera is the same record that you saw before. However, it's been broken into fields, fields with the columns that you see here. So as you look down address 123 Fake Street, 19 South Moggy Street, those are all fields within their individual records. This is what's called a Flat-file database. That compares to a relational database, which we'll get through in just a minute. It's the simplest way to store data. It's the way that most of us find most intuitive.
And once again just to remind you, each row is a record, each column is a field and the whole thing is called a table. But you have many choices when it comes to setting up a database. The first one is, how exactly do you break up the fields? In our example before, we had the full name just under Name and we had the full address just under Address. But we could have done it a different way, if we had broken it up more, we could have separated out the First name from the Last name and the Street number from the Street name and even from the Street type, so a street would be different from avenue would be different from route.
This is what's called making it more atomic, sort of like breaking up a molecule into separate atoms. The advantage to this is that now you would be able to sort not only on the Street number but also the Street name. You could sort not only on the First name but also on the Last name and so forth and you could pull out those pieces of information and replace them more easily once they are in the database program. The disadvantage of this format is in putting in the information usually and also managing it once its there. You have all typed into a form at times and you've typed Tom and then you have to press Tab to go on to the next field and then Tab. You end up pressing Tab a lot and you end up with lots of tiny little atoms, so it is possible to break up a record into two many fields.
But what should be clear from this example is that the record in the first table here, Tom De Nada at 123 Fake Street is exactly the same as the record in the second table here, the only difference is how it's broken up. So, now we have broken up our table into atoms like this. But what if we want to put it back together to print out mailing labels? That's where the database programming comes in. You need to come up with some sort of formula to put it all together. For example we see that here the name goes on one line and then we have to put in a new line symbol for that second line.
So Tom De Nada, new line, 829 14th Street, new line, city comma space state. It gets a little bit complicated and in fact when you look at the formula itself, you see that it can actually be a little bit difficult to understand. The subject of how you break up information into separate fields and then how you put it back together again is the very essence of this course. The first part is how to get the information into the database, what fields you want actually to be separated out. We cover that when we talk about an area called Content Construction Kit or CCK. That's the module that's available for Drupal which lets you decide what fields to go into a particular content type.
So in this case we have something, let's call it a person content type and we have broken it up into a Name, Address, City and so forth. Then each time we create a record, we'll put in the individual's Name, Address and City and so forth. Again we can break it up in many different ways but however we do it, we implement it using this Content Construction Kit. The second part is about putting the information back together again and we do that using a module called Views. We are going to talk about both of these modules in great detail throughout this course. There is a little bit more to know though. Often times when you are building a database, you find that you are repeating information. Let's take a look at these three people Mimi de Nada, Conrad de Nada and Sylvia de Nada. They all live at the same place, 19 South Moggy Street. Wouldn't it be good if we could just take out that information so we only had to type it in once and then have each one of those people live in this one household? The way we do that is we create multiple tables. So before we had a single table for people with their name and their address and their phone number and so forth and now we are going to have one for people and one for households and it will look something like this. This is the people table. In each one we have the Name and Phone number and Email address, but we only have a reference to the Household. The three people you see at the beginning there Mimi, Conrad and Sylvia they all live in the same household. So we just say okay, they live in household number 1. Then later we define what the household is. That household number 1 is 19 South Moggy Street etcetera.
The reason that this works is because we make one table relate to another. Where we have household 0001 in the people table, we can make it point to household 0001 in the household table where it actually spells out where that household is. This is what's called a Relational Database and it is in fact the way that Drupal's database is constructed, which brings us to why you need to know all of this. When you are planning out your data driven Drupal site, you need to figure out first of all what fields you are going to need, what do you think you are going to need to pull out of if and recombine. All of that depends on your application, that is, what you will be using your Drupal site for.
That requires a little bit of foresight. You have to think about data presentation first. Imagine what you want it to look like when you are done. What information do you want to get across? The fact is this is a very difficult thing to do because usually when you are starting out, you haven't solved all of those problems, but don't worry about it. The skills you will learn in this course will allow you to go back and re-figure all of these things. I do want to give you a tip though. Before you start adding large amounts of data run a view tests first. Just enter a small subset of it so you can see how it's all working out and then try to dry out the data in ways that you want to present it, so that you can see if it's actually working for you before you bring in thousands of records or open up your site to the public.
When you are planning your database, try to keep it pretty atomic. Think of all the possible reasons you would want to pull out different pieces of data. But keep in mind that the more atomic you make it, the more complicated your database can become. It's also good to know that Drupal itself is a relational database. There are times that you might want to manipulate Drupal's native database and I'll show you how to do that and just knowing how it works and being able to find how one table relates to another can be very useful later on. So I hope this has given you the basis you need to understand how to construct your site, because first of all you have to decide what it is you want.
We will talk more about the specifics such as Drupal's internal database in the video Understanding How Drupal Stores Data and we'll talk more about the specifics of how to implement this throughout the course.
- Importing and manipulating data in Drupal
- Presenting date-formatted information in calendars
- Understanding Drupal's data-query interface, "Views"
- Improving view appearances with grid, list, and table formatting
- Importing, exporting, and cloning views
- Extending views with arguments and relationships