From the course: AWS for Developers: DynamoDB

NoSQL versus relational DBs - Amazon Web Services (AWS) Tutorial

From the course: AWS for Developers: DynamoDB

Start my 1-month free trial

NoSQL versus relational DBs

- [Instructor] Chances are, you've probably heard of NoSQL or maybe you've even used it a bit, but maybe you just don't get it or are too afraid to ask. Well don't worry, we're going to dive into it now. You may also wonder what about schemas, data integrity, and forcing constraints? What about stored procedures and triggers? How do those work in a NoSQL environment? To unpack the SQL versus NoSQL debate, we need to go back to the creation of SQL databases to understand why they were created and what problem they were trying to solve. And that takes us back to 1970 when Edgar F. Codd defined what he called a database that was relational, meaning the internal structure of data was different than what the user would see in the end. The primary benefits of a relational database are enforced data integrity, reduced storage cost, and the ability to structure data on the fly using SQL. Consider this scale. On one side is the cost of storage and on the other, the cost of processing. For most of the last four years, the priority has been on efficiently storing large volumes of data. Since you own the hardware, the cost of processing was relatively fixed, but running out of hard drive space and having to add more drives, well that was something you wanted to avoid. In a relational database, information like phone number is stored in one column and one table, and if you need to access that information, you must join to get it. Suppose I had a requirement to return a row of orders with customer info. My API needs to return the following fields: OrderId, CustomerId, Price, Name, and PhoneNumber. In a relational database, most of these fields would be stored only one time to efficiently reduce the cost of storing thousands if not millions of orders. In order to make the data appear to be part of the same structure, I will have to use a query to join records together by a foreign key, thus creating the more appealing results set on the right. That brings us to modern day where storage has become remarkably inexpensive and now the focus is on the cost of processing data. Just to bring this into perspective, a gigabyte of storage in 1980 cost about $193,000. The same storage for a month on S3 is well, you might want to sit down for this, about two cents. That's right, two shiny pennies and you can have what cost more than a Lamborghini in 1980. So let's look at our scale again. I think we would agree that the bulk of our cost no longer are in storage, but now the cost has moved to the side of processing. When we talk about cost, we're not just talking about money, we're talking about the cost of time as well, or how long the user has to wait to get their data which ultimately, will cost your company money. Demand for throughput has never been greater and SQL joins are just too slow for our needs sometimes. And when I say slow, I'm talking about microsecond cost. Your query may run in under a second, but if you get thousands of request per second, that number will grow dramatically. So now let's talk about the benefits of a NoSQL database. Data is stored in the way you want. That means relationships are present in the data, not an ID that maps from one table to another table. Number two, NoSQL databases if used correctly are typically faster because there are no joins required to fetch your data. Number three, NoSQL databases don't have a schema and therefore, don't require database migrations or altered table scripts if you're changing your models. The structure of your data can be different from row to row. In fact, as you'll see soon, it actually should be. Obviously, that doesn't a SQL database is bad and just to be thorough, here are some scenarios where a SQL database might be better for your needs. Number one, a SQL database generally is going to reduce the anomalies in your data. Where a NoSQL database might store a customer information on one row and a transaction on the next, SQL database structures like tables are intended to store one type of entity. Number two, a SQL database allows you to run computations or calculations on the server. Anyone who's ever written a complex store procedure knows that it's generally faster to calculate lots of data on the server versus pulling that data locally and crunching those numbers yourself. Lastly, commercial SQL databases and even open source variety often have multiple vendors including the developer offering tiered support. In this course however, we're only going to work with NoSQL because our application needs to be fast, flexible, and work with our library of APIs so that makes NoSQL our preferred choice. The fact is there are many great options on both the SQL and NoSQL side of the aisle. There are no clear rules here, just opinions, but this is mine, and it's shared by quite a few people in the industry, but you're free to make up your own mind.

Contents