Join Dan Sullivan for an in-depth discussion in this video Types of NoSQL databases, part of Advanced NoSQL for Data Science.
- [Narrator] NoSQL Databases are designed to overcome the limitations of relational databases. It shouldn't be a surprise that different people face different challenges with relational databases and they ended up designing different kinds of NoSQL databases to address those challenges. So what is a NoSQL database? One way to think about this is to look at the four different kinds of NoSQL databases. The simplest form is called the Key-Value database and it's designed kind of like a dictionary where you know a word and you're able to look up it's value. In this case you know a key, you're able to look up a value of a certain thing.
Such as if you have a person's ID, you could look up their first name. Now this kind of functionality is really useful for caching and gives you some performance gains. But in general it's not really that valuable in Data science so we won't spend too much time talking about Key-Value databases themselves. A second type is called the Document database and what distinguishes a document database is that they're able to store multiple key-values in a structure called a document. Documents roughly parallel rows in a table. Keys can be scalar, which just means that they're simple data types, like integers or strings.
But the value themselves can be more complex structures, like lists or arrays. A third type of NoSQL database is called Wide Column database. This is probably the one that is most similar to relational databases, but we have to be careful because although it uses terms like table and column. The idea of a column and a table in relational database is different from what it is in a wide column. For example in a wide column database, the data is denormalized, columns are not fixed, they can change, so for example we can add columns on the fly in our application and it's even the case that rows in the same table can have different columns.
And like document databases, the values here can be complex structures such as arrays and lists. A fourth type of NoSQL database is called the Graph and graphs are basically networks and so they have two parts, they have entities and they have relations between those entities which are represented by links. Edges and entities have properties and that's important because those properties are things that we can query on. We can also query on links and paths between entities. Now, NoSQL databases give data scientists, the flexibility they need to adapt to changing requirements while scaling to meet the compute and storage needs of their analytic projects.
The course begins with an introduction to NoSQL, and then delves into the specifics of document, wide-column, and graph databases. Learn key details for performing data preparation, exploration, and extraction for each type of NoSQL database. Review case studies that show how to use various NoSQL databases with popular data science tools, including the document database MongoDB, the wide-column database Cassandra, and the graph database Neo4j.
- NoSQL compared to traditional relational databases
- Performing common data science tasks
- Preparing data with document databases
- Manipulating data in NoSQL
- Preparing, exploring, extracting, and model building
- Working with document, wide-column, and graph databases
- Reviewing case studies using MongoDB, Cassandra, and Neo4j