From the course: Advanced NoSQL for Data Science

Unlock the full course today

Join today to access over 22,400 courses taught by industry experts or purchase this course individually.

Prepare data with wide-column databases

Prepare data with wide-column databases - NoSQL Tutorial

From the course: Advanced NoSQL for Data Science

Start my 1-month free trial

Prepare data with wide-column databases

- [Instructor] When preparing to load data into Cassandra, there are several factors to keep in mind. As with other no SQL databases, data is denormalized. There's little choice in this matter. Cassandra does not support sorting of query results, it also lacks support for joins. The only way to ensure your query results include all the attributes you want and are returned in the order you want, is to denormalize in order rows by cluster key. Do not model based on rules of normalization. Focus on designing tables to answer queries. Normalizing data in a wide column database will negatively impact performance. Since different queries will have different attributes and sort order requirements, it is common to have multiple tables with duplicate data in different orders. Cassandra is used for big data applications. Be sure to consider the volume of data that you will need to load and the time that could take. Import utilities are usually faster than code-written in Python or other…

Contents