From the course: Advanced NoSQL for Data Science

Unlock the full course today

Join today to access over 22,400 courses taught by industry experts or purchase this course individually.

Preparing data

Preparing data - NoSQL Tutorial

From the course: Advanced NoSQL for Data Science

Start my 1-month free trial

Preparing data

- [Instructor] A common task in data science is preparing data. Let's review the core steps in data preparation. Collecting data is the first step of data preparation. We typically work with multiple data sets from different source systems. One factor to keep in mind is that different source systems will update data with different frequencies. So when you're planning a data science project, be sure to consider how frequently to update the data if you're storing it in a NoSQL database for further analysis. Also keep in mind business rules about consolidating or linking data. Rules could be simple such as dropping an older version of records or complicated such as a series of ephemerals specifying how to decide when to link records from different source systems. Another part of data collection that is easily overlooked is data retention. After some amount of time, data is no longer useful or it should be deleted to comply with business policies. So you want to consider how you'll…

Contents