Join Dan Sullivan for an in-depth discussion in this video Preparing data, part of Advanced NoSQL for Data Science.
- [Instructor] A common task in data science…is preparing data.…Let's review the core steps in data preparation.…Collecting data is the first step of data preparation.…We typically work with multiple data sets…from different source systems.…One factor to keep in mind is that different source systems…will update data with different frequencies.…So when you're planning a data science project,…be sure to consider how frequently to update the data…if you're storing it in a NoSQL database…for further analysis.…Also keep in mind business rules…about consolidating or linking data.…
Rules could be simple such as dropping…an older version of records…or complicated such as a series of ephemerals…specifying how to decide when to link records…from different source systems.…Another part of data collection…that is easily overlooked is data retention.…After some amount of time, data is no longer useful…or it should be deleted to comply with business policies.…So you want to consider how you'll enforce…data retention policies in your NoSQL database.…
The course begins with an introduction to NoSQL, and then delves into the specifics of document, wide-column, and graph databases. Learn key details for performing data preparation, exploration, and extraction for each type of NoSQL database. Review case studies that show how to use various NoSQL databases with popular data science tools, including the document database MongoDB, the wide-column database Cassandra, and the graph database Neo4j.
- NoSQL compared to traditional relational databases
- Performing common data science tasks
- Preparing data with document databases
- Manipulating data in NoSQL
- Preparing, exploring, extracting, and model building
- Working with document, wide-column, and graph databases
- Reviewing case studies using MongoDB, Cassandra, and Neo4j
Skill Level Advanced
1. Why NoSQL?
Types of NoSQL databases2m 20s
2. Perform Common Data Science Tasks with NoSQL Databases
3. Document Databases for Data Science
4. Wide-Column Databases for Data Science
5. Graph Databases for Data Science
- Mark as unwatched
- Mark all as unwatched
Are you sure you want to mark all the videos in this course as unwatched?
This will not affect your course history, your reports, or your certificates of completion for this course.Cancel
Take notes with your new membership!
Type in the entry box, then click Enter to save your note.
1:30Press on any video thumbnail to jump immediately to the timecode shown.
Notes are saved with you account but can also be exported as plain text, MS Word, PDF, Google Doc, or Evernote.