Join Dan Sullivan for an in-depth discussion in this video Cassandra and Spark, part of Advanced NoSQL for Data Science.
- [Instructor] Let's briefly consider large…scale data science with Cassandra and Spark.…It's now common to have data sets…that are too large to store and analyze…on a single server.…Rather than try to find larger…and larger servers, analysts are turning…to clusters of machines.…This allows us to easily parallelize computations,…while providing ample low latency local storage systems.…Both Cassandra and Spark are designed for clusters.…Cassandra is a commonly used database…for big data, and Spark is widely used…for analytics on large data sets.…
Apache Spark is an open source data analysis platform…that is known for both scalability and speed.…It is widely used over Hadoop, because Spark…is better able to leverage memory…and rely less on slower disks.…Spark is an analytics platform with libraries…to support common data science requirements.…Another advantage of Spark for data science…is that it can run on the same cluster nodes…as Cassandra, making it a logical choice…for big data analytics.…Spark supports the DataFrame structure,…
The course begins with an introduction to NoSQL, and then delves into the specifics of document, wide-column, and graph databases. Learn key details for performing data preparation, exploration, and extraction for each type of NoSQL database. Review case studies that show how to use various NoSQL databases with popular data science tools, including the document database MongoDB, the wide-column database Cassandra, and the graph database Neo4j.
- NoSQL compared to traditional relational databases
- Performing common data science tasks
- Preparing data with document databases
- Manipulating data in NoSQL
- Preparing, exploring, extracting, and model building
- Working with document, wide-column, and graph databases
- Reviewing case studies using MongoDB, Cassandra, and Neo4j
Skill Level Advanced
1. Why NoSQL?
Types of NoSQL databases2m 20s
2. Perform Common Data Science Tasks with NoSQL Databases
3. Document Databases for Data Science
4. Wide-Column Databases for Data Science
5. Graph Databases for Data Science
- Mark as unwatched
- Mark all as unwatched
Are you sure you want to mark all the videos in this course as unwatched?
This will not affect your course history, your reports, or your certificates of completion for this course.Cancel
Take notes with your new membership!
Type in the entry box, then click Enter to save your note.
1:30Press on any video thumbnail to jump immediately to the timecode shown.
Notes are saved with you account but can also be exported as plain text, MS Word, PDF, Google Doc, or Evernote.