- Explain where and why Apache Spark stores its data.
- Differentiate between the types of data to work with.
- Explain how bucketing can be used to partition data.
- Analyze the execution plan when reading HDFS files with schema.
- Determine when and how to apply best practices for data processing.
- Leverage various tools and techniques to build a solution using Apache Spark and Hadoop.
Skill Level Intermediate
- [Kumaran Ponnambalam] Data engineers often use stacks to leverage the power of multiple technologies. For example, there is often a need for not just scalable storage but also fast processing. Many teams find themselves using the combination of Hadoop for storage and Spark for compute, because it provides unparalleled scalability and performance for analytics pipelines. In order to harness this power, it is important to understand how Hadoop and Spark work with each other and utilize the levers available. My name is Kumaran Ponnambalam, in this course, I will show you how to build scalable and high performance analytics pipelines with Apache Hadoop and Spark. I will only discuss key tools and best practices for taking advantage of this combination. We will use a Hortonworks Sandbox for this course. You need prior familiarity, with both Apache Hadoop and Spark. In this course we will only focus on using Hadoop and Spark together. We will also use Zeppelin notebooks for our examples. Please refer to other essential courses and resources, if you want to learn the essentials of these technologies. That being said, let's explore how to maximize the combined power of Hadoop and Spark.
Hadoop for Data Science Tips, Tricks, & Techniqueswith Ben Sullins1h 12m Intermediate
1. Introduction and Setup
2. HDFS Data Modeling for Analytics
3. Data Ingestion with Spark
4. Data Extraction with Spark
5. Optimizing Spark Processing
6. Use Case Project
- Mark as unwatched
- Mark all as unwatched
Are you sure you want to mark all the videos in this course as unwatched?
This will not affect your course history, your reports, or your certificates of completion for this course.Cancel
Take notes with your new membership!
Type in the entry box, then click Enter to save your note.
1:30Press on any video thumbnail to jump immediately to the timecode shown.
Notes are saved with you account but can also be exported as plain text, MS Word, PDF, Google Doc, or Evernote.