Understand Spark lazy evaluation and how to optimize it for better performance.
- [Instructor] In this video,…we are going to explore…what Spark Lazy Evaluation is…and how we can take advantage of it.…Spark executes transformation statements…only when there is an action executed on the resulting RDDs.…In other words, executions…of transformations are delayed.…They are executed lazily.…Consider this piece of code.…It starts with loading a CSV file into an RDD,…which is a transformation.…
Then, it prints out the number of records in the RDD,…that is an action.…It proceeds to perform a couple of filters…on the resulting RDDs, and then,…it also does a flatMap.…Finally, there is an action…to print the number of records in the RDD word.…So, how does this all get executed?…Spark executes them in batches twice.…First, it executes the CSV file loading…and counting of records in the resulting RDD.…
This is because count is an action.…Then, in the second batch, it executes…three transformations in one shot.…This is triggered by the action…to count the number of records in the RDD word.…This is an important concept,…
- What is data engineering?
- Spark and Kafka for data engineering
- Moving data with Kafka and Kafka Connect
- Kafka integration with Apache Spark
- How Spark works
- Optimizing for lazy evaluation
- Complex accumulators
Skill Level Advanced
Big Data Foundations: Program Managementwith Alan Simon1h 11m Intermediate
1. Data Engineering Overview
2. Moving Data with Kafka
3. Spark High-Performance Processing
4. Use Case Project
- Mark as unwatched
- Mark all as unwatched
Are you sure you want to mark all the videos in this course as unwatched?
This will not affect your course history, your reports, or your certificates of completion for this course.Cancel
Take notes with your new membership!
Type in the entry box, then click Enter to save your note.
1:30Press on any video thumbnail to jump immediately to the timecode shown.
Notes are saved with you account but can also be exported as plain text, MS Word, PDF, Google Doc, or Evernote.