Resilient distributed datasets (RDDs) are another way of loading data into Spark. In this video, learn how this older format compares to using DataFrames, and where its strengths lie.
- [Narrator] Resilient Distributed Datasets or RDDs…with a primary API for version one…and they're still available in Spark version two.…Now almost all the code we've been running using DataFrames…compiles down to an RDD.…So it makes sense for us to have a basic understanding…of what an RDD is.…An RDD is an immutable partitioned collection of records…that can be worked on in parallel.…Now remember that with a DataFrame,…each record is a structured row…containing fields with a known schema.…
In the case of RDD,…the records are just Java,…Scala or Python objects.…And so you have complete control over them.…Although this has several advantages,…there are a couple of challenges.…Spark does not understand the inner structure…of your records as it does with your DataFrames.…This means that the optimizations…you would have automatically got with DataFrames,…you will need to manually recreate.…The RDD APIs are available in Python…as well as Scala and Java.…
You can get good performance with running RDDs…with Scala and Java.…
- Benefits of the Apache Spark ecosystem
- Working with the DataFrame API
- Working with columns and rows
- Leveraging built-in Spark functions
- Creating your own functions in Spark
- Working with Resilient Distributed Datasets (RDDs)
Skill Level Intermediate
1. Introduction to Apache Spark
2. Technical Setup
3. Working with the DataFrame API
5. Resilient Distributed Datasets (RDDs)
- Mark as unwatched
- Mark all as unwatched
Are you sure you want to mark all the videos in this course as unwatched?
This will not affect your course history, your reports, or your certificates of completion for this course.Cancel
Take notes with your new membership!
Type in the entry box, then click Enter to save your note.
1:30Press on any video thumbnail to jump immediately to the timecode shown.
Notes are saved with you account but can also be exported as plain text, MS Word, PDF, Google Doc, or Evernote.