Understand how to filter collections in parallel.
- [Instructor] Sometimes, when we have large collections…we want to filter them.…Scala makes it easy to filter collections so you can…find all the members of a collection…that meet some criteria.…So for example, let's create an array of numbers.…We'll create val v.…And we'll make this one to 10,000.…And let's make it an array.…And let's create a parallel version…by using the par method.…Now let's just check the length of the collections.…
V.lengths and pv.length.…OK, they're the same.…Numbers appear to be the same.…So we'll just clear the screen,…and we'll move on to our next step.…So we have a collection of 10,000 elements.…What I'd like to do now is create another value…that has the elements from pv, the parallel vector,…that are greater than 5,000.…So I'm going to make a new value, and I'll call it pvf…for the filtered version of pv.…And I'm going to define that as pv.filter.…
So I'll apply a filter.…So for each element of the collection,…I want to do a test and see if it is greater than 5,000.…And now I have a value called pvf,…
Dan also focuses on using Scala with Spark, a distributed processing platform. He first describes how to work with Resilient Distributed Datasets (RDDs)—a fundamental Spark data structure—and then explains how to use Scala with Spark DataFrames, a new class of data structure specially designed for analytic processing. He wraps up the course by providing a summary of advantages of using Scala for data science.
- The advantages of Scala for data science
- Scala data types
- Scala arrays, vectors, and ranges
- Parallel processing in Scala
- Mapping functions over parallel collections
- When and when not to use parallel collections
- Using SQL in Scala
- Scala and Spark RDDs
- Scala and Spark DataFrames
- Creating DataFrames
Skill Level Intermediate
Java for Data Scientists Essential Trainingwith Charles Kelly2h 43m Intermediate
1. Introduction to Scala
2. Parallel Processing in Scala
3. Using SQL in Scala
4. Scala and Spark RDDs
5. Scala and Spark DataFrames
- Mark as unwatched
- Mark all as unwatched
Are you sure you want to mark all the videos in this course as unwatched?
This will not affect your course history, your reports, or your certificates of completion for this course.Cancel
Take notes with your new membership!
Type in the entry box, then click Enter to save your note.
1:30Press on any video thumbnail to jump immediately to the timecode shown.
Notes are saved with you account but can also be exported as plain text, MS Word, PDF, Google Doc, or Evernote.