Understand how to use the functional programming construct of mapping over collections.
- [Teacher] Let's work with functions…over parallel collections.…We'll start the Scala REPL.…Now I want to create a value v…and I'm going to set this to be a range of one to 100.…I'm going to want to convert it to an array.…We can convert an array into a parallel collection…using the par method,…and I'll call it pv for parallel version of v.…And it's v.par.…Now I have a parallel array.…The value of pv is the same length…and has the same values as v,…however the pv value allows for parallel operations.…
Let's start by multiplying each member…of the v array by two.…First, I'll clear the screen.…Now I have v.…Now to apply an operation to every member of a collection…I can use the map function.…I'll use the underscore as an alias…for each member of the collection.…I'll say, for each member of the collection,…multiply by two.…That doubles everything in the array.…Similarly, for parallel version, I can use the same code.…I can apply the map function, use the anonymous variable,…and then multiply every member by two.…
What's going on here, is that I'm getting the same results…
Dan also focuses on using Scala with Spark, a distributed processing platform. He first describes how to work with Resilient Distributed Datasets (RDDs)—a fundamental Spark data structure—and then explains how to use Scala with Spark DataFrames, a new class of data structure specially designed for analytic processing. He wraps up the course by providing a summary of advantages of using Scala for data science.
- The advantages of Scala for data science
- Scala data types
- Scala arrays, vectors, and ranges
- Parallel processing in Scala
- Mapping functions over parallel collections
- When and when not to use parallel collections
- Using SQL in Scala
- Scala and Spark RDDs
- Scala and Spark DataFrames
- Creating DataFrames
Skill Level Intermediate
Java for Data Scientists Essential Trainingwith Charles Kelly2h 43m Intermediate
1. Introduction to Scala
2. Parallel Processing in Scala
3. Using SQL in Scala
4. Scala and Spark RDDs
5. Scala and Spark DataFrames
- Mark as unwatched
- Mark all as unwatched
Are you sure you want to mark all the videos in this course as unwatched?
This will not affect your course history, your reports, or your certificates of completion for this course.Cancel
Take notes with your new membership!
Type in the entry box, then click Enter to save your note.
1:30Press on any video thumbnail to jump immediately to the timecode shown.
Notes are saved with you account but can also be exported as plain text, MS Word, PDF, Google Doc, or Evernote.