Dan Sullivan walks you through the steps needed to install Spark.
- [Instructor] Now it's time to install Spark.…Here I've opened a browser…and navigated to spark.apache.org.…And from the main page, I'm going…to select the Download option,…and that will bring me to another page…where I can choose from among several different downloads.…And I'm just going to accept all…of the defaults in terms of the Spark release,…the package type, and the download type.…And I'm going to click on the link…that is provided here to download Spark.…Now that Spark has finished downloading,…I'm going to go over to my download directory…and I have a compressed file here,…so I'm going to open that.…
Now that creates a new folder,…and this has a fairly long name…so I'm going to rename this and I'm just going to call it Spark,…which is sufficient for our purposes.…Now also, Spark is now in my download directory.…I'm just going to drag it over to my home directory.…Okay, so now I have in my home directory,…I have the Spark folder, so I'm going to move…to my terminal window and I'm going to cd…to my home directory.…
Dan also focuses on using Scala with Spark, a distributed processing platform. He first describes how to work with Resilient Distributed Datasets (RDDs)—a fundamental Spark data structure—and then explains how to use Scala with Spark DataFrames, a new class of data structure specially designed for analytic processing. He wraps up the course by providing a summary of advantages of using Scala for data science.
- The advantages of Scala for data science
- Scala data types
- Scala arrays, vectors, and ranges
- Parallel processing in Scala
- Mapping functions over parallel collections
- When and when not to use parallel collections
- Using SQL in Scala
- Scala and Spark RDDs
- Scala and Spark DataFrames
- Creating DataFrames
Skill Level Intermediate
Java for Data Scientists Essential Trainingwith Charles Kelly2h 43m Intermediate
1. Introduction to Scala
2. Parallel Processing in Scala
3. Using SQL in Scala
4. Scala and Spark RDDs
5. Scala and Spark DataFrames
- Mark as unwatched
- Mark all as unwatched
Are you sure you want to mark all the videos in this course as unwatched?
This will not affect your course history, your reports, or your certificates of completion for this course.Cancel
Take notes with your new membership!
Type in the entry box, then click Enter to save your note.
1:30Press on any video thumbnail to jump immediately to the timecode shown.
Notes are saved with you account but can also be exported as plain text, MS Word, PDF, Google Doc, or Evernote.