Learn how to use the Spark shell.
- [Voiceover] One of the most important aspects of Spark…is its use of Resilient Distributed Dataset,…or RDD,…to accomplish fault tolerance.…Once created,…RDD can be transformed into another RDD,…or you can also take an action on an RDD.…Let's create our first RDD…of the README file stored…in our Spark directory.…You can see the README file in…the /usr local Spark directory.…
So let's quit Spark for now,…and let's checkout the README file, type ls.…And the README.md file is there.…Let's start the Spark shell again.…Let's call our first RDD text file…type val textFile = spark.read.textFile("README.md") .…
Press Enter.…Looks like it worked.…Now let's take some actions to…the newly-created RDD,…type textFile.first() .…Press Enter.…This action returns the first item in the dataset,…which is # Apache Spark.…
So the line appears first in the README.md file.…Let's take another action.…Type textFile.count() .…Press Enter.…This action counts the number of items in the dataset,…which is our README.md file.…There are 103 items or lines in the README.md file.…
- Enabling technologies in data science
- Cloud computing and virtualization
- Installing and working with Proxmox, Hadoop, Spark, and Weka
- Managing virtual machines on Proxmox
- Distributed processing with Spark
- Fundamental applications of machine learning
- Distributed systems and distributed processing
- How Hadoop, Spark, and Weka can work together
Skill Level Beginner
Course organization1m 17s
1. Introduction to Data Science
2. Cloud Computing
3. Distributed File Systems
4. Distributed Processing
5. Machine Learning
6. Case Study
- Mark as unwatched
- Mark all as unwatched
Are you sure you want to mark all the videos in this course as unwatched?
This will not affect your course history, your reports, or your certificates of completion for this course.Cancel
Take notes with your new membership!
Type in the entry box, then click Enter to save your note.
1:30Press on any video thumbnail to jump immediately to the timecode shown.
Notes are saved with you account but can also be exported as plain text, MS Word, PDF, Google Doc, or Evernote.