Author
Released
4/1/2020- File systems for Hadoop and Spark
- Working with Databricks
- Loading data into tables
- Setting up Hadoop and Spark clusters on the cloud
- Running Spark jobs
- Importing and exporting Python notebooks
- Executing Spark jobs in Databricks using Python and Scala
- Importing data into Spark clusters
- Coding and executing Spark transformations and actions
- Data caching
- Spark libraries: Spark SQL, SparkR, Spark ML, and more
- Spark streaming
- Scaling Spark with AWS and GCP
Skill Level Beginner
Duration
Views
- [Lynn] Have you been learning Apache Spark and been wondering how to turn the various examples and Hello World samples you've been working with into reality and business value? Have you been thinking about how you might leverage the distributed compute power of Apache Spark in the public cloud? In this course, we're going to cover using Spark in a practical manner, based on my experience as a cloud architect, building big data pipelines for customers from genomic analysts to education and many others. In particular, we're going to look at cloud configurations. We're going to examine databricks, which is software as a service running on Amazon. We're going to look at managed Spark EMR, or elastic map reduce, on Amazon and many others. I'm Lynn Langit, we have lots to cover, so let's get started.
Related Courses
-
Introduction to Spark SQL and DataFrames
with Dan Sullivan1h 53m Intermediate -
Learning Hadoop
with Lynn Langit4h 6m Beginner
-
Introduction
-
Using cloud services1m 41s
-
1. Hadoop and Spark Fundamentals
-
Modern Hadoop and Spark1m 39s
-
Hadoop and Spark libraries1m 23s
-
-
2. AWS Cloud Spark Environments
-
Add Hadoop libraries2m 33s
-
Load data into tables1m 51s
-
Run Spark job on AWS EMR4m 40s
-
3. Spark Basics
-
Apache Spark libraries3m 24s
-
Spark shell1m 53s
-
-
4. Using Spark
-
Tour the notebook5m 29s
-
Import and export notebooks2m 56s
-
Calculate Pi on Spark8m 30s
-
Import data2m
-
Transformations and actions3m 21s
-
Caching and the DAG6m 49s
-
5. Spark Libraries
-
Spark SQL8m 34s
-
SparkR6m 54s
-
Spark ML: Preparing data4m 21s
-
Spark ML: Building the model3m 50s
-
MXNet25s
-
-
6. Spark Streaming
-
Spark Streaming4m 21s
-
7. Scaling Spark on AWS and GCP
-
Conclusion
- Mark as unwatched
- Mark all as unwatched
Are you sure you want to mark all the videos in this course as unwatched?
This will not affect your course history, your reports, or your certificates of completion for this course.
CancelTake notes with your new membership!
Type in the entry box, then click Enter to save your note.
1:30Press on any video thumbnail to jump immediately to the timecode shown.
Notes are saved with you account but can also be exported as plain text, MS Word, PDF, Google Doc, or Evernote.
Share this video
Embed this video
Video: Scaling Apache Hadoop and Spark