Understand the interplay between cluster sizing, job tuning, and data preparation in a real-world scale Spark job run.
- [Instructor] So in this next section,…we are scaling our workloads even bigger.…And I've started two copies of the notebook…with even larger files,…so you can see now our cluster is in the resizing state.…So if I go into…the two workloads,…I have both the big workload…and then I have the compressed big workload,…and you'll remember this is around 800 megs…and then this is one-hundredth of that.…So if we go in and take a look at this,…we can see that this is loaded.…So it's the hipster_genomewide_001_1000.vcf.…
And here, we can see the size, so it's significantly bigger…than the other files we've been working with.…This is uncompressed.…And here, you can see that the Spark jobs are running.…And then if we go back to the cluster, we can see…that we have a compressed version of this as well.…So this is the hipster_genomewide_001_1000.vcf.bz2…and you can see how much smaller that is in size…and this is running as well.…
So this…starts to get into,…in kind of a small way, the complexity of the real world…while you're running multiple sized workloads on clusters.…
- Business scenarios for Apache Spark
- Setting up a cluster
- Using Python, R, and Scala notebooks
- Scaling Azure Databricks workflows
- Data pipelines with Azure Databricks
- Machine learning architectures
- Using Azure Databricks for data warehousing
Skill Level Intermediate
1. Big Data on Azure Databricks
2. Core Azure Databricks Workloads
Use a notebook with scikit-learn11m 29s
3. Scaling Azure Databricks Workloads
4. Data Pipelines with Azure Databricks
5. Machine Learning Architectures
Next steps1m 1s
- Mark as unwatched
- Mark all as unwatched
Are you sure you want to mark all the videos in this course as unwatched?
This will not affect your course history, your reports, or your certificates of completion for this course.Cancel
Take notes with your new membership!
Type in the entry box, then click Enter to save your note.
1:30Press on any video thumbnail to jump immediately to the timecode shown.
Notes are saved with you account but can also be exported as plain text, MS Word, PDF, Google Doc, or Evernote.