Join Dan Sullivan for an in-depth discussion in this video Review of Scala for data science, part of Scala Essential Training for Data Science.
- [Instructor] Scala is a language well suited…for data science.…Several of it's features are especially important.…For example, Scala is a functional programming language.…So, we can use things like the map operator…too apply computations to members of collections.…It's object oriented so we can organize our data…and methods for operating on that data into logical groups.…Scala is also a scalable language…and it gives us access to a wide range of Java libraries…including JDBC.…
Scala is especially known for efficient computation.…It compiles to Java bytecode and runs on the JVM.…This means we get to take advantage of all kinds…of advances in Java Compiler Design,…like the just in time compiler.…Parallel collections are especially useful…for taking advantage of multi-core processors…on our desktops and our laptops.…It's especially useful if we're running on servers…that have even more cores.…Now, if you're working with big data,…that is data that's too big to efficiently process…on a single server,…look into Spark.…
Dan also focuses on using Scala with Spark, a distributed processing platform. He first describes how to work with Resilient Distributed Datasets (RDDs)—a fundamental Spark data structure—and then explains how to use Scala with Spark DataFrames, a new class of data structure specially designed for analytic processing. He wraps up the course by providing a summary of advantages of using Scala for data science.
- The advantages of Scala for data science
- Scala data types
- Scala arrays, vectors, and ranges
- Parallel processing in Scala
- Mapping functions over parallel collections
- When and when not to use parallel collections
- Using SQL in Scala
- Scala and Spark RDDs
- Scala and Spark DataFrames
- Creating DataFrames
Skill Level Intermediate
Java for Data Scientists Essential Trainingwith Charles Kelly2h 43m Intermediate
1. Introduction to Scala
2. Parallel Processing in Scala
3. Using SQL in Scala
4. Scala and Spark RDDs
5. Scala and Spark DataFrames
- Mark as unwatched
- Mark all as unwatched
Are you sure you want to mark all the videos in this course as unwatched?
This will not affect your course history, your reports, or your certificates of completion for this course.Cancel
Take notes with your new membership!
Type in the entry box, then click Enter to save your note.
1:30Press on any video thumbnail to jump immediately to the timecode shown.
Notes are saved with you account but can also be exported as plain text, MS Word, PDF, Google Doc, or Evernote.