From the course: Scala Essential Training for Data Science
Unlock the full course today
Join today to access over 22,600 courses taught by industry experts or purchase this course individually.
Introduction to Spark - Scala Tutorial
From the course: Scala Essential Training for Data Science
Introduction to Spark
- [Instructor] There are many reasons to use Scala for data science and analytics. Scala is a functional programming language and those languages are well-suited for applying computations to data. It's also an object-oriented language. That allows us to create objects and methods that keep our data organized according to the structure of the business problem we're working on. Features like parallel collections help when we're working with large data sets. They allow us to take advantage of multiple CPUs that are found in contemporary desktops and laptops. When you start working with big data, that is data that cannot be processed in a reasonable amount of time on a single server, then it's time to consider a distributed processing framework like Spark. Spark is a distributed processing framework written in Scala. It's known for its fast processing. It's faster than Hadoop, the first popular big data analytics platform, libraries for analytics, stream processing for near real-time…
Practice while you learn with exercise files
Download the files the instructor uses to teach the course. Follow along and learn by watching, listening and practicing.