From the course: Scala Essential Training for Data Science

Unlock the full course today

Join today to access over 22,600 courses taught by industry experts or purchase this course individually.

Introduction to Spark

Introduction to Spark - Scala Tutorial

From the course: Scala Essential Training for Data Science

Start my 1-month free trial

Introduction to Spark

- [Instructor] There are many reasons to use Scala for data science and analytics. Scala is a functional programming language and those languages are well-suited for applying computations to data. It's also an object-oriented language. That allows us to create objects and methods that keep our data organized according to the structure of the business problem we're working on. Features like parallel collections help when we're working with large data sets. They allow us to take advantage of multiple CPUs that are found in contemporary desktops and laptops. When you start working with big data, that is data that cannot be processed in a reasonable amount of time on a single server, then it's time to consider a distributed processing framework like Spark. Spark is a distributed processing framework written in Scala. It's known for its fast processing. It's faster than Hadoop, the first popular big data analytics platform, libraries for analytics, stream processing for near real-time…

Contents