From the course: Cloud Hadoop: Scaling Apache Spark

Unlock the full course today

Join today to access over 22,600 courses taught by industry experts or purchase this course individually.

Apache Spark libraries

Apache Spark libraries - Apache Spark Tutorial

From the course: Cloud Hadoop: Scaling Apache Spark

Start my 1-month free trial

Apache Spark libraries

- [Instructor] So as I mentioned in the previous set of movies, one of the reasons to select Spark for fast Hadoop solutions is the amount of integrated libraries. So a simplified version of the Spark architecture looks like this. You've got the core Spark implementation, and you have these utility libraries sitting on top. And these are really important in terms of usability. In particular, I'll point out Spark SQL. We'll be getting our hands on this shortly, but, as you can imagine, this allows you to use ANSI SQL-like query language to execute fast, distributed, in-memory, huge Hadoop jobs. This is tremendously powerful and is one of the drivers in the popularity of Apache Spark. In addition, we've been talking about the Spark Streaming library, and we'll be getting our hands on that later in this course as well. We'll also be using the MLlib machine learning library, and, to be complete, there is a graphics library.…

Contents