From the course: Cloud Hadoop: Scaling Apache Spark

Unlock the full course today

Join today to access over 22,600 courses taught by industry experts or purchase this course individually.

Reexamine streaming pipelines

Reexamine streaming pipelines - Apache Spark Tutorial

From the course: Cloud Hadoop: Scaling Apache Spark

Start my 1-month free trial

Reexamine streaming pipelines

- [Instructor] We're going to look again at Streaming Pipelines because this is very, very commonly becoming the work that my team and I are asked to do by our customers. So there are an increasing number of choices around these architectures. So first of course we're going to to look if it's Batch or stream. And you could use open source or commercial, usually cloud services for stream ingest. Then want to look at if it's batch or stream processing. So ingest is increasingly as you moved to production becoming separate from processing. So for Batch or stream processing, you know, Batch is of course MapReduce. A Spark is in memory and we have some other libraries like Storm that have different type of mechanisms around processing streams. I increasingly use Spark because of its ability to do so many different types of capabilities including streaming, although there is still some Storm out there, that I see. There…

Contents