From the course: Cloud Hadoop: Scaling Apache Spark

Unlock the full course today

Join today to access over 22,600 courses taught by industry experts or purchase this course individually.

Scale on GCP Dataproc or on Terra.bio

Scale on GCP Dataproc or on Terra.bio - Apache Spark Tutorial

From the course: Cloud Hadoop: Scaling Apache Spark

Start my 1-month free trial

Scale on GCP Dataproc or on Terra.bio

- [Instructor] In addition to working in Amazon, I do a lot of work on GCP and just wanted to take a minute here and show you that the data proc managed Hadoop Spark importation is conceptually very similar to Amazon. In some ways I find it to be more performant. So I wanted to show it here. So you just create a cluster very similarly to what you do in Amazon. Now if you've got a new google account there is a limit of eight CPU's, so you're going to want to set this to a smaller size because it's a one master and two workers and you'll run out of CPU's by default in a free trial account so know you have only eight here. Cause' four here and four here. Also you're going to want to enable the components because that's going to give you your Jupiter Notebook. So you're going to turn that on, and then you're going to want to select the components and get Anaconda, which is required on GCP and Jupiter, if you want to use…

Contents