From the course: Data Science on Google Cloud Platform: Predictive Analytics

Cloud Dataproc - Google Cloud Tutorial

From the course: Data Science on Google Cloud Platform: Predictive Analytics

Start my 1-month free trial

Cloud Dataproc

- [Instructor] Cloud Dataproc is a managed Hadoop and Apache Spark service available on GCP. It is the same product that you would use in your enterprise environment, except that it is a managed service. Everything that works on an enterprise installation of Hadoop and Spark will work here, too, as there is complete portability of code. With respect to machine learning, the algorithms supported on Apache Spark and MapReduce will be supported on Cloud Dataproc as well. Cloud Dataproc provides automated cluster management. This provides automated scaling up of resources based on jobs run on Cloud Dataproc. As you can see, Cloud Dataproc is ideal for users where current applications running on Hadoop and Spark needs to be moved to GCP.

Contents