From the course: Data Science on Google Cloud Platform: Predictive Analytics
Cloud Dataproc - Google Cloud Tutorial
From the course: Data Science on Google Cloud Platform: Predictive Analytics
Cloud Dataproc
- [Instructor] Cloud Dataproc is a managed Hadoop and Apache Spark service available on GCP. It is the same product that you would use in your enterprise environment, except that it is a managed service. Everything that works on an enterprise installation of Hadoop and Spark will work here, too, as there is complete portability of code. With respect to machine learning, the algorithms supported on Apache Spark and MapReduce will be supported on Cloud Dataproc as well. Cloud Dataproc provides automated cluster management. This provides automated scaling up of resources based on jobs run on Cloud Dataproc. As you can see, Cloud Dataproc is ideal for users where current applications running on Hadoop and Spark needs to be moved to GCP.
Practice while you learn with exercise files
Download the files the instructor uses to teach the course. Follow along and learn by watching, listening and practicing.