From the course: Data Science on Google Cloud Platform: Building Data Pipelines
Unlock the full course today
Join today to access over 22,600 courses taught by industry experts or purchase this course individually.
Cloud Dataflow - Google Cloud Tutorial
From the course: Data Science on Google Cloud Platform: Building Data Pipelines
Cloud Dataflow
- [Narrator] Cloud Dataflow is a fully managed batch and stream processing service. There is almost zero administrative work required to create or scale computer power. It is a native GCP product. It's is tightly integrated with other GCP products. At the same time, it does not provide port abilities to other platforms if you desire. It is built on the Apache Beam programming model. You define your pipelines in Apache Beam and execute them on Cloud Dataflow using its own runner. It has scalable execution engine that can auto tune and auto scale based on the requirements of the job and the overall user's usage settings in GCP. It has tight integrations with other GCP products including Datastores and Pub/Sub. It has automated resource management so you don't have to do any cluster design, administration or fine tuning. As you will see later on, Cloud Dataflow is very useful in creating and executing data pipelines.
Practice while you learn with exercise files
Download the files the instructor uses to teach the course. Follow along and learn by watching, listening and practicing.