Join Lynn Langit for an in-depth discussion in this video Internet of Things, part of Google Cloud Platform Essential Training (2017).
- [Instructor] The next set of recommended GCP architectures we're going to look at are around internet of things scenarios. The core products here are Cloud Storage, Cloud Pub/Sub, Bigtable, Cloud Dataflow and BigQuery. So in internet of things, there are various protocols that the devices can use. If your devices are using MQTT, a common architecture is sending those MQTT messages via a Pub/Sub broker back into some processing services so that the device can communicate with the application, for example like a web app that controls the device.
So you can see on the outside of our diagram, we have our devices and then we're using Cloud Load Balancer in front of an MQTT broker. I've had a couple of customers not be aware that this broker was available on the Google Cloud so I wanted to make sure and include this. Now this being said, not all IoT devices use MQTT. There are other protocols, but this is an architecture for this type of device. Now once you get the message into the Pub/Sub, then you're going to process the message and that can happen in various ways.
So you can see on the right, we've got our application running on App Engine and we've got Cloud Pub/Sub that is handling the publish and subscribe to the various MQTT topics. I've also applied stream analytics on the top of Cloud Dataflow using the hot ingest path from Pub/Sub so it's in near real time. Obviously if we have our app on our phone to turn off or on our light bulb, we don't want to have latency. This should be near real time. And then we can capture those events into an IoT warehouse so that we can learn about what our customers are doing with their products and make better products.
We can fix things that don't work properly and we can see which features are important by leveraging the ability to do SQL queries against the list of IoT event data in BigQuery. Just because I'm doing a lot of IoT, I wanted to show an alternate architecture. Now this one is a little bit overwhelming so I won't go through quite everything here, but you can see we've got our constrained devices not on TCP and then we have our standard devices HTTPS so this is where you have devices in the home that are connecting to an edge network.
So then once that device data goes beyond that edge network into the Google Cloud Platform, let's just start with the higher level light blue boxes. So we have some sort of ingest mechanism and again that's often called Pub/Sub because that's near real time with monitoring and logging and those ingest messages are then sent to a Cloud Dataflow pipeline. And the next step in the message path is that it goes into both the storage and analytics tier and in the storage tier you could use Cloud Storage, you could use Data Storage depending on the structure of the data or even Bigtable.
Do you remember the difference between Cloud Data Store and Bigtable? Data Store is for document databases and it has ACID tunable consistency. Bigtable is designed as an abstraction layer on top of basic file storage that allows you to have HBase style queries against rows and rows and rows of log or event data. So you really want to use the right storage mechanism based on what you need to do with the incoming data. You can save a lot of money and time by selecting the appropriate single or combination of storage mechanisms.
On the same hand, you want to look at the analytics tier and you can use a combination of BigQuery, Cloud Dataflow so continuing steps in the pipeline that do processing or you can send it, what I call, to the dump truck which is Cloud Dataproc which is the implementation of Hadoop Spark in the Hadoop ecosystem if you need to do lower level processing and you can use Cloud Datalab to do analytics and visualization at this tier if that's appropriate. Once you're done doing the analytics on your data, then you're going to present the data and you might do that using any of the various compute methods, App Engine, Container Engine or Compute Engine.
Again, that's a familiar paradigm at this point in our discussion of architecture.
- Google Cloud Platform benefits
- Compute services
- Database and storage services
- Data pipeline services
- Machine learning and visualization
- Networking and developer tools
- Implementation solutions
- Architecture options