From the course: Apache Flink: Real-Time Data Engineering

Unlock the full course today

Join today to access over 22,600 courses taught by industry experts or purchase this course individually.

Using keyed streams

Using keyed streams - Flink Tutorial

From the course: Apache Flink: Real-Time Data Engineering

Start my 1-month free trial

Using keyed streams

- [Instructor] In this video I will discuss partitioning of streams by keys and show an example for the same. In partitioning, we apply the keyBy operator to partition data by one or more attributes in the data stream. Flink distributes the events in a data stream to different task slots based on the key. Flink users are hashing algorithms to divide the stream by partitions based on the number of slots allocated to the job. It then distributes the same keys to the same slots. Partitioning by key is ideal for aggregation operations that aggregate on a specific key. Each key is then aggregated locally in the task slot without the need to shuffle data between the slots. This helps optimize the performance of this job. The code example for using keys is in the keyed stream operations class under the chapter two to package. The setting of the flick environment and the reading CSV into the data stream, and setting up the…

Contents