Reading streams is different than reading persistent data. Use the DataStream API to read from a stream for processing with Flink.
- [Instructor] In this video, We will continue on the example discussed in the previous video with the basic streaming operations class. In order to read data from a data source, Flink needs to create a data source connection and attach it to the data stream object. For a file based data source. We defined the input format as text input format and pointed to the data directory where the files are created. Then we use the read file function in the streaming environment Object to read the contents of this directory. In this function we first specify the format of the file. Then we specify the directory to monitor for new or changed files. When your change files are found, Flink will automatically pick up these files for processing and attach it to the data string. The final processing mode is set the process continuously. This means that Flink will continuously monitor the directory for any new files. The monitoring interval is set to one second. To receive data into this data stream, it is required to declare the data type of each record. We start off with the data type as a string, to read it into the audit trail string data string. A string format for a record has limited use. It needs to be either converted into a POJO class or a tople For further processing. We will convert a string record into an audit trail POJO object using a map function. The map function will continuously process new records as they arrive into the string. It will print the received record so we can check its output. It also converts it into an audit trail object. The audit trail object contains logic to interpret the string and converted into a corresponding object. Back to the main code. Let's run this code and review the results. The log messages generated by the string generator are printed in blue to differentiate from the messages generated by the Flink job. As a generator continues to generate events, we see that the map function receives these events and converts them into auditory objects. In the next video, we will perform some computations on this data.
- Streaming with Apache Flink
- Using the DataStream API for basic stream processing
- Working with process functions
- Windowing and joins
- Setting up event-time processing
- State management in Flink