Build code for processing web hits data in the solution.
- [Instructor] In this video, I'm going to show you how to set…up a pipeline for web clicks.…The web servers that are running the website keeps…recording web click events and keeps writing these…events to log files.…We are going to set up a Kafka Connect to monitor…this log files, receive data from them,…and publish it as a Kafka topic.…Then we're going to use Spark to actually subscribe…to this data and then process and update the summary table.…As you can see on the right top side,…the summary table only has sales data…at this point, and we are going to now go…and update the web hits data also.…
First, let let us focus on setting up the Kafka Connect…process for getting web clicks data.…So, this is a FileStreamSource and it is going…to monitor this specific file on line number 32.…It's called webclicks.text, and the content…of this particular file, we have it here,…and you'll see that this is how the event…would look like in this case.…And it has a timestamp that indicates…when some event happened, and then it…
- What is data engineering?
- Spark and Kafka for data engineering
- Moving data with Kafka and Kafka Connect
- Kafka integration with Apache Spark
- How Spark works
- Optimizing for lazy evaluation
- Complex accumulators
Skill Level Advanced
Big Data Foundations: Program Managementwith Alan Simon1h 11m Intermediate
1. Data Engineering Overview
2. Moving Data with Kafka
3. Spark High-Performance Processing
4. Use Case Project
- Mark as unwatched
- Mark all as unwatched
Are you sure you want to mark all the videos in this course as unwatched?
This will not affect your course history, your reports, or your certificates of completion for this course.Cancel
Take notes with your new membership!
Type in the entry box, then click Enter to save your note.
1:30Press on any video thumbnail to jump immediately to the timecode shown.
Notes are saved with you account but can also be exported as plain text, MS Word, PDF, Google Doc, or Evernote.