Analyze the problem statement for the log accumulation use case, and set out the goals to achieve while architecting the solution.
- [Instructor] We now move on to the next use case, which is related to the previous use case. Let's turn our attention to Server Log Accumulation. First, let us define the problem. Your company has a farm of 200 web servers which are used to provide your e-commerce service. When a visitor gets to your website they will be redirected by a load balancer to one of those 200 instances, which will then serve the visitor for the entire remaining user session.
Each of these servers generate their own performance logs. The logs carry information about CPU utilization, memory utilization, and disk I/O. They overall generate about 40 GB of data per day across all the nodes. Your IT department wants to analyze these logs to understand server performance and improve user experience. You are asked to accumulate these logs, and send them to a network management system, as well as to a new data base for your analytics team.
The animus is for third party alerting purposes. So here is your use case: architect a data pipeline to collect distributed logs and accumulate them in a central database. Your goals for the use case would be the following: the solution needs scalability to handle 200 servers, and also potential growth of this setup in the future, to double or triple the volume.
There should be minimum custom work required to build the solution. This ensures that the solution can be deployed with minimal effort and resources from the organization's spot. The solution should support multiple sources of data, as well as multiple receivers of data. Remember, we need to support 200 web servers as sources and an animus and a server log store as sinks.
The solution should provide guaranteed delivery. There should be no loss of events during transit. Let us now roll into architecting the solution.
There is no coding involved. Instead you will see how big data tools can help solve some of the most complex challenges for businesses that generate, store, and analyze large amounts of data. The use cases are drawn from a variety of industries, including ecommerce and IT. Instructor Kumaran Ponnambalam shows how to analyze a problem, draw an architectural outline, choose the right technologies, and finalize the solution. After each use case, he reviews related best practices for data acquisition, transport, processing, storage, and service. Each lesson is rich in practical techniques and insights from a developer who has experienced the benefits and shortcomings of these technologies firsthand.
- Components of a big data application
- Big data app development strategies
- Use cases: archiving audit logs and performing customer analytics
- Technology options
- Designing solutions
- Best practices