Analyze the problem statement for the data warehouse use case, and set out the goals while architecting the solution.
- [Instructor] Let's begin with our first use case. Audit trail, data archive. This is a simple one, and it is a great starter project, for business that want to begin using big data, for that data and analytics needs. I'll define the problem first. Your company has an eCommerce application, that it holds for it's customers to buy a variety of products and services. This application creates an audit trail, for all the actions that it's users perform on it's website everyday.
The audit trail contains information, like the username, when the user visited the website, pages visited, products reviewed, total time in each page, etcetera. This is important for the business, to analyze user activity trends. It generate about 10GB of data, per day. Currently, your company uses an Oracle DB to store data. As you would know, Oracle is relatively expensive, when the database has to scale into a cluster.
In order to control costs, and due to scalability limitations, your company has been keeping only 15 days of audit trail online. But this data, is insufficient for your analysts. They have been demanding access to three years worth of data online. So, here is your task. Architect an audit trail data archive, that should be able to store, for three years. That would be more than 10 terabytes of data online.
You're also given specific goals, for this use case. First, the solution needs to be cheap, and cost effective. It can not cost the company significant amounts of money. The solution should be massively scalable. Storing and managing more than 10 terabytes of data is no small task. Finally, the solution should allow Ad hoc querying capabilities, to help your analysts, use this data, for their work.
Let's get started with our potential solution.
There is no coding involved. Instead you will see how big data tools can help solve some of the most complex challenges for businesses that generate, store, and analyze large amounts of data. The use cases are drawn from a variety of industries, including ecommerce and IT. Instructor Kumaran Ponnambalam shows how to analyze a problem, draw an architectural outline, choose the right technologies, and finalize the solution. After each use case, he reviews related best practices for data acquisition, transport, processing, storage, and service. Each lesson is rich in practical techniques and insights from a developer who has experienced the benefits and shortcomings of these technologies firsthand.
- Components of a big data application
- Big data app development strategies
- Use cases: archiving audit logs and performing customer analytics
- Technology options
- Designing solutions
- Best practices