Discuss some best practices while building data transport modules within the big data architecture.
- [Instructor] We will look into some best practices…in setting up parallel processing architectures…in this video.…Store all data including session state…in a central database.…This database should work as a data grid…and be able to scale horizontally.…Design your services to be stateless.…Any state should be stored in the central database.…Deploy multiple instances of the services…behind a load balancer.…
Each instance can handle a request…independent of the other instances,…as all state required it's stored in a central data grid.…Different technologies provide different kinds of…data partitions and partition management capabilities.…This is a key scalability feature,…so spend time learning and designing these partitions.…Data processing should be done in map type operations…as much as possible to ensure parallelism.…
Reduced operations should be kept to a minimum…and should be in the last stage…after unnecessary data is filtered and crumbed.…
There is no coding involved. Instead you will see how big data tools can help solve some of the most complex challenges for businesses that generate, store, and analyze large amounts of data. The use cases are drawn from a variety of industries, including ecommerce and IT. Instructor Kumaran Ponnambalam shows how to analyze a problem, draw an architectural outline, choose the right technologies, and finalize the solution. After each use case, he reviews related best practices for real-time streaming, predictive analytics, parallel processing, and pipeline management. Each lesson is rich in practical techniques and insights from a developer who has experienced the benefits and shortcomings of these technologies firsthand.
- Components of a big data application
- Big data app development strategies
- Use cases: fraud detection and product recommendations
- Technology options
- Designing solutions
- Best practices