In this video, Jeff Winesett discusses AWS Auto scaling. Auto scaling provides a service to automate the horizontal scaling of cloud applications, and helps provides the elasticity the cloud promises.
- [Announcer] Few AWS services could provide more benefit to implementing elasticity than Auto Scaling. Auto Scaling is a service focused on helping implement elasticity. Remember when I first talked about understanding elasticity? I used some graphs that demonstrated the issues that can occur when not allocating enough capacity? And, the waste that is generated, by allocating too much capacity in traditional environments? The solution was to take advantage of the elastic nature of the cloud.
Increase capacity exactly when load demands, and decrease capacity when load lightens. Always matching the right amount of capacity to meet demand. Auto Scaling is a service provided by Amazon, that makes this easy. Start by defining the conditions on which the application should scale out, or in; based on the number of desired EC2 instances. Then the auto scaling service will add or remove instances when these conditions are met.
Take for example, the web application architected as depicted here. This application is distributed across two availability zones. Because it has been designed for failure, and doing so achieves a high degree of fault tolerance. During periods of normal to light load, this application works fine with just these two servers. Now imagine the traffic starts to pick up a little. Cloud watch is being used to monitor the CPU usage on the servers, and will trigger alarms when the CPU levels on the boxes rise.
With Auto Scaling, a condition can be defined such that, if the CPU utilization rises above 75%, say for more than a minute, the application will automatically scale out, by having two more instances added. One in Zone A, and another in Zone B. This will likely result in lowering the CPU utilization across each server. If at any other time they again rise above 75%, two more instances will be added to accommodate the increase in demand.
This continues in such a manner, so that supply always matches demand. Since Auto Scaling has also been used to set up a condition, such that, if the average CPU utilization drops below 35%, instances will be removed when load decreases. One from Zone A, and one from Zone B. This will continue to happen until it gets back down to a defined minimum number of instances. Customers are happy because demands were met, and the accounting department is happy, because only the needed resources were used, and only paid for when utilized.
Auto Scaling has three Primary Components to it. The first part is what is called the Launch configuration. This defines what to scale. This is where the EC2 instance size is defined. What AMI to use. What security group and other storage needs. Very much the same as all the configuration settings that need to be defined when launching a new instance. The second part is called an Auto scaling group. This defines where to launch the instances, and also defines limits on the number of instances to launch, should certain events occur.
One of these is a desired capacity number. Auto Scaling will work to keep your number of instances equal to your desired capacity. So, with just these two components defined, Auto Scaling will provide some fault tolerance. The third component is optional, but is really important to fully implement the real time elasticity. It is called a Scaling policy, and it defines the when and under what conditions instant scaling should happen.
This is where cloud watch alarms can be defined. Based on certain metrics breaching specified thresholds. Referring to the initial example, it is in the Scaling policy where the conditions are defined to have two or more instances added if the average CPU utilization rises above 75%, and to have instances removed, if that same metric falls below 35%. It's a good idea to have both a scaling out, and a scaling in policy defined, in order to fully take advantage of elasticity.
The system needs to scale out to handle increase in load, but also needs to scale back in, when load decreases, to keep costs low and avoid waste. Implementing Elasticity rule #2. Take advantage of the auto scaling service to greatly simplify the process of automating your scaling. This helps keep application users happy, and business costs minimized.
- Benefits of cloud services
- Making architectures scalable
- Examining cloud constraints
- Virtual servers, EC2, and Elastic IP
- Using the Amazon machine image
- Elastic load balancing
- Using CloudWatch for monitoring
- Security Models
- Elastic block storage
- S3, CloudFront, and Elastic Beanstalk
- Handling queues, workflows, and notifications
- Caching options and services
- Identity and access management
- Creating a custom server image
- Application deployment strategies
- Serverless architectures