Once the instances are launched, you may find that even though the model is that of a scaling group, you get all three nodes right away, the default maximum scale This may not be appropriate, so learn how to define a simple policy to adjust the scale to better match the number of nodes needed to run the system.
- [Instructor] In the worker group that we deployed with cloud formation, our auto scaling, auto scales automatically to the maximum. In this module, in the directory, we have an ASG policy file and we're going to do more on ASG policy. We're going to look at creating an auto scaling policy against our group so that rather than just sitting at the maximum scale, which is the default for the model, we're going to allow it to look for a parameter. In this case, we're going to look at target CPU utilization for each of the nodes.
And we want to target a number. I find that 70% usually works pretty well so if the target CPU utilizations is below 70%, it'll remove a node from the group rather than leaving nodes in the group. So we're going to do this in the console. And here within our EC2 console, we're in the EC2 dashboard and you get there from services under compute EC2. And we actually want to go to find auto scaling. So in the menu on the left, we can scroll down, and about halfway down the page is auto scaling and we want auto scaling groups.
Now within the groups, we'll have this list, and sometimes you see a view that's not as visible. You can change the size of the view here or just drag it up. We want to go to scaling policies. And we'll see that by default, there aren't any. And so if we look at the actual state of our auto scaling group, we see we have three instances, even though there's a minimum of one, we have three, which is the maximum, because that's what the system desires by default. You want up to three, let me just give you the three. We're going to add a policy and we'll call this scale 70, and we're just going to look at the average CPU utilization.
You have network and load bounds request, and a bunch of other ways of looking at this, but CPU utilization for general purpose compute is a reasonable approach. And we're going to set a target value of 70 because that's percent of CPU. Instances need, x number of seconds to warm up after scaling, they don't. Because they're going to connect into the Kubernetes environment. Once they connect in, they're going to get configured, and then they're going to start being used. So we don't have to wait anything longer for the resources to get supported. And disable scale-in basically says, you know we're not going to force the system to scale-up at any one point in time.
Again, for general purpose, this is fine. Obviously, you can play with these parameters. And we're going to say create here. And so what will happen is, after a period of times, after it starts to run through some metrics cycles, we'll see it start to remove nodes from the environment. And we're actually not going to wait for this to happen because it will just take a little while to happen, we can look at it another way, which is that we can actually go back to our command line and we can do kubectl get nodes, and we'll see how many nodes the system has.
We still have three. If we come back and check in five minutes, we'll probably just have down to one. Because there's currently nothing generating any load on this system. Here after three minutes, we can see that the scaling is starting to happen. It'll take it another couple of minutes before it scales down to a single node.
- Setting up Kubernetes on AWS
- Scaling EKS workers
- Adding EKS storage and networks
- Configuring application security
- Monitoring EKS deployments