- [Instructor] When we talk about building apps for the cloud it's good to think about scaling from the outset. Planning to scale starts with how you build your app. When you build an application, you hope it will attract a wide audience. Of course the more concurrent users your app attracts, the more you're going to need to prepare to make sure your app has the capacity to serve them well. Let's assume that your application is reasonably well-performing on its own. Your app has the basic traits of a well-performing application. The server requests are small, and as fast as possible.
Network calls and file I/O are minimized, or void when possible. And resource-intensive processes are run asynchronously in the background, not occupying web workers. You've done all that and more, and you're just running up against the limits of your application server. Maybe you're close to maxing out CPU, or RAM. It's time to scale. Scaling a web app comes in two flavors, vertical, and horizontal. Vertical scaling means to simply increase the resources on a single server. This means adding hardware, more RAM, or a faster CPU.
Or in AWS terms, it would mean changing to a more powerful instance type. Whether it's an AWS instance, or physical hardware, scaling vertically means incurring some downtime on that machine as you upgrade. Even an EC2 instance has to at least reboot when changing type. Horizontal scaling refers to distributing your application over multiple servers. Each server is alike, running a copy of your application. A device called load balancer, such as Amazon's ELB, sits in front of the group, and coordinates traffic among them. Horizontal scaling has a number of advantages over vertical scaling.
It can be cheaper to achieve the same capacity. It gets expensive to build, and run highly provision machines. Price tends to scale up faster when you scale vertically. It can provide fault tolerance. If a single host web app has hardware problems you're toast. In a multi host environment hardware failures may mean you lose capacity for awhile, but they don't necessarily mean you're offline. Finally there's no downtime to add capacity. When scaling horizontally, increasing or decreasing capacity is much easier, especially if you use services like AWS Auto Scaling Groups, which allow you to define scaling criteria that add or remove EC2 instances when you need them.
AWS has no service that auto scales instances vertically. So horizontal scaling is great, right? Yes, but your app needs to be built within mine. First you must make sure to separate the tiers of your application. You might be used to installing a database like MySQL on the same host as your web application. This image represents a single web server configured in just that way. The app server talks to the database port, like so. But when you're horizontally scaling, the idea is to distribute load over multiple copies of your server, and load balance with something like an ELB.
So what then? Do you build a new host with its own database? No, the databases will be immediately out of sync, and your app will be broken. You could point new host to the first server's database, but that's a disaster waiting to happen. What happens if the first web server experiences hardware failure? Your database goes with it. Instead you need to separate your concerns. Dedicate the app server to being just an app server. Build the database separately, then point the app server there. Then new app servers can be built to replicate the first, and point to the same database.
Of course, you'll need a load balancer at this point. This multi-tier model allows you to scale each layer independently. Load balancers and database engines tend to scale better than application servers. So you might end up with a diamond shape architecture, like this. Also critical to supporting horizontal scaling, avoid the file system. Remember this picture? The problem was that a critical resource needed by every application server, in this case the database, was unlocked to a single instance. In fact you can get into trouble with any use of the local file system.
For instance, if you are storing user session information on the file system, your users will experience problems when you scale to multiple hosts. Imagine you have your nice n-tier architecture, but you're still storing session information on the file system. If this user's shopping cart is stored on server A, what happens when the load balancer senses next page clicked to server B? The cart is gone. The same problem can occur if you allow users to upload data, say, profile pictures to your app. If they get stored to a local file system, users routed to other hosts will not see them.
Some load balancers support the concept of sticky sessions, meaning the load balancer is smart enough to always send this particular user to node A. But this is not something we want to rely on. Instead, look at alternate solutions. Chances are that your application framework allows you to specify where session information is stored. Ruby on Rails for instance lets you specify Active Record as its store. If you point Active Record to a remote database, you're good. That's point one. Use database-backed, or cookie session stores.
If you need to let users upload assets, consider storing those in the database, or even better, look at cloud-based storage like S3. That's point two. Finally, if your application absolutely requires a local volume to function, take a look at one of AWS's newest offerings, Elastic File System, or EFS. Unlike Elastic Block Store, EFS can scale itself, and be mounted by multiple instances, meaning your application servers can share the volume, and horizontal scaling still works.
- Understanding AWS EC2
- Creating an EC2 instance
- Provisioning with CloudFormation
- Architecting apps for horizontal scaling
- Creating an Elastic Beanstalk environment and app
- Using OpsWorks
- Deploying apps with CodeDeploy
- Working with the Cloud9 cloud-based IDE
- Quickly setting up coding projects with CodeStar