From the course: AWS Essential Training for Architects (2019)

Caching: AWS Elasticache - Amazon Web Services (AWS) Tutorial

From the course: AWS Essential Training for Architects (2019)

Start my 1-month free trial

Caching: AWS Elasticache

- [Instructor] For applications that read data from a persistent data storage stored on disk, such as a traditional database, few approach can add more performance than adding an in-memory caching layer between the application business logic and the database, especially for data that is accessed frequently, sometimes referred to as hot data. Ideally, the caching layer should be placed as close to the application server as possible and for very simple system architectures consisting of only one server, a caching layer could be achieved by adding an in-memory cache solution directly to the single app server. However, this architecture is not optimized for scaling. If this application tier's ever going to scale horizontally, it would be best not to store cached data on each individual server. It would result in too many missed cache hits and data could easily get out of sync, and it would be hard to maintain. Enter Amazon ElastiCache. With just a few lines of code added to an application, a caching layer can be added that will greatly improve speed, increase throughput, and is more cost-effective than trying to scale out your own database layer. ElastiCache supports two popular caching engines, Memcached and Redis. Memcached has been around since 2003, and is a tried-and-true and tested caching engine. It's popular and well-supported in most application languages and frameworks. It is a fairly flat cache. It has one datatype into which to store data, which is a single string type, and this string type has a default size limitation of up to one megabyte, though in newer versions, you can configure it support larger-sized objects. Memcached has no persistence, so in adding nodes into a Memcached cluster, if something happens to one of the nodes, the data will be lost. The d in Memcached stands for distributed, which means it has been built with scalability in mind. Redis is the other caching engine supported by ElastiCache. Redis supports everything that Memcached does but also has some additional features. It supports the string type but also allows up to 512 megabytes of data in this type. Redis supports persistence, so if the data stored in the cache needs to be able to survive a memory failure, Redis supports this. A scripting language called Lua can be used to write business logic which can be executed in memory. It is highly available and supports a master and read replica architecture. Redis also supports other datatypes, such as sets, sorted sets with scoring, lists, and hashes. So how does one choose? This of course depends on the specific application being built and the use cases and requirements for the project. Memcached is a good choice when requirements are basic. If you are using a coding language that already has sophisticated support for Memcached or perhaps is really baked into the framework that you're using, Memcached might be the right choice. However, because Redis has everything Memcached has plus many more options and features, it has the potential to add more value and better support an application as it grows in complexity and future feature demands. So consider using Redis unless there are specific reasons why Memcached is a better fit. There are a couple of patterns used when adding basic caching to an application. One is referred to as write through. In the write through approach, the data is first written to a persistence tier such as a database and then immediately afterward, written to cache. In this pattern, the data is stored in the cache every time new data is added, regardless of whether users of the application ever request the data for reading. Another common pattern is called lazy load. The lazy load approach first attempts to retrieve the data from the cache. If not found in cache, the attempt is then made to retrieve from the database, and once retrieved, then stored in cache. In this approach, there is no data added to the cache until it is first being requested. There are pros and cons to both approaches. With the write through approach, all data is stored in cache. This has the advantage of increasing a cache hit for requested data but this approach can require a lot more memory to hold all of that data, even when the data is not being actively requested. With the lazy load approach, only the needed data is being set in cache, but in this case, there is a higher chance of getting a miss, which can have a negative performance impact when attempting to retrieve data for the first time. In practice, both approaches are often used. A time-to-live or TTL setting can be used. A TTL setting will expire the data after a period of time, which frees up available memory. So using a TTL setting can help eliminate the concern about consuming too much memory with unneeded data with the write through approach. Using the ElastiCache service can increase data reads by 10 to 15 times faster than reading from a database. It can reduce load on the database and potentially prevent having to scale up or out the database tier which can save money. With in-memory caching solutions, memory size limits will determine the amount of data able to be stored, and as such, it can still be limiting. It may not be the only solution to optimizing application performance, but it's certainly one of the best places to start. So, when architecting applications on AWS, consider using ElastiCache as an in-memory data access solution to eliminate bottlenecks with database data access, and speed up the overall delivery of application data.

Contents