From the course: AWS: High Availability

Exploring high availability in Route 53 - Amazon Web Services (AWS) Tutorial

From the course: AWS: High Availability

Start my 1-month free trial

Exploring high availability in Route 53

- [Instructor] Route 53 is a highly available service offering from AWS that provides domain name system or DNS services. Route 53 itself is considered a highly available component. Let's look at how you can configure DNS Failover to facilitate the design of highly available multi-region systems. Route 53 has the ability to perform health checks on three types of resources. Route 53 can perform health checks against endpoints. An endpoint can be a domain name or an IP address and the health check can be based on HTTP with or without string matching or TCP. An endpoint can be based on a domain name or an IP address and the health check can be based on HTTP with or without string matching. The second type is known as a calculated health check. It's a health check that monitors other health checks. If that sounds a bit meta, it's because it is. Route 53 can also monitor CloudWatch alarms. This is very useful if you need to Failover based on an alarm you've configured. Let's look under the hood at how endpoint health checks are evaluated. First, checks have to respond successfully in a timely manner. For TCP checks, responses must be received within 10 seconds. Meanwhile, HTTP and HTTPS checks get a total of six seconds to respond. The six seconds are divided into two parts. Four seconds are allocated to establishing a TCP connection. The remaining two seconds are allotted for receiving a successful HTTP status code. For HTTP or HTTPS checks, success means an HTTP response code in the 200 or 300 range. If you want to go a step further and confirm that your site is functional an advanced feature is the ability to look for a string value contained in the first 5,210 bites of a webpage. If the string isn't found, the health check will fail. Note that pattern matching through the use of regular expressions is not supported. The second characteristic has to do with failure rate. The number of consecutive errors that comprise a failure is configured when writing a health check. Route 53 health checks originate in a variety of AWS regions around the world. To allow for regional networking variances, Route 53 considers an endpoint healthy if more than 18% of its agents evaluate the endpoint as healthy. Calculated health checks look at other health checks. It's a hierarchical concept in which there is a Parent health check which can monitor the status of to 255 Child health checks. A useful thing about calculated health checks is that you can configure what constitutes a lack of health. For example, suppose you run an application that uses APIs from two external sources. Using Route 53, you establish two Route 53 health checks, one for each API endpoint. You proceed to create a calculated health check, which records healthy as long as both underlying health checks are healthy. If one of your API providers goes offline, it's Child health check would show unhealthy causing the Parent health check to show unhealthy. For health checks based on CloudWatch alarms, what you really need to think through are the conditions with triggering an unhealthy state. As expected, when CloudWatch shows a state of OK, the Route 53 health check shows healthy. Similarly, when CloudWatch is in a state of Alarm, Route 53 shows unhealthy. However, when there's not enough data and CloudWatch's status shows Insufficient, you have the option of defaulting to healthy or unhealthy. Alternatively, you can elect to use the last known status. You'll likely want to take action in case your health check fails. As is the case with many AWS services, Route 53 health checks can be configured to create alarms and post to SNS topics for notification purposes. One design pattern is a simple, active passive DNS Failover. Suppose you operate a website out of a US West region with a warm standby in one of the US East regions, you've set up health checks to monitor whether or not your site is up. Under normal operations, Route 53 is sending people to US West. If the Route 53 health check detects an application failure, DNS records that depend on the health check are inactivated while backup DNS records become active. This results in network traffic being redirected to your warm standby in about three minutes.

Contents