Creating a Highly Available Architecture in AWS

Amit Dhanik
Jun 16, 2021
11 min read

Updated: Jun 25, 2021

Hi Guys, Today I am going to discuss how we can build a Simple yet highly available architecture on AWS and what are some of the basic features you should always have while designing your Application Architecture on the Cloud. This is done so that your application suffers from the minimum downside in terms of performance, is highly available, and can be scaled easily.

I have discussed in depth all the services that have been used(more of a service overview) in this Architecture and what should be the proper approach when we build such an application from Scratch. So without wasting any time, let's jump in.

Disclaimer - Readers are expected to have a basic knowledge of AWS Services, although I have tried my best to explain everything from basic. Feel free to comment and ask your questions in the comment section.

Suppose, we have a Mobile Application that has been hosted on AWS. The primary requirement of every application hosted on AWS is generally the same. These are -

An application must be Highly Available - Suppose, we have hosted our application in 2 Availability zones(A.Z 1 and A.Z 2). An A.Z can be considered as a Datacenter where your application has been hosted. A.Z resides inside a Region. A Region can have multiple A.Z which are isolated and physically separated from each other within a geographic area. Eg - If you want to launch an application that serves the people of India, for eg, Cred, you would want to host your application in the Asia-Pacific(Mumbai) region. As of now, for the Mumbai region, we have 3 A.Z provided by AWS. So, you can launch your application in any one of the A.Z. Remember, these 3 A.Z in Mumbai are physically separated by a meaningful distance. This means that if any one of the Availability zones goes down, the other 2 would still be available and functioning. This is an important characteristic of AZ, which overcomes a significant drawback of data centers used by companies. Earlier, companies used to face huge problems since if due to some mishappening(many possible reasons for service downtime could be due to software upgrade, OS upgrade, file/data corruption, defective programs/application code, user error, etc), their data center goes down, it would halt all the ongoing activities, giving business and the company a setback. Now with the option of hosting your application in multiple A.Z, there is rarely any downtime, and you also don't have to maintain a huge Datacenter of your own.(Extra cost is minimized by making use of AWS A.Z.)

So, how to make your Application Highly Available?

Deploy your application across multiple AZs in the same region for fault tolerance(if one A.Z goes down, your application still functions) and for low latency. You should never depend on One A.Z if you care about the availability and performance of your application. This is a big RED FLAG for your architecture. Here we can see that our application is hosted in two availability zones inside the VPC. Below is a representation of 3 A.Z in the EU Region.

After you have decided to take your application online, how do the clients contact your website? How your site appears when someone types in the domain name? Let's discuss how all this is achieved using our first Service, Route 53.

Route 53 - is the DNS service provided by Amazon. Like a phonebook, Route 53 helps you fetch any IP address listed on the Internet's phonebook. What does a DNS do? A DNS is used to convert a human-friendly domain name( eg youthindiaspeak.com) to an Internet Protocol (IP - eg - htttps://185.230.63.171). Why?

IP addresses are used by computers to identify each other on the network. Computers don't understand the human-readable format and hence talk in numerics. We have two different forms of IP's -

IPV4 (32 Bits)
IPV6(128 Bits)

Note - In AWS, By default, all VPCs and subnets must have IPv4 CIDR blocks—you can't change this behavior. You can optionally associate an IPv6 CIDR block with your VPC.

Amazon Route 53 is a Global Service. It does not require you to select any region. It is used to connect user requests to infrastructure running on AWS – such as Amazon EC2 instances, Elastic Load Balancing load balancers, or Amazon S3 buckets – and can also be used to route users to infrastructure outside of AWS. You can see the main features provided by Route 53, which are DNS management, Traffic management, helps monitor health checks of your applications and web resources, and also provides Domain Registration facility.

FUNCTIONING OF ROUTE 53 (Extra Details, beginners can skip this part)

So, assuming that we have our application hosted on Route 53(we have registered the Domain), we can create record sets for our Hosted Zones in Route 53. To route traffic to your resources, you create records, also known as resource record sets, in your hosted zone. After the request is routed to Route 53, it sends the IP to the client which requested for it( Client types in the Domain name, DNS request hits the Route 53, Route 53 resolves the IPV4 Address, changes Domain name to IP, returns back the mapped IP to the client and using that IP client can communicate with the EC2 Instance where the application is hosted). Here, A-Record is used to resolve records for IPV4. This all happens in milliseconds !! The below image can help you visualize better what's going on when you search for a website.

Note - When we send a DNS Request with a URL, Route 53 gives us back the IP address and caches the request made to it(on the web browser), and returns a response with a TTL(time to live). The time for which a DNS resolver caches a response is set by a value called the time to live (TTL), and is associated with every record. It means that until TTL time is completed, you should not ask for the same IP from Route 53. When TTL time is completed, it sends the request again and gets the updated data. Since there can be multiple hosts for the same application, TTL makes sure Route 53 is not bombarded with traffic and is not loaded all the time. Amazon Route 53 does not have a default TTL for any record type. You must always specify a TTL for each record so that caching DNS resolvers can cache your DNS records to the length of time specified through the TTL.

You might want to look at this video to understand what could potentially go wrong with your DNS.

https://www.youtube.com/watch?v=czKHFxaO56c

Hence you always have to strike a balance between how long the values should be cached vs how much pressure goes on the DNS. Also, you can select different types of routing to your web server. Suppose, you have hosted a website on an EC2 instance, and it goes down. What do you do then? For these purposes, Route 53 comes with various routing policies like failover routing policy, in which you failover to another AZ from where your website can still be accessed when the primary AZ goes down. Route 53 monitors the health of your primary site using health checks. Health Checks monitor the health of our endpoints.

IGW - Internet Gateway(0.0.0.0/0)

IGW, in simple terms, is a Virtual router that connects your VPC(all your applications sit inside VPC, except for global services) to the internet. It allows communication between instances in your VPC's and the Internet with the help of Route tables. A Route table contains a set of rules, called routes, that are used to determine where network traffic from your subnet or gateway is directed. You can attach your subnet to only one route table at once(subnet association) but multiple subnets can be attached to a single route table(read again!!).

Eg - Consider mobile as your instance and WIFI box as Internet Gateway. If you do not have a Wifi box, you won't be able to talk/connect to the Internet. So, IGW provides the resources in your VPC to talk to the internet. Just like you can connect your mobile to only one Wifi at a time, similarly, you can attach only one Internet Gateway at a time to one VPC. You can have more than one IGW created, but you cannot have more than one IGW attached to a VPC at any point in time.

Also, if you have any resources live in your VPC, for eg, an EC2 instance, you cannot detach your IGW from the VPC.

So, IGW helps in communication between the resources in your VPC and the Internet. It performs Network address translation for instances that have a public IP address. Your instance is always assigned public and private IP. When traffic leaves the VPC and goes to the internet, the address field is set to public IP on which the reply is expected from the internet. When the response is received, the destination address is translated into the instance private IPv4 address before the traffic is delivered to the VPC. The Internet Gateway does this one-to-one NAT on behalf of your instance, and thus the private IP is never exposed. Remember, the public IP or elastic IP is a must for communication over the internet. If you want to provide your instance's internet access without instances having any public IP attached, you will have to make use of NAT Gateways (which are discussed later) as private IP are not reachable outside of our Network.

Public vs Private IP

The AWS Docs explains it well. I have posted a snippet of information which you can find useful if you don't want to go all over the article.

Each instance that receives a public IP address is also given an external DNS hostname; for example, ec2-203-0-113-25.compute-1.amazonaws.com. We resolve an external DNS hostname to the public IP address of the instance from outside its VPC, and to the private IPv4 address of the instance from inside its VPC. The public IP address is mapped to the primary private IP address through network address translation (NAT).

A private IPv4 address is an IP address that's not reachable over the Internet. You can use private IPv4 addresses for communication between instances in the same network (EC2-Classic or a VPC). A Public IP address is an IPv4 address that's reachable from the Internet. You can use public addresses for communication between your instances and the Internet.

Hope this clears everything!!

ELASTIC LOAD BALANCER

Load balancer, as the name suggests, is used to balance the load across our servers. It accepts incoming traffic from the client(user) and routes the request to EC2 Instances which are specified in the Load Balancer configuration as Targets. Load Balancer has listeners(like ears) on which they listen to the incoming connection request. Listeners have ports and protocols on which they operate. We have three types of load balancers.

Classic Load Balancer(Not used now)
Network Load Balancer(Layer 4 - TCP-80, TLS-443, UDP-53, and TCP_UDP-53)
Application Load Balancer(Layer 7 - Protocol HTTP-80 or HTTPS-443)

EG -

Note -Before creating a Load Balancer, You must have 2 A.Z functioning. This is a precondition for Load Balancers - You must specify subnets from at least two Availability Zones to increase the availability of your load balancer.

A load balancer serves as a single point of contact for the clients. The load balancer distributes the incoming application traffic evenly across multiple targets, such as EC2 instances in multiple A.Z.(only if cross-zone load balancing is enabled). If cross-zone load balancing is disabled, a load balancer distributes requests evenly across the registered instances in its Availability Zone only. For ALB, cross-zone load balancing is always enabled.

How load balancer helps in increasing the availability of your application?

If a single server in one of A.Z goes down, the load balancer redirects traffic to the remaining online servers. Load balancers are able to do this as they continuously monitor the health of the registered targets and it ensures that it routes traffic only to the healthy targets. If the load balancer detects an unhealthy target, it stops routing traffic to that target until the target is healthy again. So, our Load Balancer is most effective when we ensure that we have at least one instance/registered target in each of our AZ. This way, if instance in on A.Z goes donwn, load balancer can send traffic to instance in other Availability zone.

Here our Load Balancer is sending the traffic to our web servers which are deployed across two A.Z. Each A.Z has one public and one private subnets defined.

Note - Very Important Point - Our Load Balancer(2) is an Internet-facing load balancer while the load balancer(4) is an internal load balancer. What is the difference?

When we create a Load Balancer, we have to opt whether we are going to make an Internet-facing Load Balancer or Internal Load Balancer. An internet-facing load balancer has public IP address. The DNS name of an internet-facing load balancer is publicly resolvable. This is because so that load balancers can route requests from clients over the internet to the EC2 instances that are registered with the load balancer. The load balancer 4, on the other hand, is an internal load balancer. Internal load balancers have only private IP addresses, hence internal load balancers can only route requests from clients having access to the VPC.

Here, we are using both internal and internet-facing load balancers. This is because our application uses web servers that must be connected to the internet, and application servers that are only connected to the web servers. We have an internet-facing load balancer with web servers registered with it. We have an internal load balancer and we have registered the application servers with it. The web servers receive requests from the internet-facing load balancer and send requests for the application servers via the internal load balancer. The application servers receive requests from the internal load balancer.

The architecture is usually defined as follows -

We have Web server hosted in the Public Subnets so that the user from the internet can access our Website.
We have the Application layer which is hosted in the Private Subnets. Private Subnets cannot be accessed by the users from the Internet. Private Subnets are not accessible from the Internet as they are associated with a Route table that does not have IGW attached to it.
Application and DB servers are used by companies for developing and testing purposes, while Web servers are accessed by clients using the company's products.

VPC - Virtual Private Cloud

A VPC is a virtual private network dedicated to your AWS resources which can span all the A.Z in a region. VPC helps us with the following -

Complete control over our Virtual Networking environment.
Selection of our own IP Address range. (https://cidr.xyz/)
Creation of Public and Private Subnets.
Configuration of Route Tables and Network Gateways.

Two ways of communicating with a VPC

Internet Gateway - Helps you communicate over the internet
VPN(Virtual private network) - Helps communicate with the corporate network.

Inside our first AZ, we have a public subnet and a private subnet. We see a NAT Instance also present inside our public subnet. Let's discuss their functionality. See the diagram below for reference of all subnets discussed.

PUBLIC SUBNET - A Public Subnet is a subnet that is attached with the custom route table having IGW attached to it. Instances launched inside our public subnet are associated with a Public IP, with help of which they can communicate over the internet. Eg - In the image, Subnet 1 (1A) is a Public Subnet, as it has an Elastic IP defined, and the custom route table attached to this subnet has access IGW attached(0.0.0.0/0) to it. So, the instances in this subnet can talk to each other via private IP's and communicate to the internet via public IP's (IGW does NAT which is discussed above).

PRIVATE SUBNET - A Private Subnet is a subnet that is attached to the main Route table which does not have any IGW attached to it. Eg - Here, in the image, we see that Subnet 2 (2A) has only a private IPV4 address and the route table with which it is associated does not have any IGW attached to it. So the instances residing within this subnet won't be able to communicate with the internet.

The last one is also a private subnet(3A), but what is the difference? It doesn't have a route to the IGW(of course!) but has its traffic being routed to Virtual Private Gateway for a VPN connection. Hence this is called a VPN-ONLY-SUBNET. This is generally used to connect to your AWS resources from your company premises.

Public, Private Subnet with Route tables — Credits : AWS

So, now we have our AZ and public, private subnets defined. But why are these NAT Gateways placed inside of our AZ? And What is their use?

This will be covered in upcoming Part-2 of highly available architecture. Stay tuned!!

I hope you all found the post useful and start using different AWS services as well. If you have any queries, you can always reach out to me. Feel free to provide your feedback in the comment section. Leave a like if you enjoyed reading. Thanks !!

Connect with me on LinkedIn - Amit Dhanik.

Credits - This post was successful because of the following peoples - ACloudguru, AWS resources, pythoholic, and many others.