Scaling to Multiple Servers with AWS ALB

This post is part of a series of posts about how to deploy a Django app to AWS. If you haven't deployed your app to EC2 and RDS yet, I recommend reading the AWS EC2 and AWS RDS first.

So your website's getting so much traffic that you're ready to scale your web app to multiple servers? Great! In this blog post, we'll walk through how to set up our single EC2 machine for scalability, create a template to spin up more instances on-demand, and use an AWS Application Load Balancer (ALB) to distribute traffic amongst them. Let's get started!

01 01 Move Logging to AWS CloudWatch

At the moment, our EC2 instance logs to its own hard drive. This doesn't scale because each server will have its own logs, and it's difficult to aggregate them. Instead, we'll use AWS CloudWatch to send our logs to a centralized location.

First, let's grant permissions to our EC2 machine to send metrics to CloudWatch.

If you don't already have an IAM role for your EC2 machines, create a new role for sending metrics to CloudWatch called ec2-role. Give your new (or existing) role the permission called CloudWatchAgentServerPolicy.
Attach the role to our EC2 instance if you haven't already. Go to EC2 > Click on the instance > Actions > Security > Modify IAM role > Attach the new role.

Next, let's install the CloudWatch Agent on our EC2 machine so that our logs are forwarded to CloudWatch.

SSH into your EC2 machine.

Run the following commands to launch the CloudWatch Agent configuration wizard:

sudo apt install collectd amazon-cloudwatch-agent
sudo /opt/aws/amazon-cloudwatch-agent/bin/amazon-cloudwatch-agent-config-wizard

Say yes to all defaults except "Do you want to store the config in the SSM parameter store?" and "Do you want the CloudWatch agent to also retrieve X-ray traces?"
Also say yes to "Do you want to monitor any log files?" and add these four log files:
- /var/log/nginx/access.log
- /var/log/nginx/error.log
- /var/log/gunicorn/gunicorn.log
- /var/log/gunicorn/myapp.log
Note: For retention policy, -1 means "never delete."

Finally, start the CloudWatch agent using:

sudo /opt/aws/amazon-cloudwatch-agent/bin/amazon-cloudwatch-agent-ctl -a fetch-config -m ec2 -c file:/opt/aws/amazon-cloudwatch-agent/bin/config.json -s

Note: The wizard's generated configuration is stored at /opt/aws/amazon-cloudwatch-agent/bin/config.json. This configuration is also saved to AWS Systems Manager > Parameter Store for us. When we launch the CloudWatch agent, we can choose to launch it with a local JSON file or by grabbing a config from the parameter store. Good practice is to grab from the parameter store, but let's just use the local file for the simplicity of this tutorial. If we ever wanted to fetch a different config from the parameter store, we could use -c ssm:<parameterName>.

Great! The CloudWatch Agent is now configured and is forwarding our logs to CloudWatch. We can check the status of the CloudWatch Agent on any of our EC2 machines at any time using: amazon-cloudwatch-agent-ctl -a status. If we navigate to CloudWatch > Log groups we should be able to see our logs streaming in.

02 02 Migrate Secrets to AWS Secrets Manager

Currently, our secrets (such as database password, API keys, etc.) might be stored as environment variables on our EC2 machine. This also won't scale, since creating, modifying, or deleting these secrets will get very cumbersome, especially as we increase the number of servers. This also isn't a secure way to store secrets. Instead, we'll use AWS Secrets Manager to store our secrets where fewer people can access them.

First, let's grant our EC2 machine access to Secrets Manager:

Navigate to AWS IAM console and select Roles.
If you don't already have an IAM role for your EC2 machines, create a new role for sending secrets to Secrets Manager called ec2-role. Give your new (or existing) role the permission called SecretsManagerReadWrite.
Attach the role to the EC2 instance if you haven't already. Go to EC2 > Click on the instance > Actions > Security > Modify IAM role > Attach the new role.

Now, let's create a new secret in Secrets Manager:

Navigate to AWS Secrets console and select Store a new secret.
Select Other type of secret.
Give it a key, such as MYSECRET and a value, such as MYVALUE, then select Next.
Give it a readable name, such as my-secret, then select Next.
On the final page, we are shown some sample code for accessing our new secret. Select Next.

Now that our secret is created, let's access it from Django. Keep in mind that good practice mandates using different secrets for development vs production—for example, different API keys or database passwords. With that in mind, modify settings.py to look like the following:

if DEBUG:
  MYSECRET = os.environ['MYSECRET]
else:
  session = boto3.session.Session()
  client = session.client(
      service_name='secretsmanager',
      region_name='us-west-1'
  )
  response = client.get_secret_value(SecretId='my-secret')
  secrets = json.loads(response['SecretString'])
  MYSECRET = secrets['MYSECRET']

Now, we can access and apply our secrets in settings.py using AWS Secrets Manager.

03 03 Create an EC2 Launch Template

Great! Now that our EC2 machine is ready for scaling, let's create a template so that we can quickly launch more EC2 instances without having to painstakingly set each one up manually. There are various strategies for creating a template, such as creating a new Amazon Machine Image (AMI), which is a snapshot of our EC2 machine, or a launch template. For this tutorial, we'll use the simplest strategy: creating a launch template.

Launch templates allow us to quickly launch new EC2 instances with one click containing our latest code and other configuration files.

Navigate to the AWS EC2 console.
Navigate to Launch templates > Create launch template.
Give it a name such as web-server-template.
You might notice a checkbox called Provide guidance to help me set up a template that I can use with EC2 Auto-scaling. Leave this unchecked for now and we'll come back to it later.
Choose an instance type, such as t2.micro for the cheapest option or m6i.large for a more powerful option.
Choose the key pair you typically use for your EC2 machine.
Choose the security group you used for your EC2 machine.
Choose the subnet you typically use for your EC2 machine.
Add a storage volume of at least 10GB (or more depending on the size of your web app).
Finally, in Advanced details > IAM instance profile, select the role you created for your EC2 machine.

In Advanced details > User data, add the following script to install and configure our web app:

#!/bin/bash
cd /home/ubuntu
sudo apt update
sudo apt -y upgrade
sudo apt install -y collectd nginx gunicorn supervisor python3-pip postgresql-client-common postgresql-client-14

# code
cd /home/ubuntu
git clone https://github.com/your-username/your-public-repo.git
cd your-public-repo
pip install -r requirements.txt

# nginx
cd /etc/nginx/sites-available/
cat <<- EOF | sudo tee example.com
server {
    listen 80;
    server_name example.com www.example.com;

    location / {
        include proxy_params;
        proxy_pass http://localhost:8000;
        client_max_body_size 100m;
        proxy_read_timeout 3600s;
    }
}
EOF
cd /etc/nginx/sites-enabled/
sudo rm default
sudo ln -s /etc/nginx/sites-available/example.com example.com
sudo nginx -s reload

# gunicorn
cd /home/ubuntu
mkdir .gunicorn
cat <<- EOF > /home/ubuntu/.gunicorn/config.py
	"""Gunicorn config file"""

	# Django WSGI application path in pattern MODULE_NAME:VARIABLE_NAME
	wsgi_app = "example.wsgi:application"
	# The granularity of Error log outputs
	loglevel = "debug"
	# The number of worker processes for handling requests
	workers = 4
	# The socket to bind
	bind = "0.0.0.0:8000"
	# Restart workers when code changes (development only!)
	#reload = True
	# Write access and error info to /var/log
	accesslog = errorlog = "/var/log/gunicorn/gunicorn.log"
	# Redirect stdout/stderr to log file
	capture_output = True
	# PID file so you can easily fetch process ID
	#pidfile = "/var/run/gunicorn/dev.pid"
	# Daemonize the Gunicorn process (detach & enter background)
	#daemon = True
	# Workers silent for more than this many seconds are killed and restarted
	timeout = 600
	# Restart workers after this many requests
	max_requests = 10
	# Stagger reloading of workers to avoid restarting at the same time
	max_requests_jitter = 30
	# Restart workers after this much resident memory
EOF
chown -R ubuntu:ubuntu /home/ubuntu/.gunicorn

# supervisor
cat <<- EOF | sudo tee /etc/supervisor/conf.d/gunicorn.conf
	[program:django_gunicorn]
	directory=/home/ubuntu/my-public-repo/
	command=/usr/bin/gunicorn -c /home/ubuntu/.gunicorn/config.py
	autostart=true
	autorestart=true
	stdout_logfile=/var/log/supervisor/django-gunicorn-out.log
	stderr_logfile=/var/log/supervisor/django-gunicorn-err.log
EOF
sudo systemctl restart supervisor

# logging
cd /home/ubuntu
wget https://amazoncloudwatch-agent.s3.amazonaws.com/ubuntu/amd64/latest/amazon-cloudwatch-agent.deb
sudo apt install -y ./amazon-cloudwatch-agent.deb
cd /opt/aws/amazon-cloudwatch-agent/bin
cat << EOF | sudo tee config.json
{
    "agent": {
        "metrics_collection_interval": 60,
        "run_as_user": "root"
    },
    "logs": {
        "logs_collected": {
            "files": {
                "collect_list": [
                    {
                        "file_path": "/var/log/nginx/access.log",
                        "log_group_class": "STANDARD",
                        "log_group_name": "nginx-access.log",
                        "log_stream_name": "prod",
                        "retention_in_days": -1
                    },
                    {
                        "file_path": "/var/log/nginx/error.log",
                        "log_group_class": "STANDARD",
                        "log_group_name": "nginx-error.log",
                        "log_stream_name": "prod",
                        "retention_in_days": -1
                    },
                    {
                        "file_path": "/var/log/gunicorn/gunicorn.log",
                        "log_group_class": "STANDARD",
                        "log_group_name": "gunicorn.log",
                        "log_stream_name": "prod",
                        "retention_in_days": -1
                    },
                    {
                        "file_path": "/home/ubuntu/my-public-repo/myapp.log",
                        "log_group_class": "STANDARD",
                        "log_group_name": "myapp.log",
                        "log_stream_name": "prod",
                        "retention_in_days": -1
                    }
                ]
            }
        }
    },
    "metrics": {
        "aggregation_dimensions": [
            [
                "InstanceId"
            ]
        ],
        "append_dimensions": {
            "AutoScalingGroupName": "${aws:AutoScalingGroupName}",
            "ImageId": "${aws:ImageId}",
            "InstanceId": "${aws:InstanceId}",
            "InstanceType": "${aws:InstanceType}"
        },
        "metrics_collected": {
            "collectd": {
                "metrics_aggregation_interval": 60
            },
            "disk": {
                "measurement": [
                    "used_percent"
                ],
                "metrics_collection_interval": 60,
                "resources": [
                    "*"
                ]
            },
            "mem": {
                "measurement": [
                    "mem_used_percent"
                ],
                "metrics_collection_interval": 60
            },
            "statsd": {
                "metrics_aggregation_interval": 60,
                "metrics_collection_interval": 10,
                "service_address": ":8125"
            }
        }
    }
}
EOF
sudo /opt/aws/amazon-cloudwatch-agent/bin/amazon-cloudwatch-agent-ctl -a fetch-config -m ec2 -c file:/opt/aws/amazon-cloudwatch-agent/bin/config.json -s

Finally, select Create launch template.

Great! Now we can launch one or multiple new web servers by selecting our template > Actions > Launch instance from template. Feel free to modify this template as you go, for example by adding extra configurations for .gitignore, tmux, or npm by selecting our template > Actions > Modify template (Create new version).

04 04 Add an AWS Application Load Balancer

Let's create a target group for our web servers. A target group is simply a group of EC2 instances that we want to distribute traffic to. We'll also need an endpoint where our target group can ping our web servers to see if they're still responding. We could use a custom middleware to intercept this request and put less load on our server, but let's just add a regular view for simplicity at the endpoint /status:

Add a new view to our top-level views.py that simply returns a basic HTTP response:

def status_view(request):
  return HttpResponse("Healthy.")

Add a new URL to our top-level urls.py that points to this view:

urlpatterns = [
  path('status/', views.status_view, name='status'),
]

Now that we have a status endpoint, we can create a target group for our web servers that periodically sends health checks to this endpoint:

Navigate to the AWS EC2 console and select Target Groups > Create target group.
Select a target type of Instances.
Give it a name such as web-server-tg.
Select a protocal of HTTP which defaults to port 80.
Under Health checks, select HTTPS for the protocol and /status for the path.
Select Next. On the next page, select the instance(s) to add to this target group, then select Include as pending below.
Select Create target group.

Now that our instance(s) are added to the target group, we should see each one receiving a ping to /status every few seconds. If we select the newly created target group, we should see a count of healthy and unhealthy web servers. Eventually, we can use this to automatically stop sending requests to unhealthy servers and even automatically spin up new web servers when the number of healthy web servers drops below a certain threshold.

For now, let's modify our security groups to only allow web traffic to access our load balancer, not our EC2 servers directly. First, let's create a new security group for our ALB:

Navigate to the AWS EC2 console and select Security Groups > Create security group.
Give it a name such as alb-sg.
Add two inbound rules that allow HTTP and HTTPS traffic from anywhere.

Then, let's modify the existing security group for our web servers to only allow web traffic from our ALB, not the open web:

Next, navigate to the AWS EC2 console and select Security Groups > web-server-sg.
Select Actions > Edit inbound rules.
Remove all inbound rules except for the one that allows SSH traffic.
Add a new inbound rule that allows HTTP traffic from the security group of our ALB we just created.

Now we're ready to create the AWS Application Load Balancer (ALB) which will distribute traffic to the web servers in our new target group:

Navigate to the AWS EC2 console and select Load Balancers > Create load balancer. Select Application Load Balancer and Create.
Give it a name such as web-server-alb.
Select a scheme of Internet-facing.
Select any two or more Availability Zones.
Select the security group we created earlier.
Select the target group we created earlier.
Select Create.

Now that everything's hooked up, the final step is to point our domain to our ALB instead of to our EC2 instance.

Select the ALB we just created and navigate to the Details tab. Copy the DNS name of the ALB.
Navigate to your domain registrar, remove the previous A records that pointed to the EC2 machine, and add an A record that points to the DNS name of the ALB, such as dualstack.alb-603648898.us-west-1.elb.amazonaws.com.
Save the changes and test your domain by navigating to it in your browser. You should see your web app.

Well done! We've now enabled multiple web servers and created a load balancer to distribute traffic to them.

05 05 Next Steps

Congratulations on completing this series of blog posts! From here, we could enable EC2 Auto Scaling, which can automatically terminate and spin up servers using our launch template to maintain a certain number of healthy servers at all times. It can also be configured to scale up and down servers based on the average usage (memory, CPU, or networking traffic) across our servers to keep up with traffic.