Rolling Out Your Own VPN in AWS ─ Part 2

March 4, 2019

This is the second and final part of my tutorial on creating your personal OpenVPN server on the cloud. Please read part 1 of the series first if you’ve not done so already as this part builds directly on top of the infrastructure stood up in the previous instalment.

Making the ECS Instance More Reliable

The first thing we need to do is to create the IAM role that we will attach to our auto-scaled ECS instances so they can assign an Elastic IP to themselves at startup. This will allow our OpenVPN server to retain the same IP even if the underlying EC2 instance where it is running is replaced by a new one, something which we cannot do at the moment, as regular public IPv4 addresses are released every time the EC2 instance that uses them is destroyed.

Head to the IAM dashboard in the AWS Console and create a new role. I called my role EIPECSInstanceRole. Then, attach the following two policies to it:

AmazonEC2ContainerServiceforEC2Role: this is the same AWS managed policy the default ECS instance role utilises. We need it to allow the instances to register themselves with ECS.
AssociateEIPAddressPolicy: this is our extension to the above policy, which allows the instances to associate an IP address to themselves. Create this policy with the JSON description below:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Action": "ec2:AssociateAddress",
            "Resource": "*"
        }
    ]
}

Now navigate to the EC2 dashboard and select “Elastic IPs”. Allocate a new address and make a note of its allocation ID, as you’ll need it shortly.

Go to the Route 53 dashboard and edit the DNS record set for your VPN, which you created in part 1 of this tutorial. Change it to point at the new Elastic IP so that the VPN domain name always resolves to a static IP. We will associate this IP with our OpenVPN server in a minute.

Now select the “Launch Configurations” option on the left pane of the EC2 dashboard. Locate the default ECS autoscaling group launch configuration (shouldn’t be hard to find) and copy it, so we can make some changes to it.

Edit the “Launch configuration details” first. Here, select the IAM role you just created instead of the default ecsInstanceRole. Then, expand the “Advanced details” section below and replace the contents of the user data box with the following script

#!/bin/bash
echo ECS_CLUSTER=openvpn >> /etc/ecs/ecs.config;echo ECS_BACKEND_HOST= >> /etc/ecs/ecs.config;
mkdir -p /ecs-data/openvpn-data
echo "/dev/sdf /ecs-data/openvpn-data ext4 defaults 0 2" >> /etc/fstab
mount -a
yum install -y python36
curl -O https://bootstrap.pypa.io/get-pip.py
python3 get-pip.py
export PATH=/usr/local/bin:$PATH
pip install awscli
instance_id=$(curl -s http://169.254.169.254/latest/meta-data/instance-id)
aws --region=eu-west-1 ec2 associate-address --instance-id $instance_id --allocation-id <eip-id>

Where eip-id is the ID of the Elastic IP address you created a few moments ago. This user data script is launched when the ECS instance boots up and accomplishes two key tasks:

Register the EC2 instance with the ECS cluster.
Fetch its own instance ID and associate a pre-determined Elastic IP to itself

Confirm this step and move on.

The final step to configuring our launch configuration is to automatically attach the OpenVPN config EBS snapshot we created in part 1.

In the “Storage” section of the config, add an additional EBS volume apart from the Root volume.

Set the device name to /dev/sdf (it really could be anything else but this is the name that we use in the scripts elsewhere so we’re keeping it for simplicity).
Select the snapshot you created in part 1.
Match the size of the snapshot (e.g. 5GB if the snapshot was taken from a 5GB EBS volume)

Leave all the other defaults, again ensuring the volume will be encrypted.

Go on to create this new launch configuration and when it’s done, navigate to the Auto Scaling Groups dashboard on the left pane and modify the ECS auto scaling group to use the new launch config.

That’s all we need to do to make our ECS EC2 fleet more reliable.

Self-healing ECS Tasks

We need to make one final change to our ECS cluster. At the moment, if our ECS instance was killed, it would come back up, grab the IP that we’ve mapped to our public VPN domain name in Route 53 and register itself as a valid ECS cluster node, which is great, but unfortunately ECS will not schedule any tasks to run on it unless we explicitly tell it to do so. So how do we fix that? Easy, as all we need to do is create a Service to put in front of our task.

Head to the ECS dashboard, select your openvpn cluster and select the “Services” tab. Click on “Create” and configure it as follows (anything that is not mentioned can be left as default):

Launch type: EC2
Task definition: the latest revision of your openvpn definition.
Cluster: openvpn
Service name: openvpn
Service type: REPLICA
Number of tasks: 1. This is important as it will ensure there is one instance of our task running at all times.

On the next screen, do not provision a load balancer and disable service discovery integration. We won’t be needing any of this because there’s only one instance of the OpenVPN container running in the cluster and we’re already mapping its EIP to DNS. Do not set auto scaling on the next screen either and confirm all your settings in the final prompt.

This will create an ECS Service for your openvpn task. Now, your task is seen as a resource that needs to stay at a baseline capacity of 1 instance at all times as its Service dictates. ECS will ensure that this is the case by launching a new openvpn task if the previous one suddenly dies.

The net result of these changes is that, in the event of a catastrophic failure that wipes out your ECS instance, AWS will automatically bring up a new, healthy one for you, give it the same address to use so it remains accessible and re-launch your OpenVPN server container on it. All of this with no intervention on your part. Neat!

You can see this process unfold by simply manually terminating the existing ECS EC2 instance in your AWS environment. I timed mine and my recovery time was 120 seconds after the instance died. Not too bad.

So that’s it as far as basic VPN deployments on the cloud go! I hope all the steps worked out for you and you’re now reading this via your brand-new, self-healing AWS VPN!

Do you have any questions, comments or feedback about this article to share with me or the world?

You can message me on Mastodon. You can also reach out to me in a couple of other ways, if you'd prefer. I would love to hear your thoughts either way!

Segmentation Fault

Rolling Out Your Own VPN in AWS ─ Part 2

Making the ECS Instance More Reliable

Self-healing ECS Tasks

Articles from friends and people I find interesting

Snipes Everywhere

A more robust raw OpenBSD syscall demo

Best Simple System for Now