AWS Transit Gateway in Action: VPN to VPCs Connectivity
AWS Transit Gateway was launched at Re:Invent November 2018 - it can massively simplify networking that involves multiple VPCs, VPNs and shared services.
This post covers the following points:
Use Case (Simplified)
I have separate AWS accounts for staging and prod. I also need a VPN so that my on-prem resources can get bi-directional access to both the staging and prod accounts. This includes for my on-prem Active Directory, and VPN access for operations tools.
There must be no connectivity between staging and prod.
This is a great use case for Transit Gateway - thankfully the very terse AWS documentation has been fleshed out recently to better describe "Isolated VPCs".
Transit Gateway Concepts
The Transit Gateway is (mostly) well described in the AWS Transit Gateway documentation:
To quote the docs:
transit gateway - a network transit hub that you can use to interconnect your virtual private clouds (VPC) and on-premises networks.
attachment — You can attach a VPC, an AWS Direct Connect gateway, or a VPN connection to a transit gateway.
transit gateway route table — A transit gateway has a default route table and can optionally have additional route tables. A route table includes dynamic and static routes that decide the next hop based on the destination IP address of the packet. The target of these routes could be a VPC or a VPN connection. By default, the VPCs and VPN connections that you attach to a transit gateway are associated with the default transit gateway route table.
associations — Each attachment is associated with exactly one route table. Each route table can be associated with zero to many attachments.
route propagation — A VPC or VPN connection can dynamically propagate routes to a transit gateway route table. With a VPC, you must create static routes to send traffic to the transit gateway. With a VPN connection, routes are propagated from the transit gateway to your on-premises router using Border Gateway Protocol (BGP).
I found route propagation unclear – so here is my alternative, less formal and more verbose explanation!
A transit gateway route table has a set of routes (a mapping of CIDR blocks to destinations, so AWS can determine the next hop for routing an IP packet). In the route table's configuration, if you choose to "propagate" a VPC, it will automatically add the CIDR block of that VPC to the route table. Similarly if you "propagate" an end-to-end VPN, it automatically adds the CIDR blocks for that VPN (which could be statically defined in the VPN, or dynamically if using BGP).
When you "associate" a VPC with a route table, you are telling it to use that route table for packets coming from that VPC (i.e. it will be able to route to everything that propagates to that route table, and to other statically defined routes that you add).
A VPC (or VPN) can be associated with at most one route table. However, a VPC (or VPN) can "propagate" to multiple route tables.
To allow an EC2 instance within a VPC to send traffic via routes defined in the transit gateway, you have to reconfigure the VPC subnet's route table. You need to add static routes for the desired CIDR blocks, pointing at the transit gateway.
Let's look at a concrete example.
In the instructions below, I'll configure it by hand so we can understand how it all wires together. However, it's much better to use CloudFormation for this.
I'll use two AWS accounts, along with AWS Organizations:
- staging: contains a VPC with CIDR block 220.127.116.11/16
- production: contains a VPC with CIDR block 18.104.22.168/16
I already have a VPN device on-prem, where the on-prem address range is 172.31.0.0/16.
I’ll create the transit gateway and the VPN in the ‘production’ account. (Another good choice would be to use a ‘shared services’ account. I’ll stick with ‘production’ for simplicity, and so that to affect production you’d need to have IAM permissions within the production account.
Transit Gateway Setup
Create Transit Gateway
To create the transit gateway in the ‘production’ account, I go to the 'VPC' service, choose `Transit Gateways`, and choose `Create Transit Gateway`. I fill in the details as shown below:
I've chosen not to use the default route table because I don't want a flat network, where traffic can flow between staging and production. Instead I'll create the route tables, associations and propagations later (which also better demonstrates the concepts).
Share Transit Gateway Cross-accounts
I want to share this transit gateway across multiple accounts within my organization. I therefore have to allow this in the AWS Resource Access Manager.
In the master account, as per the Resource Access Manager docs, I open https://console.aws.amazon.com/ram/home#Settings and “enable sharing within your AWS Organization”. (I also take a note of the organization id, which I’ll need later.)
In the ‘production’ account, as per the Transit Gateway docs, I create a resource share. I use my organization id as the principal, being sure to untick the “allow external accounts”.
I then attach my two VPCs to the transit gateway, from the two different accounts (by logging into each of the accounts in turn).
Similarly, I'll attach the VPN to the transit gateway.
First I create a Customer Gateway:
Next I create the VPN Connection. Note this is created and managed via the Transit Gateway Attachment, rather than in the VPN section of the AWS console (even though it is subsequently listed in the VPN section). However, if you need to delete the VPN connection then it must be done through the VPN section.
Create Transit Gateway Route Tables: for VPN
I create the first route table that the VPN will use. I'll "associate" the VPN so that it can make use of this route table; I'll "propagate" the VPCs so that their CIDR block(s) are added to the route table. This allows packets from the VPN to be routed to the VPCs.
The route table configuration used by the VPN is shown below:
This shows that the VPN is “associated” with the transit gateway route table so traffic coming from that VPN can use this route table. The two VPCs “propagate” their CIDR blocks into the route table, so the VPN’s traffic can be routed to these two VPCs.
Create Transit Gateway Route Tables: for VPCs
Next I repeat these steps for a route table that the VPCs will use. I'll "associate" the VPCs so that they can make use of this route table; I'll "propagate" the VPN so that its CIDR block(s) are added to the route table.
The route table configuration used by the VPCs is shown below:
If I’d used BGP for the VPN, we’d be done. However, I ticked “static”. This means the route table doesn’t know what routes to propagate for the VPN connection. Under “Routes”, I therefore click “Create route”. I give it the on-prem CIDR and choose the VPN attachment. This creates a route with two attachments, for the two IPs used by the two tunnels of the VPN connection.
VPC Subnet Route Tables
We still need to tell the VPC subnets' route tables to route traffic to the transit gateway.
I open the route table for the first VPC subnet (in the 'VPC' service, I click on subnets, find my subnet and click the "route table" link).
I add a route so that traffic being sent from the VPC to on-prem will go via the Transit Gateway. I choose the on-prem CIDR (172.31.0.0/16) and from the drop-down I choose "Transit Gateway" and then my transit gateway's ID.
I repeat this from my other subnets in this VPC, and do the same in the 'production' VPC.
We're done! Let's test if this works.
I have three private VMs, each listening on an open TCP port: one VM in each VPC and one on-prem. I check that from the on-prem VM I can reach each of the VPC VMs, and vice-versa. I check that from the 'staging' VM I cannot reach the 'production' VM, and vice-versa.
There are a few really important things I've not covered in this blog. Getting from the example in a blog or from AWS "getting started" documentation to a production-quality system is a huge gap. This is particularly true for an enterprise that requires buy-in from infosec and operations, and who need to maintain and evolve the systems over many years.
Below are a few of the topics to consider:
- Lock down the network connections - do you really need everything in on-prem to be able to reach everything in your VPCs, and vice-versa?! As well as locking down the VPCs with security groups and NACLs, you may want to restrict the CIDRs to only those really need access.
- Lock down user IAM access to all of these AWS resources, following the principle of least privilege. This includes the configuration of the Transit Gateway, the VPN and the VPC routes.
- Enable monitoring to detect when there is a problem (see docs for Monitor your Transit Gateway)
- Use configuration-as-code (e.g. CloudFormation or Terraform) to setup the Transit Gateway, and to subsequently manage changes to its configuration.
Other Use Cases
The AWS docs give a few more ideas of AWS Transit Gateway use cases. There are many more possibilities. If your VPC inter-connectivity requirements are small and simple, stick with VPC Peering and VPNs attached directly to a single VPC. However, consider Transit Gateway if you need non-trivial inter-connectivity, are already using or thinking about Transit VPCs, need “shared services” VPC(s), or have more advanced routing requirements such as for Intrusion Prevention Systems (IPS).