Setting up a VPC with Public and Private Subnets with NAT on the Cheap

I am back at my AWS studies, taking Adrian Cantrill’s Solutions Architect Profesional Course https://learn.cantrill.io. But between study sessions, I thought I would just play around with some things in AWS.

To that end, I set out to create a CloudFormation script that would create a VPC that had public subnets in three AZs, six private subnets in three AZs (2 each), and a NAT Gateway.

My goal was to leave this infrastructure up and running for me to experiment with. But I was quickly dissuaded from this idea when I saw the costs. The VPC and Subnets would be doable, of course, there is no charge for those and the Internet Gateway is free (aside from data transfer through it).

The problem is the NAT gateway. It is a beast. It supports 55,000 simultaneous connections, autoscaling from 5Gbps to 100Gbps, and can scale to up to 10 million packets per second. But this comes at a cost. This cost is $0.045/hour, which doesn’t sound bad. But that is ~$32.00/mo.

For a learning and experimental project, at least for me, that’s a bit steep. I have no problem paying for something I will use daily and enable a ton of projects. But, in this case, I’ll play with it a few days a month and may leave some stuff running all the time in this infrastructure but it won’t be anything that will make me any money or justify that much money for this one component.

So I decided to opt for a NAT instance. A NAT instance is not nearly as robust as the NAT gateway. All it really is is an EC2 instance configured to do NAT. AWS actually has an ami that is pre-configured as a NAT instance, but I opted to use a standard Aamzon Linux 2023 ami and configure it to do NAT, because the AWS Nat Instance ami is nearing the end of life and the NAT instance documentation indicates they are not replacing it.

Diagram of Architecture

The above is a simplified depiction of the architecture. The private subnets use the private route table and route non-local traffic to the NAT instance and the public subnets use the public route table and route non-local traffic to the Internet Gateway (IGW).

The idea behind the subnets was that we might want to have application and database subnets with resiliency broken into different private subnet groups. The public subnets (the first three) are for a web application layer. The final three subnets are for the NAT Instance and eventually a VPN Instance (because it is too expensive to justify using the Site-to-Site VPN option from Amazon as well - for my pet projects.)

The security group for the NAT instance is set up to accept inbound traffic from the VPC CIDR block, 10.1.0.0/16. It will allow outbound traffic to any address. It also does not have an SSH key installed, and won’t accept any SSH connections.

This means we need to enable it for SSM connections. To do this I first implemented this Stack:

AWSTemplateFormatVersion: '2010-09-09'
Description: 'IAM Role for EC2 Instances to interact with AWS Systems Manager'

Resources:
  MyEC2SSMRole:
    Type: 'AWS::IAM::Role'
    Properties:
      AssumeRolePolicyDocument:
        Version: '2012-10-17'
        Statement:
          - Effect: Allow
            Principal:
              Service: ec2.amazonaws.com
            Action: 'sts:AssumeRole'
      ManagedPolicyArns:
        - 'arn:aws:iam::aws:policy/AmazonSSMManagedInstanceCore'

Outputs:
  EC2SSMRoleName:
    Description: 'Name of the IAM Role for Systems Manager'
    Value: !Ref MyEC2SSMRole
    Export:
      Name: EC2SSMRoleName

This script creates an IAM role and exports the name of that role as EC2SSMRoleName. The Role allows any EC2 instance for which it is applied as the Instance Profile the managed policy of AmazonSSMManagedInstanceCore which should allow us to connect with SSM connect in the console.

The next step was actually creating all of the network infrastructure that isn’t specific to the NAT instance. Initially, I had combined the NAT instance in the same script but as I made changes to the NAT instance configuration it became tedious to manage. I also thought I might experiment with load balancing and auto-scaling on the NAT instance at some point and having a separated script for this made more sense.

AWSTemplateFormatVersion: '2010-09-09'
Description: 'AWS CloudFormation Template: VPC with Public and Private Subnets'

Parameters:
  VpcCidr:
    Type: String
    Default: '10.1.0.0/16'
    Description: The CIDR block for the VPC
  PublicSubnetOneCidr:
    Type: String
    Default: '10.1.0.0/24'
  PublicSubnetTwoCidr:
    Type: String
    Default: '10.1.1.0/24'
  PublicSubnetThreeCidr:
    Type: String
    Default: '10.1.2.0/24'
  PrivateSubnetOneCidr:
    Type: String
    Default: '10.1.10.0/24'
  PrivateSubnetTwoCidr:
    Type: String
    Default: '10.1.11.0/24'
  PrivateSubnetThreeCidr:
    Type: String
    Default: '10.1.12.0/24'
  PrivateSubnetFourCidr:
    Type: String
    Default: '10.1.20.0/24'
  PrivateSubnetFiveCidr:
    Type: String
    Default: '10.1.21.0/24'
  PrivateSubnetSixCidr:
    Type: String
    Default: '10.1.22.0/24'
  NetworkServiceSubnetOneCidr:
    Type: String
    Default: '10.1.250.0/24'
  NetworkServiceSubnetTwoCidr:
    Type: String
    Default: '10.1.251.0/24'
  NetworkServiceSubnetThreeCidr:
    Type: String
    Default: '10.1.252.0/24'

Resources:
  MainVPC:
    Type: 'AWS::EC2::VPC'
    Properties:
      CidrBlock: !Ref VpcCidr
      EnableDnsSupport: true
      EnableDnsHostnames: true
      Tags:
        - Key: Name
          Value: MainVPC

  # Internet Gateway
  MainInternetGateway:
    Type: 'AWS::EC2::InternetGateway'
    Properties:
      Tags:
        - Key: Name
          Value: MainInternetGateway

  # Attach Internet Gateway to VPC
  GatewayAttachment:
    Type: 'AWS::EC2::VPCGatewayAttachment'
    Properties:
      VpcId: !Ref MainVPC
      InternetGatewayId: !Ref MainInternetGateway

  # Public Subnets
  PublicSubnetOne:
    Type: 'AWS::EC2::Subnet'
    Properties:
      VpcId: !Ref MainVPC
      CidrBlock: !Ref PublicSubnetOneCidr
      AvailabilityZone: !Select [0, !GetAZs '']
      MapPublicIpOnLaunch: true
      Tags:
        - Key: Name
          Value: PublicSubnetOne

  PublicSubnetTwo:
    Type: 'AWS::EC2::Subnet'
    Properties:
      VpcId: !Ref MainVPC
      CidrBlock: !Ref PublicSubnetTwoCidr
      AvailabilityZone: !Select [1, !GetAZs '']
      MapPublicIpOnLaunch: true
      Tags:
        - Key: Name
          Value: PublicSubnetTwo

  PublicSubnetThree:
    Type: 'AWS::EC2::Subnet'
    Properties:
      VpcId: !Ref MainVPC
      CidrBlock: !Ref PublicSubnetThreeCidr
      AvailabilityZone: !Select [2, !GetAZs '']
      MapPublicIpOnLaunch: true
      Tags:
        - Key: Name
          Value: PublicSubnetThree

  # NAT/VPN Public Subnets
  NetworkServiceSubnetOne:
    Type: 'AWS::EC2::Subnet'
    Properties:
      VpcId: !Ref MainVPC
      CidrBlock: !Ref NetworkServiceSubnetOneCidr
      AvailabilityZone: !Select [0, !GetAZs '']
      MapPublicIpOnLaunch: true
      Tags:
        - Key: Name
          Value: NetworkServiceSubnetOne

  NetworkServiceSubnetTwo:
    Type: 'AWS::EC2::Subnet'
    Properties:
      VpcId: !Ref MainVPC
      CidrBlock: !Ref NetworkServiceSubnetTwoCidr
      AvailabilityZone: !Select [1, !GetAZs '']
      MapPublicIpOnLaunch: true
      Tags:
        - Key: Name
          Value: NetworkServiceSubnetTwo

  NetworkServiceSubnetThree:
    Type: 'AWS::EC2::Subnet'
    Properties:
      VpcId: !Ref MainVPC
      CidrBlock: !Ref NetworkServiceSubnetThreeCidr
      AvailabilityZone: !Select [2, !GetAZs '']
      MapPublicIpOnLaunch: true
      Tags:
        - Key: Name
          Value: NetworkServiceSubnetThree

  # Private Subnets
  PrivateSubnetOne:
    Type: 'AWS::EC2::Subnet'
    Properties:
      VpcId: !Ref MainVPC
      CidrBlock: !Ref PrivateSubnetOneCidr
      AvailabilityZone: !Select [0, !GetAZs '']
      Tags:
        - Key: Name
          Value: PrivateSubnetOne

  PrivateSubnetTwo:
    Type: 'AWS::EC2::Subnet'
    Properties:
      VpcId: !Ref MainVPC
      CidrBlock: !Ref PrivateSubnetTwoCidr
      AvailabilityZone: !Select [1, !GetAZs '']
      Tags:
        - Key: Name
          Value: PrivateSubnetTwo

  PrivateSubnetThree:
    Type: 'AWS::EC2::Subnet'
    Properties:
      VpcId: !Ref MainVPC
      CidrBlock: !Ref PrivateSubnetThreeCidr
      AvailabilityZone: !Select [2, !GetAZs '']
      Tags:
        - Key: Name
          Value: PrivateSubnetThree

  PrivateSubnetFour:
    Type: 'AWS::EC2::Subnet'
    Properties:
      VpcId: !Ref MainVPC
      CidrBlock: !Ref PrivateSubnetFourCidr
      AvailabilityZone: !Select [0, !GetAZs '']
      Tags:
        - Key: Name
          Value: PrivateSubnetFour

  PrivateSubnetFive:
    Type: 'AWS::EC2::Subnet'
    Properties:
      VpcId: !Ref MainVPC
      CidrBlock: !Ref PrivateSubnetFiveCidr
      AvailabilityZone: !Select [1, !GetAZs '']
      Tags:
        - Key: Name
          Value: PrivateSubnetFive

  PrivateSubnetSix:
    Type: 'AWS::EC2::Subnet'
    Properties:
      VpcId: !Ref MainVPC
      CidrBlock: !Ref PrivateSubnetSixCidr
      AvailabilityZone: !Select [2, !GetAZs '']
      Tags:
        - Key: Name
          Value: PrivateSubnetSix

  # Public Route Table
  PublicRouteTable:
    Type: 'AWS::EC2::RouteTable'
    Properties:
      VpcId: !Ref MainVPC
      Tags:
        - Key: Name
          Value: PublicRouteTable

  # Public Route for Internet Access
  PublicRoute:
    Type: 'AWS::EC2::Route'
    DependsOn: GatewayAttachment
    Properties:
      RouteTableId: !Ref PublicRouteTable
      DestinationCidrBlock: '0.0.0.0/0'
      GatewayId: !Ref MainInternetGateway

  PublicSubnetOneRouteTableAssociation:
    Type: 'AWS::EC2::SubnetRouteTableAssociation'
    Properties:
      SubnetId: !Ref PublicSubnetOne
      RouteTableId: !Ref PublicRouteTable
 
  PublicSubnetTwoRouteTableAssociation:
    Type: 'AWS::EC2::SubnetRouteTableAssociation'
    Properties:
      SubnetId: !Ref PublicSubnetTwo
      RouteTableId: !Ref PublicRouteTable

  PublicSubnetThreeRouteTableAssociation:
    Type: 'AWS::EC2::SubnetRouteTableAssociation'
    Properties:
      SubnetId: !Ref PublicSubnetThree
      RouteTableId: !Ref PublicRouteTable

  NetworkServiceSubnetOneRouteTableAssociation:
    Type: 'AWS::EC2::SubnetRouteTableAssociation'
    Properties:
      SubnetId: !Ref NetworkServiceSubnetOne
      RouteTableId: !Ref PublicRouteTable

  NetworkServiceSubnetTwoRouteTableAssociation:
    Type: 'AWS::EC2::SubnetRouteTableAssociation'
    Properties:
      SubnetId: !Ref NetworkServiceSubnetTwo
      RouteTableId: !Ref PublicRouteTable

  NetworkServiceSubnetThreeRouteTableAssociation:
    Type: 'AWS::EC2::SubnetRouteTableAssociation'
    Properties:
      SubnetId: !Ref NetworkServiceSubnetThree
      RouteTableId: !Ref PublicRouteTable

  PrivateRouteTable:
    Type: 'AWS::EC2::RouteTable'
    Properties:
      VpcId: !Ref MainVPC
      Tags:
        - Key: Name
          Value: PrivateRouteTable

  # Associate Private Subnets with Private Route Table
  PrivateSubnetOneRouteTableAssociation:
    Type: 'AWS::EC2::SubnetRouteTableAssociation'
    Properties:
      SubnetId: !Ref PrivateSubnetOne
      RouteTableId: !Ref PrivateRouteTable

  PrivateSubnetTwoRouteTableAssociation:
    Type: 'AWS::EC2::SubnetRouteTableAssociation'
    Properties:
      SubnetId: !Ref PrivateSubnetTwo
      RouteTableId: !Ref PrivateRouteTable

  PrivateSubnetThreeRouteTableAssociation:
    Type: 'AWS::EC2::SubnetRouteTableAssociation'
    Properties:
      SubnetId: !Ref PrivateSubnetThree
      RouteTableId: !Ref PrivateRouteTable

  PrivateSubnetFourRouteTableAssociation:
    Type: 'AWS::EC2::SubnetRouteTableAssociation'
    Properties:
      SubnetId: !Ref PrivateSubnetFour
      RouteTableId: !Ref PrivateRouteTable

  PrivateSubnetFiveRouteTableAssociation:
    Type: 'AWS::EC2::SubnetRouteTableAssociation'
    Properties:
      SubnetId: !Ref PrivateSubnetFive
      RouteTableId: !Ref PrivateRouteTable

  PrivateSubnetSixRouteTableAssociation:
    Type: 'AWS::EC2::SubnetRouteTableAssociation'
    Properties:
      SubnetId: !Ref PrivateSubnetSix
      RouteTableId: !Ref PrivateRouteTable

Outputs:
  VpcId:
    Description: VPC ID
    Value: !Ref MainVPC
    Export:
      Name: MainVPCId 

  VpcCidrBlock:
    Description: VPC Cidr Block
    Value: !GetAtt MainVPC.CidrBlock
    Export:
      Name: MainVPCCidrBlock

  PublicSubnetOneId:
    Description: Public Subnet One ID
    Value: !Ref PublicSubnetOne
    Export:
      Name: PublicSubnetOneId

  PublicSubnetTwoId:
    Description: Public Subnet Two ID
    Value: !Ref PublicSubnetTwo
    Export:
      Name: PublicSubnetTwoId

  PublicSubnetThreeId:
    Description: Public Subnet Three ID
    Value: !Ref PublicSubnetThree
    Export:
      Name: PublicSubnetThreeId

  PrivateSubnetOneId:
    Description: Public Subnet One ID
    Value: !Ref PrivateSubnetOne
    Export:
      Name: PrivateSubnetOneId

  PrivateSubnetTwoId:
    Description: Public Subnet Two ID
    Value: !Ref PrivateSubnetTwo
    Export:
      Name: PrivateSubnetTwoId

  PrivateSubnetThreeId:
    Description: Public Subnet Three ID
    Value: !Ref PrivateSubnetThree
    Export:
      Name: PrivateSubnetThreeId

  PrivateSubnetFourId:
    Description: Public Subnet Four ID
    Value: !Ref PrivateSubnetFour
    Export:
      Name: PrivateSubnetFourId

  PrivateSubnetFiveId:
    Description: Public Subnet Five ID
    Value: !Ref PrivateSubnetFive
    Export:
      Name: PrivateSubnetFiveId

  PrivateSubnetSixId:
    Description: Public Subnet Six ID
    Value: !Ref PrivateSubnetSix
    Export:
      Name: PrivateSubnetSixId

  NetworkServiceSubnetOneId:
    Description: Network Service Subnet One ID
    Value: !Ref NetworkServiceSubnetOne
    Export:
      Name: NetworkServiceSubnetOneId

  NetworkServiceSubnetTwoId:
    Description: Network Service Subnet Two ID
    Value: !Ref NetworkServiceSubnetTwo
    Export:
      Name: NetworkServiceSubnetTwoId

  NetworkServiceSubnetThreeId:
    Description: Network Service Subnet Three ID
    Value: !Ref NetworkServiceSubnetThree
    Export:
      Name: NetworkServiceSubnetThreeId

  MainInternetGatewayId:
    Description: Main Ineternet Gateway ID
    Value: !Ref MainInternetGateway
    Export:
      Name: MainInternetGatewayId

  PublicRouteTableId:
    Description: Public Route Table ID
    Value: !Ref PublicRouteTable
    Export:
      Name: PublicRouteTableId

  PrivateRouteTableId:
    Description: Private Route Table ID
    Value: !Ref PrivateRouteTable
    Export:
      Name: PrivateRouteTableId

Note the exports here are to support both the NAT instance script as well as future scripts which might need this data. Below is the NAT instance script:

AWSTemplateFormatVersion: '2010-09-09'
Description: 'AWS CloudFormation Template: Create Nat Instance in in NetworkServiceSubnetOne'

Parameters:
  NatInstanceType:
    Type: String
    Default: 't4g.nano'
  NatInstanceAmi:
    Type: String
    Default: 'ami-04c97e62cb19d53f1'

Resources:
  IamSSMInstanceProfile:
    Type: 'AWS::IAM::InstanceProfile'
    Properties:
      Roles:
        - !ImportValue EC2SSMRoleName

  NatInstanceElasticIP:
    Type: 'AWS::EC2::EIP'
    Properties:
      Domain: vpc

  NatInstanceSecurityGroup:
    Type: 'AWS::EC2::SecurityGroup'
    Properties:
      GroupDescription: "Security group that allows inbound from VPC and outbound to anywhere"
      VpcId: !ImportValue MainVPCId 


      SecurityGroupIngress:
        - IpProtocol: -1
          CidrIp: !ImportValue MainVPCCidrBlock

      SecurityGroupEgress:
        - IpProtocol: -1
          CidrIp: 0.0.0.0/0  # Allow all outbound to any destination

  NatInstanceNetworkInterface:
    Type: 'AWS::EC2::NetworkInterface'
    Properties:
      SubnetId: !ImportValue  NetworkServiceSubnetOneId
      Description: 'Network Interface for NAT Instance'
      GroupSet:
        - !Ref NatInstanceSecurityGroup
      SourceDestCheck: false

  NatInstance:
    Type: 'AWS::EC2::Instance'
    Properties:
      InstanceType: !Ref NatInstanceType
      ImageId: !Ref NatInstanceAmi
      NetworkInterfaces: 
      - DeviceIndex: 0
        NetworkInterfaceId: !Ref NatInstanceNetworkInterface

      IamInstanceProfile: !Ref IamSSMInstanceProfile
      UserData:
        Fn::Base64: !Sub |
          #!/bin/bash
          sudo yum update -y
          sudo yum install iptables -y
          # Add IP forwarding to sysctl.conf
          echo "net.ipv4.ip_forward = 1" | sudo tee -a /etc/sysctl.conf
          sudo sysctl -p

          # Add iptables rule to rc.local for persistence
          echo '#!/bin/bash' | sudo tee /etc/rc.d/rc.local
          echo 'sleep 30' | sudo tee -a /etc/rc.d/rc.local
          echo 'PRIMARY_INTERFACE=$(route | grep '\''^default'\'' | grep -o '\''[^ ]*$'\'')' | sudo tee -a /etc/rc.d/rc.local
          echo 'iptables -t nat -A POSTROUTING -o $PRIMARY_INTERFACE -s 0.0.0.0/0 -j MASQUERADE' | sudo tee -a /etc/rc.d/rc.local
          sudo chmod +x /etc/rc.d/rc.local

          echo '[Unit]' | sudo tee /etc/systemd/system/rc-local.service
          echo 'Description=/etc/rc.d/rc.local Compatibility' | sudo tee -a /etc/systemd/system/rc-local.service 
          echo 'ConditionPathExists=/etc/rc.d/rc.local' | sudo tee -a /etc/systemd/system/rc-local.service 
          echo '[Service]' | sudo tee -a /etc/systemd/system/rc-local.service 
          echo 'Type=forking' | sudo tee -a /etc/systemd/system/rc-local.service 
          echo 'ExecStart=/etc/rc.d/rc.local' | sudo tee -a /etc/systemd/system/rc-local.service 
          echo 'TimeoutSec=0' | sudo tee -a /etc/systemd/system/rc-local.service 
          echo 'StandardOutput=tty' | sudo tee -a /etc/systemd/system/rc-local.service 
          echo 'RemainAfterExit=yes' | sudo tee -a /etc/systemd/system/rc-local.service 
          echo 'SysVStartPriority=99' | sudo tee -a /etc/systemd/system/rc-local.service 
          echo '[Install]' | sudo tee -a /etc/systemd/system/rc-local.service 
          echo 'WantedBy=multi-user.target' | sudo tee -a /etc/systemd/system/rc-local.service 

          sudo systemctl enable rc-local
          sudo systemctl start rc-local

  MyElasticIPAssociation:
    Type: 'AWS::EC2::EIPAssociation'
    Properties:
      InstanceId: !Ref NatInstance
      AllocationId: !GetAtt NatInstanceElasticIP.AllocationId

  PrivateRoute:
    Type: 'AWS::EC2::Route'
    Properties:
      RouteTableId: !ImportValue PrivateRouteTableId
      DestinationCidrBlock: 0.0.0.0/0
      NetworkInterfaceId: !Ref NatInstanceNetworkInterface

Outputs:
  NatInstanceIPAddress:
    Description: The Nat Instance IP Address
    Value: !GetAtt NatInstanceElasticIP.PublicIp
    Export: 
      Name: NatInstanceIPAddress

You will note a good portion of this script is actually set up of the instance in the UserData. This part of the configuration was the most frustrating and time-consuming step. There is a less-than-ideal workaround in the script that pauses for 30 seconds, I found without this pause, I would inconsistently get errors from iptables.

I chose to start with a t4g.nano instance for my NAT instance. This is a fairly weak instance type, so I’m not sure how it will handle the load. I haven’t really tested it much yet. But if it works and is able to handle the little bit of traffic I plan to push through it - it will be much cheaper than the NAT Gateway option. At current rates, this will cost me approximately $0.0042/hr which works out to about $3.00/mo. So if this works out it is $29.00/mo cheaper than using NAT Gateway, for my use case I’ll take that trade-off.

My next step will be to set up a VPN instance to allow me to set up a full-time connection to AWS with my home network. I will implement that as a new stack on top of my existing VPC / Subnet Stack described here. I plan to start with the same t4g.micro and work up from there if the load is too much. If it works out with the t4g.micro though, it looks like a similar savings to this.

Another perk of projects like this is I learn new things and get to apply things I have only learned about in courses.