Connecting AWS and Azure with a Site-to-Site VPN Using Terraform

You have workloads in AWS. You have workloads in Azure. They need to talk to each other over private IPs, securely, over encrypted tunnels across the public internet. The solution to this is a site-to-site VPN, and while both clouds support it natively, getting them to agree on the details takes some understanding of how each side works.

In this blog post I’ll walk through the concepts, the differences between AWS and Azure VPN implementations, and show you a working Terraform setup that provisions both sides in a single apply.

The Building Blocks

A site-to-site VPN has three layers, and understanding them separately first is important before you can understand them together.

Layer 1: IPsec – The Encrypted Pipe

IPsec creates an encrypted tunnel between two public IP addresses. It negotiates in two phases:

  • Phase 1 (IKE) establishes a secure control channel. The two sides authenticate using a pre-shared key, agree on encryption parameters (AES-256, SHA-256, etc.), and set up a session that rekeys every 8 hours.
  • Phase 2 (IPsec SA) creates the actual data tunnel inside that control channel. This is where your traffic flows, encrypted. It rekeys every hour for forward secrecy — even if a key leaks, only one hour of traffic is exposed.

The result is an encrypted pipe between two public IPs. But a pipe alone isn’t useful. For that you’ll also need routing.

Layer 2: BGP – The Routing Protocol

IPsec doesn’t know anything about your internal networks. It just encrypts and forwards. BGP (Border Gateway Protocol) runs inside the IPsec tunnel and handles route exchange:

  • Azure tells AWS: “I have 10.224.0.0/12, send traffic for it to me”
  • AWS tells Azure: “I have 172.31.0.0/16, send traffic for it to me”
  • If a tunnel goes down, BGP detects it within seconds and reroutes

Each BGP speaker needs an ASN (Autonomous System Number), basically a unique identifier. AWS defaults to 64512, Azure defaults to 65515. These are from the private range (64512-65534), analogous to how 192.168.x.x is a private IP range.

BGP peers need IP addresses to talk to each other. But which IPs? Your real subnets (172.31.x.x10.224.x.x) haven’t been routed yet — that’s what BGP is trying to set up. It’s a chicken-and-egg problem.

The solution is link-local addresses from the 169.254.0.0/16 range. These are special IPs that only exist on a single point-to-point link — in this case, inside the IPsec tunnel. Each tunnel gets a /30 subnet (4 IPs, 2 usable):

        ┌────────── IPsec Tunnel ──────────┐
Azure ── 169.254.21.2 <──BGP──> 169.254.21.1 ── AWS
        └──────────────────────────────────┘
                  (169.254.21.0/30)

Within each /30, AWS takes host 1 and the customer (Azure) takes host 2. Azure calls these “APIPA addresses” (Automatic Private IP Addressing); AWS calls them “inside CIDRs.” Both mean the same: a temporary IP that exists only inside the tunnel for BGP to use.

How AWS and Azure Differ

Both clouds support IPsec + BGP, but they handle the details differently:

AspectAWSAzure
GatewayVirtual Private Gateway (VGW) — no public IPs of its ownVirtual Network Gateway — has dedicated public IPs
Remote peer representationCustomer Gateway (CGW) — metadata onlyLocal Network Gateway — metadata only
Tunnels per connection2 endpoints per VPN connection (always)1 per connection, pin to a gateway instance
Inside tunnel IPsAuto-assigned or specified, must be 169.254.x.x/30Manually specified, must be 169.254.x.x (APIPA)
Pre-shared keysGenerated by AWSYou provide them (or auto-generate)
Provisioning time~5 minutes~30-45 minutes
Route propagationExplicit per-route-table settingAutomatic within the VNet

Key Difference: Tunnel Counts

AWS always creates two endpoints per VPN connection, each with its own public IP in a different AZ. The VGW itself has no public IPs. The tunnel endpoint IPs are allocated when the VPN connection is created.

Azure creates one IPsec connection per connection resource, targeting a specific gateway instance. In active-standby mode (one instance), all connections share a single point of failure. In active-active mode (two instances, two public IPs), connections can be distributed across instances.

Why Active-Active Matters

An Azure gateway instance can have multiple APIPA addresses in its list, but without custom_bgp_addresses on each connection, Azure just picks the first address for all connections, regardless of which tunnel it’s peering with. The custom_bgp_addresses block that pins a specific APIPA to a specific connection requires both a primary and secondary field, mapping to ip-config-1 and ip-config-2 respectively. Active-standby only has one ip-config, so custom_bgp_addresses doesn’t work. You must use active-active mode. The cost difference is negligible — just one extra static public IP (~4/monthontopofthe 140/month VpnGw1 gateway).

With active-active, the full picture for an AWS-to-Azure VPN is:

  • 2 Azure public IPs (one per instance)
  • 2 AWS Customer Gateways (one per Azure IP, since a CGW only accepts a single IP)
  • 2 AWS VPN Connections (one per CGW, each producing 2 endpoints = 4 total)
  • 4 Azure connection resources (one per AWS endpoint)


Azure Instance 0 (PIP 1) Azure Instance 1 (PIP 2)
| |
AWS CGW 0 AWS CGW 1
| |
AWS VPN Connection 0 AWS VPN Connection 1
/ \ / \
Tunnel 1 Tunnel 2 Tunnel 1 Tunnel 2
| | | |
vpn_aws_ vpn_aws_ vpn_aws_ vpn_aws_
i0_t1 i0_t2 i1_t1 i1_t2

If Azure instance 0 goes down, instance 1 and its tunnels stay up. If an AWS endpoint goes down, the other endpoint for that CGW takes over. Full redundancy on both sides.

APIPA Address Allocation

Each tunnel needs a unique /30 from the 169.254.0.0/16 range. AWS takes host 1, Azure takes host 2:

TunnelInside CIDRAWS (host 1)Azure (host 2)
CGW0 Tunnel 1 -> Instance 0169.254.21.0/30169.254.21.1169.254.21.2
CGW0 Tunnel 2 -> Instance 0169.254.22.0/30169.254.22.1169.254.22.2
CGW1 Tunnel 1 -> Instance 1169.254.21.4/30169.254.21.5169.254.21.6
CGW1 Tunnel 2 -> Instance 1169.254.22.4/30169.254.22.5169.254.22.6

All of these are derived from a single Terraform variable — a list of lists where the first list contains CIDRs for instance 0 and the second for instance 1:

variable "aws_tunnel_inside_cidrs" {
  type    = list(list(string))
  default = [
    ["169.254.21.0/30", "169.254.22.0/30"],  # Azure instance 0
    ["169.254.21.4/30", "169.254.22.4/30"],  # Azure instance 1
  ]
}

The AWS module consumes these CIDRs directly. The Azure gateway derives its APIPA addresses using cidrhost(cidr, 2), which picks host 2 from each /30. One variable drives both sides — no duplication, no drift.

The Terraform Implementation

Module Structure

We split the VPN into reusable modules:

modules/
├── azure/vpn/
│   ├── gateway/       # Azure VPN Gateway (subnet, 2 public IPs, gateway)
│   └── connection/    # Generic: Local Network Gateway + VPN Connection
└── aws/vpn/
    └── gateway/       # 2 Customer Gateways + 2 VPN Connections (4 tunnels)

The Azure connection module is generic. It takes a remote peer’s public IP, BGP ASN, APIPA address, and custom BGP addresses, then creates the Azure-side connection. It’s used once per tunnel, regardless of whether the remote peer is AWS, an office firewall, or another cloud.

The Azure VPN Gateway

The gateway is always provisioned with active-active and two public IPs:

resource "azurerm_subnet" "gateway" {
name = "GatewaySubnet" # Azure requires this exact name
resource_group_name = var.vnet_resource_group_name
virtual_network_name = var.vnet_name
address_prefixes = [var.gateway_subnet_prefix] # minimum /27
}

resource "azurerm_public_ip" "pip1" {
name = "${var.base_resource_name_h}-vpngw-pip-1"
location = var.location
resource_group_name = var.resource_group_name
allocation_method = "Static"
sku = "Standard"
}

resource "azurerm_public_ip" "pip2" {
name = "${var.base_resource_name_h}-vpngw-pip-2"
location = var.location
resource_group_name = var.resource_group_name
allocation_method = "Static"
sku = "Standard"
}

resource "azurerm_virtual_network_gateway" "this" {
name = "${var.base_resource_name_h}-vpngw"
location = var.location
resource_group_name = var.resource_group_name
type = "Vpn"
vpn_type = "RouteBased"
sku = "VpnGw1"
active_active = true
bgp_enabled = true

bgp_settings {
asn = 65515

peering_addresses {
ip_configuration_name = "vpn-ip-config-1"
apipa_addresses = var.azure_apipa_addresses_1 # instance 0 APIPA addresses
}

peering_addresses {
ip_configuration_name = "vpn-ip-config-2"
apipa_addresses = var.azure_apipa_addresses_2 # instance 1 APIPA addresses
}
}

ip_configuration {
name = "vpn-ip-config-1"
public_ip_address_id = azurerm_public_ip.pip1.id
private_ip_address_allocation = "Dynamic"
subnet_id = azurerm_subnet.gateway.id
}

ip_configuration {
name = "vpn-ip-config-2"
public_ip_address_id = azurerm_public_ip.pip2.id
private_ip_address_allocation = "Dynamic"
subnet_id = azurerm_subnet.gateway.id
}
}

Each ip_configuration maps to a gateway instance. Each instance gets its own APIPA addresses. Note: this resource takes 30-45 minutes to provision.

The APIPA addresses are derived from the AWS tunnel CIDRs using cidrhost:

locals {
azure_apipa_instance_0 = [for cidr in var.aws_tunnel_inside_cidrs[0] : cidrhost(cidr, 2)]
azure_apipa_instance_1 = [for cidr in var.aws_tunnel_inside_cidrs[1] : cidrhost(cidr, 2)]
}

cidrhost(cidr, 2) picks host 2 from each /30 — the Azure side of the tunnel.

The AWS Side

The existing VGW is looked up via a data source. We create two Customer Gateways (one per Azure public IP) and two VPN Connections:

data "aws_vpn_gateway" "this" {
id = var.aws_vgw_id
}

resource "aws_customer_gateway" "instance_0" {
bgp_asn = var.azure_bgp_asn # 65515
ip_address = var.azure_gateway_public_ip_1 # Azure PIP 1
type = "ipsec.1"
}

resource "aws_customer_gateway" "instance_1" {
bgp_asn = var.azure_bgp_asn # 65515
ip_address = var.azure_gateway_public_ip_2 # Azure PIP 2
type = "ipsec.1"
}

resource "aws_vpn_connection" "instance_0" {
vpn_gateway_id = data.aws_vpn_gateway.this.id
customer_gateway_id = aws_customer_gateway.instance_0.id
type = "ipsec.1"

tunnel1_inside_cidr = var.tunnel_inside_cidrs_instance_0[0] # 169.254.21.0/30
tunnel2_inside_cidr = var.tunnel_inside_cidrs_instance_0[1] # 169.254.22.0/30
}

resource "aws_vpn_connection" "instance_1" {
vpn_gateway_id = data.aws_vpn_gateway.this.id
customer_gateway_id = aws_customer_gateway.instance_1.id
type = "ipsec.1"

tunnel1_inside_cidr = var.tunnel_inside_cidrs_instance_1[0] # 169.254.21.4/30
tunnel2_inside_cidr = var.tunnel_inside_cidrs_instance_1[1] # 169.254.22.4/30
}

We specify the inside CIDRs explicitly so they match the APIPA addresses already configured on the Azure gateway. If you omit them, AWS auto-assigns random /30s and you’d have to update Azure to match.

Each VPN Connection outputs tunnel public IPs, BGP peering addresses, and pre-shared keys — 4 sets of outputs total.

Wiring the Two Sides Together

This is where Terraform shines. The AWS module outputs feed directly into the Azure connection modules — no manual copying of IPs or keys:

module "vpn_gateway" {
source = "../../modules/azure/vpn/gateway"
# ... creates active-active Azure VPN Gateway with 2 public IPs
azure_apipa_addresses_1 = local.azure_apipa_instance_0
azure_apipa_addresses_2 = local.azure_apipa_instance_1
}

module "aws_vpn_gateway" {
source = "../../modules/aws/vpn/gateway"

aws_vgw_id = var.aws_vgw_id
azure_gateway_public_ip_1 = module.vpn_gateway.gateway_public_ip_1
azure_gateway_public_ip_2 = module.vpn_gateway.gateway_public_ip_2
azure_bgp_asn = module.vpn_gateway.gateway_bgp_asn
tunnel_inside_cidrs_instance_0 = var.aws_tunnel_inside_cidrs[0]
tunnel_inside_cidrs_instance_1 = var.aws_tunnel_inside_cidrs[1]
}

# 4 Azure connections — one per AWS tunnel endpoint
module "vpn_aws_i0_t1" {
source = "../../modules/azure/vpn/connection"

connection_name = "aws-i0-t1"
gateway_id = module.vpn_gateway.gateway_id
remote_gateway_address = module.aws_vpn_gateway[0].i0_tunnel1_address
remote_bgp_asn = module.aws_vpn_gateway[0].vgw_bgp_asn
remote_bgp_peering_address = module.aws_vpn_gateway[0].i0_tunnel1_bgp_peering_address
shared_key = module.aws_vpn_gateway[0].i0_tunnel1_shared_key

custom_bgp_address_primary = local.azure_apipa_instance_0[0] # active on instance 0
custom_bgp_address_secondary = local.azure_apipa_instance_1[0] # required, unused
}
# ... repeat for i0_t2, i1_t1, i1_t2

The Azure gateway’s two public IPs flow into AWS as two Customer Gateways. AWS creates two VPN Connections (4 tunnel endpoints) and outputs their details. Those details flow back into four Azure connections. Everything is a single terraform apply.

Custom BGP Addresses

Each Azure connection specifies custom_bgp_addresses with primary (instance 0’s APIPA) and secondary (instance 1’s APIPA). For connections targeting instance 0, the primary address is the one actually used for BGP peering. For connections targeting instance 1, the secondary is used. The other is required by Azure but not active for that connection.

The Pre-Shared Key Problem

The connection module needs to support two scenarios: generating a key (for office connections where you control both sides) and accepting a key (for AWS connections where AWS generates them). The naive approach with a conditional count on random_password breaks at plan time because the AWS key isn’t known until apply.

The fix is to always generate a key, but only use it as a fallback:

resource "random_password" "shared_key" {
length = 64
special = true
}

locals {
shared_key = var.shared_key != null ? var.shared_key : random_password.shared_key.result
}

AWS connections pass in their key, so the generated one is ignored. Office connections omit the variable, so the generated key is used.

Verifying the Connection

After terraform apply completes (about 50 minutes, mostly the Azure gateway), you can verify from both sides.

AWS side — check tunnel status for both VPN connections:

aws ec2 describe-vpn-connections \
--filters "Name=tag:Name,Values=azure-vpn-i0,azure-vpn-i1" \
--query 'VpnConnections[*].{Name:Tags[?Key==`Name`].Value|[0],Telemetry:VgwTelemetry[*].{IP:OutsideIpAddress,Status:Status,Routes:AcceptedRouteCount}}' \
--output json

All four tunnels should show UP with at least 1 accepted BGP route (the Azure VNet CIDR).

Azure side — check learned routes:

az network vnet-gateway list-learned-routes \
--resource-group <resource-group> \
--name <gateway-name> \
--output table

You should see the AWS VPC CIDR learned via EBgp from four peers (the four tunnel APIPA addresses). You’ll also see IBgp routes between the two Azure instances — this is the two gateway instances sharing routes internally, confirming that active-active is working.

One More Thing: Route Propagation on AWS

Azure automatically makes BGP-learned routes available within the VNet. AWS does not — you need to enable route propagation on each VPC route table where resources need to reach Azure.

Without it, your EC2 instances learn nothing about 10.224.0.0/12 even though the VGW has the route via BGP. Enable propagation in the AWS console under VPC > Route Tables > Route Propagation, or via:

aws ec2 enable-vgw-route-propagation \
--route-table-id rtb-xxxxxxxx \
--gateway-id vgw-xxxxxxxx

This is a per-route-table setting, so if you have separate route tables for public and private subnets, enable it on each one that needs Azure connectivity.

A Note on Teardown

If you need to destroy and recreate the VPN (e.g. switching from active-standby to active-active), be aware of a dependency ordering issue: the AWS Customer Gateway cannot be deleted while a VPN Connection still references it. Terraform may try to delete the CGW first and fail. The workaround is to delete the AWS VPN Connection before the CGW, either by running a targeted destroy or by removing the VPN connection manually first. The Azure gateway itself takes ~10 minutes to tear down.

Summary

Connecting AWS and Azure via a site-to-site VPN is fundamentally the same pattern as connecting on-prem to a cloud. Your have IPsec for encryption, BGP for routing, link-local addresses for peering inside the tunnel. The main complexity comes from the asymmetry: AWS creates two tunnel endpoints per VPN connection while Azure creates one connection per tunnel, and Azure requires active-active mode to support multiple APIPA-based BGP sessions.

With active-active, the setup work doubles: two Azure public IPs, two AWS Customer Gateways, two AWS VPN Connections (4 tunnel endpoints), and four Azure connection resources. But by splitting the Terraform into a gateway module (created once) and a reusable connection module (created per tunnel), all four connections use the same code with different inputs. The AWS module outputs wire directly into the Azure connection modules, so a single terraform apply provisions both sides without any manual value copying.

If you’e like help configuring VPNs between AWS, Azure, or on-prem servers, contact us at Trailhead. We can help you do it with Terraform in a way that is automatic and repeatable.

Picture of Piotr Kolodziej

Piotr Kolodziej

Born and raised in Poland, Piotr did his first programming on Atari 800XL. He covered his first dial-up modem with a duve, so his mom wouldn’t hear it and freak over the internet bills. He graduated with a degree in Telecommunication and Business Application Programming. Piotr is a certified .NET and Azure Developer, and is passionate about excellent software architecture.

Free Consultation

Sign up for a FREE consultation with one of Trailhead's experts.

"*" indicates required fields

This field is for validation purposes and should be left unchanged.

Related Blog Posts

We hope you’ve found this to be helpful and are walking away with some new, useful insights. If you want to learn more, here are a couple of related articles that others also usually find to be interesting:

Our Gear Is Packed and We're Excited to Explore With You

Ready to come with us? 

Together, we can map your company’s software journey and start down the right trails. If you’re set to take the first step, simply fill out our contact form. We’ll be in touch quickly – and you’ll have a partner who is ready to help your company take the next step on its software journey. 

We can’t wait to hear from you! 

Main Contact

This field is for validation purposes and should be left unchanged.

Together, we can map your company’s tech journey and start down the trails. If you’re set to take the first step, simply fill out the form below. We’ll be in touch – and you’ll have a partner who cares about you and your company. 

We can’t wait to hear from you! 

Montage Portal

Montage Furniture Services provides furniture protection plans and claims processing services to a wide selection of furniture retailers and consumers.

Project Background

Montage was looking to build a new web portal for both Retailers and Consumers, which would integrate with Dynamics CRM and other legacy systems. The portal needed to be multi tenant and support branding and configuration for different Retailers. Trailhead architected the new Montage Platform, including the Portal and all of it’s back end integrations, did the UI/UX and then delivered the new system, along with enhancements to DevOps and processes.

Logistics

We’ve logged countless miles exploring the tech world. In doing so, we gained the experience that enables us to deliver your unique software and systems architecture needs. Our team of seasoned tech vets can provide you with:

Custom App and Software Development

We collaborate with you throughout the entire process because your customized tech should fit your needs, not just those of other clients.

Cloud and Mobile Applications

The modern world demands versatile technology, and this is exactly what your mobile and cloud-based apps will give you.

User Experience and Interface (UX/UI) Design

We want your end users to have optimal experiences with tech that is highly intuitive and responsive.

DevOps

This combination of Agile software development and IT operations provides you with high-quality software at reduced cost, time, and risk.

Trailhead stepped into a challenging project – building our new web architecture and redeveloping our portals at the same time the business was migrating from a legacy system to our new CRM solution. They were able to not only significantly improve our web development architecture but our development and deployment processes as well as the functionality and performance of our portals. The feedback from customers has been overwhelmingly positive. Trailhead has proven themselves to be a valuable partner.

– BOB DOERKSEN, Vice President of Technology Services
at Montage Furniture Services

Technologies Used

When you hit the trails, it is essential to bring appropriate gear. The same holds true for your digital technology needs. That’s why Trailhead builds custom solutions on trusted platforms like .NET, Angular, React, and Xamarin.

Expertise

We partner with businesses who need intuitive custom software, responsive mobile applications, and advanced cloud technologies. And our extensive experience in the tech field allows us to help you map out the right path for all your digital technology needs.

  • Project Management
  • Architecture
  • Web App Development
  • Cloud Development
  • DevOps
  • Process Improvements
  • Legacy System Integration
  • UI Design
  • Manual QA
  • Back end/API/Database development

We partner with businesses who need intuitive custom software, responsive mobile applications, and advanced cloud technologies. And our extensive experience in the tech field allows us to help you map out the right path for all your digital technology needs.

Our Gear Is Packed and We're Excited to Explore with You

Ready to come with us? 

Together, we can map your company’s tech journey and start down the trails. If you’re set to take the first step, simply fill out the contact form. We’ll be in touch – and you’ll have a partner who cares about you and your company. 

We can’t wait to hear from you! 

Thank you for reaching out.

You’ll be getting an email from our team shortly. If you need immediate assistance, please call (616) 371-1037.