Home » Blog » Connecting AWS and Azure with a Site-to-Site VPN Using Terraform

Connecting AWS and Azure with a Site-to-Site VPN Using Terraform

You have workloads in AWS. You have workloads in Azure. They need to talk to each other over private IPs, securely, over encrypted tunnels across the public internet. The solution to this is a site-to-site VPN, and while both clouds support it natively, getting them to agree on the details takes some understanding of how each side works.

In this blog post I’ll walk through the concepts, the differences between AWS and Azure VPN implementations, and show you a working Terraform setup that provisions both sides in a single apply.

The Building Blocks

A site-to-site VPN has three layers, and understanding them separately first is important before you can understand them together.

Layer 1: IPsec – The Encrypted Pipe

IPsec creates an encrypted tunnel between two public IP addresses. It negotiates in two phases:

Phase 1 (IKE) establishes a secure control channel. The two sides authenticate using a pre-shared key, agree on encryption parameters (AES-256, SHA-256, etc.), and set up a session that rekeys every 8 hours.
Phase 2 (IPsec SA) creates the actual data tunnel inside that control channel. This is where your traffic flows, encrypted. It rekeys every hour for forward secrecy — even if a key leaks, only one hour of traffic is exposed.

The result is an encrypted pipe between two public IPs. But a pipe alone isn’t useful. For that you’ll also need routing.

Layer 2: BGP – The Routing Protocol

IPsec doesn’t know anything about your internal networks. It just encrypts and forwards. BGP (Border Gateway Protocol) runs inside the IPsec tunnel and handles route exchange:

Azure tells AWS: “I have 10.224.0.0/12, send traffic for it to me”
AWS tells Azure: “I have 172.31.0.0/16, send traffic for it to me”
If a tunnel goes down, BGP detects it within seconds and reroutes

Each BGP speaker needs an ASN (Autonomous System Number), basically a unique identifier. AWS defaults to 64512, Azure defaults to 65515. These are from the private range (64512-65534), analogous to how 192.168.x.x is a private IP range.

Layer 3: Link-Local Addresses – The Inside of the Tunnel

BGP peers need IP addresses to talk to each other. But which IPs? Your real subnets (172.31.x.x, 10.224.x.x) haven’t been routed yet — that’s what BGP is trying to set up. It’s a chicken-and-egg problem.

The solution is link-local addresses from the 169.254.0.0/16 range. These are special IPs that only exist on a single point-to-point link — in this case, inside the IPsec tunnel. Each tunnel gets a /30 subnet (4 IPs, 2 usable):

        ┌────────── IPsec Tunnel ──────────┐
Azure ── 169.254.21.2 <──BGP──> 169.254.21.1 ── AWS
        └──────────────────────────────────┘
                  (169.254.21.0/30)

Within each /30, AWS takes host 1 and the customer (Azure) takes host 2. Azure calls these “APIPA addresses” (Automatic Private IP Addressing); AWS calls them “inside CIDRs.” Both mean the same: a temporary IP that exists only inside the tunnel for BGP to use.

How AWS and Azure Differ

Both clouds support IPsec + BGP, but they handle the details differently:

Aspect	AWS	Azure
Gateway	Virtual Private Gateway (VGW) — no public IPs of its own	Virtual Network Gateway — has dedicated public IPs
Remote peer representation	Customer Gateway (CGW) — metadata only	Local Network Gateway — metadata only
Tunnels per connection	2 endpoints per VPN connection (always)	1 per connection, pin to a gateway instance
Inside tunnel IPs	Auto-assigned or specified, must be `169.254.x.x/30`	Manually specified, must be `169.254.x.x` (APIPA)
Pre-shared keys	Generated by AWS	You provide them (or auto-generate)
Provisioning time	~5 minutes	~30-45 minutes
Route propagation	Explicit per-route-table setting	Automatic within the VNet

Key Difference: Tunnel Counts

AWS always creates two endpoints per VPN connection, each with its own public IP in a different AZ. The VGW itself has no public IPs. The tunnel endpoint IPs are allocated when the VPN connection is created.

Azure creates one IPsec connection per connection resource, targeting a specific gateway instance. In active-standby mode (one instance), all connections share a single point of failure. In active-active mode (two instances, two public IPs), connections can be distributed across instances.

Why Active-Active Matters

An Azure gateway instance can have multiple APIPA addresses in its list, but without custom_bgp_addresses on each connection, Azure just picks the first address for all connections, regardless of which tunnel it’s peering with. The custom_bgp_addresses block that pins a specific APIPA to a specific connection requires both a primary and secondary field, mapping to ip-config-1 and ip-config-2 respectively. Active-standby only has one ip-config, so custom_bgp_addresses doesn’t work. You must use active-active mode. The cost difference is negligible — just one extra static public IP (~ $4 / m o n t h o n t o p o f t h e$ 140/month VpnGw1 gateway).

With active-active, the full picture for an AWS-to-Azure VPN is:

2 Azure public IPs (one per instance)
2 AWS Customer Gateways (one per Azure IP, since a CGW only accepts a single IP)
2 AWS VPN Connections (one per CGW, each producing 2 endpoints = 4 total)
4 Azure connection resources (one per AWS endpoint)



Azure Instance 0 (PIP 1)                    Azure Instance 1 (PIP 2)
        |                                            |
   AWS CGW 0                                    AWS CGW 1
        |                                            |
   AWS VPN Connection 0                     AWS VPN Connection 1
    /          \                              /          \
Tunnel 1    Tunnel 2                     Tunnel 1    Tunnel 2
   |            |                            |            |
vpn_aws_    vpn_aws_                    vpn_aws_     vpn_aws_
 i0_t1       i0_t2                      i1_t1        i1_t2

If Azure instance 0 goes down, instance 1 and its tunnels stay up. If an AWS endpoint goes down, the other endpoint for that CGW takes over. Full redundancy on both sides.

APIPA Address Allocation

Each tunnel needs a unique /30 from the 169.254.0.0/16 range. AWS takes host 1, Azure takes host 2:

Tunnel	Inside CIDR	AWS (host 1)	Azure (host 2)
CGW0 Tunnel 1 -> Instance 0	`169.254.21.0/30`	`169.254.21.1`	`169.254.21.2`
CGW0 Tunnel 2 -> Instance 0	`169.254.22.0/30`	`169.254.22.1`	`169.254.22.2`
CGW1 Tunnel 1 -> Instance 1	`169.254.21.4/30`	`169.254.21.5`	`169.254.21.6`
CGW1 Tunnel 2 -> Instance 1	`169.254.22.4/30`	`169.254.22.5`	`169.254.22.6`

All of these are derived from a single Terraform variable — a list of lists where the first list contains CIDRs for instance 0 and the second for instance 1:

variable "aws_tunnel_inside_cidrs" {
  type    = list(list(string))
  default = [
    ["169.254.21.0/30", "169.254.22.0/30"],  # Azure instance 0
    ["169.254.21.4/30", "169.254.22.4/30"],  # Azure instance 1
  ]
}

The AWS module consumes these CIDRs directly. The Azure gateway derives its APIPA addresses using cidrhost(cidr, 2), which picks host 2 from each /30. One variable drives both sides — no duplication, no drift.

The Terraform Implementation

Module Structure

We split the VPN into reusable modules:

modules/
├── azure/vpn/
│   ├── gateway/       # Azure VPN Gateway (subnet, 2 public IPs, gateway)
│   └── connection/    # Generic: Local Network Gateway + VPN Connection
└── aws/vpn/
    └── gateway/       # 2 Customer Gateways + 2 VPN Connections (4 tunnels)

The Azure connection module is generic. It takes a remote peer’s public IP, BGP ASN, APIPA address, and custom BGP addresses, then creates the Azure-side connection. It’s used once per tunnel, regardless of whether the remote peer is AWS, an office firewall, or another cloud.

The Azure VPN Gateway

The gateway is always provisioned with active-active and two public IPs:

resource "azurerm_subnet" "gateway" {
  name                 = "GatewaySubnet"   # Azure requires this exact name
  resource_group_name  = var.vnet_resource_group_name
  virtual_network_name = var.vnet_name
  address_prefixes     = [var.gateway_subnet_prefix]  # minimum /27
}

resource "azurerm_public_ip" "pip1" {
  name                = "${var.base_resource_name_h}-vpngw-pip-1"
  location            = var.location
  resource_group_name = var.resource_group_name
  allocation_method   = "Static"
  sku                 = "Standard"
}

resource "azurerm_public_ip" "pip2" {
  name                = "${var.base_resource_name_h}-vpngw-pip-2"
  location            = var.location
  resource_group_name = var.resource_group_name
  allocation_method   = "Static"
  sku                 = "Standard"
}

resource "azurerm_virtual_network_gateway" "this" {
  name                = "${var.base_resource_name_h}-vpngw"
  location            = var.location
  resource_group_name = var.resource_group_name
  type                = "Vpn"
  vpn_type            = "RouteBased"
  sku                 = "VpnGw1"
  active_active       = true
  bgp_enabled         = true

  bgp_settings {
    asn = 65515

    peering_addresses {
      ip_configuration_name = "vpn-ip-config-1"
      apipa_addresses       = var.azure_apipa_addresses_1  # instance 0 APIPA addresses
    }

    peering_addresses {
      ip_configuration_name = "vpn-ip-config-2"
      apipa_addresses       = var.azure_apipa_addresses_2  # instance 1 APIPA addresses
    }
  }

  ip_configuration {
    name                          = "vpn-ip-config-1"
    public_ip_address_id          = azurerm_public_ip.pip1.id
    private_ip_address_allocation = "Dynamic"
    subnet_id                     = azurerm_subnet.gateway.id
  }

  ip_configuration {
    name                          = "vpn-ip-config-2"
    public_ip_address_id          = azurerm_public_ip.pip2.id
    private_ip_address_allocation = "Dynamic"
    subnet_id                     = azurerm_subnet.gateway.id
  }
}

Each ip_configuration maps to a gateway instance. Each instance gets its own APIPA addresses. Note: this resource takes 30-45 minutes to provision.

The APIPA addresses are derived from the AWS tunnel CIDRs using cidrhost:

locals {
  azure_apipa_instance_0 = [for cidr in var.aws_tunnel_inside_cidrs[0] : cidrhost(cidr, 2)]
  azure_apipa_instance_1 = [for cidr in var.aws_tunnel_inside_cidrs[1] : cidrhost(cidr, 2)]
}

cidrhost(cidr, 2) picks host 2 from each /30 — the Azure side of the tunnel.

The AWS Side

The existing VGW is looked up via a data source. We create two Customer Gateways (one per Azure public IP) and two VPN Connections:

data "aws_vpn_gateway" "this" {
  id = var.aws_vgw_id
}

resource "aws_customer_gateway" "instance_0" {
  bgp_asn    = var.azure_bgp_asn              # 65515
  ip_address = var.azure_gateway_public_ip_1   # Azure PIP 1
  type       = "ipsec.1"
}

resource "aws_customer_gateway" "instance_1" {
  bgp_asn    = var.azure_bgp_asn              # 65515
  ip_address = var.azure_gateway_public_ip_2   # Azure PIP 2
  type       = "ipsec.1"
}

resource "aws_vpn_connection" "instance_0" {
  vpn_gateway_id      = data.aws_vpn_gateway.this.id
  customer_gateway_id = aws_customer_gateway.instance_0.id
  type                = "ipsec.1"

  tunnel1_inside_cidr = var.tunnel_inside_cidrs_instance_0[0]  # 169.254.21.0/30
  tunnel2_inside_cidr = var.tunnel_inside_cidrs_instance_0[1]  # 169.254.22.0/30
}

resource "aws_vpn_connection" "instance_1" {
  vpn_gateway_id      = data.aws_vpn_gateway.this.id
  customer_gateway_id = aws_customer_gateway.instance_1.id
  type                = "ipsec.1"

  tunnel1_inside_cidr = var.tunnel_inside_cidrs_instance_1[0]  # 169.254.21.4/30
  tunnel2_inside_cidr = var.tunnel_inside_cidrs_instance_1[1]  # 169.254.22.4/30
}

We specify the inside CIDRs explicitly so they match the APIPA addresses already configured on the Azure gateway. If you omit them, AWS auto-assigns random /30s and you’d have to update Azure to match.

Each VPN Connection outputs tunnel public IPs, BGP peering addresses, and pre-shared keys — 4 sets of outputs total.

Wiring the Two Sides Together

This is where Terraform shines. The AWS module outputs feed directly into the Azure connection modules — no manual copying of IPs or keys:

module "vpn_gateway" {
  source = "../../modules/azure/vpn/gateway"
  # ... creates active-active Azure VPN Gateway with 2 public IPs
  azure_apipa_addresses_1 = local.azure_apipa_instance_0
  azure_apipa_addresses_2 = local.azure_apipa_instance_1
}

module "aws_vpn_gateway" {
  source = "../../modules/aws/vpn/gateway"

  aws_vgw_id                     = var.aws_vgw_id
  azure_gateway_public_ip_1      = module.vpn_gateway.gateway_public_ip_1
  azure_gateway_public_ip_2      = module.vpn_gateway.gateway_public_ip_2
  azure_bgp_asn                  = module.vpn_gateway.gateway_bgp_asn
  tunnel_inside_cidrs_instance_0 = var.aws_tunnel_inside_cidrs[0]
  tunnel_inside_cidrs_instance_1 = var.aws_tunnel_inside_cidrs[1]
}

# 4 Azure connections — one per AWS tunnel endpoint
module "vpn_aws_i0_t1" {
  source = "../../modules/azure/vpn/connection"

  connection_name            = "aws-i0-t1"
  gateway_id                 = module.vpn_gateway.gateway_id
  remote_gateway_address     = module.aws_vpn_gateway[0].i0_tunnel1_address
  remote_bgp_asn             = module.aws_vpn_gateway[0].vgw_bgp_asn
  remote_bgp_peering_address = module.aws_vpn_gateway[0].i0_tunnel1_bgp_peering_address
  shared_key                 = module.aws_vpn_gateway[0].i0_tunnel1_shared_key

  custom_bgp_address_primary   = local.azure_apipa_instance_0[0]  # active on instance 0
  custom_bgp_address_secondary = local.azure_apipa_instance_1[0]  # required, unused
}
# ... repeat for i0_t2, i1_t1, i1_t2

The Azure gateway’s two public IPs flow into AWS as two Customer Gateways. AWS creates two VPN Connections (4 tunnel endpoints) and outputs their details. Those details flow back into four Azure connections. Everything is a single terraform apply.

Custom BGP Addresses

Each Azure connection specifies custom_bgp_addresses with primary (instance 0’s APIPA) and secondary (instance 1’s APIPA). For connections targeting instance 0, the primary address is the one actually used for BGP peering. For connections targeting instance 1, the secondary is used. The other is required by Azure but not active for that connection.

The Pre-Shared Key Problem

The connection module needs to support two scenarios: generating a key (for office connections where you control both sides) and accepting a key (for AWS connections where AWS generates them). The naive approach with a conditional count on random_password breaks at plan time because the AWS key isn’t known until apply.

The fix is to always generate a key, but only use it as a fallback:

resource "random_password" "shared_key" {
  length  = 64
  special = true
}

locals {
  shared_key = var.shared_key != null ? var.shared_key : random_password.shared_key.result
}

AWS connections pass in their key, so the generated one is ignored. Office connections omit the variable, so the generated key is used.

Verifying the Connection

After terraform apply completes (about 50 minutes, mostly the Azure gateway), you can verify from both sides.

AWS side — check tunnel status for both VPN connections:

aws ec2 describe-vpn-connections \
  --filters "Name=tag:Name,Values=azure-vpn-i0,azure-vpn-i1" \
  --query 'VpnConnections[*].{Name:Tags[?Key==`Name`].Value|[0],Telemetry:VgwTelemetry[*].{IP:OutsideIpAddress,Status:Status,Routes:AcceptedRouteCount}}' \
  --output json

All four tunnels should show UP with at least 1 accepted BGP route (the Azure VNet CIDR).

Azure side — check learned routes:

az network vnet-gateway list-learned-routes \
  --resource-group <resource-group> \
  --name <gateway-name> \
  --output table

You should see the AWS VPC CIDR learned via EBgp from four peers (the four tunnel APIPA addresses). You’ll also see IBgp routes between the two Azure instances — this is the two gateway instances sharing routes internally, confirming that active-active is working.

One More Thing: Route Propagation on AWS

Azure automatically makes BGP-learned routes available within the VNet. AWS does not — you need to enable route propagation on each VPC route table where resources need to reach Azure.

Without it, your EC2 instances learn nothing about 10.224.0.0/12 even though the VGW has the route via BGP. Enable propagation in the AWS console under VPC > Route Tables > Route Propagation, or via:

aws ec2 enable-vgw-route-propagation \
  --route-table-id rtb-xxxxxxxx \
  --gateway-id vgw-xxxxxxxx

This is a per-route-table setting, so if you have separate route tables for public and private subnets, enable it on each one that needs Azure connectivity.

A Note on Teardown

If you need to destroy and recreate the VPN (e.g. switching from active-standby to active-active), be aware of a dependency ordering issue: the AWS Customer Gateway cannot be deleted while a VPN Connection still references it. Terraform may try to delete the CGW first and fail. The workaround is to delete the AWS VPN Connection before the CGW, either by running a targeted destroy or by removing the VPN connection manually first. The Azure gateway itself takes ~10 minutes to tear down.

Summary

Connecting AWS and Azure via a site-to-site VPN is fundamentally the same pattern as connecting on-prem to a cloud. Your have IPsec for encryption, BGP for routing, link-local addresses for peering inside the tunnel. The main complexity comes from the asymmetry: AWS creates two tunnel endpoints per VPN connection while Azure creates one connection per tunnel, and Azure requires active-active mode to support multiple APIPA-based BGP sessions.

With active-active, the setup work doubles: two Azure public IPs, two AWS Customer Gateways, two AWS VPN Connections (4 tunnel endpoints), and four Azure connection resources. But by splitting the Terraform into a gateway module (created once) and a reusable connection module (created per tunnel), all four connections use the same code with different inputs. The AWS module outputs wire directly into the Azure connection modules, so a single terraform apply provisions both sides without any manual value copying.

If you’e like help configuring VPNs between AWS, Azure, or on-prem servers, contact us at Trailhead. We can help you do it with Terraform in a way that is automatic and repeatable.

Piotr Kolodziej

Born and raised in Poland, Piotr did his first programming on Atari 800XL. He covered his first dial-up modem with a duve, so his mom wouldn’t hear it and freak over the internet bills. He graduated with a degree in Telecommunication and Business Application Programming. Piotr is a certified .NET and Azure Developer, and is passionate about excellent software architecture.

Free Consultation

Sign up for a FREE consultation with one of Trailhead's experts.

"*" indicates required fields

Our Gear Is Packed and We're Excited to Explore With You

Ready to come with us?

Together, we can map your company’s software journey and start down the right trails. If you’re set to take the first step, simply fill out our contact form. We’ll be in touch quickly – and you’ll have a partner who is ready to help your company take the next step on its software journey.

We can’t wait to hear from you!

Main Contact

Together, we can map your company’s tech journey and start down the trails. If you’re set to take the first step, simply fill out the form below. We’ll be in touch – and you’ll have a partner who cares about you and your company.

We can’t wait to hear from you!

About Us

Our Team

Core Values

Alliances

Awards

Join Us

What We Do

Services

Technologies

Industries

Insights

Blog

Podcast

Events

Courses

Connecting AWS and Azure with a Site-to-Site VPN Using Terraform

The Building Blocks

Layer 1: IPsec – The Encrypted Pipe

Layer 2: BGP – The Routing Protocol

Layer 3: Link-Local Addresses – The Inside of the Tunnel

How AWS and Azure Differ

Key Difference: Tunnel Counts

Why Active-Active Matters

APIPA Address Allocation

The Terraform Implementation

Module Structure

The Azure VPN Gateway

The AWS Side

Wiring the Two Sides Together

Custom BGP Addresses

The Pre-Shared Key Problem

Verifying the Connection

One More Thing: Route Propagation on AWS

A Note on Teardown

Summary

Piotr Kolodziej

Free Consultation

Sign up for a FREE consultation with one of Trailhead's experts.

Related Blog Posts

The Data Collector API Is Going Away: How to Migrate Your Azure Monitor Logs

Ditching SaaS Platforms With AI

Backlog Zero: How AI Can Clear You Development Backlog

Our Gear Is Packed and We're Excited to Explore With You

Main Contact

Montage Portal

Project Background

Logistics

Custom App and Software Development

Cloud and Mobile Applications

User Experience and Interface (UX/UI) Design

DevOps

Technologies Used

Expertise

Our Gear Is Packed and We're Excited to Explore with You

Thank you for reaching out.