Tailscale behind a Azure NAT gateway fail to establish a direct connection

I have configured a Taiscale exit node in an Azure VM. This VM is setup in a VNet subnet that uses the Azure Internet Gateway as its default gateway. I have the necessary NSG rules to allow UDP 41641 and 3478 and my tailscale client make a “direct” connection, and all as I expected.

However I have several remote systems that I need to allow incoming access only from my tailscale vpn devices, ie. from the outbound IP of my exit node. As my exit node uses the Azure Internet gateway, vpn client outbound connections uses one of the Azure default Internet Gateway IP address. I cant use default Azure IPs in my remote firewalls as they are used by many other users and also can change time to time. Therefore I created a Azure NAT gateway with a public IP address and used the NAT gateway in my Exit Node subnet. This is working as expected, vpn outbound connections uses the public IP of my NAT gateway, and I can use that IP in the firewalls to allow inbound access to my remote systems.

However then I noticed my taiscale clients no longer use the “direct” connection, instead it uses a relay, adding latency. If I revert back to my old setup, I get direct connection. Azure documentation do not say anything about NAT gateway’s blocking any UDP in particular.

One of my colleges also re-produced the same issue by simply using an Azure VM that uses a NAT gateway, and a remote tailscale client. We are effectively doing something similar to what has been documented in Connect to an AWS VPC using subnet routes · Tailscale , although it is not in AWS, but in Azure.

Anyone has any insight how I can use Azure NAT gateway for the tailscale host while getting a direct connection?

To make direct connections we need one side of the connection to know a UDP port number which will make it from the Internet back through its NAT firewall and be delivered. We can accomplish this in one of a few ways:

  • if one side of the connection uses “easy NAT”, where the rewritten source UDP port is always the same no matter what destination we send to. We figure this out by sending packets to our own DERP servers, and checking if they all see the same source port. By contrast, “hard NAT” is where the UDP source port is different for every destination. The NAT Gateway is hard NAT.
  • if one side of the connection has a protocol available to ask its firewall to open a port: UPnP, NAT-PMP, or PCP. Cloud providers never do, but many residential routers do.
  • if one side of the connection manually configures a known UDP port number to ingress through the firewall. By default we use port 41641, though taiscaled can be started with a --port=N argument.

For your case, it is probably best to focus on the other end of the connections not the NAT Gateway. If a way can be found to make it be easy NAT, like enabling NAT-PMP, then direct connections would be possible even through the NAT Gateway.

Thanks, that explains well. I will see what I can do.

Hi, sorry to reply to this older post but I was running into the same issue described here.

We have an Azure Subnet using a NAT Gateway and installed Tailscale on a dedicated VM with a Subnet Router. Outbound connections on the Tailscale VM are allowed.
Connections permanently are established through the DERP and never directly.

My work machine notebook is sitting behind a router that can do UPnP, tailscale netcheck also shows that UPnP was successful, so we have one side doing Easy NAT and the Tailscale VM should just go to the Port that was assigned by my router through UPnP.
According to my understanding of Tailscale (One-side Easy NAT) this should result in a direct connection.
But if I run tailscale ping it never upgrades to a direct connection it always stays on the relay.
On the Tailscale Admin console it also shows that UDP is available for our tailscale VM.
Am I missing something here?
Thanks in advance.

My experience is similar to that of @Theragus. From a machine on my home network with both NAT-PMP and PCP available (OPNSense), I cannot get a direct connection to my subnet router on a NATed Azure VM (nor to one on GCP, for that matter).

2023/03/31 11:13:09 portmap: [v1] Got PMP response; IP: <home-ip-addres>, epoch: 8217281

Report:
	* UDP: true
	* IPv4: yes, <home-ip-address>:64339
	* IPv6: no, but OS has support
	* MappingVariesByDestIP: true
	* HairPinning: false
	* PortMapping: NAT-PMP
	* Nearest DERP: Seattle
	* DERP latency:
		- sea: 32.4ms  (Seattle)
		- den: 32.6ms  (Denver)
		- ord: 49.7ms  (Chicago)
		- dfw: 51.9ms  (Dallas)
		- tor: 60.3ms  (Toronto)
		- mia: 72.3ms  (Miami)
		- hnl: 74.1ms  (Honolulu)
		- nyc: 74.4ms  (New York City)
		- lax: 80.3ms  (Los Angeles)
		- sfo: 88.5ms  (San Francisco)
		- tok: 102ms   (Tokyo)
		- hkg: 137.3ms (Hong Kong)
		- lhr: 140.1ms (London)
		- ams: 146ms   (Amsterdam)
		- par: 147.2ms (Paris)
		- fra: 154.2ms (Frankfurt)
		- mad: 162ms   (Madrid)
		- waw: 165.2ms (Warsaw)
		- syd: 175.6ms (Sydney)
		- sao: 175.9ms (São Paulo)
		- sin: 184.5ms (Singapore)
		- blr:         (Bangalore)
		- jnb:         (Johannesburg)
		- dbi:         (Dubai)
phil@home-laptop:~$ tailscale ping subnet-router-azure
pong from subnet-router-azure (100.113.216.75) via DERP(nyc) in 343ms
pong from subnet-router-azure (100.113.216.75) via DERP(nyc) in 87ms
pong from subnet-router-azure (100.113.216.75) via DERP(nyc) in 88ms
pong from subnet-router-azure (100.113.216.75) via DERP(nyc) in 89ms
pong from subnet-router-azure (100.113.216.75) via DERP(nyc) in 88ms
pong from subnet-router-azure (100.113.216.75) via DERP(nyc) in 87ms
pong from subnet-router-azure (100.113.216.75) via DERP(nyc) in 87ms
pong from subnet-router-azure (100.113.216.75) via DERP(nyc) in 87ms
pong from subnet-router-azure (100.113.216.75) via DERP(nyc) in 87ms
pong from subnet-router-azure (100.113.216.75) via DERP(nyc) in 87ms
2023/03/31 11:14:07 direct connection not established

In the Tailscale admin interface, information for my machine includes:

  • OS: macOS
  • Tailscale version: 1.36.2
  • Relays: —
  • client connectivity:
    • Varies: Yes
    • Hairpinning: No
    • IPv6: No
    • UDP: Yes
    • UPnP: No
    • PCP: Yes
    • NAT-PMP: Yes

Also strange: tailscale netcheck reports 30+ ms DERP latency to Seattle, but I am about 25 miles from Seattle, and direct pings of the current sea DERP servers indicate just 3-4 ms latency:

PING derp10b.tailscale.com (192.73.240.161): 56 data bytes
64 bytes from 192.73.240.161: icmp_seq=0 ttl=55 time=3.702 ms
64 bytes from 192.73.240.161: icmp_seq=1 ttl=55 time=3.551 ms

--- derp10b.tailscale.com ping statistics ---
2 packets transmitted, 2 packets received, 0.0% packet loss
round-trip min/avg/max/stddev = 3.551/3.627/3.702/0.076 ms
PING derp10c.tailscale.com (192.73.240.121): 56 data bytes
64 bytes from 192.73.240.121: icmp_seq=0 ttl=55 time=3.547 ms
64 bytes from 192.73.240.121: icmp_seq=1 ttl=55 time=3.606 ms

--- derp10c.tailscale.com ping statistics ---
2 packets transmitted, 2 packets received, 0.0% packet loss
round-trip min/avg/max/stddev = 3.547/3.577/3.606/0.030 ms
PING derp10d.tailscale.com (192.73.240.132): 56 data bytes
64 bytes from 192.73.240.132: icmp_seq=0 ttl=55 time=3.363 ms
64 bytes from 192.73.240.132: icmp_seq=1 ttl=55 time=3.354 ms

--- derp10d.tailscale.com ping statistics ---
2 packets transmitted, 2 packets received, 0.0% packet loss
round-trip min/avg/max/stddev = 3.354/3.359/3.363/0.004 ms```