Fortigate firewall odd behaviour when connecting direct

I have a machine called “macmini” that has subnet & exit node enabled. macmini sits in the office behind a fortigate router.

I have a home machine called t14, and I have --accept-routes enabled

When I first set it up, it works perfectly and it seems to go through lhr relay. But after a while it switch to direct connect and then the connection doesn’t work. Running tailscale ping from t14 shows:

pong from macmini ( via DERP(lhr) in 27ms
pong from macmini ( via DERP(lhr) in 126ms
pong from macmini ( via DERP(lhr) in 12ms
pong from macmini ( via DERP(lhr) in 276ms
pong from macmini ( via in 20ms

after it’s connected via our office public ip, tailscale ping returns ok, but ssh or any other connection just won’t work.

our office uses a Fortigate router. the default policy is allow all outgoing: (first line)

I thought this should be enough after reading the “How NAT traversal works”: How NAT traversal works · Tailscale

I tried adding the allow incoming UDP 41641 on the router, then restart tailscaled, then direct connection starting to work. see second line in the above image

Question: is this what I’m supposed to do? Surely fortigate is the stateful firewall that’ll be able to allow outgoing to remember incoming?

And how did tailscale decide to go for direct connection instead of lhr relay? Obviously the relay works but direct doesn’t?

Actually I noticed after a while (not sure how long) tailscale seems to realised the direct didn’t work and reverted back to relay? I couldn’t stably reproduce this.

We need to allow

in order for Tailscale to work over DERp or DIrect connection.

Well the first line in my router config allows any outgoing traffic

Connection is negotiated based on the latency to reach from source to destination unless the direct connection has some issue or interruption connection should be stay direct else it will be swapped over to the DERP connection.

Another thing to try: does normal ping work? If so, try ping -s 1252. If that fails, can you find a lower number that does work?

Does macmini have any firewall rules that may be preventing incoming or outgoing traffic on tailscale0? iptables -L -n -v should show what the current firewall rules are. This does not seem like as likely to be the problem since you report that it works fine until the connection switches to direct.

1 Like

To answer your questions in the original post:

Tailscale decides to go for the direct connection instead of the relay when it sees successful communication over a direct connection. It will periodically send probes to every IP address of the remote device, and request that the remote do the same in the other direction. Until the probes succeed, it will use the relay. If the direct connection stops working, it will eventually give up and return to using the relay. It is unusual for tailscale ping to succeed over a direct connection but other traffic to not work correctly.

Your Fortigate router appears to vary port numbers to different destinations (“Hard NAT” in the NAT traversal document), which makes direct connections difficult.

  • Adding a port forward can help but is not guaranteed to work.
  • If enabling a NAT traversal protocol to the router like NAT-PMP is acceptable, that will typically resolve issues. If NAT-PMP is not labelled as a separate option, sometimes NAT-PMP is enabled by enabling UPNP (we do not currently use UPNP).
  • Sometimes there is an option to avoid changing port numbers quite so much: e.g. pfsense has a “disable firewall scrub” option that reportedly helps.
  • I took a look through the FortiGate documentation and it mentions “When using the IP pool for source NAT, you can define a fixed port to guarantee the source port number is unchanged.” which (if I read it correctly) might help: if you can force macmini:41641/udp to always map to the same external port then other devices should be able to establish direct connections. The external port doesn’t have to be 41641 (useful if you have multiple devices you wish to expose), the actual port number will be automatically discovered using STUN. If the external port used in SNAT is fixed and you can also use a port forward, that will probably be very effective.
1 Like

Thanks Adrian for the detailed reply. I have reverted my configuration to the original which essentially only have one policy: allow everything outgoing and deny everything incoming.

There is no firewall on macmini. sudo ufw status shows Status: inactive

At the moment, running tailscale status on home machine t14 shows direct connection to macmini in the office. And ping or ssh works fine.

I have full control of the router & network config. I think I’ll see when it breaks next time and try to go through the suggestions you gave.

Another observation: I have multiple machines in the office, e.g. there is another call l5501. However at any point, only one machine in the office seems to be able to connect direct to home machine t14:

❯ tailscale ping macmini
pong from macmini ( via in 18ms
❯ tailscale ping l5501
pong from l5501 ( via DERP(lhr) in 19ms
pong from l5501 ( via DERP(lhr) in 12ms