Very slow speeds over Tailscale to machine shared from external network

Setup:
Tailscale network A: Several devices (windows, linux, android)
Tailscale network B: Synology NAS running DSM 7
Network B shares the NAS as an external machine to network A

Copying a 160MB file using scp from a linode server to the NAS using a direct connection to the public IP of the linode server results in speeds about 85 mbps

Running the same scp command using the tailscale ip of the server results in speeds of around 4.7 mbps

Running an iperf server on the NAS and then running the client on the linode server shows this:

Traceroute on Linode side:

Traceroute on NAS side:

Is this just expected due to the NAS being externally shared instead of on the same Tailscale network?

(Sidenote: I tried also sharing the linode server as an external machine with network B at the same time as the NAS was still being shared with network A. That made Tailscale very unhappy and I couldn’t access the NAS at all.)

Wile or immediately after copying a file, run tailscale status - this will tell you if you’re making a direct connection, or being relayed through a DERP server.

If the machines are unable to make a direct connection, the will fall back to a DERP relay, which allows for connectivity, but is not as fast or responsive as a direct connection.

This is what I see when I run this on the linode

... linux active; direct 207.109.84.50:41641, tx 122843548 rx 1290371044

So you are getting a direct connection. Something I have seen is that containers and VMs tend to have slower throughput because virtualizing the networking is more CPU intensive, the encryption is also CPU intensive, and that creates a bottleneck.

Unfortunately, this is not tailscale specific, as I have done benchmarks against straight wireguard connections and see the same performance hit for VM and containers.

If you’re able to run tailscale on the NAS, rather than in a container, that should help, depending on the resources available on the NAS.

Tailscale already isn’t running in a container, and the cpu usage stays quite low.

I’ve been able to do some more testing. I went to where the NAS is located and put my laptop on the same LAN, then I put both on the same tailscale network as my linode server. So, laptop + NAS on a single LAN, then linode server on a remote LAN, but all three are now part of a single tailscale network.

In the setup I see the following using iperf3:
Laptop to / from Linode: Works great
Laptop to / from NAS: Works great
Linode to / from NAS: Still slow as hell

Tailscale is indicating direct connections in all cases.

I also tried updating Tailscale on both the linode server and the Synology NAS to 20.1 and I still see the same results.

I’d like to gather more information on this so we can track down why it might be misbehaving.

Can you run top on the Synology and the Linode machines during a large file transfer, and keep an eye on the %Cpu id value - see how much that drops on both.

Also, you can try enabling the tun device on your Synology if you have not already. and let us know if that makes a notable difference.