Dramatic decline in performance in direct connect

After finding my Time Machine backup claiming it will take 17 hours to backup 50GB over a 1Gbps direct ethernet connection to my server via tailscale, I ran some perf tests and realized that I am getting less than 10Mbps over this link. This was not the case when I had originally setup when I used to get over 600Mbps. What tools are available to figure out why tailscale tunnel performance has dropped dramatically?

iperf3 -c 192.168.1.220
Connecting to host 192.168.1.220, port 5201
[ 5] local 192.168.1.216 port 58150 connected to 192.168.1.220 port 5201
[ ID] Interval Transfer Bitrate
[ 5] 0.00-1.00 sec 115 MBytes 966 Mbits/sec
[ 5] 1.00-2.00 sec 112 MBytes 939 Mbits/sec
[ 5] 2.00-3.00 sec 113 MBytes 944 Mbits/sec
[ 5] 3.00-4.00 sec 112 MBytes 941 Mbits/sec
[ 5] 4.00-5.00 sec 111 MBytes 931 Mbits/sec
[ 5] 5.00-6.00 sec 113 MBytes 952 Mbits/sec
[ 5] 6.00-7.00 sec 112 MBytes 941 Mbits/sec
[ 5] 7.00-8.00 sec 112 MBytes 942 Mbits/sec
[ 5] 8.00-9.00 sec 112 MBytes 941 Mbits/sec
[ 5] 9.00-10.00 sec 112 MBytes 936 Mbits/sec


[ ID] Interval Transfer Bitrate
[ 5] 0.00-10.00 sec 1.10 GBytes 943 Mbits/sec sender
[ 5] 0.00-10.00 sec 1.10 GBytes 941 Mbits/sec receiver

iperf Done.
➜ ~ iperf3 -c 100.69.246.43
Connecting to host 100.69.246.43, port 5201
[ 5] local 100.92.5.92 port 58161 connected to 100.69.246.43 port 5201
[ ID] Interval Transfer Bitrate
[ 5] 0.00-1.00 sec 3.03 MBytes 25.4 Mbits/sec
[ 5] 1.00-2.00 sec 715 KBytes 5.85 Mbits/sec
[ 5] 2.00-3.00 sec 0.00 Bytes 0.00 bits/sec
[ 5] 3.00-4.00 sec 1.24 MBytes 10.4 Mbits/sec
[ 5] 4.00-5.00 sec 0.00 Bytes 0.00 bits/sec
[ 5] 5.00-6.00 sec 50.1 KBytes 411 Kbits/sec
[ 5] 6.00-7.00 sec 1.31 MBytes 11.0 Mbits/sec
[ 5] 7.00-8.00 sec 1.24 MBytes 10.4 Mbits/sec
[ 5] 8.00-9.00 sec 2.17 MBytes 18.2 Mbits/sec
[ 5] 9.00-10.00 sec 692 KBytes 5.67 Mbits/sec


[ ID] Interval Transfer Bitrate
[ 5] 0.00-10.00 sec 10.4 MBytes 8.74 Mbits/sec sender
[ 5] 0.00-10.03 sec 10.2 MBytes 8.50 Mbits/sec receiver

iperf Done.

/Applications/Tailscale.app/Contents/MacOS/Tailscale ping -verbose 100.69.246.43
pong from nshome-nas-1 (100.69.246.43) via 192.168.1.220:41641 in 3ms

Digging more into this issue -

So my MacBook has both WiFi and a USB-C Ethernet dongle. If I disable WiFi and just use the Ethernet dongle - then I get around 400Mbps. If I enable WiFi and use DHCP then I get around 300Mbps. If I enable WiFi and assign a static IP then I get around 10Mbps.

In all cases the service order is set to get the Ethernet connection higher priority to WiFi. Outside of Tailscale this seems to have no difference in performance. But within Tailscale network the performance just drops like a rock.

Hello,

Can you provide us with the source and destination IP and timeline when you try to connect all 3 ways? So we can try to dig into logs. Please feel free to email us details on “support@tailscale.com

Sat Jan 16 11:56:45 CST 2021

SRC 100.92.5.92
DST 100.69.246.43

Now the performance is consistently around 250Mbps in all three scenarios - which for a 1000Mbps interface seems very slow for Wireguard.

Can you share the CPU utilization at the time of this transfer speed? Is it high to normal?

It is normal. This is a i9 MacBook Pro 16 “. CPU doesn’t exceed 15%

It’s an area that we’re working on, and each release has gotten a bit better. Please follow https://github.com/tailscale/tailscale/issues/414 for further updates on this issue.

Let me just add that I have seen the same issue but it only started occurring in the last 2 weeks.

The scenario is that I have two servers in the cloud and one pushes its backups to the other. It used to take about 2h and now lately it went to 8-11h with roughly the same sized backups.

an iperf3 test shows:

with a direct connection:

iperf3 -c 63.xxx.xxx.219 -f K
Connecting to host 63.xxx.xxx.219, port 5201
[  5] local 45.xxx.xxx.248 port 58758 connected to 63.xxx.xxx.219 port 5201
[ ID] Interval           Transfer     Bitrate         Retr  Cwnd
[  5]   0.00-1.00   sec  2.51 MBytes  2573 KBytes/sec    0    441 KBytes
[  5]   1.00-2.00   sec  12.5 MBytes  12799 KBytes/sec    0   6.01 MBytes
[  5]   2.00-3.00   sec  21.2 MBytes  21762 KBytes/sec    0   6.01 MBytes
[  5]   3.00-4.00   sec  17.5 MBytes  17920 KBytes/sec    0   6.01 MBytes
[  5]   4.00-5.00   sec  17.5 MBytes  17919 KBytes/sec    0   6.01 MBytes
[  5]   5.00-6.00   sec  20.0 MBytes  20482 KBytes/sec    0   6.01 MBytes
[  5]   6.00-7.00   sec  17.5 MBytes  17919 KBytes/sec    0   6.01 MBytes
[  5]   7.00-8.00   sec  18.8 MBytes  19201 KBytes/sec    0   6.01 MBytes
[  5]   8.00-9.00   sec  18.8 MBytes  19200 KBytes/sec    0   6.01 MBytes
[  5]   9.00-10.00  sec  17.5 MBytes  17918 KBytes/sec    0   6.01 MBytes
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-10.00  sec   164 MBytes  16769 KBytes/sec    0             sender
[  5]   0.00-10.16  sec   164 MBytes  16512 KBytes/sec                  receiver

with a connection via tailscale IPs:

iperf3 -c 100.xxx.xxx.126 -f K
Connecting to host 100.xxx.xxx.126, port 5201
[  5] local 100.xxx.xxx.40 port 41858 connected to 100.xxx.xxx.126 port 5201
[ ID] Interval           Transfer     Bitrate         Retr  Cwnd
[  5]   0.00-1.00   sec   167 KBytes   167 KBytes/sec    6   19.2 KBytes
[  5]   1.00-2.00   sec   177 KBytes   178 KBytes/sec    2   13.2 KBytes
[  5]   2.00-3.00   sec  0.00 Bytes  0.00 KBytes/sec    2   7.20 KBytes
[  5]   3.00-4.00   sec  68.4 KBytes  68.4 KBytes/sec    2   7.20 KBytes
[  5]   4.00-5.00   sec  0.00 Bytes  0.00 KBytes/sec    2   6.00 KBytes
[  5]   5.00-6.00   sec  68.4 KBytes  68.4 KBytes/sec    1   4.80 KBytes
[  5]   6.00-7.00   sec  0.00 Bytes  0.00 KBytes/sec    1   7.20 KBytes
[  5]   7.00-8.00   sec  66.0 KBytes  66.0 KBytes/sec    2   6.00 KBytes
[  5]   8.00-9.00   sec  0.00 Bytes  0.00 KBytes/sec    1   4.80 KBytes
[  5]   9.00-10.00  sec  70.8 KBytes  70.8 KBytes/sec    2   4.80 KBytes
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-10.00  sec   618 KBytes  61.8 KBytes/sec   21             sender
[  5]   0.00-10.16  sec   510 KBytes  50.2 KBytes/sec                  receiver

CPU usage was around 10% on both servers during this iperf3 test.

BUT, having just checked, during the actual transfer of the backups, CPU performance goes indeed to 100% :frowning:

Still, the iperf3 results are very disappointing.

Those are indeed some very low numbers. Even 10% of CPU to get… 6kBytes/sec? doesn’t seem right at all.

It’s possible you’re seeing packet loss somewhere in the network. Can you try an iperf test with UDP mode instead? You probably have to tweak the bandwidth setting by hand, unlike TCP mode.

I’ve done the test with UDP and re-done the one with TCP. The one with UDP triggers 100% CPU on the receiving end and show lots of losses. Any tips on what and where to debug to figure out why I get such low throughput with tailscale? It used to be way more performant wit hthis exact same 2 servers as I have been sending the backups through tailscale for many months.

UDP, first with public IP, then with tailscale IP

root@hyper:~# iperf3 -c 63.81.90.219 -u -b 0 -f K

Connecting to host 63.81.90.219, port 5201
[  5] local 45.157.178.248 port 40778 connected to 63.81.90.219 port 5201
[ ID] Interval           Transfer     Bitrate         Total Datagrams
[  5]   0.00-1.00   sec   201 MBytes  206081 KBytes/sec  145760
[  5]   1.00-2.00   sec   245 MBytes  250610 KBytes/sec  177220
[  5]   2.00-3.00   sec   230 MBytes  235117 KBytes/sec  166280
[  5]   3.00-4.00   sec   241 MBytes  246953 KBytes/sec  174610
[  5]   4.00-5.00   sec   231 MBytes  236687 KBytes/sec  167390
[  5]   5.00-6.00   sec   236 MBytes  241686 KBytes/sec  170910
[  5]   6.00-7.00   sec   242 MBytes  248238 KBytes/sec  175550
[  5]   7.00-8.00   sec   258 MBytes  263809 KBytes/sec  186580
[  5]   8.00-9.00   sec   244 MBytes  250050 KBytes/sec  176810
[  5]   9.00-10.00  sec   234 MBytes  239189 KBytes/sec  169160
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate         Jitter    Lost/Total Datagrams
[  5]   0.00-10.00  sec  2.31 GBytes  241841 KBytes/sec  0.000 ms  0/1710270 (0%)  sender
[  5]   0.00-10.25  sec  1.07 GBytes  109914 KBytes/sec  0.019 ms  913845/1710240 (53%)  receiver

iperf Done.

root@hyper:~# iperf3 -c 100.73.203.126 -u -b 0 -f K

Connecting to host 100.73.203.126, port 5201
[  5] local 100.80.58.40 port 45937 connected to 100.73.203.126 port 5201
[ ID] Interval           Transfer     Bitrate         Total Datagrams
[  5]   0.00-1.00   sec   316 MBytes  323089 KBytes/sec  269420
[  5]   1.00-2.00   sec   450 MBytes  460304 KBytes/sec  383840
[  5]   2.00-3.00   sec   413 MBytes  423336 KBytes/sec  353010
[  5]   3.00-4.00   sec   452 MBytes  463308 KBytes/sec  386340
[  5]   4.00-5.00   sec   429 MBytes  439393 KBytes/sec  366400
[  5]   5.00-6.00   sec   437 MBytes  447848 KBytes/sec  373450
[  5]   6.00-7.00   sec   450 MBytes  460953 KBytes/sec  384370
[  5]   7.00-8.00   sec   454 MBytes  465337 KBytes/sec  388040
[  5]   8.00-9.00   sec   464 MBytes  475324 KBytes/sec  396360
[  5]   9.00-10.00  sec   462 MBytes  473537 KBytes/sec  394870
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate         Jitter    Lost/Total Datagrams
[  5]   0.00-10.00  sec  4.23 GBytes  443243 KBytes/sec  0.000 ms  0/3696100 (0%)  sender
[  5]   0.00-10.63  sec   359 MBytes  34551 KBytes/sec  0.020 ms  3388668/3694910 (92%)  receiver

TCP, first with public IP, then with tailscale IP

root@hyper:~# iperf3 -c 63.81.90.219  -f K

Connecting to host 63.81.90.219, port 5201
[  5] local 45.157.178.248 port 39510 connected to 63.81.90.219 port 5201
[ ID] Interval           Transfer     Bitrate         Retr  Cwnd
[  5]   0.00-1.00   sec  1.06 MBytes  1080 KBytes/sec    0    161 KBytes
[  5]   1.00-2.00   sec  10.4 MBytes  10607 KBytes/sec    0   6.00 MBytes
[  5]   2.00-3.00   sec  17.5 MBytes  17920 KBytes/sec    0   6.02 MBytes
[  5]   3.00-4.00   sec  17.5 MBytes  17922 KBytes/sec    0   6.02 MBytes
[  5]   4.00-5.00   sec  17.5 MBytes  17918 KBytes/sec    0   6.02 MBytes
[  5]   5.00-6.00   sec  17.5 MBytes  17920 KBytes/sec    0   6.02 MBytes
[  5]   6.00-7.00   sec  17.5 MBytes  17920 KBytes/sec    0   6.02 MBytes
[  5]   7.00-8.00   sec  17.5 MBytes  17920 KBytes/sec    0   6.02 MBytes
[  5]   8.00-9.00   sec  17.5 MBytes  17920 KBytes/sec    0   6.02 MBytes
[  5]   9.00-10.00  sec  17.5 MBytes  17920 KBytes/sec    0   6.02 MBytes
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-10.00  sec   151 MBytes  15505 KBytes/sec    0             sender
[  5]   0.00-10.17  sec   151 MBytes  15175 KBytes/sec                  receiver

iperf Done.


root@hyper:~# iperf3 -c 100.73.203.126 -f K

Connecting to host 100.73.203.126, port 5201
[  5] local 100.80.58.40 port 50846 connected to 100.73.203.126 port 5201
[ ID] Interval           Transfer     Bitrate         Retr  Cwnd
[  5]   0.00-1.00   sec   115 KBytes   115 KBytes/sec    1   10.8 KBytes
[  5]   1.00-2.00   sec  45.6 KBytes  45.6 KBytes/sec    4   3.60 KBytes
[  5]   2.00-3.00   sec  0.00 Bytes  0.00 KBytes/sec    2   4.80 KBytes
[  5]   3.00-4.00   sec  0.00 Bytes  0.00 KBytes/sec    1   3.60 KBytes
[  5]   4.00-5.00   sec  44.4 KBytes  44.4 KBytes/sec    1   3.60 KBytes
[  5]   5.00-6.00   sec  46.8 KBytes  46.8 KBytes/sec    0   7.20 KBytes
[  5]   6.00-7.00   sec  0.00 Bytes  0.00 KBytes/sec    1   7.20 KBytes
[  5]   7.00-8.00   sec  45.6 KBytes  45.6 KBytes/sec    3   3.60 KBytes
[  5]   8.00-9.00   sec  0.00 Bytes  0.00 KBytes/sec    1   3.60 KBytes
[  5]   9.00-10.00  sec  44.4 KBytes  44.4 KBytes/sec    1   3.60 KBytes
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-10.00  sec   342 KBytes  34.2 KBytes/sec   15             sender
[  5]   0.00-10.16  sec   289 KBytes  28.4 KBytes/sec                  receiver

iperf Done.

I think there’s a clue here in the fact that UDP seems to be so much faster than TCP for you even on the non-tailscale network (109 MBytes/sec vs 34 MBytes/sec). This suggests that you’re seeing significant packet loss between the two devices, regardless of tailscale. In turn, that can cause TCP to slow down. Because of how tailscale’s tunneling works (a few extra bytes per packet, a bit of extra latency because of the trip through the tun device), it’s possible that TCP will slow itself down more over tailscale than without, under the same packet loss conditions.

Can you use iperf’s -b parameter in a UDP test to find a speed (on the non-tailscale network) slightly less than 109 MBytes/sec, which has very low UDP packet loss? How low can it go? On a healthy network, 0% would be normal. (iperf3’s automatic UDP speed always has high packet loss, so you have to specify a rate in order to get the packet loss rate down.)

Another thing you can try is my isoping tool: https://github.com/apenwarr/isochronous, which can measure latency and packet loss in each direction separately, to see if one is worse than the other. You can run this either at the same time as iperf (in case high throughput triggers a problem) or separately.

@apenwarr thanks for the help. I will try your suggestions about iperfs -b parameter this evening.

About the tool you linked, since there are no instructions whatsoever I have to skip that - I might be in way over my head. I guess that needs to be compiled and then I still have no idea about how to use it or am I missing that info somewhere?

Yeah, the isoping docs suck, sorry. But it should compile with just “make,” and then you run it kind of like iperf, one on each end. It’ll tell you latency and packet loss in each direction.

Firstly, I really want to thank you for the valuable pointers so far. It turns out its a networking problem and unrelated to tailscale, please see results below. I don’t want to keep posting here since its not related to tailscale so I’d really appreciate it if you have any final tips or links for me.

thx. I tried isoping not sure how to interpret the results. After running for about 3 minutes it looks like this:

time paradox: backsliding start by -113 usec
  82.6 ms tx    82.0 ms rx  (min=82.0)  loss: 11/839 tx  0/838 rx
  82.5 ms tx    82.0 ms rx  (min=82.0)  loss: 11/840 tx  0/839 rx
  82.6 ms tx    82.2 ms rx  (min=82.0)  loss: 11/841 tx  0/840 rx
  82.6 ms tx    82.4 ms rx  (min=82.0)  loss: 11/842 tx  0/841 rx
  82.5 ms tx    82.2 ms rx  (min=82.0)  loss: 11/843 tx  0/842 rx
  82.8 ms tx    82.2 ms rx  (min=82.0)  loss: 11/844 tx  0/843 rx
  82.5 ms tx    82.0 ms rx  (min=82.0)  loss: 11/845 tx  0/844 rx
  82.4 ms tx    82.1 ms rx  (min=82.0)  loss: 11/846 tx  0/845 rx
  82.6 ms tx    82.1 ms rx  (min=82.0)  loss: 11/847 tx  0/846 rx
  82.7 ms tx    82.0 ms rx  (min=82.0)  loss: 11/848 tx  0/847 rx
  82.6 ms tx    82.1 ms rx  (min=82.0)  loss: 11/849 tx  0/848 rx
  82.5 ms tx    82.1 ms rx  (min=82.0)  loss: 11/850 tx  0/849 rx
  82.4 ms tx    82.0 ms rx  (min=82.0)  loss: 11/851 tx  0/850 rx
  82.8 ms tx    82.1 ms rx  (min=82.0)  loss: 11/852 tx  0/851 rx
  82.6 ms tx    82.2 ms rx  (min=82.0)  loss: 11/853 tx  0/852 rx
  82.4 ms tx    82.2 ms rx  (min=82.0)  loss: 11/854 tx  0/853 rx
  82.7 ms tx    82.0 ms rx  (min=82.0)  loss: 11/855 tx  0/854 rx
  82.6 ms tx    82.1 ms rx  (min=82.0)  loss: 11/856 tx  0/855 rx
  82.5 ms tx    82.1 ms rx  (min=82.0)  loss: 11/857 tx  0/856 rx
  82.6 ms tx    82.1 ms rx  (min=82.0)  loss: 11/858 tx  0/857 rx
  82.5 ms tx    82.0 ms rx  (min=82.0)  loss: 11/859 tx  0/858 rx
^C
---
tx: min/avg/max/mdev = 82.20/82.58/88.45/0.30 ms
rx: min/avg/max/mdev = 81.82/82.13/93.23/0.50 ms

btw. I tried iperf3 UDP playing around with the -b parameter. Basically setting -b 70M to -b 100M does show a loss between 0% and 3% consistently. I ran these tests multiple times so I guess I can go as well with 100M. This was for the non-tailscale IP.

iperf3 -c 63.81.90.219 -u -b 100M -f M
Connecting to host 63.81.90.219, port 5201
[  5] local 45.157.178.248 port 56530 connected to 63.81.90.219 port 5201
[ ID] Interval           Transfer     Bitrate         Total Datagrams
[  5]   0.00-1.00   sec  11.9 MBytes  11.9 MBytes/sec  8626
[  5]   1.00-2.00   sec  11.9 MBytes  11.9 MBytes/sec  8632
[  5]   2.00-3.00   sec  11.9 MBytes  11.9 MBytes/sec  8633
[  5]   3.00-4.00   sec  11.9 MBytes  11.9 MBytes/sec  8632
[  5]   4.00-5.00   sec  11.9 MBytes  11.9 MBytes/sec  8633
[  5]   5.00-6.00   sec  11.9 MBytes  11.9 MBytes/sec  8633
[  5]   6.00-7.00   sec  11.9 MBytes  11.9 MBytes/sec  8632
[  5]   7.00-8.00   sec  11.9 MBytes  11.9 MBytes/sec  8632
[  5]   8.00-9.00   sec  11.9 MBytes  11.9 MBytes/sec  8634
[  5]   9.00-10.00  sec  11.9 MBytes  11.9 MBytes/sec  8632
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate         Jitter    Lost/Total Datagrams
[  5]   0.00-10.00  sec   119 MBytes  11.9 MBytes/sec  0.000 ms  0/86319 (0%)  sender
[  5]   0.00-10.17  sec   116 MBytes  11.4 MBytes/sec  0.047 ms  2545/86319 (2.9%)  receiver

iperf Done.

Next trying via tailscale IP. Also getting consistent loss just slightly higher at around 4%

iperf3 -c 100.73.203.126 -u -b 100M -f M
Connecting to host 100.73.203.126, port 5201
[  5] local 100.80.58.40 port 59517 connected to 100.73.203.126 port 5201
[ ID] Interval           Transfer     Bitrate         Total Datagrams
[  5]   0.00-1.00   sec  11.9 MBytes  11.9 MBytes/sec  10171
[  5]   1.00-2.00   sec  11.9 MBytes  11.9 MBytes/sec  10179
[  5]   2.00-3.00   sec  11.9 MBytes  11.9 MBytes/sec  10179
[  5]   3.00-4.00   sec  11.9 MBytes  11.9 MBytes/sec  10179
[  5]   4.00-5.00   sec  11.9 MBytes  11.9 MBytes/sec  10179
[  5]   5.00-6.00   sec  11.9 MBytes  11.9 MBytes/sec  10180
[  5]   6.00-7.00   sec  11.9 MBytes  11.9 MBytes/sec  10179
[  5]   7.00-8.00   sec  11.9 MBytes  11.9 MBytes/sec  10179
[  5]   8.00-9.00   sec  11.9 MBytes  11.9 MBytes/sec  10179
[  5]   9.00-10.00  sec  11.9 MBytes  11.9 MBytes/sec  10180
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate         Jitter    Lost/Total Datagrams
[  5]   0.00-10.00  sec   119 MBytes  11.9 MBytes/sec  0.000 ms  0/101784 (0%)  sender
[  5]   0.00-10.16  sec   114 MBytes  11.2 MBytes/sec  0.055 ms  4235/101784 (4.2%)  receiver

iperf Done.

Now I need to figure out where the problem is on the client or on the server side so I tried the test from my local workstation with both as target and found:

iperf3 -c 63.81.90.219 -u -b 100M -f M
Connecting to host 63.81.90.219, port 5201
[  5] local 10.10.10.10 port 40106 connected to 63.81.90.219 port 5201
[ ID] Interval           Transfer     Bitrate         Total Datagrams
[  5]   0.00-1.00   sec  11.9 MBytes  11.9 MBytes/sec  8673
[  5]   1.00-2.00   sec  11.9 MBytes  11.9 MBytes/sec  8681
[  5]   2.00-3.00   sec  11.9 MBytes  11.9 MBytes/sec  8680
[  5]   3.00-4.00   sec  11.9 MBytes  11.9 MBytes/sec  8681
[  5]   4.00-5.00   sec  11.9 MBytes  11.9 MBytes/sec  8680
[  5]   5.00-6.00   sec  11.9 MBytes  11.9 MBytes/sec  8681
[  5]   6.00-7.00   sec  11.9 MBytes  11.9 MBytes/sec  8680
[  5]   7.00-8.00   sec  11.9 MBytes  11.9 MBytes/sec  8681
[  5]   8.00-9.00   sec  11.9 MBytes  11.9 MBytes/sec  8681
[  5]   9.00-10.00  sec  11.9 MBytes  11.9 MBytes/sec  8680
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate         Jitter    Lost/Total Datagrams
[  5]   0.00-10.00  sec   119 MBytes  11.9 MBytes/sec  0.000 ms  0/86798 (0%)  sender
[  5]   0.00-10.18  sec  45.9 MBytes  4.51 MBytes/sec  0.392 ms  51353/84801 (61%)  receiver

iperf Done.

and to the other target:

iperf3 -c 45.157.178.248 -u -b 100M -f M
Connecting to host 45.157.178.248, port 5201
[  5] local 10.10.10.10 port 36019 connected to 45.157.178.248 port 5201
[ ID] Interval           Transfer     Bitrate         Total Datagrams
[  5]   0.00-1.00   sec  11.9 MBytes  11.9 MBytes/sec  8673
[  5]   1.00-2.00   sec  11.9 MBytes  11.9 MBytes/sec  8681
[  5]   2.00-3.00   sec  11.9 MBytes  11.9 MBytes/sec  8680
[  5]   3.00-4.00   sec  11.9 MBytes  11.9 MBytes/sec  8681
[  5]   4.00-5.00   sec  11.9 MBytes  11.9 MBytes/sec  8680
[  5]   5.00-6.00   sec  11.9 MBytes  11.9 MBytes/sec  8681
[  5]   6.00-7.00   sec  11.9 MBytes  11.9 MBytes/sec  8680
[  5]   7.00-8.00   sec  11.9 MBytes  11.9 MBytes/sec  8681
[  5]   8.00-9.00   sec  11.9 MBytes  11.9 MBytes/sec  8680
[  5]   9.00-10.00  sec  11.9 MBytes  11.9 MBytes/sec  8681
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate         Jitter    Lost/Total Datagrams
[  5]   0.00-10.00  sec   119 MBytes  11.9 MBytes/sec  0.000 ms  0/86798 (0%)  sender
[  5]   0.00-10.01  sec  46.7 MBytes  4.66 MBytes/sec  0.401 ms  50818/84800 (60%)  receiver

iperf Done.

Looks like an issue on my local side since all of a sudden both targets show the same loss % :frowning:

Tried iperf3 from a 3rd server towards the initial 2 servers which were showing the problem:

iperf3 -c 45.157.178.248 -u -b 100M -f M
Connecting to host 45.157.178.248, port 5201
[  4] local 188.165.225.88 port 50235 connected to 45.157.178.248 port 5201
[ ID] Interval           Transfer     Bandwidth       Total Datagrams
[  4]   0.00-1.00   sec  10.9 MBytes  10.9 MBytes/sec  1394
[  4]   1.00-2.00   sec  11.9 MBytes  11.9 MBytes/sec  1526
[  4]   2.00-3.00   sec  11.9 MBytes  11.9 MBytes/sec  1526
[  4]   3.00-4.00   sec  11.9 MBytes  11.9 MBytes/sec  1526
[  4]   4.00-5.00   sec  11.9 MBytes  11.9 MBytes/sec  1525
[  4]   5.00-6.00   sec  11.9 MBytes  11.9 MBytes/sec  1526
[  4]   6.00-7.00   sec  11.9 MBytes  11.9 MBytes/sec  1526
[  4]   7.00-8.00   sec  11.9 MBytes  11.9 MBytes/sec  1525
[  4]   8.00-9.00   sec  11.9 MBytes  11.9 MBytes/sec  1527
[  4]   9.00-10.00  sec  11.9 MBytes  11.9 MBytes/sec  1526
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bandwidth       Jitter    Lost/Total Datagrams
[  4]   0.00-10.00  sec   118 MBytes  11.8 MBytes/sec  0.133 ms  1394/11904 (12%)
[  4] Sent 11904 datagrams

and

iperf3 -c 63.81.90.219 -u -b 100M -f M
Connecting to host 63.81.90.219, port 5201
[  4] local 188.165.225.88 port 43873 connected to 63.81.90.219 port 5201
[ ID] Interval           Transfer     Bandwidth       Total Datagrams
[  4]   0.00-1.00   sec  10.9 MBytes  10.9 MBytes/sec  1394
[  4]   1.00-2.00   sec  11.9 MBytes  11.9 MBytes/sec  1526
[  4]   2.00-3.00   sec  11.9 MBytes  11.9 MBytes/sec  1526
[  4]   3.00-4.00   sec  11.9 MBytes  11.9 MBytes/sec  1526
[  4]   4.00-5.00   sec  11.9 MBytes  11.9 MBytes/sec  1526
[  4]   5.00-6.00   sec  11.9 MBytes  11.9 MBytes/sec  1525
[  4]   6.00-7.00   sec  11.9 MBytes  11.9 MBytes/sec  1526
[  4]   7.00-8.00   sec  11.9 MBytes  11.9 MBytes/sec  1526
[  4]   8.00-9.00   sec  11.9 MBytes  11.9 MBytes/sec  1526
[  4]   9.00-10.00  sec  11.9 MBytes  11.9 MBytes/sec  1526
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bandwidth       Jitter    Lost/Total Datagrams
[  4]   0.00-10.00  sec   118 MBytes  11.8 MBytes/sec  0.180 ms  1441/12336 (12%)
[  4] Sent 12336 datagrams

iperf Done.

Not sure where to go from here. I guess there is a networking problem between the 2 servers I mentioned initially seeing that reducing the speed to 100M lead to an acceptable packet loss around 3%

trying to keep iperf3 bandwidth to 100M on non-taillscale IP shows inconsistency:

iperf3 -c 63.81.90.219 -b 100M -f M
Connecting to host 63.81.90.219, port 5201
[  5] local 45.157.178.248 port 46860 connected to 63.81.90.219 port 5201
[ ID] Interval           Transfer     Bitrate         Retr  Cwnd
[  5]   0.00-1.00   sec  2.07 MBytes  2.07 MBytes/sec    0    452 KBytes
[  5]   1.00-2.00   sec  11.9 MBytes  11.9 MBytes/sec    0   5.45 MBytes
[  5]   2.00-3.00   sec  15.8 MBytes  15.7 MBytes/sec    0   5.45 MBytes
[  5]   3.00-4.00   sec  15.8 MBytes  15.8 MBytes/sec    0   5.45 MBytes
[  5]   4.00-5.00   sec  14.2 MBytes  14.2 MBytes/sec    0   5.45 MBytes
[  5]   5.00-6.00   sec  11.9 MBytes  11.9 MBytes/sec    0   5.45 MBytes
[  5]   6.00-7.00   sec  11.9 MBytes  11.9 MBytes/sec    0   5.45 MBytes
[  5]   7.00-8.00   sec  12.0 MBytes  12.0 MBytes/sec    0   5.45 MBytes
[  5]   8.00-9.00   sec  11.9 MBytes  11.9 MBytes/sec    0   5.45 MBytes
[  5]   9.00-10.00  sec  12.0 MBytes  12.0 MBytes/sec    0   5.45 MBytes
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-10.00  sec   119 MBytes  11.9 MBytes/sec    0             sender
[  5]   0.00-10.16  sec   119 MBytes  11.7 MBytes/sec                  receiver

iperf Done.

iperf3 -c 63.81.90.219 -b 100M -f M
Connecting to host 63.81.90.219, port 5201
[  5] local 45.157.178.248 port 46956 connected to 63.81.90.219 port 5201
[ ID] Interval           Transfer     Bitrate         Retr  Cwnd
[  5]   0.00-1.00   sec  1.19 MBytes  1.19 MBytes/sec    6    168 KBytes
[  5]   1.00-2.00   sec   896 KBytes  0.87 MBytes/sec    9   59.4 KBytes
[  5]   2.00-3.00   sec   256 KBytes  0.25 MBytes/sec    4   33.9 KBytes
[  5]   3.00-4.00   sec   128 KBytes  0.13 MBytes/sec    4   18.4 KBytes
[  5]   4.00-5.00   sec   128 KBytes  0.12 MBytes/sec    2   9.90 KBytes
[  5]   5.00-6.00   sec  0.00 Bytes  0.00 MBytes/sec    0   12.7 KBytes
[  5]   6.00-7.00   sec   128 KBytes  0.13 MBytes/sec    0   17.0 KBytes
[  5]   7.00-8.00   sec   128 KBytes  0.13 MBytes/sec    1   14.1 KBytes
[  5]   8.00-9.00   sec  0.00 Bytes  0.00 MBytes/sec    2   12.7 KBytes
[  5]   9.00-10.00  sec   128 KBytes  0.13 MBytes/sec    0   15.6 KBytes
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-10.00  sec  2.94 MBytes  0.29 MBytes/sec   28             sender
[  5]   0.00-10.17  sec  1.96 MBytes  0.19 MBytes/sec                  receiver

iperf Done.

No need to leave the forum just because the problem likely isn’t tailscale :slight_smile: This place is whatever people want to make it. And unfortunately, debugging network problems is kind of fun.

It looks like when you tried to reach both of your servers from your desktop, you had very high packet loss, so I suspect maybe your home uplink is slow? You might want to try in the other direction, or just intentionally reduce the speed in that test until your packet loss goes back down. 40% loss is extremely high.

Did you say that if you run iperf3 below 70 Mbps between the two servers, there’s near 0% loss? That’s… interesting. And doesn’t quite match what isoping was showing, which was 11/859 = 1.5% loss even when not heavily loaded. Notice that the isoping loss was only in one direction (the tx direction, from the data you posted). That’s a sign that your link may be asymmetrically broken.

Have you played with the mtr tool before? It might be able to find which hop on the link between your two servers is the first one with the problem.

1 Like

what threw me for a loop is that this test: iperf3 -c 63.81.90.219 -b 100M -f M
run consecutively seemingly randomly results in
a) 11.7 MBytes/sec
b) 0.19 MBytes/sec

that makes zero sense to me :frowning:

###edit###
thanks for the encouragement about it being ok to ask about this problem on this forum. I guess I was just trying not to pollute a product-specific forum with general questions but seeing that I already hijacked another persons thread and haven’t been told off, I guess people around are here quite friendly :slight_smile:

Here’s a quick look at what mtr shows from both servers:

 mtr -rwbzc 100 -i 0.2 -rw 63.81.90.219
Start: 2021-04-21T21:53:44+0200
HOST: hyper                                                                  Loss%   Snt   Last   Avg  Best  Wrst StDev
  1. AS197540 45.157.176.2                                                    0.0%   100    0.5   3.2   0.2  59.0  10.6
  2. AS47147  ae3-4019.bbr02.anx84.nue.de.anexia-it.net (144.208.211.10)      0.0%   100    0.7   1.7   0.3  36.0   5.3
  3. AS47147  ae0-0.bbr01.anx84.nue.de.anexia-it.net (144.208.208.139)        0.0%   100    3.7   4.4   3.7  20.4   2.2
  4. AS47147  ae2-0.bbr02.anx25.fra.de.anexia-it.net (144.208.208.141)        0.0%   100    4.2   7.9   3.7  69.1   9.9
  5. AS47147  ae0-0.bbr01.anx25.fra.de.anexia-it.net (144.208.208.143)        0.0%   100    3.7   6.6   3.5  62.0   8.1
  6. AS1299   ffm-b5-link.ip.twelve99.net (62.115.14.116)                     0.0%   100    3.9   5.5   3.6  37.1   6.4
  7. AS1299   ffm-bb1-link.ip.twelve99.net (62.115.114.88)                    0.0%   100   95.5  95.5  95.3  96.1   0.2
  8. AS1299   prs-bb1-link.ip.twelve99.net (62.115.123.13)                    0.0%   100   94.7  94.7  94.4  95.4   0.2
  9. AS1299   ash-bb2-link.ip.twelve99.net (62.115.112.242)                   0.0%   100  100.4 100.5 100.3 101.1   0.2
 10. AS1299   ash-b2-link.ip.twelve99.net (62.115.123.125)                    0.0%   100  100.2 100.3  99.7 115.8   1.9
 11. AS1299   verizon-ic332636-ash-b2.ip.twelve99-cust.net (80.239.135.179)  21.0%   100  197.0 197.1 196.4 213.6   2.4
 12. AS???    0.ae1.GW1.SCL2.ALTER.NET (140.222.232.209)                      0.0%   100  166.5 167.7 166.3 202.7   5.4
 13. AS701    lanset-gw.customer.alter.net (204.148.181.2)                    0.0%   100  173.7 173.9 173.3 184.8   1.5
 14. AS16578  63.81.90.219                                                    0.0%   100  168.2 168.2 168.0 168.9   0.1

and

mtr -rwbzc 100 -i 0.2 -rw 45.157.178.248
Start: 2021-04-21T21:53:48+0200
HOST: ict                                                                  Loss%   Snt   Last   Avg  Best  Wrst StDev
  1. AS16578  63.81.80.1                                                    0.0%   100    0.5   0.8   0.5   9.4   1.0
  2. AS701    97.TenGigE0-3-0-12.GW6.SCL2.ALTER.NET (204.148.181.1)         0.0%   100    7.3   7.5   7.2  15.2   1.0
  3. AS???    0.et-11-3-0.GW7.LAX15.ALTER.NET (140.222.235.151)             0.0%   100   16.2  16.2  15.0  48.7   4.1
  4. AS701    152.179.21.22                                                 0.0%   100   15.5  15.5  15.2  19.6   0.6
  5. AS3320   pd900c602.dip0.t-ipconnect.de (217.0.198.2)                   0.0%   100  4656. 1241. 168.2 7005. 1984.8
  6. AS3320   80.156.161.186                                                0.0%   100  161.9 165.3 160.9 223.5  11.6
  7. AS47147  ae1-0.bbr01.anx84.nue.de.anexia-it.net (144.208.208.140)      0.0%   100  165.0 164.9 163.9 182.6   3.0
  8. AS47147  netcup-gw.bbr01.anx84.nue.de.anexia-it.net (144.208.211.31)   0.0%   100  165.6 167.9 165.5 234.3   9.8
  9. AS197540 hyper.ict-consult.co.za (45.157.178.248)                      0.0%   100  168.2 168.2 168.1 168.8   0.1

Hi Ovidiu,

Do you mind capturing one of your slow-over-tailscale iperf sessions so we can analyze the packets on our end? You can do that with a command like one of the following (on the receiving side):

on linux:
tcpdump -ni tailscale0 -w iperf.pcap

or on macOS:
tcpdump -ni utun2 -w iperf.pcap
(you may have to replace utun2 with some other interface name to match the tailscale interface).

Make sure to compress the file using zip or gzip, and then please send to support@tailscale.com. Maybe we can figure out the pattern to why tailscale is slower than an unencrypted link in your high-packet-loss situation. Even though the correct fix is probably to eliminate the packet loss, we’d at least like for tailscale to not further aggravate the problem.

Thanks!

I’ve run iperf3 via UDP and sent you the info but after running 5 times, the results all of a sudden look very stable. It might very well be that the networking problem between the 2 servers has been fixed. :frowning: