tailscale version 1.34.2
Windows 11 Pro 22H2 22621.1105
I have a tailscale network with a variety of devices. In my home I have some iOS devices, a NAS, and a Windows Desktop. In the cloud I have a few Linux VPCs.
In terms of the fundamentals, everything seems to be working correctly. The devices can all connect to each other. The addresses and DNS are all correct. IPv4 and v6 are both good to go. I’m honestly impressed by tailscale. My only issues are the very well known iOS battery usage problems, and the following problem.
I have a very annoying and persistent issue that occurs only for connections between my Windows desktop and devices that are outside my home network. Here is what I have discovered.
I first encountered the issue when I was working on a cloud VPC using SSH. I’m not using tailscale SSH. I just have OpenSSH server running on the VPC with port 22 open on the tailscale interface, and I use the standard OpenSSH client. Every few seconds, maybe once or twice a minute, the connection hangs. I type and nothing appears. Then after a few seconds of delay all the things I typed appear instantly. It’s as if suddenly a bunch of buffered up packets got through. You can imagine this makes it infuriatingly difficult to get any work done when the problem reliably happens several times per minute.
I performed several tests and discovered that this issue was not specific to SSH. I could reproduce it reliably using the tailscale ping command. Here is what I discovered.
- When using SSH between my iPad and a VPC in the cloud, no issues.
- When using tailscale ping between any two devices on my home network, no issues.
- When using tailscale ping between any two different VPCs in the cloud, no issues.
- When using tailscale ping between a VPC and a device on my home network other than my Windows desktop, in either direction, no issues.
- When using tailscale ping between my Windows Desktop and a cloud VPC, in both directions, reliably reproducible recurring timeouts happen every single time. Even if do a fresh restart of my Windows desktop the problem is there.
- When using regular ping, outside of tailscale, between my Windows Desktop and the cloud VPC there are no problems with either IPv4 or 6.
I still have to get another Windows device on my home network to test with to determine if it’s a Windows problem, or a problem with just my desktop specifically.
Here is what the pings look like. In this example I am pinging from a cloud VPC to my desktop over tailscale. If I attempt the pings in the reverse direction, from desktop to VPC, the exact same phenomena occurs. And it happens every single time.
In this example 100.100.100.100 is the tailscale IP of my desktop.
66.66.66.66 is the public IPv4 address of my home network’s router.
2600:2600:2600:2600::2600 is the public IPv6 address of my home network’s router.
apreche@myvpc:~$ tailscale ping -c 100 --until-direct=false --verbose mydesktop
2023/01/23 16:38:34 lookup "mydesktop" => "100.100.100.100"
ping "100.100.100.100" timed out
pong from mydesktop (100.100.100.100) via 66.66.66.66:1115 in 12ms
pong from mydesktop (100.100.100.100) via [2600:2600:2600:2600::2600]:41641 in 7ms
pong from mydesktop (100.100.100.100) via [2600:2600:2600:2600::2600]:41641 in 21ms
pong from mydesktop (100.100.100.100) via [2600:2600:2600:2600::2600]:41641 in 9ms
pong from mydesktop (100.100.100.100) via [2600:2600:2600:2600::2600]:41641 in 8ms
pong from mydesktop (100.100.100.100) via [2600:2600:2600:2600::2600]:41641 in 6ms
ping "100.100.100.100" timed out
pong from mydesktop (100.100.100.100) via [2600:2600:2600:2600::2600]:41641 in 7ms
pong from mydesktop (100.100.100.100) via [2600:2600:2600:2600::2600]:41641 in 7ms
pong from mydesktop (100.100.100.100) via [2600:2600:2600:2600::2600]:41641 in 9ms
pong from mydesktop (100.100.100.100) via [2600:2600:2600:2600::2600]:41641 in 7ms
pong from mydesktop (100.100.100.100) via [2600:2600:2600:2600::2600]:41641 in 8ms
pong from mydesktop (100.100.100.100) via [2600:2600:2600:2600::2600]:41641 in 6ms
pong from mydesktop (100.100.100.100) via [2600:2600:2600:2600::2600]:41641 in 6ms
pong from mydesktop (100.100.100.100) via [2600:2600:2600:2600::2600]:41641 in 7ms
pong from mydesktop (100.100.100.100) via [2600:2600:2600:2600::2600]:41641 in 10ms
pong from mydesktop (100.100.100.100) via [2600:2600:2600:2600::2600]:41641 in 6ms
pong from mydesktop (100.100.100.100) via [2600:2600:2600:2600::2600]:41641 in 7ms
ping "100.100.100.100" timed out
pong from mydesktop (100.100.100.100) via [2600:2600:2600:2600::2600]:41641 in 9ms
pong from mydesktop (100.100.100.100) via [2600:2600:2600:2600::2600]:41641 in 8ms
pong from mydesktop (100.100.100.100) via [2600:2600:2600:2600::2600]:41641 in 9ms
pong from mydesktop (100.100.100.100) via [2600:2600:2600:2600::2600]:41641 in 8ms
pong from mydesktop (100.100.100.100) via [2600:2600:2600:2600::2600]:41641 in 10ms
ping "100.100.100.100" timed out
pong from mydesktop (100.100.100.100) via [2600:2600:2600:2600::2600]:41641 in 12ms
pong from mydesktop (100.100.100.100) via [2600:2600:2600:2600::2600]:41641 in 6ms
pong from mydesktop (100.100.100.100) via [2600:2600:2600:2600::2600]:41641 in 9ms
Does anyone have any ideas on how I can further diagnose this problem? I’ve narrowed it down a whole bunch, but now I have no clues as to what the cause could be. Some Windows firewall or network configuration perhaps?
Thanks.