Exit node breaks split-horizon DNS

Tailscale version 1.22.2 (MacOS), 1.22.2 (Linux), 1.22.0 (iOS)
Your operating system & version MacOS 12.3.1, Amazon Linux 2, IOS 15.4.1

I have Tailscale installed on my M1 MacBook and iPhone connecting to several devices at home including a Raspberry Pi 4 running Ubuntu as a subnet router and exit node. Magic DNS is configured to forward queries for *.axeltech.com for split-view DNS to my local DNS server, which is a route advertised by the RPi subnet router.

This all works fine when i am out of the house and either using the exit node or not. DNS queries for a *.axeltech.com domain correctly resolve using the internal nameserver. Other DNS queries either resolve using the default nameserver for the client or my home nameserver depending on whether I am using the exit node.

However, I am currently traveling and decided to stand up another Exit Node in AWS (Amazon Linux 2) to provide lower latency exit node functionality while I’m using unencrypted hotel wifi. When I switch to this exit node in either my MacOS or iOS tailscale clients, split horizon DNS breaks. DNS resolution for non axeltech.com domains works fine and appears to be using the EC2 DNS resolver from the Exit Node, which makes sense. DNS queries to axeltech.com domains time out.

The odd thing is that routing to the exit node and my existing subnet router back at home are all working fine. I can dig my home name server and resolve *.axeltech.com domains fine. I can also connect to devices behind that subnet router by IP address just fine, and all of my other internet traffic goes out the new exit node.

I originally thought that Tailscale was preferencing using the default DNS via the Exit Node over the explicit split-horizon domain, but in that case, I would expect to be getting NXDOMAIN responses for hostnames that are internal only and the external IP address for hostnames that exist in public DNS for that domain, but instead I’m getting timeouts.

# Querying my home's nameserver directly via subnet router works
❯ dig @10.70.10.1 dvr.axeltech.com

; <<>> DiG 9.10.6 <<>> @10.70.10.1 dvr.axeltech.com
; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 44896
;; flags: qr aa rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4096
;; QUESTION SECTION:
;dvr.axeltech.com.		IN	A

;; ANSWER SECTION:
dvr.axeltech.com.	3600	IN	A	10.70.8.6

;; Query time: 80 msec
;; SERVER: 10.70.10.1#53(10.70.10.1)
;; WHEN: Wed Apr 13 10:01:52 PDT 2022
;; MSG SIZE  rcvd: 61

# same query against an external nameserver gives me an expected NXDOMAIN
❯ dig @1.1.1.1 dvr.axeltech.com

; <<>> DiG 9.10.6 <<>> @1.1.1.1 dvr.axeltech.com
; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 7958
;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 1, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 1232
;; QUESTION SECTION:
;dvr.axeltech.com.		IN	A

;; AUTHORITY SECTION:
axeltech.com.		900	IN	SOA	ns-1022.awsdns-63.net. awsdns-hostmaster.amazon.com. 1 7200 900 1209600 86400

;; Query time: 87 msec
;; SERVER: 1.1.1.1#53(1.1.1.1)
;; WHEN: Wed Apr 13 10:02:38 PDT 2022
;; MSG SIZE  rcvd: 127

# querying Tailscale's resolver times out
❯ dig @100.100.100.100 dvr.axeltech.com

; <<>> DiG 9.10.6 <<>> @100.100.100.100 dvr.axeltech.com
; (1 server found)
;; global options: +cmd
;; connection timed out; no servers could be reached

# but querying for an external hostname works just fine
❯ dig @100.100.100.100 tailscale.com

; <<>> DiG 9.10.6 <<>> @100.100.100.100 tailscale.com
; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 17645
;; flags: qr rd ra; QUERY: 1, ANSWER: 2, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4095
;; QUESTION SECTION:
;tailscale.com.			IN	A

;; ANSWER SECTION:
tailscale.com.		60	IN	A	104.16.243.78
tailscale.com.		60	IN	A	104.16.244.78

;; Query time: 67 msec
;; SERVER: 100.100.100.100#53(100.100.100.100)
;; WHEN: Wed Apr 13 10:03:43 PDT 2022
;; MSG SIZE  rcvd: 74

I solved the issue. I needed to --accept-routes on the new Subnet Router I stood up on the US West Coast.

This seems counterintuitive, but it seems that with split-view Magic DNS, all of the DNS queries go via the Exit Node, instead of taking the same path from the client that other traffic takes. Although my client could reach my home DNS resolver directly via the subnet router advertisement, my new West Coast exit node could not. Once I accepted the routes on that subnet router, my DNS queries tarted working again.

So it seems that traffic to machines behind the subnet router takes a direct path from my client to that subnet router. However, DNS queries will first go to the Exit Node I selected and then follow the split horizon DNS rules I set up in MagicDNS settings to go the appropriate DNS server, in this case, from the new Exit Node to my home subnet router.