Issues using Tailscale subnet router in production

Referencing the article here: Subnet router failover · Tailscale

While there exists the basic ingredients of using Tailscale router in HA, its missing some key operational parts. The requirements to run in HA are:

  1. Need to be able to script initialization and authentication of tailscale.
  2. Set it and forget it configuration. Not needing to recreate router boxes constantly.
  3. Be able to survive restarts.

To achieve 1. the only method is to generate ’ Auth keys’ and add them to the script that will call tailscale up on the box. However Auth keys have an expiry period. So now there is an additional burden to constantly recreate router boxes so the key on them doesnt expire, which violates 2.
for 3. tailscale doesnt seem to have an easy mechanism to run tailscale up with custom params (like auth-key) in a systemd init script. atleast nothing that i could find in the docs. Since this is needed to achieve 3, currently this is also left to the user to implement.

This post resulted from a support conversation. To summarize:

  1. Need to be able to script initialization and authentication of tailscale.

This is done via an authkey. It is true that the authkey itself expires after 90 days, however the node created by the authkey does not expire at that same 90 days.

  1. Set it and forget it configuration. Not needing to recreate router boxes constantly.

A new node creates a node key, which has a lifetime independent of the authkey which created it. By default node keys need to be refreshed every 6 months, but a node key can be set to never expire.

  1. Be able to survive restarts.

Arguments to “tailscale up” are stored and retained, they do not need to be re-issued every time.