Files
timmy-home/docs/FLEET_SECRET_ROTATION.md
Alexander Whitestone b334139fb5
Some checks failed
Smoke Test / smoke (pull_request) Failing after 15s
feat: add fleet secret rotation playbook (#694)
2026-04-14 23:59:54 -04:00

2.7 KiB

Fleet Secret Rotation

Issue: timmy-home#694

This runbook adds a single place to rotate fleet API keys, service tokens, and SSH authorized keys without hand-editing remote hosts.

Files

  • ansible/inventory/hosts.ini — fleet hosts (ezra, bezalel)
  • ansible/inventory/group_vars/fleet.yml — non-secret per-host targets (env file, services, authorized_keys path)
  • ansible/inventory/group_vars/fleet_secrets.vault.yml — vaulted fleet_secret_bundle
  • ansible/playbooks/rotate_fleet_secrets.yml — staged rotation + restart verification + rollback

Secret inventory shape

fleet_secret_bundle is keyed by host. Each host carries the env secrets to rewrite plus the full authorized_keys payload to distribute.

fleet_secret_bundle:
  ezra:
    env:
      GITEA_TOKEN: !vault |
        ...
      TELEGRAM_BOT_TOKEN: !vault |
        ...
      PRIMARY_MODEL_API_KEY: !vault |
        ...
    ssh_authorized_keys: !vault |
      ...

The committed vault file contains placeholder encrypted values only. Replace them with real rotated material before production use.

Rotate a new bundle

From repo root:

cd ansible
ansible-vault edit inventory/group_vars/fleet_secrets.vault.yml
ansible-playbook -i inventory/hosts.ini playbooks/rotate_fleet_secrets.yml --ask-vault-pass

Or update one value at a time with ansible-vault encrypt_string and paste it into fleet_secret_bundle.

What the playbook does

  1. Validates that each host has a secret bundle and target metadata.
  2. Writes rollback snapshots under /var/lib/timmy/secret-rotations/<rotation_id>/<host>/.
  3. Stages a candidate .env file and candidate authorized_keys file before promotion.
  4. Promotes staged files into place.
  5. Restarts every declared dependent service.
  6. Verifies each service with systemctl is-active.
  7. If anything fails, restores the previous .env and authorized_keys, restarts services again, and aborts the run.

Rollback semantics

Rollback is host-safe and automatic inside the playbook rescue: block.

  • Existing .env and authorized_keys files are restored from backup when they existed before rotation.
  • Newly created files are removed if the host had no prior version.
  • Service restart is retried after rollback so the node returns to the last-known-good bundle.

Operational notes

  • Keep required_env_keys in ansible/inventory/group_vars/fleet.yml aligned with each house's real runtime contract.
  • ssh_authorized_keys distributes public keys only. Rotate corresponding private keys out-of-band, then publish the new authorized key list through the vault.
  • Use one vault edit per rotation window so API keys, bot tokens, and SSH access move together.