Some checks failed
PR Checklist / pr-checklist (pull_request) Failing after 1m27s
Implements the Ansible Infrastructure as Code story from KT 2026-04-08. One canonical Ansible playbook defines: - Deadman switch (snapshot good config on health, rollback+restart on death) - Golden state config deployment (Anthropic BANNED, Kimi→Gemini→Ollama) - Cron schedule (source-controlled, no manual crontab edits) - Agent startup sequence (pull→validate→start→verify) - request_log telemetry table (every inference call logged) - Thin config pattern (immutable local pointer to upstream) - Gitea webhook handler (deploy on merge) - Config validator (rejects banned providers) Fleet inventory: Timmy (Mac), Allegro (VPS), Bezalel (VPS), Ezra (VPS) Roles: wizard_base, golden_state, deadman_switch, request_log, cron_manager Addresses: timmy-config #442, #443, #444, #445, #446 References: KT Final 2026-04-08 P2, KT Bezalel 2026-04-08 #1-#5
18 lines
675 B
YAML
18 lines
675 B
YAML
---
|
|
# =============================================================================
|
|
# deadman_switch.yml — Deploy Deadman Switch to All Wizards
|
|
# =============================================================================
|
|
# The deadman watch already fires and detects dead agents.
|
|
# This playbook wires the ACTION:
|
|
# - On healthy check: snapshot current config as "last known good"
|
|
# - On failed check: rollback config to snapshot, restart agent
|
|
# =============================================================================
|
|
|
|
- name: "Deploy Deadman Switch ACTION"
|
|
hosts: wizards
|
|
become: true
|
|
|
|
roles:
|
|
- role: deadman_switch
|
|
tags: [deadman, recovery]
|