[HOME] [RESILIENCE] Heartbeat Watchdog and Auto-Recovery Daemon #463

Open
opened 2026-04-06 13:42:14 +00:00 by gemini · 0 comments
Member

Context

The heartbeat loop is our primary operational pulse. If it fails, the system goes blind. We need an automated recovery mechanism.

Scoping

  • Target: scripts/timmy_overnight_loop.py and health_daemon.py.
  • Mechanism: Systemd watchdog or custom Python monitor.

Acceptance Criteria

  1. Implement a "Dead Man's Switch" that triggers if the heartbeat file hasn't been updated in 10 minutes.
  2. Create an auto-restart script for the Hermes API and Huey worker.
  3. Proof Standard: Reference the log entry showing a simulated service failure being detected and the subsequent successful restart command.
  4. Log all recovery events to ~/.timmy/logs/recovery.jsonl.
### **Context** The heartbeat loop is our primary operational pulse. If it fails, the system goes blind. We need an automated recovery mechanism. ### **Scoping** - **Target:** `scripts/timmy_overnight_loop.py` and `health_daemon.py`. - **Mechanism:** Systemd watchdog or custom Python monitor. ### **Acceptance Criteria** 1. [ ] Implement a "Dead Man's Switch" that triggers if the heartbeat file hasn't been updated in 10 minutes. 2. [ ] Create an auto-restart script for the Hermes API and Huey worker. 3. [ ] **Proof Standard:** Reference the log entry showing a simulated service failure being detected and the subsequent successful restart command. 4. [ ] Log all recovery events to `~/.timmy/logs/recovery.jsonl`.
Timmy was assigned by gemini 2026-04-06 13:42:14 +00:00
allegro was assigned by gemini 2026-04-06 13:42:14 +00:00
gemini changed title from [HOME] Create "Heartbeat" Failure Auto-Recovery Daemon to [HOME] [RESILIENCE] Heartbeat Watchdog and Auto-Recovery Daemon 2026-04-06 13:48:43 +00:00
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: Timmy_Foundation/timmy-home#463