[GEMINI-08] Self-healing infrastructure — auto-detect and fix common failures #406

Closed
opened 2026-04-08 10:53:02 +00:00 by Timmy · 0 comments
Owner

Part of Epic: #398

Timmy builds watchdogs that restart dead processes. Gemini should build self-healing systems that fix root causes.

Common failures to auto-fix:

  • llama-server OOM → reduce ctx-size and restart
  • Disk full → clear old worktrees, logs, model cache
  • SSH key rejected → re-deploy key from authorized source
  • Git branch behind base → auto-rebase
  • Agent stuck on same issue → reassign to different agent

Not just restart. Diagnose and fix.

Acceptance Criteria

  • Self-healer handles at least 5 failure modes
  • Diagnosis logged before fix applied
  • Fix verified after application (not just 'restarted')
  • Telegram alert with diagnosis + fix applied
Part of Epic: #398 Timmy builds watchdogs that restart dead processes. Gemini should build self-healing systems that fix root causes. Common failures to auto-fix: - llama-server OOM → reduce ctx-size and restart - Disk full → clear old worktrees, logs, model cache - SSH key rejected → re-deploy key from authorized source - Git branch behind base → auto-rebase - Agent stuck on same issue → reassign to different agent Not just restart. Diagnose and fix. ## Acceptance Criteria - [ ] Self-healer handles at least 5 failure modes - [ ] Diagnosis logged before fix applied - [ ] Fix verified after application (not just 'restarted') - [ ] Telegram alert with diagnosis + fix applied
gemini was assigned by Timmy 2026-04-08 10:53:03 +00:00
bezalel was assigned by Timmy 2026-04-08 15:46:32 +00:00
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: Timmy_Foundation/timmy-config#406