docs: inventory automation state and stale resurrection paths

This commit is contained in:
Alexander Whitestone
2026-04-04 16:16:10 -04:00
parent 4489cee478
commit 6fa3f97284
3 changed files with 379 additions and 34 deletions

View File

@@ -1,23 +1,27 @@
# DEPRECATED — Bash Loop Scripts Removed
# DEPRECATED — policy, not proof of runtime absence
**Date:** 2026-03-25
**Reason:** Replaced by Hermes + timmy-config sidecar orchestration
Original deprecation date: 2026-03-25
## What was removed
- claude-loop.sh, gemini-loop.sh, agent-loop.sh
- timmy-orchestrator.sh, workforce-manager.py
- nexus-merge-bot.sh, claudemax-watchdog.sh, timmy-loopstat.sh
This file records the policy direction: long-running ad hoc bash loops were meant
to be replaced by Hermes-side orchestration.
## What replaces them
**Harness:** Hermes
**Overlay repo:** Timmy_Foundation/timmy-config
**Entry points:** `orchestration.py`, `tasks.py`, `deploy.sh`
**Features:** Huey + SQLite scheduling, local-model health checks, session export, DPO artifact staging
But policy and world state diverged.
Some of these loops and watchdogs were later revived directly in the live runtime.
## Why
The bash loops crash-looped, produced zero work after relaunch, had no crash
recovery, no durable export path, and required too many ad hoc scripts. The
Hermes sidecar keeps orchestration close to Timmy's actual config and training
surfaces.
Do NOT use this file as proof that something is gone.
Use `docs/automation-inventory.md` as the current world-state document.
Do NOT recreate bash loops. If orchestration is broken, fix the Hermes sidecar.
## Deprecated by policy
- old dashboard-era loop stacks
- old tmux resurrection paths
- old startup paths that recreate `timmy-loop`
- stale repo-specific automation tied to `Timmy-time-dashboard` or `the-matrix`
## Current rule
If an automation question matters, audit:
1. launchd loaded jobs
2. live process table
3. Hermes cron list
4. the automation inventory doc
Only then decide what is actually live.

View File

@@ -14,10 +14,9 @@ timmy-config/
├── DEPRECATED.md ← What was removed and why
├── config.yaml ← Hermes harness configuration
├── channel_directory.json ← Platform channel mappings
├── bin/ ← Live utility scripts (NOT deprecated loops)
│ ├── hermes-startup.sh ← Hermes boot sequence
├── bin/ ← Sidecar-managed operational scripts
│ ├── hermes-startup.sh ← Dormant startup path (audit before enabling)
│ ├── agent-dispatch.sh ← Manual agent dispatch
│ ├── deploy-allegro-house.sh← Bootstraps the remote Allegro wizard house
│ ├── ops-panel.sh ← Ops dashboard panel
│ ├── ops-gitea.sh ← Gitea ops helpers
│ ├── pipeline-freshness.sh ← Session/export drift check
@@ -26,7 +25,7 @@ timmy-config/
├── skins/ ← UI skins (timmy skin)
├── playbooks/ ← Agent playbooks (YAML)
├── cron/ ← Cron job definitions
├── wizards/ ← Remote wizard-house templates + units
├── docs/automation-inventory.md ← Live automation + stale-state inventory
└── training/ ← Transitional training recipes, not canonical lived data
```
@@ -42,9 +41,10 @@ If a file answers "who is Timmy?" or "how does Hermes host him?", it belongs
here. If it answers "what has Timmy done or learned?" it belongs in
`timmy-home`.
The scripts in `bin/` are live operational helpers for the Hermes sidecar.
What is dead are the old long-running bash worker loops, not every script in
this repo.
The scripts in `bin/` are sidecar-managed operational helpers for the Hermes layer.
Do NOT assume older prose about removed loops is still true at runtime.
Audit the live machine first, then read `docs/automation-inventory.md` for the
current reality and stale-state risks.
## Orchestration: Huey
@@ -56,15 +56,6 @@ pip install huey
huey_consumer.py tasks.huey -w 2 -k thread
```
## Proof Standard
This repo uses a hard proof rule for merges.
- visual changes require screenshot proof
- CLI/verifiable changes must cite logs, command output, or world-state proof
- screenshots/media stay out of Gitea backup unless explicitly required
- see `CONTRIBUTING.md` for the merge gate
## Deploy
```bash

View File

@@ -0,0 +1,350 @@
# Automation Inventory
Last audited: 2026-04-04 15:55 EDT
Owner: Timmy sidecar / Timmy home split
Purpose: document every known automation that can restart services, revive old worktrees, reuse stale session state, or re-enter old queue state.
## Why this file exists
The failure mode is not just "a process is running".
The failure mode is:
- launchd or a watchdog restarts something behind our backs
- the restarted process reads old config, old labels, old worktrees, old session mappings, or old tmux assumptions
- the machine appears haunted because old state comes back after we thought it was gone
This file is the source of truth for what automations exist, what state they read, and how to stop or reset them safely.
## Source-of-truth split
Not all automations live in one repo.
1. timmy-config
Path: ~/.timmy/timmy-config
Owns: sidecar deployment, ~/.hermes/config.yaml overlay, launch-facing helper scripts in timmy-config/bin/
2. timmy-home
Path: ~/.timmy
Owns: Kimi heartbeat script at uniwizard/kimi-heartbeat.sh and other workspace-native automation
3. live runtime
Path: ~/.hermes/bin
Reality: some scripts are still only present live in ~/.hermes/bin and are NOT yet mirrored into timmy-config/bin/
Rule:
- Do not assume ~/.hermes/bin is canonical.
- Do not assume timmy-config contains every currently running automation.
- Audit runtime first, then reconcile to source control.
## Current live automations
### A. launchd-loaded automations
These are loaded right now according to `launchctl list` after the 2026-04-04 phase-2 cleanup.
The only Timmy-specific launchd jobs still loaded are the ones below.
#### 1. ai.hermes.gateway
- Plist: ~/Library/LaunchAgents/ai.hermes.gateway.plist
- Command: `python -m hermes_cli.main gateway run --replace`
- HERMES_HOME: `~/.hermes`
- Logs:
- `~/.hermes/logs/gateway.log`
- `~/.hermes/logs/gateway.error.log`
- KeepAlive: yes
- RunAtLoad: yes
- State it reuses:
- `~/.hermes/config.yaml`
- `~/.hermes/channel_directory.json`
- `~/.hermes/sessions/sessions.json`
- `~/.hermes/state.db`
- Old-state risk:
- if config drifted, this gateway will faithfully revive the drift
- if Telegram/session mappings are stale, it will continue stale conversations
Stop:
```bash
launchctl bootout gui/$(id -u) ~/Library/LaunchAgents/ai.hermes.gateway.plist
```
Start:
```bash
launchctl bootstrap gui/$(id -u) ~/Library/LaunchAgents/ai.hermes.gateway.plist
```
#### 2. ai.hermes.gateway-fenrir
- Plist: ~/Library/LaunchAgents/ai.hermes.gateway-fenrir.plist
- Command: same gateway binary
- HERMES_HOME: `~/.hermes/profiles/fenrir`
- Logs:
- `~/.hermes/profiles/fenrir/logs/gateway.log`
- `~/.hermes/profiles/fenrir/logs/gateway.error.log`
- KeepAlive: yes
- RunAtLoad: yes
- Old-state risk:
- same class as main gateway, but isolated to fenrir profile state
#### 3. ai.openclaw.gateway
- Plist: ~/Library/LaunchAgents/ai.openclaw.gateway.plist
- Command: `node .../openclaw/dist/index.js gateway --port 18789`
- Logs:
- `~/.openclaw/logs/gateway.log`
- `~/.openclaw/logs/gateway.err.log`
- KeepAlive: yes
- RunAtLoad: yes
- Old-state risk:
- long-lived gateway survives toolchain assumptions and keeps accepting work even if upstream routing changed
#### 4. ai.timmy.kimi-heartbeat
- Plist: ~/Library/LaunchAgents/ai.timmy.kimi-heartbeat.plist
- Command: `/bin/bash ~/.timmy/uniwizard/kimi-heartbeat.sh`
- Interval: every 300s
- Logs:
- `/tmp/kimi-heartbeat-launchd.log`
- `/tmp/kimi-heartbeat-launchd.err`
- script log: `/tmp/kimi-heartbeat.log`
- State it reuses:
- `/tmp/kimi-heartbeat.lock`
- Gitea labels: `assigned-kimi`, `kimi-in-progress`, `kimi-done`
- repo issue bodies/comments as task memory
- Current behavior as of this audit:
- stale `kimi-in-progress` tasks are now reclaimed after 1 hour of silence
- Old-state risk:
- labels ARE the queue state; if labels are stale, the heartbeat used to starve forever
- the heartbeat is source-controlled in timmy-home, not timmy-config
Stop:
```bash
launchctl bootout gui/$(id -u) ~/Library/LaunchAgents/ai.timmy.kimi-heartbeat.plist
```
Clear lock only if process is truly dead:
```bash
rm -f /tmp/kimi-heartbeat.lock
```
#### 5. ai.timmy.claudemax-watchdog
- Plist: ~/Library/LaunchAgents/ai.timmy.claudemax-watchdog.plist
- Command: `/bin/bash ~/.hermes/bin/claudemax-watchdog.sh`
- Interval: every 300s
- Logs:
- `~/.hermes/logs/claudemax-watchdog.log`
- launchd wrapper: `~/.hermes/logs/claudemax-launchd.log`
- State it reuses:
- live process table via `pgrep`
- recent Claude logs `~/.hermes/logs/claude-*.log`
- backlog count from Gitea
- Current behavior as of this audit:
- will NOT restart claude-loop if recent Claude logs say `You've hit your limit`
- will log-and-skip missing helper scripts instead of failing loudly
- Old-state risk:
- any watchdog can resurrect a loop you meant to leave dead
- this is the first place to check when a loop "comes back"
### B. quarantined legacy launch agents
These were moved out of `~/Library/LaunchAgents` on 2026-04-04 to:
`~/Library/LaunchAgents.quarantine/timmy-legacy-20260404/`
#### 6. com.timmy.dashboard-backend
- Former plist: `com.timmy.dashboard-backend.plist`
- Former command: uvicorn `dashboard.app:app`
- Former working directory: `~/worktrees/kimi-repo`
- Quarantine reason:
- served code from a specific stale worktree
- could revive old backend state by launchd KeepAlive alone
#### 7. com.timmy.matrix-frontend
- Former plist: `com.timmy.matrix-frontend.plist`
- Former command: `npx vite --host`
- Former working directory: `~/worktrees/the-matrix`
- Quarantine reason:
- pointed at the old `the-matrix` lineage instead of current nexus truth
- could revive a stale frontend every login
#### 8. ai.hermes.startup
- Former plist: `ai.hermes.startup.plist`
- Former command: `~/.hermes/bin/hermes-startup.sh`
- Quarantine reason:
- startup path still expected missing `timmy-tmux.sh`
- could recreate old webhook/tmux assumptions at login
#### 9. com.timmy.tick
- Former plist: `com.timmy.tick.plist`
- Former command: `/Users/apayne/Timmy-time-dashboard/deploy/timmy-tick-mac.sh`
- Quarantine reason:
- pure dashboard-era legacy path
### C. running now but NOT launchd-managed
These are live processes, but not currently represented by a loaded launchd plist.
They can still persist because they were started with `nohup` or by other parent scripts.
#### 8. gemini-loop.sh
- Live process: `~/.hermes/bin/gemini-loop.sh`
- State files:
- `~/.hermes/logs/gemini-loop.log`
- `~/.hermes/logs/gemini-skip-list.json`
- `~/.hermes/logs/gemini-active.json`
- `~/.hermes/logs/gemini-locks/`
- `~/.hermes/logs/gemini-pids/`
- worktrees under `~/worktrees/gemini-w*`
- per-issue logs `~/.hermes/logs/gemini-*.log`
- Old-state risk:
- skip list suppresses issues for hours
- lock directories can make issues look "already busy"
- old worktrees can preserve prior branch state
- branch naming `gemini/issue-N` continues prior work if branch exists
Stop cleanly:
```bash
pkill -f 'bash /Users/apayne/.hermes/bin/gemini-loop.sh'
pkill -f 'gemini .*--yolo'
rm -rf ~/.hermes/logs/gemini-locks/*.lock ~/.hermes/logs/gemini-pids/*.pid
printf '{}\n' > ~/.hermes/logs/gemini-active.json
```
#### 9. timmy-orchestrator.sh
- Live process: `~/.hermes/bin/timmy-orchestrator.sh`
- State files:
- `~/.hermes/logs/timmy-orchestrator.log`
- `~/.hermes/logs/timmy-orchestrator.pid`
- `~/.hermes/logs/timmy-reviews.log`
- `~/.hermes/logs/workforce-manager.log`
- transient state dir: `/tmp/timmy-state-$$/`
- Working behavior:
- bulk-assigns unassigned issues to claude
- reviews PRs via `hermes chat`
- runs `workforce-manager.py`
- Old-state risk:
- writes agent assignments back into Gitea
- can repopulate agent queues even after you thought they were cleared
- not represented in timmy-config/bin yet as of this audit
### D. Hermes cron automations
Current cron inventory from `cronjob(list, include_disabled=true)`:
Enabled:
- `a77a87392582` — Health Monitor — every 5m
Paused:
- `9e0624269ba7` — Triage Heartbeat
- `e29eda4a8548` — PR Review Sweep
- `5e9d952871bc` — Agent Status Check
- `36fb2f630a17` — Hermes Philosophy Loop
Old-state risk:
- paused crons are not dead forever; they are resumable state
- LLM-wrapped crons can revive old routing/model assumptions if resumed blindly
### E. file exists but NOT currently loaded
These are the ones most likely to surprise us later because they still exist and point at old realities.
#### 10. com.tower.pr-automerge
- Plist: `~/Library/LaunchAgents/com.tower.pr-automerge.plist`
- Points to: `/Users/apayne/hermes-config/bin/pr-automerge.sh`
- Not loaded at audit time
- Separate Tower-era automation path; not part of current Timmy sidecar truth
## State carriers that make the machine feel haunted
These are the files and external states that most often "bring back old state":
### Hermes runtime state
- `~/.hermes/config.yaml`
- `~/.hermes/channel_directory.json`
- `~/.hermes/sessions/sessions.json`
- `~/.hermes/state.db`
### Loop state
- `~/.hermes/logs/claude-skip-list.json`
- `~/.hermes/logs/claude-active.json`
- `~/.hermes/logs/claude-locks/`
- `~/.hermes/logs/claude-pids/`
- `~/.hermes/logs/gemini-skip-list.json`
- `~/.hermes/logs/gemini-active.json`
- `~/.hermes/logs/gemini-locks/`
- `~/.hermes/logs/gemini-pids/`
### Kimi queue state
- Gitea labels, not local files, are the queue truth
- `assigned-kimi`
- `kimi-in-progress`
- `kimi-done`
### Worktree state
- `~/worktrees/*`
- especially old frontend/backend worktrees like:
- `~/worktrees/the-matrix`
- `~/worktrees/kimi-repo`
### Launchd state
- plist files in `~/Library/LaunchAgents`
- anything with `RunAtLoad` and `KeepAlive` can resurrect automatically
## Audit commands
List loaded Timmy/Hermes automations:
```bash
launchctl list | egrep 'timmy|kimi|claude|max|dashboard|matrix|gateway|huey'
```
List Timmy/Hermes launch agent files:
```bash
find ~/Library/LaunchAgents -maxdepth 1 -name '*.plist' | egrep 'timmy|hermes|openclaw|tower'
```
List running loop scripts:
```bash
ps -Ao pid,ppid,etime,command | egrep '/Users/apayne/.hermes/bin/|/Users/apayne/.timmy/uniwizard/'
```
List cron jobs:
```bash
hermes cron list --include-disabled
```
## Safe reset order when old state keeps coming back
1. Stop launchd jobs first
```bash
launchctl bootout gui/$(id -u) ~/Library/LaunchAgents/ai.timmy.kimi-heartbeat.plist || true
launchctl bootout gui/$(id -u) ~/Library/LaunchAgents/ai.timmy.claudemax-watchdog.plist || true
launchctl bootout gui/$(id -u) ~/Library/LaunchAgents/ai.hermes.gateway.plist || true
launchctl bootout gui/$(id -u) ~/Library/LaunchAgents/ai.hermes.gateway-fenrir.plist || true
launchctl bootout gui/$(id -u) ~/Library/LaunchAgents/ai.openclaw.gateway.plist || true
```
2. Kill manual loops
```bash
pkill -f 'gemini-loop.sh' || true
pkill -f 'timmy-orchestrator.sh' || true
pkill -f 'claude-loop.sh' || true
pkill -f 'claude .*--print' || true
pkill -f 'gemini .*--yolo' || true
```
3. Clear local loop state
```bash
rm -rf ~/.hermes/logs/claude-locks/*.lock ~/.hermes/logs/claude-pids/*.pid
rm -rf ~/.hermes/logs/gemini-locks/*.lock ~/.hermes/logs/gemini-pids/*.pid
printf '{}\n' > ~/.hermes/logs/claude-active.json
printf '{}\n' > ~/.hermes/logs/gemini-active.json
rm -f /tmp/kimi-heartbeat.lock
```
4. If gateway/session drift is the problem, back up before clearing
```bash
cp ~/.hermes/config.yaml ~/.hermes/config.yaml.bak.$(date +%Y%m%d-%H%M%S)
cp ~/.hermes/sessions/sessions.json ~/.hermes/sessions/sessions.json.bak.$(date +%Y%m%d-%H%M%S)
```
5. Relaunch only what you explicitly want
## Current contradictions to fix later
1. README and DEPRECATED were corrected on 2026-04-04, but older local clones may still have stale prose.
2. The quarantined launch agents now live under `~/Library/LaunchAgents.quarantine/timmy-legacy-20260404/`; if someone moves them back, the old state can return.
3. `gemini-loop.sh` and `timmy-orchestrator.sh` are live but not yet mirrored into timmy-config/bin/.
4. The open docs PR must be kept clean: do not mix operational script recovery and documentation history on the same branch.
Until those are reconciled, trust this inventory over older prose.