Tighten Hermes cutover and export checks
This commit is contained in:
@@ -1,7 +1,7 @@
|
||||
# DEPRECATED — Bash Loop Scripts Removed
|
||||
|
||||
**Date:** 2026-03-25
|
||||
**Reason:** Replaced by sovereign-orchestration (SQLite + Python single-process executor)
|
||||
**Reason:** Replaced by Hermes + timmy-config sidecar orchestration
|
||||
|
||||
## What was removed
|
||||
- claude-loop.sh, gemini-loop.sh, agent-loop.sh
|
||||
@@ -9,14 +9,15 @@
|
||||
- nexus-merge-bot.sh, claudemax-watchdog.sh, timmy-loopstat.sh
|
||||
|
||||
## What replaces them
|
||||
**Repo:** Timmy_Foundation/sovereign-orchestration
|
||||
**Entry point:** `python3 src/sovereign_executor.py --workers 3 --poll 30`
|
||||
**Features:** SQLite task queue, crash recovery, dedup, playbooks, MCP server
|
||||
**Issues:** #29 (fix imports), #30 (deploy as service)
|
||||
**Harness:** Hermes
|
||||
**Overlay repo:** Timmy_Foundation/timmy-config
|
||||
**Entry points:** `orchestration.py`, `tasks.py`, `deploy.sh`
|
||||
**Features:** Huey + SQLite scheduling, local-model health checks, session export, DPO artifact staging
|
||||
|
||||
## Why
|
||||
The bash loops crash-looped, produced zero work after relaunch, had no crash
|
||||
recovery, no dedup, and required 8 separate scripts. The Python executor is
|
||||
one process with SQLite durability.
|
||||
recovery, no durable export path, and required too many ad hoc scripts. The
|
||||
Hermes sidecar keeps orchestration close to Timmy's actual config and training
|
||||
surfaces.
|
||||
|
||||
Do NOT recreate bash loops. If the executor is broken, fix the executor.
|
||||
Do NOT recreate bash loops. If orchestration is broken, fix the Hermes sidecar.
|
||||
|
||||
@@ -14,11 +14,12 @@ timmy-config/
|
||||
├── DEPRECATED.md ← What was removed and why
|
||||
├── config.yaml ← Hermes harness configuration
|
||||
├── channel_directory.json ← Platform channel mappings
|
||||
├── bin/ ← Utility scripts (NOT loops — see below)
|
||||
├── bin/ ← Live utility scripts (NOT deprecated loops)
|
||||
│ ├── hermes-startup.sh ← Hermes boot sequence
|
||||
│ ├── agent-dispatch.sh ← Manual agent dispatch
|
||||
│ ├── ops-panel.sh ← Ops dashboard panel
|
||||
│ ├── ops-gitea.sh ← Gitea ops helpers
|
||||
│ ├── pipeline-freshness.sh ← Session/export drift check
|
||||
│ └── timmy-status.sh ← Status check
|
||||
├── memories/ ← Persistent memory YAML
|
||||
├── skins/ ← UI skins (timmy skin)
|
||||
@@ -39,10 +40,14 @@ If a file answers "who is Timmy?" or "how does Hermes host him?", it belongs
|
||||
here. If it answers "what has Timmy done or learned?" it belongs in
|
||||
`timmy-home`.
|
||||
|
||||
The scripts in `bin/` are live operational helpers for the Hermes sidecar.
|
||||
What is dead are the old long-running bash worker loops, not every script in
|
||||
this repo.
|
||||
|
||||
## Orchestration: Huey
|
||||
|
||||
All orchestration (triage, PR review, dispatch) runs via [Huey](https://github.com/coleifer/huey) with SQLite.
|
||||
`orchestration.py` (6 lines) + `tasks.py` (~70 lines) replace the entire sovereign-orchestration repo (3,846 lines).
|
||||
`orchestration.py` + `tasks.py` replace the old sovereign-orchestration repo with a much thinner sidecar.
|
||||
|
||||
```bash
|
||||
pip install huey
|
||||
|
||||
42
bin/pipeline-freshness.sh
Executable file
42
bin/pipeline-freshness.sh
Executable file
@@ -0,0 +1,42 @@
|
||||
#!/usr/bin/env bash
|
||||
|
||||
set -euo pipefail
|
||||
|
||||
SESSIONS_DIR="$HOME/.hermes/sessions"
|
||||
EXPORT_DIR="$HOME/.timmy/training-data/dpo-pairs"
|
||||
|
||||
latest_session=$(find "$SESSIONS_DIR" -maxdepth 1 -name 'session_*.json' -type f -print 2>/dev/null | sort | tail -n 1)
|
||||
latest_export=$(find "$EXPORT_DIR" -maxdepth 1 -name 'session_*.json' -type f -print 2>/dev/null | sort | tail -n 1)
|
||||
|
||||
echo "latest_session=${latest_session:-none}"
|
||||
echo "latest_export=${latest_export:-none}"
|
||||
|
||||
if [ -z "${latest_session:-}" ]; then
|
||||
echo "status=ok"
|
||||
echo "reason=no sessions yet"
|
||||
exit 0
|
||||
fi
|
||||
|
||||
if [ -z "${latest_export:-}" ]; then
|
||||
echo "status=lagging"
|
||||
echo "reason=no exports yet"
|
||||
exit 1
|
||||
fi
|
||||
|
||||
session_mtime=$(stat -f '%m' "$latest_session")
|
||||
export_mtime=$(stat -f '%m' "$latest_export")
|
||||
lag_minutes=$(( (session_mtime - export_mtime) / 60 ))
|
||||
if [ "$lag_minutes" -lt 0 ]; then
|
||||
lag_minutes=0
|
||||
fi
|
||||
|
||||
echo "lag_minutes=$lag_minutes"
|
||||
|
||||
if [ "$lag_minutes" -gt 300 ]; then
|
||||
echo "status=lagging"
|
||||
echo "reason=exports more than 5 hours behind sessions"
|
||||
exit 1
|
||||
fi
|
||||
|
||||
echo "status=ok"
|
||||
echo "reason=exports within freshness window"
|
||||
24
deploy.sh
24
deploy.sh
@@ -3,7 +3,7 @@
|
||||
# This is the canonical way to deploy Timmy's configuration.
|
||||
# Hermes-agent is the engine. timmy-config is the driver's seat.
|
||||
#
|
||||
# Usage: ./deploy.sh [--restart-loops]
|
||||
# Usage: ./deploy.sh
|
||||
|
||||
set -euo pipefail
|
||||
|
||||
@@ -74,24 +74,10 @@ done
|
||||
chmod +x "$HERMES_HOME/bin/"*.sh "$HERMES_HOME/bin/"*.py 2>/dev/null || true
|
||||
log "bin/ -> $HERMES_HOME/bin/"
|
||||
|
||||
# === Restart loops if requested ===
|
||||
if [ "${1:-}" = "--restart-loops" ]; then
|
||||
log "Killing existing loops..."
|
||||
pkill -f 'claude-loop.sh' 2>/dev/null || true
|
||||
pkill -f 'gemini-loop.sh' 2>/dev/null || true
|
||||
pkill -f 'timmy-orchestrator.sh' 2>/dev/null || true
|
||||
sleep 2
|
||||
|
||||
log "Clearing stale locks..."
|
||||
rm -rf "$HERMES_HOME/logs/claude-locks/"* 2>/dev/null || true
|
||||
rm -rf "$HERMES_HOME/logs/gemini-locks/"* 2>/dev/null || true
|
||||
|
||||
log "Relaunching loops..."
|
||||
nohup bash "$HERMES_HOME/bin/timmy-orchestrator.sh" >> "$HERMES_HOME/logs/timmy-orchestrator.log" 2>&1 &
|
||||
nohup bash "$HERMES_HOME/bin/claude-loop.sh" 2 >> "$HERMES_HOME/logs/claude-loop.log" 2>&1 &
|
||||
nohup bash "$HERMES_HOME/bin/gemini-loop.sh" 1 >> "$HERMES_HOME/logs/gemini-loop.log" 2>&1 &
|
||||
sleep 1
|
||||
log "Loops relaunched."
|
||||
if [ "${1:-}" != "" ]; then
|
||||
echo "ERROR: deploy.sh no longer accepts legacy loop flags." >&2
|
||||
echo "Deploy the sidecar only. Do not relaunch deprecated bash loops." >&2
|
||||
exit 1
|
||||
fi
|
||||
|
||||
log "Deploy complete. timmy-config applied to $HERMES_HOME/"
|
||||
|
||||
53
tasks.py
53
tasks.py
@@ -26,6 +26,13 @@ NET_LINE_LIMIT = 10
|
||||
|
||||
HEARTBEAT_MODEL = "hermes4:14b"
|
||||
FALLBACK_MODEL = "hermes3:8b"
|
||||
LOCAL_PROVIDER_BASE_URL = "http://localhost:8081/v1"
|
||||
LOCAL_PROVIDER_MODEL = HEARTBEAT_MODEL
|
||||
|
||||
|
||||
def newest_file(directory, pattern):
|
||||
files = sorted(directory.glob(pattern))
|
||||
return files[-1] if files else None
|
||||
|
||||
|
||||
def hermes_local(prompt, model=None, caller_tag=None, toolsets=None):
|
||||
@@ -322,26 +329,32 @@ def session_export():
|
||||
|
||||
@huey.periodic_task(crontab(minute="*/5")) # every 5 minutes
|
||||
def model_health():
|
||||
"""Check Ollama is running, a model is loaded, inference responds."""
|
||||
"""Check the active local inference surface and export freshness."""
|
||||
checks = {}
|
||||
models_url = f"{LOCAL_PROVIDER_BASE_URL}/models"
|
||||
chat_url = f"{LOCAL_PROVIDER_BASE_URL}/chat/completions"
|
||||
|
||||
# 1. Is Ollama process running?
|
||||
checks["provider"] = "local-llama.cpp"
|
||||
checks["provider_base_url"] = LOCAL_PROVIDER_BASE_URL
|
||||
checks["provider_model"] = LOCAL_PROVIDER_MODEL
|
||||
|
||||
# 1. Is the local inference process running?
|
||||
try:
|
||||
result = subprocess.run(
|
||||
["pgrep", "-f", "ollama"],
|
||||
["pgrep", "-f", "llama-server|ollama"],
|
||||
capture_output=True, timeout=5
|
||||
)
|
||||
checks["ollama_running"] = result.returncode == 0
|
||||
checks["local_inference_running"] = result.returncode == 0
|
||||
except Exception:
|
||||
checks["ollama_running"] = False
|
||||
checks["local_inference_running"] = False
|
||||
|
||||
# 2. Can we hit the API?
|
||||
# 2. Can we hit the configured API?
|
||||
try:
|
||||
import urllib.request
|
||||
req = urllib.request.Request("http://localhost:11434/api/tags")
|
||||
req = urllib.request.Request(models_url)
|
||||
with urllib.request.urlopen(req, timeout=5) as resp:
|
||||
data = json.loads(resp.read())
|
||||
models = [m["name"] for m in data.get("models", [])]
|
||||
models = [m.get("id", "?") for m in data.get("data", [])]
|
||||
checks["models_loaded"] = models
|
||||
checks["api_responding"] = True
|
||||
except Exception as e:
|
||||
@@ -352,13 +365,13 @@ def model_health():
|
||||
if checks.get("api_responding"):
|
||||
try:
|
||||
payload = json.dumps({
|
||||
"model": "hermes3:8b",
|
||||
"model": LOCAL_PROVIDER_MODEL,
|
||||
"messages": [{"role": "user", "content": "ping"}],
|
||||
"max_tokens": 5,
|
||||
"stream": False,
|
||||
}).encode()
|
||||
req = urllib.request.Request(
|
||||
"http://localhost:11434/v1/chat/completions",
|
||||
chat_url,
|
||||
data=payload,
|
||||
headers={"Content-Type": "application/json"},
|
||||
)
|
||||
@@ -368,6 +381,26 @@ def model_health():
|
||||
checks["inference_ok"] = False
|
||||
checks["inference_error"] = str(e)
|
||||
|
||||
# 4. Is session export keeping up with new Hermes sessions?
|
||||
sessions_dir = HERMES_HOME / "sessions"
|
||||
export_dir = TIMMY_HOME / "training-data" / "dpo-pairs"
|
||||
latest_session = newest_file(sessions_dir, "session_*.json")
|
||||
latest_export = newest_file(export_dir, "session_*.json")
|
||||
checks["latest_session"] = latest_session.name if latest_session else None
|
||||
checks["latest_export"] = latest_export.name if latest_export else None
|
||||
if latest_session and latest_export:
|
||||
session_mtime = latest_session.stat().st_mtime
|
||||
export_mtime = latest_export.stat().st_mtime
|
||||
lag_minutes = max(0, int((session_mtime - export_mtime) // 60))
|
||||
checks["export_lag_minutes"] = lag_minutes
|
||||
checks["export_fresh"] = lag_minutes <= 300
|
||||
elif latest_session and not latest_export:
|
||||
checks["export_lag_minutes"] = None
|
||||
checks["export_fresh"] = False
|
||||
else:
|
||||
checks["export_lag_minutes"] = 0
|
||||
checks["export_fresh"] = True
|
||||
|
||||
# Write health status to a file for other tools to read
|
||||
health_file = HERMES_HOME / "model_health.json"
|
||||
checks["timestamp"] = datetime.now(timezone.utc).isoformat()
|
||||
|
||||
Reference in New Issue
Block a user