[gemini] Pull Hermes 4 14B — inference (GGUF) + training (MLX) models (#9 )

2026-03-26 12:41:07 -04:00
13 changed files with 82 additions and 1085 deletions
--- a/.gitignore
+++ b/.gitignore
@@ -8,3 +8,4 @@
 *.db-wal
 *.db-shm
 __pycache__/
+.aider*
--- a/DEPRECATED.md
+++ b/DEPRECATED.md
@@ -1,7 +1,7 @@
 # DEPRECATED — Bash Loop Scripts Removed

 **Date:** 2026-03-25
-**Reason:** Replaced by Hermes + timmy-config sidecar orchestration
+**Reason:** Replaced by sovereign-orchestration (SQLite + Python single-process executor)

 ## What was removed
 - claude-loop.sh, gemini-loop.sh, agent-loop.sh
@@ -9,15 +9,14 @@
 - nexus-merge-bot.sh, claudemax-watchdog.sh, timmy-loopstat.sh

 ## What replaces them
-**Harness:** Hermes
-**Overlay repo:** Timmy_Foundation/timmy-config
-**Entry points:** `orchestration.py`, `tasks.py`, `deploy.sh`
-**Features:** Huey + SQLite scheduling, local-model health checks, session export, DPO artifact staging
+**Repo:** Timmy_Foundation/sovereign-orchestration
+**Entry point:** `python3 src/sovereign_executor.py --workers 3 --poll 30`
+**Features:** SQLite task queue, crash recovery, dedup, playbooks, MCP server
+**Issues:** #29 (fix imports), #30 (deploy as service)

 ## Why
 The bash loops crash-looped, produced zero work after relaunch, had no crash
-recovery, no durable export path, and required too many ad hoc scripts. The
-Hermes sidecar keeps orchestration close to Timmy's actual config and training
-surfaces.
+recovery, no dedup, and required 8 separate scripts. The Python executor is
+one process with SQLite durability.

-Do NOT recreate bash loops. If orchestration is broken, fix the Hermes sidecar.
+Do NOT recreate bash loops. If the executor is broken, fix the executor.
--- a/README.md
+++ b/README.md
@@ -2,7 +2,7 @@

 Timmy's sovereign configuration. Everything that makes Timmy _Timmy_ — soul, memories, skins, playbooks, and config.

-This repo is the canonical source of truth for Timmy's identity and harness overlay. Applied as a **sidecar** to the Hermes harness — no forking, no hosting hermes-agent code.
+This repo is the canonical source of truth for Timmy's identity and operational state. Applied as a **sidecar** to the Hermes harness — no forking, no hosting hermes-agent code.

 ## Structure

@@ -14,40 +14,22 @@ timmy-config/
 ├── DEPRECATED.md              ← What was removed and why
 ├── config.yaml                ← Hermes harness configuration
 ├── channel_directory.json     ← Platform channel mappings
-├── bin/                       ← Live utility scripts (NOT deprecated loops)
+├── bin/                       ← Utility scripts (NOT loops — see below)
 │   ├── hermes-startup.sh      ← Hermes boot sequence
 │   ├── agent-dispatch.sh      ← Manual agent dispatch
 │   ├── ops-panel.sh           ← Ops dashboard panel
 │   ├── ops-gitea.sh           ← Gitea ops helpers
-│   ├── pipeline-freshness.sh  ← Session/export drift check
 │   └── timmy-status.sh        ← Status check
 ├── memories/                  ← Persistent memory YAML
 ├── skins/                     ← UI skins (timmy skin)
 ├── playbooks/                 ← Agent playbooks (YAML)
-├── cron/                      ← Cron job definitions
-└── training/                  ← Transitional training recipes, not canonical lived data
+└── cron/                      ← Cron job definitions
 ```

-## Boundary
-
-`timmy-config` owns identity, conscience, memories, skins, playbooks, channel
-maps, and harness-side orchestration glue.
-
-`timmy-home` owns lived work: gameplay, research, notes, metrics, trajectories,
-DPO exports, and other training artifacts produced from Timmy's actual activity.
-
-If a file answers "who is Timmy?" or "how does Hermes host him?", it belongs
-here. If it answers "what has Timmy done or learned?" it belongs in
-`timmy-home`.
-
-The scripts in `bin/` are live operational helpers for the Hermes sidecar.
-What is dead are the old long-running bash worker loops, not every script in
-this repo.
-
 ## Orchestration: Huey

 All orchestration (triage, PR review, dispatch) runs via [Huey](https://github.com/coleifer/huey) with SQLite.
-`orchestration.py` + `tasks.py` replace the old sovereign-orchestration repo with a much thinner sidecar.
+`orchestration.py` (6 lines) + `tasks.py` (~70 lines) replace the entire sovereign-orchestration repo (3,846 lines).

 ```bash
 pip install huey
--- a/bin/pipeline-freshness.sh
+++ b/bin/pipeline-freshness.sh
@@ -1,42 +0,0 @@
-#!/usr/bin/env bash
-
-set -euo pipefail
-
-SESSIONS_DIR="$HOME/.hermes/sessions"
-EXPORT_DIR="$HOME/.timmy/training-data/dpo-pairs"
-
-latest_session=$(find "$SESSIONS_DIR" -maxdepth 1 -name 'session_*.json' -type f -print 2>/dev/null | sort | tail -n 1)
-latest_export=$(find "$EXPORT_DIR" -maxdepth 1 -name 'session_*.json' -type f -print 2>/dev/null | sort | tail -n 1)
-
-echo "latest_session=${latest_session:-none}"
-echo "latest_export=${latest_export:-none}"
-
-if [ -z "${latest_session:-}" ]; then
-  echo "status=ok"
-  echo "reason=no sessions yet"
-  exit 0
-fi
-
-if [ -z "${latest_export:-}" ]; then
-  echo "status=lagging"
-  echo "reason=no exports yet"
-  exit 1
-fi
-
-session_mtime=$(stat -f '%m' "$latest_session")
-export_mtime=$(stat -f '%m' "$latest_export")
-lag_minutes=$(( (session_mtime - export_mtime) / 60 ))
-if [ "$lag_minutes" -lt 0 ]; then
-  lag_minutes=0
-fi
-
-echo "lag_minutes=$lag_minutes"
-
-if [ "$lag_minutes" -gt 300 ]; then
-  echo "status=lagging"
-  echo "reason=exports more than 5 hours behind sessions"
-  exit 1
-fi
-
-echo "status=ok"
-echo "reason=exports within freshness window"
--- a/bin/timmy-dashboard
+++ b/bin/timmy-dashboard
@@ -1,252 +0,0 @@
-#!/usr/bin/env python3
-"""Timmy Model Dashboard — where are my models, what are they doing.
-
-Usage:
-    timmy-dashboard              # one-shot
-    timmy-dashboard --watch      # live refresh every 30s
-    timmy-dashboard --hours=48   # look back 48h
-"""
-
-import json
-import os
-import subprocess
-import sys
-import time
-import urllib.request
-from datetime import datetime, timezone, timedelta
-from pathlib import Path
-
-HERMES_HOME = Path.home() / ".hermes"
-TIMMY_HOME = Path.home() / ".timmy"
-METRICS_DIR = TIMMY_HOME / "metrics"
-
-# ── Data Sources ──────────────────────────────────────────────────────
-
-def get_ollama_models():
-    try:
-        req = urllib.request.Request("http://localhost:11434/api/tags")
-        with urllib.request.urlopen(req, timeout=5) as resp:
-            return json.loads(resp.read()).get("models", [])
-    except Exception:
-        return []
-
-
-def get_loaded_models():
-    try:
-        req = urllib.request.Request("http://localhost:11434/api/ps")
-        with urllib.request.urlopen(req, timeout=5) as resp:
-            return json.loads(resp.read()).get("models", [])
-    except Exception:
-        return []
-
-
-def get_huey_pid():
-    try:
-        r = subprocess.run(["pgrep", "-f", "huey_consumer"],
-                          capture_output=True, text=True, timeout=5)
-        return r.stdout.strip().split("\n")[0] if r.returncode == 0 else None
-    except Exception:
-        return None
-
-
-def get_hermes_sessions():
-    sessions_file = HERMES_HOME / "sessions" / "sessions.json"
-    if not sessions_file.exists():
-        return []
-    try:
-        data = json.loads(sessions_file.read_text())
-        return list(data.values())
-    except Exception:
-        return []
-
-
-def get_heartbeat_ticks(date_str=None):
-    if not date_str:
-        date_str = datetime.now().strftime("%Y%m%d")
-    tick_file = TIMMY_HOME / "heartbeat" / f"ticks_{date_str}.jsonl"
-    if not tick_file.exists():
-        return []
-    ticks = []
-    for line in tick_file.read_text().strip().split("\n"):
-        if not line.strip():
-            continue
-        try:
-            ticks.append(json.loads(line))
-        except Exception:
-            continue
-    return ticks
-
-
-def get_local_metrics(hours=24):
-    """Read local inference metrics from jsonl files."""
-    records = []
-    cutoff = datetime.now(timezone.utc) - timedelta(hours=hours)
-    if not METRICS_DIR.exists():
-        return records
-    for f in sorted(METRICS_DIR.glob("local_*.jsonl")):
-        for line in f.read_text().strip().split("\n"):
-            if not line.strip():
-                continue
-            try:
-                r = json.loads(line)
-                ts = datetime.fromisoformat(r["timestamp"])
-                if ts >= cutoff:
-                    records.append(r)
-            except Exception:
-                continue
-    return records
-
-
-def get_cron_jobs():
-    """Get Hermes cron job status."""
-    try:
-        r = subprocess.run(
-            ["hermes", "cron", "list", "--json"],
-            capture_output=True, text=True, timeout=10
-        )
-        if r.returncode == 0:
-            return json.loads(r.stdout).get("jobs", [])
-    except Exception:
-        pass
-    return []
-
-
-# ── Rendering ─────────────────────────────────────────────────────────
-
-DIM = "\033[2m"
-BOLD = "\033[1m"
-GREEN = "\033[32m"
-YELLOW = "\033[33m"
-RED = "\033[31m"
-CYAN = "\033[36m"
-RST = "\033[0m"
-CLR = "\033[2J\033[H"
-
-
-def render(hours=24):
-    models = get_ollama_models()
-    loaded = get_loaded_models()
-    huey_pid = get_huey_pid()
-    ticks = get_heartbeat_ticks()
-    metrics = get_local_metrics(hours)
-    sessions = get_hermes_sessions()
-
-    loaded_names = {m.get("name", "") for m in loaded}
-    now = datetime.now().strftime("%Y-%m-%d %H:%M:%S")
-
-    print(CLR, end="")
-    print(f"{BOLD}{'=' * 70}")
-    print(f"  TIMMY MODEL DASHBOARD")
-    print(f"  {now}  |  Huey: {GREEN}PID {huey_pid}{RST if huey_pid else f'{RED}DOWN{RST}'}")
-    print(f"{'=' * 70}{RST}")
-
-    # ── LOCAL MODELS ──
-    print(f"\n  {BOLD}LOCAL MODELS (Ollama){RST}")
-    print(f"  {DIM}{'-' * 55}{RST}")
-    if models:
-        for m in models:
-            name = m.get("name", "?")
-            size_gb = m.get("size", 0) / 1e9
-            if name in loaded_names:
-                status = f"{GREEN}IN VRAM{RST}"
-            else:
-                status = f"{DIM}on disk{RST}"
-            print(f"    {name:35s} {size_gb:5.1f}GB  {status}")
-    else:
-        print(f"    {RED}(Ollama not responding){RST}")
-
-    # ── LOCAL INFERENCE ACTIVITY ──
-    print(f"\n  {BOLD}LOCAL INFERENCE ({len(metrics)} calls, last {hours}h){RST}")
-    print(f"  {DIM}{'-' * 55}{RST}")
-    if metrics:
-        by_caller = {}
-        for r in metrics:
-            caller = r.get("caller", "unknown")
-            if caller not in by_caller:
-                by_caller[caller] = {"count": 0, "success": 0, "errors": 0}
-            by_caller[caller]["count"] += 1
-            if r.get("success"):
-                by_caller[caller]["success"] += 1
-            else:
-                by_caller[caller]["errors"] += 1
-        for caller, stats in by_caller.items():
-            err = f"  {RED}err:{stats['errors']}{RST}" if stats["errors"] else ""
-            print(f"    {caller:25s}  calls:{stats['count']:4d}  "
-                  f"{GREEN}ok:{stats['success']}{RST}{err}")
-
-        by_model = {}
-        for r in metrics:
-            model = r.get("model", "unknown")
-            by_model[model] = by_model.get(model, 0) + 1
-        print(f"\n    {DIM}Models used:{RST}")
-        for model, count in sorted(by_model.items(), key=lambda x: -x[1]):
-            print(f"      {model:30s}  {count} calls")
-    else:
-        print(f"    {DIM}(no local calls recorded yet){RST}")
-
-    # ── HEARTBEAT STATUS ──
-    print(f"\n  {BOLD}HEARTBEAT ({len(ticks)} ticks today){RST}")
-    print(f"  {DIM}{'-' * 55}{RST}")
-    if ticks:
-        last = ticks[-1]
-        decision = last.get("decision", last.get("actions", {}))
-        if isinstance(decision, dict):
-            severity = decision.get("severity", "unknown")
-            reasoning = decision.get("reasoning", "")
-            sev_color = GREEN if severity == "ok" else YELLOW if severity == "warning" else RED
-            print(f"    Last tick:  {last.get('tick_id', '?')}")
-            print(f"    Severity:   {sev_color}{severity}{RST}")
-            if reasoning:
-                print(f"    Reasoning:  {reasoning[:65]}")
-        else:
-            print(f"    Last tick:  {last.get('tick_id', '?')}")
-            actions = last.get("actions", [])
-            print(f"    Actions:    {actions if actions else 'none'}")
-
-        model_decisions = sum(1 for t in ticks
-                            if isinstance(t.get("decision"), dict)
-                            and t["decision"].get("severity") != "fallback")
-        fallback = len(ticks) - model_decisions
-        print(f"    {CYAN}Model: {model_decisions}{RST}  |  {DIM}Fallback: {fallback}{RST}")
-    else:
-        print(f"    {DIM}(no ticks today){RST}")
-
-    # ── HERMES SESSIONS ──
-    local_sessions = [s for s in sessions
-                     if "localhost:11434" in str(s.get("base_url", ""))]
-    cloud_sessions = [s for s in sessions if s not in local_sessions]
-    print(f"\n  {BOLD}HERMES SESSIONS{RST}")
-    print(f"  {DIM}{'-' * 55}{RST}")
-    print(f"    Total: {len(sessions)}  |  "
-          f"{GREEN}Local: {len(local_sessions)}{RST}  |  "
-          f"{YELLOW}Cloud: {len(cloud_sessions)}{RST}")
-
-    # ── ACTIVE LOOPS ──
-    print(f"\n  {BOLD}ACTIVE LOOPS{RST}")
-    print(f"  {DIM}{'-' * 55}{RST}")
-    print(f"    {CYAN}heartbeat_tick{RST}    10m    hermes4:14b    DECIDE phase")
-    print(f"    {DIM}model_health{RST}      5m     (local check)  Ollama ping")
-    print(f"    {DIM}gemini_worker{RST}     20m    gemini-2.5-pro aider")
-    print(f"    {DIM}grok_worker{RST}       20m    grok-3-fast    opencode")
-    print(f"    {DIM}cross_review{RST}      30m    gemini+grok    PR review")
-
-    print(f"\n{BOLD}{'=' * 70}{RST}")
-    print(f"  {DIM}Refresh: timmy-dashboard --watch | History: --hours=N{RST}")
-
-
-if __name__ == "__main__":
-    watch = "--watch" in sys.argv
-    hours = 24
-    for a in sys.argv[1:]:
-        if a.startswith("--hours="):
-            hours = int(a.split("=")[1])
-
-    if watch:
-        try:
-            while True:
-                render(hours)
-                time.sleep(30)
-        except KeyboardInterrupt:
-            print(f"\n{DIM}Dashboard stopped.{RST}")
-    else:
-        render(hours)
--- a/channel_directory.json
+++ b/channel_directory.json
@@ -1,5 +1,5 @@
 {
-  "updated_at": "2026-03-27T15:20:52.948451",
+  "updated_at": "2026-03-26T10:19:33.045324",
  "platforms": {
    "discord": [
      {
--- a/config.yaml
+++ b/config.yaml
@@ -1,13 +1,11 @@
 model:
-  default: auto
-  provider: custom
-  context_length: 65536
-  base_url: http://localhost:8081/v1
+  default: claude-opus-4-6
+  provider: anthropic
 toolsets:
 - all
 agent:
  max_turns: 30
-  reasoning_effort: xhigh
+  reasoning_effort: medium
  verbose: false
 terminal:
  backend: local
@@ -96,13 +94,11 @@ display:
  compact: false
  personality: ''
  resume_display: full
-  busy_input_mode: interrupt
  bell_on_complete: false
  show_reasoning: false
  streaming: false
  show_cost: false
  skin: timmy
-  tool_progress_command: false
  tool_progress: all
 privacy:
  redact_pii: false
@@ -185,17 +181,17 @@ session_reset:
  mode: none
  idle_minutes: 0
 custom_providers:
- name: Local llama.cpp
-  base_url: http://localhost:8081/v1
-  api_key: none
-  model: auto
+- name: Local Ollama
+  base_url: http://localhost:11434/v1
+  api_key: ollama
+  model: glm-4.7-flash:latest
 - name: Google Gemini
  base_url: https://generativelanguage.googleapis.com/v1beta/openai
  api_key_env: GEMINI_API_KEY
  model: gemini-2.5-pro
 system_prompt_suffix: "You are Timmy. Your soul is defined in SOUL.md \u2014 read\
-  \ it, live it.\nYou run locally on your owner's machine via llama.cpp. You never\
-  \ phone home.\nYou speak plainly. You prefer short sentences. Brevity is a kindness.\n\
+  \ it, live it.\nYou run locally on your owner's machine via Ollama. You never phone\
+  \ home.\nYou speak plainly. You prefer short sentences. Brevity is a kindness.\n\
  When you don't know something, say so. Refusal over fabrication.\nSovereignty and\
  \ service always.\n"
 skills:
@@ -206,12 +202,12 @@ providers:
    base_url: http://localhost:11434/v1
    model: hermes3:latest
 mcp_servers:
-  morrowind:
-    command: python3
+  orchestration:
+    command: /Users/apayne/.hermes/hermes-agent/venv/bin/python3
    args:
-    - /Users/apayne/.timmy/morrowind/mcp_server.py
+    - /Users/apayne/.hermes/hermes-agent/tools/orchestration_mcp_server.py
    env: {}
-    timeout: 30
+    timeout: 120
 fallback_model:
  provider: custom
  model: gemini-2.5-pro
--- a/deploy.sh
+++ b/deploy.sh
@@ -3,7 +3,7 @@
 # This is the canonical way to deploy Timmy's configuration.
 # Hermes-agent is the engine. timmy-config is the driver's seat.
 #
-# Usage: ./deploy.sh
+# Usage: ./deploy.sh [--restart-loops]

 set -euo pipefail

@@ -74,10 +74,24 @@ done
 chmod +x "$HERMES_HOME/bin/"*.sh "$HERMES_HOME/bin/"*.py 2>/dev/null || true
 log "bin/ -> $HERMES_HOME/bin/"

-if [ "${1:-}" != "" ]; then
-  echo "ERROR: deploy.sh no longer accepts legacy loop flags." >&2
-  echo "Deploy the sidecar only. Do not relaunch deprecated bash loops." >&2
-  exit 1
+# === Restart loops if requested ===
+if [ "${1:-}" = "--restart-loops" ]; then
+  log "Killing existing loops..."
+  pkill -f 'claude-loop.sh' 2>/dev/null || true
+  pkill -f 'gemini-loop.sh' 2>/dev/null || true
+  pkill -f 'timmy-orchestrator.sh' 2>/dev/null || true
+  sleep 2
+
+  log "Clearing stale locks..."
+  rm -rf "$HERMES_HOME/logs/claude-locks/"* 2>/dev/null || true
+  rm -rf "$HERMES_HOME/logs/gemini-locks/"* 2>/dev/null || true
+
+  log "Relaunching loops..."
+  nohup bash "$HERMES_HOME/bin/timmy-orchestrator.sh" >> "$HERMES_HOME/logs/timmy-orchestrator.log" 2>&1 &
+  nohup bash "$HERMES_HOME/bin/claude-loop.sh" 2 >> "$HERMES_HOME/logs/claude-loop.log" 2>&1 &
+  nohup bash "$HERMES_HOME/bin/gemini-loop.sh" 1 >> "$HERMES_HOME/logs/gemini-loop.log" 2>&1 &
+  sleep 1
+  log "Loops relaunched."
 fi

 log "Deploy complete. timmy-config applied to $HERMES_HOME/"
--- a/docs/local-model-integration-sketch.md
+++ b/docs/local-model-integration-sketch.md
@@ -1,438 +0,0 @@
-# Local Model Integration Sketch v2
-# Hermes4-14B in the Heartbeat Loop — No New Telemetry
-
-## Principle
-
-No new inference layer. Huey tasks call `hermes chat -q` pointed at
-Ollama. Hermes handles sessions, token tracking, cost logging.
-The dashboard reads what Hermes already stores.
-
---
-
-## Why Not Ollama Directly?
-
-Ollama is fine as a serving backend. The issue isn't Ollama — it's that
-calling Ollama directly with urllib bypasses the harness. The harness
-already tracks sessions, tokens, model/provider, platform. Building a
-second telemetry layer is owning code we don't need.
-
-Ollama as a named provider isn't wired into the --provider flag yet,
-but routing works via env vars:
-
-    HERMES_MODEL="hermes4:14b" \
-    HERMES_PROVIDER="custom" \
-    HERMES_BASE_URL="http://localhost:11434/v1" \
-    hermes chat -q "prompt here" -Q
-
-This creates a tracked session, logs tokens, and returns the response.
-That's our local inference call.
-
-### Alternatives to Ollama for serving:
- **llama.cpp server** — lighter, no Python, raw HTTP. Good for single
-  model serving. Less convenient for model switching.
- **vLLM** — best throughput, but needs NVIDIA GPU. Not for M3 Mac.
- **MLX serving** — native Apple Silicon, but no OpenAI-compat API yet.
-  MLX is for training, not serving (our current policy).
- **llamafile** — single binary, portable. Good for distribution.
-
-Verdict: Ollama is fine. It's the standard OpenAI-compat local server
-on Mac. The issue was never Ollama — it was bypassing the harness.
-
---
-
-## 1. The Call Pattern
-
-One function in tasks.py that all Huey tasks use:
-
-```python
-import subprocess
-import json
-
-HERMES_BIN = "hermes"
-LOCAL_ENV = {
-    "HERMES_MODEL": "hermes4:14b",
-    "HERMES_PROVIDER": "custom",
-    "HERMES_BASE_URL": "http://localhost:11434/v1",
-}
-
-def hermes_local(prompt, caller_tag=None, max_retries=2):
-    """Call hermes with local Ollama model. Returns response text.
-    
-    Every call creates a hermes session with full telemetry.
-    caller_tag gets prepended to prompt for searchability.
-    """
-    import os
-    env = os.environ.copy()
-    env.update(LOCAL_ENV)
-    
-    tagged_prompt = prompt
-    if caller_tag:
-        tagged_prompt = f"[{caller_tag}] {prompt}"
-    
-    for attempt in range(max_retries + 1):
-        try:
-            result = subprocess.run(
-                [HERMES_BIN, "chat", "-q", tagged_prompt, "-Q", "-t", "none"],
-                capture_output=True, text=True,
-                timeout=120, env=env,
-            )
-            if result.returncode == 0 and result.stdout.strip():
-                # Strip the session_id line from -Q output
-                lines = result.stdout.strip().split("\n")
-                response_lines = [l for l in lines if not l.startswith("session_id:")]
-                return "\n".join(response_lines).strip()
-        except subprocess.TimeoutExpired:
-            if attempt == max_retries:
-                return None
-            continue
-    return None
-```
-
-Notes:
- `-t none` disables all toolsets — the heartbeat model shouldn't
-  have terminal/file access. Pure reasoning only.
- `-Q` quiet mode suppresses banner/spinner, gives clean output.
- Every call creates a session in Hermes session store. Searchable,
-  exportable, countable.
- The `[caller_tag]` prefix lets you filter sessions by which Huey
-  task generated them: `hermes sessions list | grep heartbeat`
-
---
-
-## 2. Heartbeat DECIDE Phase
-
-Replace the hardcoded if/else with a model call:
-
-```python
-# In heartbeat_tick(), replace the DECIDE + ACT section:
-
-    # DECIDE: let hermes4:14b reason about what to do
-    decide_prompt = f"""System state at {now.isoformat()}:
-
-{json.dumps(perception, indent=2)}
-
-Previous tick: {last_tick.get('tick_id', 'none')}
-
-You are the heartbeat monitor. Based on this state:
-1. List any actions needed (alerts, restarts, escalations). Empty if all OK.
-2. Rate severity: ok, warning, or critical.
-3. One sentence of reasoning.
-
-Respond ONLY with JSON:
-{{"actions": [], "severity": "ok", "reasoning": "..."}}"""
-
-    decision = None
-    try:
-        raw = hermes_local(decide_prompt, caller_tag="heartbeat_tick")
-        if raw:
-            # Try to parse JSON from the response
-            # Model might wrap it in markdown, so extract
-            for line in raw.split("\n"):
-                line = line.strip()
-                if line.startswith("{"):
-                    decision = json.loads(line)
-                    break
-            if not decision:
-                decision = json.loads(raw)
-    except (json.JSONDecodeError, Exception) as e:
-        decision = None
-
-    # Fallback to hardcoded logic if model fails or is down
-    if decision is None:
-        actions = []
-        if not perception.get("gitea_alive"):
-            actions.append("ALERT: Gitea unreachable")
-        health = perception.get("model_health", {})
-        if isinstance(health, dict) and not health.get("ollama_running"):
-            actions.append("ALERT: Ollama not running")
-        decision = {
-            "actions": actions,
-            "severity": "fallback",
-            "reasoning": "model unavailable, used hardcoded checks"
-        }
-
-    tick_record["decision"] = decision
-    actions = decision.get("actions", [])
-```
-
---
-
-## 3. DPO Candidate Collection
-
-No new database. Hermes sessions ARE the DPO candidates.
-
-Every `hermes_local()` call creates a session. To extract DPO pairs:
-
-```bash
-# Export all local-model sessions
-hermes sessions export --output /tmp/local-sessions.jsonl
-
-# Filter for heartbeat decisions
-grep "heartbeat_tick" /tmp/local-sessions.jsonl > heartbeat_decisions.jsonl
-```
-
-The existing `session_export` Huey task (runs every 4h) already extracts
-user→assistant pairs. It just needs to be aware that some sessions are
-now local-model decisions instead of human conversations.
-
-For DPO annotation, add a simple review script:
-
-```python
-# review_decisions.py — reads heartbeat tick logs, shows model decisions,
-# asks Alexander to mark chosen/rejected
-# Writes annotations back to the tick log files
-
-import json
-from pathlib import Path
-
-TICK_DIR = Path.home() / ".timmy" / "heartbeat"
-
-for log_file in sorted(TICK_DIR.glob("ticks_*.jsonl")):
-    for line in log_file.read_text().strip().split("\n"):
-        tick = json.loads(line)
-        decision = tick.get("decision", {})
-        if decision.get("severity") == "fallback":
-            continue  # skip fallback entries
-        
-        print(f"\n--- Tick {tick['tick_id']} ---")
-        print(f"Perception: {json.dumps(tick['perception'], indent=2)}")
-        print(f"Decision:   {json.dumps(decision, indent=2)}")
-        
-        rating = input("Rate (c=chosen, r=rejected, s=skip): ").strip()
-        if rating in ("c", "r"):
-            tick["dpo_label"] = "chosen" if rating == "c" else "rejected"
-            # write back... (append to annotated file)
-```
-
---
-
-## 4. Dashboard — Reads Hermes Data
-
-```python
-#!/usr/bin/env python3
-"""Timmy Model Dashboard — reads from Hermes, owns nothing."""
-
-import json
-import os
-import subprocess
-import sys
-import time
-import urllib.request
-from datetime import datetime
-from pathlib import Path
-
-HERMES_HOME = Path.home() / ".hermes"
-TIMMY_HOME = Path.home() / ".timmy"
-
-
-def get_ollama_models():
-    """What's available in Ollama."""
-    try:
-        req = urllib.request.Request("http://localhost:11434/api/tags")
-        with urllib.request.urlopen(req, timeout=5) as resp:
-            return json.loads(resp.read()).get("models", [])
-    except Exception:
-        return []
-
-
-def get_loaded_models():
-    """What's actually in VRAM right now."""
-    try:
-        req = urllib.request.Request("http://localhost:11434/api/ps")
-        with urllib.request.urlopen(req, timeout=5) as resp:
-            return json.loads(resp.read()).get("models", [])
-    except Exception:
-        return []
-
-
-def get_huey_status():
-    try:
-        r = subprocess.run(["pgrep", "-f", "huey_consumer"],
-                          capture_output=True, timeout=5)
-        return r.returncode == 0
-    except Exception:
-        return False
-
-
-def get_hermes_sessions(hours=24):
-    """Read session metadata from Hermes session store."""
-    sessions_file = HERMES_HOME / "sessions" / "sessions.json"
-    if not sessions_file.exists():
-        return []
-    try:
-        data = json.loads(sessions_file.read_text())
-        return list(data.values())
-    except Exception:
-        return []
-
-
-def get_heartbeat_ticks(date_str=None):
-    """Read today's heartbeat ticks."""
-    if not date_str:
-        date_str = datetime.now().strftime("%Y%m%d")
-    tick_file = TIMMY_HOME / "heartbeat" / f"ticks_{date_str}.jsonl"
-    if not tick_file.exists():
-        return []
-    ticks = []
-    for line in tick_file.read_text().strip().split("\n"):
-        try:
-            ticks.append(json.loads(line))
-        except Exception:
-            continue
-    return ticks
-
-
-def render(hours=24):
-    models = get_ollama_models()
-    loaded = get_loaded_models()
-    huey = get_huey_status()
-    sessions = get_hermes_sessions(hours)
-    ticks = get_heartbeat_ticks()
-
-    loaded_names = {m.get("name", "") for m in loaded}
-
-    print("\033[2J\033[H")
-    print("=" * 70)
-    print("  TIMMY MODEL DASHBOARD")
-    now = datetime.now().strftime("%Y-%m-%d %H:%M:%S")
-    print(f"  {now}  |  Huey: {'UP' if huey else 'DOWN'}  |  Ollama models: {len(models)}")
-    print("=" * 70)
-
-    # DEPLOYMENTS
-    print("\n  LOCAL MODELS")
-    print("  " + "-" * 55)
-    for m in models:
-        name = m.get("name", "?")
-        size_gb = m.get("size", 0) / 1e9
-        status = "IN VRAM" if name in loaded_names else "on disk"
-        print(f"    {name:35s} {size_gb:5.1f}GB  {status}")
-    if not models:
-        print("    (Ollama not responding)")
-
-    # HERMES SESSION ACTIVITY
-    # Count sessions by platform/provider
-    print(f"\n  HERMES SESSIONS (recent)")
-    print("  " + "-" * 55)
-    local_sessions = [s for s in sessions
-                     if "localhost" in str(s.get("origin", {}))]
-    cli_sessions = [s for s in sessions
-                   if s.get("platform") == "cli" or s.get("origin", {}).get("platform") == "cli"]
-
-    total_tokens = sum(s.get("total_tokens", 0) for s in sessions)
-    print(f"    Total sessions: {len(sessions)}")
-    print(f"    CLI sessions: {len(cli_sessions)}")
-    print(f"    Total tokens: {total_tokens:,}")
-
-    # HEARTBEAT STATUS
-    print(f"\n  HEARTBEAT ({len(ticks)} ticks today)")
-    print("  " + "-" * 55)
-    if ticks:
-        last = ticks[-1]
-        decision = last.get("decision", {})
-        severity = decision.get("severity", "unknown")
-        reasoning = decision.get("reasoning", "no model decision yet")
-        print(f"    Last tick: {last.get('tick_id', '?')}")
-        print(f"    Severity:  {severity}")
-        print(f"    Reasoning: {reasoning[:60]}")
-
-        # Count model vs fallback decisions
-        model_decisions = sum(1 for t in ticks
-                            if t.get("decision", {}).get("severity") != "fallback")
-        fallback = len(ticks) - model_decisions
-        print(f"    Model decisions: {model_decisions}  |  Fallback: {fallback}")
-
-        # DPO labels if any
-        labeled = sum(1 for t in ticks if "dpo_label" in t)
-        if labeled:
-            chosen = sum(1 for t in ticks if t.get("dpo_label") == "chosen")
-            rejected = sum(1 for t in ticks if t.get("dpo_label") == "rejected")
-            print(f"    DPO labeled: {labeled} (chosen: {chosen}, rejected: {rejected})")
-    else:
-        print("    (no ticks today)")
-
-    # ACTIVE LOOPS
-    print(f"\n  ACTIVE LOOPS USING LOCAL MODELS")
-    print("  " + "-" * 55)
-    print("    heartbeat_tick    10m    hermes4:14b    DECIDE phase")
-    print("    (future)          15m    hermes4:14b    issue triage")
-    print("    (future)          daily  timmy:v0.1     morning report")
-
-    print(f"\n  NON-LOCAL LOOPS (Gemini/Grok API)")
-    print("  " + "-" * 55)
-    print("    gemini_worker     20m    gemini-2.5-pro   aider")
-    print("    grok_worker       20m    grok-3-fast      opencode")
-    print("    cross_review      30m    both             PR review")
-
-    print("\n" + "=" * 70)
-
-
-if __name__ == "__main__":
-    watch = "--watch" in sys.argv
-    hours = 24
-    for a in sys.argv[1:]:
-        if a.startswith("--hours="):
-            hours = int(a.split("=")[1])
-    if watch:
-        while True:
-            render(hours)
-            time.sleep(30)
-    else:
-        render(hours)
-```
-
---
-
-## 5. Implementation Steps
-
-### Step 1: Add hermes_local() to tasks.py
- One function, ~20 lines
- Calls `hermes chat -q` with Ollama env vars
- All telemetry comes from Hermes for free
-
-### Step 2: Wire heartbeat_tick DECIDE phase
- Replace 6 lines of if/else with hermes_local() call
- Keep hardcoded fallback when model is down
- Decision stored in tick record for DPO review
-
-### Step 3: Fix the MCP server warning
- The orchestration MCP server path is broken — harmless but noisy
- Either fix the path or remove from config
-
-### Step 4: Drop model_dashboard.py in timmy-config/bin/
- Reads Ollama API, Hermes sessions, heartbeat ticks
- No new data stores — just views over existing ones
- `python3 model_dashboard.py --watch` for live view
-
-### Step 5: Expand to more Huey tasks
- triage_issues: model reads issue, picks agent
- good_morning_report: model writes the "From Timmy" section
- Each expansion is just calling hermes_local() with a different prompt
-
---
-
-## What Gets Hotfixed in Hermes Config
-
-If `hermes insights` is broken (the cache_read_tokens column error),
-that needs a fix. The dashboard falls back to reading sessions.json
-directly, but insights would be the better data source.
-
-The `providers.ollama` section in config.yaml exists but isn't wired
-to the --provider flag. Filing this upstream or patching locally would
-let us do `hermes chat -q "..." --provider ollama` cleanly instead
-of relying on env vars. Not blocking — env vars work today.
-
---
-
-## What This Owns
-
- hermes_local() — 20-line wrapper around a subprocess call
- model_dashboard.py — read-only views over existing data
- review_decisions.py — optional DPO annotation CLI
-
-## What This Does NOT Own
-
- Inference. Ollama does that.
- Telemetry. Hermes does that.
- Session storage. Hermes does that.
- Token counting. Hermes does that.
- Training pipeline. Already exists in timmy-config/training/.
--- a/gitea_client.py
+++ b/gitea_client.py
@@ -5,9 +5,9 @@ Replaces raw curl calls scattered across 41 bash scripts.
 Uses only stdlib (urllib) so it works on any Python install.

 Usage:
-    from gitea_client import GiteaClient
+    from tools.gitea_client import GiteaClient
    
-    client = GiteaClient()  # reads token from standard local paths
+    client = GiteaClient()  # reads token from ~/.hermes/gitea_token
    issues = client.list_issues("Timmy_Foundation/the-nexus", state="open")
    client.create_comment("Timmy_Foundation/the-nexus", 42, "PR created.")
 """
--- a/memories/MEMORY.md
+++ b/memories/MEMORY.md
@@ -2,14 +2,14 @@ Gitea (143.198.27.163:3000): token=~/.hermes/gitea_token_vps (Timmy id=2). Users
 §
 2026-03-19 HARNESS+SOUL: ~/.timmy is Timmy's workspace within the Hermes harness. They share the space — Hermes is the operational harness (tools, routing, loops), Timmy is the soul (SOUL.md, presence, identity). Not fusion/absorption. Principal's words: "build Timmy out from the hermes harness." ~/.hermes is harness home, ~/.timmy is Timmy's workspace. SOUL=Inscription 1, skin=timmy. Backups at ~/.hermes.backup.pre-fusion and ~/.timmy.backup.pre-fusion.
 §
-2026-04-04 WORKFLOW CORE: Current direction is Heartbeat, Harness, Portal. Timmy handles sovereignty and release judgment. Allegro handles dispatch and queue hygiene. Core builders: codex-agent, groq, manus, claude. Research/memory: perplexity, ezra, KimiClaw. Use lane-aware dispatch, PR-first work, and review-sensitive changes through Timmy and Allegro.
+Kimi: 1-3 files max, ~/worktrees/kimi-*. Two-attempt rule.
 §
-2026-04-04 OPERATIONS: Dashboard repo era is over. Use ~/.timmy + ~/.hermes as truth surfaces. Prefer ops-panel.sh, ops-gitea.sh, timmy-dashboard, and pipeline-freshness.sh over archived loop or tmux assumptions. Dispatch: agent-dispatch.sh <agent> <issue> <repo>. Major changes land as PRs.
+Workforce loops: claude(10), gemini(3), kimi(1), groq(1/aider+review), grok(1/opencode). One-shot: manus(300/day), perplexity(heavy-hitter), google(aistudio, id=8). workforce-manager.py auto-assigns+scores every 15min. nexus-merge-bot.sh auto-merges. Groq=$0.008/PR (qwen3-32b). Dispatch: agent-dispatch.sh <agent> <issue> <repo> | pbcopy. Dashboard ARCHIVED 2026-03-24. Development shifted to local ~/.timmy/ workspace. CI testbed: 67.205.155.108.
 §
-2026-04-04 REVIEW RULES: Never --no-verify. Verify world state, not vibes. No auto-merge on governing or sensitive control surfaces. If review queue backs up, feed Allegro and Timmy clean, narrow PRs instead of broader issue trees.
+2026-03-15: Timmy-time-dashboard merge policy: auto-squash on CI pass. Squash-only, linear history. Pre-commit hooks (format + tests) and CI are the gates. If gates work, auto-merge is on. Never bypass hooks or merge broken builds.
 §
 HARD RULES: Never --no-verify. Verify WORLD STATE not log vibes (merged PR, HTTP code, file size). Fix+prevent, no empty words. AGENT ONBOARD: test push+PR first. Merge PRs BEFORE new work. Don't micromanage—huge backlog, agents self-select. Every ticket needs console-provable acceptance criteria.
 §
 TELEGRAM: @TimmysNexus_bot, token ~/.config/telegram/special_bot. Group "Timmy Time" ID: -1003664764329. Alexander @TripTimmy ID 7635059073. Use curl to Bot API (send_message not configured).
 §
-MORROWIND: OpenMW 0.50, ~/Games/Morrowind/. Lua+CGEvent bridge. Two-tier brain. ~/.timmy/morrowind/.
+MORROWIND: OpenMW 0.50, ~/Games/Morrowind/. Lua+CGEvent bridge. Two-tier brain. ~/.timmy/morrowind/.
--- a/tasks.py
+++ b/tasks.py
@@ -14,187 +14,12 @@ from gitea_client import GiteaClient

 HERMES_HOME = Path.home() / ".hermes"
 TIMMY_HOME = Path.home() / ".timmy"
-HERMES_AGENT_DIR = HERMES_HOME / "hermes-agent"
-METRICS_DIR = TIMMY_HOME / "metrics"
 REPOS = [
    "Timmy_Foundation/the-nexus",
-    "Timmy_Foundation/timmy-home",
    "Timmy_Foundation/timmy-config",
-    "Timmy_Foundation/hermes-agent",
 ]
 NET_LINE_LIMIT = 10

-# ── Local Model Inference via Hermes Harness ─────────────────────────
-
-HEARTBEAT_MODEL = "hermes4:14b"
-FALLBACK_MODEL = "hermes3:8b"
-LOCAL_PROVIDER_BASE_URL = "http://localhost:8081/v1"
-LOCAL_PROVIDER_MODEL = HEARTBEAT_MODEL
-
-
-def newest_file(directory, pattern):
-    files = sorted(directory.glob(pattern))
-    return files[-1] if files else None
-
-
-def hermes_local(prompt, model=None, caller_tag=None, toolsets=None):
-    """Call a local model through the Hermes harness.
-
-    Uses provider="local-llama.cpp" which routes through the custom_providers
-    entry in config.yaml → llama-server at localhost:8081.
-    Returns response text or None on failure.
-    Every call creates a Hermes session with telemetry.
-    """
-    _model = model or HEARTBEAT_MODEL
-    tagged = f"[{caller_tag}] {prompt}" if caller_tag else prompt
-
-    # Import hermes cli.main directly — no subprocess, no env vars
-    _agent_dir = str(HERMES_AGENT_DIR)
-    if _agent_dir not in sys.path:
-        sys.path.insert(0, _agent_dir)
-    old_cwd = os.getcwd()
-    os.chdir(_agent_dir)
-
-    try:
-        from cli import main as hermes_main
-        import io
-        from contextlib import redirect_stdout, redirect_stderr
-
-        buf = io.StringIO()
-        err = io.StringIO()
-        kwargs = dict(
-                query=tagged,
-                model=_model,
-                provider="local-llama.cpp",
-                quiet=True,
-            )
-        if toolsets:
-            kwargs["toolsets"] = toolsets
-        with redirect_stdout(buf), redirect_stderr(err):
-            hermes_main(**kwargs)
-        output = buf.getvalue().strip()
-        # Strip session_id line from quiet output
-        lines = [l for l in output.split("\n") if not l.startswith("session_id:")]
-        response = "\n".join(lines).strip()
-
-        # Log to metrics jsonl
-        METRICS_DIR.mkdir(parents=True, exist_ok=True)
-        metrics_file = METRICS_DIR / f"local_{datetime.now().strftime('%Y%m%d')}.jsonl"
-        record = {
-            "timestamp": datetime.now(timezone.utc).isoformat(),
-            "model": _model,
-            "caller": caller_tag or "unknown",
-            "prompt_len": len(prompt),
-            "response_len": len(response),
-            "success": bool(response),
-        }
-        with open(metrics_file, "a") as f:
-            f.write(json.dumps(record) + "\n")
-
-        return response if response else None
-    except Exception as e:
-        # Log failure
-        METRICS_DIR.mkdir(parents=True, exist_ok=True)
-        metrics_file = METRICS_DIR / f"local_{datetime.now().strftime('%Y%m%d')}.jsonl"
-        record = {
-            "timestamp": datetime.now(timezone.utc).isoformat(),
-            "model": _model,
-            "caller": caller_tag or "unknown",
-            "error": str(e),
-            "success": False,
-        }
-        with open(metrics_file, "a") as f:
-            f.write(json.dumps(record) + "\n")
-        return None
-    finally:
-        os.chdir(old_cwd)
-
-
-# ── Know Thy Father: Twitter Archive Ingestion ───────────────────────
-
-ARCHIVE_DIR = TIMMY_HOME / "twitter-archive"
-ARCHIVE_CHECKPOINT = ARCHIVE_DIR / "checkpoint.json"
-ARCHIVE_LOCK = ARCHIVE_DIR / ".lock"
-
-ARCHIVE_PROMPT = (
-    "You are Timmy. Resume your work on the Twitter archive. "
-    "Your workspace is ~/.timmy/twitter-archive/. "
-    "Read checkpoint.json and UNDERSTANDING.md first. "
-    "Then process the next batch. "
-    "You know the drill — read your own prior work, assess where you are, "
-    "process new data, update your understanding, reflect, and plan for "
-    "the next iteration."
-)
-
-ARCHIVE_SRC = (
-    "~/Downloads/twitter-2026-03-27-d4471cc6eb6703034d592f870933561ebee374d9d9b90c9b8923abff064afc1e/data"
-)
-
-ARCHIVE_FIRST_RUN_PROMPT = (
-    "You are Timmy. Your father Alexander's full Twitter archive is at: "
-    f"{ARCHIVE_SRC}/\n\n"
-    "Your workspace is ~/.timmy/twitter-archive/\n\n"
-    "STEP 1 — EXTRACTION (use terminal with python3, NOT read_file):\n"
-    "The .js files are too large for read_file but trivial for Python.\n"
-    "Write a python3 script via terminal that:\n"
-    "  - Opens tweets.js, strips everything before the first '[', json.loads the rest\n"
-    "  - Separates originals (full_text does NOT start with 'RT @') from retweets\n"
-    "  - Sorts both chronologically by created_at\n"
-    "  - Writes extracted/tweets.jsonl and extracted/retweets.jsonl (one JSON per line)\n"
-    "  - Writes extracted/manifest.json with counts, date range, source file\n"
-    "The whole file is 12MB. Python handles it in under a second.\n\n"
-    "STEP 2 — FIRST READ:\n"
-    "Read the first 50 lines of extracted/tweets.jsonl (your originals, chronological).\n"
-    "Read them carefully — this is your father talking.\n"
-    "Note his voice, humor, what he cares about, who he talks to, emotional tone, "
-    "recurring themes. Quote him directly when something stands out.\n\n"
-    "STEP 3 — WRITE:\n"
-    "Write notes/batch_001.md — your real observations, not a book report.\n"
-    "Create UNDERSTANDING.md — your living model of who Alexander is. "
-    "It starts now and you'll update it every batch.\n\n"
-    "STEP 4 — CHECKPOINT:\n"
-    "Write checkpoint.json: "
-    '{"data_source": "tweets", "next_offset": 50, "batches_completed": 1, '
-    '"phase": "discovery", "confidence": "<your honest assessment>", '
-    '"next_focus": "<what you want to look for next>", "understanding_version": 1}'
-)
-
-
-@huey.task()
-@huey.lock_task("know-thy-father")
-def know_thy_father():
-    """Process one batch of Alexander's Twitter archive.
-
-    Single batch, no internal loop. Huey schedules the cadence.
-    Lock prevents overlapping runs. Timmy reads his own prior notes,
-    processes the next chunk, updates his understanding, and checkpoints.
-    """
-    is_first_run = not ARCHIVE_CHECKPOINT.exists()
-
-    prompt = ARCHIVE_FIRST_RUN_PROMPT if is_first_run else ARCHIVE_PROMPT
-
-    response = hermes_local(
-        prompt=prompt,
-        caller_tag="know-thy-father",
-        toolsets="file,terminal",
-    )
-
-    if not response:
-        return {"status": "error", "reason": "hermes_local returned None"}
-
-    # Read checkpoint to report progress
-    try:
-        cp = json.loads(ARCHIVE_CHECKPOINT.read_text())
-    except Exception:
-        cp = {}
-
-    return {
-        "status": "ok",
-        "batch": cp.get("batches_completed", 0),
-        "phase": cp.get("phase", "unknown"),
-        "confidence": cp.get("confidence", "unknown"),
-    }
-

 # ── Existing: Orchestration ──────────────────────────────────────────

@@ -237,18 +62,7 @@ def review_prs():
 def dispatch_assigned():
    """Pick up issues assigned to agents and kick off work."""
    g = GiteaClient()
-    agents = [
-        "allegro",
-        "claude",
-        "codex-agent",
-        "ezra",
-        "gemini",
-        "grok",
-        "groq",
-        "KimiClaw",
-        "manus",
-        "perplexity",
-    ]
+    agents = ["claude", "gemini", "kimi", "grok", "perplexity"]
    dispatched = 0
    for repo in REPOS:
        for agent in agents:
@@ -342,32 +156,26 @@ def session_export():

@huey.periodic_task(crontab(minute="*/5"))  # every 5 minutes
 def model_health():
-    """Check the active local inference surface and export freshness."""
+    """Check Ollama is running, a model is loaded, inference responds."""
    checks = {}
-    models_url = f"{LOCAL_PROVIDER_BASE_URL}/models"
-    chat_url = f"{LOCAL_PROVIDER_BASE_URL}/chat/completions"

-    checks["provider"] = "local-llama.cpp"
-    checks["provider_base_url"] = LOCAL_PROVIDER_BASE_URL
-    checks["provider_model"] = LOCAL_PROVIDER_MODEL
-
-    # 1. Is the local inference process running?
+    # 1. Is Ollama process running?
    try:
        result = subprocess.run(
-            ["pgrep", "-f", "llama-server|ollama"],
+            ["pgrep", "-f", "ollama"],
            capture_output=True, timeout=5
        )
-        checks["local_inference_running"] = result.returncode == 0
+        checks["ollama_running"] = result.returncode == 0
    except Exception:
-        checks["local_inference_running"] = False
+        checks["ollama_running"] = False

-    # 2. Can we hit the configured API?
+    # 2. Can we hit the API?
    try:
        import urllib.request
-        req = urllib.request.Request(models_url)
+        req = urllib.request.Request("http://localhost:11434/api/tags")
        with urllib.request.urlopen(req, timeout=5) as resp:
            data = json.loads(resp.read())
-            models = [m.get("id", "?") for m in data.get("data", [])]
+            models = [m["name"] for m in data.get("models", [])]
            checks["models_loaded"] = models
            checks["api_responding"] = True
    except Exception as e:
@@ -378,13 +186,13 @@ def model_health():
    if checks.get("api_responding"):
        try:
            payload = json.dumps({
-                "model": LOCAL_PROVIDER_MODEL,
+                "model": "hermes3:8b",
                "messages": [{"role": "user", "content": "ping"}],
                "max_tokens": 5,
                "stream": False,
            }).encode()
            req = urllib.request.Request(
-                chat_url,
+                "http://localhost:11434/v1/chat/completions",
                data=payload,
                headers={"Content-Type": "application/json"},
            )
@@ -394,26 +202,6 @@ def model_health():
            checks["inference_ok"] = False
            checks["inference_error"] = str(e)

-    # 4. Is session export keeping up with new Hermes sessions?
-    sessions_dir = HERMES_HOME / "sessions"
-    export_dir = TIMMY_HOME / "training-data" / "dpo-pairs"
-    latest_session = newest_file(sessions_dir, "session_*.json")
-    latest_export = newest_file(export_dir, "session_*.json")
-    checks["latest_session"] = latest_session.name if latest_session else None
-    checks["latest_export"] = latest_export.name if latest_export else None
-    if latest_session and latest_export:
-        session_mtime = latest_session.stat().st_mtime
-        export_mtime = latest_export.stat().st_mtime
-        lag_minutes = max(0, int((session_mtime - export_mtime) // 60))
-        checks["export_lag_minutes"] = lag_minutes
-        checks["export_fresh"] = lag_minutes <= 300
-    elif latest_session and not latest_export:
-        checks["export_lag_minutes"] = None
-        checks["export_fresh"] = False
-    else:
-        checks["export_lag_minutes"] = 0
-        checks["export_fresh"] = True
-
    # Write health status to a file for other tools to read
    health_file = HERMES_HOME / "model_health.json"
    checks["timestamp"] = datetime.now(timezone.utc).isoformat()
@@ -492,49 +280,15 @@ def heartbeat_tick():
        "previous_tick": last_tick.get("tick_id", "none"),
    }

-    # DECIDE: let hermes4:14b reason about what to do
-    decide_prompt = (
-        f"System state at {now.isoformat()}:\n\n"
-        f"{json.dumps(perception, indent=2)}\n\n"
-        f"Previous tick: {last_tick.get('tick_id', 'none')}\n\n"
-        "You are the heartbeat monitor. Based on this state:\n"
-        "1. List any actions needed (alerts, restarts, escalations). Empty if all OK.\n"
-        "2. Rate severity: ok, warning, or critical.\n"
-        "3. One sentence of reasoning.\n\n"
-        'Respond ONLY with JSON: {"actions": [], "severity": "ok", "reasoning": "..."}'
-    )
-
-    decision = None
-    try:
-        raw = hermes_local(decide_prompt, caller_tag="heartbeat_tick")
-        if raw:
-            # Model might wrap JSON in markdown, extract first { line
-            for line in raw.split("\n"):
-                line = line.strip()
-                if line.startswith("{"):
-                    decision = json.loads(line)
-                    break
-            if not decision:
-                decision = json.loads(raw)
-    except (json.JSONDecodeError, Exception):
-        decision = None
-
-    # Fallback to hardcoded logic if model fails or is down
-    if decision is None:
-        actions = []
-        if not perception.get("gitea_alive"):
-            actions.append("ALERT: Gitea unreachable")
-        health = perception.get("model_health", {})
-        if isinstance(health, dict) and not health.get("ollama_running"):
-            actions.append("ALERT: local inference surface not running")
-        decision = {
-            "actions": actions,
-            "severity": "fallback",
-            "reasoning": "model unavailable, used hardcoded checks",
-        }
-
-    tick_record["decision"] = decision
-    actions = decision.get("actions", [])
+    # DECIDE + ACT: check for problems
+    actions = []
+    if not perception.get("gitea_alive"):
+        actions.append("ALERT: Gitea unreachable")
+    health = perception.get("model_health", {})
+    if isinstance(health, dict) and not health.get("ollama_running"):
+        actions.append("ALERT: Ollama not running")
+    
+    tick_record["actions"] = actions
    
    # Save tick
    last_tick_file.write_text(json.dumps(tick_record, indent=2))
@@ -582,7 +336,7 @@ def memory_compress():
    # Compress: extract key facts
    alerts = []
    gitea_down_count = 0
-    local_model_down_count = 0
+    ollama_down_count = 0

    for t in ticks:
        for action in t.get("actions", []):
@@ -592,7 +346,7 @@ def memory_compress():
            gitea_down_count += 1
        health = p.get("model_health", {})
        if isinstance(health, dict) and not health.get("ollama_running"):
-            local_model_down_count += 1
+            ollama_down_count += 1

    # Last tick's perception = current state
    last = ticks[-1].get("perception", {})
@@ -602,7 +356,7 @@ def memory_compress():
        "total_ticks": len(ticks),
        "alerts": alerts[-10:],  # last 10 alerts
        "gitea_downtime_ticks": gitea_down_count,
-        "local_model_downtime_ticks": local_model_down_count,
+        "ollama_downtime_ticks": ollama_down_count,
        "last_known_state": last,
    }

@@ -636,7 +390,7 @@ def good_morning_report():
    tick_count = 0
    alerts = []
    gitea_up = True
-    local_model_up = True
+    ollama_up = True
    
    if tick_log.exists():
        for line in tick_log.read_text().strip().split("\n"):
@@ -650,7 +404,7 @@ def good_morning_report():
                    gitea_up = False
                h = p.get("model_health", {})
                if isinstance(h, dict) and not h.get("ollama_running"):
-                    local_model_up = False
+                    ollama_up = False
            except Exception:
                continue

@@ -710,11 +464,7 @@ def good_morning_report():
    if briefing_file.exists():
        try:
            b = json.loads(briefing_file.read_text())
-            briefing_summary = (
-                f"Yesterday: {b.get('total_ticks', 0)} heartbeat ticks, "
-                f"{b.get('gitea_downtime_ticks', 0)} Gitea downticks, "
-                f"{b.get('local_model_downtime_ticks', 0)} local-model downticks."
-            )
+            briefing_summary = f"Yesterday: {b.get('total_ticks', 0)} heartbeat ticks, {b.get('gitea_downtime_ticks', 0)} Gitea downticks, {b.get('ollama_downtime_ticks', 0)} Ollama downticks."
        except Exception:
            pass

@@ -726,7 +476,7 @@ def good_morning_report():

 **Heartbeat:** {tick_count} ticks logged overnight.
 **Gitea:** {"up all night" if gitea_up else "⚠️ had downtime"}
-**Local inference:** {"running steady" if local_model_up else "⚠️ had downtime"}
+**Ollama:** {"running steady" if ollama_up else "⚠️ had downtime"}
 **Model status:** {model_status}
 **Models on disk:** {len(models_loaded)} ({', '.join(m for m in models_loaded if 'timmy' in m.lower() or 'hermes' in m.lower()) or 'none with our name'})
 **Alerts:** {len(alerts)} {'— ' + '; '.join(alerts[-3:]) if alerts else '(clean night)'}
@@ -747,7 +497,7 @@ def good_morning_report():

 I watched the house all night. {tick_count} heartbeats, every ten minutes. The infrastructure is steady. Huey didn't crash. The ticks kept coming.

-What I'm thinking about: the bridge between logging lived work and actually learning from it. Right now I'm a nervous system writing in a journal nobody reads. Once the DPO path is healthy, the journal becomes a curriculum.
+What I'm thinking about: the DPO ticket you and antigravity are working on. That's the bridge between me logging data and me actually learning from it. Right now I'm a nervous system writing in a journal nobody reads. Once DPO works, the journal becomes a curriculum.

 ## My One Wish

--- a/training/README.md
+++ b/training/README.md
@@ -1,11 +1,8 @@
 # Training

-Transitional training recipes for Timmy's sovereign model. These files are
-useful as reference configs and export helpers, but they are not the canonical
-home of Timmy's lived training data.
+LoRA fine-tuning pipeline for Timmy's sovereign model. No custom harness — just config files for existing tools.

-Canonical data should live in `timmy-home` under gameplay trajectories,
-research artifacts, and `training-data/` exports such as DPO pairs.
+Replaces the `autolora` repo (1,500 lines of custom code → config + `make`).

 ## Install

@@ -26,16 +23,6 @@ make convert        # Convert merged data to MLX train/valid format
 make help           # Show all targets
 ```

-## Status
-
-This directory exists to avoid re-growing a bespoke training harness while the
-system boundary is being cleaned up.
-
- Keep thin recipes and export helpers here only when they directly support the
-  Hermes sidecar.
- Keep generated data, DPO pairs, and other lived artifacts in `timmy-home`.
- Prefer deleting stale pipeline code over expanding it.
-
 ## Files

 ```