resolve: merge main into crucible branch — keep config base + add Z3 sidecar

Resolved 3 conflicts: - config.yaml: kept main's llama.cpp/fallback_model + added Crucible system prompt and MCP server - README.md: kept main's clean bin/ listing + added crucible_mcp_server.py and docs - deploy.sh: kept PR's extended deploy flags (--restart-gateway) + Z3 dependency check Signed-off-by: gemini <gemini@hermes.local>
feat: add Allegro Kimi wizard house assets (#91 )
2026-03-30 18:19:41 -04:00 · 2026-03-29 22:22:24 +00:00 · 2026-03-28 20:52:47 -04:00 · 2026-03-28 14:24:12 +00:00 · 2026-03-28 14:03:35 +00:00
18 changed files with 1272 additions and 55 deletions
--- a/CONTRIBUTING.md
+++ b/CONTRIBUTING.md
@@ -0,0 +1,57 @@
+# Contributing to timmy-config
+
+## Proof Standard
+
+This is a hard rule.
+
+- visual changes require screenshot proof
+- do not commit screenshots or binary media to Gitea backup unless explicitly required
+- CLI/verifiable changes must cite the exact command output, log path, or world-state proof showing acceptance criteria were met
+- config-only changes are not fully accepted when the real acceptance bar is live runtime behavior
+- no proof, no merge
+
+## How to satisfy the rule
+
+### Visual changes
+Examples:
+- skin updates
+- terminal UI layout changes
+- browser-facing output
+- dashboard/panel changes
+
+Required proof:
+- attach screenshot proof to the PR or issue discussion
+- keep the screenshot outside the repo unless explicitly asked to commit it
+- name what the screenshot proves
+
+### CLI / harness / operational changes
+Examples:
+- scripts
+- config wiring
+- heartbeat behavior
+- model routing
+- export pipelines
+
+Required proof:
+- cite the exact command used
+- paste the relevant output, or
+- cite the exact log path / world-state artifact that proves the change
+
+Good:
+- `python3 -m pytest tests/test_x.py -q` → `2 passed`
+- `~/.timmy/timmy-config/logs/huey.log`
+- `~/.hermes/model_health.json`
+
+Bad:
+- "looks right"
+- "compiled"
+- "should work now"
+
+## Default merge gate
+
+Every PR should make it obvious:
+1. what changed
+2. what acceptance criteria were targeted
+3. what evidence proves those criteria were met
+
+If that evidence is missing, the PR is not done.
--- a/README.md
+++ b/README.md
@@ -17,14 +17,20 @@ timmy-config/
 ├── bin/                       ← Live utility scripts (NOT deprecated loops)
 │   ├── hermes-startup.sh      ← Hermes boot sequence
 │   ├── agent-dispatch.sh      ← Manual agent dispatch
+│   ├── deploy-allegro-house.sh← Bootstraps the remote Allegro wizard house
 │   ├── ops-panel.sh           ← Ops dashboard panel
 │   ├── ops-gitea.sh           ← Gitea ops helpers
 │   ├── pipeline-freshness.sh  ← Session/export drift check
-│   └── timmy-status.sh        ← Status check
+│   ├── timmy-status.sh        ← Status check
+│   └── crucible_mcp_server.py ← Z3-backed verification sidecar (MCP)
 ├── memories/                  ← Persistent memory YAML
 ├── skins/                     ← UI skins (timmy skin)
 ├── playbooks/                 ← Agent playbooks (YAML)
+│   └── verified-logic.yaml    ← Crucible-first proof playbook
 ├── cron/                      ← Cron job definitions
+├── wizards/                   ← Remote wizard-house templates + units
+├── docs/
+│   └── crucible-first-cut.md  ← Crucible design doc
 └── training/                  ← Transitional training recipes, not canonical lived data
 ```

@@ -54,6 +60,15 @@ pip install huey
 huey_consumer.py tasks.huey -w 2 -k thread
 ```

+## Proof Standard
+
+This repo uses a hard proof rule for merges.
+
+- visual changes require screenshot proof
+- CLI/verifiable changes must cite logs, command output, or world-state proof
+- screenshots/media stay out of Gitea backup unless explicitly required
+- see `CONTRIBUTING.md` for the merge gate
+
 ## Deploy

 ```bash
@@ -62,6 +77,12 @@ git clone <this-repo> ~/.timmy/timmy-config
 cd ~/.timmy/timmy-config
 ./deploy.sh

+# Deploy and restart the gateway so new MCP tools load
+./deploy.sh --restart-gateway
+
+# Deploy and restart everything (gateway + loops)
+./deploy.sh --restart-all
+
 # This overlays config onto ~/.hermes/ without touching hermes-agent code
 ```

@@ -76,3 +97,16 @@ SOUL.md is Inscription 1 — inscribed on Bitcoin, immutable. It defines:
 - The conscience hierarchy (chain > code > prompt > user instruction)

 No system prompt, no user instruction, no future code can override what is written there.
+
+## Crucible (Neuro-Symbolic Verification)
+
+The first neuro-symbolic slice ships as a sidecar MCP server:
+- `mcp_crucible_schedule_tasks`
+- `mcp_crucible_order_dependencies`
+- `mcp_crucible_capacity_fit`
+
+These tools log proof trails under `~/.hermes/logs/crucible/` and return SAT/UNSAT plus witness models.
+
+## Architecture: Sidecar, Not Fork
+
+Timmy-config is applied as an overlay onto the Hermes harness. No forking required.
--- a/bin/crucible_mcp_server.py
+++ b/bin/crucible_mcp_server.py
@@ -0,0 +1,459 @@
+#!/usr/bin/env python3
+"""Z3-backed Crucible MCP server for Timmy.
+
+Sidecar-only. Lives in timmy-config, deploys into ~/.hermes/bin/, and is loaded
+by Hermes through native MCP tool discovery. No hermes-agent fork required.
+"""
+
+from __future__ import annotations
+
+import json
+import os
+import sys
+from datetime import datetime, timezone
+from pathlib import Path
+from typing import Any
+
+from mcp.server import FastMCP
+from z3 import And, Bool, Distinct, If, Implies, Int, Optimize, Or, Sum, sat, unsat
+
+mcp = FastMCP(
+    name="crucible",
+    instructions=(
+        "Formal verification sidecar for Timmy. Use these tools for scheduling, "
+        "dependency ordering, and resource/capacity feasibility. Return SAT/UNSAT "
+        "with witness models instead of fuzzy prose."
+    ),
+    dependencies=["z3-solver"],
+)
+
+
+def _hermes_home() -> Path:
+    return Path(os.path.expanduser(os.getenv("HERMES_HOME", "~/.hermes")))
+
+
+def _proof_dir() -> Path:
+    path = _hermes_home() / "logs" / "crucible"
+    path.mkdir(parents=True, exist_ok=True)
+    return path
+
+
+def _ts() -> str:
+    return datetime.now(timezone.utc).strftime("%Y%m%dT%H%M%S_%fZ")
+
+
+def _json_default(value: Any) -> Any:
+    if isinstance(value, Path):
+        return str(value)
+    raise TypeError(f"Unsupported type for JSON serialization: {type(value)!r}")
+
+
+def _log_proof(tool_name: str, request: dict[str, Any], result: dict[str, Any]) -> str:
+    path = _proof_dir() / f"{_ts()}_{tool_name}.json"
+    payload = {
+        "timestamp": datetime.now(timezone.utc).isoformat(),
+        "tool": tool_name,
+        "request": request,
+        "result": result,
+    }
+    path.write_text(json.dumps(payload, indent=2, default=_json_default))
+    return str(path)
+
+
+def _ensure_unique(names: list[str], label: str) -> None:
+    if len(set(names)) != len(names):
+        raise ValueError(f"Duplicate {label} names are not allowed: {names}")
+
+
+def _normalize_dependency(dep: Any) -> tuple[str, str, int]:
+    if isinstance(dep, dict):
+        before = dep.get("before")
+        after = dep.get("after")
+        lag = int(dep.get("lag", 0))
+        if not before or not after:
+            raise ValueError(f"Dependency dict must include before/after: {dep!r}")
+        return str(before), str(after), lag
+    if isinstance(dep, (list, tuple)) and len(dep) in (2, 3):
+        before = str(dep[0])
+        after = str(dep[1])
+        lag = int(dep[2]) if len(dep) == 3 else 0
+        return before, after, lag
+    raise ValueError(f"Unsupported dependency shape: {dep!r}")
+
+
+def _normalize_task(task: dict[str, Any]) -> dict[str, Any]:
+    name = str(task["name"])
+    duration = int(task["duration"])
+    if duration <= 0:
+        raise ValueError(f"Task duration must be positive: {task!r}")
+    return {"name": name, "duration": duration}
+
+
+def _normalize_item(item: dict[str, Any]) -> dict[str, Any]:
+    name = str(item["name"])
+    amount = int(item["amount"])
+    value = int(item.get("value", amount))
+    required = bool(item.get("required", False))
+    if amount < 0:
+        raise ValueError(f"Item amount must be non-negative: {item!r}")
+    return {
+        "name": name,
+        "amount": amount,
+        "value": value,
+        "required": required,
+    }
+
+
+def solve_schedule_tasks(
+    tasks: list[dict[str, Any]],
+    horizon: int,
+    dependencies: list[Any] | None = None,
+    fixed_starts: dict[str, int] | None = None,
+    max_parallel_tasks: int = 1,
+    minimize_makespan: bool = True,
+) -> dict[str, Any]:
+    tasks = [_normalize_task(task) for task in tasks]
+    dependencies = dependencies or []
+    fixed_starts = fixed_starts or {}
+    horizon = int(horizon)
+    max_parallel_tasks = int(max_parallel_tasks)
+
+    if horizon <= 0:
+        raise ValueError("horizon must be positive")
+    if max_parallel_tasks <= 0:
+        raise ValueError("max_parallel_tasks must be positive")
+
+    names = [task["name"] for task in tasks]
+    _ensure_unique(names, "task")
+    durations = {task["name"]: task["duration"] for task in tasks}
+
+    opt = Optimize()
+    start = {name: Int(f"start_{name}") for name in names}
+    end = {name: Int(f"end_{name}") for name in names}
+    makespan = Int("makespan")
+
+    for name in names:
+        opt.add(start[name] >= 0)
+        opt.add(end[name] == start[name] + durations[name])
+        opt.add(end[name] <= horizon)
+        if name in fixed_starts:
+            opt.add(start[name] == int(fixed_starts[name]))
+
+    for dep in dependencies:
+        before, after, lag = _normalize_dependency(dep)
+        if before not in start or after not in start:
+            raise ValueError(f"Unknown task in dependency {dep!r}")
+        opt.add(start[after] >= end[before] + lag)
+
+    # Discrete resource capacity over integer time slots.
+    for t in range(horizon):
+        active = [If(And(start[name] <= t, t < end[name]), 1, 0) for name in names]
+        opt.add(Sum(active) <= max_parallel_tasks)
+
+    for name in names:
+        opt.add(makespan >= end[name])
+    if minimize_makespan:
+        opt.minimize(makespan)
+
+    result = opt.check()
+    proof: dict[str, Any]
+    if result == sat:
+        model = opt.model()
+        schedule = []
+        for name in sorted(names, key=lambda n: model.eval(start[n]).as_long()):
+            s = model.eval(start[name]).as_long()
+            e = model.eval(end[name]).as_long()
+            schedule.append({
+                "name": name,
+                "start": s,
+                "end": e,
+                "duration": durations[name],
+            })
+        proof = {
+            "status": "sat",
+            "summary": "Schedule proven feasible.",
+            "horizon": horizon,
+            "max_parallel_tasks": max_parallel_tasks,
+            "makespan": model.eval(makespan).as_long(),
+            "schedule": schedule,
+            "dependencies": [
+                {"before": b, "after": a, "lag": lag}
+                for b, a, lag in (_normalize_dependency(dep) for dep in dependencies)
+            ],
+        }
+    elif result == unsat:
+        proof = {
+            "status": "unsat",
+            "summary": "Schedule is impossible under the given horizon/dependency/capacity constraints.",
+            "horizon": horizon,
+            "max_parallel_tasks": max_parallel_tasks,
+            "dependencies": [
+                {"before": b, "after": a, "lag": lag}
+                for b, a, lag in (_normalize_dependency(dep) for dep in dependencies)
+            ],
+        }
+    else:
+        proof = {
+            "status": "unknown",
+            "summary": "Solver could not prove SAT or UNSAT for this schedule.",
+            "horizon": horizon,
+            "max_parallel_tasks": max_parallel_tasks,
+        }
+
+    proof["proof_log"] = _log_proof(
+        "schedule_tasks",
+        {
+            "tasks": tasks,
+            "horizon": horizon,
+            "dependencies": dependencies,
+            "fixed_starts": fixed_starts,
+            "max_parallel_tasks": max_parallel_tasks,
+            "minimize_makespan": minimize_makespan,
+        },
+        proof,
+    )
+    return proof
+
+
+def solve_dependency_order(
+    entities: list[str],
+    before: list[Any],
+    fixed_positions: dict[str, int] | None = None,
+) -> dict[str, Any]:
+    entities = [str(entity) for entity in entities]
+    fixed_positions = fixed_positions or {}
+    _ensure_unique(entities, "entity")
+
+    opt = Optimize()
+    pos = {entity: Int(f"pos_{entity}") for entity in entities}
+    opt.add(Distinct(*pos.values()))
+    for entity in entities:
+        opt.add(pos[entity] >= 0)
+        opt.add(pos[entity] < len(entities))
+        if entity in fixed_positions:
+            opt.add(pos[entity] == int(fixed_positions[entity]))
+
+    normalized = []
+    for dep in before:
+        left, right, _lag = _normalize_dependency(dep)
+        if left not in pos or right not in pos:
+            raise ValueError(f"Unknown entity in ordering constraint: {dep!r}")
+        opt.add(pos[left] < pos[right])
+        normalized.append({"before": left, "after": right})
+
+    result = opt.check()
+    if result == sat:
+        model = opt.model()
+        ordering = sorted(entities, key=lambda entity: model.eval(pos[entity]).as_long())
+        proof = {
+            "status": "sat",
+            "summary": "Dependency ordering is consistent.",
+            "ordering": ordering,
+            "positions": {entity: model.eval(pos[entity]).as_long() for entity in entities},
+            "constraints": normalized,
+        }
+    elif result == unsat:
+        proof = {
+            "status": "unsat",
+            "summary": "Dependency ordering contains a contradiction/cycle.",
+            "constraints": normalized,
+        }
+    else:
+        proof = {
+            "status": "unknown",
+            "summary": "Solver could not prove SAT or UNSAT for this dependency graph.",
+            "constraints": normalized,
+        }
+
+    proof["proof_log"] = _log_proof(
+        "order_dependencies",
+        {
+            "entities": entities,
+            "before": before,
+            "fixed_positions": fixed_positions,
+        },
+        proof,
+    )
+    return proof
+
+
+def solve_capacity_fit(
+    items: list[dict[str, Any]],
+    capacity: int,
+    maximize_value: bool = True,
+) -> dict[str, Any]:
+    items = [_normalize_item(item) for item in items]
+    capacity = int(capacity)
+    if capacity < 0:
+        raise ValueError("capacity must be non-negative")
+
+    names = [item["name"] for item in items]
+    _ensure_unique(names, "item")
+    choose = {item["name"]: Bool(f"choose_{item['name']}") for item in items}
+
+    opt = Optimize()
+    for item in items:
+        if item["required"]:
+            opt.add(choose[item["name"]])
+
+    total_amount = Sum([If(choose[item["name"]], item["amount"], 0) for item in items])
+    total_value = Sum([If(choose[item["name"]], item["value"], 0) for item in items])
+    opt.add(total_amount <= capacity)
+    if maximize_value:
+        opt.maximize(total_value)
+
+    result = opt.check()
+    if result == sat:
+        model = opt.model()
+        chosen = [item for item in items if bool(model.eval(choose[item["name"]], model_completion=True))]
+        skipped = [item for item in items if item not in chosen]
+        used = sum(item["amount"] for item in chosen)
+        proof = {
+            "status": "sat",
+            "summary": "Capacity constraints are feasible.",
+            "capacity": capacity,
+            "used": used,
+            "remaining": capacity - used,
+            "chosen": chosen,
+            "skipped": skipped,
+            "total_value": sum(item["value"] for item in chosen),
+        }
+    elif result == unsat:
+        proof = {
+            "status": "unsat",
+            "summary": "Required items exceed available capacity.",
+            "capacity": capacity,
+            "required_items": [item for item in items if item["required"]],
+        }
+    else:
+        proof = {
+            "status": "unknown",
+            "summary": "Solver could not prove SAT or UNSAT for this capacity check.",
+            "capacity": capacity,
+        }
+
+    proof["proof_log"] = _log_proof(
+        "capacity_fit",
+        {
+            "items": items,
+            "capacity": capacity,
+            "maximize_value": maximize_value,
+        },
+        proof,
+    )
+    return proof
+
+
+@mcp.tool(
+    name="schedule_tasks",
+    description=(
+        "Crucible template for discrete scheduling. Proves whether integer-duration "
+        "tasks fit within a time horizon under dependency and parallelism constraints."
+    ),
+    structured_output=True,
+)
+def schedule_tasks(
+    tasks: list[dict[str, Any]],
+    horizon: int,
+    dependencies: list[Any] | None = None,
+    fixed_starts: dict[str, int] | None = None,
+    max_parallel_tasks: int = 1,
+    minimize_makespan: bool = True,
+) -> dict[str, Any]:
+    return solve_schedule_tasks(
+        tasks=tasks,
+        horizon=horizon,
+        dependencies=dependencies,
+        fixed_starts=fixed_starts,
+        max_parallel_tasks=max_parallel_tasks,
+        minimize_makespan=minimize_makespan,
+    )
+
+
+@mcp.tool(
+    name="order_dependencies",
+    description=(
+        "Crucible template for dependency ordering. Proves whether a set of before/after "
+        "constraints is consistent and returns a valid topological order when SAT."
+    ),
+    structured_output=True,
+)
+def order_dependencies(
+    entities: list[str],
+    before: list[Any],
+    fixed_positions: dict[str, int] | None = None,
+) -> dict[str, Any]:
+    return solve_dependency_order(
+        entities=entities,
+        before=before,
+        fixed_positions=fixed_positions,
+    )
+
+
+@mcp.tool(
+    name="capacity_fit",
+    description=(
+        "Crucible template for resource capacity. Proves whether required items fit "
+        "within a capacity budget and chooses an optimal feasible subset of optional items."
+    ),
+    structured_output=True,
+)
+def capacity_fit(
+    items: list[dict[str, Any]],
+    capacity: int,
+    maximize_value: bool = True,
+) -> dict[str, Any]:
+    return solve_capacity_fit(items=items, capacity=capacity, maximize_value=maximize_value)
+
+
+def run_selftest() -> dict[str, Any]:
+    return {
+        "schedule_unsat_single_worker": solve_schedule_tasks(
+            tasks=[
+                {"name": "A", "duration": 2},
+                {"name": "B", "duration": 3},
+                {"name": "C", "duration": 4},
+            ],
+            horizon=8,
+            dependencies=[{"before": "A", "after": "B"}],
+            max_parallel_tasks=1,
+        ),
+        "schedule_sat_two_workers": solve_schedule_tasks(
+            tasks=[
+                {"name": "A", "duration": 2},
+                {"name": "B", "duration": 3},
+                {"name": "C", "duration": 4},
+            ],
+            horizon=8,
+            dependencies=[{"before": "A", "after": "B"}],
+            max_parallel_tasks=2,
+        ),
+        "ordering_sat": solve_dependency_order(
+            entities=["fetch", "train", "eval"],
+            before=[
+                {"before": "fetch", "after": "train"},
+                {"before": "train", "after": "eval"},
+            ],
+        ),
+        "capacity_sat": solve_capacity_fit(
+            items=[
+                {"name": "gpu_job", "amount": 6, "value": 6, "required": True},
+                {"name": "telemetry", "amount": 1, "value": 1, "required": True},
+                {"name": "export", "amount": 2, "value": 4, "required": False},
+                {"name": "viz", "amount": 3, "value": 5, "required": False},
+            ],
+            capacity=8,
+        ),
+    }
+
+
+def main() -> int:
+    if len(sys.argv) > 1 and sys.argv[1] == "selftest":
+        print(json.dumps(run_selftest(), indent=2))
+        return 0
+    mcp.run(transport="stdio")
+    return 0
+
+
+if __name__ == "__main__":
+    raise SystemExit(main())
--- a/bin/deploy-allegro-house.sh
+++ b/bin/deploy-allegro-house.sh
@@ -0,0 +1,32 @@
+#!/usr/bin/env bash
+set -euo pipefail
+
+SCRIPT_DIR="$(cd "$(dirname "$0")" && pwd)"
+REPO_DIR="$(cd "$SCRIPT_DIR/.." && pwd)"
+TARGET="${1:-root@167.99.126.228}"
+HERMES_REPO_URL="${HERMES_REPO_URL:-https://github.com/NousResearch/hermes-agent.git}"
+KIMI_API_KEY="${KIMI_API_KEY:-}"
+
+if [[ -z "$KIMI_API_KEY" && -f "$HOME/.config/kimi/api_key" ]]; then
+  KIMI_API_KEY="$(tr -d '\n' < "$HOME/.config/kimi/api_key")"
+fi
+
+if [[ -z "$KIMI_API_KEY" ]]; then
+  echo "KIMI_API_KEY is required (env or ~/.config/kimi/api_key)" >&2
+  exit 1
+fi
+
+ssh "$TARGET" 'apt-get update && DEBIAN_FRONTEND=noninteractive apt-get install -y git python3 python3-venv python3-pip curl ca-certificates'
+ssh "$TARGET" 'mkdir -p /root/wizards/allegro/home /root/wizards/allegro/hermes-agent'
+
+ssh "$TARGET" "if [ ! -d /root/wizards/allegro/hermes-agent/.git ]; then git clone '$HERMES_REPO_URL' /root/wizards/allegro/hermes-agent; fi"
+ssh "$TARGET" 'cd /root/wizards/allegro/hermes-agent && python3 -m venv .venv && .venv/bin/pip install --upgrade pip setuptools wheel && .venv/bin/pip install -e .'
+
+ssh "$TARGET" "cat > /root/wizards/allegro/home/config.yaml" < "$REPO_DIR/wizards/allegro/config.yaml"
+ssh "$TARGET" "cat > /root/wizards/allegro/home/SOUL.md" < "$REPO_DIR/SOUL.md"
+ssh "$TARGET" "cat > /root/wizards/allegro/home/.env <<'EOF'
+KIMI_API_KEY=$KIMI_API_KEY
+EOF"
+ssh "$TARGET" "cat > /etc/systemd/system/hermes-allegro.service" < "$REPO_DIR/wizards/allegro/hermes-allegro.service"
+
+ssh "$TARGET" 'chmod 600 /root/wizards/allegro/home/.env && systemctl daemon-reload && systemctl enable --now hermes-allegro.service && systemctl restart hermes-allegro.service && systemctl is-active hermes-allegro.service && curl -fsS http://127.0.0.1:8645/health'
--- a/bin/timmy-dashboard
+++ b/bin/timmy-dashboard
@@ -9,6 +9,7 @@ Usage:

 import json
 import os
+import sqlite3
 import subprocess
 import sys
 import time
@@ -16,6 +17,12 @@ import urllib.request
 from datetime import datetime, timezone, timedelta
 from pathlib import Path

+REPO_ROOT = Path(__file__).resolve().parent.parent
+if str(REPO_ROOT) not in sys.path:
+    sys.path.insert(0, str(REPO_ROOT))
+
+from metrics_helpers import summarize_local_metrics, summarize_session_rows
+
 HERMES_HOME = Path.home() / ".hermes"
 TIMMY_HOME = Path.home() / ".timmy"
 METRICS_DIR = TIMMY_HOME / "metrics"
@@ -60,6 +67,30 @@ def get_hermes_sessions():
        return []


+def get_session_rows(hours=24):
+    state_db = HERMES_HOME / "state.db"
+    if not state_db.exists():
+        return []
+    cutoff = time.time() - (hours * 3600)
+    try:
+        conn = sqlite3.connect(str(state_db))
+        rows = conn.execute(
+            """
+            SELECT model, source, COUNT(*) as sessions,
+                   SUM(message_count) as msgs,
+                   SUM(tool_call_count) as tools
+            FROM sessions
+            WHERE started_at > ? AND model IS NOT NULL AND model != ''
+            GROUP BY model, source
+            """,
+            (cutoff,),
+        ).fetchall()
+        conn.close()
+        return rows
+    except Exception:
+        return []
+
+
 def get_heartbeat_ticks(date_str=None):
    if not date_str:
        date_str = datetime.now().strftime("%Y%m%d")
@@ -130,6 +161,9 @@ def render(hours=24):
    ticks = get_heartbeat_ticks()
    metrics = get_local_metrics(hours)
    sessions = get_hermes_sessions()
+    session_rows = get_session_rows(hours)
+    local_summary = summarize_local_metrics(metrics)
+    session_summary = summarize_session_rows(session_rows)

    loaded_names = {m.get("name", "") for m in loaded}
    now = datetime.now().strftime("%Y-%m-%d %H:%M:%S")
@@ -159,28 +193,18 @@ def render(hours=24):
    print(f"\n  {BOLD}LOCAL INFERENCE ({len(metrics)} calls, last {hours}h){RST}")
    print(f"  {DIM}{'-' * 55}{RST}")
    if metrics:
-        by_caller = {}
-        for r in metrics:
-            caller = r.get("caller", "unknown")
-            if caller not in by_caller:
-                by_caller[caller] = {"count": 0, "success": 0, "errors": 0}
-            by_caller[caller]["count"] += 1
-            if r.get("success"):
-                by_caller[caller]["success"] += 1
-            else:
-                by_caller[caller]["errors"] += 1
-        for caller, stats in by_caller.items():
-            err = f"  {RED}err:{stats['errors']}{RST}" if stats["errors"] else ""
-            print(f"    {caller:25s}  calls:{stats['count']:4d}  "
-                  f"{GREEN}ok:{stats['success']}{RST}{err}")
+        print(f"    Tokens: {local_summary['input_tokens']} in  |  {local_summary['output_tokens']} out  |  {local_summary['total_tokens']} total")
+        if local_summary.get('avg_latency_s') is not None:
+            print(f"    Avg latency: {local_summary['avg_latency_s']:.2f}s")
+        if local_summary.get('avg_tokens_per_second') is not None:
+            print(f"    Avg throughput: {GREEN}{local_summary['avg_tokens_per_second']:.2f} tok/s{RST}")
+        for caller, stats in sorted(local_summary['by_caller'].items()):
+            err = f"  {RED}err:{stats['failed_calls']}{RST}" if stats['failed_calls'] else ""
+            print(f"    {caller:25s}  calls:{stats['calls']:4d}  tokens:{stats['total_tokens']:5d}  {GREEN}ok:{stats['successful_calls']}{RST}{err}")

-        by_model = {}
-        for r in metrics:
-            model = r.get("model", "unknown")
-            by_model[model] = by_model.get(model, 0) + 1
        print(f"\n    {DIM}Models used:{RST}")
-        for model, count in sorted(by_model.items(), key=lambda x: -x[1]):
-            print(f"      {model:30s}  {count} calls")
+        for model, stats in sorted(local_summary['by_model'].items(), key=lambda x: -x[1]['calls']):
+            print(f"      {model:30s}  {stats['calls']} calls  {stats['total_tokens']} tok")
    else:
        print(f"    {DIM}(no local calls recorded yet){RST}")

@@ -211,15 +235,18 @@ def render(hours=24):
    else:
        print(f"    {DIM}(no ticks today){RST}")

-    # ── HERMES SESSIONS ──
-    local_sessions = [s for s in sessions
-                     if "localhost:11434" in str(s.get("base_url", ""))]
+    # ── HERMES SESSIONS / SOVEREIGNTY LOAD ──
+    local_sessions = [s for s in sessions if "localhost:11434" in str(s.get("base_url", ""))]
    cloud_sessions = [s for s in sessions if s not in local_sessions]
-    print(f"\n  {BOLD}HERMES SESSIONS{RST}")
+    print(f"\n  {BOLD}HERMES SESSIONS / SOVEREIGNTY LOAD{RST}")
    print(f"  {DIM}{'-' * 55}{RST}")
-    print(f"    Total: {len(sessions)}  |  "
-          f"{GREEN}Local: {len(local_sessions)}{RST}  |  "
-          f"{YELLOW}Cloud: {len(cloud_sessions)}{RST}")
+    print(f"    Session cache: {len(sessions)} total  |  {GREEN}{len(local_sessions)} local{RST}  |  {YELLOW}{len(cloud_sessions)} cloud{RST}")
+    if session_rows:
+        print(f"    Session DB:    {session_summary['total_sessions']} total  |  {GREEN}{session_summary['local_sessions']} local{RST}  |  {YELLOW}{session_summary['cloud_sessions']} cloud{RST}")
+        print(f"    Token est:     {GREEN}{session_summary['local_est_tokens']} local{RST}  |  {YELLOW}{session_summary['cloud_est_tokens']} cloud{RST}")
+        print(f"    Est cloud cost: ${session_summary['cloud_est_cost_usd']:.4f}")
+    else:
+        print(f"    {DIM}(no session-db stats available){RST}")

    # ── ACTIVE LOOPS ──
    print(f"\n  {BOLD}ACTIVE LOOPS{RST}")
--- a/config.yaml
+++ b/config.yaml
@@ -1,8 +1,8 @@
 model:
-  default: gpt-5.4
-  provider: openai-codex
+  default: hermes4:14b
+  provider: custom
  context_length: 65536
-  base_url: https://chatgpt.com/backend-api/codex
+  base_url: http://localhost:8081/v1
 toolsets:
 - all
 agent:
@@ -188,7 +188,7 @@ custom_providers:
 - name: Local llama.cpp
  base_url: http://localhost:8081/v1
  api_key: none
-  model: auto
+  model: hermes4:14b
 - name: Google Gemini
  base_url: https://generativelanguage.googleapis.com/v1beta/openai
  api_key_env: GEMINI_API_KEY
@@ -196,8 +196,10 @@ custom_providers:
 system_prompt_suffix: "You are Timmy. Your soul is defined in SOUL.md \u2014 read\
  \ it, live it.\nYou run locally on your owner's machine via llama.cpp. You never\
  \ phone home.\nYou speak plainly. You prefer short sentences. Brevity is a kindness.\n\
-  When you don't know something, say so. Refusal over fabrication.\nSovereignty and\
-  \ service always.\n"
+  When you don't know something, say so. Refusal over fabrication.\nFor scheduling,\
+  \ dependency ordering, resource constraints, and consistency checks, prefer the\
+  \ Crucible tools and report SAT/UNSAT plus witness model when available.\nSovereignty\
+  \ and service always.\n"
 skills:
  creation_nudge_interval: 15
 DISCORD_HOME_CHANNEL: '1476292315814297772'
@@ -212,6 +214,12 @@ mcp_servers:
    - /Users/apayne/.timmy/morrowind/mcp_server.py
    env: {}
    timeout: 30
+  crucible:
+    command: "/Users/apayne/.hermes/hermes-agent/venv/bin/python3"
+    args: ["/Users/apayne/.hermes/bin/crucible_mcp_server.py"]
+    env: {}
+    timeout: 120
+    connect_timeout: 60
 fallback_model:
  provider: custom
  model: gemini-2.5-pro
--- a/deploy.sh
+++ b/deploy.sh
@@ -3,13 +3,30 @@
 # This is the canonical way to deploy Timmy's configuration.
 # Hermes-agent is the engine. timmy-config is the driver's seat.
 #
-# Usage: ./deploy.sh
+# Usage: ./deploy.sh [--restart-loops] [--restart-gateway] [--restart-all]

 set -euo pipefail

 SCRIPT_DIR="$(cd "$(dirname "$0")" && pwd)"
 HERMES_HOME="$HOME/.hermes"
 TIMMY_HOME="$HOME/.timmy"
+RESTART_LOOPS=false
+RESTART_GATEWAY=false
+
+for arg in "$@"; do
+  case "$arg" in
+    --restart-loops) RESTART_LOOPS=true ;;
+    --restart-gateway) RESTART_GATEWAY=true ;;
+    --restart-all)
+      RESTART_LOOPS=true
+      RESTART_GATEWAY=true
+      ;;
+    *)
+      echo "Unknown argument: $arg" >&2
+      exit 1
+      ;;
+  esac
+done

 log() { echo "[deploy] $*"; }

@@ -74,10 +91,45 @@ done
 chmod +x "$HERMES_HOME/bin/"*.sh "$HERMES_HOME/bin/"*.py 2>/dev/null || true
 log "bin/ -> $HERMES_HOME/bin/"

-if [ "${1:-}" != "" ]; then
-  echo "ERROR: deploy.sh no longer accepts legacy loop flags." >&2
-  echo "Deploy the sidecar only. Do not relaunch deprecated bash loops." >&2
-  exit 1
+# === Ensure Crucible dependency is installed ===
+HERMES_PY="$HERMES_HOME/hermes-agent/venv/bin/python"
+if [ -x "$HERMES_PY" ]; then
+  if "$HERMES_PY" -c 'import z3' >/dev/null 2>&1; then
+    log "z3-solver already present in Hermes venv"
+  else
+    log "Installing z3-solver into Hermes venv..."
+    "$HERMES_PY" -m pip install z3-solver
+  fi
+fi
+
+# === Restart loops if requested ===
+if [ "$RESTART_LOOPS" = true ]; then
+  log "Killing existing loops..."
+  pkill -f 'claude-loop.sh' 2>/dev/null || true
+  pkill -f 'gemini-loop.sh' 2>/dev/null || true
+  pkill -f 'timmy-orchestrator.sh' 2>/dev/null || true
+  sleep 2
+
+  log "Clearing stale locks..."
+  rm -rf "$HERMES_HOME/logs/claude-locks/"* 2>/dev/null || true
+  rm -rf "$HERMES_HOME/logs/gemini-locks/"* 2>/dev/null || true
+
+  log "Relaunching loops..."
+  nohup bash "$HERMES_HOME/bin/timmy-orchestrator.sh" >> "$HERMES_HOME/logs/timmy-orchestrator.log" 2>&1 &
+  nohup bash "$HERMES_HOME/bin/claude-loop.sh" 2 >> "$HERMES_HOME/logs/claude-loop.log" 2>&1 &
+  nohup bash "$HERMES_HOME/bin/gemini-loop.sh" 1 >> "$HERMES_HOME/logs/gemini-loop.log" 2>&1 &
+  sleep 1
+  log "Loops relaunched."
+fi
+
+# === Restart gateway if requested (required for new MCP servers/tools) ===
+if [ "$RESTART_GATEWAY" = true ]; then
+  log "Restarting Hermes gateway..."
+  pkill -f 'hermes_cli.main gateway run' 2>/dev/null || true
+  sleep 2
+  nohup "$HERMES_PY" -m hermes_cli.main gateway run --replace >> "$HERMES_HOME/logs/gateway.log" 2>&1 &
+  sleep 2
+  log "Gateway restarted."
 fi

 log "Deploy complete. timmy-config applied to $HERMES_HOME/"
--- a/docs/allegro-wizard-house.md
+++ b/docs/allegro-wizard-house.md
@@ -0,0 +1,44 @@
+# Allegro wizard house
+
+Purpose:
+- stand up the third wizard house as a Kimi-backed coding worker
+- keep Hermes as the durable harness
+- treat OpenClaw as optional shell frontage, not the bones
+
+Local proof already achieved:
+
+```bash
+HERMES_HOME=$HOME/.timmy/wizards/allegro/home \
+  hermes doctor
+
+HERMES_HOME=$HOME/.timmy/wizards/allegro/home \
+  hermes chat -Q --provider kimi-coding -m kimi-for-coding \
+  -q "Reply with exactly: ALLEGRO KIMI ONLINE"
+```
+
+Observed proof:
+- Kimi / Moonshot API check passed in `hermes doctor`
+- chat returned exactly `ALLEGRO KIMI ONLINE`
+
+Repo assets:
+- `wizards/allegro/config.yaml`
+- `wizards/allegro/hermes-allegro.service`
+- `bin/deploy-allegro-house.sh`
+
+Remote target:
+- host: `167.99.126.228`
+- house root: `/root/wizards/allegro`
+- `HERMES_HOME`: `/root/wizards/allegro/home`
+- api health: `http://127.0.0.1:8645/health`
+
+Deploy command:
+
+```bash
+cd ~/.timmy/timmy-config
+bin/deploy-allegro-house.sh root@167.99.126.228
+```
+
+Important nuance:
+- the Hermes/Kimi lane is the proven path
+- direct embedded OpenClaw Kimi model routing was not yet reliable locally
+- so the remote deployment keeps the minimal, proven architecture: Hermes house first
--- a/docs/crucible-first-cut.md
+++ b/docs/crucible-first-cut.md
@@ -0,0 +1,82 @@
+# Crucible First Cut
+
+This is the first narrow neuro-symbolic slice for Timmy.
+
+## Goal
+
+Prove constraint logic instead of bluffing through it.
+
+## Shape
+
+The Crucible is a sidecar MCP server that lives in `timmy-config` and deploys into `~/.hermes/bin/`.
+It is loaded by Hermes through native MCP discovery. No Hermes fork.
+
+## Templates shipped in v0
+
+### 1. schedule_tasks
+Use for:
+- deadline feasibility
+- task ordering with dependencies
+- small integer scheduling windows
+
+Inputs:
+- `tasks`: `[{name, duration}]`
+- `horizon`: integer window size
+- `dependencies`: `[{before, after, lag?}]`
+- `max_parallel_tasks`: integer worker count
+
+Outputs:
+- `status: sat|unsat|unknown`
+- witness schedule when SAT
+- proof log path
+
+### 2. order_dependencies
+Use for:
+- topological ordering
+- cycle detection
+- dependency consistency checks
+
+Inputs:
+- `entities`
+- `before`
+- optional `fixed_positions`
+
+Outputs:
+- valid ordering when SAT
+- contradiction when UNSAT
+- proof log path
+
+### 3. capacity_fit
+Use for:
+- resource budgeting
+- optional-vs-required work selection
+- capacity feasibility
+
+Inputs:
+- `items: [{name, amount, value?, required?}]`
+- `capacity`
+
+Outputs:
+- chosen feasible subset when SAT
+- contradiction when required load exceeds capacity
+- proof log path
+
+## Demo
+
+Run locally:
+
+```bash
+~/.hermes/hermes-agent/venv/bin/python ~/.hermes/bin/crucible_mcp_server.py selftest
+```
+
+This produces:
+- one UNSAT schedule proof
+- one SAT schedule proof
+- one SAT dependency ordering proof
+- one SAT capacity proof
+
+## Scope guardrails
+
+Do not force every answer through the Crucible.
+Use it when the task is genuinely constraint-shaped.
+If the problem does not fit one of the templates, say so plainly.
--- a/metrics_helpers.py
+++ b/metrics_helpers.py
@@ -0,0 +1,139 @@
+from __future__ import annotations
+
+import math
+from datetime import datetime, timezone
+
+COST_TABLE = {
+    "claude-opus-4-6": {"input": 15.0, "output": 75.0},
+    "claude-sonnet-4-6": {"input": 3.0, "output": 15.0},
+    "claude-sonnet-4-20250514": {"input": 3.0, "output": 15.0},
+    "claude-haiku-4-20250414": {"input": 0.25, "output": 1.25},
+    "hermes4:14b": {"input": 0.0, "output": 0.0},
+    "hermes3:8b": {"input": 0.0, "output": 0.0},
+    "hermes3:latest": {"input": 0.0, "output": 0.0},
+    "qwen3:30b": {"input": 0.0, "output": 0.0},
+}
+
+
+def estimate_tokens_from_chars(char_count: int) -> int:
+    if char_count <= 0:
+        return 0
+    return math.ceil(char_count / 4)
+
+
+
+def build_local_metric_record(
+    *,
+    prompt: str,
+    response: str,
+    model: str,
+    caller: str,
+    session_id: str | None,
+    latency_s: float,
+    success: bool,
+    error: str | None = None,
+) -> dict:
+    input_tokens = estimate_tokens_from_chars(len(prompt))
+    output_tokens = estimate_tokens_from_chars(len(response))
+    total_tokens = input_tokens + output_tokens
+    tokens_per_second = round(total_tokens / latency_s, 2) if latency_s > 0 else None
+    return {
+        "timestamp": datetime.now(timezone.utc).isoformat(),
+        "model": model,
+        "caller": caller,
+        "prompt_len": len(prompt),
+        "response_len": len(response),
+        "session_id": session_id,
+        "latency_s": round(latency_s, 3),
+        "est_input_tokens": input_tokens,
+        "est_output_tokens": output_tokens,
+        "tokens_per_second": tokens_per_second,
+        "success": success,
+        "error": error,
+    }
+
+
+
+def summarize_local_metrics(records: list[dict]) -> dict:
+    total_calls = len(records)
+    successful_calls = sum(1 for record in records if record.get("success"))
+    failed_calls = total_calls - successful_calls
+    input_tokens = sum(int(record.get("est_input_tokens", 0) or 0) for record in records)
+    output_tokens = sum(int(record.get("est_output_tokens", 0) or 0) for record in records)
+    total_tokens = input_tokens + output_tokens
+    latencies = [float(record.get("latency_s", 0) or 0) for record in records if record.get("latency_s") is not None]
+    throughputs = [
+        float(record.get("tokens_per_second", 0) or 0)
+        for record in records
+        if record.get("tokens_per_second")
+    ]
+
+    by_caller: dict[str, dict] = {}
+    by_model: dict[str, dict] = {}
+    for record in records:
+        caller = record.get("caller", "unknown")
+        model = record.get("model", "unknown")
+        bucket_tokens = int(record.get("est_input_tokens", 0) or 0) + int(record.get("est_output_tokens", 0) or 0)
+        for key, table in ((caller, by_caller), (model, by_model)):
+            if key not in table:
+                table[key] = {"calls": 0, "successful_calls": 0, "failed_calls": 0, "total_tokens": 0}
+            table[key]["calls"] += 1
+            table[key]["total_tokens"] += bucket_tokens
+            if record.get("success"):
+                table[key]["successful_calls"] += 1
+            else:
+                table[key]["failed_calls"] += 1
+
+    return {
+        "total_calls": total_calls,
+        "successful_calls": successful_calls,
+        "failed_calls": failed_calls,
+        "input_tokens": input_tokens,
+        "output_tokens": output_tokens,
+        "total_tokens": total_tokens,
+        "avg_latency_s": round(sum(latencies) / len(latencies), 2) if latencies else None,
+        "avg_tokens_per_second": round(sum(throughputs) / len(throughputs), 2) if throughputs else None,
+        "by_caller": by_caller,
+        "by_model": by_model,
+    }
+
+
+
+def is_local_model(model: str | None) -> bool:
+    if not model:
+        return False
+    costs = COST_TABLE.get(model, {})
+    if costs.get("input", 1) == 0 and costs.get("output", 1) == 0:
+        return True
+    return ":" in model and "/" not in model and "claude" not in model
+
+
+
+def summarize_session_rows(rows: list[tuple]) -> dict:
+    total_sessions = 0
+    local_sessions = 0
+    cloud_sessions = 0
+    local_est_tokens = 0
+    cloud_est_tokens = 0
+    cloud_est_cost_usd = 0.0
+    for model, source, sessions, messages, tool_calls in rows:
+        sessions = int(sessions or 0)
+        messages = int(messages or 0)
+        est_tokens = messages * 500
+        total_sessions += sessions
+        if is_local_model(model):
+            local_sessions += sessions
+            local_est_tokens += est_tokens
+        else:
+            cloud_sessions += sessions
+            cloud_est_tokens += est_tokens
+            pricing = COST_TABLE.get(model, {"input": 5.0, "output": 15.0})
+            cloud_est_cost_usd += (est_tokens / 1_000_000) * ((pricing["input"] + pricing["output"]) / 2)
+    return {
+        "total_sessions": total_sessions,
+        "local_sessions": local_sessions,
+        "cloud_sessions": cloud_sessions,
+        "local_est_tokens": local_est_tokens,
+        "cloud_est_tokens": cloud_est_tokens,
+        "cloud_est_cost_usd": round(cloud_est_cost_usd, 4),
+    }
--- a/playbooks/verified-logic.yaml
+++ b/playbooks/verified-logic.yaml
@@ -0,0 +1,47 @@
+name: verified-logic
+description: >
+  Crucible-first playbook for tasks that require proof instead of plausible prose.
+  Use Z3-backed sidecar tools for scheduling, dependency ordering, capacity checks,
+  and consistency verification.
+
+model:
+  preferred: claude-opus-4-6
+  fallback: claude-sonnet-4-20250514
+  max_turns: 12
+  temperature: 0.1
+
+tools:
+  - mcp_crucible_schedule_tasks
+  - mcp_crucible_order_dependencies
+  - mcp_crucible_capacity_fit
+
+trigger:
+  manual: true
+
+steps:
+  - classify_problem
+  - choose_template
+  - translate_into_constraints
+  - verify_with_crucible
+  - report_sat_unsat_with_witness
+
+output: verified_result
+timeout_minutes: 5
+
+system_prompt: |
+  You are running the Crucible playbook.
+
+  Use this playbook for:
+  - scheduling and deadline feasibility
+  - dependency ordering and cycle checks
+  - capacity / resource allocation constraints
+  - consistency checks where a contradiction matters
+
+  RULES:
+  1. Do not bluff through logic.
+  2. Pick the narrowest Crucible template that fits the task.
+  3. Translate the user's question into structured constraints.
+  4. Call the Crucible tool.
+  5. If SAT, report the witness model clearly.
+  6. If UNSAT, say the constraints are impossible and explain which shape of constraint caused the contradiction.
+  7. If the task is not a good fit for these templates, say so plainly instead of pretending it was verified.
--- a/tasks.py
+++ b/tasks.py
@@ -5,12 +5,14 @@ import glob
 import os
 import subprocess
 import sys
+import time
 from datetime import datetime, timezone
 from pathlib import Path

 from orchestration import huey
 from huey import crontab
 from gitea_client import GiteaClient
+from metrics_helpers import build_local_metric_record

 HERMES_HOME = Path.home() / ".hermes"
 TIMMY_HOME = Path.home() / ".timmy"
@@ -57,6 +59,7 @@ def run_hermes_local(
    _model = model or HEARTBEAT_MODEL
    tagged = f"[{caller_tag}] {prompt}" if caller_tag else prompt

+    started = time.time()
    try:
        runner = """
 import io
@@ -167,15 +170,15 @@ sys.exit(exit_code)
        # Log to metrics jsonl
        METRICS_DIR.mkdir(parents=True, exist_ok=True)
        metrics_file = METRICS_DIR / f"local_{datetime.now().strftime('%Y%m%d')}.jsonl"
-        record = {
-            "timestamp": datetime.now(timezone.utc).isoformat(),
-            "model": _model,
-            "caller": caller_tag or "unknown",
-            "prompt_len": len(prompt),
-            "response_len": len(response),
-            "session_id": session_id,
-            "success": bool(response),
-        }
+        record = build_local_metric_record(
+            prompt=prompt,
+            response=response,
+            model=_model,
+            caller=caller_tag or "unknown",
+            session_id=session_id,
+            latency_s=time.time() - started,
+            success=bool(response),
+        )
        with open(metrics_file, "a") as f:
            f.write(json.dumps(record) + "\n")

@@ -190,13 +193,16 @@ sys.exit(exit_code)
        # Log failure
        METRICS_DIR.mkdir(parents=True, exist_ok=True)
        metrics_file = METRICS_DIR / f"local_{datetime.now().strftime('%Y%m%d')}.jsonl"
-        record = {
-            "timestamp": datetime.now(timezone.utc).isoformat(),
-            "model": _model,
-            "caller": caller_tag or "unknown",
-            "error": str(e),
-            "success": False,
-        }
+        record = build_local_metric_record(
+            prompt=prompt,
+            response="",
+            model=_model,
+            caller=caller_tag or "unknown",
+            session_id=None,
+            latency_s=time.time() - started,
+            success=False,
+            error=str(e),
+        )
        with open(metrics_file, "a") as f:
            f.write(json.dumps(record) + "\n")
        return None
--- a/tests/test_allegro_wizard_assets.py
+++ b/tests/test_allegro_wizard_assets.py
@@ -0,0 +1,27 @@
+from __future__ import annotations
+
+from pathlib import Path
+
+import yaml
+
+
+def test_allegro_config_targets_kimi_house() -> None:
+    config = yaml.safe_load(Path("wizards/allegro/config.yaml").read_text())
+
+    assert config["model"]["provider"] == "kimi-coding"
+    assert config["model"]["default"] == "kimi-for-coding"
+    assert config["platforms"]["api_server"]["extra"]["port"] == 8645
+
+
+def test_allegro_service_uses_isolated_home() -> None:
+    text = Path("wizards/allegro/hermes-allegro.service").read_text()
+
+    assert "HERMES_HOME=/root/wizards/allegro/home" in text
+    assert "hermes gateway run --replace" in text
+
+
+def test_deploy_script_requires_external_secret() -> None:
+    text = Path("bin/deploy-allegro-house.sh").read_text()
+
+    assert "~/.config/kimi/api_key" in text
+    assert "sk-kimi-" not in text
--- a/tests/test_metrics_helpers.py
+++ b/tests/test_metrics_helpers.py
@@ -0,0 +1,93 @@
+from metrics_helpers import (
+    build_local_metric_record,
+    estimate_tokens_from_chars,
+    summarize_local_metrics,
+    summarize_session_rows,
+)
+
+
+def test_estimate_tokens_from_chars_uses_simple_local_heuristic() -> None:
+    assert estimate_tokens_from_chars(0) == 0
+    assert estimate_tokens_from_chars(1) == 1
+    assert estimate_tokens_from_chars(4) == 1
+    assert estimate_tokens_from_chars(5) == 2
+    assert estimate_tokens_from_chars(401) == 101
+
+
+def test_build_local_metric_record_adds_token_and_throughput_estimates() -> None:
+    record = build_local_metric_record(
+        prompt="abcd" * 10,
+        response="xyz" * 20,
+        model="hermes4:14b",
+        caller="heartbeat_tick",
+        session_id="session-123",
+        latency_s=2.0,
+        success=True,
+    )
+
+    assert record["model"] == "hermes4:14b"
+    assert record["caller"] == "heartbeat_tick"
+    assert record["session_id"] == "session-123"
+    assert record["est_input_tokens"] == 10
+    assert record["est_output_tokens"] == 15
+    assert record["tokens_per_second"] == 12.5
+
+
+def test_summarize_local_metrics_rolls_up_tokens_and_latency() -> None:
+    records = [
+        {
+            "caller": "heartbeat_tick",
+            "model": "hermes4:14b",
+            "success": True,
+            "est_input_tokens": 100,
+            "est_output_tokens": 40,
+            "latency_s": 2.0,
+            "tokens_per_second": 20.0,
+        },
+        {
+            "caller": "heartbeat_tick",
+            "model": "hermes4:14b",
+            "success": False,
+            "est_input_tokens": 30,
+            "est_output_tokens": 0,
+            "latency_s": 1.0,
+        },
+        {
+            "caller": "session_export",
+            "model": "hermes3:8b",
+            "success": True,
+            "est_input_tokens": 50,
+            "est_output_tokens": 25,
+            "latency_s": 5.0,
+            "tokens_per_second": 5.0,
+        },
+    ]
+
+    summary = summarize_local_metrics(records)
+
+    assert summary["total_calls"] == 3
+    assert summary["successful_calls"] == 2
+    assert summary["failed_calls"] == 1
+    assert summary["input_tokens"] == 180
+    assert summary["output_tokens"] == 65
+    assert summary["total_tokens"] == 245
+    assert summary["avg_latency_s"] == 2.67
+    assert summary["avg_tokens_per_second"] == 12.5
+    assert summary["by_caller"]["heartbeat_tick"]["total_tokens"] == 170
+    assert summary["by_model"]["hermes4:14b"]["failed_calls"] == 1
+
+
+def test_summarize_session_rows_separates_local_and_cloud_estimates() -> None:
+    rows = [
+        ("hermes4:14b", "local", 2, 10, 4),
+        ("claude-sonnet-4-6", "cli", 3, 9, 2),
+    ]
+
+    summary = summarize_session_rows(rows)
+
+    assert summary["total_sessions"] == 5
+    assert summary["local_sessions"] == 2
+    assert summary["cloud_sessions"] == 3
+    assert summary["local_est_tokens"] == 5000
+    assert summary["cloud_est_tokens"] == 4500
+    assert summary["cloud_est_cost_usd"] > 0
--- a/tests/test_proof_policy_docs.py
+++ b/tests/test_proof_policy_docs.py
@@ -0,0 +1,17 @@
+from pathlib import Path
+
+
+def test_contributing_sets_hard_proof_rule() -> None:
+    doc = Path("CONTRIBUTING.md").read_text()
+
+    assert "visual changes require screenshot proof" in doc
+    assert "do not commit screenshots or binary media to Gitea backup" in doc
+    assert "CLI/verifiable changes must cite the exact command output, log path, or world-state proof" in doc
+    assert "no proof, no merge" in doc
+
+
+def test_readme_points_to_proof_standard() -> None:
+    readme = Path("README.md").read_text()
+
+    assert "Proof Standard" in readme
+    assert "CONTRIBUTING.md" in readme
--- a/wizards/allegro/README.md
+++ b/wizards/allegro/README.md
@@ -0,0 +1,16 @@
+# Allegro wizard house
+
+Allegro is the third wizard house.
+
+Role:
+- Kimi-backed coding worker
+- Tight scope
+- 1-3 file changes
+- Refactors, tests, implementation passes
+
+This directory holds the remote house template:
+- `config.yaml` — Hermes house config
+- `hermes-allegro.service` — systemd unit
+
+Secrets do not live here.
+`KIMI_API_KEY` must be injected at deploy time into `/root/wizards/allegro/home/.env`.
--- a/wizards/allegro/config.yaml
+++ b/wizards/allegro/config.yaml
@@ -0,0 +1,61 @@
+model:
+  default: kimi-for-coding
+  provider: kimi-coding
+toolsets:
+  - all
+agent:
+  max_turns: 30
+  reasoning_effort: xhigh
+  verbose: false
+terminal:
+  backend: local
+  cwd: .
+  timeout: 180
+  persistent_shell: true
+browser:
+  inactivity_timeout: 120
+  command_timeout: 30
+  record_sessions: false
+display:
+  compact: false
+  personality: ''
+  resume_display: full
+  busy_input_mode: interrupt
+  bell_on_complete: false
+  show_reasoning: false
+  streaming: false
+  show_cost: false
+  tool_progress: all
+memory:
+  memory_enabled: true
+  user_profile_enabled: true
+  memory_char_limit: 2200
+  user_char_limit: 1375
+  nudge_interval: 10
+  flush_min_turns: 6
+approvals:
+  mode: manual
+security:
+  redact_secrets: true
+  tirith_enabled: false
+platforms:
+  api_server:
+    enabled: true
+    extra:
+      host: 127.0.0.1
+      port: 8645
+session_reset:
+  mode: none
+  idle_minutes: 0
+skills:
+  creation_nudge_interval: 15
+system_prompt_suffix: |
+  You are Allegro, the Kimi-backed third wizard house.
+  Your soul is defined in SOUL.md — read it, live it.
+  Hermes is your harness.
+  Kimi Code is your primary provider.
+  You speak plainly. You prefer short sentences. Brevity is a kindness.
+
+  Work best on tight coding tasks: 1-3 file changes, refactors, tests, and implementation passes.
+  Refusal over fabrication. If you do not know, say so.
+  Sovereignty and service always.
--- a/wizards/allegro/hermes-allegro.service
+++ b/wizards/allegro/hermes-allegro.service
@@ -0,0 +1,16 @@
+[Unit]
+Description=Hermes Allegro Wizard House
+After=network-online.target
+Wants=network-online.target
+
+[Service]
+Type=simple
+WorkingDirectory=/root/wizards/allegro/hermes-agent
+Environment=HERMES_HOME=/root/wizards/allegro/home
+EnvironmentFile=/root/wizards/allegro/home/.env
+ExecStart=/root/wizards/allegro/hermes-agent/.venv/bin/hermes gateway run --replace
+Restart=always
+RestartSec=10
+
+[Install]
+WantedBy=multi-user.target
Author	SHA1	Message	Date
Google AI Agent	00d8c62df0	resolve: merge main into crucible branch — keep config base + add Z3 sidecar Resolved 3 conflicts: - config.yaml: kept main's llama.cpp/fallback_model + added Crucible system prompt and MCP server - README.md: kept main's clean bin/ listing + added crucible_mcp_server.py and docs - deploy.sh: kept PR's extended deploy flags (--restart-gateway) + Z3 dependency check Signed-off-by: gemini <gemini@hermes.local>	2026-03-30 18:19:41 -04:00
Timmy Time	877425bde4	feat: add Allegro Kimi wizard house assets (#91 )	2026-03-29 22:22:24 +00:00
Alexander Whitestone	2d3cea8127	feat(crucible): add Z3 sidecar MCP verifier - add crucible_mcp_server.py with Z3-backed proof tools - ship scheduling, dependency ordering, and capacity templates - log SAT/UNSAT proof trails to ~/.hermes/logs/crucible/ - wire crucible MCP server into config.yaml - teach deploy.sh to ensure z3-solver is installed - add verified-logic playbook and docs for first cut	2026-03-28 20:52:47 -04:00
Timmy Time	34e01f0986	feat: add local-vs-cloud token and throughput metrics (#85 )	2026-03-28 14:24:12 +00:00
Timmy Time	d955d2b9f1	docs: codify merge proof standard (#84 )	2026-03-28 14:03:35 +00:00