feat(morrowind): add Perception/Command protocol + SQLite command log (#859 , #855 )

Implement two foundational infrastructure pieces for the Morrowind integration: 1. Perception/Command Protocol (Issue #859): - Formal spec document with JSON schemas, API contracts, versioning strategy - Engine-agnostic design following the Falsework Rule - Pydantic v2 models (PerceptionOutput, CommandInput) for runtime validation 2. SQLite Command Log + Training Pipeline (Issue #855): - SQLAlchemy model for command_log table with full indexing - CommandLogger class with log_command(), query(), export_training_data() - TrainingExporter with chat-completion, episode, and instruction formats - Storage management (rotation/archival) utilities - Alembic migration for the new table Includes 39 passing tests covering schema validation, logging, querying, and export. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-21 22:33:13 +00:00
57 changed files with 1691 additions and 4394 deletions
--- a/docs/protocol/morrowind-perception-command-spec.md
+++ b/docs/protocol/morrowind-perception-command-spec.md
@@ -0,0 +1,312 @@
+# Morrowind Perception/Command Protocol Specification
+
+**Version:** 1.0.0
+**Status:** Draft
+**Authors:** Timmy Infrastructure Team
+**Date:** 2026-03-21
+
+---
+
+## 1. Overview
+
+This document defines the **engine-agnostic Perception/Command protocol** used by Timmy's
+heartbeat loop to observe the game world and issue commands. The protocol is designed
+around the **Falsework Rule**: TES3MP (Morrowind) is scaffolding. If the engine swaps,
+only the bridge and perception script change — the heartbeat, reasoning, and journal
+remain sovereign.
+
+### 1.1 Design Principles
+
+- **Engine-agnostic**: Schemas reference abstract concepts (cells, entities, quests), not
+  Morrowind-specific internals.
+- **Versioned**: Every payload carries a `protocol_version` so consumers can negotiate
+  compatibility.
+- **Typed at the boundary**: Pydantic v2 models enforce validation on both the producer
+  (bridge) and consumer (heartbeat) side.
+- **Logged by default**: Every command is persisted to the SQLite command log for
+  training-data extraction (see Issue #855).
+
+---
+
+## 2. Protocol Version Strategy
+
+| Field              | Type   | Description                          |
+| ------------------ | ------ | ------------------------------------ |
+| `protocol_version` | string | SemVer string (e.g. `"1.0.0"`)      |
+
+### Compatibility Rules
+
+- **Patch** bump (1.0.x): additive fields with defaults — fully backward-compatible.
+- **Minor** bump (1.x.0): new optional endpoints or enum values — old clients still work.
+- **Major** bump (x.0.0): breaking schema change — requires coordinated upgrade of bridge
+  and heartbeat.
+
+Consumers MUST reject payloads whose major version exceeds their own.
+
+---
+
+## 3. Perception Output Schema
+
+Returned by `GET /perception`. Represents a single snapshot of the game world as observed
+by the bridge.
+
+```json
+{
+  "protocol_version": "1.0.0",
+  "timestamp": "2026-03-21T14:30:00Z",
+  "agent_id": "timmy",
+  "location": {
+    "cell": "Balmora",
+    "x": 1024.5,
+    "y": -512.3,
+    "z": 64.0,
+    "interior": false
+  },
+  "health": {
+    "current": 85,
+    "max": 100
+  },
+  "nearby_entities": [
+    {
+      "entity_id": "npc_001",
+      "name": "Caius Cosades",
+      "entity_type": "npc",
+      "distance": 12.5,
+      "disposition": 65
+    }
+  ],
+  "inventory_summary": {
+    "gold": 150,
+    "item_count": 23,
+    "encumbrance_pct": 0.45
+  },
+  "active_quests": [
+    {
+      "quest_id": "mq_01",
+      "name": "Report to Caius Cosades",
+      "stage": 10
+    }
+  ],
+  "environment": {
+    "time_of_day": "afternoon",
+    "weather": "clear",
+    "is_combat": false,
+    "is_dialogue": false
+  },
+  "raw_engine_data": {}
+}
+```
+
+### 3.1 Field Reference
+
+| Field                | Type              | Required | Description                                                  |
+| -------------------- | ----------------- | -------- | ------------------------------------------------------------ |
+| `protocol_version`   | string            | yes      | Protocol SemVer                                              |
+| `timestamp`          | ISO 8601 datetime | yes      | When the snapshot was taken                                   |
+| `agent_id`           | string            | yes      | Which agent this perception belongs to                        |
+| `location.cell`      | string            | yes      | Current cell/zone name                                        |
+| `location.x/y/z`    | float             | yes      | World coordinates                                             |
+| `location.interior`  | bool              | yes      | Whether the agent is indoors                                  |
+| `health.current`     | int (0–max)       | yes      | Current health                                                |
+| `health.max`         | int (>0)          | yes      | Maximum health                                                |
+| `nearby_entities`    | array             | yes      | Entities within perception radius (may be empty)              |
+| `inventory_summary`  | object            | yes      | Lightweight inventory overview                                |
+| `active_quests`      | array             | yes      | Currently tracked quests                                      |
+| `environment`        | object            | yes      | World-state flags                                             |
+| `raw_engine_data`    | object            | no       | Opaque engine-specific blob (not relied upon by heartbeat)    |
+
+### 3.2 Entity Types
+
+The `entity_type` field uses a controlled vocabulary:
+
+| Value      | Description              |
+| ---------- | ------------------------ |
+| `npc`      | Non-player character     |
+| `creature` | Hostile or neutral mob   |
+| `item`     | Pickup-able world item   |
+| `door`     | Door or transition       |
+| `container`| Lootable container       |
+
+---
+
+## 4. Command Input Schema
+
+Sent via `POST /command`. Represents a single action the agent wants to take in the world.
+
+```json
+{
+  "protocol_version": "1.0.0",
+  "timestamp": "2026-03-21T14:30:01Z",
+  "agent_id": "timmy",
+  "command": "move_to",
+  "params": {
+    "target_cell": "Balmora",
+    "target_x": 1050.0,
+    "target_y": -500.0
+  },
+  "reasoning": "Moving closer to Caius Cosades to begin the main quest dialogue.",
+  "episode_id": "ep_20260321_001",
+  "context": {
+    "perception_timestamp": "2026-03-21T14:30:00Z",
+    "heartbeat_cycle": 42
+  }
+}
+```
+
+### 4.1 Field Reference
+
+| Field                          | Type              | Required | Description                                             |
+| ------------------------------ | ----------------- | -------- | ------------------------------------------------------- |
+| `protocol_version`             | string            | yes      | Protocol SemVer                                         |
+| `timestamp`                    | ISO 8601 datetime | yes      | When the command was issued                              |
+| `agent_id`                     | string            | yes      | Which agent is issuing the command                       |
+| `command`                      | string (enum)     | yes      | Command type (see §4.2)                                 |
+| `params`                       | object            | yes      | Command-specific parameters (may be empty `{}`)         |
+| `reasoning`                    | string            | yes      | Natural-language explanation of *why* this command       |
+| `episode_id`                   | string            | no       | Groups commands into training episodes                   |
+| `context`                      | object            | no       | Metadata linking command to its triggering perception    |
+
+### 4.2 Command Types
+
+| Command         | Description                              | Key Params                         |
+| --------------- | ---------------------------------------- | ---------------------------------- |
+| `move_to`       | Navigate to coordinates or entity        | `target_cell`, `target_x/y/z`     |
+| `interact`      | Interact with entity (talk, activate)    | `entity_id`, `interaction_type`    |
+| `use_item`      | Use an inventory item                    | `item_id`, `target_entity_id?`     |
+| `wait`          | Wait/idle for a duration                 | `duration_seconds`                 |
+| `combat_action` | Perform a combat action                  | `action_type`, `target_entity_id`  |
+| `dialogue`      | Choose a dialogue option                 | `entity_id`, `topic`, `choice_idx` |
+| `journal_note`  | Write an internal journal observation    | `content`, `tags`                  |
+| `noop`          | Heartbeat tick with no action            | —                                  |
+
+---
+
+## 5. API Contracts
+
+### 5.1 `GET /perception`
+
+Returns the latest perception snapshot.
+
+**Response:** `200 OK` with `PerceptionOutput` JSON body.
+
+**Error Responses:**
+
+| Status | Code                | Description                         |
+| ------ | ------------------- | ----------------------------------- |
+| 503    | `BRIDGE_UNAVAILABLE`| Game bridge is not connected        |
+| 504    | `PERCEPTION_TIMEOUT`| Bridge did not respond in time      |
+| 422    | `SCHEMA_MISMATCH`   | Bridge returned incompatible schema |
+
+### 5.2 `POST /command`
+
+Submit a command for the agent to execute.
+
+**Request:** `CommandInput` JSON body.
+
+**Response:** `202 Accepted`
+
+```json
+{
+  "status": "accepted",
+  "command_id": "cmd_abc123",
+  "logged": true
+}
+```
+
+**Error Responses:**
+
+| Status | Code                 | Description                         |
+| ------ | -------------------- | ----------------------------------- |
+| 400    | `INVALID_COMMAND`    | Command type not recognized         |
+| 400    | `VALIDATION_ERROR`   | Payload fails Pydantic validation   |
+| 409    | `COMMAND_CONFLICT`   | Agent is busy executing another cmd |
+| 503    | `BRIDGE_UNAVAILABLE` | Game bridge is not connected        |
+
+### 5.3 `GET /morrowind/status`
+
+Health-check endpoint for the Morrowind bridge.
+
+**Response:** `200 OK`
+
+```json
+{
+  "bridge_connected": true,
+  "engine": "tes3mp",
+  "protocol_version": "1.0.0",
+  "uptime_seconds": 3600,
+  "last_perception_at": "2026-03-21T14:30:00Z"
+}
+```
+
+---
+
+## 6. Engine-Swap Documentation (The Falsework Rule)
+
+### What Changes
+
+| Component              | Changes on Engine Swap? | Notes                                         |
+| ---------------------- | ----------------------- | --------------------------------------------- |
+| Bridge process         | **YES** — replaced      | New bridge speaks same protocol to new engine  |
+| Perception Lua script  | **YES** — replaced      | New engine's scripting language/API            |
+| `PerceptionOutput`     | NO                      | Schema is engine-agnostic                      |
+| `CommandInput`         | NO                      | Schema is engine-agnostic                      |
+| Heartbeat loop         | NO                      | Consumes `PerceptionOutput`, emits `Command`   |
+| Reasoning/LLM layer    | NO                      | Operates on abstract perception data           |
+| Journal system         | NO                      | Writes `journal_note` commands                 |
+| Command log + training | NO                      | Logs all commands regardless of engine         |
+| Dashboard WebSocket    | NO                      | Separate protocol (`src/infrastructure/protocol.py`) |
+
+### Swap Procedure
+
+1. Implement new bridge that serves `GET /perception` and accepts `POST /command`.
+2. Update `raw_engine_data` field documentation for the new engine.
+3. Extend `entity_type` enum if the new engine has novel entity categories.
+4. Bump `protocol_version` minor (or major if schema changes are required).
+5. Run integration tests against the new bridge.
+
+---
+
+## 7. Error Handling Specification
+
+### 7.1 Error Response Format
+
+All error responses follow a consistent structure:
+
+```json
+{
+  "error": {
+    "code": "BRIDGE_UNAVAILABLE",
+    "message": "Human-readable error description",
+    "details": {},
+    "timestamp": "2026-03-21T14:30:00Z"
+  }
+}
+```
+
+### 7.2 Error Codes
+
+| Code                 | HTTP Status | Retry? | Description                              |
+| -------------------- | ----------- | ------ | ---------------------------------------- |
+| `BRIDGE_UNAVAILABLE` | 503         | yes    | Bridge process not connected             |
+| `PERCEPTION_TIMEOUT` | 504         | yes    | Bridge did not respond within deadline   |
+| `SCHEMA_MISMATCH`    | 422         | no     | Protocol version incompatibility         |
+| `INVALID_COMMAND`    | 400         | no     | Unknown command type                     |
+| `VALIDATION_ERROR`   | 400         | no     | Pydantic validation failed               |
+| `COMMAND_CONFLICT`   | 409         | yes    | Agent busy — retry after current command |
+| `INTERNAL_ERROR`     | 500         | yes    | Unexpected server error                  |
+
+### 7.3 Retry Policy
+
+Clients SHOULD implement exponential backoff for retryable errors:
+- Initial delay: 100ms
+- Max delay: 5s
+- Max retries: 5
+- Jitter: ±50ms
+
+---
+
+## 8. Appendix: Pydantic Model Reference
+
+The canonical Pydantic v2 models live in `src/infrastructure/morrowind/schemas.py`.
+These models serve as both runtime validation and living documentation of this spec.
+Any change to this spec document MUST be reflected in the Pydantic models, and vice versa.
--- a/migrations/env.py
+++ b/migrations/env.py
@@ -20,6 +20,7 @@ if config.config_file_name is not None:
 # target_metadata = mymodel.Base.metadata
 from src.dashboard.models.database import Base
 from src.dashboard.models.calm import Task, JournalEntry
+from src.infrastructure.morrowind.command_log import CommandLog  # noqa: F401
 target_metadata = Base.metadata

 # other values from the config, defined by the needs of env.py,
--- a/migrations/versions/a1b2c3d4e5f6_create_command_log_table.py
+++ b/migrations/versions/a1b2c3d4e5f6_create_command_log_table.py
@@ -0,0 +1,89 @@
+"""Create command_log table
+
+Revision ID: a1b2c3d4e5f6
+Revises: 0093c15b4bbf
+Create Date: 2026-03-21 12:00:00.000000
+
+"""
+
+from typing import Sequence, Union
+
+import sqlalchemy as sa
+from alembic import op
+
+# revision identifiers, used by Alembic.
+revision: str = "a1b2c3d4e5f6"
+down_revision: Union[str, Sequence[str], None] = "0093c15b4bbf"
+branch_labels: Union[str, Sequence[str], None] = None
+depends_on: Union[str, Sequence[str], None] = None
+
+
+def upgrade() -> None:
+    """Upgrade schema."""
+    op.create_table(
+        "command_log",
+        sa.Column("id", sa.Integer(), autoincrement=True, nullable=False),
+        sa.Column("timestamp", sa.DateTime(), nullable=False),
+        sa.Column("command", sa.String(length=64), nullable=False),
+        sa.Column("params", sa.Text(), nullable=False, server_default="{}"),
+        sa.Column("reasoning", sa.Text(), nullable=False, server_default=""),
+        sa.Column(
+            "perception_snapshot", sa.Text(), nullable=False, server_default="{}"
+        ),
+        sa.Column("outcome", sa.Text(), nullable=True),
+        sa.Column(
+            "agent_id",
+            sa.String(length=64),
+            nullable=False,
+            server_default="timmy",
+        ),
+        sa.Column("episode_id", sa.String(length=128), nullable=True),
+        sa.Column("cell", sa.String(length=255), nullable=True),
+        sa.Column(
+            "protocol_version",
+            sa.String(length=16),
+            nullable=False,
+            server_default="1.0.0",
+        ),
+        sa.Column("created_at", sa.DateTime(), nullable=False),
+        sa.PrimaryKeyConstraint("id"),
+    )
+    op.create_index(
+        op.f("ix_command_log_timestamp"), "command_log", ["timestamp"], unique=False
+    )
+    op.create_index(
+        op.f("ix_command_log_command"), "command_log", ["command"], unique=False
+    )
+    op.create_index(
+        op.f("ix_command_log_agent_id"), "command_log", ["agent_id"], unique=False
+    )
+    op.create_index(
+        op.f("ix_command_log_episode_id"),
+        "command_log",
+        ["episode_id"],
+        unique=False,
+    )
+    op.create_index(
+        op.f("ix_command_log_cell"), "command_log", ["cell"], unique=False
+    )
+    op.create_index(
+        "ix_command_log_cmd_cell", "command_log", ["command", "cell"], unique=False
+    )
+    op.create_index(
+        "ix_command_log_episode",
+        "command_log",
+        ["episode_id", "timestamp"],
+        unique=False,
+    )
+
+
+def downgrade() -> None:
+    """Downgrade schema."""
+    op.drop_index("ix_command_log_episode", table_name="command_log")
+    op.drop_index("ix_command_log_cmd_cell", table_name="command_log")
+    op.drop_index(op.f("ix_command_log_cell"), table_name="command_log")
+    op.drop_index(op.f("ix_command_log_episode_id"), table_name="command_log")
+    op.drop_index(op.f("ix_command_log_agent_id"), table_name="command_log")
+    op.drop_index(op.f("ix_command_log_command"), table_name="command_log")
+    op.drop_index(op.f("ix_command_log_timestamp"), table_name="command_log")
+    op.drop_table("command_log")
--- a/pyproject.toml
+++ b/pyproject.toml
@@ -50,7 +50,6 @@ sounddevice = { version = ">=0.4.6", optional = true }
 sentence-transformers = { version = ">=2.0.0", optional = true }
 numpy = { version = ">=1.24.0", optional = true }
 requests = { version = ">=2.31.0", optional = true }
-trafilatura = { version = ">=1.6.0", optional = true }
 GitPython = { version = ">=3.1.40", optional = true }
 pytest = { version = ">=8.0.0", optional = true }
 pytest-asyncio = { version = ">=0.24.0", optional = true }
@@ -68,7 +67,6 @@ voice = ["pyttsx3", "openai-whisper", "piper-tts", "sounddevice"]
 celery = ["celery"]
 embeddings = ["sentence-transformers", "numpy"]
 git = ["GitPython"]
-research = ["requests", "trafilatura"]
 dev = ["pytest", "pytest-asyncio", "pytest-cov", "pytest-timeout", "pytest-randomly", "pytest-xdist", "selenium"]

 [tool.poetry.group.dev.dependencies]
--- a/scripts/backfill_retro.py
+++ b/scripts/backfill_retro.py
@@ -17,23 +17,8 @@ REPO_ROOT = Path(__file__).resolve().parent.parent
 RETRO_FILE = REPO_ROOT / ".loop" / "retro" / "cycles.jsonl"
 SUMMARY_FILE = REPO_ROOT / ".loop" / "retro" / "summary.json"

-
-def _get_gitea_api() -> str:
-    """Read Gitea API URL from env var, then ~/.hermes/gitea_api file, then default."""
-    # Check env vars first (TIMMY_GITEA_API is preferred, GITEA_API for compatibility)
-    api_url = os.environ.get("TIMMY_GITEA_API") or os.environ.get("GITEA_API")
-    if api_url:
-        return api_url
-    # Check ~/.hermes/gitea_api file
-    api_file = Path.home() / ".hermes" / "gitea_api"
-    if api_file.exists():
-        return api_file.read_text().strip()
-    # Default fallback
-    return "http://localhost:3000/api/v1"
-
-
-GITEA_API = _get_gitea_api()
-REPO_SLUG = os.environ.get("REPO_SLUG", "rockachopa/Timmy-time-dashboard")
+GITEA_API = "http://localhost:3000/api/v1"
+REPO_SLUG = "rockachopa/Timmy-time-dashboard"
 TOKEN_FILE = Path.home() / ".hermes" / "gitea_token"

 TAG_RE = re.compile(r"\[([^\]]+)\]")
--- a/scripts/cycle_retro.py
+++ b/scripts/cycle_retro.py
@@ -277,8 +277,6 @@ def main() -> None:
            args.tests_passed = int(cr["tests_passed"])
        if not args.notes and cr.get("notes"):
            args.notes = cr["notes"]
-        # Consume-once: delete after reading so stale results don't poison future cycles
-        CYCLE_RESULT_FILE.unlink(missing_ok=True)

    # Auto-detect issue from branch when not explicitly provided
    if args.issue is None:
--- a/scripts/gitea_backup.sh
+++ b/scripts/gitea_backup.sh
@@ -1,83 +0,0 @@
-#!/bin/bash
-# Gitea backup script — run on the VPS before any hardening changes.
-# Usage: sudo bash scripts/gitea_backup.sh [off-site-dest]
-#
-# off-site-dest: optional rsync/scp destination for off-site copy
-#   e.g. user@backup-host:/backups/gitea/
-#
-# Refs: #971, #990
-
-set -euo pipefail
-
-BACKUP_DIR="/opt/gitea/backups"
-TIMESTAMP=$(date +"%Y%m%d_%H%M%S")
-GITEA_CONF="/etc/gitea/app.ini"
-GITEA_WORK_DIR="/var/lib/gitea"
-OFFSITE_DEST="${1:-}"
-
-echo "=== Gitea Backup — $TIMESTAMP ==="
-
-# Ensure backup directory exists
-mkdir -p "$BACKUP_DIR"
-cd "$BACKUP_DIR"
-
-# Run the dump
-echo "[1/4] Running gitea dump..."
-gitea dump -c "$GITEA_CONF"
-
-# Find the newest zip (gitea dump names it gitea-dump-*.zip)
-BACKUP_FILE=$(ls -t "$BACKUP_DIR"/gitea-dump-*.zip 2>/dev/null | head -1)
-
-if [ -z "$BACKUP_FILE" ]; then
-    echo "ERROR: No backup zip found in $BACKUP_DIR"
-    exit 1
-fi
-
-BACKUP_SIZE=$(stat -c%s "$BACKUP_FILE" 2>/dev/null || stat -f%z "$BACKUP_FILE")
-echo "[2/4] Backup created: $BACKUP_FILE ($BACKUP_SIZE bytes)"
-
-if [ "$BACKUP_SIZE" -eq 0 ]; then
-    echo "ERROR: Backup file is 0 bytes"
-    exit 1
-fi
-
-# Lock down permissions
-chmod 600 "$BACKUP_FILE"
-
-# Verify contents
-echo "[3/4] Verifying backup contents..."
-CONTENTS=$(unzip -l "$BACKUP_FILE" 2>/dev/null || true)
-
-check_component() {
-    if echo "$CONTENTS" | grep -q "$1"; then
-        echo "  OK: $2"
-    else
-        echo "  WARN: $2 not found in backup"
-    fi
-}
-
-check_component "gitea-db.sql"    "Database dump"
-check_component "gitea-repo"      "Repositories"
-check_component "custom"          "Custom config"
-check_component "app.ini"         "app.ini"
-
-# Off-site copy
-if [ -n "$OFFSITE_DEST" ]; then
-    echo "[4/4] Copying to off-site: $OFFSITE_DEST"
-    rsync -avz "$BACKUP_FILE" "$OFFSITE_DEST"
-    echo "  Off-site copy complete."
-else
-    echo "[4/4] No off-site destination provided. Skipping."
-    echo "  To copy later: scp $BACKUP_FILE user@backup-host:/backups/gitea/"
-fi
-
-echo ""
-echo "=== Backup complete ==="
-echo "File: $BACKUP_FILE"
-echo "Size: $BACKUP_SIZE bytes"
-echo ""
-echo "To verify restore on a clean instance:"
-echo "  1. Copy zip to test machine"
-echo "  2. unzip $BACKUP_FILE"
-echo "  3. gitea restore --from <extracted-dir> -c /etc/gitea/app.ini"
-echo "  4. Verify repos and DB are intact"
--- a/scripts/loop_guard.py
+++ b/scripts/loop_guard.py
@@ -30,22 +30,7 @@ IDLE_STATE_FILE = REPO_ROOT / ".loop" / "idle_state.json"
 CYCLE_RESULT_FILE = REPO_ROOT / ".loop" / "cycle_result.json"
 TOKEN_FILE = Path.home() / ".hermes" / "gitea_token"

-
-def _get_gitea_api() -> str:
-    """Read Gitea API URL from env var, then ~/.hermes/gitea_api file, then default."""
-    # Check env vars first (TIMMY_GITEA_API is preferred, GITEA_API for compatibility)
-    api_url = os.environ.get("TIMMY_GITEA_API") or os.environ.get("GITEA_API")
-    if api_url:
-        return api_url
-    # Check ~/.hermes/gitea_api file
-    api_file = Path.home() / ".hermes" / "gitea_api"
-    if api_file.exists():
-        return api_file.read_text().strip()
-    # Default fallback
-    return "http://localhost:3000/api/v1"
-
-
-GITEA_API = _get_gitea_api()
+GITEA_API = os.environ.get("GITEA_API", "http://localhost:3000/api/v1")
 REPO_SLUG = os.environ.get("REPO_SLUG", "rockachopa/Timmy-time-dashboard")

 # Default cycle duration in seconds (5 min); stale threshold = 2× this
@@ -202,11 +187,7 @@ def load_queue() -> list[dict]:
                # Persist the cleaned queue so stale entries don't recur
                _save_cleaned_queue(data, open_numbers)
        return ready
-    except json.JSONDecodeError as exc:
-        print(f"[loop-guard] WARNING: Corrupt queue.json ({exc}) — returning empty queue")
-        return []
-    except OSError as exc:
-        print(f"[loop-guard] WARNING: Cannot read queue.json ({exc}) — returning empty queue")
+    except (json.JSONDecodeError, OSError):
        return []


--- a/scripts/triage_score.py
+++ b/scripts/triage_score.py
@@ -20,28 +20,11 @@ from datetime import datetime, timezone
 from pathlib import Path

 # ── Config ──────────────────────────────────────────────────────────────
-
-
-def _get_gitea_api() -> str:
-    """Read Gitea API URL from env var, then ~/.hermes/gitea_api file, then default."""
-    # Check env vars first (TIMMY_GITEA_API is preferred, GITEA_API for compatibility)
-    api_url = os.environ.get("TIMMY_GITEA_API") or os.environ.get("GITEA_API")
-    if api_url:
-        return api_url
-    # Check ~/.hermes/gitea_api file
-    api_file = Path.home() / ".hermes" / "gitea_api"
-    if api_file.exists():
-        return api_file.read_text().strip()
-    # Default fallback
-    return "http://localhost:3000/api/v1"
-
-
-GITEA_API = _get_gitea_api()
+GITEA_API = os.environ.get("GITEA_API", "http://localhost:3000/api/v1")
 REPO_SLUG = os.environ.get("REPO_SLUG", "rockachopa/Timmy-time-dashboard")
 TOKEN_FILE = Path.home() / ".hermes" / "gitea_token"
 REPO_ROOT = Path(__file__).resolve().parent.parent
 QUEUE_FILE = REPO_ROOT / ".loop" / "queue.json"
-QUEUE_BACKUP_FILE = REPO_ROOT / ".loop" / "queue.json.bak"
 RETRO_FILE = REPO_ROOT / ".loop" / "retro" / "triage.jsonl"
 QUARANTINE_FILE = REPO_ROOT / ".loop" / "quarantine.json"
 CYCLE_RETRO_FILE = REPO_ROOT / ".loop" / "retro" / "cycles.jsonl"
@@ -343,38 +326,9 @@ def run_triage() -> list[dict]:
    ready = [s for s in scored if s["ready"]]
    not_ready = [s for s in scored if not s["ready"]]

-    # Save backup before writing (if current file exists and is valid)
-    if QUEUE_FILE.exists():
-        try:
-            json.loads(QUEUE_FILE.read_text())  # Validate current file
-            QUEUE_BACKUP_FILE.write_text(QUEUE_FILE.read_text())
-        except (json.JSONDecodeError, OSError):
-            pass  # Current file is corrupt, don't overwrite backup
-
-    # Write new queue file
    QUEUE_FILE.parent.mkdir(parents=True, exist_ok=True)
    QUEUE_FILE.write_text(json.dumps(ready, indent=2) + "\n")

-    # Validate the write by re-reading and parsing
-    try:
-        json.loads(QUEUE_FILE.read_text())
-    except (json.JSONDecodeError, OSError) as exc:
-        print(f"[triage] ERROR: queue.json validation failed: {exc}", file=sys.stderr)
-        # Restore from backup if available
-        if QUEUE_BACKUP_FILE.exists():
-            try:
-                backup_data = QUEUE_BACKUP_FILE.read_text()
-                json.loads(backup_data)  # Validate backup
-                QUEUE_FILE.write_text(backup_data)
-                print(f"[triage] Restored queue.json from backup")
-            except (json.JSONDecodeError, OSError) as restore_exc:
-                print(f"[triage] ERROR: Backup restore failed: {restore_exc}", file=sys.stderr)
-                # Write empty list as last resort
-                QUEUE_FILE.write_text("[]\n")
-        else:
-            # No backup, write empty list
-            QUEUE_FILE.write_text("[]\n")
-
    # Write retro entry
    retro_entry = {
        "timestamp": datetime.now(timezone.utc).isoformat(),
--- a/skills/research/architecture_spike.md
+++ b/skills/research/architecture_spike.md
@@ -1,67 +0,0 @@
---
-name: Architecture Spike
-type: research
-typical_query_count: 2-4
-expected_output_length: 600-1200 words
-cascade_tier: groq_preferred
-description: >
-  Investigate how to connect two systems or components. Produces an integration
-  architecture with sequence diagram, key decisions, and a proof-of-concept outline.
---
-
-# Architecture Spike: Connect {system_a} to {system_b}
-
-## Context
-
-We need to integrate **{system_a}** with **{system_b}** in the context of
-**{project_context}**. This spike answers: what is the best way to wire them
-together, and what are the trade-offs?
-
-## Constraints
-
- Prefer approaches that avoid adding new infrastructure dependencies.
- The integration should be **{sync_or_async}** (synchronous / asynchronous).
- Must work within: {environment_constraints}.
-
-## Research Steps
-
-1. Identify the APIs / protocols exposed by both systems.
-2. List all known integration patterns (direct API, message queue, webhook, SDK, etc.).
-3. Evaluate each pattern for complexity, reliability, and latency.
-4. Select the recommended approach and outline a proof-of-concept.
-
-## Output Format
-
-### Integration Options
-
-| Pattern | Complexity | Reliability | Latency | Notes |
-|---------|-----------|-------------|---------|-------|
-| ...     | ...       | ...         | ...     | ...   |
-
-### Recommended Approach
-
-**Pattern:** {pattern_name}
-
-**Why:** One paragraph explaining the choice.
-
-### Sequence Diagram
-
-```
-{system_a} -> {middleware} -> {system_b}
-```
-
-Describe the data flow step by step:
-
-1. {system_a} does X...
-2. {middleware} transforms / routes...
-3. {system_b} receives Y...
-
-### Proof-of-Concept Outline
-
- Files to create or modify
- Key libraries / dependencies needed
- Estimated effort: {effort_estimate}
-
-### Open Questions
-
-Bullet list of decisions that need human input before proceeding.
--- a/skills/research/competitive_scan.md
+++ b/skills/research/competitive_scan.md
@@ -1,74 +0,0 @@
---
-name: Competitive Scan
-type: research
-typical_query_count: 3-5
-expected_output_length: 800-1500 words
-cascade_tier: groq_preferred
-description: >
-  Compare a project against its alternatives. Produces a feature matrix,
-  strengths/weaknesses analysis, and positioning summary.
---
-
-# Competitive Scan: {project} vs Alternatives
-
-## Context
-
-Compare **{project}** against **{alternatives}** (comma-separated list of
-competitors). The goal is to understand where {project} stands and identify
-differentiation opportunities.
-
-## Constraints
-
- Comparison date: {date}.
- Focus areas: {focus_areas} (e.g., features, pricing, community, performance).
- Perspective: {perspective} (user, developer, business).
-
-## Research Steps
-
-1. Gather key facts about {project} (features, pricing, community size, release cadence).
-2. Gather the same data for each alternative in {alternatives}.
-3. Build a feature comparison matrix.
-4. Identify strengths and weaknesses for each entry.
-5. Summarize positioning and recommend next steps.
-
-## Output Format
-
-### Overview
-
-One paragraph: what space does {project} compete in, and who are the main players?
-
-### Feature Matrix
-
-| Feature / Attribute | {project} | {alt_1} | {alt_2} | {alt_3} |
-|--------------------|-----------|---------|---------|---------|
-| {feature_1}        | ...       | ...     | ...     | ...     |
-| {feature_2}        | ...       | ...     | ...     | ...     |
-| Pricing            | ...       | ...     | ...     | ...     |
-| License            | ...       | ...     | ...     | ...     |
-| Community Size     | ...       | ...     | ...     | ...     |
-| Last Major Release | ...       | ...     | ...     | ...     |
-
-### Strengths & Weaknesses
-
-#### {project}
- **Strengths:** ...
- **Weaknesses:** ...
-
-#### {alt_1}
- **Strengths:** ...
- **Weaknesses:** ...
-
-_(Repeat for each alternative)_
-
-### Positioning Map
-
-Describe where each project sits along the key dimensions (e.g., simplicity
-vs power, free vs paid, niche vs general).
-
-### Recommendations
-
-Bullet list of actions based on the competitive landscape:
-
- **Differentiate on:** {differentiator}
- **Watch out for:** {threat}
- **Consider adopting from {alt}:** {feature_or_approach}
--- a/skills/research/game_analysis.md
+++ b/skills/research/game_analysis.md
@@ -1,68 +0,0 @@
---
-name: Game Analysis
-type: research
-typical_query_count: 2-3
-expected_output_length: 600-1000 words
-cascade_tier: local_ok
-description: >
-  Evaluate a game for AI agent playability. Assesses API availability,
-  observation/action spaces, and existing bot ecosystems.
---
-
-# Game Analysis: {game}
-
-## Context
-
-Evaluate **{game}** to determine whether an AI agent can play it effectively.
-Focus on programmatic access, observation space, action space, and existing
-bot/AI ecosystems.
-
-## Constraints
-
- Platform: {platform} (PC, console, mobile, browser).
- Agent type: {agent_type} (reinforcement learning, rule-based, LLM-driven, hybrid).
- Budget for API/licenses: {budget}.
-
-## Research Steps
-
-1. Identify official APIs, modding support, or programmatic access methods for {game}.
-2. Characterize the observation space (screen pixels, game state JSON, memory reading, etc.).
-3. Characterize the action space (keyboard/mouse, API calls, controller inputs).
-4. Survey existing bots, AI projects, or research papers for {game}.
-5. Assess feasibility and difficulty for the target agent type.
-
-## Output Format
-
-### Game Profile
-
-| Property          | Value                  |
-|-------------------|------------------------|
-| Game              | {game}                 |
-| Genre             | {genre}                |
-| Platform          | {platform}             |
-| API Available     | Yes / No / Partial     |
-| Mod Support       | Yes / No / Limited     |
-| Existing AI Work  | Extensive / Some / None|
-
-### Observation Space
-
-Describe what data the agent can access and how (API, screen capture, memory hooks, etc.).
-
-### Action Space
-
-Describe how the agent can interact with the game (input methods, timing constraints, etc.).
-
-### Existing Ecosystem
-
-List known bots, frameworks, research papers, or communities working on AI for {game}.
-
-### Feasibility Assessment
-
- **Difficulty:** Easy / Medium / Hard / Impractical
- **Best approach:** {recommended_agent_type}
- **Key challenges:** Bullet list
- **Estimated time to MVP:** {time_estimate}
-
-### Recommendation
-
-One paragraph: should we proceed, and if so, what is the first step?
--- a/skills/research/integration_guide.md
+++ b/skills/research/integration_guide.md
@@ -1,79 +0,0 @@
---
-name: Integration Guide
-type: research
-typical_query_count: 3-5
-expected_output_length: 1000-2000 words
-cascade_tier: groq_preferred
-description: >
-  Step-by-step guide to wire a specific tool into an existing stack,
-  complete with code samples, configuration, and testing steps.
---
-
-# Integration Guide: Wire {tool} into {stack}
-
-## Context
-
-Integrate **{tool}** into our **{stack}** stack. The goal is to
-**{integration_goal}** (e.g., "add vector search to the dashboard",
-"send notifications via Telegram").
-
-## Constraints
-
- Must follow existing project conventions (see CLAUDE.md).
- No new cloud AI dependencies unless explicitly approved.
- Environment config via `pydantic-settings` / `config.py`.
-
-## Research Steps
-
-1. Review {tool}'s official documentation for installation and setup.
-2. Identify the minimal dependency set required.
-3. Map {tool}'s API to our existing patterns (singletons, graceful degradation).
-4. Write integration code with proper error handling.
-5. Define configuration variables and their defaults.
-
-## Output Format
-
-### Prerequisites
-
- Dependencies to install (with versions)
- External services or accounts required
- Environment variables to configure
-
-### Configuration
-
-```python
-# In config.py — add these fields to Settings:
-{config_fields}
-```
-
-### Implementation
-
-```python
-# {file_path}
-{implementation_code}
-```
-
-### Graceful Degradation
-
-Describe how the integration behaves when {tool} is unavailable:
-
-| Scenario              | Behavior           | Log Level |
-|-----------------------|--------------------|-----------|
-| {tool} not installed  | {fallback}         | WARNING   |
-| {tool} unreachable    | {fallback}         | WARNING   |
-| Invalid credentials   | {fallback}         | ERROR     |
-
-### Testing
-
-```python
-# tests/unit/test_{tool_snake}.py
-{test_code}
-```
-
-### Verification Checklist
-
- [ ] Dependency added to pyproject.toml
- [ ] Config fields added with sensible defaults
- [ ] Graceful degradation tested (service down)
- [ ] Unit tests pass (`tox -e unit`)
- [ ] No new linting errors (`tox -e lint`)
--- a/skills/research/state_of_art.md
+++ b/skills/research/state_of_art.md
@@ -1,67 +0,0 @@
---
-name: State of the Art
-type: research
-typical_query_count: 4-6
-expected_output_length: 1000-2000 words
-cascade_tier: groq_preferred
-description: >
-  Comprehensive survey of what currently exists in a given field or domain.
-  Produces a structured landscape overview with key players, trends, and gaps.
---
-
-# State of the Art: {field} (as of {date})
-
-## Context
-
-Survey the current landscape of **{field}**. Identify key players, recent
-developments, dominant approaches, and notable gaps. This is a point-in-time
-snapshot intended to inform decision-making.
-
-## Constraints
-
- Focus on developments from the last {timeframe} (e.g., 12 months, 2 years).
- Prioritize {priority} (open-source, commercial, academic, or all).
- Target audience: {audience} (technical team, leadership, general).
-
-## Research Steps
-
-1. Identify the major categories or sub-domains within {field}.
-2. For each category, list the leading projects, companies, or research groups.
-3. Note recent milestones, releases, or breakthroughs.
-4. Identify emerging trends and directions.
-5. Highlight gaps — things that don't exist yet but should.
-
-## Output Format
-
-### Executive Summary
-
-Two to three sentences: what is the state of {field} right now?
-
-### Landscape Map
-
-| Category       | Key Players              | Maturity    | Trend       |
-|---------------|--------------------------|-------------|-------------|
-| {category_1}  | {player_a}, {player_b}   | Early / GA  | Growing / Stable / Declining |
-| {category_2}  | {player_c}, {player_d}   | Early / GA  | Growing / Stable / Declining |
-
-### Recent Milestones
-
-Chronological list of notable events in the last {timeframe}:
-
- **{date_1}:** {event_description}
- **{date_2}:** {event_description}
-
-### Trends
-
-Numbered list of the top 3-5 trends shaping {field}:
-
-1. **{trend_name}** — {one-line description}
-2. **{trend_name}** — {one-line description}
-
-### Gaps & Opportunities
-
-Bullet list of things that are missing, underdeveloped, or ripe for innovation.
-
-### Implications for Us
-
-One paragraph: what does this mean for our project? What should we do next?
--- a/skills/research/tool_evaluation.md
+++ b/skills/research/tool_evaluation.md
@@ -1,52 +0,0 @@
---
-name: Tool Evaluation
-type: research
-typical_query_count: 3-5
-expected_output_length: 800-1500 words
-cascade_tier: groq_preferred
-description: >
-  Discover and evaluate all shipping tools/libraries/services in a given domain.
-  Produces a ranked comparison table with pros, cons, and recommendation.
---
-
-# Tool Evaluation: {domain}
-
-## Context
-
-You are researching tools, libraries, and services for **{domain}**.
-The goal is to find everything that is currently shipping (not vaporware)
-and produce a structured comparison.
-
-## Constraints
-
- Only include tools that have public releases or hosted services available today.
- If a tool is in beta/preview, note that clearly.
- Focus on {focus_criteria} when evaluating (e.g., cost, ease of integration, community size).
-
-## Research Steps
-
-1. Identify all actively-maintained tools in the **{domain}** space.
-2. For each tool, gather: name, URL, license/pricing, last release date, language/platform.
-3. Evaluate each tool against the focus criteria.
-4. Rank by overall fit for the use case: **{use_case}**.
-
-## Output Format
-
-### Summary
-
-One paragraph: what the landscape looks like and the top recommendation.
-
-### Comparison Table
-
-| Tool | License / Price | Last Release | Language | {focus_criteria} Score | Notes |
-|------|----------------|--------------|----------|----------------------|-------|
-| ...  | ...            | ...          | ...      | ...                  | ...   |
-
-### Top Pick
-
- **Recommended:** {tool_name} — {one-line reason}
- **Runner-up:** {tool_name} — {one-line reason}
-
-### Risks & Gaps
-
-Bullet list of things to watch out for (missing features, vendor lock-in, etc.).
--- a/src/config.py
+++ b/src/config.py
@@ -87,12 +87,8 @@ class Settings(BaseSettings):
    xai_base_url: str = "https://api.x.ai/v1"
    grok_default_model: str = "grok-3-fast"
    grok_max_sats_per_query: int = 200
-    grok_sats_hard_cap: int = 100  # Absolute ceiling on sats per Grok query
    grok_free: bool = False  # Skip Lightning invoice when user has own API key

-    # ── Database ──────────────────────────────────────────────────────────
-    db_busy_timeout_ms: int = 5000  # SQLite PRAGMA busy_timeout (ms)
-
    # ── Claude (Anthropic) — cloud fallback backend ────────────────────────
    # Used when Ollama is offline and local inference isn't available.
    # Set ANTHROPIC_API_KEY to enable.  Default model is Haiku (fast + cheap).
--- a/src/dashboard/app.py
+++ b/src/dashboard/app.py
@@ -44,7 +44,6 @@ from dashboard.routes.mobile import router as mobile_router
 from dashboard.routes.models import api_router as models_api_router
 from dashboard.routes.models import router as models_router
 from dashboard.routes.quests import router as quests_router
-from dashboard.routes.scorecards import router as scorecards_router
 from dashboard.routes.spark import router as spark_router
 from dashboard.routes.system import router as system_router
 from dashboard.routes.tasks import router as tasks_router
@@ -630,7 +629,6 @@ app.include_router(matrix_router)
 app.include_router(tower_router)
 app.include_router(daily_run_router)
 app.include_router(quests_router)
-app.include_router(scorecards_router)


@app.websocket("/ws")
--- a/src/dashboard/routes/scorecards.py
+++ b/src/dashboard/routes/scorecards.py
@@ -1,353 +0,0 @@
-"""Agent scorecard routes — API endpoints for generating and viewing scorecards."""
-
-from __future__ import annotations
-
-import logging
-from datetime import datetime
-
-from fastapi import APIRouter, Query, Request
-from fastapi.responses import HTMLResponse, JSONResponse
-
-from dashboard.services.scorecard_service import (
-    PeriodType,
-    generate_all_scorecards,
-    generate_scorecard,
-    get_tracked_agents,
-)
-from dashboard.templating import templates
-
-logger = logging.getLogger(__name__)
-
-router = APIRouter(prefix="/scorecards", tags=["scorecards"])
-
-
-def _format_period_label(period_type: PeriodType) -> str:
-    """Format a period type for display."""
-    return "Daily" if period_type == PeriodType.daily else "Weekly"
-
-
-@router.get("/api/agents")
-async def list_tracked_agents() -> dict[str, list[str]]:
-    """Return the list of tracked agent IDs.
-
-    Returns:
-        Dict with "agents" key containing list of agent IDs
-    """
-    return {"agents": get_tracked_agents()}
-
-
-@router.get("/api/{agent_id}")
-async def get_agent_scorecard(
-    agent_id: str,
-    period: str = Query(default="daily", description="Period type: 'daily' or 'weekly'"),
-) -> JSONResponse:
-    """Generate a scorecard for a specific agent.
-
-    Args:
-        agent_id: The agent ID (e.g., 'kimi', 'claude')
-        period: 'daily' or 'weekly' (default: daily)
-
-    Returns:
-        JSON response with scorecard data
-    """
-    try:
-        period_type = PeriodType(period.lower())
-    except ValueError:
-        return JSONResponse(
-            status_code=400,
-            content={"error": f"Invalid period '{period}'. Use 'daily' or 'weekly'."},
-        )
-
-    try:
-        scorecard = generate_scorecard(agent_id, period_type)
-
-        if scorecard is None:
-            return JSONResponse(
-                status_code=404,
-                content={"error": f"No scorecard found for agent '{agent_id}'"},
-            )
-
-        return JSONResponse(content=scorecard.to_dict())
-
-    except Exception as exc:
-        logger.error("Failed to generate scorecard for %s: %s", agent_id, exc)
-        return JSONResponse(
-            status_code=500,
-            content={"error": f"Failed to generate scorecard: {str(exc)}"},
-        )
-
-
-@router.get("/api")
-async def get_all_scorecards(
-    period: str = Query(default="daily", description="Period type: 'daily' or 'weekly'"),
-) -> JSONResponse:
-    """Generate scorecards for all tracked agents.
-
-    Args:
-        period: 'daily' or 'weekly' (default: daily)
-
-    Returns:
-        JSON response with list of scorecard data
-    """
-    try:
-        period_type = PeriodType(period.lower())
-    except ValueError:
-        return JSONResponse(
-            status_code=400,
-            content={"error": f"Invalid period '{period}'. Use 'daily' or 'weekly'."},
-        )
-
-    try:
-        scorecards = generate_all_scorecards(period_type)
-        return JSONResponse(
-            content={
-                "period": period_type.value,
-                "scorecards": [s.to_dict() for s in scorecards],
-                "count": len(scorecards),
-            }
-        )
-
-    except Exception as exc:
-        logger.error("Failed to generate scorecards: %s", exc)
-        return JSONResponse(
-            status_code=500,
-            content={"error": f"Failed to generate scorecards: {str(exc)}"},
-        )
-
-
-@router.get("", response_class=HTMLResponse)
-async def scorecards_page(request: Request) -> HTMLResponse:
-    """Render the scorecards dashboard page.
-
-    Returns:
-        HTML page with scorecard interface
-    """
-    agents = get_tracked_agents()
-    return templates.TemplateResponse(
-        request,
-        "scorecards.html",
-        {
-            "agents": agents,
-            "periods": ["daily", "weekly"],
-        },
-    )
-
-
-@router.get("/panel/{agent_id}", response_class=HTMLResponse)
-async def agent_scorecard_panel(
-    request: Request,
-    agent_id: str,
-    period: str = Query(default="daily"),
-) -> HTMLResponse:
-    """Render an individual agent scorecard panel (for HTMX).
-
-    Args:
-        request: The request object
-        agent_id: The agent ID
-        period: 'daily' or 'weekly'
-
-    Returns:
-        HTML panel with scorecard content
-    """
-    try:
-        period_type = PeriodType(period.lower())
-    except ValueError:
-        period_type = PeriodType.daily
-
-    try:
-        scorecard = generate_scorecard(agent_id, period_type)
-
-        if scorecard is None:
-            return HTMLResponse(
-                content=f"""
-                <div class="card mc-panel">
-                    <h5 class="card-title">{agent_id.title()}</h5>
-                    <p class="text-muted">No activity recorded for this period.</p>
-                </div>
-                """,
-                status_code=200,
-            )
-
-        data = scorecard.to_dict()
-
-        # Build patterns HTML
-        patterns_html = ""
-        if data["patterns"]:
-            patterns_list = "".join([f"<li>{p}</li>" for p in data["patterns"]])
-            patterns_html = f"""
-            <div class="mt-3">
-                <h6>Patterns</h6>
-                <ul class="list-unstyled text-info">
-                    {patterns_list}
-                </ul>
-            </div>
-            """
-
-        # Build bullets HTML
-        bullets_html = "".join([f"<li>{b}</li>" for b in data["narrative_bullets"]])
-
-        # Build metrics summary
-        metrics = data["metrics"]
-
-        html_content = f"""
-        <div class="card mc-panel">
-            <div class="card-header d-flex justify-content-between align-items-center">
-                <h5 class="card-title mb-0">{agent_id.title()}</h5>
-                <span class="badge bg-secondary">{_format_period_label(period_type)}</span>
-            </div>
-            <div class="card-body">
-                <ul class="list-unstyled mb-3">
-                    {bullets_html}
-                </ul>
-                
-                <div class="row text-center small">
-                    <div class="col">
-                        <div class="text-muted">PRs</div>
-                        <div class="fw-bold">{metrics["prs_opened"]}/{metrics["prs_merged"]}</div>
-                        <div class="text-muted" style="font-size: 0.75rem;">
-                            {int(metrics["pr_merge_rate"] * 100)}% merged
-                        </div>
-                    </div>
-                    <div class="col">
-                        <div class="text-muted">Issues</div>
-                        <div class="fw-bold">{metrics["issues_touched"]}</div>
-                    </div>
-                    <div class="col">
-                        <div class="text-muted">Tests</div>
-                        <div class="fw-bold">{metrics["tests_affected"]}</div>
-                    </div>
-                    <div class="col">
-                        <div class="text-muted">Tokens</div>
-                        <div class="fw-bold {"text-success" if metrics["token_net"] >= 0 else "text-danger"}">
-                            {"+" if metrics["token_net"] > 0 else ""}{metrics["token_net"]}
-                        </div>
-                    </div>
-                </div>
-                
-                {patterns_html}
-            </div>
-        </div>
-        """
-
-        return HTMLResponse(content=html_content)
-
-    except Exception as exc:
-        logger.error("Failed to render scorecard panel for %s: %s", agent_id, exc)
-        return HTMLResponse(
-            content=f"""
-            <div class="card mc-panel border-danger">
-                <h5 class="card-title">{agent_id.title()}</h5>
-                <p class="text-danger">Error loading scorecard: {str(exc)}</p>
-            </div>
-            """,
-            status_code=200,
-        )
-
-
-@router.get("/all/panels", response_class=HTMLResponse)
-async def all_scorecard_panels(
-    request: Request,
-    period: str = Query(default="daily"),
-) -> HTMLResponse:
-    """Render all agent scorecard panels (for HTMX).
-
-    Args:
-        request: The request object
-        period: 'daily' or 'weekly'
-
-    Returns:
-        HTML with all scorecard panels
-    """
-    try:
-        period_type = PeriodType(period.lower())
-    except ValueError:
-        period_type = PeriodType.daily
-
-    try:
-        scorecards = generate_all_scorecards(period_type)
-
-        panels: list[str] = []
-        for scorecard in scorecards:
-            data = scorecard.to_dict()
-
-            # Build patterns HTML
-            patterns_html = ""
-            if data["patterns"]:
-                patterns_list = "".join([f"<li>{p}</li>" for p in data["patterns"]])
-                patterns_html = f"""
-                <div class="mt-3">
-                    <h6>Patterns</h6>
-                    <ul class="list-unstyled text-info">
-                        {patterns_list}
-                    </ul>
-                </div>
-                """
-
-            # Build bullets HTML
-            bullets_html = "".join([f"<li>{b}</li>" for b in data["narrative_bullets"]])
-            metrics = data["metrics"]
-
-            panel_html = f"""
-            <div class="col-md-6 col-lg-4 mb-3">
-                <div class="card mc-panel">
-                    <div class="card-header d-flex justify-content-between align-items-center">
-                        <h5 class="card-title mb-0">{scorecard.agent_id.title()}</h5>
-                        <span class="badge bg-secondary">{_format_period_label(period_type)}</span>
-                    </div>
-                    <div class="card-body">
-                        <ul class="list-unstyled mb-3">
-                            {bullets_html}
-                        </ul>
-                        
-                        <div class="row text-center small">
-                            <div class="col">
-                                <div class="text-muted">PRs</div>
-                                <div class="fw-bold">{metrics["prs_opened"]}/{metrics["prs_merged"]}</div>
-                                <div class="text-muted" style="font-size: 0.75rem;">
-                                    {int(metrics["pr_merge_rate"] * 100)}% merged
-                                </div>
-                            </div>
-                            <div class="col">
-                                <div class="text-muted">Issues</div>
-                                <div class="fw-bold">{metrics["issues_touched"]}</div>
-                            </div>
-                            <div class="col">
-                                <div class="text-muted">Tests</div>
-                                <div class="fw-bold">{metrics["tests_affected"]}</div>
-                            </div>
-                            <div class="col">
-                                <div class="text-muted">Tokens</div>
-                                <div class="fw-bold {"text-success" if metrics["token_net"] >= 0 else "text-danger"}">
-                                    {"+" if metrics["token_net"] > 0 else ""}{metrics["token_net"]}
-                                </div>
-                            </div>
-                        </div>
-                        
-                        {patterns_html}
-                    </div>
-                </div>
-            </div>
-            """
-            panels.append(panel_html)
-
-        html_content = f"""
-        <div class="row">
-            {"".join(panels)}
-        </div>
-        <div class="text-muted small mt-2">
-            Generated: {datetime.now().strftime("%Y-%m-%d %H:%M:%S UTC")}
-        </div>
-        """
-
-        return HTMLResponse(content=html_content)
-
-    except Exception as exc:
-        logger.error("Failed to render all scorecard panels: %s", exc)
-        return HTMLResponse(
-            content=f"""
-            <div class="alert alert-danger">
-                Error loading scorecards: {str(exc)}
-            </div>
-            """,
-            status_code=200,
-        )
--- a/src/dashboard/routes/system.py
+++ b/src/dashboard/routes/system.py
@@ -56,13 +56,11 @@ async def self_modify_queue(request: Request):

@router.get("/swarm/mission-control", response_class=HTMLResponse)
 async def mission_control(request: Request):
-    """Render the swarm mission control dashboard page."""
    return templates.TemplateResponse(request, "mission_control.html", {})


@router.get("/bugs", response_class=HTMLResponse)
 async def bugs_page(request: Request):
-    """Render the bug tracking page."""
    return templates.TemplateResponse(
        request,
        "bugs.html",
@@ -77,19 +75,16 @@ async def bugs_page(request: Request):

@router.get("/self-coding", response_class=HTMLResponse)
 async def self_coding(request: Request):
-    """Render the self-coding automation status page."""
    return templates.TemplateResponse(request, "self_coding.html", {"stats": {}})


@router.get("/hands", response_class=HTMLResponse)
 async def hands_page(request: Request):
-    """Render the hands (automation executions) page."""
    return templates.TemplateResponse(request, "hands.html", {"executions": []})


@router.get("/creative/ui", response_class=HTMLResponse)
 async def creative_ui(request: Request):
-    """Render the creative UI playground page."""
    return templates.TemplateResponse(request, "creative.html", {})


--- a/src/dashboard/routes/tasks.py
+++ b/src/dashboard/routes/tasks.py
@@ -145,7 +145,6 @@ async def tasks_page(request: Request):

@router.get("/tasks/pending", response_class=HTMLResponse)
 async def tasks_pending(request: Request):
-    """Return HTMX partial for pending approval tasks."""
    with _get_db() as db:
        rows = db.execute(
            "SELECT * FROM tasks WHERE status='pending_approval' ORDER BY created_at DESC"
@@ -165,7 +164,6 @@ async def tasks_pending(request: Request):

@router.get("/tasks/active", response_class=HTMLResponse)
 async def tasks_active(request: Request):
-    """Return HTMX partial for active (approved/running/paused) tasks."""
    with _get_db() as db:
        rows = db.execute(
            "SELECT * FROM tasks WHERE status IN ('approved','running','paused') ORDER BY created_at DESC"
@@ -185,7 +183,6 @@ async def tasks_active(request: Request):

@router.get("/tasks/completed", response_class=HTMLResponse)
 async def tasks_completed(request: Request):
-    """Return HTMX partial for completed/vetoed/failed tasks (last 50)."""
    with _get_db() as db:
        rows = db.execute(
            "SELECT * FROM tasks WHERE status IN ('completed','vetoed','failed') ORDER BY completed_at DESC LIMIT 50"
@@ -244,31 +241,26 @@ async def create_task_form(

@router.post("/tasks/{task_id}/approve", response_class=HTMLResponse)
 async def approve_task(request: Request, task_id: str):
-    """Approve a pending task and move it to active queue."""
    return await _set_status(request, task_id, "approved")


@router.post("/tasks/{task_id}/veto", response_class=HTMLResponse)
 async def veto_task(request: Request, task_id: str):
-    """Veto a task, marking it as rejected."""
    return await _set_status(request, task_id, "vetoed")


@router.post("/tasks/{task_id}/pause", response_class=HTMLResponse)
 async def pause_task(request: Request, task_id: str):
-    """Pause a running or approved task."""
    return await _set_status(request, task_id, "paused")


@router.post("/tasks/{task_id}/cancel", response_class=HTMLResponse)
 async def cancel_task(request: Request, task_id: str):
-    """Cancel a task (marks as vetoed)."""
    return await _set_status(request, task_id, "vetoed")


@router.post("/tasks/{task_id}/retry", response_class=HTMLResponse)
 async def retry_task(request: Request, task_id: str):
-    """Retry a failed/vetoed task by moving it back to approved."""
    return await _set_status(request, task_id, "approved")


@@ -279,7 +271,6 @@ async def modify_task(
    title: str = Form(...),
    description: str = Form(""),
 ):
-    """Update task title and description."""
    with _get_db() as db:
        db.execute(
            "UPDATE tasks SET title=?, description=? WHERE id=?",
--- a/src/dashboard/services/init.py
+++ b/src/dashboard/services/init.py
@@ -1,17 +0,0 @@
-"""Dashboard services for business logic."""
-
-from dashboard.services.scorecard_service import (
-    PeriodType,
-    ScorecardSummary,
-    generate_all_scorecards,
-    generate_scorecard,
-    get_tracked_agents,
-)
-
-__all__ = [
-    "PeriodType",
-    "ScorecardSummary",
-    "generate_all_scorecards",
-    "generate_scorecard",
-    "get_tracked_agents",
-]
--- a/src/dashboard/services/scorecard_service.py
+++ b/src/dashboard/services/scorecard_service.py
@@ -1,515 +0,0 @@
-"""Agent scorecard service — track and summarize agent performance.
-
-Generates daily/weekly scorecards showing:
- Issues touched, PRs opened/merged
- Tests affected, tokens earned/spent
- Pattern highlights (merge rate, activity quality)
-"""
-
-from __future__ import annotations
-
-import logging
-from dataclasses import dataclass, field
-from datetime import UTC, datetime, timedelta
-from enum import StrEnum
-from typing import Any
-
-from infrastructure.events.bus import Event, get_event_bus
-
-logger = logging.getLogger(__name__)
-
-# Bot/agent usernames to track
-TRACKED_AGENTS = frozenset({"hermes", "kimi", "manus", "claude", "gemini"})
-
-
-class PeriodType(StrEnum):
-    daily = "daily"
-    weekly = "weekly"
-
-
-@dataclass
-class AgentMetrics:
-    """Raw metrics collected for an agent over a period."""
-
-    agent_id: str
-    issues_touched: set[int] = field(default_factory=set)
-    prs_opened: set[int] = field(default_factory=set)
-    prs_merged: set[int] = field(default_factory=set)
-    tests_affected: set[str] = field(default_factory=set)
-    tokens_earned: int = 0
-    tokens_spent: int = 0
-    commits: int = 0
-    comments: int = 0
-
-    @property
-    def pr_merge_rate(self) -> float:
-        """Calculate PR merge rate (0.0 - 1.0)."""
-        opened = len(self.prs_opened)
-        if opened == 0:
-            return 0.0
-        return len(self.prs_merged) / opened
-
-
-@dataclass
-class ScorecardSummary:
-    """A generated scorecard with narrative summary."""
-
-    agent_id: str
-    period_type: PeriodType
-    period_start: datetime
-    period_end: datetime
-    metrics: AgentMetrics
-    narrative_bullets: list[str] = field(default_factory=list)
-    patterns: list[str] = field(default_factory=list)
-
-    def to_dict(self) -> dict[str, Any]:
-        """Convert scorecard to dictionary for JSON serialization."""
-        return {
-            "agent_id": self.agent_id,
-            "period_type": self.period_type.value,
-            "period_start": self.period_start.isoformat(),
-            "period_end": self.period_end.isoformat(),
-            "metrics": {
-                "issues_touched": len(self.metrics.issues_touched),
-                "prs_opened": len(self.metrics.prs_opened),
-                "prs_merged": len(self.metrics.prs_merged),
-                "pr_merge_rate": round(self.metrics.pr_merge_rate, 2),
-                "tests_affected": len(self.tests_affected),
-                "commits": self.metrics.commits,
-                "comments": self.metrics.comments,
-                "tokens_earned": self.metrics.tokens_earned,
-                "tokens_spent": self.metrics.tokens_spent,
-                "token_net": self.metrics.tokens_earned - self.metrics.tokens_spent,
-            },
-            "narrative_bullets": self.narrative_bullets,
-            "patterns": self.patterns,
-        }
-
-    @property
-    def tests_affected(self) -> set[str]:
-        """Alias for metrics.tests_affected."""
-        return self.metrics.tests_affected
-
-
-def _get_period_bounds(
-    period_type: PeriodType, reference_date: datetime | None = None
-) -> tuple[datetime, datetime]:
-    """Calculate start and end timestamps for a period.
-
-    Args:
-        period_type: daily or weekly
-        reference_date: The date to calculate from (defaults to now)
-
-    Returns:
-        Tuple of (period_start, period_end) in UTC
-    """
-    if reference_date is None:
-        reference_date = datetime.now(UTC)
-
-    # Normalize to start of day
-    end = reference_date.replace(hour=0, minute=0, second=0, microsecond=0)
-
-    if period_type == PeriodType.daily:
-        start = end - timedelta(days=1)
-    else:  # weekly
-        start = end - timedelta(days=7)
-
-    return start, end
-
-
-def _collect_events_for_period(
-    start: datetime, end: datetime, agent_id: str | None = None
-) -> list[Event]:
-    """Collect events from the event bus for a time period.
-
-    Args:
-        start: Period start time
-        end: Period end time
-        agent_id: Optional agent filter
-
-    Returns:
-        List of matching events
-    """
-    bus = get_event_bus()
-    events: list[Event] = []
-
-    # Query persisted events for relevant types
-    event_types = [
-        "gitea.push",
-        "gitea.issue.opened",
-        "gitea.issue.comment",
-        "gitea.pull_request",
-        "agent.task.completed",
-        "test.execution",
-    ]
-
-    for event_type in event_types:
-        try:
-            type_events = bus.replay(
-                event_type=event_type,
-                source=agent_id,
-                limit=1000,
-            )
-            events.extend(type_events)
-        except Exception as exc:
-            logger.debug("Failed to replay events for %s: %s", event_type, exc)
-
-    # Filter by timestamp
-    filtered = []
-    for event in events:
-        try:
-            event_time = datetime.fromisoformat(event.timestamp.replace("Z", "+00:00"))
-            if start <= event_time < end:
-                filtered.append(event)
-        except (ValueError, AttributeError):
-            continue
-
-    return filtered
-
-
-def _extract_actor_from_event(event: Event) -> str:
-    """Extract the actor/agent from an event."""
-    # Try data fields first
-    if "actor" in event.data:
-        return event.data["actor"]
-    if "agent_id" in event.data:
-        return event.data["agent_id"]
-    # Fall back to source
-    return event.source
-
-
-def _is_tracked_agent(actor: str) -> bool:
-    """Check if an actor is a tracked agent."""
-    return actor.lower() in TRACKED_AGENTS
-
-
-def _aggregate_metrics(events: list[Event]) -> dict[str, AgentMetrics]:
-    """Aggregate metrics from events grouped by agent.
-
-    Args:
-        events: List of events to process
-
-    Returns:
-        Dict mapping agent_id -> AgentMetrics
-    """
-    metrics_by_agent: dict[str, AgentMetrics] = {}
-
-    for event in events:
-        actor = _extract_actor_from_event(event)
-
-        # Skip non-agent events unless they explicitly have an agent_id
-        if not _is_tracked_agent(actor) and "agent_id" not in event.data:
-            continue
-
-        if actor not in metrics_by_agent:
-            metrics_by_agent[actor] = AgentMetrics(agent_id=actor)
-
-        metrics = metrics_by_agent[actor]
-
-        # Process based on event type
-        event_type = event.type
-
-        if event_type == "gitea.push":
-            metrics.commits += event.data.get("num_commits", 1)
-
-        elif event_type == "gitea.issue.opened":
-            issue_num = event.data.get("issue_number", 0)
-            if issue_num:
-                metrics.issues_touched.add(issue_num)
-
-        elif event_type == "gitea.issue.comment":
-            metrics.comments += 1
-            issue_num = event.data.get("issue_number", 0)
-            if issue_num:
-                metrics.issues_touched.add(issue_num)
-
-        elif event_type == "gitea.pull_request":
-            pr_num = event.data.get("pr_number", 0)
-            action = event.data.get("action", "")
-            merged = event.data.get("merged", False)
-
-            if pr_num:
-                if action == "opened":
-                    metrics.prs_opened.add(pr_num)
-                elif action == "closed" and merged:
-                    metrics.prs_merged.add(pr_num)
-                    # Also count as touched issue for tracking
-                    metrics.issues_touched.add(pr_num)
-
-        elif event_type == "agent.task.completed":
-            # Extract test files from task data
-            affected = event.data.get("tests_affected", [])
-            for test in affected:
-                metrics.tests_affected.add(test)
-
-            # Token rewards from task completion
-            reward = event.data.get("token_reward", 0)
-            if reward:
-                metrics.tokens_earned += reward
-
-        elif event_type == "test.execution":
-            # Track test files that were executed
-            test_files = event.data.get("test_files", [])
-            for test in test_files:
-                metrics.tests_affected.add(test)
-
-    return metrics_by_agent
-
-
-def _query_token_transactions(agent_id: str, start: datetime, end: datetime) -> tuple[int, int]:
-    """Query the lightning ledger for token transactions.
-
-    Args:
-        agent_id: The agent to query for
-        start: Period start
-        end: Period end
-
-    Returns:
-        Tuple of (tokens_earned, tokens_spent)
-    """
-    try:
-        from lightning.ledger import get_transactions
-
-        transactions = get_transactions(limit=1000)
-
-        earned = 0
-        spent = 0
-
-        for tx in transactions:
-            # Filter by agent if specified
-            if tx.agent_id and tx.agent_id != agent_id:
-                continue
-
-            # Filter by timestamp
-            try:
-                tx_time = datetime.fromisoformat(tx.created_at.replace("Z", "+00:00"))
-                if not (start <= tx_time < end):
-                    continue
-            except (ValueError, AttributeError):
-                continue
-
-            if tx.tx_type.value == "incoming":
-                earned += tx.amount_sats
-            else:
-                spent += tx.amount_sats
-
-        return earned, spent
-
-    except Exception as exc:
-        logger.debug("Failed to query token transactions: %s", exc)
-        return 0, 0
-
-
-def _generate_narrative_bullets(metrics: AgentMetrics, period_type: PeriodType) -> list[str]:
-    """Generate narrative summary bullets for a scorecard.
-
-    Args:
-        metrics: The agent's metrics
-        period_type: daily or weekly
-
-    Returns:
-        List of narrative bullet points
-    """
-    bullets: list[str] = []
-    period_label = "day" if period_type == PeriodType.daily else "week"
-
-    # Activity summary
-    activities = []
-    if metrics.commits:
-        activities.append(f"{metrics.commits} commit{'s' if metrics.commits != 1 else ''}")
-    if len(metrics.prs_opened):
-        activities.append(
-            f"{len(metrics.prs_opened)} PR{'s' if len(metrics.prs_opened) != 1 else ''} opened"
-        )
-    if len(metrics.prs_merged):
-        activities.append(
-            f"{len(metrics.prs_merged)} PR{'s' if len(metrics.prs_merged) != 1 else ''} merged"
-        )
-    if len(metrics.issues_touched):
-        activities.append(
-            f"{len(metrics.issues_touched)} issue{'s' if len(metrics.issues_touched) != 1 else ''} touched"
-        )
-    if metrics.comments:
-        activities.append(f"{metrics.comments} comment{'s' if metrics.comments != 1 else ''}")
-
-    if activities:
-        bullets.append(f"Active across {', '.join(activities)} this {period_label}.")
-
-    # Test activity
-    if len(metrics.tests_affected):
-        bullets.append(
-            f"Affected {len(metrics.tests_affected)} test file{'s' if len(metrics.tests_affected) != 1 else ''}."
-        )
-
-    # Token summary
-    net_tokens = metrics.tokens_earned - metrics.tokens_spent
-    if metrics.tokens_earned or metrics.tokens_spent:
-        if net_tokens > 0:
-            bullets.append(
-                f"Net earned {net_tokens} tokens ({metrics.tokens_earned} earned, {metrics.tokens_spent} spent)."
-            )
-        elif net_tokens < 0:
-            bullets.append(
-                f"Net spent {abs(net_tokens)} tokens ({metrics.tokens_earned} earned, {metrics.tokens_spent} spent)."
-            )
-        else:
-            bullets.append(
-                f"Balanced token flow ({metrics.tokens_earned} earned, {metrics.tokens_spent} spent)."
-            )
-
-    # Handle empty case
-    if not bullets:
-        bullets.append(f"No recorded activity this {period_label}.")
-
-    return bullets
-
-
-def _detect_patterns(metrics: AgentMetrics) -> list[str]:
-    """Detect interesting patterns in agent behavior.
-
-    Args:
-        metrics: The agent's metrics
-
-    Returns:
-        List of pattern descriptions
-    """
-    patterns: list[str] = []
-
-    pr_opened = len(metrics.prs_opened)
-    merge_rate = metrics.pr_merge_rate
-
-    # Merge rate patterns
-    if pr_opened >= 3:
-        if merge_rate >= 0.8:
-            patterns.append("High merge rate with few failures — code quality focus.")
-        elif merge_rate <= 0.3:
-            patterns.append("Lots of noisy PRs, low merge rate — may need review support.")
-
-    # Activity patterns
-    if metrics.commits > 10 and pr_opened == 0:
-        patterns.append("High commit volume without PRs — working directly on main?")
-
-    if len(metrics.issues_touched) > 5 and metrics.comments == 0:
-        patterns.append("Touching many issues but low comment volume — silent worker.")
-
-    if metrics.comments > len(metrics.issues_touched) * 2:
-        patterns.append("Highly communicative — lots of discussion relative to work items.")
-
-    # Token patterns
-    net_tokens = metrics.tokens_earned - metrics.tokens_spent
-    if net_tokens > 100:
-        patterns.append("Strong token accumulation — high value delivery.")
-    elif net_tokens < -50:
-        patterns.append("High token spend — may be in experimentation phase.")
-
-    return patterns
-
-
-def generate_scorecard(
-    agent_id: str,
-    period_type: PeriodType = PeriodType.daily,
-    reference_date: datetime | None = None,
-) -> ScorecardSummary | None:
-    """Generate a scorecard for a single agent.
-
-    Args:
-        agent_id: The agent to generate scorecard for
-        period_type: daily or weekly
-        reference_date: The date to calculate from (defaults to now)
-
-    Returns:
-        ScorecardSummary or None if agent has no activity
-    """
-    start, end = _get_period_bounds(period_type, reference_date)
-
-    # Collect events
-    events = _collect_events_for_period(start, end, agent_id)
-
-    # Aggregate metrics
-    all_metrics = _aggregate_metrics(events)
-
-    # Get metrics for this specific agent
-    if agent_id not in all_metrics:
-        # Create empty metrics - still generate a scorecard
-        metrics = AgentMetrics(agent_id=agent_id)
-    else:
-        metrics = all_metrics[agent_id]
-
-    # Augment with token data from ledger
-    tokens_earned, tokens_spent = _query_token_transactions(agent_id, start, end)
-    metrics.tokens_earned = max(metrics.tokens_earned, tokens_earned)
-    metrics.tokens_spent = max(metrics.tokens_spent, tokens_spent)
-
-    # Generate narrative and patterns
-    narrative = _generate_narrative_bullets(metrics, period_type)
-    patterns = _detect_patterns(metrics)
-
-    return ScorecardSummary(
-        agent_id=agent_id,
-        period_type=period_type,
-        period_start=start,
-        period_end=end,
-        metrics=metrics,
-        narrative_bullets=narrative,
-        patterns=patterns,
-    )
-
-
-def generate_all_scorecards(
-    period_type: PeriodType = PeriodType.daily,
-    reference_date: datetime | None = None,
-) -> list[ScorecardSummary]:
-    """Generate scorecards for all tracked agents.
-
-    Args:
-        period_type: daily or weekly
-        reference_date: The date to calculate from (defaults to now)
-
-    Returns:
-        List of ScorecardSummary for all agents with activity
-    """
-    start, end = _get_period_bounds(period_type, reference_date)
-
-    # Collect all events
-    events = _collect_events_for_period(start, end)
-
-    # Aggregate metrics for all agents
-    all_metrics = _aggregate_metrics(events)
-
-    # Include tracked agents even if no activity
-    for agent_id in TRACKED_AGENTS:
-        if agent_id not in all_metrics:
-            all_metrics[agent_id] = AgentMetrics(agent_id=agent_id)
-
-    # Generate scorecards
-    scorecards: list[ScorecardSummary] = []
-
-    for agent_id, metrics in all_metrics.items():
-        # Augment with token data
-        tokens_earned, tokens_spent = _query_token_transactions(agent_id, start, end)
-        metrics.tokens_earned = max(metrics.tokens_earned, tokens_earned)
-        metrics.tokens_spent = max(metrics.tokens_spent, tokens_spent)
-
-        narrative = _generate_narrative_bullets(metrics, period_type)
-        patterns = _detect_patterns(metrics)
-
-        scorecard = ScorecardSummary(
-            agent_id=agent_id,
-            period_type=period_type,
-            period_start=start,
-            period_end=end,
-            metrics=metrics,
-            narrative_bullets=narrative,
-            patterns=patterns,
-        )
-        scorecards.append(scorecard)
-
-    # Sort by agent_id for consistent ordering
-    scorecards.sort(key=lambda s: s.agent_id)
-
-    return scorecards
-
-
-def get_tracked_agents() -> list[str]:
-    """Return the list of tracked agent IDs."""
-    return sorted(TRACKED_AGENTS)
--- a/src/dashboard/templates/base.html
+++ b/src/dashboard/templates/base.html
@@ -51,7 +51,6 @@
          <a href="/thinking" class="mc-test-link mc-link-thinking">THINKING</a>
          <a href="/swarm/mission-control" class="mc-test-link">MISSION CTRL</a>
          <a href="/swarm/live" class="mc-test-link">SWARM</a>
-          <a href="/scorecards" class="mc-test-link">SCORECARDS</a>
          <a href="/bugs" class="mc-test-link mc-link-bugs">BUGS</a>
        </div>
      </div>
@@ -124,7 +123,6 @@
    <a href="/thinking" class="mc-mobile-link">THINKING</a>
    <a href="/swarm/mission-control" class="mc-mobile-link">MISSION CONTROL</a>
    <a href="/swarm/live" class="mc-mobile-link">SWARM</a>
-    <a href="/scorecards" class="mc-mobile-link">SCORECARDS</a>
    <a href="/bugs" class="mc-mobile-link">BUGS</a>
    <div class="mc-mobile-section-label">INTELLIGENCE</div>
    <a href="/spark/ui" class="mc-mobile-link">SPARK</a>
--- a/src/dashboard/templates/scorecards.html
+++ b/src/dashboard/templates/scorecards.html
@@ -1,113 +0,0 @@
-{% extends "base.html" %}
-
-{% block title %}Agent Scorecards - Timmy Time{% endblock %}
-
-{% block extra_styles %}{% endblock %}
-
-{% block content %}
-<div class="container-fluid py-4">
-  <!-- Header -->
-  <div class="d-flex justify-content-between align-items-center mb-4">
-    <div>
-      <h1 class="h3 mb-0">AGENT SCORECARDS</h1>
-      <p class="text-muted small mb-0">Track agent performance across issues, PRs, tests, and tokens</p>
-    </div>
-    <div class="d-flex gap-2">
-      <select id="period-select" class="form-select form-select-sm" style="width: auto;">
-        <option value="daily" selected>Daily</option>
-        <option value="weekly">Weekly</option>
-      </select>
-      <button class="btn btn-sm btn-primary" onclick="refreshScorecards()">
-        <span>Refresh</span>
-      </button>
-    </div>
-  </div>
-
-  <!-- Scorecards Grid -->
-  <div id="scorecards-container"
-       hx-get="/scorecards/all/panels?period=daily"
-       hx-trigger="load"
-       hx-swap="innerHTML">
-    <div class="text-center py-5">
-      <div class="spinner-border text-secondary" role="status">
-        <span class="visually-hidden">Loading...</span>
-      </div>
-      <p class="text-muted mt-2">Loading scorecards...</p>
-    </div>
-  </div>
-
-  <!-- API Reference -->
-  <div class="mt-5 pt-4 border-top">
-    <h5 class="text-muted">API Reference</h5>
-    <div class="row g-3">
-      <div class="col-md-6">
-        <div class="card mc-panel">
-          <div class="card-body">
-            <h6 class="card-title">List Tracked Agents</h6>
-            <code>GET /scorecards/api/agents</code>
-            <p class="small text-muted mt-2">Returns all tracked agent IDs</p>
-          </div>
-        </div>
-      </div>
-      <div class="col-md-6">
-        <div class="card mc-panel">
-          <div class="card-body">
-            <h6 class="card-title">Get All Scorecards</h6>
-            <code>GET /scorecards/api?period=daily|weekly</code>
-            <p class="small text-muted mt-2">Returns scorecards for all agents</p>
-          </div>
-        </div>
-      </div>
-      <div class="col-md-6">
-        <div class="card mc-panel">
-          <div class="card-body">
-            <h6 class="card-title">Get Agent Scorecard</h6>
-            <code>GET /scorecards/api/{agent_id}?period=daily|weekly</code>
-            <p class="small text-muted mt-2">Returns scorecard for a specific agent</p>
-          </div>
-        </div>
-      </div>
-      <div class="col-md-6">
-        <div class="card mc-panel">
-          <div class="card-body">
-            <h6 class="card-title">HTML Panel (HTMX)</h6>
-            <code>GET /scorecards/panel/{agent_id}?period=daily|weekly</code>
-            <p class="small text-muted mt-2">Returns HTML panel for embedding</p>
-          </div>
-        </div>
-      </div>
-    </div>
-  </div>
-</div>
-
-<script>
-// Period selector change handler
-document.getElementById('period-select').addEventListener('change', function() {
-  refreshScorecards();
-});
-
-function refreshScorecards() {
-  var period = document.getElementById('period-select').value;
-  var container = document.getElementById('scorecards-container');
-  
-  // Show loading state
-  container.innerHTML = `
-    <div class="text-center py-5">
-      <div class="spinner-border text-secondary" role="status">
-        <span class="visually-hidden">Loading...</span>
-      </div>
-      <p class="text-muted mt-2">Loading scorecards...</p>
-    </div>
-  `;
-  
-  // Trigger HTMX request
-  htmx.ajax('GET', '/scorecards/all/panels?period=' + period, {
-    target: '#scorecards-container',
-    swap: 'innerHTML'
-  });
-}
-
-// Auto-refresh every 5 minutes
-setInterval(refreshScorecards, 300000);
-</script>
-{% endblock %}
--- a/src/infrastructure/morrowind/init.py
+++ b/src/infrastructure/morrowind/init.py
@@ -0,0 +1,18 @@
+"""Morrowind engine-agnostic perception/command protocol.
+
+This package implements the Perception/Command protocol defined in
+``docs/protocol/morrowind-perception-command-spec.md``.  It provides:
+
+- Pydantic v2 schemas for runtime validation (``schemas``)
+- SQLite command logging and query interface (``command_log``)
+- Training-data export pipeline (``training_export``)
+"""
+
+from .schemas import CommandInput, CommandType, EntityType, PerceptionOutput
+
+__all__ = [
+    "CommandInput",
+    "CommandType",
+    "EntityType",
+    "PerceptionOutput",
+]
--- a/src/infrastructure/morrowind/command_log.py
+++ b/src/infrastructure/morrowind/command_log.py
@@ -0,0 +1,307 @@
+"""SQLite command log for the Morrowind Perception/Command protocol.
+
+Every heartbeat cycle is logged — the resulting dataset serves as organic
+training data for local model fine-tuning (Phase 7+).
+
+Usage::
+
+    from infrastructure.morrowind.command_log import CommandLogger
+
+    logger = CommandLogger()            # uses project default DB
+    logger.log_command(command_input, perception_snapshot)
+    results = logger.query(command_type="move_to", limit=100)
+    logger.export_training_data("export.jsonl")
+"""
+
+from __future__ import annotations
+
+import json
+import logging
+from datetime import UTC, datetime, timedelta
+from pathlib import Path
+from typing import Any
+
+from sqlalchemy import (
+    Column,
+    DateTime,
+    Index,
+    Integer,
+    String,
+    Text,
+    create_engine,
+)
+from sqlalchemy.orm import Session, sessionmaker
+
+from src.dashboard.models.database import Base
+
+from .schemas import CommandInput, CommandType, PerceptionOutput
+
+logger = logging.getLogger(__name__)
+
+# Default database path — same SQLite file as the rest of the project.
+DEFAULT_DB_PATH = Path("./data/timmy_calm.db")
+
+
+# ---------------------------------------------------------------------------
+# SQLAlchemy model
+# ---------------------------------------------------------------------------
+
+
+class CommandLog(Base):
+    """Persisted command log entry.
+
+    Schema columns mirror the requirements from Issue #855:
+    timestamp, command, params, reasoning, perception_snapshot,
+    outcome, episode_id.
+    """
+
+    __tablename__ = "command_log"
+
+    id = Column(Integer, primary_key=True, autoincrement=True)
+
+    timestamp = Column(
+        DateTime, nullable=False, default=lambda: datetime.now(UTC), index=True
+    )
+    command = Column(String(64), nullable=False, index=True)
+    params = Column(Text, nullable=False, default="{}")
+    reasoning = Column(Text, nullable=False, default="")
+
+    perception_snapshot = Column(Text, nullable=False, default="{}")
+    outcome = Column(Text, nullable=True)
+
+    agent_id = Column(String(64), nullable=False, default="timmy", index=True)
+    episode_id = Column(String(128), nullable=True, index=True)
+    cell = Column(String(255), nullable=True, index=True)
+    protocol_version = Column(String(16), nullable=False, default="1.0.0")
+
+    created_at = Column(
+        DateTime, nullable=False, default=lambda: datetime.now(UTC)
+    )
+
+    __table_args__ = (
+        Index("ix_command_log_cmd_cell", "command", "cell"),
+        Index("ix_command_log_episode", "episode_id", "timestamp"),
+    )
+
+
+# ---------------------------------------------------------------------------
+# CommandLogger — high-level API
+# ---------------------------------------------------------------------------
+
+
+class CommandLogger:
+    """High-level interface for logging, querying, and exporting commands.
+
+    Args:
+        db_url: SQLAlchemy database URL.  Defaults to the project SQLite path.
+    """
+
+    def __init__(self, db_url: str | None = None) -> None:
+        if db_url is None:
+            DEFAULT_DB_PATH.parent.mkdir(parents=True, exist_ok=True)
+            db_url = f"sqlite:///{DEFAULT_DB_PATH}"
+        self._engine = create_engine(
+            db_url, connect_args={"check_same_thread": False}
+        )
+        self._SessionLocal = sessionmaker(
+            autocommit=False, autoflush=False, bind=self._engine
+        )
+        # Ensure table exists.
+        Base.metadata.create_all(bind=self._engine, tables=[CommandLog.__table__])
+
+    def _get_session(self) -> Session:
+        return self._SessionLocal()
+
+    # -- Write ---------------------------------------------------------------
+
+    def log_command(
+        self,
+        command_input: CommandInput,
+        perception: PerceptionOutput | None = None,
+        outcome: str | None = None,
+    ) -> int:
+        """Persist a command to the log.
+
+        Returns the auto-generated row id.
+        """
+        perception_json = perception.model_dump_json() if perception else "{}"
+        cell = perception.location.cell if perception else None
+
+        entry = CommandLog(
+            timestamp=command_input.timestamp,
+            command=command_input.command.value,
+            params=json.dumps(command_input.params),
+            reasoning=command_input.reasoning,
+            perception_snapshot=perception_json,
+            outcome=outcome,
+            agent_id=command_input.agent_id,
+            episode_id=command_input.episode_id,
+            cell=cell,
+            protocol_version=command_input.protocol_version,
+        )
+
+        session = self._get_session()
+        try:
+            session.add(entry)
+            session.commit()
+            session.refresh(entry)
+            row_id: int = entry.id
+            return row_id
+        except Exception:
+            session.rollback()
+            raise
+        finally:
+            session.close()
+
+    # -- Read ----------------------------------------------------------------
+
+    def query(
+        self,
+        *,
+        command_type: str | CommandType | None = None,
+        cell: str | None = None,
+        episode_id: str | None = None,
+        agent_id: str | None = None,
+        since: datetime | None = None,
+        until: datetime | None = None,
+        limit: int = 100,
+        offset: int = 0,
+    ) -> list[dict[str, Any]]:
+        """Query command log entries with optional filters.
+
+        Returns a list of dicts (serialisable to JSON).
+        """
+        session = self._get_session()
+        try:
+            q = session.query(CommandLog)
+
+            if command_type is not None:
+                q = q.filter(CommandLog.command == str(command_type))
+            if cell is not None:
+                q = q.filter(CommandLog.cell == cell)
+            if episode_id is not None:
+                q = q.filter(CommandLog.episode_id == episode_id)
+            if agent_id is not None:
+                q = q.filter(CommandLog.agent_id == agent_id)
+            if since is not None:
+                q = q.filter(CommandLog.timestamp >= since)
+            if until is not None:
+                q = q.filter(CommandLog.timestamp <= until)
+
+            q = q.order_by(CommandLog.timestamp.desc())
+            q = q.offset(offset).limit(limit)
+
+            rows = q.all()
+            return [self._row_to_dict(row) for row in rows]
+        finally:
+            session.close()
+
+    # -- Export --------------------------------------------------------------
+
+    def export_training_data(
+        self,
+        output_path: str | Path,
+        *,
+        episode_id: str | None = None,
+        since: datetime | None = None,
+        until: datetime | None = None,
+    ) -> int:
+        """Export command log entries as a JSONL file for fine-tuning.
+
+        Each line is a JSON object with ``perception`` (input) and
+        ``command`` + ``reasoning`` (target output).
+
+        Returns the number of rows exported.
+        """
+        output_path = Path(output_path)
+        output_path.parent.mkdir(parents=True, exist_ok=True)
+
+        session = self._get_session()
+        try:
+            q = session.query(CommandLog)
+            if episode_id is not None:
+                q = q.filter(CommandLog.episode_id == episode_id)
+            if since is not None:
+                q = q.filter(CommandLog.timestamp >= since)
+            if until is not None:
+                q = q.filter(CommandLog.timestamp <= until)
+            q = q.order_by(CommandLog.timestamp.asc())
+
+            count = 0
+            with open(output_path, "w", encoding="utf-8") as fh:
+                for row in q.yield_per(500):
+                    record = {
+                        "input": {
+                            "perception": json.loads(row.perception_snapshot),
+                        },
+                        "output": {
+                            "command": row.command,
+                            "params": json.loads(row.params),
+                            "reasoning": row.reasoning,
+                        },
+                        "metadata": {
+                            "timestamp": row.timestamp.isoformat() if row.timestamp else None,
+                            "episode_id": row.episode_id,
+                            "cell": row.cell,
+                            "outcome": row.outcome,
+                        },
+                    }
+                    fh.write(json.dumps(record) + "\n")
+                    count += 1
+            logger.info("Exported %d training records to %s", count, output_path)
+            return count
+        finally:
+            session.close()
+
+    # -- Storage management --------------------------------------------------
+
+    def rotate(self, max_age_days: int = 90) -> int:
+        """Delete command log entries older than *max_age_days*.
+
+        Returns the number of rows deleted.
+        """
+        cutoff = datetime.now(UTC) - timedelta(days=max_age_days)
+        session = self._get_session()
+        try:
+            deleted = (
+                session.query(CommandLog)
+                .filter(CommandLog.timestamp < cutoff)
+                .delete(synchronize_session=False)
+            )
+            session.commit()
+            logger.info("Rotated %d command log entries older than %s", deleted, cutoff)
+            return deleted
+        except Exception:
+            session.rollback()
+            raise
+        finally:
+            session.close()
+
+    def count(self) -> int:
+        """Return the total number of command log entries."""
+        session = self._get_session()
+        try:
+            return session.query(CommandLog).count()
+        finally:
+            session.close()
+
+    # -- Helpers -------------------------------------------------------------
+
+    @staticmethod
+    def _row_to_dict(row: CommandLog) -> dict[str, Any]:
+        return {
+            "id": row.id,
+            "timestamp": row.timestamp.isoformat() if row.timestamp else None,
+            "command": row.command,
+            "params": json.loads(row.params) if row.params else {},
+            "reasoning": row.reasoning,
+            "perception_snapshot": json.loads(row.perception_snapshot)
+            if row.perception_snapshot
+            else {},
+            "outcome": row.outcome,
+            "agent_id": row.agent_id,
+            "episode_id": row.episode_id,
+            "cell": row.cell,
+            "protocol_version": row.protocol_version,
+            "created_at": row.created_at.isoformat() if row.created_at else None,
+        }
--- a/src/infrastructure/morrowind/schemas.py
+++ b/src/infrastructure/morrowind/schemas.py
@@ -0,0 +1,186 @@
+"""Pydantic v2 models for the Morrowind Perception/Command protocol.
+
+These models enforce the contract defined in
+``docs/protocol/morrowind-perception-command-spec.md`` at runtime.
+They are engine-agnostic by design — see the Falsework Rule.
+"""
+
+from __future__ import annotations
+
+from datetime import datetime
+from enum import StrEnum
+from typing import Any
+
+from pydantic import BaseModel, Field, model_validator
+
+PROTOCOL_VERSION = "1.0.0"
+
+
+# ---------------------------------------------------------------------------
+# Enums
+# ---------------------------------------------------------------------------
+
+
+class EntityType(StrEnum):
+    """Controlled vocabulary for nearby entity types."""
+
+    NPC = "npc"
+    CREATURE = "creature"
+    ITEM = "item"
+    DOOR = "door"
+    CONTAINER = "container"
+
+
+class CommandType(StrEnum):
+    """All supported command types."""
+
+    MOVE_TO = "move_to"
+    INTERACT = "interact"
+    USE_ITEM = "use_item"
+    WAIT = "wait"
+    COMBAT_ACTION = "combat_action"
+    DIALOGUE = "dialogue"
+    JOURNAL_NOTE = "journal_note"
+    NOOP = "noop"
+
+
+# ---------------------------------------------------------------------------
+# Perception Output sub-models
+# ---------------------------------------------------------------------------
+
+
+class Location(BaseModel):
+    """Agent position within the game world."""
+
+    cell: str = Field(..., description="Current cell/zone name")
+    x: float = Field(..., description="World X coordinate")
+    y: float = Field(..., description="World Y coordinate")
+    z: float = Field(0.0, description="World Z coordinate")
+    interior: bool = Field(False, description="Whether the agent is indoors")
+
+
+class HealthStatus(BaseModel):
+    """Agent health information."""
+
+    current: int = Field(..., ge=0, description="Current health points")
+    max: int = Field(..., gt=0, description="Maximum health points")
+
+    @model_validator(mode="after")
+    def current_le_max(self) -> "HealthStatus":
+        if self.current > self.max:
+            raise ValueError(
+                f"current ({self.current}) cannot exceed max ({self.max})"
+            )
+        return self
+
+
+class NearbyEntity(BaseModel):
+    """An entity within the agent's perception radius."""
+
+    entity_id: str = Field(..., description="Unique entity identifier")
+    name: str = Field(..., description="Display name")
+    entity_type: EntityType = Field(..., description="Entity category")
+    distance: float = Field(..., ge=0, description="Distance from agent")
+    disposition: int | None = Field(None, description="NPC disposition (0-100)")
+
+
+class InventorySummary(BaseModel):
+    """Lightweight overview of the agent's inventory."""
+
+    gold: int = Field(0, ge=0, description="Gold held")
+    item_count: int = Field(0, ge=0, description="Total items carried")
+    encumbrance_pct: float = Field(
+        0.0, ge=0.0, le=1.0, description="Encumbrance as fraction (0.0–1.0)"
+    )
+
+
+class QuestInfo(BaseModel):
+    """A currently tracked quest."""
+
+    quest_id: str = Field(..., description="Quest identifier")
+    name: str = Field(..., description="Quest display name")
+    stage: int = Field(0, ge=0, description="Current quest stage")
+
+
+class Environment(BaseModel):
+    """World-state flags."""
+
+    time_of_day: str = Field("unknown", description="Time period (morning, afternoon, etc.)")
+    weather: str = Field("clear", description="Current weather condition")
+    is_combat: bool = Field(False, description="Whether the agent is in combat")
+    is_dialogue: bool = Field(False, description="Whether the agent is in dialogue")
+
+
+# ---------------------------------------------------------------------------
+# Top-level schemas
+# ---------------------------------------------------------------------------
+
+
+class PerceptionOutput(BaseModel):
+    """Complete perception snapshot returned by ``GET /perception``.
+
+    This is the engine-agnostic view of the game world consumed by the
+    heartbeat loop and reasoning layer.
+    """
+
+    protocol_version: str = Field(
+        default=PROTOCOL_VERSION,
+        description="Protocol SemVer string",
+    )
+    timestamp: datetime = Field(..., description="When the snapshot was taken")
+    agent_id: str = Field(..., description="Which agent this perception belongs to")
+
+    location: Location
+    health: HealthStatus
+    nearby_entities: list[NearbyEntity] = Field(default_factory=list)
+    inventory_summary: InventorySummary = Field(default_factory=InventorySummary)
+    active_quests: list[QuestInfo] = Field(default_factory=list)
+    environment: Environment = Field(default_factory=Environment)
+
+    raw_engine_data: dict[str, Any] = Field(
+        default_factory=dict,
+        description="Opaque engine-specific blob — not relied upon by heartbeat",
+    )
+
+
+class CommandContext(BaseModel):
+    """Metadata linking a command to its triggering perception."""
+
+    perception_timestamp: datetime | None = Field(
+        None, description="Timestamp of the perception that triggered this command"
+    )
+    heartbeat_cycle: int | None = Field(
+        None, ge=0, description="Heartbeat cycle number"
+    )
+
+
+class CommandInput(BaseModel):
+    """Command payload sent via ``POST /command``.
+
+    Every command includes a ``reasoning`` field so the command log
+    captures the agent's intent — critical for training-data export.
+    """
+
+    protocol_version: str = Field(
+        default=PROTOCOL_VERSION,
+        description="Protocol SemVer string",
+    )
+    timestamp: datetime = Field(..., description="When the command was issued")
+    agent_id: str = Field(..., description="Which agent is issuing the command")
+
+    command: CommandType = Field(..., description="Command type")
+    params: dict[str, Any] = Field(
+        default_factory=dict, description="Command-specific parameters"
+    )
+    reasoning: str = Field(
+        ...,
+        min_length=1,
+        description="Natural-language explanation of why this command was chosen",
+    )
+
+    episode_id: str | None = Field(
+        None, description="Groups commands into training episodes"
+    )
+    context: CommandContext | None = Field(
+        None, description="Metadata linking command to its triggering perception"
+    )
--- a/src/infrastructure/morrowind/training_export.py
+++ b/src/infrastructure/morrowind/training_export.py
@@ -0,0 +1,243 @@
+"""Fine-tuning dataset export pipeline for command log data.
+
+Transforms raw command log entries into structured training datasets
+suitable for supervised fine-tuning of local models.
+
+Usage::
+
+    from infrastructure.morrowind.training_export import TrainingExporter
+
+    exporter = TrainingExporter(command_logger)
+    stats = exporter.export_chat_format("train.jsonl")
+    stats = exporter.export_episode_sequences("episodes/", min_length=5)
+"""
+
+from __future__ import annotations
+
+import json
+import logging
+from dataclasses import dataclass, field
+from datetime import datetime
+from pathlib import Path
+from typing import Any
+
+from .command_log import CommandLogger
+
+logger = logging.getLogger(__name__)
+
+
+@dataclass
+class ExportStats:
+    """Statistics about an export run."""
+
+    total_records: int = 0
+    episodes_exported: int = 0
+    skipped_records: int = 0
+    output_path: str = ""
+    format: str = ""
+    exported_at: str = field(default_factory=lambda: datetime.utcnow().isoformat())
+
+
+class TrainingExporter:
+    """Builds fine-tuning datasets from the command log.
+
+    Supports multiple output formats used by common fine-tuning
+    frameworks (chat-completion style, instruction-following, episode
+    sequences).
+
+    Args:
+        command_logger: A :class:`CommandLogger` instance to read from.
+    """
+
+    def __init__(self, command_logger: CommandLogger) -> None:
+        self._logger = command_logger
+
+    # -- Chat-completion format ----------------------------------------------
+
+    def export_chat_format(
+        self,
+        output_path: str | Path,
+        *,
+        since: datetime | None = None,
+        until: datetime | None = None,
+        max_records: int | None = None,
+    ) -> ExportStats:
+        """Export as chat-completion training pairs.
+
+        Each line is a JSON object with ``messages`` list containing a
+        ``system`` prompt, ``user`` (perception), and ``assistant``
+        (command + reasoning) message.
+
+        This format is compatible with OpenAI / Llama fine-tuning APIs.
+        """
+        output_path = Path(output_path)
+        output_path.parent.mkdir(parents=True, exist_ok=True)
+
+        rows = self._logger.query(
+            since=since,
+            until=until,
+            limit=max_records or 100_000,
+        )
+        # query returns newest-first; reverse for chronological export
+        rows.reverse()
+
+        stats = ExportStats(
+            output_path=str(output_path),
+            format="chat_completion",
+        )
+
+        with open(output_path, "w", encoding="utf-8") as fh:
+            for row in rows:
+                perception = row.get("perception_snapshot", {})
+                if not perception:
+                    stats.skipped_records += 1
+                    continue
+
+                record = {
+                    "messages": [
+                        {
+                            "role": "system",
+                            "content": (
+                                "You are an autonomous agent navigating a game world. "
+                                "Given a perception of the world state, decide what "
+                                "command to execute and explain your reasoning."
+                            ),
+                        },
+                        {
+                            "role": "user",
+                            "content": json.dumps(perception),
+                        },
+                        {
+                            "role": "assistant",
+                            "content": json.dumps(
+                                {
+                                    "command": row.get("command"),
+                                    "params": row.get("params", {}),
+                                    "reasoning": row.get("reasoning", ""),
+                                }
+                            ),
+                        },
+                    ],
+                }
+                fh.write(json.dumps(record) + "\n")
+                stats.total_records += 1
+
+        logger.info(
+            "Exported %d chat-format records to %s (skipped %d)",
+            stats.total_records,
+            output_path,
+            stats.skipped_records,
+        )
+        return stats
+
+    # -- Episode sequences ---------------------------------------------------
+
+    def export_episode_sequences(
+        self,
+        output_dir: str | Path,
+        *,
+        min_length: int = 3,
+        since: datetime | None = None,
+        until: datetime | None = None,
+    ) -> ExportStats:
+        """Export command sequences grouped by episode.
+
+        Each episode is written as a separate JSONL file in *output_dir*.
+        Episodes shorter than *min_length* are skipped.
+        """
+        output_dir = Path(output_dir)
+        output_dir.mkdir(parents=True, exist_ok=True)
+
+        # Gather all rows (high limit) and group by episode.
+        rows = self._logger.query(since=since, until=until, limit=1_000_000)
+        rows.reverse()  # chronological
+
+        episodes: dict[str, list[dict[str, Any]]] = {}
+        for row in rows:
+            ep_id = row.get("episode_id") or "unknown"
+            episodes.setdefault(ep_id, []).append(row)
+
+        stats = ExportStats(
+            output_path=str(output_dir),
+            format="episode_sequence",
+        )
+
+        for ep_id, entries in episodes.items():
+            if len(entries) < min_length:
+                stats.skipped_records += len(entries)
+                continue
+
+            ep_file = output_dir / f"{ep_id}.jsonl"
+            with open(ep_file, "w", encoding="utf-8") as fh:
+                for entry in entries:
+                    fh.write(json.dumps(entry, default=str) + "\n")
+                    stats.total_records += 1
+            stats.episodes_exported += 1
+
+        logger.info(
+            "Exported %d episodes (%d records) to %s",
+            stats.episodes_exported,
+            stats.total_records,
+            output_dir,
+        )
+        return stats
+
+    # -- Instruction-following format ----------------------------------------
+
+    def export_instruction_format(
+        self,
+        output_path: str | Path,
+        *,
+        since: datetime | None = None,
+        until: datetime | None = None,
+        max_records: int | None = None,
+    ) -> ExportStats:
+        """Export as instruction/response pairs (Alpaca-style).
+
+        Each line has ``instruction``, ``input``, and ``output`` fields.
+        """
+        output_path = Path(output_path)
+        output_path.parent.mkdir(parents=True, exist_ok=True)
+
+        rows = self._logger.query(
+            since=since,
+            until=until,
+            limit=max_records or 100_000,
+        )
+        rows.reverse()
+
+        stats = ExportStats(
+            output_path=str(output_path),
+            format="instruction",
+        )
+
+        with open(output_path, "w", encoding="utf-8") as fh:
+            for row in rows:
+                perception = row.get("perception_snapshot", {})
+                if not perception:
+                    stats.skipped_records += 1
+                    continue
+
+                record = {
+                    "instruction": (
+                        "Given the following game world perception, decide what "
+                        "command to execute. Explain your reasoning."
+                    ),
+                    "input": json.dumps(perception),
+                    "output": json.dumps(
+                        {
+                            "command": row.get("command"),
+                            "params": row.get("params", {}),
+                            "reasoning": row.get("reasoning", ""),
+                        }
+                    ),
+                }
+                fh.write(json.dumps(record) + "\n")
+                stats.total_records += 1
+
+        logger.info(
+            "Exported %d instruction-format records to %s",
+            stats.total_records,
+            output_path,
+        )
+        return stats
--- a/src/infrastructure/world/init.py
+++ b/src/infrastructure/world/init.py
@@ -1,29 +0,0 @@
-"""World interface — engine-agnostic adapter pattern for embodied agents.
-
-Provides the ``WorldInterface`` ABC and an adapter registry so Timmy can
-observe, act, and speak in any game world (Morrowind, Luanti, Godot, …)
-through a single contract.
-
-Quick start::
-
-    from infrastructure.world import get_adapter, register_adapter
-    from infrastructure.world.interface import WorldInterface
-
-    register_adapter("mock", MockWorldAdapter)
-    world = get_adapter("mock")
-    perception = world.observe()
-"""
-
-from infrastructure.world.registry import AdapterRegistry
-
-_registry = AdapterRegistry()
-
-register_adapter = _registry.register
-get_adapter = _registry.get
-list_adapters = _registry.list_adapters
-
-__all__ = [
-    "register_adapter",
-    "get_adapter",
-    "list_adapters",
-]
--- a/src/infrastructure/world/adapters/init.py
+++ b/src/infrastructure/world/adapters/init.py
@@ -1 +0,0 @@
-"""Built-in world adapters."""
--- a/src/infrastructure/world/adapters/mock.py
+++ b/src/infrastructure/world/adapters/mock.py
@@ -1,99 +0,0 @@
-"""Mock world adapter — returns canned perception and logs commands.
-
-Useful for testing the heartbeat loop and WorldInterface contract
-without a running game server.
-"""
-
-from __future__ import annotations
-
-import logging
-from dataclasses import dataclass
-from datetime import UTC, datetime
-
-from infrastructure.world.interface import WorldInterface
-from infrastructure.world.types import (
-    ActionResult,
-    ActionStatus,
-    CommandInput,
-    PerceptionOutput,
-)
-
-logger = logging.getLogger(__name__)
-
-
-@dataclass
-class _ActionLog:
-    """Record of an action dispatched to the mock world."""
-
-    command: CommandInput
-    timestamp: datetime
-
-
-class MockWorldAdapter(WorldInterface):
-    """In-memory mock adapter for testing.
-
-    * ``observe()`` returns configurable canned perception.
-    * ``act()`` logs the command and returns success.
-    * ``speak()`` logs the message.
-
-    Inspect ``action_log`` and ``speech_log`` to verify behaviour in tests.
-    """
-
-    def __init__(
-        self,
-        *,
-        location: str = "Test Chamber",
-        entities: list[str] | None = None,
-        events: list[str] | None = None,
-    ) -> None:
-        self._location = location
-        self._entities = entities or ["TestNPC"]
-        self._events = events or []
-        self._connected = False
-        self.action_log: list[_ActionLog] = []
-        self.speech_log: list[dict] = []
-
-    # -- lifecycle ---------------------------------------------------------
-
-    def connect(self) -> None:
-        self._connected = True
-        logger.info("MockWorldAdapter connected")
-
-    def disconnect(self) -> None:
-        self._connected = False
-        logger.info("MockWorldAdapter disconnected")
-
-    @property
-    def is_connected(self) -> bool:
-        return self._connected
-
-    # -- core contract -----------------------------------------------------
-
-    def observe(self) -> PerceptionOutput:
-        logger.debug("MockWorldAdapter.observe()")
-        return PerceptionOutput(
-            timestamp=datetime.now(UTC),
-            location=self._location,
-            entities=list(self._entities),
-            events=list(self._events),
-            raw={"adapter": "mock"},
-        )
-
-    def act(self, command: CommandInput) -> ActionResult:
-        logger.debug("MockWorldAdapter.act(%s)", command.action)
-        self.action_log.append(_ActionLog(command=command, timestamp=datetime.now(UTC)))
-        return ActionResult(
-            status=ActionStatus.SUCCESS,
-            message=f"Mock executed: {command.action}",
-            data={"adapter": "mock"},
-        )
-
-    def speak(self, message: str, target: str | None = None) -> None:
-        logger.debug("MockWorldAdapter.speak(%r, target=%r)", message, target)
-        self.speech_log.append(
-            {
-                "message": message,
-                "target": target,
-                "timestamp": datetime.now(UTC).isoformat(),
-            }
-        )
--- a/src/infrastructure/world/adapters/tes3mp.py
+++ b/src/infrastructure/world/adapters/tes3mp.py
@@ -1,58 +0,0 @@
-"""TES3MP world adapter — stub for Morrowind multiplayer via TES3MP.
-
-This adapter will eventually connect to a TES3MP server and translate
-the WorldInterface contract into TES3MP commands.  For now every method
-raises ``NotImplementedError`` with guidance on what needs wiring up.
-
-Once PR #864 merges, import PerceptionOutput and CommandInput directly
-from ``infrastructure.morrowind.schemas`` if their shapes differ from
-the canonical types in ``infrastructure.world.types``.
-"""
-
-from __future__ import annotations
-
-import logging
-
-from infrastructure.world.interface import WorldInterface
-from infrastructure.world.types import ActionResult, CommandInput, PerceptionOutput
-
-logger = logging.getLogger(__name__)
-
-
-class TES3MPWorldAdapter(WorldInterface):
-    """Stub adapter for TES3MP (Morrowind multiplayer).
-
-    All core methods raise ``NotImplementedError``.
-    Implement ``connect()`` first — it should open a socket to the
-    TES3MP server and authenticate.
-    """
-
-    def __init__(self, *, host: str = "localhost", port: int = 25565) -> None:
-        self._host = host
-        self._port = port
-        self._connected = False
-
-    # -- lifecycle ---------------------------------------------------------
-
-    def connect(self) -> None:
-        raise NotImplementedError("TES3MPWorldAdapter.connect() — wire up TES3MP server socket")
-
-    def disconnect(self) -> None:
-        raise NotImplementedError("TES3MPWorldAdapter.disconnect() — close TES3MP server socket")
-
-    @property
-    def is_connected(self) -> bool:
-        return self._connected
-
-    # -- core contract (stubs) ---------------------------------------------
-
-    def observe(self) -> PerceptionOutput:
-        raise NotImplementedError("TES3MPWorldAdapter.observe() — poll TES3MP for player/NPC state")
-
-    def act(self, command: CommandInput) -> ActionResult:
-        raise NotImplementedError(
-            "TES3MPWorldAdapter.act() — translate CommandInput to TES3MP packet"
-        )
-
-    def speak(self, message: str, target: str | None = None) -> None:
-        raise NotImplementedError("TES3MPWorldAdapter.speak() — send chat message via TES3MP")
--- a/src/infrastructure/world/interface.py
+++ b/src/infrastructure/world/interface.py
@@ -1,64 +0,0 @@
-"""Abstract WorldInterface — the contract every game-world adapter must fulfil.
-
-Follows a Gymnasium-inspired pattern: observe → act → speak, with each
-method returning strongly-typed data structures.
-
-Any future engine (TES3MP, Luanti, Godot, …) plugs in by subclassing
-``WorldInterface`` and implementing the three methods.
-"""
-
-from __future__ import annotations
-
-from abc import ABC, abstractmethod
-
-from infrastructure.world.types import ActionResult, CommandInput, PerceptionOutput
-
-
-class WorldInterface(ABC):
-    """Engine-agnostic base class for world adapters.
-
-    Subclasses must implement:
-        - ``observe()``  — gather structured perception from the world
-        - ``act()``      — dispatch a command and return the outcome
-        - ``speak()``    — send a message to an NPC / player / broadcast
-
-    Lifecycle hooks ``connect()`` and ``disconnect()`` are optional.
-    """
-
-    # -- lifecycle (optional overrides) ------------------------------------
-
-    def connect(self) -> None:  # noqa: B027
-        """Establish connection to the game world.
-
-        Default implementation is a no-op.  Override to open sockets,
-        authenticate, etc.
-        """
-
-    def disconnect(self) -> None:  # noqa: B027
-        """Tear down the connection.
-
-        Default implementation is a no-op.
-        """
-
-    @property
-    def is_connected(self) -> bool:
-        """Return ``True`` if the adapter has an active connection.
-
-        Default returns ``True``.  Override for adapters that maintain
-        persistent connections.
-        """
-        return True
-
-    # -- core contract (must implement) ------------------------------------
-
-    @abstractmethod
-    def observe(self) -> PerceptionOutput:
-        """Return a structured snapshot of the current world state."""
-
-    @abstractmethod
-    def act(self, command: CommandInput) -> ActionResult:
-        """Execute *command* in the world and return the result."""
-
-    @abstractmethod
-    def speak(self, message: str, target: str | None = None) -> None:
-        """Send *message* in the world, optionally directed at *target*."""
--- a/src/infrastructure/world/registry.py
+++ b/src/infrastructure/world/registry.py
@@ -1,54 +0,0 @@
-"""Adapter registry — register and instantiate world adapters by name.
-
-Usage::
-
-    registry = AdapterRegistry()
-    registry.register("mock", MockWorldAdapter)
-    adapter = registry.get("mock", some_kwarg="value")
-"""
-
-from __future__ import annotations
-
-import logging
-from typing import Any
-
-from infrastructure.world.interface import WorldInterface
-
-logger = logging.getLogger(__name__)
-
-
-class AdapterRegistry:
-    """Name → WorldInterface class registry with instantiation."""
-
-    def __init__(self) -> None:
-        self._adapters: dict[str, type[WorldInterface]] = {}
-
-    def register(self, name: str, cls: type[WorldInterface]) -> None:
-        """Register an adapter class under *name*.
-
-        Raises ``TypeError`` if *cls* is not a ``WorldInterface`` subclass.
-        """
-        if not (isinstance(cls, type) and issubclass(cls, WorldInterface)):
-            raise TypeError(f"{cls!r} is not a WorldInterface subclass")
-        if name in self._adapters:
-            logger.warning("Overwriting adapter %r (was %r)", name, self._adapters[name])
-        self._adapters[name] = cls
-        logger.info("Registered world adapter: %s → %s", name, cls.__name__)
-
-    def get(self, name: str, **kwargs: Any) -> WorldInterface:
-        """Instantiate and return the adapter registered as *name*.
-
-        Raises ``KeyError`` if *name* is not registered.
-        """
-        cls = self._adapters[name]
-        return cls(**kwargs)
-
-    def list_adapters(self) -> list[str]:
-        """Return sorted list of registered adapter names."""
-        return sorted(self._adapters)
-
-    def __contains__(self, name: str) -> bool:
-        return name in self._adapters
-
-    def __len__(self) -> int:
-        return len(self._adapters)
--- a/src/infrastructure/world/types.py
+++ b/src/infrastructure/world/types.py
@@ -1,71 +0,0 @@
-"""Canonical data types for world interaction.
-
-These mirror the PerceptionOutput / CommandInput types from PR #864's
-``morrowind/schemas.py``.  When that PR merges, these can be replaced
-with re-exports — but until then they serve as the stable contract for
-every WorldInterface adapter.
-"""
-
-from __future__ import annotations
-
-from dataclasses import dataclass, field
-from datetime import UTC, datetime
-from enum import StrEnum
-
-
-class ActionStatus(StrEnum):
-    """Outcome of an action dispatched to the world."""
-
-    SUCCESS = "success"
-    FAILURE = "failure"
-    PENDING = "pending"
-    NOOP = "noop"
-
-
-@dataclass
-class PerceptionOutput:
-    """Structured world state returned by ``WorldInterface.observe()``.
-
-    Attributes:
-        timestamp:  When the observation was captured.
-        location:   Free-form location descriptor (e.g. "Balmora, Fighters Guild").
-        entities:   List of nearby entity descriptions.
-        events:     Recent game events since last observation.
-        raw:        Optional raw / engine-specific payload for advanced consumers.
-    """
-
-    timestamp: datetime = field(default_factory=lambda: datetime.now(UTC))
-    location: str = ""
-    entities: list[str] = field(default_factory=list)
-    events: list[str] = field(default_factory=list)
-    raw: dict = field(default_factory=dict)
-
-
-@dataclass
-class CommandInput:
-    """Action command sent via ``WorldInterface.act()``.
-
-    Attributes:
-        action:     Verb / action name (e.g. "move", "attack", "use_item").
-        target:     Optional target identifier.
-        parameters: Arbitrary key-value payload for engine-specific params.
-    """
-
-    action: str
-    target: str | None = None
-    parameters: dict = field(default_factory=dict)
-
-
-@dataclass
-class ActionResult:
-    """Outcome returned by ``WorldInterface.act()``.
-
-    Attributes:
-        status:   Whether the action succeeded, failed, etc.
-        message:  Human-readable description of the outcome.
-        data:     Arbitrary engine-specific result payload.
-    """
-
-    status: ActionStatus = ActionStatus.SUCCESS
-    message: str = ""
-    data: dict = field(default_factory=dict)
--- a/src/loop/heartbeat.py
+++ b/src/loop/heartbeat.py
@@ -1,286 +0,0 @@
-"""Heartbeat v2 — WorldInterface-driven cognitive loop.
-
-Drives real observe → reason → act → reflect cycles through whatever
-``WorldInterface`` adapter is connected.  When no adapter is present,
-gracefully falls back to the existing ``run_cycle()`` behaviour.
-
-Usage::
-
-    heartbeat = Heartbeat(world=adapter, interval=30.0)
-    await heartbeat.run_once()          # single cycle
-    await heartbeat.start()             # background loop
-    heartbeat.stop()                    # graceful shutdown
-"""
-
-from __future__ import annotations
-
-import asyncio
-import logging
-import time
-from dataclasses import dataclass, field
-from datetime import UTC, datetime
-
-from loop.phase1_gather import gather
-from loop.phase2_reason import reason
-from loop.phase3_act import act
-from loop.schema import ContextPayload
-
-logger = logging.getLogger(__name__)
-
-
-# ---------------------------------------------------------------------------
-# Cycle log entry
-# ---------------------------------------------------------------------------
-
-
-@dataclass
-class CycleRecord:
-    """One observe → reason → act → reflect cycle."""
-
-    cycle_id: int
-    timestamp: str
-    observation: dict = field(default_factory=dict)
-    reasoning_summary: str = ""
-    action_taken: str = ""
-    action_status: str = ""
-    reflect_notes: str = ""
-    duration_ms: int = 0
-
-
-# ---------------------------------------------------------------------------
-# Heartbeat
-# ---------------------------------------------------------------------------
-
-
-class Heartbeat:
-    """Manages the recurring cognitive loop with optional world adapter.
-
-    Parameters
-    ----------
-    world:
-        A ``WorldInterface`` instance (or ``None`` for passive mode).
-    interval:
-        Seconds between heartbeat ticks.  30 s for embodied mode,
-        300 s (5 min) for passive thinking.
-    on_cycle:
-        Optional async callback invoked after each cycle with the
-        ``CycleRecord``.
-    """
-
-    def __init__(
-        self,
-        *,
-        world=None,  # WorldInterface | None
-        interval: float = 30.0,
-        on_cycle=None,  # Callable[[CycleRecord], Awaitable[None]] | None
-    ) -> None:
-        self._world = world
-        self._interval = interval
-        self._on_cycle = on_cycle
-        self._cycle_count: int = 0
-        self._running = False
-        self._task: asyncio.Task | None = None
-        self.history: list[CycleRecord] = []
-
-    # -- properties --------------------------------------------------------
-
-    @property
-    def world(self):
-        return self._world
-
-    @world.setter
-    def world(self, adapter) -> None:
-        self._world = adapter
-
-    @property
-    def interval(self) -> float:
-        return self._interval
-
-    @interval.setter
-    def interval(self, value: float) -> None:
-        self._interval = max(1.0, value)
-
-    @property
-    def is_running(self) -> bool:
-        return self._running
-
-    @property
-    def cycle_count(self) -> int:
-        return self._cycle_count
-
-    # -- single cycle ------------------------------------------------------
-
-    async def run_once(self) -> CycleRecord:
-        """Execute one full heartbeat cycle.
-
-        If a world adapter is present:
-            1. Observe — ``world.observe()``
-            2. Gather + Reason + Act via the three-phase loop, with the
-               observation injected into the payload
-            3. Dispatch the decided action back to ``world.act()``
-            4. Reflect — log the cycle
-
-        Without an adapter the existing loop runs on a timer-sourced
-        payload (passive thinking).
-        """
-        self._cycle_count += 1
-        start = time.monotonic()
-        record = CycleRecord(
-            cycle_id=self._cycle_count,
-            timestamp=datetime.now(UTC).isoformat(),
-        )
-
-        if self._world is not None:
-            record = await self._embodied_cycle(record)
-        else:
-            record = await self._passive_cycle(record)
-
-        record.duration_ms = int((time.monotonic() - start) * 1000)
-        self.history.append(record)
-
-        # Broadcast via WebSocket (best-effort)
-        await self._broadcast(record)
-
-        if self._on_cycle:
-            await self._on_cycle(record)
-
-        logger.info(
-            "Heartbeat cycle #%d complete (%d ms) — action=%s status=%s",
-            record.cycle_id,
-            record.duration_ms,
-            record.action_taken or "(passive)",
-            record.action_status or "n/a",
-        )
-        return record
-
-    # -- background loop ---------------------------------------------------
-
-    async def start(self) -> None:
-        """Start the recurring heartbeat loop as a background task."""
-        if self._running:
-            logger.warning("Heartbeat already running")
-            return
-        self._running = True
-        self._task = asyncio.current_task() or asyncio.ensure_future(self._loop())
-        if self._task is not asyncio.current_task():
-            return
-        await self._loop()
-
-    async def _loop(self) -> None:
-        logger.info(
-            "Heartbeat loop started (interval=%.1fs, adapter=%s)",
-            self._interval,
-            type(self._world).__name__ if self._world else "None",
-        )
-        while self._running:
-            try:
-                await self.run_once()
-            except Exception:
-                logger.exception("Heartbeat cycle failed")
-            await asyncio.sleep(self._interval)
-
-    def stop(self) -> None:
-        """Signal the heartbeat loop to stop after the current cycle."""
-        self._running = False
-        logger.info("Heartbeat stop requested")
-
-    # -- internal: embodied cycle ------------------------------------------
-
-    async def _embodied_cycle(self, record: CycleRecord) -> CycleRecord:
-        """Cycle with a live world adapter: observe → reason → act → reflect."""
-        from infrastructure.world.types import ActionStatus, CommandInput
-
-        # 1. Observe
-        perception = self._world.observe()
-        record.observation = {
-            "location": perception.location,
-            "entities": perception.entities,
-            "events": perception.events,
-        }
-
-        # 2. Feed observation into the three-phase loop
-        obs_content = (
-            f"Location: {perception.location}\n"
-            f"Entities: {', '.join(perception.entities)}\n"
-            f"Events: {', '.join(perception.events)}"
-        )
-        payload = ContextPayload(
-            source="world",
-            content=obs_content,
-            metadata={"perception": record.observation},
-        )
-
-        gathered = gather(payload)
-        reasoned = reason(gathered)
-        acted = act(reasoned)
-
-        # Extract action decision from the acted payload
-        action_name = acted.metadata.get("action", "idle")
-        action_target = acted.metadata.get("action_target")
-        action_params = acted.metadata.get("action_params", {})
-        record.reasoning_summary = acted.metadata.get("reasoning", acted.content[:200])
-
-        # 3. Dispatch action to world
-        if action_name != "idle":
-            cmd = CommandInput(
-                action=action_name,
-                target=action_target,
-                parameters=action_params,
-            )
-            result = self._world.act(cmd)
-            record.action_taken = action_name
-            record.action_status = result.status.value
-        else:
-            record.action_taken = "idle"
-            record.action_status = ActionStatus.NOOP.value
-
-        # 4. Reflect
-        record.reflect_notes = (
-            f"Observed {len(perception.entities)} entities at {perception.location}. "
-            f"Action: {record.action_taken} → {record.action_status}."
-        )
-
-        return record
-
-    # -- internal: passive cycle -------------------------------------------
-
-    async def _passive_cycle(self, record: CycleRecord) -> CycleRecord:
-        """Cycle without a world adapter — existing think_once() behaviour."""
-        payload = ContextPayload(
-            source="timer",
-            content="heartbeat",
-            metadata={"mode": "passive"},
-        )
-
-        gathered = gather(payload)
-        reasoned = reason(gathered)
-        acted = act(reasoned)
-
-        record.reasoning_summary = acted.content[:200]
-        record.action_taken = "think"
-        record.action_status = "noop"
-        record.reflect_notes = "Passive thinking cycle — no world adapter connected."
-
-        return record
-
-    # -- broadcast ---------------------------------------------------------
-
-    async def _broadcast(self, record: CycleRecord) -> None:
-        """Emit heartbeat cycle data via WebSocket (best-effort)."""
-        try:
-            from infrastructure.ws_manager.handler import ws_manager
-
-            await ws_manager.broadcast(
-                "heartbeat.cycle",
-                {
-                    "cycle_id": record.cycle_id,
-                    "timestamp": record.timestamp,
-                    "action": record.action_taken,
-                    "action_status": record.action_status,
-                    "reasoning_summary": record.reasoning_summary[:300],
-                    "observation": record.observation,
-                    "duration_ms": record.duration_ms,
-                },
-            )
-        except (ImportError, AttributeError, ConnectionError, RuntimeError) as exc:
-            logger.debug("Heartbeat broadcast skipped: %s", exc)
--- a/src/loop/phase1_gather.py
+++ b/src/loop/phase1_gather.py
@@ -17,9 +17,9 @@ logger = logging.getLogger(__name__)
 def gather(payload: ContextPayload) -> ContextPayload:
    """Accept raw input and return structured context for reasoning.

-    When the payload carries a ``perception`` dict in metadata (injected by
-    the heartbeat loop from a WorldInterface adapter), that observation is
-    folded into the gathered context.  Otherwise behaves as before.
+    Stub: tags the payload with phase=gather and logs transit.
+    Timmy will flesh this out with context selection, memory lookup,
+    adapter polling, and attention-residual weighting.
    """
    logger.info(
        "Phase 1 (Gather) received: source=%s content_len=%d tokens=%d",
@@ -28,20 +28,7 @@ def gather(payload: ContextPayload) -> ContextPayload:
        payload.token_count,
    )

-    extra: dict = {"phase": "gather", "gathered": True}
-
-    # Enrich with world observation when present
-    perception = payload.metadata.get("perception")
-    if perception:
-        extra["world_observation"] = perception
-        logger.info(
-            "Phase 1 (Gather) world observation: location=%s entities=%d events=%d",
-            perception.get("location", "?"),
-            len(perception.get("entities", [])),
-            len(perception.get("events", [])),
-        )
-
-    result = payload.with_metadata(**extra)
+    result = payload.with_metadata(phase="gather", gathered=True)

    logger.info(
        "Phase 1 (Gather) produced: metadata_keys=%s",
--- a/src/timmy/memory/unified.py
+++ b/src/timmy/memory/unified.py
@@ -14,8 +14,6 @@ from dataclasses import dataclass, field
 from datetime import UTC, datetime
 from pathlib import Path

-from config import settings
-
 logger = logging.getLogger(__name__)

 # Paths
@@ -30,7 +28,7 @@ def get_connection() -> Generator[sqlite3.Connection, None, None]:
    with closing(sqlite3.connect(str(DB_PATH))) as conn:
        conn.row_factory = sqlite3.Row
        conn.execute("PRAGMA journal_mode=WAL")
-        conn.execute(f"PRAGMA busy_timeout={settings.db_busy_timeout_ms}")
+        conn.execute("PRAGMA busy_timeout=5000")
        _ensure_schema(conn)
        yield conn

--- a/src/timmy/memory_system.py
+++ b/src/timmy/memory_system.py
@@ -20,7 +20,6 @@ from dataclasses import dataclass, field
 from datetime import UTC, datetime, timedelta
 from pathlib import Path

-from config import settings
 from timmy.memory.embeddings import (
    EMBEDDING_DIM,
    EMBEDDING_MODEL,  # noqa: F401 — re-exported for backward compatibility
@@ -112,7 +111,7 @@ def get_connection() -> Generator[sqlite3.Connection, None, None]:
    with closing(sqlite3.connect(str(DB_PATH))) as conn:
        conn.row_factory = sqlite3.Row
        conn.execute("PRAGMA journal_mode=WAL")
-        conn.execute(f"PRAGMA busy_timeout={settings.db_busy_timeout_ms}")
+        conn.execute("PRAGMA busy_timeout=5000")
        _ensure_schema(conn)
        yield conn

@@ -950,7 +949,7 @@ class SemanticMemory:
            with closing(sqlite3.connect(str(self.db_path))) as conn:
                conn.row_factory = sqlite3.Row
                conn.execute("PRAGMA journal_mode=WAL")
-                conn.execute(f"PRAGMA busy_timeout={settings.db_busy_timeout_ms}")
+                conn.execute("PRAGMA busy_timeout=5000")
                # Ensure schema exists
                conn.execute("""
                    CREATE TABLE IF NOT EXISTS memories (
--- a/src/timmy/tools.py
+++ b/src/timmy/tools.py
@@ -24,9 +24,6 @@ from config import settings

 logger = logging.getLogger(__name__)

-# Max characters of user query included in Lightning invoice memo
-_INVOICE_MEMO_MAX_LEN = 50
-
 # Lazy imports to handle test mocking
 _ImportError = None
 try:
@@ -450,6 +447,7 @@ def consult_grok(query: str) -> str:
        )
    except (ImportError, AttributeError) as exc:
        logger.warning("Tool execution failed (consult_grok logging): %s", exc)
+        pass

    # Generate Lightning invoice for monetization (unless free mode)
    invoice_info = ""
@@ -458,11 +456,12 @@ def consult_grok(query: str) -> str:
            from lightning.factory import get_backend as get_ln_backend

            ln = get_ln_backend()
-            sats = min(settings.grok_max_sats_per_query, settings.grok_sats_hard_cap)
-            inv = ln.create_invoice(sats, f"Grok query: {query[:_INVOICE_MEMO_MAX_LEN]}")
+            sats = min(settings.grok_max_sats_per_query, 100)
+            inv = ln.create_invoice(sats, f"Grok query: {query[:50]}")
            invoice_info = f"\n[Lightning invoice: {sats} sats — {inv.payment_request[:40]}...]"
        except (ImportError, OSError, ValueError) as exc:
            logger.warning("Tool execution failed (Lightning invoice): %s", exc)
+            pass

    result = backend.run(query)

@@ -473,69 +472,6 @@ def consult_grok(query: str) -> str:
    return response


-def web_fetch(url: str, max_tokens: int = 4000) -> str:
-    """Fetch a web page and return its main text content.
-
-    Downloads the URL, extracts readable text using trafilatura, and
-    truncates to a token budget.  Use this to read full articles, docs,
-    or blog posts that web_search only returns snippets for.
-
-    Args:
-        url: The URL to fetch (must start with http:// or https://).
-        max_tokens: Maximum approximate token budget (default 4000).
-                    Text is truncated to max_tokens * 4 characters.
-
-    Returns:
-        Extracted text content, or an error message on failure.
-    """
-    if not url or not url.startswith(("http://", "https://")):
-        return f"Error: invalid URL — must start with http:// or https://: {url!r}"
-
-    try:
-        import requests as _requests
-    except ImportError:
-        return "Error: 'requests' package is not installed. Install with: pip install requests"
-
-    try:
-        import trafilatura
-    except ImportError:
-        return (
-            "Error: 'trafilatura' package is not installed. Install with: pip install trafilatura"
-        )
-
-    try:
-        resp = _requests.get(
-            url,
-            timeout=15,
-            headers={"User-Agent": "TimmyResearchBot/1.0"},
-        )
-        resp.raise_for_status()
-    except _requests.exceptions.Timeout:
-        return f"Error: request timed out after 15 seconds for {url}"
-    except _requests.exceptions.HTTPError as exc:
-        return f"Error: HTTP {exc.response.status_code} for {url}"
-    except _requests.exceptions.RequestException as exc:
-        return f"Error: failed to fetch {url} — {exc}"
-
-    text = trafilatura.extract(resp.text, include_tables=True, include_links=True)
-    if not text:
-        return f"Error: could not extract readable content from {url}"
-
-    char_budget = max_tokens * 4
-    if len(text) > char_budget:
-        text = text[:char_budget] + f"\n\n[…truncated to ~{max_tokens} tokens]"
-
-    return text
-
-
-def _register_web_fetch_tool(toolkit: Toolkit) -> None:
-    """Register the web_fetch tool for full-page content extraction."""
-    try:
-        toolkit.register(web_fetch, name="web_fetch")
-    except Exception as exc:
-        logger.warning("Tool execution failed (web_fetch registration): %s", exc)
-
-
 def _register_core_tools(toolkit: Toolkit, base_path: Path) -> None:
    """Register core execution and file tools."""
    # Python execution
@@ -735,7 +671,6 @@ def create_full_toolkit(base_dir: str | Path | None = None):
    base_path = Path(base_dir) if base_dir else Path(settings.repo_root)

    _register_core_tools(toolkit, base_path)
-    _register_web_fetch_tool(toolkit)
    _register_grok_tool(toolkit)
    _register_memory_tools(toolkit)
    _register_agentic_loop_tool(toolkit)
@@ -893,11 +828,6 @@ def _analysis_tool_catalog() -> dict:
            "description": "Evaluate mathematical expressions with exact results",
            "available_in": ["orchestrator"],
        },
-        "web_fetch": {
-            "name": "Web Fetch",
-            "description": "Fetch a web page and extract clean readable text (trafilatura)",
-            "available_in": ["orchestrator"],
-        },
    }


@@ -1010,7 +940,7 @@ def _merge_catalog(
                "available_in": available_in,
            }
    except ImportError:
-        logger.debug("Optional catalog %s.%s not available", module_path, attr_name)
+        pass


 def get_all_available_tools() -> dict[str, dict]:
--- a/tests/dashboard/test_scorecards.py
+++ b/tests/dashboard/test_scorecards.py
@@ -1,680 +0,0 @@
-"""Tests for agent scorecard functionality."""
-
-from datetime import UTC, datetime, timedelta
-from unittest.mock import MagicMock, patch
-
-from dashboard.services.scorecard_service import (
-    AgentMetrics,
-    PeriodType,
-    ScorecardSummary,
-    _aggregate_metrics,
-    _detect_patterns,
-    _extract_actor_from_event,
-    _generate_narrative_bullets,
-    _get_period_bounds,
-    _is_tracked_agent,
-    _query_token_transactions,
-    generate_all_scorecards,
-    generate_scorecard,
-    get_tracked_agents,
-)
-from infrastructure.events.bus import Event
-
-
-class TestPeriodBounds:
-    """Test period boundary calculations."""
-
-    def test_daily_period_bounds(self):
-        """Test daily period returns correct 24-hour window."""
-        reference = datetime(2026, 3, 21, 12, 30, 45, tzinfo=UTC)
-        start, end = _get_period_bounds(PeriodType.daily, reference)
-
-        assert end == datetime(2026, 3, 21, 0, 0, 0, tzinfo=UTC)
-        assert start == datetime(2026, 3, 20, 0, 0, 0, tzinfo=UTC)
-        assert (end - start) == timedelta(days=1)
-
-    def test_weekly_period_bounds(self):
-        """Test weekly period returns correct 7-day window."""
-        reference = datetime(2026, 3, 21, 12, 30, 45, tzinfo=UTC)
-        start, end = _get_period_bounds(PeriodType.weekly, reference)
-
-        assert end == datetime(2026, 3, 21, 0, 0, 0, tzinfo=UTC)
-        assert start == datetime(2026, 3, 14, 0, 0, 0, tzinfo=UTC)
-        assert (end - start) == timedelta(days=7)
-
-    def test_default_reference_date(self):
-        """Test default reference date uses current time."""
-        start, end = _get_period_bounds(PeriodType.daily)
-        now = datetime.now(UTC)
-
-        # End should be start of current day (midnight)
-        expected_end = now.replace(hour=0, minute=0, second=0, microsecond=0)
-        assert end == expected_end
-        # Start should be 24 hours before end
-        assert (end - start) == timedelta(days=1)
-
-
-class TestTrackedAgents:
-    """Test agent tracking functions."""
-
-    def test_get_tracked_agents(self):
-        """Test get_tracked_agents returns sorted list."""
-        agents = get_tracked_agents()
-        assert isinstance(agents, list)
-        assert "kimi" in agents
-        assert "claude" in agents
-        assert "gemini" in agents
-        assert "hermes" in agents
-        assert "manus" in agents
-        assert agents == sorted(agents)
-
-    def test_is_tracked_agent_true(self):
-        """Test _is_tracked_agent returns True for tracked agents."""
-        assert _is_tracked_agent("kimi") is True
-        assert _is_tracked_agent("KIMI") is True  # case insensitive
-        assert _is_tracked_agent("claude") is True
-        assert _is_tracked_agent("hermes") is True
-
-    def test_is_tracked_agent_false(self):
-        """Test _is_tracked_agent returns False for untracked agents."""
-        assert _is_tracked_agent("unknown") is False
-        assert _is_tracked_agent("rockachopa") is False
-        assert _is_tracked_agent("") is False
-
-
-class TestExtractActor:
-    """Test actor extraction from events."""
-
-    def test_extract_from_actor_field(self):
-        """Test extraction from data.actor field."""
-        event = Event(type="test", source="system", data={"actor": "kimi"})
-        assert _extract_actor_from_event(event) == "kimi"
-
-    def test_extract_from_agent_id_field(self):
-        """Test extraction from data.agent_id field."""
-        event = Event(type="test", source="system", data={"agent_id": "claude"})
-        assert _extract_actor_from_event(event) == "claude"
-
-    def test_extract_from_source_fallback(self):
-        """Test fallback to event.source."""
-        event = Event(type="test", source="gemini", data={})
-        assert _extract_actor_from_event(event) == "gemini"
-
-    def test_actor_priority_over_agent_id(self):
-        """Test actor field takes priority over agent_id."""
-        event = Event(type="test", source="system", data={"actor": "kimi", "agent_id": "claude"})
-        assert _extract_actor_from_event(event) == "kimi"
-
-
-class TestAggregateMetrics:
-    """Test metrics aggregation from events."""
-
-    def test_empty_events(self):
-        """Test aggregation with no events returns empty dict."""
-        result = _aggregate_metrics([])
-        assert result == {}
-
-    def test_push_event_aggregation(self):
-        """Test push events aggregate commits correctly."""
-        events = [
-            Event(type="gitea.push", source="gitea", data={"actor": "kimi", "num_commits": 3}),
-            Event(type="gitea.push", source="gitea", data={"actor": "kimi", "num_commits": 2}),
-        ]
-        result = _aggregate_metrics(events)
-
-        assert "kimi" in result
-        assert result["kimi"].commits == 5
-
-    def test_issue_opened_aggregation(self):
-        """Test issue opened events aggregate correctly."""
-        events = [
-            Event(
-                type="gitea.issue.opened",
-                source="gitea",
-                data={"actor": "claude", "issue_number": 100},
-            ),
-            Event(
-                type="gitea.issue.opened",
-                source="gitea",
-                data={"actor": "claude", "issue_number": 101},
-            ),
-        ]
-        result = _aggregate_metrics(events)
-
-        assert "claude" in result
-        assert len(result["claude"].issues_touched) == 2
-        assert 100 in result["claude"].issues_touched
-        assert 101 in result["claude"].issues_touched
-
-    def test_comment_aggregation(self):
-        """Test comment events aggregate correctly."""
-        events = [
-            Event(
-                type="gitea.issue.comment",
-                source="gitea",
-                data={"actor": "gemini", "issue_number": 100},
-            ),
-            Event(
-                type="gitea.issue.comment",
-                source="gitea",
-                data={"actor": "gemini", "issue_number": 101},
-            ),
-        ]
-        result = _aggregate_metrics(events)
-
-        assert "gemini" in result
-        assert result["gemini"].comments == 2
-        assert len(result["gemini"].issues_touched) == 2  # Comments touch issues too
-
-    def test_pr_events_aggregation(self):
-        """Test PR open and merge events aggregate correctly."""
-        events = [
-            Event(
-                type="gitea.pull_request",
-                source="gitea",
-                data={"actor": "kimi", "pr_number": 50, "action": "opened"},
-            ),
-            Event(
-                type="gitea.pull_request",
-                source="gitea",
-                data={"actor": "kimi", "pr_number": 50, "action": "closed", "merged": True},
-            ),
-            Event(
-                type="gitea.pull_request",
-                source="gitea",
-                data={"actor": "kimi", "pr_number": 51, "action": "opened"},
-            ),
-        ]
-        result = _aggregate_metrics(events)
-
-        assert "kimi" in result
-        assert len(result["kimi"].prs_opened) == 2
-        assert len(result["kimi"].prs_merged) == 1
-        assert 50 in result["kimi"].prs_merged
-
-    def test_untracked_agent_filtered(self):
-        """Test events from untracked agents are filtered out."""
-        events = [
-            Event(
-                type="gitea.push", source="gitea", data={"actor": "rockachopa", "num_commits": 5}
-            ),
-        ]
-        result = _aggregate_metrics(events)
-
-        assert "rockachopa" not in result
-
-    def test_task_completion_aggregation(self):
-        """Test task completion events aggregate test files."""
-        events = [
-            Event(
-                type="agent.task.completed",
-                source="gitea",
-                data={
-                    "agent_id": "kimi",
-                    "tests_affected": ["test_foo.py", "test_bar.py"],
-                    "token_reward": 10,
-                },
-            ),
-        ]
-        result = _aggregate_metrics(events)
-
-        assert "kimi" in result
-        assert len(result["kimi"].tests_affected) == 2
-        assert "test_foo.py" in result["kimi"].tests_affected
-        assert result["kimi"].tokens_earned == 10
-
-
-class TestAgentMetrics:
-    """Test AgentMetrics class."""
-
-    def test_merge_rate_zero_prs(self):
-        """Test merge rate is 0 when no PRs opened."""
-        metrics = AgentMetrics(agent_id="kimi")
-        assert metrics.pr_merge_rate == 0.0
-
-    def test_merge_rate_perfect(self):
-        """Test 100% merge rate calculation."""
-        metrics = AgentMetrics(agent_id="kimi", prs_opened={1, 2, 3}, prs_merged={1, 2, 3})
-        assert metrics.pr_merge_rate == 1.0
-
-    def test_merge_rate_partial(self):
-        """Test partial merge rate calculation."""
-        metrics = AgentMetrics(agent_id="kimi", prs_opened={1, 2, 3, 4}, prs_merged={1, 2})
-        assert metrics.pr_merge_rate == 0.5
-
-
-class TestDetectPatterns:
-    """Test pattern detection logic."""
-
-    def test_high_merge_rate_pattern(self):
-        """Test detection of high merge rate pattern."""
-        metrics = AgentMetrics(
-            agent_id="kimi",
-            prs_opened={1, 2, 3, 4, 5},
-            prs_merged={1, 2, 3, 4},  # 80% merge rate
-        )
-        patterns = _detect_patterns(metrics)
-
-        assert any("High merge rate" in p for p in patterns)
-
-    def test_low_merge_rate_pattern(self):
-        """Test detection of low merge rate pattern."""
-        metrics = AgentMetrics(
-            agent_id="kimi",
-            prs_opened={1, 2, 3, 4, 5},
-            prs_merged={1},  # 20% merge rate
-        )
-        patterns = _detect_patterns(metrics)
-
-        assert any("low merge rate" in p for p in patterns)
-
-    def test_high_commits_no_prs_pattern(self):
-        """Test detection of direct-to-main commits pattern."""
-        metrics = AgentMetrics(
-            agent_id="kimi",
-            commits=15,
-            prs_opened=set(),
-        )
-        patterns = _detect_patterns(metrics)
-
-        assert any("High commit volume without PRs" in p for p in patterns)
-
-    def test_silent_worker_pattern(self):
-        """Test detection of silent worker pattern."""
-        metrics = AgentMetrics(
-            agent_id="kimi",
-            issues_touched={1, 2, 3, 4, 5, 6},
-            comments=0,
-        )
-        patterns = _detect_patterns(metrics)
-
-        assert any("silent worker" in p for p in patterns)
-
-    def test_communicative_pattern(self):
-        """Test detection of highly communicative pattern."""
-        metrics = AgentMetrics(
-            agent_id="kimi",
-            issues_touched={1, 2},  # 2 issues
-            comments=10,  # 5x comments per issue
-        )
-        patterns = _detect_patterns(metrics)
-
-        assert any("Highly communicative" in p for p in patterns)
-
-    def test_token_accumulation_pattern(self):
-        """Test detection of token accumulation pattern."""
-        metrics = AgentMetrics(
-            agent_id="kimi",
-            tokens_earned=150,
-            tokens_spent=10,
-        )
-        patterns = _detect_patterns(metrics)
-
-        assert any("Strong token accumulation" in p for p in patterns)
-
-    def test_token_spend_pattern(self):
-        """Test detection of high token spend pattern."""
-        metrics = AgentMetrics(
-            agent_id="kimi",
-            tokens_earned=10,
-            tokens_spent=100,
-        )
-        patterns = _detect_patterns(metrics)
-
-        assert any("High token spend" in p for p in patterns)
-
-
-class TestGenerateNarrative:
-    """Test narrative bullet generation."""
-
-    def test_empty_metrics_narrative(self):
-        """Test narrative for empty metrics mentions no activity."""
-        metrics = AgentMetrics(agent_id="kimi")
-        bullets = _generate_narrative_bullets(metrics, PeriodType.daily)
-
-        assert len(bullets) == 1
-        assert "No recorded activity" in bullets[0]
-
-    def test_activity_summary_narrative(self):
-        """Test narrative includes activity summary."""
-        metrics = AgentMetrics(
-            agent_id="kimi",
-            commits=5,
-            prs_opened={1, 2},
-            prs_merged={1},
-        )
-        bullets = _generate_narrative_bullets(metrics, PeriodType.daily)
-
-        activity_bullet = next((b for b in bullets if "Active across" in b), None)
-        assert activity_bullet is not None
-        assert "5 commits" in activity_bullet
-        assert "2 PRs opened" in activity_bullet
-        assert "1 PR merged" in activity_bullet
-
-    def test_tests_affected_narrative(self):
-        """Test narrative includes tests affected."""
-        metrics = AgentMetrics(
-            agent_id="kimi",
-            tests_affected={"test_a.py", "test_b.py"},
-        )
-        bullets = _generate_narrative_bullets(metrics, PeriodType.daily)
-
-        assert any("2 test files" in b for b in bullets)
-
-    def test_tokens_earned_narrative(self):
-        """Test narrative includes token earnings."""
-        metrics = AgentMetrics(
-            agent_id="kimi",
-            tokens_earned=100,
-            tokens_spent=20,
-        )
-        bullets = _generate_narrative_bullets(metrics, PeriodType.daily)
-
-        assert any("Net earned 80 tokens" in b for b in bullets)
-
-    def test_tokens_spent_narrative(self):
-        """Test narrative includes token spending."""
-        metrics = AgentMetrics(
-            agent_id="kimi",
-            tokens_earned=20,
-            tokens_spent=100,
-        )
-        bullets = _generate_narrative_bullets(metrics, PeriodType.daily)
-
-        assert any("Net spent 80 tokens" in b for b in bullets)
-
-    def test_balanced_tokens_narrative(self):
-        """Test narrative for balanced token flow."""
-        metrics = AgentMetrics(
-            agent_id="kimi",
-            tokens_earned=100,
-            tokens_spent=100,
-        )
-        bullets = _generate_narrative_bullets(metrics, PeriodType.daily)
-
-        assert any("Balanced token flow" in b for b in bullets)
-
-
-class TestScorecardSummary:
-    """Test ScorecardSummary dataclass."""
-
-    def test_to_dict_structure(self):
-        """Test to_dict returns expected structure."""
-        metrics = AgentMetrics(
-            agent_id="kimi",
-            issues_touched={1, 2},
-            prs_opened={10, 11},
-            prs_merged={10},
-            tokens_earned=100,
-            tokens_spent=20,
-        )
-        summary = ScorecardSummary(
-            agent_id="kimi",
-            period_type=PeriodType.daily,
-            period_start=datetime.now(UTC),
-            period_end=datetime.now(UTC),
-            metrics=metrics,
-            narrative_bullets=["Test bullet"],
-            patterns=["Test pattern"],
-        )
-        data = summary.to_dict()
-
-        assert data["agent_id"] == "kimi"
-        assert data["period_type"] == "daily"
-        assert "metrics" in data
-        assert data["metrics"]["issues_touched"] == 2
-        assert data["metrics"]["prs_opened"] == 2
-        assert data["metrics"]["prs_merged"] == 1
-        assert data["metrics"]["pr_merge_rate"] == 0.5
-        assert data["metrics"]["tokens_earned"] == 100
-        assert data["metrics"]["token_net"] == 80
-        assert data["narrative_bullets"] == ["Test bullet"]
-        assert data["patterns"] == ["Test pattern"]
-
-
-class TestQueryTokenTransactions:
-    """Test token transaction querying."""
-
-    def test_empty_ledger(self):
-        """Test empty ledger returns zero values."""
-        with patch("lightning.ledger.get_transactions", return_value=[]):
-            earned, spent = _query_token_transactions("kimi", datetime.now(UTC), datetime.now(UTC))
-            assert earned == 0
-            assert spent == 0
-
-    def test_ledger_with_transactions(self):
-        """Test ledger aggregation of transactions."""
-        now = datetime.now(UTC)
-        mock_tx = [
-            MagicMock(
-                agent_id="kimi",
-                tx_type=MagicMock(value="incoming"),
-                amount_sats=100,
-                created_at=now.isoformat(),
-            ),
-            MagicMock(
-                agent_id="kimi",
-                tx_type=MagicMock(value="outgoing"),
-                amount_sats=30,
-                created_at=now.isoformat(),
-            ),
-        ]
-        with patch("lightning.ledger.get_transactions", return_value=mock_tx):
-            earned, spent = _query_token_transactions(
-                "kimi", now - timedelta(hours=1), now + timedelta(hours=1)
-            )
-            assert earned == 100
-            assert spent == 30
-
-    def test_ledger_filters_by_agent(self):
-        """Test ledger filters transactions by agent_id."""
-        now = datetime.now(UTC)
-        mock_tx = [
-            MagicMock(
-                agent_id="claude",
-                tx_type=MagicMock(value="incoming"),
-                amount_sats=100,
-                created_at=now.isoformat(),
-            ),
-        ]
-        with patch("lightning.ledger.get_transactions", return_value=mock_tx):
-            earned, spent = _query_token_transactions(
-                "kimi", now - timedelta(hours=1), now + timedelta(hours=1)
-            )
-            assert earned == 0  # Transaction was for claude, not kimi
-
-    def test_ledger_filters_by_time(self):
-        """Test ledger filters transactions by time range."""
-        now = datetime.now(UTC)
-        old_time = now - timedelta(days=2)
-        mock_tx = [
-            MagicMock(
-                agent_id="kimi",
-                tx_type=MagicMock(value="incoming"),
-                amount_sats=100,
-                created_at=old_time.isoformat(),
-            ),
-        ]
-        with patch("lightning.ledger.get_transactions", return_value=mock_tx):
-            # Query for today only
-            earned, spent = _query_token_transactions(
-                "kimi", now - timedelta(hours=1), now + timedelta(hours=1)
-            )
-            assert earned == 0  # Transaction was 2 days ago
-
-
-class TestGenerateScorecard:
-    """Test scorecard generation."""
-
-    def test_generate_scorecard_no_activity(self):
-        """Test scorecard generation for agent with no activity."""
-        with patch(
-            "dashboard.services.scorecard_service._collect_events_for_period", return_value=[]
-        ):
-            with patch(
-                "dashboard.services.scorecard_service._query_token_transactions",
-                return_value=(0, 0),
-            ):
-                scorecard = generate_scorecard("kimi", PeriodType.daily)
-
-        assert scorecard is not None
-        assert scorecard.agent_id == "kimi"
-        assert scorecard.period_type == PeriodType.daily
-        assert len(scorecard.narrative_bullets) == 1
-        assert "No recorded activity" in scorecard.narrative_bullets[0]
-
-    def test_generate_scorecard_with_activity(self):
-        """Test scorecard generation includes activity."""
-        events = [
-            Event(type="gitea.push", source="gitea", data={"actor": "kimi", "num_commits": 5}),
-        ]
-        with patch(
-            "dashboard.services.scorecard_service._collect_events_for_period", return_value=events
-        ):
-            with patch(
-                "dashboard.services.scorecard_service._query_token_transactions",
-                return_value=(100, 20),
-            ):
-                scorecard = generate_scorecard("kimi", PeriodType.daily)
-
-        assert scorecard is not None
-        assert scorecard.metrics.commits == 5
-        assert scorecard.metrics.tokens_earned == 100
-        assert scorecard.metrics.tokens_spent == 20
-
-
-class TestGenerateAllScorecards:
-    """Test generating scorecards for all agents."""
-
-    def test_generates_for_all_tracked_agents(self):
-        """Test all tracked agents get scorecards even with no activity."""
-        with patch(
-            "dashboard.services.scorecard_service._collect_events_for_period", return_value=[]
-        ):
-            with patch(
-                "dashboard.services.scorecard_service._query_token_transactions",
-                return_value=(0, 0),
-            ):
-                scorecards = generate_all_scorecards(PeriodType.daily)
-
-        agent_ids = {s.agent_id for s in scorecards}
-        expected = {"kimi", "claude", "gemini", "hermes", "manus"}
-        assert expected.issubset(agent_ids)
-
-    def test_scorecards_sorted(self):
-        """Test scorecards are sorted by agent_id."""
-        with patch(
-            "dashboard.services.scorecard_service._collect_events_for_period", return_value=[]
-        ):
-            with patch(
-                "dashboard.services.scorecard_service._query_token_transactions",
-                return_value=(0, 0),
-            ):
-                scorecards = generate_all_scorecards(PeriodType.daily)
-
-        agent_ids = [s.agent_id for s in scorecards]
-        assert agent_ids == sorted(agent_ids)
-
-
-class TestScorecardRoutes:
-    """Test scorecard API routes."""
-
-    def test_list_agents_endpoint(self, client):
-        """Test GET /scorecards/api/agents returns tracked agents."""
-        response = client.get("/scorecards/api/agents")
-        assert response.status_code == 200
-        data = response.json()
-        assert "agents" in data
-        assert "kimi" in data["agents"]
-        assert "claude" in data["agents"]
-
-    def test_get_scorecard_endpoint(self, client):
-        """Test GET /scorecards/api/{agent_id} returns scorecard."""
-        with patch("dashboard.routes.scorecards.generate_scorecard") as mock_generate:
-            mock_generate.return_value = ScorecardSummary(
-                agent_id="kimi",
-                period_type=PeriodType.daily,
-                period_start=datetime.now(UTC),
-                period_end=datetime.now(UTC),
-                metrics=AgentMetrics(agent_id="kimi"),
-                narrative_bullets=["Test bullet"],
-                patterns=[],
-            )
-            response = client.get("/scorecards/api/kimi?period=daily")
-
-        assert response.status_code == 200
-        data = response.json()
-        assert data["agent_id"] == "kimi"
-        assert data["period_type"] == "daily"
-
-    def test_get_scorecard_invalid_period(self, client):
-        """Test GET with invalid period returns 400."""
-        response = client.get("/scorecards/api/kimi?period=invalid")
-        assert response.status_code == 400
-        assert "error" in response.json()
-
-    def test_get_all_scorecards_endpoint(self, client):
-        """Test GET /scorecards/api returns all scorecards."""
-        with patch("dashboard.routes.scorecards.generate_all_scorecards") as mock_generate:
-            mock_generate.return_value = [
-                ScorecardSummary(
-                    agent_id="kimi",
-                    period_type=PeriodType.daily,
-                    period_start=datetime.now(UTC),
-                    period_end=datetime.now(UTC),
-                    metrics=AgentMetrics(agent_id="kimi"),
-                    narrative_bullets=[],
-                    patterns=[],
-                ),
-            ]
-            response = client.get("/scorecards/api?period=daily")
-
-        assert response.status_code == 200
-        data = response.json()
-        assert data["period"] == "daily"
-        assert "scorecards" in data
-        assert len(data["scorecards"]) == 1
-
-    def test_scorecards_page_renders(self, client):
-        """Test GET /scorecards returns HTML page."""
-        response = client.get("/scorecards")
-        assert response.status_code == 200
-        assert "text/html" in response.headers.get("content-type", "")
-        assert "AGENT SCORECARDS" in response.text
-
-    def test_scorecard_panel_renders(self, client):
-        """Test GET /scorecards/panel/{agent_id} returns HTML."""
-        with patch("dashboard.routes.scorecards.generate_scorecard") as mock_generate:
-            mock_generate.return_value = ScorecardSummary(
-                agent_id="kimi",
-                period_type=PeriodType.daily,
-                period_start=datetime.now(UTC),
-                period_end=datetime.now(UTC),
-                metrics=AgentMetrics(agent_id="kimi", commits=5),
-                narrative_bullets=["Active across 5 commits this day."],
-                patterns=["High activity"],
-            )
-            response = client.get("/scorecards/panel/kimi?period=daily")
-
-        assert response.status_code == 200
-        assert "text/html" in response.headers.get("content-type", "")
-        assert "Kimi" in response.text
-
-    def test_all_panels_renders(self, client):
-        """Test GET /scorecards/all/panels returns HTML with all panels."""
-        with patch("dashboard.routes.scorecards.generate_all_scorecards") as mock_generate:
-            mock_generate.return_value = [
-                ScorecardSummary(
-                    agent_id="kimi",
-                    period_type=PeriodType.daily,
-                    period_start=datetime.now(UTC),
-                    period_end=datetime.now(UTC),
-                    metrics=AgentMetrics(agent_id="kimi"),
-                    narrative_bullets=[],
-                    patterns=[],
-                ),
-            ]
-            response = client.get("/scorecards/all/panels?period=daily")
-
-        assert response.status_code == 200
-        assert "text/html" in response.headers.get("content-type", "")
--- a/tests/infrastructure/test_db_pool.py
+++ b/tests/infrastructure/test_db_pool.py
@@ -242,145 +242,6 @@ class TestCloseAll:
            conn.execute("SELECT 1")


-class TestConnectionLeaks:
-    """Test that connections do not leak."""
-
-    def test_get_connection_after_close_returns_fresh_connection(self, tmp_path):
-        """After close, get_connection() returns a new working connection."""
-        pool = ConnectionPool(tmp_path / "test.db")
-        conn1 = pool.get_connection()
-        pool.close_connection()
-
-        conn2 = pool.get_connection()
-        assert conn2 is not conn1
-        # New connection must be usable
-        cursor = conn2.execute("SELECT 1")
-        assert cursor.fetchone()[0] == 1
-        pool.close_connection()
-
-    def test_context_manager_does_not_leak_connection(self, tmp_path):
-        """After context manager exit, thread-local conn is cleared."""
-        pool = ConnectionPool(tmp_path / "test.db")
-        with pool.connection():
-            pass
-        # Thread-local should be cleaned up
-        assert pool._local.conn is None
-
-    def test_context_manager_exception_does_not_leak_connection(self, tmp_path):
-        """Connection is cleaned up even when an exception occurs."""
-        pool = ConnectionPool(tmp_path / "test.db")
-        try:
-            with pool.connection():
-                raise RuntimeError("boom")
-        except RuntimeError:
-            pass
-        assert pool._local.conn is None
-
-    def test_threads_do_not_leak_into_each_other(self, tmp_path):
-        """A connection opened in one thread is invisible to another."""
-        pool = ConnectionPool(tmp_path / "test.db")
-        # Open a connection on main thread
-        pool.get_connection()
-
-        visible_from_other_thread = []
-
-        def check():
-            has_conn = hasattr(pool._local, "conn") and pool._local.conn is not None
-            visible_from_other_thread.append(has_conn)
-
-        t = threading.Thread(target=check)
-        t.start()
-        t.join()
-
-        assert visible_from_other_thread == [False]
-        pool.close_connection()
-
-    def test_repeated_open_close_cycles(self, tmp_path):
-        """Repeated open/close cycles do not accumulate leaked connections."""
-        pool = ConnectionPool(tmp_path / "test.db")
-        for _ in range(50):
-            with pool.connection() as conn:
-                conn.execute("SELECT 1")
-            # After each cycle, connection should be cleaned up
-            assert pool._local.conn is None
-
-
-class TestPragmaApplication:
-    """Test that SQLite pragmas can be applied and persist on pooled connections.
-
-    The codebase uses WAL journal mode and busy_timeout pragmas on connections
-    obtained from the pool. These tests verify that pattern works correctly.
-    """
-
-    def test_wal_journal_mode_persists(self, tmp_path):
-        """WAL journal mode set on a pooled connection persists for its lifetime."""
-        pool = ConnectionPool(tmp_path / "test.db")
-        conn = pool.get_connection()
-        conn.execute("PRAGMA journal_mode=WAL")
-        mode = conn.execute("PRAGMA journal_mode").fetchone()[0]
-        assert mode == "wal"
-
-        # Same connection should retain the pragma
-        same_conn = pool.get_connection()
-        mode2 = same_conn.execute("PRAGMA journal_mode").fetchone()[0]
-        assert mode2 == "wal"
-        pool.close_connection()
-
-    def test_busy_timeout_persists(self, tmp_path):
-        """busy_timeout pragma set on a pooled connection persists."""
-        pool = ConnectionPool(tmp_path / "test.db")
-        conn = pool.get_connection()
-        conn.execute("PRAGMA busy_timeout=5000")
-        timeout = conn.execute("PRAGMA busy_timeout").fetchone()[0]
-        assert timeout == 5000
-        pool.close_connection()
-
-    def test_pragmas_apply_per_connection(self, tmp_path):
-        """Pragmas set on one thread's connection are independent of another's."""
-        pool = ConnectionPool(tmp_path / "test.db")
-        conn_main = pool.get_connection()
-        conn_main.execute("PRAGMA cache_size=9999")
-
-        other_cache = []
-
-        def check_pragma():
-            conn = pool.get_connection()
-            # Don't set cache_size — should get the default, not 9999
-            val = conn.execute("PRAGMA cache_size").fetchone()[0]
-            other_cache.append(val)
-            pool.close_connection()
-
-        t = threading.Thread(target=check_pragma)
-        t.start()
-        t.join()
-
-        # Other thread's connection should NOT have our custom cache_size
-        assert other_cache[0] != 9999
-        pool.close_connection()
-
-    def test_session_pragma_resets_on_new_connection(self, tmp_path):
-        """Session-level pragmas (cache_size) reset on a new connection."""
-        pool = ConnectionPool(tmp_path / "test.db")
-        conn1 = pool.get_connection()
-        conn1.execute("PRAGMA cache_size=9999")
-        assert conn1.execute("PRAGMA cache_size").fetchone()[0] == 9999
-        pool.close_connection()
-
-        conn2 = pool.get_connection()
-        cache = conn2.execute("PRAGMA cache_size").fetchone()[0]
-        # New connection gets default cache_size, not the previous value
-        assert cache != 9999
-        pool.close_connection()
-
-    def test_wal_mode_via_context_manager(self, tmp_path):
-        """WAL mode can be set within a context manager block."""
-        pool = ConnectionPool(tmp_path / "test.db")
-        with pool.connection() as conn:
-            conn.execute("PRAGMA journal_mode=WAL")
-            mode = conn.execute("PRAGMA journal_mode").fetchone()[0]
-            assert mode == "wal"
-
-
 class TestIntegration:
    """Integration tests for real-world usage patterns."""

--- a/tests/infrastructure/world/init.py
+++ b/tests/infrastructure/world/init.py
--- a/tests/infrastructure/world/test_interface.py
+++ b/tests/infrastructure/world/test_interface.py
@@ -1,129 +0,0 @@
-"""Tests for the WorldInterface contract and type system."""
-
-import pytest
-
-from infrastructure.world.interface import WorldInterface
-from infrastructure.world.types import (
-    ActionResult,
-    ActionStatus,
-    CommandInput,
-    PerceptionOutput,
-)
-
-# ---------------------------------------------------------------------------
-# Type construction
-# ---------------------------------------------------------------------------
-
-
-class TestPerceptionOutput:
-    def test_defaults(self):
-        p = PerceptionOutput()
-        assert p.location == ""
-        assert p.entities == []
-        assert p.events == []
-        assert p.raw == {}
-        assert p.timestamp is not None
-
-    def test_custom_values(self):
-        p = PerceptionOutput(
-            location="Balmora",
-            entities=["Guard", "Merchant"],
-            events=["door_opened"],
-        )
-        assert p.location == "Balmora"
-        assert len(p.entities) == 2
-        assert "door_opened" in p.events
-
-
-class TestCommandInput:
-    def test_minimal(self):
-        c = CommandInput(action="move")
-        assert c.action == "move"
-        assert c.target is None
-        assert c.parameters == {}
-
-    def test_with_target_and_params(self):
-        c = CommandInput(action="attack", target="Rat", parameters={"weapon": "sword"})
-        assert c.target == "Rat"
-        assert c.parameters["weapon"] == "sword"
-
-
-class TestActionResult:
-    def test_defaults(self):
-        r = ActionResult()
-        assert r.status == ActionStatus.SUCCESS
-        assert r.message == ""
-
-    def test_failure(self):
-        r = ActionResult(status=ActionStatus.FAILURE, message="blocked")
-        assert r.status == ActionStatus.FAILURE
-
-
-class TestActionStatus:
-    def test_values(self):
-        assert ActionStatus.SUCCESS.value == "success"
-        assert ActionStatus.FAILURE.value == "failure"
-        assert ActionStatus.PENDING.value == "pending"
-        assert ActionStatus.NOOP.value == "noop"
-
-
-# ---------------------------------------------------------------------------
-# Abstract contract
-# ---------------------------------------------------------------------------
-
-
-class TestWorldInterfaceContract:
-    """Verify the ABC cannot be instantiated directly."""
-
-    def test_cannot_instantiate(self):
-        with pytest.raises(TypeError):
-            WorldInterface()
-
-    def test_subclass_must_implement_observe(self):
-        class Incomplete(WorldInterface):
-            def act(self, command):
-                pass
-
-            def speak(self, message, target=None):
-                pass
-
-        with pytest.raises(TypeError):
-            Incomplete()
-
-    def test_subclass_must_implement_act(self):
-        class Incomplete(WorldInterface):
-            def observe(self):
-                return PerceptionOutput()
-
-            def speak(self, message, target=None):
-                pass
-
-        with pytest.raises(TypeError):
-            Incomplete()
-
-    def test_subclass_must_implement_speak(self):
-        class Incomplete(WorldInterface):
-            def observe(self):
-                return PerceptionOutput()
-
-            def act(self, command):
-                return ActionResult()
-
-        with pytest.raises(TypeError):
-            Incomplete()
-
-    def test_complete_subclass_instantiates(self):
-        class Complete(WorldInterface):
-            def observe(self):
-                return PerceptionOutput()
-
-            def act(self, command):
-                return ActionResult()
-
-            def speak(self, message, target=None):
-                pass
-
-        adapter = Complete()
-        assert adapter.is_connected is True  # default
-        assert isinstance(adapter.observe(), PerceptionOutput)
-        assert isinstance(adapter.act(CommandInput(action="test")), ActionResult)
--- a/tests/infrastructure/world/test_mock_adapter.py
+++ b/tests/infrastructure/world/test_mock_adapter.py
@@ -1,80 +0,0 @@
-"""Tests for the MockWorldAdapter — full observe/act/speak cycle."""
-
-from infrastructure.world.adapters.mock import MockWorldAdapter
-from infrastructure.world.types import ActionStatus, CommandInput, PerceptionOutput
-
-
-class TestMockWorldAdapter:
-    def test_observe_returns_perception(self):
-        adapter = MockWorldAdapter(location="Vivec")
-        perception = adapter.observe()
-        assert isinstance(perception, PerceptionOutput)
-        assert perception.location == "Vivec"
-        assert perception.raw == {"adapter": "mock"}
-
-    def test_observe_entities(self):
-        adapter = MockWorldAdapter(entities=["Jiub", "Silt Strider"])
-        perception = adapter.observe()
-        assert perception.entities == ["Jiub", "Silt Strider"]
-
-    def test_act_logs_command(self):
-        adapter = MockWorldAdapter()
-        cmd = CommandInput(action="move", target="north")
-        result = adapter.act(cmd)
-        assert result.status == ActionStatus.SUCCESS
-        assert "move" in result.message
-        assert len(adapter.action_log) == 1
-        assert adapter.action_log[0].command.action == "move"
-
-    def test_act_multiple_commands(self):
-        adapter = MockWorldAdapter()
-        adapter.act(CommandInput(action="attack"))
-        adapter.act(CommandInput(action="defend"))
-        adapter.act(CommandInput(action="retreat"))
-        assert len(adapter.action_log) == 3
-
-    def test_speak_logs_message(self):
-        adapter = MockWorldAdapter()
-        adapter.speak("Hello, traveler!")
-        assert len(adapter.speech_log) == 1
-        assert adapter.speech_log[0]["message"] == "Hello, traveler!"
-        assert adapter.speech_log[0]["target"] is None
-
-    def test_speak_with_target(self):
-        adapter = MockWorldAdapter()
-        adapter.speak("Die, scum!", target="Cliff Racer")
-        assert adapter.speech_log[0]["target"] == "Cliff Racer"
-
-    def test_lifecycle(self):
-        adapter = MockWorldAdapter()
-        assert adapter.is_connected is False
-        adapter.connect()
-        assert adapter.is_connected is True
-        adapter.disconnect()
-        assert adapter.is_connected is False
-
-    def test_full_observe_act_speak_cycle(self):
-        """Acceptance criterion: full observe/act/speak cycle passes."""
-        adapter = MockWorldAdapter(
-            location="Seyda Neen",
-            entities=["Fargoth", "Hrisskar"],
-            events=["quest_started"],
-        )
-        adapter.connect()
-
-        # Observe
-        perception = adapter.observe()
-        assert perception.location == "Seyda Neen"
-        assert len(perception.entities) == 2
-        assert "quest_started" in perception.events
-
-        # Act
-        result = adapter.act(CommandInput(action="talk", target="Fargoth"))
-        assert result.status == ActionStatus.SUCCESS
-
-        # Speak
-        adapter.speak("Where is your ring, Fargoth?", target="Fargoth")
-        assert len(adapter.speech_log) == 1
-
-        adapter.disconnect()
-        assert adapter.is_connected is False
--- a/tests/infrastructure/world/test_registry.py
+++ b/tests/infrastructure/world/test_registry.py
@@ -1,68 +0,0 @@
-"""Tests for the adapter registry."""
-
-import pytest
-
-from infrastructure.world.adapters.mock import MockWorldAdapter
-from infrastructure.world.registry import AdapterRegistry
-
-
-class TestAdapterRegistry:
-    def test_register_and_get(self):
-        reg = AdapterRegistry()
-        reg.register("mock", MockWorldAdapter)
-        adapter = reg.get("mock")
-        assert isinstance(adapter, MockWorldAdapter)
-
-    def test_register_with_kwargs(self):
-        reg = AdapterRegistry()
-        reg.register("mock", MockWorldAdapter)
-        adapter = reg.get("mock", location="Custom Room")
-        assert adapter._location == "Custom Room"
-
-    def test_get_unknown_raises(self):
-        reg = AdapterRegistry()
-        with pytest.raises(KeyError):
-            reg.get("nonexistent")
-
-    def test_register_non_subclass_raises(self):
-        reg = AdapterRegistry()
-        with pytest.raises(TypeError):
-            reg.register("bad", dict)
-
-    def test_list_adapters(self):
-        reg = AdapterRegistry()
-        reg.register("beta", MockWorldAdapter)
-        reg.register("alpha", MockWorldAdapter)
-        assert reg.list_adapters() == ["alpha", "beta"]
-
-    def test_contains(self):
-        reg = AdapterRegistry()
-        reg.register("mock", MockWorldAdapter)
-        assert "mock" in reg
-        assert "other" not in reg
-
-    def test_len(self):
-        reg = AdapterRegistry()
-        assert len(reg) == 0
-        reg.register("mock", MockWorldAdapter)
-        assert len(reg) == 1
-
-    def test_overwrite_warns(self, caplog):
-        import logging
-
-        reg = AdapterRegistry()
-        reg.register("mock", MockWorldAdapter)
-        with caplog.at_level(logging.WARNING):
-            reg.register("mock", MockWorldAdapter)
-        assert "Overwriting" in caplog.text
-
-
-class TestModuleLevelRegistry:
-    """Test the convenience functions in infrastructure.world.__init__."""
-
-    def test_register_and_get(self):
-        from infrastructure.world import get_adapter, register_adapter
-
-        register_adapter("test_mock", MockWorldAdapter)
-        adapter = get_adapter("test_mock")
-        assert isinstance(adapter, MockWorldAdapter)
--- a/tests/infrastructure/world/test_tes3mp_adapter.py
+++ b/tests/infrastructure/world/test_tes3mp_adapter.py
@@ -1,44 +0,0 @@
-"""Tests for the TES3MP stub adapter."""
-
-import pytest
-
-from infrastructure.world.adapters.tes3mp import TES3MPWorldAdapter
-from infrastructure.world.types import CommandInput
-
-
-class TestTES3MPStub:
-    """Acceptance criterion: stub imports cleanly and raises NotImplementedError."""
-
-    def test_instantiates(self):
-        adapter = TES3MPWorldAdapter(host="127.0.0.1", port=25565)
-        assert adapter._host == "127.0.0.1"
-        assert adapter._port == 25565
-
-    def test_is_connected_default_false(self):
-        adapter = TES3MPWorldAdapter()
-        assert adapter.is_connected is False
-
-    def test_connect_raises(self):
-        adapter = TES3MPWorldAdapter()
-        with pytest.raises(NotImplementedError, match="connect"):
-            adapter.connect()
-
-    def test_disconnect_raises(self):
-        adapter = TES3MPWorldAdapter()
-        with pytest.raises(NotImplementedError, match="disconnect"):
-            adapter.disconnect()
-
-    def test_observe_raises(self):
-        adapter = TES3MPWorldAdapter()
-        with pytest.raises(NotImplementedError, match="observe"):
-            adapter.observe()
-
-    def test_act_raises(self):
-        adapter = TES3MPWorldAdapter()
-        with pytest.raises(NotImplementedError, match="act"):
-            adapter.act(CommandInput(action="move"))
-
-    def test_speak_raises(self):
-        adapter = TES3MPWorldAdapter()
-        with pytest.raises(NotImplementedError, match="speak"):
-            adapter.speak("Hello")
--- a/tests/loop/test_cycle_retro.py
+++ b/tests/loop/test_cycle_retro.py
@@ -58,55 +58,6 @@ class TestDetectIssueFromBranch:
            assert mod.detect_issue_from_branch() is None


-class TestConsumeOnce:
-    """cycle_result.json must be deleted after reading."""
-
-    def test_cycle_result_deleted_after_read(self, mod, tmp_path):
-        """After _load_cycle_result() data is consumed in main(), the file is deleted."""
-        result_file = tmp_path / "cycle_result.json"
-        result_file.write_text('{"issue": 42, "type": "bug"}')
-
-        with (
-            patch.object(mod, "CYCLE_RESULT_FILE", result_file),
-            patch.object(mod, "RETRO_FILE", tmp_path / "retro" / "cycles.jsonl"),
-            patch.object(mod, "SUMMARY_FILE", tmp_path / "retro" / "summary.json"),
-            patch.object(mod, "EPOCH_COUNTER_FILE", tmp_path / "retro" / ".epoch_counter"),
-            patch(
-                "sys.argv",
-                ["cycle_retro", "--cycle", "1", "--success", "--main-green", "--duration", "60"],
-            ),
-        ):
-            mod.main()
-
-        assert not result_file.exists(), "cycle_result.json should be deleted after consumption"
-
-    def test_cycle_result_not_deleted_when_empty(self, mod, tmp_path):
-        """If cycle_result.json doesn't exist, no error occurs."""
-        result_file = tmp_path / "nonexistent_result.json"
-
-        with (
-            patch.object(mod, "CYCLE_RESULT_FILE", result_file),
-            patch.object(mod, "RETRO_FILE", tmp_path / "retro" / "cycles.jsonl"),
-            patch.object(mod, "SUMMARY_FILE", tmp_path / "retro" / "summary.json"),
-            patch.object(mod, "EPOCH_COUNTER_FILE", tmp_path / "retro" / ".epoch_counter"),
-            patch(
-                "sys.argv",
-                [
-                    "cycle_retro",
-                    "--cycle",
-                    "1",
-                    "--success",
-                    "--main-green",
-                    "--duration",
-                    "60",
-                    "--issue",
-                    "10",
-                ],
-            ),
-        ):
-            mod.main()  # Should not raise
-
-
 class TestBackfillExtractIssueNumber:
    """Tests for backfill_retro.extract_issue_number PR-number filtering."""

--- a/tests/loop/test_heartbeat.py
+++ b/tests/loop/test_heartbeat.py
@@ -1,176 +0,0 @@
-"""Tests for Heartbeat v2 — WorldInterface-driven cognitive loop.
-
-Acceptance criteria:
- With MockWorldAdapter: heartbeat runs, logs show observe→reason→act→reflect
- Without adapter: existing think_once() behaviour unchanged
- WebSocket broadcasts include current action and reasoning summary
-"""
-
-from unittest.mock import AsyncMock, patch
-
-import pytest
-
-from infrastructure.world.adapters.mock import MockWorldAdapter
-from infrastructure.world.types import ActionStatus
-from loop.heartbeat import CycleRecord, Heartbeat
-
-
-@pytest.fixture
-def mock_adapter():
-    adapter = MockWorldAdapter(
-        location="Balmora",
-        entities=["Guard", "Merchant"],
-        events=["player_entered"],
-    )
-    adapter.connect()
-    return adapter
-
-
-class TestHeartbeatWithAdapter:
-    """With MockWorldAdapter: heartbeat runs full embodied cycle."""
-
-    @pytest.mark.asyncio
-    async def test_run_once_returns_cycle_record(self, mock_adapter):
-        hb = Heartbeat(world=mock_adapter)
-        record = await hb.run_once()
-        assert isinstance(record, CycleRecord)
-        assert record.cycle_id == 1
-
-    @pytest.mark.asyncio
-    async def test_observation_populated(self, mock_adapter):
-        hb = Heartbeat(world=mock_adapter)
-        record = await hb.run_once()
-        assert record.observation["location"] == "Balmora"
-        assert "Guard" in record.observation["entities"]
-        assert "player_entered" in record.observation["events"]
-
-    @pytest.mark.asyncio
-    async def test_action_dispatched_to_world(self, mock_adapter):
-        """Act phase should dispatch to world.act() for non-idle actions."""
-        hb = Heartbeat(world=mock_adapter)
-        record = await hb.run_once()
-        # The default loop phases don't set an explicit action, so it
-        # falls through to "idle" → NOOP. That's correct behaviour —
-        # the real LLM-powered reason phase will set action metadata.
-        assert record.action_status in (
-            ActionStatus.NOOP.value,
-            ActionStatus.SUCCESS.value,
-        )
-
-    @pytest.mark.asyncio
-    async def test_reflect_notes_present(self, mock_adapter):
-        hb = Heartbeat(world=mock_adapter)
-        record = await hb.run_once()
-        assert "Balmora" in record.reflect_notes
-
-    @pytest.mark.asyncio
-    async def test_cycle_count_increments(self, mock_adapter):
-        hb = Heartbeat(world=mock_adapter)
-        await hb.run_once()
-        await hb.run_once()
-        assert hb.cycle_count == 2
-        assert len(hb.history) == 2
-
-    @pytest.mark.asyncio
-    async def test_duration_recorded(self, mock_adapter):
-        hb = Heartbeat(world=mock_adapter)
-        record = await hb.run_once()
-        assert record.duration_ms >= 0
-
-    @pytest.mark.asyncio
-    async def test_on_cycle_callback(self, mock_adapter):
-        received = []
-
-        async def callback(record):
-            received.append(record)
-
-        hb = Heartbeat(world=mock_adapter, on_cycle=callback)
-        await hb.run_once()
-        assert len(received) == 1
-        assert received[0].cycle_id == 1
-
-
-class TestHeartbeatWithoutAdapter:
-    """Without adapter: existing think_once() behaviour unchanged."""
-
-    @pytest.mark.asyncio
-    async def test_passive_cycle(self):
-        hb = Heartbeat(world=None)
-        record = await hb.run_once()
-        assert record.action_taken == "think"
-        assert record.action_status == "noop"
-        assert "Passive" in record.reflect_notes
-
-    @pytest.mark.asyncio
-    async def test_passive_no_observation(self):
-        hb = Heartbeat(world=None)
-        record = await hb.run_once()
-        assert record.observation == {}
-
-
-class TestHeartbeatLifecycle:
-    def test_interval_property(self):
-        hb = Heartbeat(interval=60.0)
-        assert hb.interval == 60.0
-        hb.interval = 10.0
-        assert hb.interval == 10.0
-
-    def test_interval_minimum(self):
-        hb = Heartbeat()
-        hb.interval = 0.1
-        assert hb.interval == 1.0
-
-    def test_world_property(self):
-        hb = Heartbeat()
-        assert hb.world is None
-        adapter = MockWorldAdapter()
-        hb.world = adapter
-        assert hb.world is adapter
-
-    def test_stop_sets_flag(self):
-        hb = Heartbeat()
-        assert not hb.is_running
-        hb.stop()
-        assert not hb.is_running
-
-
-class TestHeartbeatBroadcast:
-    """WebSocket broadcasts include action and reasoning summary."""
-
-    @pytest.mark.asyncio
-    async def test_broadcast_called(self, mock_adapter):
-        with patch(
-            "loop.heartbeat.ws_manager",
-            create=True,
-        ) as mock_ws:
-            mock_ws.broadcast = AsyncMock()
-            # Patch the import inside heartbeat
-            with patch("infrastructure.ws_manager.handler.ws_manager") as ws_mod:
-                ws_mod.broadcast = AsyncMock()
-                hb = Heartbeat(world=mock_adapter)
-                await hb.run_once()
-                ws_mod.broadcast.assert_called_once()
-                call_args = ws_mod.broadcast.call_args
-                assert call_args[0][0] == "heartbeat.cycle"
-                data = call_args[0][1]
-                assert "action" in data
-                assert "reasoning_summary" in data
-                assert "observation" in data
-
-
-class TestHeartbeatLog:
-    """Verify logging of observe→reason→act→reflect cycle."""
-
-    @pytest.mark.asyncio
-    async def test_embodied_cycle_logs(self, mock_adapter, caplog):
-        import logging
-
-        with caplog.at_level(logging.INFO):
-            hb = Heartbeat(world=mock_adapter)
-            await hb.run_once()
-
-        messages = caplog.text
-        assert "Phase 1 (Gather)" in messages
-        assert "Phase 2 (Reason)" in messages
-        assert "Phase 3 (Act)" in messages
-        assert "Heartbeat cycle #1 complete" in messages
--- a/tests/loop/test_loop_guard_corrupt_queue.py
+++ b/tests/loop/test_loop_guard_corrupt_queue.py
@@ -1,97 +0,0 @@
-"""Tests for load_queue corrupt JSON handling in loop_guard.py."""
-
-from __future__ import annotations
-
-import json
-from pathlib import Path
-
-import pytest
-import scripts.loop_guard as lg
-
-
-@pytest.fixture(autouse=True)
-def _isolate(tmp_path, monkeypatch):
-    """Redirect loop_guard paths to tmp_path for isolation."""
-    monkeypatch.setattr(lg, "QUEUE_FILE", tmp_path / "queue.json")
-    monkeypatch.setattr(lg, "IDLE_STATE_FILE", tmp_path / "idle_state.json")
-    monkeypatch.setattr(lg, "CYCLE_RESULT_FILE", tmp_path / "cycle_result.json")
-    monkeypatch.setattr(lg, "GITEA_API", "http://test:3000/api/v1")
-    monkeypatch.setattr(lg, "REPO_SLUG", "owner/repo")
-
-
-def test_load_queue_missing_file(tmp_path):
-    """Missing queue file returns empty list."""
-    result = lg.load_queue()
-    assert result == []
-
-
-def test_load_queue_valid_data(tmp_path):
-    """Valid queue.json returns ready items."""
-    data = [
-        {"issue": 1, "title": "Ready issue", "ready": True},
-        {"issue": 2, "title": "Not ready", "ready": False},
-    ]
-    lg.QUEUE_FILE.parent.mkdir(parents=True, exist_ok=True)
-    lg.QUEUE_FILE.write_text(json.dumps(data, indent=2))
-
-    result = lg.load_queue()
-    assert len(result) == 1
-    assert result[0]["issue"] == 1
-
-
-def test_load_queue_corrupt_json_logs_warning(tmp_path, capsys):
-    """Corrupt queue.json returns empty list and logs warning."""
-    lg.QUEUE_FILE.parent.mkdir(parents=True, exist_ok=True)
-    lg.QUEUE_FILE.write_text("not valid json {{{")
-
-    result = lg.load_queue()
-    assert result == []
-
-    captured = capsys.readouterr()
-    assert "WARNING" in captured.out
-    assert "Corrupt queue.json" in captured.out
-
-
-def test_load_queue_not_a_list(tmp_path):
-    """Queue.json that is not a list returns empty list."""
-    lg.QUEUE_FILE.parent.mkdir(parents=True, exist_ok=True)
-    lg.QUEUE_FILE.write_text(json.dumps({"not": "a list"}))
-
-    result = lg.load_queue()
-    assert result == []
-
-
-def test_load_queue_no_ready_items(tmp_path):
-    """Queue with no ready items returns empty list."""
-    data = [
-        {"issue": 1, "title": "Not ready 1", "ready": False},
-        {"issue": 2, "title": "Not ready 2", "ready": False},
-    ]
-    lg.QUEUE_FILE.parent.mkdir(parents=True, exist_ok=True)
-    lg.QUEUE_FILE.write_text(json.dumps(data, indent=2))
-
-    result = lg.load_queue()
-    assert result == []
-
-
-def test_load_queue_oserror_logs_warning(tmp_path, monkeypatch, capsys):
-    """OSError when reading queue.json returns empty list and logs warning."""
-    lg.QUEUE_FILE.parent.mkdir(parents=True, exist_ok=True)
-    lg.QUEUE_FILE.write_text("[]")
-
-    # Mock Path.read_text to raise OSError
-    original_read_text = Path.read_text
-
-    def mock_read_text(self, *args, **kwargs):
-        if self.name == "queue.json":
-            raise OSError("Permission denied")
-        return original_read_text(self, *args, **kwargs)
-
-    monkeypatch.setattr(Path, "read_text", mock_read_text)
-
-    result = lg.load_queue()
-    assert result == []
-
-    captured = capsys.readouterr()
-    assert "WARNING" in captured.out
-    assert "Cannot read queue.json" in captured.out
--- a/tests/scripts/test_triage_score_validation.py
+++ b/tests/scripts/test_triage_score_validation.py
@@ -1,159 +0,0 @@
-"""Tests for queue.json validation and backup in triage_score.py."""
-
-from __future__ import annotations
-
-import json
-
-import pytest
-import scripts.triage_score as ts
-
-
-@pytest.fixture(autouse=True)
-def _isolate(tmp_path, monkeypatch):
-    """Redirect triage_score paths to tmp_path for isolation."""
-    monkeypatch.setattr(ts, "QUEUE_FILE", tmp_path / "queue.json")
-    monkeypatch.setattr(ts, "QUEUE_BACKUP_FILE", tmp_path / "queue.json.bak")
-    monkeypatch.setattr(ts, "RETRO_FILE", tmp_path / "retro" / "triage.jsonl")
-    monkeypatch.setattr(ts, "QUARANTINE_FILE", tmp_path / "quarantine.json")
-    monkeypatch.setattr(ts, "CYCLE_RETRO_FILE", tmp_path / "retro" / "cycles.jsonl")
-
-
-def test_backup_created_on_write(tmp_path):
-    """When writing queue.json, a backup should be created from previous valid file."""
-    # Create initial valid queue file
-    initial_data = [{"issue": 1, "title": "Test", "ready": True}]
-    ts.QUEUE_FILE.parent.mkdir(parents=True, exist_ok=True)
-    ts.QUEUE_FILE.write_text(json.dumps(initial_data))
-
-    # Write new data
-    new_data = [{"issue": 2, "title": "New", "ready": True}]
-    ts.QUEUE_FILE.write_text(json.dumps(new_data, indent=2) + "\n")
-
-    # Manually run the backup logic as run_triage would
-    if ts.QUEUE_FILE.exists():
-        try:
-            json.loads(ts.QUEUE_FILE.read_text())
-            ts.QUEUE_BACKUP_FILE.write_text(ts.QUEUE_FILE.read_text())
-        except (json.JSONDecodeError, OSError):
-            pass
-
-    # Both files should exist with same content
-    assert ts.QUEUE_BACKUP_FILE.exists()
-    assert json.loads(ts.QUEUE_BACKUP_FILE.read_text()) == new_data
-
-
-def test_corrupt_queue_restored_from_backup(tmp_path, capsys):
-    """If queue.json is corrupt, it should be restored from backup."""
-    # Create a valid backup
-    valid_data = [{"issue": 1, "title": "Backup", "ready": True}]
-    ts.QUEUE_BACKUP_FILE.parent.mkdir(parents=True, exist_ok=True)
-    ts.QUEUE_BACKUP_FILE.write_text(json.dumps(valid_data, indent=2) + "\n")
-
-    # Create a corrupt queue file
-    ts.QUEUE_FILE.parent.mkdir(parents=True, exist_ok=True)
-    ts.QUEUE_FILE.write_text("not valid json {{{")
-
-    # Run validation and restore logic
-    try:
-        json.loads(ts.QUEUE_FILE.read_text())
-    except (json.JSONDecodeError, OSError):
-        if ts.QUEUE_BACKUP_FILE.exists():
-            try:
-                backup_data = ts.QUEUE_BACKUP_FILE.read_text()
-                json.loads(backup_data)  # Validate backup
-                ts.QUEUE_FILE.write_text(backup_data)
-                print("[triage] Restored queue.json from backup")
-            except (json.JSONDecodeError, OSError):
-                ts.QUEUE_FILE.write_text("[]\n")
-        else:
-            ts.QUEUE_FILE.write_text("[]\n")
-
-    # Queue should be restored from backup
-    assert json.loads(ts.QUEUE_FILE.read_text()) == valid_data
-    captured = capsys.readouterr()
-    assert "Restored queue.json from backup" in captured.out
-
-
-def test_corrupt_queue_no_backup_writes_empty_list(tmp_path):
-    """If queue.json is corrupt and no backup exists, write empty list."""
-    # Ensure no backup exists
-    assert not ts.QUEUE_BACKUP_FILE.exists()
-
-    # Create a corrupt queue file
-    ts.QUEUE_FILE.parent.mkdir(parents=True, exist_ok=True)
-    ts.QUEUE_FILE.write_text("not valid json {{{")
-
-    # Run validation and restore logic
-    try:
-        json.loads(ts.QUEUE_FILE.read_text())
-    except (json.JSONDecodeError, OSError):
-        if ts.QUEUE_BACKUP_FILE.exists():
-            try:
-                backup_data = ts.QUEUE_BACKUP_FILE.read_text()
-                json.loads(backup_data)
-                ts.QUEUE_FILE.write_text(backup_data)
-            except (json.JSONDecodeError, OSError):
-                ts.QUEUE_FILE.write_text("[]\n")
-        else:
-            ts.QUEUE_FILE.write_text("[]\n")
-
-    # Should have empty list
-    assert json.loads(ts.QUEUE_FILE.read_text()) == []
-
-
-def test_corrupt_backup_writes_empty_list(tmp_path):
-    """If both queue.json and backup are corrupt, write empty list."""
-    # Create a corrupt backup
-    ts.QUEUE_BACKUP_FILE.parent.mkdir(parents=True, exist_ok=True)
-    ts.QUEUE_BACKUP_FILE.write_text("also corrupt backup")
-
-    # Create a corrupt queue file
-    ts.QUEUE_FILE.parent.mkdir(parents=True, exist_ok=True)
-    ts.QUEUE_FILE.write_text("not valid json {{{")
-
-    # Run validation and restore logic
-    try:
-        json.loads(ts.QUEUE_FILE.read_text())
-    except (json.JSONDecodeError, OSError):
-        if ts.QUEUE_BACKUP_FILE.exists():
-            try:
-                backup_data = ts.QUEUE_BACKUP_FILE.read_text()
-                json.loads(backup_data)
-                ts.QUEUE_FILE.write_text(backup_data)
-            except (json.JSONDecodeError, OSError):
-                ts.QUEUE_FILE.write_text("[]\n")
-        else:
-            ts.QUEUE_FILE.write_text("[]\n")
-
-    # Should have empty list
-    assert json.loads(ts.QUEUE_FILE.read_text()) == []
-
-
-def test_valid_queue_not_corrupt_no_backup_overwrite(tmp_path):
-    """Don't overwrite backup if current queue.json is corrupt."""
-    # Create a valid backup
-    valid_backup = [{"issue": 99, "title": "Old Backup", "ready": True}]
-    ts.QUEUE_BACKUP_FILE.parent.mkdir(parents=True, exist_ok=True)
-    ts.QUEUE_BACKUP_FILE.write_text(json.dumps(valid_backup, indent=2) + "\n")
-
-    # Create a corrupt queue file
-    ts.QUEUE_FILE.parent.mkdir(parents=True, exist_ok=True)
-    ts.QUEUE_FILE.write_text("corrupt data")
-
-    # Try to save backup (should skip because current is corrupt)
-    if ts.QUEUE_FILE.exists():
-        try:
-            json.loads(ts.QUEUE_FILE.read_text())  # This will fail
-            ts.QUEUE_BACKUP_FILE.write_text(ts.QUEUE_FILE.read_text())
-        except (json.JSONDecodeError, OSError):
-            pass  # Should hit this branch
-
-    # Backup should still have original valid data
-    assert json.loads(ts.QUEUE_BACKUP_FILE.read_text()) == valid_backup
-
-
-def test_backup_path_configuration():
-    """Ensure backup file path is properly configured relative to queue file."""
-    assert ts.QUEUE_BACKUP_FILE.parent == ts.QUEUE_FILE.parent
-    assert ts.QUEUE_BACKUP_FILE.name == "queue.json.bak"
-    assert ts.QUEUE_FILE.name == "queue.json"
--- a/tests/test_command_log.py
+++ b/tests/test_command_log.py
@@ -0,0 +1,266 @@
+"""Tests for Morrowind command log and training export pipeline."""
+
+from datetime import UTC, datetime, timedelta
+from pathlib import Path
+
+import pytest
+
+from src.infrastructure.morrowind.command_log import CommandLog, CommandLogger
+from src.infrastructure.morrowind.schemas import (
+    CommandInput,
+    CommandType,
+    PerceptionOutput,
+)
+from src.infrastructure.morrowind.training_export import TrainingExporter
+
+
+# ---------------------------------------------------------------------------
+# Helpers
+# ---------------------------------------------------------------------------
+
+NOW = datetime(2026, 3, 21, 14, 30, 0, tzinfo=UTC)
+
+
+def _make_perception(**overrides) -> PerceptionOutput:
+    defaults = {
+        "timestamp": NOW,
+        "agent_id": "timmy",
+        "location": {"cell": "Balmora", "x": 1024.5, "y": -512.3, "z": 64.0},
+        "health": {"current": 85, "max": 100},
+    }
+    defaults.update(overrides)
+    return PerceptionOutput(**defaults)
+
+
+def _make_command(**overrides) -> CommandInput:
+    defaults = {
+        "timestamp": NOW,
+        "agent_id": "timmy",
+        "command": "move_to",
+        "params": {"target_x": 1050.0},
+        "reasoning": "Moving closer to quest target.",
+    }
+    defaults.update(overrides)
+    return CommandInput(**defaults)
+
+
+@pytest.fixture
+def logger(tmp_path: Path) -> CommandLogger:
+    """CommandLogger backed by an in-memory SQLite DB."""
+    db_path = tmp_path / "test.db"
+    return CommandLogger(db_url=f"sqlite:///{db_path}")
+
+
+@pytest.fixture
+def exporter(logger: CommandLogger) -> TrainingExporter:
+    return TrainingExporter(logger)
+
+
+# ---------------------------------------------------------------------------
+# CommandLogger — log_command
+# ---------------------------------------------------------------------------
+
+
+class TestLogCommand:
+    def test_basic_log(self, logger: CommandLogger):
+        cmd = _make_command()
+        row_id = logger.log_command(cmd)
+        assert row_id >= 1
+
+    def test_log_with_perception(self, logger: CommandLogger):
+        cmd = _make_command()
+        perception = _make_perception()
+        row_id = logger.log_command(cmd, perception=perception)
+        assert row_id >= 1
+
+        results = logger.query(limit=1)
+        assert len(results) == 1
+        assert results[0]["cell"] == "Balmora"
+        assert results[0]["perception_snapshot"]["location"]["cell"] == "Balmora"
+
+    def test_log_with_outcome(self, logger: CommandLogger):
+        cmd = _make_command()
+        row_id = logger.log_command(cmd, outcome="success: arrived at destination")
+        results = logger.query(limit=1)
+        assert results[0]["outcome"] == "success: arrived at destination"
+
+    def test_log_preserves_episode_id(self, logger: CommandLogger):
+        cmd = _make_command(episode_id="ep_test_001")
+        logger.log_command(cmd)
+        results = logger.query(episode_id="ep_test_001")
+        assert len(results) == 1
+        assert results[0]["episode_id"] == "ep_test_001"
+
+
+# ---------------------------------------------------------------------------
+# CommandLogger — query
+# ---------------------------------------------------------------------------
+
+
+class TestQuery:
+    def test_filter_by_command_type(self, logger: CommandLogger):
+        logger.log_command(_make_command(command="move_to"))
+        logger.log_command(_make_command(command="noop"))
+        logger.log_command(_make_command(command="move_to"))
+
+        results = logger.query(command_type="move_to")
+        assert len(results) == 2
+        assert all(r["command"] == "move_to" for r in results)
+
+    def test_filter_by_cell(self, logger: CommandLogger):
+        p1 = _make_perception(location={"cell": "Balmora", "x": 0, "y": 0, "z": 0})
+        p2 = _make_perception(location={"cell": "Vivec", "x": 0, "y": 0, "z": 0})
+        logger.log_command(_make_command(), perception=p1)
+        logger.log_command(_make_command(), perception=p2)
+
+        results = logger.query(cell="Vivec")
+        assert len(results) == 1
+        assert results[0]["cell"] == "Vivec"
+
+    def test_filter_by_time_range(self, logger: CommandLogger):
+        t1 = NOW - timedelta(hours=2)
+        t2 = NOW - timedelta(hours=1)
+        t3 = NOW
+
+        logger.log_command(_make_command(timestamp=t1.isoformat()))
+        logger.log_command(_make_command(timestamp=t2.isoformat()))
+        logger.log_command(_make_command(timestamp=t3.isoformat()))
+
+        results = logger.query(since=NOW - timedelta(hours=1, minutes=30), until=NOW)
+        assert len(results) == 2
+
+    def test_limit_and_offset(self, logger: CommandLogger):
+        for i in range(5):
+            logger.log_command(_make_command())
+
+        results = logger.query(limit=2, offset=0)
+        assert len(results) == 2
+
+        results = logger.query(limit=10, offset=3)
+        assert len(results) == 2
+
+    def test_empty_query(self, logger: CommandLogger):
+        results = logger.query()
+        assert results == []
+
+
+# ---------------------------------------------------------------------------
+# CommandLogger — export_training_data (JSONL)
+# ---------------------------------------------------------------------------
+
+
+class TestExportTrainingData:
+    def test_basic_export(self, logger: CommandLogger, tmp_path: Path):
+        perception = _make_perception()
+        for _ in range(3):
+            logger.log_command(_make_command(), perception=perception)
+
+        output = tmp_path / "train.jsonl"
+        count = logger.export_training_data(output)
+        assert count == 3
+        assert output.exists()
+
+        import json
+
+        lines = output.read_text().strip().split("\n")
+        assert len(lines) == 3
+        record = json.loads(lines[0])
+        assert "input" in record
+        assert "output" in record
+        assert record["output"]["command"] == "move_to"
+
+    def test_export_filter_by_episode(self, logger: CommandLogger, tmp_path: Path):
+        logger.log_command(_make_command(episode_id="ep_a"), perception=_make_perception())
+        logger.log_command(_make_command(episode_id="ep_b"), perception=_make_perception())
+
+        output = tmp_path / "ep_a.jsonl"
+        count = logger.export_training_data(output, episode_id="ep_a")
+        assert count == 1
+
+
+# ---------------------------------------------------------------------------
+# CommandLogger — storage management
+# ---------------------------------------------------------------------------
+
+
+class TestStorageManagement:
+    def test_count(self, logger: CommandLogger):
+        assert logger.count() == 0
+        logger.log_command(_make_command())
+        logger.log_command(_make_command())
+        assert logger.count() == 2
+
+    def test_rotate_old_entries(self, logger: CommandLogger):
+        old_time = NOW - timedelta(days=100)
+        logger.log_command(_make_command(timestamp=old_time.isoformat()))
+        logger.log_command(_make_command(timestamp=NOW.isoformat()))
+
+        deleted = logger.rotate(max_age_days=90)
+        assert deleted == 1
+        assert logger.count() == 1
+
+    def test_rotate_nothing_to_delete(self, logger: CommandLogger):
+        logger.log_command(_make_command(timestamp=NOW.isoformat()))
+        deleted = logger.rotate(max_age_days=1)
+        assert deleted == 0
+
+
+# ---------------------------------------------------------------------------
+# TrainingExporter — chat format
+# ---------------------------------------------------------------------------
+
+
+class TestTrainingExporterChat:
+    def test_chat_format_export(
+        self, logger: CommandLogger, exporter: TrainingExporter, tmp_path: Path
+    ):
+        perception = _make_perception()
+        for _ in range(3):
+            logger.log_command(_make_command(), perception=perception)
+
+        output = tmp_path / "chat.jsonl"
+        stats = exporter.export_chat_format(output)
+        assert stats.total_records == 3
+        assert stats.format == "chat_completion"
+
+        import json
+
+        lines = output.read_text().strip().split("\n")
+        record = json.loads(lines[0])
+        assert record["messages"][0]["role"] == "system"
+        assert record["messages"][1]["role"] == "user"
+        assert record["messages"][2]["role"] == "assistant"
+
+
+# ---------------------------------------------------------------------------
+# TrainingExporter — episode sequences
+# ---------------------------------------------------------------------------
+
+
+class TestTrainingExporterEpisodes:
+    def test_episode_export(
+        self, logger: CommandLogger, exporter: TrainingExporter, tmp_path: Path
+    ):
+        perception = _make_perception()
+        for i in range(5):
+            logger.log_command(
+                _make_command(episode_id="ep_test"),
+                perception=perception,
+            )
+
+        output_dir = tmp_path / "episodes"
+        stats = exporter.export_episode_sequences(output_dir, min_length=3)
+        assert stats.episodes_exported == 1
+        assert stats.total_records == 5
+        assert (output_dir / "ep_test.jsonl").exists()
+
+    def test_short_episodes_skipped(
+        self, logger: CommandLogger, exporter: TrainingExporter, tmp_path: Path
+    ):
+        perception = _make_perception()
+        logger.log_command(_make_command(episode_id="short"), perception=perception)
+
+        output_dir = tmp_path / "episodes"
+        stats = exporter.export_episode_sequences(output_dir, min_length=3)
+        assert stats.episodes_exported == 0
+        assert stats.skipped_records == 1
--- a/tests/test_morrowind_schemas.py
+++ b/tests/test_morrowind_schemas.py
@@ -0,0 +1,242 @@
+"""Tests for Morrowind Perception/Command protocol Pydantic schemas."""
+
+from datetime import UTC, datetime
+
+import pytest
+from pydantic import ValidationError
+
+from src.infrastructure.morrowind.schemas import (
+    PROTOCOL_VERSION,
+    CommandContext,
+    CommandInput,
+    CommandType,
+    EntityType,
+    Environment,
+    HealthStatus,
+    InventorySummary,
+    Location,
+    NearbyEntity,
+    PerceptionOutput,
+    QuestInfo,
+)
+
+
+# ---------------------------------------------------------------------------
+# Helpers
+# ---------------------------------------------------------------------------
+
+NOW = datetime(2026, 3, 21, 14, 30, 0, tzinfo=UTC)
+
+
+def _make_perception(**overrides) -> PerceptionOutput:
+    defaults = {
+        "timestamp": NOW,
+        "agent_id": "timmy",
+        "location": {"cell": "Balmora", "x": 1024.5, "y": -512.3, "z": 64.0, "interior": False},
+        "health": {"current": 85, "max": 100},
+    }
+    defaults.update(overrides)
+    return PerceptionOutput(**defaults)
+
+
+def _make_command(**overrides) -> CommandInput:
+    defaults = {
+        "timestamp": NOW,
+        "agent_id": "timmy",
+        "command": "move_to",
+        "params": {"target_cell": "Balmora", "target_x": 1050.0},
+        "reasoning": "Moving closer to the quest target.",
+    }
+    defaults.update(overrides)
+    return CommandInput(**defaults)
+
+
+# ---------------------------------------------------------------------------
+# PerceptionOutput tests
+# ---------------------------------------------------------------------------
+
+
+class TestPerceptionOutput:
+    def test_minimal_valid(self):
+        p = _make_perception()
+        assert p.protocol_version == PROTOCOL_VERSION
+        assert p.agent_id == "timmy"
+        assert p.location.cell == "Balmora"
+        assert p.health.current == 85
+        assert p.nearby_entities == []
+        assert p.active_quests == []
+
+    def test_full_payload(self):
+        p = _make_perception(
+            nearby_entities=[
+                {
+                    "entity_id": "npc_001",
+                    "name": "Caius Cosades",
+                    "entity_type": "npc",
+                    "distance": 12.5,
+                    "disposition": 65,
+                }
+            ],
+            inventory_summary={"gold": 150, "item_count": 23, "encumbrance_pct": 0.45},
+            active_quests=[{"quest_id": "mq_01", "name": "Report to Caius", "stage": 10}],
+            environment={
+                "time_of_day": "afternoon",
+                "weather": "clear",
+                "is_combat": False,
+                "is_dialogue": False,
+            },
+            raw_engine_data={"tes3mp_version": "0.8.1"},
+        )
+        assert len(p.nearby_entities) == 1
+        assert p.nearby_entities[0].entity_type == EntityType.NPC
+        assert p.inventory_summary.gold == 150
+        assert p.active_quests[0].quest_id == "mq_01"
+        assert p.raw_engine_data["tes3mp_version"] == "0.8.1"
+
+    def test_serialization_roundtrip(self):
+        p = _make_perception()
+        json_str = p.model_dump_json()
+        p2 = PerceptionOutput.model_validate_json(json_str)
+        assert p2.location.cell == p.location.cell
+        assert p2.health.current == p.health.current
+
+    def test_missing_required_fields(self):
+        with pytest.raises(ValidationError):
+            PerceptionOutput(timestamp=NOW, agent_id="timmy")  # no location/health
+
+    def test_default_protocol_version(self):
+        p = _make_perception()
+        assert p.protocol_version == "1.0.0"
+
+
+# ---------------------------------------------------------------------------
+# Health validation
+# ---------------------------------------------------------------------------
+
+
+class TestHealthStatus:
+    def test_current_cannot_exceed_max(self):
+        with pytest.raises(ValidationError, match="cannot exceed max"):
+            HealthStatus(current=150, max=100)
+
+    def test_max_must_be_positive(self):
+        with pytest.raises(ValidationError):
+            HealthStatus(current=0, max=0)
+
+    def test_current_can_be_zero(self):
+        h = HealthStatus(current=0, max=100)
+        assert h.current == 0
+
+
+# ---------------------------------------------------------------------------
+# Location
+# ---------------------------------------------------------------------------
+
+
+class TestLocation:
+    def test_defaults(self):
+        loc = Location(cell="Seyda Neen", x=0.0, y=0.0)
+        assert loc.z == 0.0
+        assert loc.interior is False
+
+
+# ---------------------------------------------------------------------------
+# NearbyEntity
+# ---------------------------------------------------------------------------
+
+
+class TestNearbyEntity:
+    def test_all_entity_types(self):
+        for et in EntityType:
+            e = NearbyEntity(entity_id="e1", name="Test", entity_type=et, distance=1.0)
+            assert e.entity_type == et
+
+    def test_invalid_entity_type(self):
+        with pytest.raises(ValidationError):
+            NearbyEntity(entity_id="e1", name="Test", entity_type="dragon", distance=1.0)
+
+    def test_negative_distance_rejected(self):
+        with pytest.raises(ValidationError):
+            NearbyEntity(entity_id="e1", name="Test", entity_type="npc", distance=-5.0)
+
+
+# ---------------------------------------------------------------------------
+# InventorySummary
+# ---------------------------------------------------------------------------
+
+
+class TestInventorySummary:
+    def test_encumbrance_bounds(self):
+        with pytest.raises(ValidationError):
+            InventorySummary(encumbrance_pct=1.5)
+        with pytest.raises(ValidationError):
+            InventorySummary(encumbrance_pct=-0.1)
+
+    def test_defaults(self):
+        inv = InventorySummary()
+        assert inv.gold == 0
+        assert inv.item_count == 0
+        assert inv.encumbrance_pct == 0.0
+
+
+# ---------------------------------------------------------------------------
+# CommandInput tests
+# ---------------------------------------------------------------------------
+
+
+class TestCommandInput:
+    def test_minimal_valid(self):
+        c = _make_command()
+        assert c.command == CommandType.MOVE_TO
+        assert c.reasoning == "Moving closer to the quest target."
+        assert c.episode_id is None
+
+    def test_all_command_types(self):
+        for ct in CommandType:
+            c = _make_command(command=ct.value)
+            assert c.command == ct
+
+    def test_invalid_command_type(self):
+        with pytest.raises(ValidationError):
+            _make_command(command="fly_to_moon")
+
+    def test_reasoning_required(self):
+        with pytest.raises(ValidationError):
+            CommandInput(
+                timestamp=NOW,
+                agent_id="timmy",
+                command="noop",
+                reasoning="",  # min_length=1
+            )
+
+    def test_with_episode_and_context(self):
+        c = _make_command(
+            episode_id="ep_001",
+            context={"perception_timestamp": NOW, "heartbeat_cycle": 42},
+        )
+        assert c.episode_id == "ep_001"
+        assert c.context.heartbeat_cycle == 42
+
+    def test_serialization_roundtrip(self):
+        c = _make_command(episode_id="ep_002")
+        json_str = c.model_dump_json()
+        c2 = CommandInput.model_validate_json(json_str)
+        assert c2.command == c.command
+        assert c2.episode_id == c.episode_id
+
+
+# ---------------------------------------------------------------------------
+# Enum coverage
+# ---------------------------------------------------------------------------
+
+
+class TestEnums:
+    def test_entity_type_values(self):
+        assert set(EntityType) == {"npc", "creature", "item", "door", "container"}
+
+    def test_command_type_values(self):
+        expected = {
+            "move_to", "interact", "use_item", "wait",
+            "combat_action", "dialogue", "journal_note", "noop",
+        }
+        assert set(CommandType) == expected
--- a/tests/timmy/test_tools_web_fetch.py
+++ b/tests/timmy/test_tools_web_fetch.py
@@ -1,158 +0,0 @@
-"""Unit tests for the web_fetch tool in timmy.tools."""
-
-from __future__ import annotations
-
-from unittest.mock import MagicMock, patch
-
-from timmy.tools import web_fetch
-
-
-class TestWebFetch:
-    """Tests for web_fetch function."""
-
-    def test_invalid_url_no_scheme(self):
-        """URLs without http(s) scheme are rejected."""
-        result = web_fetch("example.com")
-        assert "Error: invalid URL" in result
-
-    def test_invalid_url_empty(self):
-        """Empty URL is rejected."""
-        result = web_fetch("")
-        assert "Error: invalid URL" in result
-
-    def test_invalid_url_ftp(self):
-        """Non-HTTP schemes are rejected."""
-        result = web_fetch("ftp://example.com")
-        assert "Error: invalid URL" in result
-
-    @patch("timmy.tools.trafilatura", create=True)
-    @patch("timmy.tools._requests", create=True)
-    def test_successful_fetch(self, mock_requests, mock_trafilatura):
-        """Happy path: fetch + extract returns text."""
-        # We need to patch at import level inside the function
-        mock_resp = MagicMock()
-        mock_resp.text = "<html><body><p>Hello world</p></body></html>"
-
-        with patch.dict(
-            "sys.modules", {"requests": mock_requests, "trafilatura": mock_trafilatura}
-        ):
-            mock_requests.get.return_value = mock_resp
-            mock_requests.exceptions = _make_exceptions()
-            mock_trafilatura.extract.return_value = "Hello world"
-
-            result = web_fetch("https://example.com")
-
-        assert result == "Hello world"
-
-    @patch.dict("sys.modules", {"requests": MagicMock(), "trafilatura": MagicMock()})
-    def test_truncation(self):
-        """Long text is truncated to max_tokens * 4 chars."""
-        import sys
-
-        mock_trafilatura = sys.modules["trafilatura"]
-        mock_requests = sys.modules["requests"]
-
-        long_text = "a" * 20000
-        mock_resp = MagicMock()
-        mock_resp.text = "<html><body>" + long_text + "</body></html>"
-        mock_requests.get.return_value = mock_resp
-        mock_requests.exceptions = _make_exceptions()
-        mock_trafilatura.extract.return_value = long_text
-
-        result = web_fetch("https://example.com", max_tokens=100)
-
-        # 100 tokens * 4 chars = 400 chars max
-        assert len(result) < 500
-        assert "[…truncated" in result
-
-    @patch.dict("sys.modules", {"requests": MagicMock(), "trafilatura": MagicMock()})
-    def test_extraction_failure(self):
-        """Returns error when trafilatura can't extract text."""
-        import sys
-
-        mock_trafilatura = sys.modules["trafilatura"]
-        mock_requests = sys.modules["requests"]
-
-        mock_resp = MagicMock()
-        mock_resp.text = "<html></html>"
-        mock_requests.get.return_value = mock_resp
-        mock_requests.exceptions = _make_exceptions()
-        mock_trafilatura.extract.return_value = None
-
-        result = web_fetch("https://example.com")
-        assert "Error: could not extract" in result
-
-    @patch.dict("sys.modules", {"trafilatura": MagicMock()})
-    def test_timeout(self):
-        """Timeout errors are handled gracefully."""
-
-        mock_requests = MagicMock()
-        exc_mod = _make_exceptions()
-        mock_requests.exceptions = exc_mod
-        mock_requests.get.side_effect = exc_mod.Timeout("timed out")
-
-        with patch.dict("sys.modules", {"requests": mock_requests}):
-            result = web_fetch("https://example.com")
-
-        assert "timed out" in result
-
-    @patch.dict("sys.modules", {"trafilatura": MagicMock()})
-    def test_http_error(self):
-        """HTTP errors (404, 500, etc.) are handled gracefully."""
-
-        mock_requests = MagicMock()
-        exc_mod = _make_exceptions()
-        mock_requests.exceptions = exc_mod
-
-        mock_response = MagicMock()
-        mock_response.status_code = 404
-        mock_requests.get.return_value.raise_for_status.side_effect = exc_mod.HTTPError(
-            response=mock_response
-        )
-
-        with patch.dict("sys.modules", {"requests": mock_requests}):
-            result = web_fetch("https://example.com/nope")
-
-        assert "404" in result
-
-    def test_missing_requests(self):
-        """Graceful error when requests not installed."""
-        with patch.dict("sys.modules", {"requests": None}):
-            result = web_fetch("https://example.com")
-        assert "requests" in result and "not installed" in result
-
-    def test_missing_trafilatura(self):
-        """Graceful error when trafilatura not installed."""
-        mock_requests = MagicMock()
-        with patch.dict("sys.modules", {"requests": mock_requests, "trafilatura": None}):
-            result = web_fetch("https://example.com")
-        assert "trafilatura" in result and "not installed" in result
-
-    def test_catalog_entry_exists(self):
-        """web_fetch should appear in the tool catalog."""
-        from timmy.tools import get_all_available_tools
-
-        catalog = get_all_available_tools()
-        assert "web_fetch" in catalog
-        assert "orchestrator" in catalog["web_fetch"]["available_in"]
-
-
-def _make_exceptions():
-    """Create a mock exceptions module with real exception classes."""
-
-    class Timeout(Exception):
-        pass
-
-    class HTTPError(Exception):
-        def __init__(self, *args, response=None, **kwargs):
-            super().__init__(*args, **kwargs)
-            self.response = response
-
-    class RequestException(Exception):
-        pass
-
-    mod = MagicMock()
-    mod.Timeout = Timeout
-    mod.HTTPError = HTTPError
-    mod.RequestException = RequestException
-    return mod
--- a/tests/timmy_automations/test_health_snapshot.py
+++ b/tests/timmy_automations/test_health_snapshot.py
@@ -60,17 +60,8 @@ class TestGetToken:

        assert token == "file-token-456"

-    def test_returns_none_when_no_token(self, monkeypatch):
+    def test_returns_none_when_no_token(self):
        """Return None when no token available."""
-        # Prevent repo-root .timmy_gitea_token fallback from leaking real token
-        _orig_exists = Path.exists
-
-        def _exists_no_timmy(self):
-            if self.name == ".timmy_gitea_token":
-                return False
-            return _orig_exists(self)
-
-        monkeypatch.setattr(Path, "exists", _exists_no_timmy)
        config = {"token_file": "/nonexistent/path"}
        token = hs.get_token(config)

--- a/timmy_automations/BACKLOG_ORGANIZATION.md
+++ b/timmy_automations/BACKLOG_ORGANIZATION.md
@@ -1,232 +0,0 @@
-# Timmy Automations Backlog Organization
-
-**Date:** 2026-03-21  
-**Issue:** #720 - Refine and group Timmy Automations backlog  
-**Organized by:** Kimi agent
-
---
-
-## Summary
-
-The Timmy Automations backlog has been organized into **10 milestones** grouping related work into coherent iterations. This document serves as the authoritative reference for milestone purposes and issue assignments.
-
---
-
-## Milestones Overview
-
-| Milestone | Issues | Due Date | Description |
-|-----------|--------|----------|-------------|
-| **Automation Hub v1** | 2 open | 2026-04-10 | Core automation infrastructure - Timmy Automations module, orchestration, and workflow management |
-| **Daily Run v1** | 8 open | 2026-04-15 | First iteration of the Daily Run automation system - 10-minute ritual, agenda generation, and focus presets |
-| **Infrastructure** | 3 open | 2026-04-15 | Infrastructure and deployment tasks - DNS, SSL, VPS, and DevOps |
-| **Dashboard v1** | 0 open | 2026-04-20 | Mission Control dashboard enhancements - Daily Run metrics, triage visibility, and agent scorecards |
-| **Inbox & Focus v1** | 1 open | 2026-04-25 | Unified inbox view for Timmy - issue triage, focus management, and work selection |
-| **Token Economy v1** | 4 open | 2026-04-30 | Token-based reward system for agents - rules, scorecards, quests, and adaptive rewards |
-| **Code Hygiene** | 14 open | 2026-04-30 | Code quality improvements - tests, docstrings, refactoring, and hardcoded value extraction |
-| **Matrix Staging** | 19 open | 2026-04-05 | The Matrix 3D world staging deployment - UI fixes, WebSocket, Workshop integration |
-| **OpenClaw Sovereignty** | 11 open | 2026-05-15 | Deploy sovereign AI agent on Hermes VPS - Ollama, OpenClaw, and Matrix portal integration |
-
---
-
-## Detailed Breakdown
-
-### Automation Hub v1 (Due: 2026-04-10)
-Core automation infrastructure - the foundation for all other automation work.
-
-| Issue | Title | Status |
-|-------|-------|--------|
-| #720 | Refine and group Timmy Automations backlog | **In Progress** |
-| #719 | Generate weekly narrative summary of work and vibes | Open |
-
-**Recommendation:** Complete #719 first to establish the narrative logging pattern before other milestones.
-
---
-
-### Daily Run v1 (Due: 2026-04-15)
-The 10-minute ritual that starts Timmy's day - agenda generation, focus presets, and health checks.
-
-| Issue | Title | Status |
-|-------|-------|--------|
-| #716 | Add focus-day presets for Daily Run and work selection | Open |
-| #704 | Enrich Daily Run agenda with classifications and suggestions | Open |
-| #705 | Add helper to log Daily Run sessions to a logbook issue | Open |
-| #706 | Capture Daily Run feels notes and surface nudges | Open |
-| #707 | Integrate Deep Triage outputs into Daily Run agenda | Open |
-| #708 | Map flakiness and risky areas for test tightening | Open |
-| #709 | Add a library of test-tightening recipes for Daily Run | Open |
-| #710 | Implement quick health snapshot before coding | Open |
-
-**Recommendation:** Start with #710 (health snapshot) as it provides immediate value and informs other Daily Run features. Then #716 (focus presets) to establish the work selection pattern.
-
---
-
-### Infrastructure (Due: 2026-04-15)
-DevOps and deployment tasks required for production stability.
-
-| Issue | Title | Status |
-|-------|-------|--------|
-| #687 | Pre-commit and pre-push hooks fail on main due to 256 ModuleNotFoundErrors | Open |
-| #688 | Point all 4 domains to Hermes VPS in GoDaddy DNS | Open |
-| #689 | Run SSL provisioning after DNS is pointed | Open |
-
-**Recommendation:** These are sequential - #687 blocks commits, #688 blocks #689. Prioritize #687 for code hygiene.
-
---
-
-### Dashboard v1 (Due: 2026-04-20)
-Mission Control dashboard for automation visibility. Currently empty as related work is in Token Economy (#712).
-
-**Note:** Issue #718 (dashboard card for Daily Run) is already closed. Issue #712 (agent scorecards) spans both Token Economy and Dashboard milestones.
-
---
-
-### Inbox & Focus v1 (Due: 2026-04-25)
-Unified view for issue triage and work selection.
-
-| Issue | Title | Status |
-|-------|-------|--------|
-| #715 | Implement Timmy Inbox unified view | Open |
-
-**Note:** This is a significant feature that may need to be broken down further once work begins.
-
---
-
-### Token Economy v1 (Due: 2026-04-30)
-Reward system for agent participation and quality work.
-
-| Issue | Title | Status |
-|-------|-------|--------|
-| #711 | Centralize agent token rules and hooks for automations | Open |
-| #712 | Generate daily/weekly agent scorecards | Open |
-| #713 | Implement token quest system for agents | Open |
-| #714 | Adapt token rewards based on system stress signals | Open |
-
-**Recommendation:** Start with #711 to establish the token infrastructure, then #712 for visibility. #713 and #714 are enhancements that build on the base system.
-
---
-
-### Code Hygiene (Due: 2026-04-30)
-Ongoing code quality improvements. These are good "filler" tasks between larger features.
-
-| Issue | Title | Status |
-|-------|-------|--------|
-| #769 | Add unit tests for src/infrastructure/db_pool.py | Open |
-| #770 | Add unit tests for src/dashboard/routes/health.py | Open |
-| #771 | Refactor run_agentic_loop() — 120 lines, extract helpers | Open |
-| #772 | Refactor produce_system_status() — 88 lines, split into sections | Open |
-| #773 | Add docstrings to public functions in src/dashboard/routes/tasks.py | Open |
-| #774 | Add docstrings to VoiceTTS.set_rate(), set_volume(), set_voice() | Open |
-| #775 | Add docstrings to system route functions in src/dashboard/routes/system.py | Open |
-| #776 | Extract hardcoded PRAGMA busy_timeout=5000 to config | Open |
-| #777 | DRY up tasks_pending/active/completed — extract shared helper | Open |
-| #778 | Remove bare `pass` after logged exceptions in src/timmy/tools.py | Open |
-| #779 | Add unit tests for src/timmy/conversation.py | Open |
-| #780 | Add unit tests for src/timmy/interview.py | Open |
-| #781 | Add error handling for missing DB in src/dashboard/routes/tasks.py | Open |
-| #782 | Extract hardcoded sats limit in consult_grok() to config | Open |
-
-**Recommendation:** These are independent and can be picked up in any order. Good candidates for when blocked on larger features.
-
---
-
-### Matrix Staging (Due: 2026-04-05)
-The Matrix 3D world - UI fixes and WebSocket integration for the Workshop.
-
-**QA Issues:**
-| Issue | Title |
-|-------|-------|
-| #733 | The Matrix staging deployment — 3 issues to fix |
-| #757 | No landing page or enter button — site loads directly into 3D world |
-| #758 | WebSocket never connects — VITE_WS_URL is empty in production build |
-| #759 | Missing Submit Job and Fund Session UI buttons |
-| #760 | Chat messages silently dropped when WebSocket is offline |
-| #761 | All routes serve identical content — no client-side router |
-| #762 | All 5 agents permanently show IDLE state |
-| #763 | Chat clear button overlaps connection status on small viewports |
-| #764 | Mobile: status panel overlaps HUD agent count on narrow viewports |
-
-**UI Enhancement Issues:**
-| Issue | Title |
-|-------|-------|
-| #747 | Add graceful offline mode — show demo mode instead of hanging |
-| #748 | Add loading spinner/progress bar while 3D scene initializes |
-| #749 | Add keyboard shortcuts — Escape to close modals, Enter to submit chat |
-| #750 | Chat input should auto-focus when Workshop panel opens |
-| #751 | Add connection status indicator with color coding |
-| #752 | Add dark/light theme toggle |
-| #753 | Fund Session modal should show explanatory text about what sats do |
-| #754 | Submit Job modal should validate input before submission |
-| #755 | Add About/Info panel explaining what The Matrix/Workshop is |
-| #756 | Add FPS counter visibility toggle — debug-only by default |
-
-**Note:** This milestone has the earliest due date (2026-04-05) and most issues. Consider splitting into "Matrix Critical" (QA blockers) and "Matrix Polish" (UI enhancements).
-
---
-
-### OpenClaw Sovereignty (Due: 2026-05-15)
-Deploy a sovereign AI agent on Hermes VPS - the long-term goal of Timmy's independence from cloud APIs.
-
-| Issue | Title | Status |
-|-------|-------|--------|
-| #721 | Research: OpenClaw architecture, deployment modes, and Ollama integration | Open |
-| #722 | Research: Best small LLMs for agentic tool-calling on constrained hardware | Open |
-| #723 | Research: OpenClaw SOUL.md and AGENTS.md patterns | Open |
-| #724 | [1/8] Audit Hermes VPS resources and prepare for OpenClaw deployment | Open |
-| #725 | [2/8] Install and configure Ollama on Hermes VPS | Open |
-| #726 | [3/8] Install OpenClaw on Hermes VPS and complete onboarding | Open |
-| #727 | [4/8] Expose OpenClaw gateway via Tailscale for Matrix portal access | Open |
-| #728 | [5/8] Create Timmy's SOUL.md and AGENTS.md — sovereign agent persona | Open |
-| #729 | [6/8] Integrate OpenClaw chat as a portal/scroll in The Matrix frontend | Open |
-| #730 | [7/8] Create openclaw-tools Gitea repo — Timmy's sovereign toolbox | Open |
-| #731 | [8/8] Write sovereignty migration plan — offload tasks from Anthropic to OpenClaw | Open |
-
-**Note:** This is a research-heavy, sequential milestone. Issues #721-#723 should be completed before implementation begins. Consider creating a research summary document as output from the research issues.
-
---
-
-## Issues Intentionally Left Unassigned
-
-The following issues remain without milestone assignment by design:
-
-### Philosophy Issues
-Ongoing discussion threads that don't fit a milestone structure:
- #502, #511, #521, #528, #536, #543, #548, #556, #566, #571, #583, #588, #596, #602, #608, #613, #623, #630, #642
-
-### Feature Ideas / Future Work
-Ideas that need more definition before milestone assignment:
- #654, #653, #652, #651, #650 (ASCII Video showcase)
- #664 (Chain Memory song)
- #578, #577, #579 (Autonomous action, identity evolution, contextual mastery)
-
-### Completed Issues
-Already closed issues remain in their original state without milestone assignment.
-
---
-
-## Recommended Execution Order
-
-Based on priority and dependencies:
-
-1. **Automation Hub v1** (April 10) - Foundation for all automation work
-2. **Daily Run v1** (April 15) - Core developer experience improvement
-3. **Infrastructure** (April 15) - Unblocks production deployments
-4. **Matrix Staging** (April 5) - *Parallel track* - UI team work
-5. **Inbox & Focus v1** (April 25) - Builds on Daily Run patterns
-6. **Dashboard v1** (April 20) - Visualizes Token Economy data
-7. **Token Economy v1** (April 30) - Gamification layer
-8. **Code Hygiene** (April 30) - *Ongoing* - Fill gaps between features
-9. **OpenClaw Sovereignty** (May 15) - Long-term research and deployment
-
---
-
-## Notes for Future Triage
-
- Issues should be assigned to milestones at creation time
- Each milestone should have a "Definition of Done" documented
- Consider creating epic issues for large milestones (OpenClaw, Matrix)
- Weekly triage should review unassigned issues and new arrivals
- Milestone due dates should be adjusted based on velocity
-
---
-
-*This document is maintained as part of the Timmy Automations subsystem. Update it when milestone structure changes.*
--- a/timmy_automations/daily_run/health_snapshot.py
+++ b/timmy_automations/daily_run/health_snapshot.py
@@ -53,26 +53,21 @@ def load_config() -> dict:


 def get_token(config: dict) -> str | None:
-    """Get Gitea token from environment or file.
-
-    Priority: config["token"] > config["token_file"] > .timmy_gitea_token
-    """
+    """Get Gitea token from environment or file."""
    if "token" in config:
        return config["token"]
-
-    # Explicit token_file from config takes priority
-    token_file_str = config.get("token_file", "")
-    if token_file_str:
-        token_file = Path(token_file_str)
-        if token_file.exists():
-            return token_file.read_text().strip()
-
-    # Fallback: repo-root .timmy_gitea_token
+    
+    # Try timmy's token file
    repo_root = Path(__file__).resolve().parent.parent.parent
    timmy_token_path = repo_root / ".timmy_gitea_token"
    if timmy_token_path.exists():
        return timmy_token_path.read_text().strip()
-
+    
+    # Fallback to legacy token file
+    token_file = Path(config["token_file"]).expanduser()
+    if token_file.exists():
+        return token_file.read_text().strip()
+    
    return None