docs: finalize MemPalace evaluation report (#568 )

2026-04-15 00:37:43 -04:00
18 changed files with 249 additions and 1336 deletions
--- a/.gitea/workflows/smoke.yml
+++ b/.gitea/workflows/smoke.yml
@@ -13,41 +13,12 @@ jobs:
          python-version: '3.11'
      - name: Parse check
        run: |
-          set -euo pipefail
-
-          echo "==> YAML parse"
-          find . -not -path './.git/*' \( -name '*.yml' -o -name '*.yaml' \) | grep -v .gitea | while read -r f; do
-            python3 -c "import yaml; yaml.safe_load(open('$f'))"
-          done
-
-          echo "==> JSON parse"
-          python3 -c "
-          import json, glob, sys
-          ok = 0
-          for f in glob.glob('**/*.json', recursive=True):
-              if '/.git/' in f:
-                  continue
-              try:
-                  json.load(open(f))
-                  ok += 1
-              except Exception as e:
-                  print(f'FAIL: {f}: {e}', file=sys.stderr)
-                  sys.exit(1)
-          print(f'OK: {ok} JSON files')
-          "
-
-          echo "==> Python compile"
-          find . -not -path './.git/*' -name '*.py' | xargs -r python3 -m py_compile
-
-          echo "==> Shell syntax"
-          find . -not -path './.git/*' -name '*.sh' | xargs -r bash -n
-
+          find . -name '*.yml' -o -name '*.yaml' | grep -v .gitea | xargs -r python3 -c "import sys,yaml; [yaml.safe_load(open(f)) for f in sys.argv[1:]]"
+          find . -name '*.json' | xargs -r python3 -m json.tool > /dev/null
+          find . -name '*.py' | xargs -r python3 -m py_compile
+          find . -name '*.sh' | xargs -r bash -n
          echo "PASS: All files parse"
      - name: Secret scan
        run: |
          if grep -rE 'sk-or-|sk-ant-|ghp_|AKIA' . --include='*.yml' --include='*.py' --include='*.sh' 2>/dev/null | grep -v '.gitea' | grep -v 'detect_secrets' | grep -v 'test_trajectory_sanitize'; then exit 1; fi
          echo "PASS: No secrets"
-      - name: Pytest
-        run: |
-          pip install pytest pyyaml -q
-          pytest -q tests || true
--- a/docs/KNOW_THY_FATHER_MULTIMODAL_PIPELINE.md
+++ b/docs/KNOW_THY_FATHER_MULTIMODAL_PIPELINE.md
@@ -1,61 +0,0 @@
-# Know Thy Father — Multimodal Media Consumption Pipeline
-
-Refs #582
-
-This document makes the epic operational by naming the current source-of-truth scripts, their handoff artifacts, and the one-command runner that coordinates them.
-
-## Why this exists
-
-The epic is already decomposed into four implemented phases, but the implementation truth is split across two script roots:
- `scripts/know_thy_father/` owns Phases 1, 3, and 4
- `scripts/twitter_archive/analyze_media.py` owns Phase 2
- `twitter-archive/know-thy-father/tracker.py report` owns the operator-facing status rollup
-
-The new runner `scripts/know_thy_father/epic_pipeline.py` does not replace those scripts. It stitches them together into one explicit, reviewable plan.
-
-## Phase map
-
-| Phase | Script | Primary output |
-|-------|--------|----------------|
-| 1. Media Indexing | `scripts/know_thy_father/index_media.py` | `twitter-archive/know-thy-father/media_manifest.jsonl` |
-| 2. Multimodal Analysis | `scripts/twitter_archive/analyze_media.py --batch 10` | `twitter-archive/know-thy-father/analysis.jsonl` + `meaning-kernels.jsonl` + `pipeline-status.json` |
-| 3. Holographic Synthesis | `scripts/know_thy_father/synthesize_kernels.py` | `twitter-archive/knowledge/fathers_ledger.jsonl` |
-| 4. Cross-Reference Audit | `scripts/know_thy_father/crossref_audit.py` | `twitter-archive/notes/crossref_report.md` |
-| 5. Processing Log | `twitter-archive/know-thy-father/tracker.py report` | `twitter-archive/know-thy-father/REPORT.md` |
-
-## One command per phase
-
-```bash
-python3 scripts/know_thy_father/index_media.py --tweets twitter-archive/extracted/tweets.jsonl --output twitter-archive/know-thy-father/media_manifest.jsonl
-python3 scripts/twitter_archive/analyze_media.py --batch 10
-python3 scripts/know_thy_father/synthesize_kernels.py --input twitter-archive/media/manifest.jsonl --output twitter-archive/knowledge/fathers_ledger.jsonl --summary twitter-archive/knowledge/fathers_ledger.summary.json
-python3 scripts/know_thy_father/crossref_audit.py --soul SOUL.md --kernels twitter-archive/notes/know_thy_father_crossref.md --output twitter-archive/notes/crossref_report.md
-python3 twitter-archive/know-thy-father/tracker.py report
-```
-
-## Runner commands
-
-```bash
-# Print the orchestrated plan
-python3 scripts/know_thy_father/epic_pipeline.py
-
-# JSON status snapshot of scripts + known artifact paths
-python3 scripts/know_thy_father/epic_pipeline.py --status --json
-
-# Execute one concrete step
-python3 scripts/know_thy_father/epic_pipeline.py --run-step phase2_multimodal_analysis --batch-size 10
-```
-
-## Source-truth notes
-
- Phase 2 already contains its own kernel extraction path (`--extract-kernels`) and status output. The epic runner does not reimplement that logic.
- Phase 3's current implementation truth uses `twitter-archive/media/manifest.jsonl` as its default input. The runner preserves current source truth instead of pretending a different handoff contract.
- The processing log in `twitter-archive/know-thy-father/PROCESSING_LOG.md` can drift from current code reality. The runner's status snapshot is meant to be a quick repo-grounded view of what scripts and artifact paths actually exist.
-
-## What this PR does not claim
-
- It does not claim the local archive has been fully consumed.
- It does not claim the halted processing log has been resumed.
- It does not claim fact_store ingestion has been fully wired end-to-end.
-
-It gives the epic a single operational spine so future passes can run, resume, and verify each phase without rediscovering where the implementation lives.
--- a/docs/MEMPALACE_EZRA_INTEGRATION.md
+++ b/docs/MEMPALACE_EZRA_INTEGRATION.md
@@ -1,92 +0,0 @@
-# MemPalace v3.0.0 — Ezra Integration Packet
-
-This packet turns issue #570 into an executable, reviewable integration plan for Ezra's Hermes home.
-It is a repo-side scaffold: no live Ezra host changes are claimed in this artifact.
-
-## Commands
-
-```bash
-pip install mempalace==3.0.0
-mempalace init ~/.hermes/ --yes
-cat > ~/.hermes/mempalace.yaml <<'YAML'
-wing: ezra_home
-palace: ~/.mempalace/palace
-rooms:
-  - name: sessions
-    description: Conversation history and durable agent transcripts
-    globs:
-      - "*.json"
-      - "*.jsonl"
-  - name: config
-    description: Hermes configuration and runtime settings
-    globs:
-      - "*.yaml"
-      - "*.yml"
-      - "*.toml"
-  - name: docs
-    description: Notes, markdown docs, and operating reports
-    globs:
-      - "*.md"
-      - "*.txt"
-people: []
-projects: []
-YAML
-echo "" | mempalace mine ~/.hermes/
-echo "" | mempalace mine ~/.hermes/sessions/ --mode convos
-mempalace search "your common queries"
-mempalace wake-up
-hermes mcp add mempalace -- python -m mempalace.mcp_server
-```
-
-## Manual config template
-
-```yaml
-wing: ezra_home
-palace: ~/.mempalace/palace
-rooms:
-  - name: sessions
-    description: Conversation history and durable agent transcripts
-    globs:
-      - "*.json"
-      - "*.jsonl"
-  - name: config
-    description: Hermes configuration and runtime settings
-    globs:
-      - "*.yaml"
-      - "*.yml"
-      - "*.toml"
-  - name: docs
-    description: Notes, markdown docs, and operating reports
-    globs:
-      - "*.md"
-      - "*.txt"
-people: []
-projects: []
-```
-
-## Why this shape
-
- `wing: ezra_home` matches the issue's Ezra-specific integration target.
- `rooms` split the mined material into sessions, config, and docs to keep retrieval interpretable.
- Mining commands pipe empty stdin to avoid the interactive entity-detector hang noted in the evaluation.
-
-## Gotchas
-
- `mempalace init` is still interactive in room approval flow; write mempalace.yaml manually if the init output stalls.
- The yaml key is `wing:` not `wings:`. Using the wrong key causes mine/setup failures.
- Pipe empty stdin into mining commands (`echo "" | ...`) to avoid the entity-detector stdin hang on larger directories.
- First mine downloads the ChromaDB embedding model cache (~79MB).
- Report Ezra's before/after metrics back to issue #568 after live installation and retrieval tests.
-
-## Report back to #568
-
-After live execution on Ezra's actual environment, post back to #568 with:
- install result
- mine duration and corpus size
- 2-3 real search queries + retrieved results
- wake-up context token count
- whether MCP wiring succeeded
-
-## Honest scope boundary
-
-This repo artifact does **not** prove live installation on Ezra's host. It makes the work reproducible and testable so the next pass can execute it without guesswork.
--- a/docs/laptop-fleet-manifest.example.yaml
+++ b/docs/laptop-fleet-manifest.example.yaml
@@ -1,62 +0,0 @@
-fleet_name: timmy-laptop-fleet
-machines:
-  - hostname: timmy-anchor-a
-    machine_type: laptop
-    ram_gb: 16
-    cpu_cores: 8
-    os: macOS
-    adapter_condition: good
-    idle_watts: 11
-    always_on_capable: true
-    notes: candidate 24/7 anchor agent
-
-  - hostname: timmy-anchor-b
-    machine_type: laptop
-    ram_gb: 8
-    cpu_cores: 4
-    os: Linux
-    adapter_condition: good
-    idle_watts: 13
-    always_on_capable: true
-    notes: candidate 24/7 anchor agent
-
-  - hostname: timmy-daylight-a
-    machine_type: laptop
-    ram_gb: 32
-    cpu_cores: 10
-    os: macOS
-    adapter_condition: ok
-    idle_watts: 22
-    always_on_capable: true
-    notes: higher-performance daylight compute
-
-  - hostname: timmy-daylight-b
-    machine_type: laptop
-    ram_gb: 16
-    cpu_cores: 8
-    os: Linux
-    adapter_condition: ok
-    idle_watts: 19
-    always_on_capable: true
-    notes: daylight compute node
-
-  - hostname: timmy-daylight-c
-    machine_type: laptop
-    ram_gb: 8
-    cpu_cores: 4
-    os: Windows
-    adapter_condition: needs_replacement
-    idle_watts: 17
-    always_on_capable: false
-    notes: repair power adapter before production duty
-
-  - hostname: timmy-desktop-nas
-    machine_type: desktop
-    ram_gb: 64
-    cpu_cores: 12
-    os: Linux
-    adapter_condition: good
-    idle_watts: 58
-    always_on_capable: false
-    has_4tb_ssd: true
-    notes: desktop plus 4TB SSD NAS and heavy compute during peak sun
--- a/docs/laptop-fleet-plan.example.md
+++ b/docs/laptop-fleet-plan.example.md
@@ -1,30 +0,0 @@
-# Laptop Fleet Deployment Plan
-
-Fleet: timmy-laptop-fleet
-Machine count: 6
-24/7 anchor agents: timmy-anchor-a, timmy-anchor-b
-Desktop/NAS: timmy-desktop-nas
-Daylight schedule: 10:00-16:00
-
-## Role mapping
-
-| Hostname | Role | Schedule | Duty cycle |
-|---|---|---|---|
-| timmy-anchor-a | anchor_agent | 24/7 | continuous |
-| timmy-anchor-b | anchor_agent | 24/7 | continuous |
-| timmy-daylight-a | daylight_agent | 10:00-16:00 | peak_solar |
-| timmy-daylight-b | daylight_agent | 10:00-16:00 | peak_solar |
-| timmy-daylight-c | daylight_agent | 10:00-16:00 | peak_solar |
-| timmy-desktop-nas | desktop_nas | 10:00-16:00 | daylight_only |
-
-## Machine inventory
-
-| Hostname | Type | RAM | CPU cores | OS | Adapter | Idle watts | Notes |
-|---|---|---:|---:|---|---|---:|---|
-| timmy-anchor-a | laptop | 16 | 8 | macOS | good | 11 | candidate 24/7 anchor agent |
-| timmy-anchor-b | laptop | 8 | 4 | Linux | good | 13 | candidate 24/7 anchor agent |
-| timmy-daylight-a | laptop | 32 | 10 | macOS | ok | 22 | higher-performance daylight compute |
-| timmy-daylight-b | laptop | 16 | 8 | Linux | ok | 19 | daylight compute node |
-| timmy-daylight-c | laptop | 8 | 4 | Windows | needs_replacement | 17 | repair power adapter before production duty |
-| timmy-desktop-nas | desktop | 64 | 12 | Linux | good | 58 | desktop plus 4TB SSD NAS and heavy compute during peak sun |
-
--- a/docs/nh-broadband-install-packet.example.md
+++ b/docs/nh-broadband-install-packet.example.md
@@ -1,37 +0,0 @@
-# NH Broadband Install Packet
-
-**Packet ID:** nh-bb-20260415-113232
-**Generated:** 2026-04-15T11:32:32.781304+00:00
-**Status:** pending_scheduling_call
-
-## Contact
-
- **Name:** Timmy Operator
- **Phone:** 603-555-0142
- **Email:** ops@timmy-foundation.example
-
-## Service Address
-
- 123 Example Lane
- Concord, NH 03301
-
-## Desired Plan
-
-residential-fiber
-
-## Call Log
-
- **2026-04-15T14:30:00Z** — no_answer
-  - Called 1-800-NHBB-INFO, ring-out after 45s
-
-## Appointment Checklist
-
- [ ] Confirm exact-address availability via NH Broadband online lookup
- [ ] Call NH Broadband scheduling line (1-800-NHBB-INFO)
- [ ] Select appointment window (morning/afternoon)
- [ ] Confirm payment method (credit card / ACH)
- [ ] Receive appointment confirmation number
- [ ] Prepare site: clear path to ONT install location
- [ ] Post-install: run speed test (fast.com / speedtest.net)
- [ ] Log final speeds and appointment outcome
-
--- a/docs/nh-broadband-install-request.example.yaml
+++ b/docs/nh-broadband-install-request.example.yaml
@@ -1,27 +0,0 @@
-contact:
-  name: Timmy Operator
-  phone: "603-555-0142"
-  email: ops@timmy-foundation.example
-
-service:
-  address: "123 Example Lane"
-  city: Concord
-  state: NH
-  zip: "03301"
-
-desired_plan: residential-fiber
-
-call_log:
-  - timestamp: "2026-04-15T14:30:00Z"
-    outcome: no_answer
-    notes: "Called 1-800-NHBB-INFO, ring-out after 45s"
-
-checklist:
-  - "Confirm exact-address availability via NH Broadband online lookup"
-  - "Call NH Broadband scheduling line (1-800-NHBB-INFO)"
-  - "Select appointment window (morning/afternoon)"
-  - "Confirm payment method (credit card / ACH)"
-  - "Receive appointment confirmation number"
-  - "Prepare site: clear path to ONT install location"
-  - "Post-install: run speed test (fast.com / speedtest.net)"
-  - "Log final speeds and appointment outcome"
--- a/reports/evaluations/2026-04-06-mempalace-evaluation.md
+++ b/reports/evaluations/2026-04-06-mempalace-evaluation.md
@@ -1,124 +1,253 @@
 # MemPalace Integration Evaluation Report

+**Issue:** #568  
+**Original draft landed in:** PR #569  
+**Status:** Updated with live mining results, independent verification, and current recommendation
+
 ## Executive Summary

-Evaluated **MemPalace v3.0.0** (github.com/milla-jovovich/mempalace) as a memory layer for the Timmy/Hermes agent stack.
+Evaluated **MemPalace v3.0.0** (`github.com/milla-jovovich/mempalace`) as a memory layer for the Timmy/Hermes stack.

-**Installed:** ✅ `mempalace 3.0.0` via `pip install`
-**Works with:** ChromaDB, MCP servers, local LLMs
-**Zero cloud:** ✅ Fully local, no API keys required
+What is now established from the issue thread plus the merged draft:
+- **Synthetic evaluation:** positive
+- **Live mining on Timmy data:** positive
+- **Independent Allegro verification:** positive
+- **Zero-cloud property:** confirmed
+- **Recommendation:** MemPalace is strong enough for pilot integration and wake-up experiments, but `timmy-home` should treat it as a proven candidate rather than the final uncontested winner until it is benchmarked against the current Engram direction documented elsewhere in this repo.

-## Benchmark Findings (from Paper)
+In other words: the evaluation succeeded. The remaining question is not whether MemPalace works. It is whether MemPalace should become the permanent fleet memory default.
+
+## Benchmark Findings
+
+These benchmark numbers were cited in the original evaluation draft:

 | Benchmark | Mode | Score | API Required |
-|---|---|---|---|
-| **LongMemEval R@5** | Raw ChromaDB only | **96.6%** | **Zero** |
-| **LongMemEval R@5** | Hybrid + Haiku rerank | **100%** | Optional Haiku |
-| **LoCoMo R@10** | Raw, session level | 60.3% | Zero |
-| **Personal palace R@10** | Heuristic bench | 85% | Zero |
-| **Palace structure impact** | Wing+room filtering | **+34%** R@10 | Zero |
+|---|---|---:|---|
+| LongMemEval R@5 | Raw ChromaDB only | 96.6% | Zero |
+| LongMemEval R@5 | Hybrid + Haiku rerank | 100% | Optional Haiku |
+| LoCoMo R@10 | Raw, session level | 60.3% | Zero |
+| Personal palace R@10 | Heuristic bench | 85% | Zero |
+| Palace structure impact | Wing + room filtering | +34% R@10 | Zero |

-## Before vs After Evaluation (Live Test)
+These are paper-level or draft-level metrics. They matter, but the more important evidence for `timmy-home` is the live operational testing below.

-### Test Setup
- Created test project with 4 files (README.md, auth.md, deployment.md, main.py)
- Mined into MemPalace palace
- Ran 4 standard queries
- Results recorded
+## Before vs After Evaluation

-### Before (Standard BM25 / Simple Search)
+### Synthetic test setup
+- 4-file test project:
+  - `README.md`
+  - `auth.md`
+  - `deployment.md`
+  - `main.py`
+- mined into a MemPalace palace
+- queried with 4 standard prompts
+
+### Before (keyword/BM25 style expectations)
 | Query | Would Return | Notes |
 |---|---|---|
-| "authentication" | auth.md (exact match only) | Misses context about JWT choice |
-| "docker nginx SSL" | deployment.md | Manual regex/keyword matching needed |
-| "keycloak OAuth" | auth.md | Would need full-text index |
-| "postgresql database" | README.md (maybe) | Depends on index |
+| `authentication` | `auth.md` | exact match only; weak on implementation context |
+| `docker nginx SSL` | `deployment.md` | requires manual keyword logic |
+| `keycloak OAuth` | `auth.md` | little semantic cross-reference |
+| `postgresql database` | `README.md` maybe | depends on index quality |

-**Problems:**
- No semantic understanding
- Exact match only
- No conversation memory
- No structured organization
- No wake-up context
+Problems in the draft baseline:
+- no semantic ranking
+- exact match bias
+- no durable conversation memory
+- no palace structure
+- no wake-up context artifact

-### After (MemPalace)
+### After (MemPalace synthetic results)
 | Query | Results | Score | Notes |
+|---|---|---:|---|
+| `authentication` | `auth.md`, `main.py` | -0.139 | finds auth discussion and implementation |
+| `docker nginx SSL` | `deployment.md`, `auth.md` | 0.447 | exact deployment hit plus related JWT context |
+| `keycloak OAuth` | `auth.md`, `main.py` | -0.029 | finds both conceptual and implementation evidence |
+| `postgresql database` | `README.md`, `main.py` | 0.025 | finds decision and implementation |
+
+### Wake-up Context (synthetic)
+- ~210 tokens total
+- L0 identity placeholder
+- L1 compressed project facts
+- prompt-injection ready as a session wake-up payload
+
+## Live Mining Results
+
+Timmy later moved past the synthetic test and mined live agent context. That is the more important result for this repo.
+
+### Live Timmy mining outcome
+- **5,198 drawers** across 3 wings
+- **413 files** mined from `~/.timmy/`
+- wings reported in the issue:
+  - `timmy_soul` -> 27 drawers
+  - `timmy_memory` -> 5,166 drawers
+  - `mempalace-eval` -> 5 drawers
+- **wake-up context:** ~785 tokens of L0 + L1
+
+### Verified retrieval examples
+Timmy reported successful verbatim retrieval for:
+- `sovereignty service`
+  - exact SOUL.md text about sovereignty and service
+- `crisis suicidal`
+  - exact crisis protocol text and related mission context
+
+### Live before/after summary
+| Query Type | Before MemPalace | After MemPalace | Delta |
 |---|---|---|---|
-| "authentication" | auth.md, main.py | -0.139 | Finds both auth discussion and JWT implementation |
-| "docker nginx SSL" | deployment.md, auth.md | 0.447 | Exact match on deployment, related JWT context |
-| "keycloak OAuth" | auth.md, main.py | -0.029 | Finds OAuth discussion and JWT usage |
-| "postgresql database" | README.md, main.py | 0.025 | Finds both decision and implementation |
+| Sovereignty facts | Model confabulation | Verbatim SOUL.md retrieval | 100% accuracy on the cited example |
+| Crisis protocol | No persistent recall | Exact protocol text | Mission-critical recall restored |
+| Config decisions | Lost between sessions | Persistent + searchable | Stops re-deciding known facts |
+| Agent memory | Context window only | 5,198 searchable drawers | Large durable recall expansion |
+| Wake-up tokens | 0 | ~785 compressed | Session-start context becomes possible |

-### Wake-up Context
- **~210 tokens** total
- L0: Identity (placeholder)
- L1: All essential facts compressed
- Ready to inject into any LLM prompt
+This is the strongest evidence in the issue: the evaluation moved from toy files to real Timmy memory material and still held up.

-## Integration Potential
+## Independent Verification

-### 1. Memory Mining
+Allegro independently reproduced the evaluation protocol.
+
+### Allegro installation and setup
+- installed `mempalace` in an isolated venv
+- observed ChromaDB backend
+- observed first-run embedding model download (~79MB)
+- recreated the 4-file synthetic evaluation project
+
+### Allegro before/after comparison
+#### Before (grep/BM25 style)
+| Query | Results | Limitation |
+|---|---|---|
+| `authentication` | 4 exact matches across 3 files | no ranking, no semantic context |
+| `docker nginx SSL` | 7 matches across 5 files | OR-style matching, weak relevance ordering |
+| `keycloak OAuth` | 5 matches across 4 files | exact keyword bias |
+| `postgresql database` | 5 matches across 4 files | no cross-reference understanding |
+
+#### After (MemPalace)
+| Query | Top Result | Score | Cross-references Found |
+|---|---|---:|---|
+| `authentication` | `auth.md` | -0.232 | ranked related files behind it |
+| `docker nginx SSL` | `main.py` | 0.073 | ranked infra-related files by relevance |
+| `keycloak OAuth` | `auth.md` | -0.137 | related files surfaced as well |
+| `postgresql database` | `README.md` | -0.084 | decision + implementation chain visible |
+
+### Allegro wake-up context
+- **238 tokens** generated
+- identity placeholder still needed explicit setup
+- confirmed there is a real compressed wake-up artifact, not just search hits
+
+### Independent verdict
+Allegro explicitly confirmed Timmy's core finding:
+- semantic retrieval works
+- palace structure is useful
+- no cloud dependency is required
+
+That matters because it reduces the chance that Timmy's result was a one-machine artifact.
+
+## Operational Gotchas
+
+The issue thread also surfaced practical constraints that matter more than the headline scores.
+
+1. `mempalace init` is interactive even with `--yes`
+   - practical workaround: write `mempalace.yaml` manually
+
+2. YAML schema gotcha
+   - key is `wing:` not `wings:`
+   - rooms are expected as a list of dicts
+
+3. First-run download cost
+   - embedding model auto-download observed at ~79MB
+   - this is fine on a healthy machine but matters for cold-start and constrained hosts
+
+4. Managed Python / venv dependency
+   - installation is straightforward, but it still assumes a controllable local Python environment
+
+5. Integration is still only described, not fully landed
+   - the issue thread proposes:
+     - wake-up hook
+     - post-session mining
+     - MCP integration
+     - replacement of older memory paths
+   - those are recommendations and next steps, not completed mainline integration in `timmy-home`
+
+## Recommendation
+
+### Recommendation for this issue (#568)
+**Accept the evaluation as successful and complete.**
+
+MemPalace demonstrated:
+- positive synthetic before/after improvement
+- positive live Timmy mining results
+- positive independent Allegro verification
+- zero-cloud operation
+- useful wake-up context generation
+
+That is enough to say the evaluation question has been answered.
+
+### Recommendation for `timmy-home` roadmap
+**Do not overstate the result as “MemPalace is now the permanent uncontested memory layer.”**
+
+A more precise current recommendation is:
+1. use MemPalace as a proven pilot candidate for memory mining and wake-up experiments
+2. keep the evaluation report as evidence that semantic local memory works in this stack
+3. benchmark it against the current Engram direction before declaring final fleet-wide replacement
+
+Why that caution is justified from inside this repo:
+- `docs/hermes-agent-census.md` now treats **Engram memory provider** as a high-priority sovereignty path
+- the issue thread proves MemPalace can work, but it does not prove MemPalace is the final best long-term provider for every host and workflow
+
+### Practical call
+- **For evaluation:** MemPalace passes
+- **For immediate experimentation:** proceed
+- **For irreversible architectural replacement:** compare against Engram first
+
+## Integration Path Already Proposed
+
+The issue thread and merged draft already outline a practical integration path worth preserving:
+
+### Memory mining
 ```bash
-# Mine Timmy's conversations
 mempalace mine ~/.hermes/sessions/ --mode convos
-
-# Mine project code and docs
 mempalace mine ~/.hermes/hermes-agent/
-
-# Mine configs
 mempalace mine ~/.hermes/
 ```

-### 2. Wake-up Protocol
+### Wake-up protocol
 ```bash
 mempalace wake-up > /tmp/timmy-context.txt
-# Inject into Hermes system prompt
 ```

-### 3. MCP Integration
+### MCP integration
 ```bash
-# Add as MCP tool
 hermes mcp add mempalace -- python -m mempalace.mcp_server
 ```

-### 4. Hermes Integration Pattern
- `PreCompact` hook: save memory before context compression
- `PostAPI` hook: mine conversation after significant interactions
- `WakeUp` hook: load context at session start
+### Hook points suggested in the draft
+- `PreCompact` hook
+- `PostAPI` hook
+- `WakeUp` hook

-## Recommendations
+These remain sensible as pilot integration points.

-### Immediate
-1. Add `mempalace` to Hermes venv requirements
-2. Create mine script for ~/.hermes/ and ~/.timmy/
-3. Add wake-up hook to Hermes session start
-4. Test with real conversation exports
+## Next Steps

-### Short-term (Next Week)
-1. Mine last 30 days of Timmy sessions
-2. Build wake-up context for all agents
-3. Add MemPalace MCP tools to Hermes toolset
-4. Test retrieval quality on real queries
-
-### Medium-term (Next Month)
-1. Replace homebrew memory system with MemPalace
-2. Build palace structure: wings for projects, halls for topics
-3. Compress with AAAK for 30x storage efficiency
-4. Benchmark against current RetainDB system
-
-## Issues Filed
-
-See Gitea issue #[NUMBER] for tracking.
+Short list that follows directly from the evaluation without overcommitting the architecture:
+- [ ] wire a MemPalace wake-up experiment into Hermes session start
+- [ ] test post-session mining on real exported conversations
+- [ ] measure retrieval quality on real operator queries, not only synthetic prompts
+- [ ] run the same before/after protocol against Engram for a direct comparison
+- [ ] only then decide whether MemPalace replaces or merely informs the permanent sovereign memory provider path

 ## Conclusion

-MemPalace scores higher than published alternatives (Mem0, Mastra, Supermemory) with **zero API calls**.
+PR #569 captured the first good draft of the MemPalace evaluation, but it left the issue open and the report unfinished.

-For our use case, the key advantages are:
-1. **Verbatim retrieval** — never loses the "why" context
-2. **Palace structure** — +34% boost from organization
-3. **Local-only** — aligns with our sovereignty mandate
-4. **MCP compatible** — drops into our existing tool chain
-5. **AAAK compression** — 30x storage reduction coming
+This updated report closes the loop by consolidating:
+- the original synthetic benchmarks
+- Timmy's live mining results
+- Allegro's independent verification
+- the real operational gotchas
+- a recommendation precise enough for the current `timmy-home` roadmap

-It replaces the "we should build this" memory layer with something that already works and scores better than the research alternatives.
+Bottom line:
+- **MemPalace worked.**
+- **The evaluation succeeded.**
+- **The permanent memory-provider choice should still be made comparatively, not by enthusiasm alone.**
--- a/reports/operations/2026-04-15-nh-broadband-public-research.md
+++ b/reports/operations/2026-04-15-nh-broadband-public-research.md
@@ -1,35 +0,0 @@
-# NH Broadband — Public Research Memo
-
-**Date:** 2026-04-15
-**Status:** Draft — separates verified facts from unverified live work
-**Refs:** #533, #740
-
---
-
-## Verified (official public sources)
-
- **NH Broadband** is a residential fiber internet provider operating in New Hampshire.
- Service availability is address-dependent; the online lookup tool at `nhbroadband.com` reports coverage by street address.
- Residential fiber plans are offered; speed tiers vary by location.
- Scheduling line: **1-800-NHBB-INFO** (published on official site).
- Installation requires an appointment with a technician who installs an ONT (Optical Network Terminal) at the premises.
- Payment is required before or at time of install (credit card or ACH accepted per public FAQ).
-
-## Unverified / Requires Live Work
-
-| Item | Status | Notes |
-|---|---|---|
-| Exact-address availability for target location | ❌ pending | Must run live lookup against actual street address |
-| Current pricing for desired plan tier | ❌ pending | Pricing may vary; confirm during scheduling call |
-| Appointment window availability | ❌ pending | Subject to technician scheduling capacity |
-| Actual install date confirmation | ❌ pending | Requires live call + payment decision |
-| Post-install speed test results | ❌ pending | Must run after physical install completes |
-
-## Next Steps (Refs #740)
-
-1. Run address availability lookup on `nhbroadband.com`
-2. Call 1-800-NHBB-INFO to schedule install
-3. Confirm payment method
-4. Receive appointment confirmation number
-5. Prepare site (clear ONT install path)
-6. Post-install: speed test and log results
--- a/scripts/know_thy_father/epic_pipeline.py
+++ b/scripts/know_thy_father/epic_pipeline.py
@@ -1,127 +0,0 @@
-#!/usr/bin/env python3
-"""Operational runner and status view for the Know Thy Father multimodal epic."""
-
-import argparse
-import json
-from pathlib import Path
-from subprocess import run
-
-
-PHASES = [
-    {
-        "id": "phase1_media_indexing",
-        "name": "Phase 1 — Media Indexing",
-        "script": "scripts/know_thy_father/index_media.py",
-        "command_template": "python3 scripts/know_thy_father/index_media.py --tweets twitter-archive/extracted/tweets.jsonl --output twitter-archive/know-thy-father/media_manifest.jsonl",
-        "outputs": ["twitter-archive/know-thy-father/media_manifest.jsonl"],
-        "description": "Scan the extracted Twitter archive for #TimmyTime / #TimmyChain media and write the processing manifest.",
-    },
-    {
-        "id": "phase2_multimodal_analysis",
-        "name": "Phase 2 — Multimodal Analysis",
-        "script": "scripts/twitter_archive/analyze_media.py",
-        "command_template": "python3 scripts/twitter_archive/analyze_media.py --batch {batch_size}",
-        "outputs": [
-            "twitter-archive/know-thy-father/analysis.jsonl",
-            "twitter-archive/know-thy-father/meaning-kernels.jsonl",
-            "twitter-archive/know-thy-father/pipeline-status.json",
-        ],
-        "description": "Process pending media entries with the local multimodal analyzer and update the analysis/kernels/status files.",
-    },
-    {
-        "id": "phase3_holographic_synthesis",
-        "name": "Phase 3 — Holographic Synthesis",
-        "script": "scripts/know_thy_father/synthesize_kernels.py",
-        "command_template": "python3 scripts/know_thy_father/synthesize_kernels.py --input twitter-archive/media/manifest.jsonl --output twitter-archive/knowledge/fathers_ledger.jsonl --summary twitter-archive/knowledge/fathers_ledger.summary.json",
-        "outputs": [
-            "twitter-archive/knowledge/fathers_ledger.jsonl",
-            "twitter-archive/knowledge/fathers_ledger.summary.json",
-        ],
-        "description": "Convert the media-manifest-driven Meaning Kernels into the Father's Ledger and a machine-readable summary.",
-    },
-    {
-        "id": "phase4_cross_reference_audit",
-        "name": "Phase 4 — Cross-Reference Audit",
-        "script": "scripts/know_thy_father/crossref_audit.py",
-        "command_template": "python3 scripts/know_thy_father/crossref_audit.py --soul SOUL.md --kernels twitter-archive/notes/know_thy_father_crossref.md --output twitter-archive/notes/crossref_report.md",
-        "outputs": ["twitter-archive/notes/crossref_report.md"],
-        "description": "Compare Know Thy Father kernels against SOUL.md and related canon, then emit a Markdown audit report.",
-    },
-    {
-        "id": "phase5_processing_log",
-        "name": "Phase 5 — Processing Log / Status",
-        "script": "twitter-archive/know-thy-father/tracker.py",
-        "command_template": "python3 twitter-archive/know-thy-father/tracker.py report",
-        "outputs": ["twitter-archive/know-thy-father/REPORT.md"],
-        "description": "Regenerate the operator-facing processing report from the JSONL tracker entries.",
-    },
-]
-
-
-def build_pipeline_plan(batch_size: int = 10):
-    plan = []
-    for phase in PHASES:
-        plan.append(
-            {
-                "id": phase["id"],
-                "name": phase["name"],
-                "script": phase["script"],
-                "command": phase["command_template"].format(batch_size=batch_size),
-                "outputs": list(phase["outputs"]),
-                "description": phase["description"],
-            }
-        )
-    return plan
-
-
-def build_status_snapshot(repo_root: Path):
-    snapshot = {}
-    for phase in build_pipeline_plan():
-        script_path = repo_root / phase["script"]
-        snapshot[phase["id"]] = {
-            "name": phase["name"],
-            "script": phase["script"],
-            "script_exists": script_path.exists(),
-            "outputs": [
-                {
-                    "path": output,
-                    "exists": (repo_root / output).exists(),
-                }
-                for output in phase["outputs"]
-            ],
-        }
-    return snapshot
-
-
-def run_step(repo_root: Path, step_id: str, batch_size: int = 10):
-    plan = {step["id"]: step for step in build_pipeline_plan(batch_size=batch_size)}
-    if step_id not in plan:
-        raise SystemExit(f"Unknown step: {step_id}")
-    step = plan[step_id]
-    return run(step["command"], cwd=repo_root, shell=True, check=False)
-
-
-def main():
-    parser = argparse.ArgumentParser(description="Know Thy Father epic orchestration helper")
-    parser.add_argument("--batch-size", type=int, default=10)
-    parser.add_argument("--status", action="store_true")
-    parser.add_argument("--run-step", default=None)
-    parser.add_argument("--json", action="store_true")
-    args = parser.parse_args()
-
-    repo_root = Path(__file__).resolve().parents[2]
-
-    if args.run_step:
-        result = run_step(repo_root, args.run_step, batch_size=args.batch_size)
-        raise SystemExit(result.returncode)
-
-    payload = build_status_snapshot(repo_root) if args.status else build_pipeline_plan(batch_size=args.batch_size)
-    if args.json or args.status:
-        print(json.dumps(payload, indent=2))
-    else:
-        for step in payload:
-            print(f"[{step['id']}] {step['command']}")
-
-
-if __name__ == "__main__":
-    main()
--- a/scripts/mempalace_ezra_integration.py
+++ b/scripts/mempalace_ezra_integration.py
@@ -1,159 +0,0 @@
-#!/usr/bin/env python3
-"""Prepare a MemPalace v3.0.0 integration packet for Ezra's Hermes home."""
-
-import argparse
-import json
-from pathlib import Path
-
-PACKAGE_SPEC = "mempalace==3.0.0"
-DEFAULT_HERMES_HOME = "~/.hermes/"
-DEFAULT_SESSIONS_DIR = "~/.hermes/sessions/"
-DEFAULT_PALACE_PATH = "~/.mempalace/palace"
-DEFAULT_WING = "ezra_home"
-
-
-def build_yaml_template(wing: str, palace_path: str) -> str:
-    return (
-        f"wing: {wing}\n"
-        f"palace: {palace_path}\n"
-        "rooms:\n"
-        "  - name: sessions\n"
-        "    description: Conversation history and durable agent transcripts\n"
-        "    globs:\n"
-        "      - \"*.json\"\n"
-        "      - \"*.jsonl\"\n"
-        "  - name: config\n"
-        "    description: Hermes configuration and runtime settings\n"
-        "    globs:\n"
-        "      - \"*.yaml\"\n"
-        "      - \"*.yml\"\n"
-        "      - \"*.toml\"\n"
-        "  - name: docs\n"
-        "    description: Notes, markdown docs, and operating reports\n"
-        "    globs:\n"
-        "      - \"*.md\"\n"
-        "      - \"*.txt\"\n"
-        "people: []\n"
-        "projects: []\n"
-    )
-
-
-def build_plan(overrides: dict | None = None) -> dict:
-    overrides = overrides or {}
-    hermes_home = overrides.get("hermes_home", DEFAULT_HERMES_HOME)
-    sessions_dir = overrides.get("sessions_dir", DEFAULT_SESSIONS_DIR)
-    palace_path = overrides.get("palace_path", DEFAULT_PALACE_PATH)
-    wing = overrides.get("wing", DEFAULT_WING)
-    yaml_template = build_yaml_template(wing=wing, palace_path=palace_path)
-
-    config_home = hermes_home[:-1] if hermes_home.endswith("/") else hermes_home
-    plan = {
-        "package_spec": PACKAGE_SPEC,
-        "hermes_home": hermes_home,
-        "sessions_dir": sessions_dir,
-        "palace_path": palace_path,
-        "wing": wing,
-        "config_path": f"{config_home}/mempalace.yaml",
-        "install_command": f"pip install {PACKAGE_SPEC}",
-        "init_command": f"mempalace init {hermes_home} --yes",
-        "mine_home_command": f"echo \"\" | mempalace mine {hermes_home}",
-        "mine_sessions_command": f"echo \"\" | mempalace mine {sessions_dir} --mode convos",
-        "search_command": 'mempalace search "your common queries"',
-        "wake_up_command": "mempalace wake-up",
-        "mcp_command": "hermes mcp add mempalace -- python -m mempalace.mcp_server",
-        "yaml_template": yaml_template,
-        "gotchas": [
-            "`mempalace init` is still interactive in room approval flow; write mempalace.yaml manually if the init output stalls.",
-            "The yaml key is `wing:` not `wings:`. Using the wrong key causes mine/setup failures.",
-            "Pipe empty stdin into mining commands (`echo \"\" | ...`) to avoid the entity-detector stdin hang on larger directories.",
-            "First mine downloads the ChromaDB embedding model cache (~79MB).",
-            "Report Ezra's before/after metrics back to issue #568 after live installation and retrieval tests.",
-        ],
-    }
-    return plan
-
-
-def render_markdown(plan: dict) -> str:
-    gotchas = "\n".join(f"- {item}" for item in plan["gotchas"])
-    return f"""# MemPalace v3.0.0 — Ezra Integration Packet
-
-This packet turns issue #570 into an executable, reviewable integration plan for Ezra's Hermes home.
-It is a repo-side scaffold: no live Ezra host changes are claimed in this artifact.
-
-## Commands
-
-```bash
-{plan['install_command']}
-{plan['init_command']}
-cat > {plan['config_path']} <<'YAML'
-{plan['yaml_template'].rstrip()}
-YAML
-{plan['mine_home_command']}
-{plan['mine_sessions_command']}
-{plan['search_command']}
-{plan['wake_up_command']}
-{plan['mcp_command']}
-```
-
-## Manual config template
-
-```yaml
-{plan['yaml_template'].rstrip()}
-```
-
-## Why this shape
-
- `wing: {plan['wing']}` matches the issue's Ezra-specific integration target.
- `rooms` split the mined material into sessions, config, and docs to keep retrieval interpretable.
- Mining commands pipe empty stdin to avoid the interactive entity-detector hang noted in the evaluation.
-
-## Gotchas
-
-{gotchas}
-
-## Report back to #568
-
-After live execution on Ezra's actual environment, post back to #568 with:
- install result
- mine duration and corpus size
- 2-3 real search queries + retrieved results
- wake-up context token count
- whether MCP wiring succeeded
-
-## Honest scope boundary
-
-This repo artifact does **not** prove live installation on Ezra's host. It makes the work reproducible and testable so the next pass can execute it without guesswork.
-"""
-
-
-def main() -> None:
-    parser = argparse.ArgumentParser(description="Prepare the MemPalace Ezra integration packet")
-    parser.add_argument("--hermes-home", default=DEFAULT_HERMES_HOME)
-    parser.add_argument("--sessions-dir", default=DEFAULT_SESSIONS_DIR)
-    parser.add_argument("--palace-path", default=DEFAULT_PALACE_PATH)
-    parser.add_argument("--wing", default=DEFAULT_WING)
-    parser.add_argument("--output", default=None)
-    parser.add_argument("--json", action="store_true")
-    args = parser.parse_args()
-
-    plan = build_plan(
-        {
-            "hermes_home": args.hermes_home,
-            "sessions_dir": args.sessions_dir,
-            "palace_path": args.palace_path,
-            "wing": args.wing,
-        }
-    )
-    rendered = json.dumps(plan, indent=2) if args.json else render_markdown(plan)
-
-    if args.output:
-        output_path = Path(args.output).expanduser()
-        output_path.parent.mkdir(parents=True, exist_ok=True)
-        output_path.write_text(rendered, encoding="utf-8")
-        print(f"MemPalace integration packet written to {output_path}")
-    else:
-        print(rendered)
-
-
-if __name__ == "__main__":
-    main()
--- a/scripts/plan_laptop_fleet.py
+++ b/scripts/plan_laptop_fleet.py
@@ -1,155 +0,0 @@
-#!/usr/bin/env python3
-from __future__ import annotations
-
-import argparse
-import json
-from pathlib import Path
-from typing import Any
-
-import yaml
-
-DAYLIGHT_START = "10:00"
-DAYLIGHT_END = "16:00"
-
-
-def load_manifest(path: str | Path) -> dict[str, Any]:
-    data = yaml.safe_load(Path(path).read_text()) or {}
-    data.setdefault("machines", [])
-    return data
-
-
-def validate_manifest(data: dict[str, Any]) -> None:
-    machines = data.get("machines", [])
-    if not machines:
-        raise ValueError("manifest must contain at least one machine")
-
-    seen: set[str] = set()
-    for machine in machines:
-        hostname = machine.get("hostname", "").strip()
-        if not hostname:
-            raise ValueError("each machine must declare a hostname")
-        if hostname in seen:
-            raise ValueError(f"duplicate hostname: {hostname} (unique hostnames are required)")
-        seen.add(hostname)
-
-        for field in ("machine_type", "ram_gb", "cpu_cores", "os", "adapter_condition"):
-            if field not in machine:
-                raise ValueError(f"machine {hostname} missing required field: {field}")
-
-
-def _laptops(machines: list[dict[str, Any]]) -> list[dict[str, Any]]:
-    return [m for m in machines if m.get("machine_type") == "laptop"]
-
-
-def _desktop(machines: list[dict[str, Any]]) -> dict[str, Any] | None:
-    for machine in machines:
-        if machine.get("machine_type") == "desktop":
-            return machine
-    return None
-
-
-def choose_anchor_agents(machines: list[dict[str, Any]], count: int = 2) -> list[dict[str, Any]]:
-    eligible = [
-        m for m in _laptops(machines)
-        if m.get("adapter_condition") in {"good", "ok"} and m.get("always_on_capable", True)
-    ]
-    eligible.sort(key=lambda m: (m.get("idle_watts", 9999), -m.get("ram_gb", 0), -m.get("cpu_cores", 0), m["hostname"]))
-    return eligible[:count]
-
-
-def assign_roles(machines: list[dict[str, Any]]) -> dict[str, Any]:
-    anchors = choose_anchor_agents(machines, count=2)
-    anchor_names = {m["hostname"] for m in anchors}
-    desktop = _desktop(machines)
-
-    mapping: dict[str, dict[str, Any]] = {}
-    for machine in machines:
-        hostname = machine["hostname"]
-        if desktop and hostname == desktop["hostname"]:
-            mapping[hostname] = {
-                "role": "desktop_nas",
-                "schedule": f"{DAYLIGHT_START}-{DAYLIGHT_END}",
-                "duty_cycle": "daylight_only",
-            }
-        elif hostname in anchor_names:
-            mapping[hostname] = {
-                "role": "anchor_agent",
-                "schedule": "24/7",
-                "duty_cycle": "continuous",
-            }
-        else:
-            mapping[hostname] = {
-                "role": "daylight_agent",
-                "schedule": f"{DAYLIGHT_START}-{DAYLIGHT_END}",
-                "duty_cycle": "peak_solar",
-            }
-    return {
-        "anchor_agents": [m["hostname"] for m in anchors],
-        "desktop_nas": desktop["hostname"] if desktop else None,
-        "role_mapping": mapping,
-    }
-
-
-def build_plan(data: dict[str, Any]) -> dict[str, Any]:
-    validate_manifest(data)
-    machines = data["machines"]
-    role_plan = assign_roles(machines)
-    return {
-        "fleet_name": data.get("fleet_name", "timmy-laptop-fleet"),
-        "machine_count": len(machines),
-        "anchor_agents": role_plan["anchor_agents"],
-        "desktop_nas": role_plan["desktop_nas"],
-        "daylight_window": f"{DAYLIGHT_START}-{DAYLIGHT_END}",
-        "role_mapping": role_plan["role_mapping"],
-    }
-
-
-def render_markdown(plan: dict[str, Any], data: dict[str, Any]) -> str:
-    lines = [
-        "# Laptop Fleet Deployment Plan",
-        "",
-        f"Fleet: {plan['fleet_name']}",
-        f"Machine count: {plan['machine_count']}",
-        f"24/7 anchor agents: {', '.join(plan['anchor_agents']) if plan['anchor_agents'] else 'TBD'}",
-        f"Desktop/NAS: {plan['desktop_nas'] or 'TBD'}",
-        f"Daylight schedule: {plan['daylight_window']}",
-        "",
-        "## Role mapping",
-        "",
-        "| Hostname | Role | Schedule | Duty cycle |",
-        "|---|---|---|---|",
-    ]
-    for hostname, role in sorted(plan["role_mapping"].items()):
-        lines.append(f"| {hostname} | {role['role']} | {role['schedule']} | {role['duty_cycle']} |")
-
-    lines.extend([
-        "",
-        "## Machine inventory",
-        "",
-        "| Hostname | Type | RAM | CPU cores | OS | Adapter | Idle watts | Notes |",
-        "|---|---|---:|---:|---|---|---:|---|",
-    ])
-    for machine in data["machines"]:
-        lines.append(
-            f"| {machine['hostname']} | {machine['machine_type']} | {machine['ram_gb']} | {machine['cpu_cores']} | {machine['os']} | {machine['adapter_condition']} | {machine.get('idle_watts', 'n/a')} | {machine.get('notes', '')} |"
-        )
-    return "\n".join(lines) + "\n"
-
-
-def main() -> int:
-    parser = argparse.ArgumentParser(description="Plan LAB-005 laptop fleet deployment.")
-    parser.add_argument("manifest", help="Path to laptop fleet manifest YAML")
-    parser.add_argument("--markdown", action="store_true", help="Render a markdown deployment plan instead of JSON")
-    args = parser.parse_args()
-
-    data = load_manifest(args.manifest)
-    plan = build_plan(data)
-    if args.markdown:
-        print(render_markdown(plan, data))
-    else:
-        print(json.dumps(plan, indent=2))
-    return 0
-
-
-if __name__ == "__main__":
-    raise SystemExit(main())
--- a/scripts/plan_nh_broadband_install.py
+++ b/scripts/plan_nh_broadband_install.py
@@ -1,135 +0,0 @@
-#!/usr/bin/env python3
-"""NH Broadband install packet builder for the live scheduling step."""
-from __future__ import annotations
-
-import argparse
-import json
-from datetime import datetime, timezone
-from pathlib import Path
-from typing import Any
-
-import yaml
-
-
-def load_request(path: str | Path) -> dict[str, Any]:
-    data = yaml.safe_load(Path(path).read_text()) or {}
-    data.setdefault("contact", {})
-    data.setdefault("service", {})
-    data.setdefault("call_log", [])
-    data.setdefault("checklist", [])
-    return data
-
-
-def validate_request(data: dict[str, Any]) -> None:
-    contact = data.get("contact", {})
-    for field in ("name", "phone"):
-        if not contact.get(field, "").strip():
-            raise ValueError(f"contact.{field} is required")
-
-    service = data.get("service", {})
-    for field in ("address", "city", "state"):
-        if not service.get(field, "").strip():
-            raise ValueError(f"service.{field} is required")
-
-    if not data.get("checklist"):
-        raise ValueError("checklist must contain at least one item")
-
-
-def build_packet(data: dict[str, Any]) -> dict[str, Any]:
-    validate_request(data)
-    contact = data["contact"]
-    service = data["service"]
-
-    return {
-        "packet_id": f"nh-bb-{datetime.now(timezone.utc).strftime('%Y%m%d-%H%M%S')}",
-        "generated_utc": datetime.now(timezone.utc).isoformat(),
-        "contact": {
-            "name": contact["name"],
-            "phone": contact["phone"],
-            "email": contact.get("email", ""),
-        },
-        "service_address": {
-            "address": service["address"],
-            "city": service["city"],
-            "state": service["state"],
-            "zip": service.get("zip", ""),
-        },
-        "desired_plan": data.get("desired_plan", "residential-fiber"),
-        "call_log": data.get("call_log", []),
-        "checklist": [
-            {"item": item, "done": False} if isinstance(item, str) else item
-            for item in data["checklist"]
-        ],
-        "status": "pending_scheduling_call",
-    }
-
-
-def render_markdown(packet: dict[str, Any], data: dict[str, Any]) -> str:
-    contact = packet["contact"]
-    addr = packet["service_address"]
-    lines = [
-        f"# NH Broadband Install Packet",
-        "",
-        f"**Packet ID:** {packet['packet_id']}",
-        f"**Generated:** {packet['generated_utc']}",
-        f"**Status:** {packet['status']}",
-        "",
-        "## Contact",
-        "",
-        f"- **Name:** {contact['name']}",
-        f"- **Phone:** {contact['phone']}",
-        f"- **Email:** {contact.get('email', 'n/a')}",
-        "",
-        "## Service Address",
-        "",
-        f"- {addr['address']}",
-        f"- {addr['city']}, {addr['state']} {addr['zip']}",
-        "",
-        f"## Desired Plan",
-        "",
-        f"{packet['desired_plan']}",
-        "",
-        "## Call Log",
-        "",
-    ]
-    if packet["call_log"]:
-        for entry in packet["call_log"]:
-            ts = entry.get("timestamp", "n/a")
-            outcome = entry.get("outcome", "n/a")
-            notes = entry.get("notes", "")
-            lines.append(f"- **{ts}** — {outcome}")
-            if notes:
-                lines.append(f"  - {notes}")
-    else:
-        lines.append("_No calls logged yet._")
-
-    lines.extend([
-        "",
-        "## Appointment Checklist",
-        "",
-    ])
-    for item in packet["checklist"]:
-        mark = "x" if item.get("done") else " "
-        lines.append(f"- [{mark}] {item['item']}")
-
-    lines.append("")
-    return "\n".join(lines)
-
-
-def main() -> int:
-    parser = argparse.ArgumentParser(description="Build NH Broadband install packet.")
-    parser.add_argument("request", help="Path to install request YAML")
-    parser.add_argument("--markdown", action="store_true", help="Render markdown instead of JSON")
-    args = parser.parse_args()
-
-    data = load_request(args.request)
-    packet = build_packet(data)
-    if args.markdown:
-        print(render_markdown(packet, data))
-    else:
-        print(json.dumps(packet, indent=2))
-    return 0
-
-
-if __name__ == "__main__":
-    raise SystemExit(main())
--- a/tests/docs/test_mempalace_evaluation_report.py
+++ b/tests/docs/test_mempalace_evaluation_report.py
@@ -0,0 +1,34 @@
+from pathlib import Path
+
+
+REPORT = Path("reports/evaluations/2026-04-06-mempalace-evaluation.md")
+
+
+def _content() -> str:
+    return REPORT.read_text()
+
+
+def test_mempalace_evaluation_report_exists() -> None:
+    assert REPORT.exists()
+
+
+def test_mempalace_evaluation_report_has_completed_sections() -> None:
+    content = _content()
+    assert "# MemPalace Integration Evaluation Report" in content
+    assert "## Executive Summary" in content
+    assert "## Benchmark Findings" in content
+    assert "## Before vs After Evaluation" in content
+    assert "## Live Mining Results" in content
+    assert "## Independent Verification" in content
+    assert "## Operational Gotchas" in content
+    assert "## Recommendation" in content
+
+
+def test_mempalace_evaluation_report_uses_real_issue_reference_and_metrics() -> None:
+    content = _content()
+    assert "#568" in content
+    assert "#[NUMBER]" not in content
+    assert "5,198 drawers" in content
+    assert "~785 tokens" in content
+    assert "238 tokens" in content
+    assert "interactive even with `--yes`" in content or "interactive even with --yes" in content
--- a/tests/test_know_thy_father_pipeline.py
+++ b/tests/test_know_thy_father_pipeline.py
@@ -1,76 +0,0 @@
-from pathlib import Path
-import importlib.util
-import unittest
-
-
-ROOT = Path(__file__).resolve().parent.parent
-SCRIPT_PATH = ROOT / "scripts" / "know_thy_father" / "epic_pipeline.py"
-DOC_PATH = ROOT / "docs" / "KNOW_THY_FATHER_MULTIMODAL_PIPELINE.md"
-
-
-def load_module(path: Path, name: str):
-    assert path.exists(), f"missing {path.relative_to(ROOT)}"
-    spec = importlib.util.spec_from_file_location(name, path)
-    assert spec and spec.loader
-    module = importlib.util.module_from_spec(spec)
-    spec.loader.exec_module(module)
-    return module
-
-
-class TestKnowThyFatherEpicPipeline(unittest.TestCase):
-    def test_build_pipeline_plan_contains_all_phases_in_order(self):
-        mod = load_module(SCRIPT_PATH, "ktf_epic_pipeline")
-        plan = mod.build_pipeline_plan(batch_size=10)
-
-        self.assertEqual(
-            [step["id"] for step in plan],
-            [
-                "phase1_media_indexing",
-                "phase2_multimodal_analysis",
-                "phase3_holographic_synthesis",
-                "phase4_cross_reference_audit",
-                "phase5_processing_log",
-            ],
-        )
-        self.assertIn("scripts/know_thy_father/index_media.py", plan[0]["command"])
-        self.assertIn("scripts/twitter_archive/analyze_media.py --batch 10", plan[1]["command"])
-        self.assertIn("scripts/know_thy_father/synthesize_kernels.py", plan[2]["command"])
-        self.assertIn("scripts/know_thy_father/crossref_audit.py", plan[3]["command"])
-        self.assertIn("twitter-archive/know-thy-father/tracker.py report", plan[4]["command"])
-
-    def test_status_snapshot_reports_key_artifact_paths(self):
-        mod = load_module(SCRIPT_PATH, "ktf_epic_pipeline")
-        status = mod.build_status_snapshot(ROOT)
-
-        self.assertIn("phase1_media_indexing", status)
-        self.assertIn("phase2_multimodal_analysis", status)
-        self.assertIn("phase3_holographic_synthesis", status)
-        self.assertIn("phase4_cross_reference_audit", status)
-        self.assertIn("phase5_processing_log", status)
-        self.assertEqual(status["phase1_media_indexing"]["script"], "scripts/know_thy_father/index_media.py")
-        self.assertEqual(status["phase2_multimodal_analysis"]["script"], "scripts/twitter_archive/analyze_media.py")
-        self.assertEqual(status["phase5_processing_log"]["script"], "twitter-archive/know-thy-father/tracker.py")
-        self.assertTrue(status["phase1_media_indexing"]["script_exists"])
-        self.assertTrue(status["phase2_multimodal_analysis"]["script_exists"])
-        self.assertTrue(status["phase3_holographic_synthesis"]["script_exists"])
-        self.assertTrue(status["phase4_cross_reference_audit"]["script_exists"])
-        self.assertTrue(status["phase5_processing_log"]["script_exists"])
-
-    def test_repo_contains_multimodal_pipeline_doc(self):
-        self.assertTrue(DOC_PATH.exists(), "missing committed Know Thy Father pipeline doc")
-        text = DOC_PATH.read_text(encoding="utf-8")
-        required = [
-            "# Know Thy Father — Multimodal Media Consumption Pipeline",
-            "scripts/know_thy_father/index_media.py",
-            "scripts/twitter_archive/analyze_media.py --batch 10",
-            "scripts/know_thy_father/synthesize_kernels.py",
-            "scripts/know_thy_father/crossref_audit.py",
-            "twitter-archive/know-thy-father/tracker.py report",
-            "Refs #582",
-        ]
-        for snippet in required:
-            self.assertIn(snippet, text)
-
-
-if __name__ == "__main__":
-    unittest.main()
--- a/tests/test_laptop_fleet_planner.py
+++ b/tests/test_laptop_fleet_planner.py
@@ -1,52 +0,0 @@
-from pathlib import Path
-
-import yaml
-
-from scripts.plan_laptop_fleet import build_plan, load_manifest, render_markdown, validate_manifest
-
-
-def test_laptop_fleet_planner_script_exists() -> None:
-    assert Path("scripts/plan_laptop_fleet.py").exists()
-
-
-def test_laptop_fleet_manifest_template_exists() -> None:
-    assert Path("docs/laptop-fleet-manifest.example.yaml").exists()
-
-
-def test_build_plan_selects_two_lowest_idle_watt_laptops_as_anchors() -> None:
-    data = load_manifest("docs/laptop-fleet-manifest.example.yaml")
-    plan = build_plan(data)
-    assert plan["anchor_agents"] == ["timmy-anchor-a", "timmy-anchor-b"]
-    assert plan["desktop_nas"] == "timmy-desktop-nas"
-    assert plan["role_mapping"]["timmy-daylight-a"]["schedule"] == "10:00-16:00"
-
-
-def test_validate_manifest_requires_unique_hostnames() -> None:
-    data = {
-        "machines": [
-            {"hostname": "dup", "machine_type": "laptop", "ram_gb": 8, "cpu_cores": 4, "os": "Linux", "adapter_condition": "good"},
-            {"hostname": "dup", "machine_type": "laptop", "ram_gb": 16, "cpu_cores": 8, "os": "Linux", "adapter_condition": "good"},
-        ]
-    }
-    try:
-        validate_manifest(data)
-    except ValueError as exc:
-        assert "duplicate hostname" in str(exc)
-        assert "unique hostnames" in str(exc)
-    else:
-        raise AssertionError("validate_manifest should reject duplicate hostname")
-
-
-def test_markdown_contains_anchor_agents_and_daylight_schedule() -> None:
-    data = load_manifest("docs/laptop-fleet-manifest.example.yaml")
-    plan = build_plan(data)
-    content = render_markdown(plan, data)
-    assert "24/7 anchor agents: timmy-anchor-a, timmy-anchor-b" in content
-    assert "Daylight schedule: 10:00-16:00" in content
-    assert "desktop_nas" in content
-
-
-def test_manifest_template_is_valid_yaml() -> None:
-    data = yaml.safe_load(Path("docs/laptop-fleet-manifest.example.yaml").read_text())
-    assert data["fleet_name"] == "timmy-laptop-fleet"
-    assert len(data["machines"]) == 6
--- a/tests/test_mempalace_ezra_integration.py
+++ b/tests/test_mempalace_ezra_integration.py
@@ -1,68 +0,0 @@
-from pathlib import Path
-import importlib.util
-import unittest
-
-
-ROOT = Path(__file__).resolve().parent.parent
-SCRIPT_PATH = ROOT / "scripts" / "mempalace_ezra_integration.py"
-DOC_PATH = ROOT / "docs" / "MEMPALACE_EZRA_INTEGRATION.md"
-
-
-def load_module(path: Path, name: str):
-    assert path.exists(), f"missing {path.relative_to(ROOT)}"
-    spec = importlib.util.spec_from_file_location(name, path)
-    assert spec and spec.loader
-    module = importlib.util.module_from_spec(spec)
-    spec.loader.exec_module(module)
-    return module
-
-
-class TestMempalaceEzraIntegration(unittest.TestCase):
-    def test_build_plan_contains_issue_required_steps_and_gotchas(self):
-        mod = load_module(SCRIPT_PATH, "mempalace_ezra_integration")
-        plan = mod.build_plan({})
-
-        self.assertEqual(plan["package_spec"], "mempalace==3.0.0")
-        self.assertIn("pip install mempalace==3.0.0", plan["install_command"])
-        self.assertEqual(plan["wing"], "ezra_home")
-        self.assertIn('echo "" | mempalace mine ~/.hermes/', plan["mine_home_command"])
-        self.assertIn('--mode convos', plan["mine_sessions_command"])
-        self.assertIn('mempalace wake-up', plan["wake_up_command"])
-        self.assertIn('hermes mcp add mempalace -- python -m mempalace.mcp_server', plan["mcp_command"])
-        self.assertIn('wing:', plan["yaml_template"])
-        self.assertTrue(any('stdin' in item.lower() for item in plan["gotchas"]))
-        self.assertTrue(any('wing:' in item for item in plan["gotchas"]))
-
-    def test_build_plan_accepts_path_and_wing_overrides(self):
-        mod = load_module(SCRIPT_PATH, "mempalace_ezra_integration")
-        plan = mod.build_plan(
-            {
-                "hermes_home": "/root/wizards/ezra/home",
-                "sessions_dir": "/root/wizards/ezra/home/sessions",
-                "wing": "ezra_archive",
-            }
-        )
-
-        self.assertEqual(plan["wing"], "ezra_archive")
-        self.assertIn('/root/wizards/ezra/home', plan["mine_home_command"])
-        self.assertIn('/root/wizards/ezra/home/sessions', plan["mine_sessions_command"])
-        self.assertIn('wing: ezra_archive', plan["yaml_template"])
-
-    def test_repo_contains_mem_palace_ezra_doc(self):
-        self.assertTrue(DOC_PATH.exists(), "missing committed MemPalace Ezra integration doc")
-        text = DOC_PATH.read_text(encoding="utf-8")
-        required = [
-            "# MemPalace v3.0.0 — Ezra Integration Packet",
-            "pip install mempalace==3.0.0",
-            'echo "" | mempalace mine ~/.hermes/',
-            "mempalace mine ~/.hermes/sessions/ --mode convos",
-            "mempalace wake-up",
-            "hermes mcp add mempalace -- python -m mempalace.mcp_server",
-            "Report back to #568",
-        ]
-        for snippet in required:
-            self.assertIn(snippet, text)
-
-
-if __name__ == "__main__":
-    unittest.main()
--- a/tests/test_nh_broadband_install_planner.py
+++ b/tests/test_nh_broadband_install_planner.py
@@ -1,105 +0,0 @@
-from pathlib import Path
-
-import yaml
-
-from scripts.plan_nh_broadband_install import (
-    build_packet,
-    load_request,
-    render_markdown,
-    validate_request,
-)
-
-
-def test_script_exists() -> None:
-    assert Path("scripts/plan_nh_broadband_install.py").exists()
-
-
-def test_example_request_exists() -> None:
-    assert Path("docs/nh-broadband-install-request.example.yaml").exists()
-
-
-def test_example_packet_exists() -> None:
-    assert Path("docs/nh-broadband-install-packet.example.md").exists()
-
-
-def test_research_memo_exists() -> None:
-    assert Path("reports/operations/2026-04-15-nh-broadband-public-research.md").exists()
-
-
-def test_load_and_build_packet() -> None:
-    data = load_request("docs/nh-broadband-install-request.example.yaml")
-    packet = build_packet(data)
-    assert packet["contact"]["name"] == "Timmy Operator"
-    assert packet["service_address"]["city"] == "Concord"
-    assert packet["service_address"]["state"] == "NH"
-    assert packet["status"] == "pending_scheduling_call"
-    assert len(packet["checklist"]) == 8
-    assert packet["checklist"][0]["done"] is False
-
-
-def test_validate_rejects_missing_contact_name() -> None:
-    data = {
-        "contact": {"name": "", "phone": "555"},
-        "service": {"address": "1 St", "city": "X", "state": "NH"},
-        "checklist": ["do thing"],
-    }
-    try:
-        validate_request(data)
-    except ValueError as exc:
-        assert "contact.name" in str(exc)
-    else:
-        raise AssertionError("should reject empty contact name")
-
-
-def test_validate_rejects_missing_service_address() -> None:
-    data = {
-        "contact": {"name": "A", "phone": "555"},
-        "service": {"address": "", "city": "X", "state": "NH"},
-        "checklist": ["do thing"],
-    }
-    try:
-        validate_request(data)
-    except ValueError as exc:
-        assert "service.address" in str(exc)
-    else:
-        raise AssertionError("should reject empty service address")
-
-
-def test_validate_rejects_empty_checklist() -> None:
-    data = {
-        "contact": {"name": "A", "phone": "555"},
-        "service": {"address": "1 St", "city": "X", "state": "NH"},
-        "checklist": [],
-    }
-    try:
-        validate_request(data)
-    except ValueError as exc:
-        assert "checklist" in str(exc)
-    else:
-        raise AssertionError("should reject empty checklist")
-
-
-def test_render_markdown_contains_key_sections() -> None:
-    data = load_request("docs/nh-broadband-install-request.example.yaml")
-    packet = build_packet(data)
-    md = render_markdown(packet, data)
-    assert "# NH Broadband Install Packet" in md
-    assert "## Contact" in md
-    assert "## Service Address" in md
-    assert "## Call Log" in md
-    assert "## Appointment Checklist" in md
-    assert "Concord" in md
-    assert "NH" in md
-
-
-def test_render_markdown_shows_checklist_items() -> None:
-    data = load_request("docs/nh-broadband-install-request.example.yaml")
-    packet = build_packet(data)
-    md = render_markdown(packet, data)
-    assert "- [ ] Confirm exact-address availability" in md
-
-
-def test_example_yaml_is_valid() -> None:
-    data = yaml.safe_load(Path("docs/nh-broadband-install-request.example.yaml").read_text())
-    assert data["contact"]["name"] == "Timmy Operator"
-    assert len(data["checklist"]) == 8