feat: add Know Thy Father epic orchestrator (#582 )

feat: add Ezra mempalace integration packet (#570 )
2026-04-15 01:52:58 -04:00 · 2026-04-15 01:37:47 -04:00
11 changed files with 586 additions and 386 deletions
--- a/.gitea/workflows/agent-pr-gate.yml
+++ b/.gitea/workflows/agent-pr-gate.yml
@@ -1,97 +0,0 @@
-name: Agent PR Gate
-'on':
-  pull_request:
-    branches: [main]
-
-jobs:
-  gate:
-    runs-on: ubuntu-latest
-    outputs:
-      syntax_status: ${{ steps.syntax.outcome }}
-      tests_status: ${{ steps.tests.outcome }}
-      criteria_status: ${{ steps.criteria.outcome }}
-      risk_level: ${{ steps.risk.outputs.level }}
-    steps:
-      - uses: actions/checkout@v4
-        with:
-          fetch-depth: 0
-
-      - uses: actions/setup-python@v5
-        with:
-          python-version: '3.11'
-
-      - name: Install CI dependencies
-        run: |
-          python3 -m pip install --quiet pyyaml pytest
-
-      - id: risk
-        name: Classify PR risk
-        run: |
-          BASE_REF="${GITHUB_BASE_REF:-main}"
-          git fetch origin "$BASE_REF" --depth 1
-          git diff --name-only "origin/$BASE_REF"...HEAD > /tmp/changed_files.txt
-          python3 scripts/agent_pr_gate.py classify-risk --files-file /tmp/changed_files.txt > /tmp/risk.json
-          python3 - <<'PY'
-          import json, os
-          with open('/tmp/risk.json', 'r', encoding='utf-8') as fh:
-              data = json.load(fh)
-          with open(os.environ['GITHUB_OUTPUT'], 'a', encoding='utf-8') as fh:
-              fh.write('level=' + data['risk'] + '\n')
-          PY
-
-      - id: syntax
-        name: Syntax and parse checks
-        continue-on-error: true
-        run: |
-          find . \( -name '*.yml' -o -name '*.yaml' \) | grep -v .gitea | xargs -r python3 -c "import sys,yaml; [yaml.safe_load(open(f)) for f in sys.argv[1:]]"
-          find . -name '*.json' | while read f; do python3 -m json.tool "$f" > /dev/null || exit 1; done
-          find . -name '*.py' | xargs -r python3 -m py_compile
-          find . -name '*.sh' | xargs -r bash -n
-
-      - id: tests
-        name: Test suite
-        continue-on-error: true
-        run: |
-          pytest -q --ignore=uni-wizard/v2/tests/test_author_whitelist.py
-
-      - id: criteria
-        name: PR criteria verification
-        continue-on-error: true
-        run: |
-          python3 scripts/agent_pr_gate.py validate-pr --event-path "$GITHUB_EVENT_PATH"
-
-      - name: Fail gate if any required check failed
-        if: steps.syntax.outcome != 'success' || steps.tests.outcome != 'success' || steps.criteria.outcome != 'success'
-        run: exit 1
-
-  report:
-    needs: gate
-    if: always()
-    runs-on: ubuntu-latest
-    steps:
-      - uses: actions/checkout@v4
-
-      - uses: actions/setup-python@v5
-        with:
-          python-version: '3.11'
-
-      - name: Post PR gate report
-        env:
-          GITEA_TOKEN: ${{ github.token }}
-        run: |
-          python3 scripts/agent_pr_gate.py comment \
-            --event-path "$GITHUB_EVENT_PATH" \
-            --token "$GITEA_TOKEN" \
-            --syntax "${{ needs.gate.outputs.syntax_status }}" \
-            --tests "${{ needs.gate.outputs.tests_status }}" \
-            --criteria "${{ needs.gate.outputs.criteria_status }}" \
-            --risk "${{ needs.gate.outputs.risk_level }}"
-
-      - name: Auto-merge low-risk clean PRs
-        if: needs.gate.result == 'success' && needs.gate.outputs.risk_level == 'low'
-        env:
-          GITEA_TOKEN: ${{ github.token }}
-        run: |
-          python3 scripts/agent_pr_gate.py merge \
-            --event-path "$GITHUB_EVENT_PATH" \
-            --token "$GITEA_TOKEN"
--- a/.gitea/workflows/smoke.yml
+++ b/.gitea/workflows/smoke.yml
@@ -1,5 +1,5 @@
 name: Smoke Test
-'on':
+on:
  pull_request:
  push:
    branches: [main]
@@ -11,13 +11,10 @@ jobs:
      - uses: actions/setup-python@v5
        with:
          python-version: '3.11'
-      - name: Install parse dependencies
-        run: |
-          python3 -m pip install --quiet pyyaml
      - name: Parse check
        run: |
-          find . \( -name '*.yml' -o -name '*.yaml' \) | grep -v .gitea | xargs -r python3 -c "import sys,yaml; [yaml.safe_load(open(f)) for f in sys.argv[1:]]"
-          find . -name '*.json' | while read f; do python3 -m json.tool "$f" > /dev/null || exit 1; done
+          find . -name '*.yml' -o -name '*.yaml' | grep -v .gitea | xargs -r python3 -c "import sys,yaml; [yaml.safe_load(open(f)) for f in sys.argv[1:]]"
+          find . -name '*.json' | xargs -r python3 -m json.tool > /dev/null
          find . -name '*.py' | xargs -r python3 -m py_compile
          find . -name '*.sh' | xargs -r bash -n
          echo "PASS: All files parse"
--- a/docs/KNOW_THY_FATHER_MULTIMODAL_PIPELINE.md
+++ b/docs/KNOW_THY_FATHER_MULTIMODAL_PIPELINE.md
@@ -0,0 +1,61 @@
+# Know Thy Father — Multimodal Media Consumption Pipeline
+
+Refs #582
+
+This document makes the epic operational by naming the current source-of-truth scripts, their handoff artifacts, and the one-command runner that coordinates them.
+
+## Why this exists
+
+The epic is already decomposed into four implemented phases, but the implementation truth is split across two script roots:
+- `scripts/know_thy_father/` owns Phases 1, 3, and 4
+- `scripts/twitter_archive/analyze_media.py` owns Phase 2
+- `twitter-archive/know-thy-father/tracker.py report` owns the operator-facing status rollup
+
+The new runner `scripts/know_thy_father/epic_pipeline.py` does not replace those scripts. It stitches them together into one explicit, reviewable plan.
+
+## Phase map
+
+| Phase | Script | Primary output |
+|-------|--------|----------------|
+| 1. Media Indexing | `scripts/know_thy_father/index_media.py` | `twitter-archive/know-thy-father/media_manifest.jsonl` |
+| 2. Multimodal Analysis | `scripts/twitter_archive/analyze_media.py --batch 10` | `twitter-archive/know-thy-father/analysis.jsonl` + `meaning-kernels.jsonl` + `pipeline-status.json` |
+| 3. Holographic Synthesis | `scripts/know_thy_father/synthesize_kernels.py` | `twitter-archive/knowledge/fathers_ledger.jsonl` |
+| 4. Cross-Reference Audit | `scripts/know_thy_father/crossref_audit.py` | `twitter-archive/notes/crossref_report.md` |
+| 5. Processing Log | `twitter-archive/know-thy-father/tracker.py report` | `twitter-archive/know-thy-father/REPORT.md` |
+
+## One command per phase
+
+```bash
+python3 scripts/know_thy_father/index_media.py --tweets twitter-archive/extracted/tweets.jsonl --output twitter-archive/know-thy-father/media_manifest.jsonl
+python3 scripts/twitter_archive/analyze_media.py --batch 10
+python3 scripts/know_thy_father/synthesize_kernels.py --input twitter-archive/media/manifest.jsonl --output twitter-archive/knowledge/fathers_ledger.jsonl --summary twitter-archive/knowledge/fathers_ledger.summary.json
+python3 scripts/know_thy_father/crossref_audit.py --soul SOUL.md --kernels twitter-archive/notes/know_thy_father_crossref.md --output twitter-archive/notes/crossref_report.md
+python3 twitter-archive/know-thy-father/tracker.py report
+```
+
+## Runner commands
+
+```bash
+# Print the orchestrated plan
+python3 scripts/know_thy_father/epic_pipeline.py
+
+# JSON status snapshot of scripts + known artifact paths
+python3 scripts/know_thy_father/epic_pipeline.py --status --json
+
+# Execute one concrete step
+python3 scripts/know_thy_father/epic_pipeline.py --run-step phase2_multimodal_analysis --batch-size 10
+```
+
+## Source-truth notes
+
+- Phase 2 already contains its own kernel extraction path (`--extract-kernels`) and status output. The epic runner does not reimplement that logic.
+- Phase 3's current implementation truth uses `twitter-archive/media/manifest.jsonl` as its default input. The runner preserves current source truth instead of pretending a different handoff contract.
+- The processing log in `twitter-archive/know-thy-father/PROCESSING_LOG.md` can drift from current code reality. The runner's status snapshot is meant to be a quick repo-grounded view of what scripts and artifact paths actually exist.
+
+## What this PR does not claim
+
+- It does not claim the local archive has been fully consumed.
+- It does not claim the halted processing log has been resumed.
+- It does not claim fact_store ingestion has been fully wired end-to-end.
+
+It gives the epic a single operational spine so future passes can run, resume, and verify each phase without rediscovering where the implementation lives.
--- a/docs/MEMPALACE_EZRA_INTEGRATION.md
+++ b/docs/MEMPALACE_EZRA_INTEGRATION.md
@@ -0,0 +1,92 @@
+# MemPalace v3.0.0 — Ezra Integration Packet
+
+This packet turns issue #570 into an executable, reviewable integration plan for Ezra's Hermes home.
+It is a repo-side scaffold: no live Ezra host changes are claimed in this artifact.
+
+## Commands
+
+```bash
+pip install mempalace==3.0.0
+mempalace init ~/.hermes/ --yes
+cat > ~/.hermes/mempalace.yaml <<'YAML'
+wing: ezra_home
+palace: ~/.mempalace/palace
+rooms:
+  - name: sessions
+    description: Conversation history and durable agent transcripts
+    globs:
+      - "*.json"
+      - "*.jsonl"
+  - name: config
+    description: Hermes configuration and runtime settings
+    globs:
+      - "*.yaml"
+      - "*.yml"
+      - "*.toml"
+  - name: docs
+    description: Notes, markdown docs, and operating reports
+    globs:
+      - "*.md"
+      - "*.txt"
+people: []
+projects: []
+YAML
+echo "" | mempalace mine ~/.hermes/
+echo "" | mempalace mine ~/.hermes/sessions/ --mode convos
+mempalace search "your common queries"
+mempalace wake-up
+hermes mcp add mempalace -- python -m mempalace.mcp_server
+```
+
+## Manual config template
+
+```yaml
+wing: ezra_home
+palace: ~/.mempalace/palace
+rooms:
+  - name: sessions
+    description: Conversation history and durable agent transcripts
+    globs:
+      - "*.json"
+      - "*.jsonl"
+  - name: config
+    description: Hermes configuration and runtime settings
+    globs:
+      - "*.yaml"
+      - "*.yml"
+      - "*.toml"
+  - name: docs
+    description: Notes, markdown docs, and operating reports
+    globs:
+      - "*.md"
+      - "*.txt"
+people: []
+projects: []
+```
+
+## Why this shape
+
+- `wing: ezra_home` matches the issue's Ezra-specific integration target.
+- `rooms` split the mined material into sessions, config, and docs to keep retrieval interpretable.
+- Mining commands pipe empty stdin to avoid the interactive entity-detector hang noted in the evaluation.
+
+## Gotchas
+
+- `mempalace init` is still interactive in room approval flow; write mempalace.yaml manually if the init output stalls.
+- The yaml key is `wing:` not `wings:`. Using the wrong key causes mine/setup failures.
+- Pipe empty stdin into mining commands (`echo "" | ...`) to avoid the entity-detector stdin hang on larger directories.
+- First mine downloads the ChromaDB embedding model cache (~79MB).
+- Report Ezra's before/after metrics back to issue #568 after live installation and retrieval tests.
+
+## Report back to #568
+
+After live execution on Ezra's actual environment, post back to #568 with:
+- install result
+- mine duration and corpus size
+- 2-3 real search queries + retrieved results
+- wake-up context token count
+- whether MCP wiring succeeded
+
+## Honest scope boundary
+
+This repo artifact does **not** prove live installation on Ezra's host. It makes the work reproducible and testable so the next pass can execute it without guesswork.
--- a/scripts/agent_pr_gate.py
+++ b/scripts/agent_pr_gate.py
@@ -1,191 +0,0 @@
-#!/usr/bin/env python3
-import argparse
-import json
-import os
-import re
-import sys
-import urllib.request
-from pathlib import Path
-
-API_BASE = "https://forge.alexanderwhitestone.com/api/v1"
-LOW_RISK_PREFIXES = (
-    'docs/', 'reports/', 'notes/', 'tickets/', 'research/', 'briefings/',
-    'twitter-archive/notes/', 'tests/'
-)
-LOW_RISK_SUFFIXES = {'.md', '.txt', '.jsonl'}
-MEDIUM_RISK_PREFIXES = ('.gitea/workflows/',)
-HIGH_RISK_PREFIXES = (
-    'scripts/', 'deploy/', 'infrastructure/', 'metrics/', 'heartbeat/',
-    'wizards/', 'evennia/', 'uniwizard/', 'uni-wizard/', 'timmy-local/',
-    'evolution/'
-)
-HIGH_RISK_SUFFIXES = {'.py', '.sh', '.ini', '.service'}
-
-
-def read_changed_files(path):
-    return [line.strip() for line in Path(path).read_text(encoding='utf-8').splitlines() if line.strip()]
-
-
-def classify_risk(files):
-    if not files:
-        return 'high'
-    level = 'low'
-    for file_path in files:
-        path = file_path.strip()
-        suffix = Path(path).suffix.lower()
-        if path.startswith(LOW_RISK_PREFIXES):
-            continue
-        if path.startswith(HIGH_RISK_PREFIXES) or suffix in HIGH_RISK_SUFFIXES:
-            return 'high'
-        if path.startswith(MEDIUM_RISK_PREFIXES):
-            level = 'medium'
-            continue
-        if path.startswith(LOW_RISK_PREFIXES) or suffix in LOW_RISK_SUFFIXES:
-            continue
-        level = 'high'
-    return level
-
-
-def validate_pr_body(title, body):
-    details = []
-    combined = f"{title}\n{body}".strip()
-    if not re.search(r'#\d+', combined):
-        details.append('PR body/title must include an issue reference like #562.')
-    if not re.search(r'(^|\n)\s*(verification|tests?)\s*:', body, re.IGNORECASE):
-        details.append('PR body must include a Verification: section.')
-    return (len(details) == 0, details)
-
-
-def build_comment_body(syntax_status, tests_status, criteria_status, risk_level):
-    statuses = {
-        'syntax': syntax_status,
-        'tests': tests_status,
-        'criteria': criteria_status,
-    }
-    all_clean = all(value == 'success' for value in statuses.values())
-    action = 'auto-merge' if all_clean and risk_level == 'low' else 'human review'
-    lines = [
-        '## Agent PR Gate',
-        '',
-        '| Check | Status |',
-        '|-------|--------|',
-        f"| Syntax / parse | {syntax_status} |",
-        f"| Test suite | {tests_status} |",
-        f"| PR criteria | {criteria_status} |",
-        f"| Risk level | {risk_level} |",
-        '',
-    ]
-    failed = [name for name, value in statuses.items() if value != 'success']
-    if failed:
-        lines.append('### Failure details')
-        for name in failed:
-            lines.append(f'- {name} reported failure. Inspect the workflow logs for that step.')
-    else:
-        lines.append('All automated checks passed.')
-    lines.extend([
-        '',
-        f'Recommendation: {action}.',
-        'Low-risk documentation/test-only PRs may be auto-merged. Operational changes stay in human review.',
-    ])
-    return '\n'.join(lines)
-
-
-def _read_event(event_path):
-    data = json.loads(Path(event_path).read_text(encoding='utf-8'))
-    pr = data.get('pull_request') or {}
-    repo = (data.get('repository') or {}).get('full_name') or os.environ.get('GITHUB_REPOSITORY')
-    pr_number = pr.get('number') or data.get('number')
-    title = pr.get('title') or ''
-    body = pr.get('body') or ''
-    return repo, pr_number, title, body
-
-
-def _request_json(method, url, token, payload=None):
-    data = None if payload is None else json.dumps(payload).encode('utf-8')
-    headers = {'Authorization': f'token {token}', 'Content-Type': 'application/json'}
-    req = urllib.request.Request(url, data=data, headers=headers, method=method)
-    with urllib.request.urlopen(req, timeout=30) as resp:
-        return json.loads(resp.read().decode('utf-8'))
-
-
-def post_comment(repo, pr_number, token, body):
-    url = f'{API_BASE}/repos/{repo}/issues/{pr_number}/comments'
-    return _request_json('POST', url, token, {'body': body})
-
-
-def merge_pr(repo, pr_number, token):
-    url = f'{API_BASE}/repos/{repo}/pulls/{pr_number}/merge'
-    return _request_json('POST', url, token, {'Do': 'merge'})
-
-
-def cmd_classify_risk(args):
-    files = list(args.files or [])
-    if args.files_file:
-        files.extend(read_changed_files(args.files_file))
-    print(json.dumps({'risk': classify_risk(files), 'files': files}, indent=2))
-    return 0
-
-
-def cmd_validate_pr(args):
-    _, _, title, body = _read_event(args.event_path)
-    ok, details = validate_pr_body(title, body)
-    if ok:
-        print('PR body validation passed.')
-        return 0
-    for detail in details:
-        print(detail)
-    return 1
-
-
-def cmd_comment(args):
-    repo, pr_number, _, _ = _read_event(args.event_path)
-    body = build_comment_body(args.syntax, args.tests, args.criteria, args.risk)
-    post_comment(repo, pr_number, args.token, body)
-    print(f'Commented on PR #{pr_number} in {repo}.')
-    return 0
-
-
-def cmd_merge(args):
-    repo, pr_number, _, _ = _read_event(args.event_path)
-    merge_pr(repo, pr_number, args.token)
-    print(f'Merged PR #{pr_number} in {repo}.')
-    return 0
-
-
-def build_parser():
-    parser = argparse.ArgumentParser(description='Agent PR CI helpers for timmy-home.')
-    sub = parser.add_subparsers(dest='command', required=True)
-
-    classify = sub.add_parser('classify-risk')
-    classify.add_argument('--files-file')
-    classify.add_argument('files', nargs='*')
-    classify.set_defaults(func=cmd_classify_risk)
-
-    validate = sub.add_parser('validate-pr')
-    validate.add_argument('--event-path', required=True)
-    validate.set_defaults(func=cmd_validate_pr)
-
-    comment = sub.add_parser('comment')
-    comment.add_argument('--event-path', required=True)
-    comment.add_argument('--token', required=True)
-    comment.add_argument('--syntax', required=True)
-    comment.add_argument('--tests', required=True)
-    comment.add_argument('--criteria', required=True)
-    comment.add_argument('--risk', required=True)
-    comment.set_defaults(func=cmd_comment)
-
-    merge = sub.add_parser('merge')
-    merge.add_argument('--event-path', required=True)
-    merge.add_argument('--token', required=True)
-    merge.set_defaults(func=cmd_merge)
-    return parser
-
-
-def main(argv=None):
-    parser = build_parser()
-    args = parser.parse_args(argv)
-    return args.func(args)
-
-
-if __name__ == '__main__':
-    sys.exit(main())
--- a/scripts/know_thy_father/epic_pipeline.py
+++ b/scripts/know_thy_father/epic_pipeline.py
@@ -0,0 +1,127 @@
+#!/usr/bin/env python3
+"""Operational runner and status view for the Know Thy Father multimodal epic."""
+
+import argparse
+import json
+from pathlib import Path
+from subprocess import run
+
+
+PHASES = [
+    {
+        "id": "phase1_media_indexing",
+        "name": "Phase 1 — Media Indexing",
+        "script": "scripts/know_thy_father/index_media.py",
+        "command_template": "python3 scripts/know_thy_father/index_media.py --tweets twitter-archive/extracted/tweets.jsonl --output twitter-archive/know-thy-father/media_manifest.jsonl",
+        "outputs": ["twitter-archive/know-thy-father/media_manifest.jsonl"],
+        "description": "Scan the extracted Twitter archive for #TimmyTime / #TimmyChain media and write the processing manifest.",
+    },
+    {
+        "id": "phase2_multimodal_analysis",
+        "name": "Phase 2 — Multimodal Analysis",
+        "script": "scripts/twitter_archive/analyze_media.py",
+        "command_template": "python3 scripts/twitter_archive/analyze_media.py --batch {batch_size}",
+        "outputs": [
+            "twitter-archive/know-thy-father/analysis.jsonl",
+            "twitter-archive/know-thy-father/meaning-kernels.jsonl",
+            "twitter-archive/know-thy-father/pipeline-status.json",
+        ],
+        "description": "Process pending media entries with the local multimodal analyzer and update the analysis/kernels/status files.",
+    },
+    {
+        "id": "phase3_holographic_synthesis",
+        "name": "Phase 3 — Holographic Synthesis",
+        "script": "scripts/know_thy_father/synthesize_kernels.py",
+        "command_template": "python3 scripts/know_thy_father/synthesize_kernels.py --input twitter-archive/media/manifest.jsonl --output twitter-archive/knowledge/fathers_ledger.jsonl --summary twitter-archive/knowledge/fathers_ledger.summary.json",
+        "outputs": [
+            "twitter-archive/knowledge/fathers_ledger.jsonl",
+            "twitter-archive/knowledge/fathers_ledger.summary.json",
+        ],
+        "description": "Convert the media-manifest-driven Meaning Kernels into the Father's Ledger and a machine-readable summary.",
+    },
+    {
+        "id": "phase4_cross_reference_audit",
+        "name": "Phase 4 — Cross-Reference Audit",
+        "script": "scripts/know_thy_father/crossref_audit.py",
+        "command_template": "python3 scripts/know_thy_father/crossref_audit.py --soul SOUL.md --kernels twitter-archive/notes/know_thy_father_crossref.md --output twitter-archive/notes/crossref_report.md",
+        "outputs": ["twitter-archive/notes/crossref_report.md"],
+        "description": "Compare Know Thy Father kernels against SOUL.md and related canon, then emit a Markdown audit report.",
+    },
+    {
+        "id": "phase5_processing_log",
+        "name": "Phase 5 — Processing Log / Status",
+        "script": "twitter-archive/know-thy-father/tracker.py",
+        "command_template": "python3 twitter-archive/know-thy-father/tracker.py report",
+        "outputs": ["twitter-archive/know-thy-father/REPORT.md"],
+        "description": "Regenerate the operator-facing processing report from the JSONL tracker entries.",
+    },
+]
+
+
+def build_pipeline_plan(batch_size: int = 10):
+    plan = []
+    for phase in PHASES:
+        plan.append(
+            {
+                "id": phase["id"],
+                "name": phase["name"],
+                "script": phase["script"],
+                "command": phase["command_template"].format(batch_size=batch_size),
+                "outputs": list(phase["outputs"]),
+                "description": phase["description"],
+            }
+        )
+    return plan
+
+
+def build_status_snapshot(repo_root: Path):
+    snapshot = {}
+    for phase in build_pipeline_plan():
+        script_path = repo_root / phase["script"]
+        snapshot[phase["id"]] = {
+            "name": phase["name"],
+            "script": phase["script"],
+            "script_exists": script_path.exists(),
+            "outputs": [
+                {
+                    "path": output,
+                    "exists": (repo_root / output).exists(),
+                }
+                for output in phase["outputs"]
+            ],
+        }
+    return snapshot
+
+
+def run_step(repo_root: Path, step_id: str, batch_size: int = 10):
+    plan = {step["id"]: step for step in build_pipeline_plan(batch_size=batch_size)}
+    if step_id not in plan:
+        raise SystemExit(f"Unknown step: {step_id}")
+    step = plan[step_id]
+    return run(step["command"], cwd=repo_root, shell=True, check=False)
+
+
+def main():
+    parser = argparse.ArgumentParser(description="Know Thy Father epic orchestration helper")
+    parser.add_argument("--batch-size", type=int, default=10)
+    parser.add_argument("--status", action="store_true")
+    parser.add_argument("--run-step", default=None)
+    parser.add_argument("--json", action="store_true")
+    args = parser.parse_args()
+
+    repo_root = Path(__file__).resolve().parents[2]
+
+    if args.run_step:
+        result = run_step(repo_root, args.run_step, batch_size=args.batch_size)
+        raise SystemExit(result.returncode)
+
+    payload = build_status_snapshot(repo_root) if args.status else build_pipeline_plan(batch_size=args.batch_size)
+    if args.json or args.status:
+        print(json.dumps(payload, indent=2))
+    else:
+        for step in payload:
+            print(f"[{step['id']}] {step['command']}")
+
+
+if __name__ == "__main__":
+    main()
--- a/scripts/mempalace_ezra_integration.py
+++ b/scripts/mempalace_ezra_integration.py
@@ -0,0 +1,159 @@
+#!/usr/bin/env python3
+"""Prepare a MemPalace v3.0.0 integration packet for Ezra's Hermes home."""
+
+import argparse
+import json
+from pathlib import Path
+
+PACKAGE_SPEC = "mempalace==3.0.0"
+DEFAULT_HERMES_HOME = "~/.hermes/"
+DEFAULT_SESSIONS_DIR = "~/.hermes/sessions/"
+DEFAULT_PALACE_PATH = "~/.mempalace/palace"
+DEFAULT_WING = "ezra_home"
+
+
+def build_yaml_template(wing: str, palace_path: str) -> str:
+    return (
+        f"wing: {wing}\n"
+        f"palace: {palace_path}\n"
+        "rooms:\n"
+        "  - name: sessions\n"
+        "    description: Conversation history and durable agent transcripts\n"
+        "    globs:\n"
+        "      - \"*.json\"\n"
+        "      - \"*.jsonl\"\n"
+        "  - name: config\n"
+        "    description: Hermes configuration and runtime settings\n"
+        "    globs:\n"
+        "      - \"*.yaml\"\n"
+        "      - \"*.yml\"\n"
+        "      - \"*.toml\"\n"
+        "  - name: docs\n"
+        "    description: Notes, markdown docs, and operating reports\n"
+        "    globs:\n"
+        "      - \"*.md\"\n"
+        "      - \"*.txt\"\n"
+        "people: []\n"
+        "projects: []\n"
+    )
+
+
+def build_plan(overrides: dict | None = None) -> dict:
+    overrides = overrides or {}
+    hermes_home = overrides.get("hermes_home", DEFAULT_HERMES_HOME)
+    sessions_dir = overrides.get("sessions_dir", DEFAULT_SESSIONS_DIR)
+    palace_path = overrides.get("palace_path", DEFAULT_PALACE_PATH)
+    wing = overrides.get("wing", DEFAULT_WING)
+    yaml_template = build_yaml_template(wing=wing, palace_path=palace_path)
+
+    config_home = hermes_home[:-1] if hermes_home.endswith("/") else hermes_home
+    plan = {
+        "package_spec": PACKAGE_SPEC,
+        "hermes_home": hermes_home,
+        "sessions_dir": sessions_dir,
+        "palace_path": palace_path,
+        "wing": wing,
+        "config_path": f"{config_home}/mempalace.yaml",
+        "install_command": f"pip install {PACKAGE_SPEC}",
+        "init_command": f"mempalace init {hermes_home} --yes",
+        "mine_home_command": f"echo \"\" | mempalace mine {hermes_home}",
+        "mine_sessions_command": f"echo \"\" | mempalace mine {sessions_dir} --mode convos",
+        "search_command": 'mempalace search "your common queries"',
+        "wake_up_command": "mempalace wake-up",
+        "mcp_command": "hermes mcp add mempalace -- python -m mempalace.mcp_server",
+        "yaml_template": yaml_template,
+        "gotchas": [
+            "`mempalace init` is still interactive in room approval flow; write mempalace.yaml manually if the init output stalls.",
+            "The yaml key is `wing:` not `wings:`. Using the wrong key causes mine/setup failures.",
+            "Pipe empty stdin into mining commands (`echo \"\" | ...`) to avoid the entity-detector stdin hang on larger directories.",
+            "First mine downloads the ChromaDB embedding model cache (~79MB).",
+            "Report Ezra's before/after metrics back to issue #568 after live installation and retrieval tests.",
+        ],
+    }
+    return plan
+
+
+def render_markdown(plan: dict) -> str:
+    gotchas = "\n".join(f"- {item}" for item in plan["gotchas"])
+    return f"""# MemPalace v3.0.0 — Ezra Integration Packet
+
+This packet turns issue #570 into an executable, reviewable integration plan for Ezra's Hermes home.
+It is a repo-side scaffold: no live Ezra host changes are claimed in this artifact.
+
+## Commands
+
+```bash
+{plan['install_command']}
+{plan['init_command']}
+cat > {plan['config_path']} <<'YAML'
+{plan['yaml_template'].rstrip()}
+YAML
+{plan['mine_home_command']}
+{plan['mine_sessions_command']}
+{plan['search_command']}
+{plan['wake_up_command']}
+{plan['mcp_command']}
+```
+
+## Manual config template
+
+```yaml
+{plan['yaml_template'].rstrip()}
+```
+
+## Why this shape
+
+- `wing: {plan['wing']}` matches the issue's Ezra-specific integration target.
+- `rooms` split the mined material into sessions, config, and docs to keep retrieval interpretable.
+- Mining commands pipe empty stdin to avoid the interactive entity-detector hang noted in the evaluation.
+
+## Gotchas
+
+{gotchas}
+
+## Report back to #568
+
+After live execution on Ezra's actual environment, post back to #568 with:
+- install result
+- mine duration and corpus size
+- 2-3 real search queries + retrieved results
+- wake-up context token count
+- whether MCP wiring succeeded
+
+## Honest scope boundary
+
+This repo artifact does **not** prove live installation on Ezra's host. It makes the work reproducible and testable so the next pass can execute it without guesswork.
+"""
+
+
+def main() -> None:
+    parser = argparse.ArgumentParser(description="Prepare the MemPalace Ezra integration packet")
+    parser.add_argument("--hermes-home", default=DEFAULT_HERMES_HOME)
+    parser.add_argument("--sessions-dir", default=DEFAULT_SESSIONS_DIR)
+    parser.add_argument("--palace-path", default=DEFAULT_PALACE_PATH)
+    parser.add_argument("--wing", default=DEFAULT_WING)
+    parser.add_argument("--output", default=None)
+    parser.add_argument("--json", action="store_true")
+    args = parser.parse_args()
+
+    plan = build_plan(
+        {
+            "hermes_home": args.hermes_home,
+            "sessions_dir": args.sessions_dir,
+            "palace_path": args.palace_path,
+            "wing": args.wing,
+        }
+    )
+    rendered = json.dumps(plan, indent=2) if args.json else render_markdown(plan)
+
+    if args.output:
+        output_path = Path(args.output).expanduser()
+        output_path.parent.mkdir(parents=True, exist_ok=True)
+        output_path.write_text(rendered, encoding="utf-8")
+        print(f"MemPalace integration packet written to {output_path}")
+    else:
+        print(rendered)
+
+
+if __name__ == "__main__":
+    main()
--- a/tests/test_agent_pr_gate.py
+++ b/tests/test_agent_pr_gate.py
@@ -1,68 +0,0 @@
-import pathlib
-import sys
-import tempfile
-import unittest
-
-ROOT = pathlib.Path(__file__).resolve().parents[1]
-sys.path.insert(0, str(ROOT / 'scripts'))
-
-import agent_pr_gate  # noqa: E402
-
-
-class TestAgentPrGate(unittest.TestCase):
-    def test_classify_risk_low_for_docs_and_tests_only(self):
-        level = agent_pr_gate.classify_risk([
-            'docs/runbook.md',
-            'reports/daily-summary.md',
-            'tests/test_agent_pr_gate.py',
-        ])
-        self.assertEqual(level, 'low')
-
-    def test_classify_risk_high_for_operational_paths(self):
-        level = agent_pr_gate.classify_risk([
-            'scripts/failover_monitor.py',
-            'deploy/playbook.yml',
-        ])
-        self.assertEqual(level, 'high')
-
-    def test_validate_pr_body_requires_issue_ref_and_verification(self):
-        ok, details = agent_pr_gate.validate_pr_body(
-            'feat: add thing',
-            'What changed only\n\nNo verification section here.'
-        )
-        self.assertFalse(ok)
-        self.assertIn('issue reference', ' '.join(details).lower())
-        self.assertIn('verification', ' '.join(details).lower())
-
-    def test_validate_pr_body_accepts_issue_ref_and_verification(self):
-        ok, details = agent_pr_gate.validate_pr_body(
-            'feat: add thing (#562)',
-            'Refs #562\n\nVerification:\n- pytest -q\n'
-        )
-        self.assertTrue(ok)
-        self.assertEqual(details, [])
-
-    def test_build_comment_body_reports_failures_and_human_review(self):
-        body = agent_pr_gate.build_comment_body(
-            syntax_status='success',
-            tests_status='failure',
-            criteria_status='success',
-            risk_level='high',
-        )
-        self.assertIn('tests', body.lower())
-        self.assertIn('failure', body.lower())
-        self.assertIn('human review', body.lower())
-
-    def test_changed_files_file_loader_ignores_blanks(self):
-        with tempfile.NamedTemporaryFile('w+', delete=False) as handle:
-            handle.write('docs/one.md\n\nreports/two.md\n')
-            path = handle.name
-        try:
-            files = agent_pr_gate.read_changed_files(path)
-        finally:
-            pathlib.Path(path).unlink(missing_ok=True)
-        self.assertEqual(files, ['docs/one.md', 'reports/two.md'])
-
-
-if __name__ == '__main__':
-    unittest.main()
--- a/tests/test_agent_pr_workflow.py
+++ b/tests/test_agent_pr_workflow.py
@@ -1,24 +0,0 @@
-import pathlib
-import unittest
-import yaml
-
-ROOT = pathlib.Path(__file__).resolve().parents[1]
-WORKFLOW = ROOT / '.gitea' / 'workflows' / 'agent-pr-gate.yml'
-
-
-class TestAgentPrWorkflow(unittest.TestCase):
-    def test_workflow_exists(self):
-        self.assertTrue(WORKFLOW.exists(), 'agent-pr-gate workflow should exist')
-
-    def test_workflow_has_pr_gate_and_reporting_jobs(self):
-        data = yaml.safe_load(WORKFLOW.read_text(encoding='utf-8'))
-        self.assertIn('pull_request', data.get('on', {}))
-        jobs = data.get('jobs', {})
-        self.assertIn('gate', jobs)
-        self.assertIn('report', jobs)
-        report_steps = jobs['report']['steps']
-        self.assertTrue(any('Auto-merge low-risk clean PRs' in (step.get('name') or '') for step in report_steps))
-
-
-if __name__ == '__main__':
-    unittest.main()
--- a/tests/test_know_thy_father_pipeline.py
+++ b/tests/test_know_thy_father_pipeline.py
@@ -0,0 +1,76 @@
+from pathlib import Path
+import importlib.util
+import unittest
+
+
+ROOT = Path(__file__).resolve().parent.parent
+SCRIPT_PATH = ROOT / "scripts" / "know_thy_father" / "epic_pipeline.py"
+DOC_PATH = ROOT / "docs" / "KNOW_THY_FATHER_MULTIMODAL_PIPELINE.md"
+
+
+def load_module(path: Path, name: str):
+    assert path.exists(), f"missing {path.relative_to(ROOT)}"
+    spec = importlib.util.spec_from_file_location(name, path)
+    assert spec and spec.loader
+    module = importlib.util.module_from_spec(spec)
+    spec.loader.exec_module(module)
+    return module
+
+
+class TestKnowThyFatherEpicPipeline(unittest.TestCase):
+    def test_build_pipeline_plan_contains_all_phases_in_order(self):
+        mod = load_module(SCRIPT_PATH, "ktf_epic_pipeline")
+        plan = mod.build_pipeline_plan(batch_size=10)
+
+        self.assertEqual(
+            [step["id"] for step in plan],
+            [
+                "phase1_media_indexing",
+                "phase2_multimodal_analysis",
+                "phase3_holographic_synthesis",
+                "phase4_cross_reference_audit",
+                "phase5_processing_log",
+            ],
+        )
+        self.assertIn("scripts/know_thy_father/index_media.py", plan[0]["command"])
+        self.assertIn("scripts/twitter_archive/analyze_media.py --batch 10", plan[1]["command"])
+        self.assertIn("scripts/know_thy_father/synthesize_kernels.py", plan[2]["command"])
+        self.assertIn("scripts/know_thy_father/crossref_audit.py", plan[3]["command"])
+        self.assertIn("twitter-archive/know-thy-father/tracker.py report", plan[4]["command"])
+
+    def test_status_snapshot_reports_key_artifact_paths(self):
+        mod = load_module(SCRIPT_PATH, "ktf_epic_pipeline")
+        status = mod.build_status_snapshot(ROOT)
+
+        self.assertIn("phase1_media_indexing", status)
+        self.assertIn("phase2_multimodal_analysis", status)
+        self.assertIn("phase3_holographic_synthesis", status)
+        self.assertIn("phase4_cross_reference_audit", status)
+        self.assertIn("phase5_processing_log", status)
+        self.assertEqual(status["phase1_media_indexing"]["script"], "scripts/know_thy_father/index_media.py")
+        self.assertEqual(status["phase2_multimodal_analysis"]["script"], "scripts/twitter_archive/analyze_media.py")
+        self.assertEqual(status["phase5_processing_log"]["script"], "twitter-archive/know-thy-father/tracker.py")
+        self.assertTrue(status["phase1_media_indexing"]["script_exists"])
+        self.assertTrue(status["phase2_multimodal_analysis"]["script_exists"])
+        self.assertTrue(status["phase3_holographic_synthesis"]["script_exists"])
+        self.assertTrue(status["phase4_cross_reference_audit"]["script_exists"])
+        self.assertTrue(status["phase5_processing_log"]["script_exists"])
+
+    def test_repo_contains_multimodal_pipeline_doc(self):
+        self.assertTrue(DOC_PATH.exists(), "missing committed Know Thy Father pipeline doc")
+        text = DOC_PATH.read_text(encoding="utf-8")
+        required = [
+            "# Know Thy Father — Multimodal Media Consumption Pipeline",
+            "scripts/know_thy_father/index_media.py",
+            "scripts/twitter_archive/analyze_media.py --batch 10",
+            "scripts/know_thy_father/synthesize_kernels.py",
+            "scripts/know_thy_father/crossref_audit.py",
+            "twitter-archive/know-thy-father/tracker.py report",
+            "Refs #582",
+        ]
+        for snippet in required:
+            self.assertIn(snippet, text)
+
+
+if __name__ == "__main__":
+    unittest.main()
--- a/tests/test_mempalace_ezra_integration.py
+++ b/tests/test_mempalace_ezra_integration.py
@@ -0,0 +1,68 @@
+from pathlib import Path
+import importlib.util
+import unittest
+
+
+ROOT = Path(__file__).resolve().parent.parent
+SCRIPT_PATH = ROOT / "scripts" / "mempalace_ezra_integration.py"
+DOC_PATH = ROOT / "docs" / "MEMPALACE_EZRA_INTEGRATION.md"
+
+
+def load_module(path: Path, name: str):
+    assert path.exists(), f"missing {path.relative_to(ROOT)}"
+    spec = importlib.util.spec_from_file_location(name, path)
+    assert spec and spec.loader
+    module = importlib.util.module_from_spec(spec)
+    spec.loader.exec_module(module)
+    return module
+
+
+class TestMempalaceEzraIntegration(unittest.TestCase):
+    def test_build_plan_contains_issue_required_steps_and_gotchas(self):
+        mod = load_module(SCRIPT_PATH, "mempalace_ezra_integration")
+        plan = mod.build_plan({})
+
+        self.assertEqual(plan["package_spec"], "mempalace==3.0.0")
+        self.assertIn("pip install mempalace==3.0.0", plan["install_command"])
+        self.assertEqual(plan["wing"], "ezra_home")
+        self.assertIn('echo "" | mempalace mine ~/.hermes/', plan["mine_home_command"])
+        self.assertIn('--mode convos', plan["mine_sessions_command"])
+        self.assertIn('mempalace wake-up', plan["wake_up_command"])
+        self.assertIn('hermes mcp add mempalace -- python -m mempalace.mcp_server', plan["mcp_command"])
+        self.assertIn('wing:', plan["yaml_template"])
+        self.assertTrue(any('stdin' in item.lower() for item in plan["gotchas"]))
+        self.assertTrue(any('wing:' in item for item in plan["gotchas"]))
+
+    def test_build_plan_accepts_path_and_wing_overrides(self):
+        mod = load_module(SCRIPT_PATH, "mempalace_ezra_integration")
+        plan = mod.build_plan(
+            {
+                "hermes_home": "/root/wizards/ezra/home",
+                "sessions_dir": "/root/wizards/ezra/home/sessions",
+                "wing": "ezra_archive",
+            }
+        )
+
+        self.assertEqual(plan["wing"], "ezra_archive")
+        self.assertIn('/root/wizards/ezra/home', plan["mine_home_command"])
+        self.assertIn('/root/wizards/ezra/home/sessions', plan["mine_sessions_command"])
+        self.assertIn('wing: ezra_archive', plan["yaml_template"])
+
+    def test_repo_contains_mem_palace_ezra_doc(self):
+        self.assertTrue(DOC_PATH.exists(), "missing committed MemPalace Ezra integration doc")
+        text = DOC_PATH.read_text(encoding="utf-8")
+        required = [
+            "# MemPalace v3.0.0 — Ezra Integration Packet",
+            "pip install mempalace==3.0.0",
+            'echo "" | mempalace mine ~/.hermes/',
+            "mempalace mine ~/.hermes/sessions/ --mode convos",
+            "mempalace wake-up",
+            "hermes mcp add mempalace -- python -m mempalace.mcp_server",
+            "Report back to #568",
+        ]
+        for snippet in required:
+            self.assertIn(snippet, text)
+
+
+if __name__ == "__main__":
+    unittest.main()
Author	SHA1	Message	Date
Alexander Whitestone	89dfa1e5de	feat: add Know Thy Father epic orchestrator (#582 ) Some checks failed Smoke Test / smoke (pull_request) Failing after 23s Details	2026-04-15 01:52:58 -04:00
Alexander Whitestone	d791c087cb	feat: add Ezra mempalace integration packet (#570 ) Some checks failed Smoke Test / smoke (pull_request) Failing after 22s Details	2026-04-15 01:37:47 -04:00