Compare commits

..

8 Commits

Author SHA1 Message Date
Alexander Whitestone
a2115398d4 fix: smoke workflow JSON parse + add pytest step (closes #742) (closes #743)
Some checks failed
Smoke Test / smoke (pull_request) Failing after 10m8s
2026-04-15 11:11:25 -04:00
5a696c184e Merge pull request 'feat: add NH Broadband install packet scaffold (closes #740)' (#741) from sprint/issue-740 into main 2026-04-15 11:57:34 +00:00
Alexander Whitestone
90d8daedcf feat: add NH Broadband install packet scaffold (closes #740)
Some checks failed
Smoke Test / smoke (pull_request) Failing after 19s
2026-04-15 07:33:01 -04:00
3016e012cc Merge PR #739: feat: add laptop fleet planner scaffold (#530) 2026-04-15 06:17:19 +00:00
60b9b90f34 Merge PR #738: feat: add Know Thy Father epic orchestrator 2026-04-15 06:12:05 +00:00
Alexander Whitestone
c818a30522 feat: add laptop fleet planner scaffold (#530)
Some checks failed
Smoke Test / smoke (pull_request) Failing after 30s
2026-04-15 02:11:31 -04:00
Alexander Whitestone
89dfa1e5de feat: add Know Thy Father epic orchestrator (#582)
Some checks failed
Smoke Test / smoke (pull_request) Failing after 23s
2026-04-15 01:52:58 -04:00
Alexander Whitestone
d791c087cb feat: add Ezra mempalace integration packet (#570)
Some checks failed
Smoke Test / smoke (pull_request) Failing after 22s
2026-04-15 01:37:47 -04:00
17 changed files with 1257 additions and 672 deletions

View File

@@ -11,13 +11,43 @@ jobs:
- uses: actions/setup-python@v5
with:
python-version: '3.11'
- name: Parse check
- name: Parse YAML
run: |
find . -name '*.yml' -o -name '*.yaml' | grep -v .gitea | xargs -r python3 -c "import sys,yaml; [yaml.safe_load(open(f)) for f in sys.argv[1:]]"
find . -name '*.json' | xargs -r python3 -m json.tool > /dev/null
find . -name '*.py' | xargs -r python3 -m py_compile
find . -name '*.sh' | xargs -r bash -n
echo "PASS: All files parse"
set -euo pipefail
find . -name '*.yml' -o -name '*.yaml' | grep -v '\.gitea' | grep -v node_modules | grep -v __pycache__ | grep -v venv | while read -r f; do
echo "Checking $f"
python3 -c "import yaml; yaml.safe_load(open('$f'))"
done
echo "PASS: YAML files parse"
- name: Parse JSON
run: |
set -euo pipefail
find . -name '*.json' -not -path './.git/*' -not -path '*/node_modules/*' -not -path '*/__pycache__/*' -not -path '*/venv/*' | while read -r f; do
echo "Checking $f"
python3 -m json.tool "$f" > /dev/null
done
echo "PASS: JSON files parse"
- name: Parse Python
run: |
set -euo pipefail
find . -name '*.py' -not -path './.git/*' -not -path '*/node_modules/*' -not -path '*/__pycache__/*' -not -path '*/venv/*' | while read -r f; do
echo "Checking $f"
python3 -m py_compile "$f"
done
echo "PASS: Python files parse"
- name: Parse Shell
run: |
set -euo pipefail
find . -name '*.sh' -not -path './.git/*' -not -path '*/node_modules/*' -not -path '*/__pycache__/*' -not -path '*/venv/*' | while read -r f; do
echo "Checking $f"
bash -n "$f"
done
echo "PASS: Shell files parse"
- name: Pytest
run: |
set -euo pipefail
python3 -m pytest tests/ -q --tb=short
echo "PASS: Tests pass"
- name: Secret scan
run: |
if grep -rE 'sk-or-|sk-ant-|ghp_|AKIA' . --include='*.yml' --include='*.py' --include='*.sh' 2>/dev/null | grep -v '.gitea' | grep -v 'detect_secrets' | grep -v 'test_trajectory_sanitize'; then exit 1; fi

View File

@@ -0,0 +1,61 @@
# Know Thy Father — Multimodal Media Consumption Pipeline
Refs #582
This document makes the epic operational by naming the current source-of-truth scripts, their handoff artifacts, and the one-command runner that coordinates them.
## Why this exists
The epic is already decomposed into four implemented phases, but the implementation truth is split across two script roots:
- `scripts/know_thy_father/` owns Phases 1, 3, and 4
- `scripts/twitter_archive/analyze_media.py` owns Phase 2
- `twitter-archive/know-thy-father/tracker.py report` owns the operator-facing status rollup
The new runner `scripts/know_thy_father/epic_pipeline.py` does not replace those scripts. It stitches them together into one explicit, reviewable plan.
## Phase map
| Phase | Script | Primary output |
|-------|--------|----------------|
| 1. Media Indexing | `scripts/know_thy_father/index_media.py` | `twitter-archive/know-thy-father/media_manifest.jsonl` |
| 2. Multimodal Analysis | `scripts/twitter_archive/analyze_media.py --batch 10` | `twitter-archive/know-thy-father/analysis.jsonl` + `meaning-kernels.jsonl` + `pipeline-status.json` |
| 3. Holographic Synthesis | `scripts/know_thy_father/synthesize_kernels.py` | `twitter-archive/knowledge/fathers_ledger.jsonl` |
| 4. Cross-Reference Audit | `scripts/know_thy_father/crossref_audit.py` | `twitter-archive/notes/crossref_report.md` |
| 5. Processing Log | `twitter-archive/know-thy-father/tracker.py report` | `twitter-archive/know-thy-father/REPORT.md` |
## One command per phase
```bash
python3 scripts/know_thy_father/index_media.py --tweets twitter-archive/extracted/tweets.jsonl --output twitter-archive/know-thy-father/media_manifest.jsonl
python3 scripts/twitter_archive/analyze_media.py --batch 10
python3 scripts/know_thy_father/synthesize_kernels.py --input twitter-archive/media/manifest.jsonl --output twitter-archive/knowledge/fathers_ledger.jsonl --summary twitter-archive/knowledge/fathers_ledger.summary.json
python3 scripts/know_thy_father/crossref_audit.py --soul SOUL.md --kernels twitter-archive/notes/know_thy_father_crossref.md --output twitter-archive/notes/crossref_report.md
python3 twitter-archive/know-thy-father/tracker.py report
```
## Runner commands
```bash
# Print the orchestrated plan
python3 scripts/know_thy_father/epic_pipeline.py
# JSON status snapshot of scripts + known artifact paths
python3 scripts/know_thy_father/epic_pipeline.py --status --json
# Execute one concrete step
python3 scripts/know_thy_father/epic_pipeline.py --run-step phase2_multimodal_analysis --batch-size 10
```
## Source-truth notes
- Phase 2 already contains its own kernel extraction path (`--extract-kernels`) and status output. The epic runner does not reimplement that logic.
- Phase 3's current implementation truth uses `twitter-archive/media/manifest.jsonl` as its default input. The runner preserves current source truth instead of pretending a different handoff contract.
- The processing log in `twitter-archive/know-thy-father/PROCESSING_LOG.md` can drift from current code reality. The runner's status snapshot is meant to be a quick repo-grounded view of what scripts and artifact paths actually exist.
## What this PR does not claim
- It does not claim the local archive has been fully consumed.
- It does not claim the halted processing log has been resumed.
- It does not claim fact_store ingestion has been fully wired end-to-end.
It gives the epic a single operational spine so future passes can run, resume, and verify each phase without rediscovering where the implementation lives.

View File

@@ -0,0 +1,92 @@
# MemPalace v3.0.0 — Ezra Integration Packet
This packet turns issue #570 into an executable, reviewable integration plan for Ezra's Hermes home.
It is a repo-side scaffold: no live Ezra host changes are claimed in this artifact.
## Commands
```bash
pip install mempalace==3.0.0
mempalace init ~/.hermes/ --yes
cat > ~/.hermes/mempalace.yaml <<'YAML'
wing: ezra_home
palace: ~/.mempalace/palace
rooms:
- name: sessions
description: Conversation history and durable agent transcripts
globs:
- "*.json"
- "*.jsonl"
- name: config
description: Hermes configuration and runtime settings
globs:
- "*.yaml"
- "*.yml"
- "*.toml"
- name: docs
description: Notes, markdown docs, and operating reports
globs:
- "*.md"
- "*.txt"
people: []
projects: []
YAML
echo "" | mempalace mine ~/.hermes/
echo "" | mempalace mine ~/.hermes/sessions/ --mode convos
mempalace search "your common queries"
mempalace wake-up
hermes mcp add mempalace -- python -m mempalace.mcp_server
```
## Manual config template
```yaml
wing: ezra_home
palace: ~/.mempalace/palace
rooms:
- name: sessions
description: Conversation history and durable agent transcripts
globs:
- "*.json"
- "*.jsonl"
- name: config
description: Hermes configuration and runtime settings
globs:
- "*.yaml"
- "*.yml"
- "*.toml"
- name: docs
description: Notes, markdown docs, and operating reports
globs:
- "*.md"
- "*.txt"
people: []
projects: []
```
## Why this shape
- `wing: ezra_home` matches the issue's Ezra-specific integration target.
- `rooms` split the mined material into sessions, config, and docs to keep retrieval interpretable.
- Mining commands pipe empty stdin to avoid the interactive entity-detector hang noted in the evaluation.
## Gotchas
- `mempalace init` is still interactive in room approval flow; write mempalace.yaml manually if the init output stalls.
- The yaml key is `wing:` not `wings:`. Using the wrong key causes mine/setup failures.
- Pipe empty stdin into mining commands (`echo "" | ...`) to avoid the entity-detector stdin hang on larger directories.
- First mine downloads the ChromaDB embedding model cache (~79MB).
- Report Ezra's before/after metrics back to issue #568 after live installation and retrieval tests.
## Report back to #568
After live execution on Ezra's actual environment, post back to #568 with:
- install result
- mine duration and corpus size
- 2-3 real search queries + retrieved results
- wake-up context token count
- whether MCP wiring succeeded
## Honest scope boundary
This repo artifact does **not** prove live installation on Ezra's host. It makes the work reproducible and testable so the next pass can execute it without guesswork.

View File

@@ -0,0 +1,62 @@
fleet_name: timmy-laptop-fleet
machines:
- hostname: timmy-anchor-a
machine_type: laptop
ram_gb: 16
cpu_cores: 8
os: macOS
adapter_condition: good
idle_watts: 11
always_on_capable: true
notes: candidate 24/7 anchor agent
- hostname: timmy-anchor-b
machine_type: laptop
ram_gb: 8
cpu_cores: 4
os: Linux
adapter_condition: good
idle_watts: 13
always_on_capable: true
notes: candidate 24/7 anchor agent
- hostname: timmy-daylight-a
machine_type: laptop
ram_gb: 32
cpu_cores: 10
os: macOS
adapter_condition: ok
idle_watts: 22
always_on_capable: true
notes: higher-performance daylight compute
- hostname: timmy-daylight-b
machine_type: laptop
ram_gb: 16
cpu_cores: 8
os: Linux
adapter_condition: ok
idle_watts: 19
always_on_capable: true
notes: daylight compute node
- hostname: timmy-daylight-c
machine_type: laptop
ram_gb: 8
cpu_cores: 4
os: Windows
adapter_condition: needs_replacement
idle_watts: 17
always_on_capable: false
notes: repair power adapter before production duty
- hostname: timmy-desktop-nas
machine_type: desktop
ram_gb: 64
cpu_cores: 12
os: Linux
adapter_condition: good
idle_watts: 58
always_on_capable: false
has_4tb_ssd: true
notes: desktop plus 4TB SSD NAS and heavy compute during peak sun

View File

@@ -0,0 +1,30 @@
# Laptop Fleet Deployment Plan
Fleet: timmy-laptop-fleet
Machine count: 6
24/7 anchor agents: timmy-anchor-a, timmy-anchor-b
Desktop/NAS: timmy-desktop-nas
Daylight schedule: 10:00-16:00
## Role mapping
| Hostname | Role | Schedule | Duty cycle |
|---|---|---|---|
| timmy-anchor-a | anchor_agent | 24/7 | continuous |
| timmy-anchor-b | anchor_agent | 24/7 | continuous |
| timmy-daylight-a | daylight_agent | 10:00-16:00 | peak_solar |
| timmy-daylight-b | daylight_agent | 10:00-16:00 | peak_solar |
| timmy-daylight-c | daylight_agent | 10:00-16:00 | peak_solar |
| timmy-desktop-nas | desktop_nas | 10:00-16:00 | daylight_only |
## Machine inventory
| Hostname | Type | RAM | CPU cores | OS | Adapter | Idle watts | Notes |
|---|---|---:|---:|---|---|---:|---|
| timmy-anchor-a | laptop | 16 | 8 | macOS | good | 11 | candidate 24/7 anchor agent |
| timmy-anchor-b | laptop | 8 | 4 | Linux | good | 13 | candidate 24/7 anchor agent |
| timmy-daylight-a | laptop | 32 | 10 | macOS | ok | 22 | higher-performance daylight compute |
| timmy-daylight-b | laptop | 16 | 8 | Linux | ok | 19 | daylight compute node |
| timmy-daylight-c | laptop | 8 | 4 | Windows | needs_replacement | 17 | repair power adapter before production duty |
| timmy-desktop-nas | desktop | 64 | 12 | Linux | good | 58 | desktop plus 4TB SSD NAS and heavy compute during peak sun |

View File

@@ -0,0 +1,37 @@
# NH Broadband Install Packet
**Packet ID:** nh-bb-20260415-113232
**Generated:** 2026-04-15T11:32:32.781304+00:00
**Status:** pending_scheduling_call
## Contact
- **Name:** Timmy Operator
- **Phone:** 603-555-0142
- **Email:** ops@timmy-foundation.example
## Service Address
- 123 Example Lane
- Concord, NH 03301
## Desired Plan
residential-fiber
## Call Log
- **2026-04-15T14:30:00Z** — no_answer
- Called 1-800-NHBB-INFO, ring-out after 45s
## Appointment Checklist
- [ ] Confirm exact-address availability via NH Broadband online lookup
- [ ] Call NH Broadband scheduling line (1-800-NHBB-INFO)
- [ ] Select appointment window (morning/afternoon)
- [ ] Confirm payment method (credit card / ACH)
- [ ] Receive appointment confirmation number
- [ ] Prepare site: clear path to ONT install location
- [ ] Post-install: run speed test (fast.com / speedtest.net)
- [ ] Log final speeds and appointment outcome

View File

@@ -0,0 +1,27 @@
contact:
name: Timmy Operator
phone: "603-555-0142"
email: ops@timmy-foundation.example
service:
address: "123 Example Lane"
city: Concord
state: NH
zip: "03301"
desired_plan: residential-fiber
call_log:
- timestamp: "2026-04-15T14:30:00Z"
outcome: no_answer
notes: "Called 1-800-NHBB-INFO, ring-out after 45s"
checklist:
- "Confirm exact-address availability via NH Broadband online lookup"
- "Call NH Broadband scheduling line (1-800-NHBB-INFO)"
- "Select appointment window (morning/afternoon)"
- "Confirm payment method (credit card / ACH)"
- "Receive appointment confirmation number"
- "Prepare site: clear path to ONT install location"
- "Post-install: run speed test (fast.com / speedtest.net)"
- "Log final speeds and appointment outcome"

View File

@@ -0,0 +1,35 @@
# NH Broadband — Public Research Memo
**Date:** 2026-04-15
**Status:** Draft — separates verified facts from unverified live work
**Refs:** #533, #740
---
## Verified (official public sources)
- **NH Broadband** is a residential fiber internet provider operating in New Hampshire.
- Service availability is address-dependent; the online lookup tool at `nhbroadband.com` reports coverage by street address.
- Residential fiber plans are offered; speed tiers vary by location.
- Scheduling line: **1-800-NHBB-INFO** (published on official site).
- Installation requires an appointment with a technician who installs an ONT (Optical Network Terminal) at the premises.
- Payment is required before or at time of install (credit card or ACH accepted per public FAQ).
## Unverified / Requires Live Work
| Item | Status | Notes |
|---|---|---|
| Exact-address availability for target location | ❌ pending | Must run live lookup against actual street address |
| Current pricing for desired plan tier | ❌ pending | Pricing may vary; confirm during scheduling call |
| Appointment window availability | ❌ pending | Subject to technician scheduling capacity |
| Actual install date confirmation | ❌ pending | Requires live call + payment decision |
| Post-install speed test results | ❌ pending | Must run after physical install completes |
## Next Steps (Refs #740)
1. Run address availability lookup on `nhbroadband.com`
2. Call 1-800-NHBB-INFO to schedule install
3. Confirm payment method
4. Receive appointment confirmation number
5. Prepare site (clear ONT install path)
6. Post-install: speed test and log results

View File

@@ -0,0 +1,127 @@
#!/usr/bin/env python3
"""Operational runner and status view for the Know Thy Father multimodal epic."""
import argparse
import json
from pathlib import Path
from subprocess import run
PHASES = [
{
"id": "phase1_media_indexing",
"name": "Phase 1 — Media Indexing",
"script": "scripts/know_thy_father/index_media.py",
"command_template": "python3 scripts/know_thy_father/index_media.py --tweets twitter-archive/extracted/tweets.jsonl --output twitter-archive/know-thy-father/media_manifest.jsonl",
"outputs": ["twitter-archive/know-thy-father/media_manifest.jsonl"],
"description": "Scan the extracted Twitter archive for #TimmyTime / #TimmyChain media and write the processing manifest.",
},
{
"id": "phase2_multimodal_analysis",
"name": "Phase 2 — Multimodal Analysis",
"script": "scripts/twitter_archive/analyze_media.py",
"command_template": "python3 scripts/twitter_archive/analyze_media.py --batch {batch_size}",
"outputs": [
"twitter-archive/know-thy-father/analysis.jsonl",
"twitter-archive/know-thy-father/meaning-kernels.jsonl",
"twitter-archive/know-thy-father/pipeline-status.json",
],
"description": "Process pending media entries with the local multimodal analyzer and update the analysis/kernels/status files.",
},
{
"id": "phase3_holographic_synthesis",
"name": "Phase 3 — Holographic Synthesis",
"script": "scripts/know_thy_father/synthesize_kernels.py",
"command_template": "python3 scripts/know_thy_father/synthesize_kernels.py --input twitter-archive/media/manifest.jsonl --output twitter-archive/knowledge/fathers_ledger.jsonl --summary twitter-archive/knowledge/fathers_ledger.summary.json",
"outputs": [
"twitter-archive/knowledge/fathers_ledger.jsonl",
"twitter-archive/knowledge/fathers_ledger.summary.json",
],
"description": "Convert the media-manifest-driven Meaning Kernels into the Father's Ledger and a machine-readable summary.",
},
{
"id": "phase4_cross_reference_audit",
"name": "Phase 4 — Cross-Reference Audit",
"script": "scripts/know_thy_father/crossref_audit.py",
"command_template": "python3 scripts/know_thy_father/crossref_audit.py --soul SOUL.md --kernels twitter-archive/notes/know_thy_father_crossref.md --output twitter-archive/notes/crossref_report.md",
"outputs": ["twitter-archive/notes/crossref_report.md"],
"description": "Compare Know Thy Father kernels against SOUL.md and related canon, then emit a Markdown audit report.",
},
{
"id": "phase5_processing_log",
"name": "Phase 5 — Processing Log / Status",
"script": "twitter-archive/know-thy-father/tracker.py",
"command_template": "python3 twitter-archive/know-thy-father/tracker.py report",
"outputs": ["twitter-archive/know-thy-father/REPORT.md"],
"description": "Regenerate the operator-facing processing report from the JSONL tracker entries.",
},
]
def build_pipeline_plan(batch_size: int = 10):
plan = []
for phase in PHASES:
plan.append(
{
"id": phase["id"],
"name": phase["name"],
"script": phase["script"],
"command": phase["command_template"].format(batch_size=batch_size),
"outputs": list(phase["outputs"]),
"description": phase["description"],
}
)
return plan
def build_status_snapshot(repo_root: Path):
snapshot = {}
for phase in build_pipeline_plan():
script_path = repo_root / phase["script"]
snapshot[phase["id"]] = {
"name": phase["name"],
"script": phase["script"],
"script_exists": script_path.exists(),
"outputs": [
{
"path": output,
"exists": (repo_root / output).exists(),
}
for output in phase["outputs"]
],
}
return snapshot
def run_step(repo_root: Path, step_id: str, batch_size: int = 10):
plan = {step["id"]: step for step in build_pipeline_plan(batch_size=batch_size)}
if step_id not in plan:
raise SystemExit(f"Unknown step: {step_id}")
step = plan[step_id]
return run(step["command"], cwd=repo_root, shell=True, check=False)
def main():
parser = argparse.ArgumentParser(description="Know Thy Father epic orchestration helper")
parser.add_argument("--batch-size", type=int, default=10)
parser.add_argument("--status", action="store_true")
parser.add_argument("--run-step", default=None)
parser.add_argument("--json", action="store_true")
args = parser.parse_args()
repo_root = Path(__file__).resolve().parents[2]
if args.run_step:
result = run_step(repo_root, args.run_step, batch_size=args.batch_size)
raise SystemExit(result.returncode)
payload = build_status_snapshot(repo_root) if args.status else build_pipeline_plan(batch_size=args.batch_size)
if args.json or args.status:
print(json.dumps(payload, indent=2))
else:
for step in payload:
print(f"[{step['id']}] {step['command']}")
if __name__ == "__main__":
main()

View File

@@ -0,0 +1,159 @@
#!/usr/bin/env python3
"""Prepare a MemPalace v3.0.0 integration packet for Ezra's Hermes home."""
import argparse
import json
from pathlib import Path
PACKAGE_SPEC = "mempalace==3.0.0"
DEFAULT_HERMES_HOME = "~/.hermes/"
DEFAULT_SESSIONS_DIR = "~/.hermes/sessions/"
DEFAULT_PALACE_PATH = "~/.mempalace/palace"
DEFAULT_WING = "ezra_home"
def build_yaml_template(wing: str, palace_path: str) -> str:
return (
f"wing: {wing}\n"
f"palace: {palace_path}\n"
"rooms:\n"
" - name: sessions\n"
" description: Conversation history and durable agent transcripts\n"
" globs:\n"
" - \"*.json\"\n"
" - \"*.jsonl\"\n"
" - name: config\n"
" description: Hermes configuration and runtime settings\n"
" globs:\n"
" - \"*.yaml\"\n"
" - \"*.yml\"\n"
" - \"*.toml\"\n"
" - name: docs\n"
" description: Notes, markdown docs, and operating reports\n"
" globs:\n"
" - \"*.md\"\n"
" - \"*.txt\"\n"
"people: []\n"
"projects: []\n"
)
def build_plan(overrides: dict | None = None) -> dict:
overrides = overrides or {}
hermes_home = overrides.get("hermes_home", DEFAULT_HERMES_HOME)
sessions_dir = overrides.get("sessions_dir", DEFAULT_SESSIONS_DIR)
palace_path = overrides.get("palace_path", DEFAULT_PALACE_PATH)
wing = overrides.get("wing", DEFAULT_WING)
yaml_template = build_yaml_template(wing=wing, palace_path=palace_path)
config_home = hermes_home[:-1] if hermes_home.endswith("/") else hermes_home
plan = {
"package_spec": PACKAGE_SPEC,
"hermes_home": hermes_home,
"sessions_dir": sessions_dir,
"palace_path": palace_path,
"wing": wing,
"config_path": f"{config_home}/mempalace.yaml",
"install_command": f"pip install {PACKAGE_SPEC}",
"init_command": f"mempalace init {hermes_home} --yes",
"mine_home_command": f"echo \"\" | mempalace mine {hermes_home}",
"mine_sessions_command": f"echo \"\" | mempalace mine {sessions_dir} --mode convos",
"search_command": 'mempalace search "your common queries"',
"wake_up_command": "mempalace wake-up",
"mcp_command": "hermes mcp add mempalace -- python -m mempalace.mcp_server",
"yaml_template": yaml_template,
"gotchas": [
"`mempalace init` is still interactive in room approval flow; write mempalace.yaml manually if the init output stalls.",
"The yaml key is `wing:` not `wings:`. Using the wrong key causes mine/setup failures.",
"Pipe empty stdin into mining commands (`echo \"\" | ...`) to avoid the entity-detector stdin hang on larger directories.",
"First mine downloads the ChromaDB embedding model cache (~79MB).",
"Report Ezra's before/after metrics back to issue #568 after live installation and retrieval tests.",
],
}
return plan
def render_markdown(plan: dict) -> str:
gotchas = "\n".join(f"- {item}" for item in plan["gotchas"])
return f"""# MemPalace v3.0.0 — Ezra Integration Packet
This packet turns issue #570 into an executable, reviewable integration plan for Ezra's Hermes home.
It is a repo-side scaffold: no live Ezra host changes are claimed in this artifact.
## Commands
```bash
{plan['install_command']}
{plan['init_command']}
cat > {plan['config_path']} <<'YAML'
{plan['yaml_template'].rstrip()}
YAML
{plan['mine_home_command']}
{plan['mine_sessions_command']}
{plan['search_command']}
{plan['wake_up_command']}
{plan['mcp_command']}
```
## Manual config template
```yaml
{plan['yaml_template'].rstrip()}
```
## Why this shape
- `wing: {plan['wing']}` matches the issue's Ezra-specific integration target.
- `rooms` split the mined material into sessions, config, and docs to keep retrieval interpretable.
- Mining commands pipe empty stdin to avoid the interactive entity-detector hang noted in the evaluation.
## Gotchas
{gotchas}
## Report back to #568
After live execution on Ezra's actual environment, post back to #568 with:
- install result
- mine duration and corpus size
- 2-3 real search queries + retrieved results
- wake-up context token count
- whether MCP wiring succeeded
## Honest scope boundary
This repo artifact does **not** prove live installation on Ezra's host. It makes the work reproducible and testable so the next pass can execute it without guesswork.
"""
def main() -> None:
parser = argparse.ArgumentParser(description="Prepare the MemPalace Ezra integration packet")
parser.add_argument("--hermes-home", default=DEFAULT_HERMES_HOME)
parser.add_argument("--sessions-dir", default=DEFAULT_SESSIONS_DIR)
parser.add_argument("--palace-path", default=DEFAULT_PALACE_PATH)
parser.add_argument("--wing", default=DEFAULT_WING)
parser.add_argument("--output", default=None)
parser.add_argument("--json", action="store_true")
args = parser.parse_args()
plan = build_plan(
{
"hermes_home": args.hermes_home,
"sessions_dir": args.sessions_dir,
"palace_path": args.palace_path,
"wing": args.wing,
}
)
rendered = json.dumps(plan, indent=2) if args.json else render_markdown(plan)
if args.output:
output_path = Path(args.output).expanduser()
output_path.parent.mkdir(parents=True, exist_ok=True)
output_path.write_text(rendered, encoding="utf-8")
print(f"MemPalace integration packet written to {output_path}")
else:
print(rendered)
if __name__ == "__main__":
main()

View File

@@ -0,0 +1,155 @@
#!/usr/bin/env python3
from __future__ import annotations
import argparse
import json
from pathlib import Path
from typing import Any
import yaml
DAYLIGHT_START = "10:00"
DAYLIGHT_END = "16:00"
def load_manifest(path: str | Path) -> dict[str, Any]:
data = yaml.safe_load(Path(path).read_text()) or {}
data.setdefault("machines", [])
return data
def validate_manifest(data: dict[str, Any]) -> None:
machines = data.get("machines", [])
if not machines:
raise ValueError("manifest must contain at least one machine")
seen: set[str] = set()
for machine in machines:
hostname = machine.get("hostname", "").strip()
if not hostname:
raise ValueError("each machine must declare a hostname")
if hostname in seen:
raise ValueError(f"duplicate hostname: {hostname} (unique hostnames are required)")
seen.add(hostname)
for field in ("machine_type", "ram_gb", "cpu_cores", "os", "adapter_condition"):
if field not in machine:
raise ValueError(f"machine {hostname} missing required field: {field}")
def _laptops(machines: list[dict[str, Any]]) -> list[dict[str, Any]]:
return [m for m in machines if m.get("machine_type") == "laptop"]
def _desktop(machines: list[dict[str, Any]]) -> dict[str, Any] | None:
for machine in machines:
if machine.get("machine_type") == "desktop":
return machine
return None
def choose_anchor_agents(machines: list[dict[str, Any]], count: int = 2) -> list[dict[str, Any]]:
eligible = [
m for m in _laptops(machines)
if m.get("adapter_condition") in {"good", "ok"} and m.get("always_on_capable", True)
]
eligible.sort(key=lambda m: (m.get("idle_watts", 9999), -m.get("ram_gb", 0), -m.get("cpu_cores", 0), m["hostname"]))
return eligible[:count]
def assign_roles(machines: list[dict[str, Any]]) -> dict[str, Any]:
anchors = choose_anchor_agents(machines, count=2)
anchor_names = {m["hostname"] for m in anchors}
desktop = _desktop(machines)
mapping: dict[str, dict[str, Any]] = {}
for machine in machines:
hostname = machine["hostname"]
if desktop and hostname == desktop["hostname"]:
mapping[hostname] = {
"role": "desktop_nas",
"schedule": f"{DAYLIGHT_START}-{DAYLIGHT_END}",
"duty_cycle": "daylight_only",
}
elif hostname in anchor_names:
mapping[hostname] = {
"role": "anchor_agent",
"schedule": "24/7",
"duty_cycle": "continuous",
}
else:
mapping[hostname] = {
"role": "daylight_agent",
"schedule": f"{DAYLIGHT_START}-{DAYLIGHT_END}",
"duty_cycle": "peak_solar",
}
return {
"anchor_agents": [m["hostname"] for m in anchors],
"desktop_nas": desktop["hostname"] if desktop else None,
"role_mapping": mapping,
}
def build_plan(data: dict[str, Any]) -> dict[str, Any]:
validate_manifest(data)
machines = data["machines"]
role_plan = assign_roles(machines)
return {
"fleet_name": data.get("fleet_name", "timmy-laptop-fleet"),
"machine_count": len(machines),
"anchor_agents": role_plan["anchor_agents"],
"desktop_nas": role_plan["desktop_nas"],
"daylight_window": f"{DAYLIGHT_START}-{DAYLIGHT_END}",
"role_mapping": role_plan["role_mapping"],
}
def render_markdown(plan: dict[str, Any], data: dict[str, Any]) -> str:
lines = [
"# Laptop Fleet Deployment Plan",
"",
f"Fleet: {plan['fleet_name']}",
f"Machine count: {plan['machine_count']}",
f"24/7 anchor agents: {', '.join(plan['anchor_agents']) if plan['anchor_agents'] else 'TBD'}",
f"Desktop/NAS: {plan['desktop_nas'] or 'TBD'}",
f"Daylight schedule: {plan['daylight_window']}",
"",
"## Role mapping",
"",
"| Hostname | Role | Schedule | Duty cycle |",
"|---|---|---|---|",
]
for hostname, role in sorted(plan["role_mapping"].items()):
lines.append(f"| {hostname} | {role['role']} | {role['schedule']} | {role['duty_cycle']} |")
lines.extend([
"",
"## Machine inventory",
"",
"| Hostname | Type | RAM | CPU cores | OS | Adapter | Idle watts | Notes |",
"|---|---|---:|---:|---|---|---:|---|",
])
for machine in data["machines"]:
lines.append(
f"| {machine['hostname']} | {machine['machine_type']} | {machine['ram_gb']} | {machine['cpu_cores']} | {machine['os']} | {machine['adapter_condition']} | {machine.get('idle_watts', 'n/a')} | {machine.get('notes', '')} |"
)
return "\n".join(lines) + "\n"
def main() -> int:
parser = argparse.ArgumentParser(description="Plan LAB-005 laptop fleet deployment.")
parser.add_argument("manifest", help="Path to laptop fleet manifest YAML")
parser.add_argument("--markdown", action="store_true", help="Render a markdown deployment plan instead of JSON")
args = parser.parse_args()
data = load_manifest(args.manifest)
plan = build_plan(data)
if args.markdown:
print(render_markdown(plan, data))
else:
print(json.dumps(plan, indent=2))
return 0
if __name__ == "__main__":
raise SystemExit(main())

View File

@@ -0,0 +1,135 @@
#!/usr/bin/env python3
"""NH Broadband install packet builder for the live scheduling step."""
from __future__ import annotations
import argparse
import json
from datetime import datetime, timezone
from pathlib import Path
from typing import Any
import yaml
def load_request(path: str | Path) -> dict[str, Any]:
data = yaml.safe_load(Path(path).read_text()) or {}
data.setdefault("contact", {})
data.setdefault("service", {})
data.setdefault("call_log", [])
data.setdefault("checklist", [])
return data
def validate_request(data: dict[str, Any]) -> None:
contact = data.get("contact", {})
for field in ("name", "phone"):
if not contact.get(field, "").strip():
raise ValueError(f"contact.{field} is required")
service = data.get("service", {})
for field in ("address", "city", "state"):
if not service.get(field, "").strip():
raise ValueError(f"service.{field} is required")
if not data.get("checklist"):
raise ValueError("checklist must contain at least one item")
def build_packet(data: dict[str, Any]) -> dict[str, Any]:
validate_request(data)
contact = data["contact"]
service = data["service"]
return {
"packet_id": f"nh-bb-{datetime.now(timezone.utc).strftime('%Y%m%d-%H%M%S')}",
"generated_utc": datetime.now(timezone.utc).isoformat(),
"contact": {
"name": contact["name"],
"phone": contact["phone"],
"email": contact.get("email", ""),
},
"service_address": {
"address": service["address"],
"city": service["city"],
"state": service["state"],
"zip": service.get("zip", ""),
},
"desired_plan": data.get("desired_plan", "residential-fiber"),
"call_log": data.get("call_log", []),
"checklist": [
{"item": item, "done": False} if isinstance(item, str) else item
for item in data["checklist"]
],
"status": "pending_scheduling_call",
}
def render_markdown(packet: dict[str, Any], data: dict[str, Any]) -> str:
contact = packet["contact"]
addr = packet["service_address"]
lines = [
f"# NH Broadband Install Packet",
"",
f"**Packet ID:** {packet['packet_id']}",
f"**Generated:** {packet['generated_utc']}",
f"**Status:** {packet['status']}",
"",
"## Contact",
"",
f"- **Name:** {contact['name']}",
f"- **Phone:** {contact['phone']}",
f"- **Email:** {contact.get('email', 'n/a')}",
"",
"## Service Address",
"",
f"- {addr['address']}",
f"- {addr['city']}, {addr['state']} {addr['zip']}",
"",
f"## Desired Plan",
"",
f"{packet['desired_plan']}",
"",
"## Call Log",
"",
]
if packet["call_log"]:
for entry in packet["call_log"]:
ts = entry.get("timestamp", "n/a")
outcome = entry.get("outcome", "n/a")
notes = entry.get("notes", "")
lines.append(f"- **{ts}** — {outcome}")
if notes:
lines.append(f" - {notes}")
else:
lines.append("_No calls logged yet._")
lines.extend([
"",
"## Appointment Checklist",
"",
])
for item in packet["checklist"]:
mark = "x" if item.get("done") else " "
lines.append(f"- [{mark}] {item['item']}")
lines.append("")
return "\n".join(lines)
def main() -> int:
parser = argparse.ArgumentParser(description="Build NH Broadband install packet.")
parser.add_argument("request", help="Path to install request YAML")
parser.add_argument("--markdown", action="store_true", help="Render markdown instead of JSON")
args = parser.parse_args()
data = load_request(args.request)
packet = build_packet(data)
if args.markdown:
print(render_markdown(packet, data))
else:
print(json.dumps(packet, indent=2))
return 0
if __name__ == "__main__":
raise SystemExit(main())

View File

@@ -0,0 +1,76 @@
from pathlib import Path
import importlib.util
import unittest
ROOT = Path(__file__).resolve().parent.parent
SCRIPT_PATH = ROOT / "scripts" / "know_thy_father" / "epic_pipeline.py"
DOC_PATH = ROOT / "docs" / "KNOW_THY_FATHER_MULTIMODAL_PIPELINE.md"
def load_module(path: Path, name: str):
assert path.exists(), f"missing {path.relative_to(ROOT)}"
spec = importlib.util.spec_from_file_location(name, path)
assert spec and spec.loader
module = importlib.util.module_from_spec(spec)
spec.loader.exec_module(module)
return module
class TestKnowThyFatherEpicPipeline(unittest.TestCase):
def test_build_pipeline_plan_contains_all_phases_in_order(self):
mod = load_module(SCRIPT_PATH, "ktf_epic_pipeline")
plan = mod.build_pipeline_plan(batch_size=10)
self.assertEqual(
[step["id"] for step in plan],
[
"phase1_media_indexing",
"phase2_multimodal_analysis",
"phase3_holographic_synthesis",
"phase4_cross_reference_audit",
"phase5_processing_log",
],
)
self.assertIn("scripts/know_thy_father/index_media.py", plan[0]["command"])
self.assertIn("scripts/twitter_archive/analyze_media.py --batch 10", plan[1]["command"])
self.assertIn("scripts/know_thy_father/synthesize_kernels.py", plan[2]["command"])
self.assertIn("scripts/know_thy_father/crossref_audit.py", plan[3]["command"])
self.assertIn("twitter-archive/know-thy-father/tracker.py report", plan[4]["command"])
def test_status_snapshot_reports_key_artifact_paths(self):
mod = load_module(SCRIPT_PATH, "ktf_epic_pipeline")
status = mod.build_status_snapshot(ROOT)
self.assertIn("phase1_media_indexing", status)
self.assertIn("phase2_multimodal_analysis", status)
self.assertIn("phase3_holographic_synthesis", status)
self.assertIn("phase4_cross_reference_audit", status)
self.assertIn("phase5_processing_log", status)
self.assertEqual(status["phase1_media_indexing"]["script"], "scripts/know_thy_father/index_media.py")
self.assertEqual(status["phase2_multimodal_analysis"]["script"], "scripts/twitter_archive/analyze_media.py")
self.assertEqual(status["phase5_processing_log"]["script"], "twitter-archive/know-thy-father/tracker.py")
self.assertTrue(status["phase1_media_indexing"]["script_exists"])
self.assertTrue(status["phase2_multimodal_analysis"]["script_exists"])
self.assertTrue(status["phase3_holographic_synthesis"]["script_exists"])
self.assertTrue(status["phase4_cross_reference_audit"]["script_exists"])
self.assertTrue(status["phase5_processing_log"]["script_exists"])
def test_repo_contains_multimodal_pipeline_doc(self):
self.assertTrue(DOC_PATH.exists(), "missing committed Know Thy Father pipeline doc")
text = DOC_PATH.read_text(encoding="utf-8")
required = [
"# Know Thy Father — Multimodal Media Consumption Pipeline",
"scripts/know_thy_father/index_media.py",
"scripts/twitter_archive/analyze_media.py --batch 10",
"scripts/know_thy_father/synthesize_kernels.py",
"scripts/know_thy_father/crossref_audit.py",
"twitter-archive/know-thy-father/tracker.py report",
"Refs #582",
]
for snippet in required:
self.assertIn(snippet, text)
if __name__ == "__main__":
unittest.main()

View File

@@ -0,0 +1,52 @@
from pathlib import Path
import yaml
from scripts.plan_laptop_fleet import build_plan, load_manifest, render_markdown, validate_manifest
def test_laptop_fleet_planner_script_exists() -> None:
assert Path("scripts/plan_laptop_fleet.py").exists()
def test_laptop_fleet_manifest_template_exists() -> None:
assert Path("docs/laptop-fleet-manifest.example.yaml").exists()
def test_build_plan_selects_two_lowest_idle_watt_laptops_as_anchors() -> None:
data = load_manifest("docs/laptop-fleet-manifest.example.yaml")
plan = build_plan(data)
assert plan["anchor_agents"] == ["timmy-anchor-a", "timmy-anchor-b"]
assert plan["desktop_nas"] == "timmy-desktop-nas"
assert plan["role_mapping"]["timmy-daylight-a"]["schedule"] == "10:00-16:00"
def test_validate_manifest_requires_unique_hostnames() -> None:
data = {
"machines": [
{"hostname": "dup", "machine_type": "laptop", "ram_gb": 8, "cpu_cores": 4, "os": "Linux", "adapter_condition": "good"},
{"hostname": "dup", "machine_type": "laptop", "ram_gb": 16, "cpu_cores": 8, "os": "Linux", "adapter_condition": "good"},
]
}
try:
validate_manifest(data)
except ValueError as exc:
assert "duplicate hostname" in str(exc)
assert "unique hostnames" in str(exc)
else:
raise AssertionError("validate_manifest should reject duplicate hostname")
def test_markdown_contains_anchor_agents_and_daylight_schedule() -> None:
data = load_manifest("docs/laptop-fleet-manifest.example.yaml")
plan = build_plan(data)
content = render_markdown(plan, data)
assert "24/7 anchor agents: timmy-anchor-a, timmy-anchor-b" in content
assert "Daylight schedule: 10:00-16:00" in content
assert "desktop_nas" in content
def test_manifest_template_is_valid_yaml() -> None:
data = yaml.safe_load(Path("docs/laptop-fleet-manifest.example.yaml").read_text())
assert data["fleet_name"] == "timmy-laptop-fleet"
assert len(data["machines"]) == 6

View File

@@ -0,0 +1,68 @@
from pathlib import Path
import importlib.util
import unittest
ROOT = Path(__file__).resolve().parent.parent
SCRIPT_PATH = ROOT / "scripts" / "mempalace_ezra_integration.py"
DOC_PATH = ROOT / "docs" / "MEMPALACE_EZRA_INTEGRATION.md"
def load_module(path: Path, name: str):
assert path.exists(), f"missing {path.relative_to(ROOT)}"
spec = importlib.util.spec_from_file_location(name, path)
assert spec and spec.loader
module = importlib.util.module_from_spec(spec)
spec.loader.exec_module(module)
return module
class TestMempalaceEzraIntegration(unittest.TestCase):
def test_build_plan_contains_issue_required_steps_and_gotchas(self):
mod = load_module(SCRIPT_PATH, "mempalace_ezra_integration")
plan = mod.build_plan({})
self.assertEqual(plan["package_spec"], "mempalace==3.0.0")
self.assertIn("pip install mempalace==3.0.0", plan["install_command"])
self.assertEqual(plan["wing"], "ezra_home")
self.assertIn('echo "" | mempalace mine ~/.hermes/', plan["mine_home_command"])
self.assertIn('--mode convos', plan["mine_sessions_command"])
self.assertIn('mempalace wake-up', plan["wake_up_command"])
self.assertIn('hermes mcp add mempalace -- python -m mempalace.mcp_server', plan["mcp_command"])
self.assertIn('wing:', plan["yaml_template"])
self.assertTrue(any('stdin' in item.lower() for item in plan["gotchas"]))
self.assertTrue(any('wing:' in item for item in plan["gotchas"]))
def test_build_plan_accepts_path_and_wing_overrides(self):
mod = load_module(SCRIPT_PATH, "mempalace_ezra_integration")
plan = mod.build_plan(
{
"hermes_home": "/root/wizards/ezra/home",
"sessions_dir": "/root/wizards/ezra/home/sessions",
"wing": "ezra_archive",
}
)
self.assertEqual(plan["wing"], "ezra_archive")
self.assertIn('/root/wizards/ezra/home', plan["mine_home_command"])
self.assertIn('/root/wizards/ezra/home/sessions', plan["mine_sessions_command"])
self.assertIn('wing: ezra_archive', plan["yaml_template"])
def test_repo_contains_mem_palace_ezra_doc(self):
self.assertTrue(DOC_PATH.exists(), "missing committed MemPalace Ezra integration doc")
text = DOC_PATH.read_text(encoding="utf-8")
required = [
"# MemPalace v3.0.0 — Ezra Integration Packet",
"pip install mempalace==3.0.0",
'echo "" | mempalace mine ~/.hermes/',
"mempalace mine ~/.hermes/sessions/ --mode convos",
"mempalace wake-up",
"hermes mcp add mempalace -- python -m mempalace.mcp_server",
"Report back to #568",
]
for snippet in required:
self.assertIn(snippet, text)
if __name__ == "__main__":
unittest.main()

View File

@@ -0,0 +1,105 @@
from pathlib import Path
import yaml
from scripts.plan_nh_broadband_install import (
build_packet,
load_request,
render_markdown,
validate_request,
)
def test_script_exists() -> None:
assert Path("scripts/plan_nh_broadband_install.py").exists()
def test_example_request_exists() -> None:
assert Path("docs/nh-broadband-install-request.example.yaml").exists()
def test_example_packet_exists() -> None:
assert Path("docs/nh-broadband-install-packet.example.md").exists()
def test_research_memo_exists() -> None:
assert Path("reports/operations/2026-04-15-nh-broadband-public-research.md").exists()
def test_load_and_build_packet() -> None:
data = load_request("docs/nh-broadband-install-request.example.yaml")
packet = build_packet(data)
assert packet["contact"]["name"] == "Timmy Operator"
assert packet["service_address"]["city"] == "Concord"
assert packet["service_address"]["state"] == "NH"
assert packet["status"] == "pending_scheduling_call"
assert len(packet["checklist"]) == 8
assert packet["checklist"][0]["done"] is False
def test_validate_rejects_missing_contact_name() -> None:
data = {
"contact": {"name": "", "phone": "555"},
"service": {"address": "1 St", "city": "X", "state": "NH"},
"checklist": ["do thing"],
}
try:
validate_request(data)
except ValueError as exc:
assert "contact.name" in str(exc)
else:
raise AssertionError("should reject empty contact name")
def test_validate_rejects_missing_service_address() -> None:
data = {
"contact": {"name": "A", "phone": "555"},
"service": {"address": "", "city": "X", "state": "NH"},
"checklist": ["do thing"],
}
try:
validate_request(data)
except ValueError as exc:
assert "service.address" in str(exc)
else:
raise AssertionError("should reject empty service address")
def test_validate_rejects_empty_checklist() -> None:
data = {
"contact": {"name": "A", "phone": "555"},
"service": {"address": "1 St", "city": "X", "state": "NH"},
"checklist": [],
}
try:
validate_request(data)
except ValueError as exc:
assert "checklist" in str(exc)
else:
raise AssertionError("should reject empty checklist")
def test_render_markdown_contains_key_sections() -> None:
data = load_request("docs/nh-broadband-install-request.example.yaml")
packet = build_packet(data)
md = render_markdown(packet, data)
assert "# NH Broadband Install Packet" in md
assert "## Contact" in md
assert "## Service Address" in md
assert "## Call Log" in md
assert "## Appointment Checklist" in md
assert "Concord" in md
assert "NH" in md
def test_render_markdown_shows_checklist_items() -> None:
data = load_request("docs/nh-broadband-install-request.example.yaml")
packet = build_packet(data)
md = render_markdown(packet, data)
assert "- [ ] Confirm exact-address availability" in md
def test_example_yaml_is_valid() -> None:
data = yaml.safe_load(Path("docs/nh-broadband-install-request.example.yaml").read_text())
assert data["contact"]["name"] == "Timmy Operator"
assert len(data["checklist"]) == 8

View File

@@ -1,666 +0,0 @@
# GENOME.md — the-testament
Generated: 2026-04-15
Repo: Timmy_Foundation/the-testament
Analysis issue: timmy-home #675
---
## Project Overview
The Testament is not a conventional software repo and not just a manuscript dump.
It is a hybrid publishing system with four layers:
1. narrative source files
2. build/packaging pipelines
3. presentation surfaces
4. verification/quality gates
At the content layer, the repo holds a five-part novel with 18 chapter manuscripts, front/back matter, character sheets, worldbuilding notes, cover copy, soundtrack notes, and other companion artifacts.
At the software layer, it ships a small publishing toolchain that compiles the manuscript into:
- combined markdown
- EPUB
- HTML
- PDF
- web-reader JSON
- checksum manifest
It also includes:
- a static promotional/reader website (`website/index.html`)
- an interactive companion experience (`game/the-door.py` / `game/the-door.html`)
- audiobook helper scripts (`audiobook/`)
- validation and smoke-check automation (`scripts/` + `.gitea/workflows/`)
This makes the repo best understood as a sovereign multimedia book production system centered on a novel.
Runtime-confirmed facts from direct verification:
- `scripts/build-verify.py --json` passes and reports 18 chapters
- the verifier reports ~18,884 manuscript words in chapters and ~19,227 words in concatenated output
- `bash scripts/smoke.sh` passes and successfully builds markdown/epub/html
- `python3 build/build.py --md` succeeds
- `python3 compile_all.py --check` currently crashes due a qrcode version lookup bug
---
## Quick Facts
Repository composition from direct scan:
- 18 chapter manuscripts in `chapters/`
- top-level content/support directories include:
- `chapters/`
- `build/`
- `website/`
- `audiobook/`
- `game/`
- `characters/`
- `worldbuilding/`
- `cover/`
- `music/`
- primary code entrypoints are Python scripts plus a static HTML site
- no dedicated `tests/` directory
- validation is script-driven rather than unit-test-driven
Approximate non-output code inventory from `pygount` scan:
- ~3.6K lines of code-equivalent across Python/HTML/CSS/YAML/Bash/JSON
- code mass is concentrated in:
- `compile_all.py`
- `build/build.py`
- `compile.py`
- `scripts/build-verify.py`
- `website/index.html`
- `game/the-door.py`
---
## Architecture
```mermaid
flowchart TD
A[chapters/*.md] --> B[compile_markdown]
C[front-matter.md / build/frontmatter.md] --> B
D[back-matter.md / build/backmatter.md] --> B
E[build/metadata.yaml] --> F[pandoc/reportlab packaging]
G[book-style.css] --> F
H[cover/cover-art.jpg] --> F
B --> I[testament-complete.md]
I --> F
F --> J[testament.epub]
F --> K[testament.html]
F --> L[testament.pdf]
A --> M[compile_chapters_json / website/build-chapters.py]
M --> N[website/chapters.json]
I --> O[generate_manifest]
J --> O
K --> O
L --> O
N --> O
O --> P[build-manifest.json]
A --> Q[scripts/index_generator.py]
R[characters/*.md] --> Q
Q --> S[KNOWLEDGE_GRAPH.md]
A --> T[build/semantic_linker.py]
T --> U[build/cross_refs.json]
A --> V[audiobook/extract_text.py]
V --> W[text excerpts]
W --> X[audiobook/generate_samples.sh]
X --> Y[audiobook sample files]
Y --> Z[audiobook/create_manifest.py]
Z --> AA[audiobook/manifest.md]
AB[scripts/build-verify.py] --> A
AB --> I
AC[scripts/smoke.sh] --> AB
AD[.gitea workflows] --> AC
AE[website/index.html] --> AF[static landing/reading experience]
AG[game/the-door.py / game/the-door.html] --> AH[interactive companion artifact]
```
---
## Entry Points
### Primary build entrypoint
1. `compile_all.py`
This is the canonical unified pipeline.
It builds:
- combined markdown
- EPUB
- PDF
- HTML
- `website/chapters.json`
- `build-manifest.json`
It also exposes:
- `--check`
- `--clean`
- format-specific flags (`--md`, `--epub`, `--pdf`, `--html`, `--json`)
### Legacy build entrypoints
2. `build/build.py`
3. `compile.py`
These overlap with the unified pipeline and still work as alternate build surfaces.
`build/build.py` is the more structured legacy path.
`compile.py` is a simpler older compiler that still shells out to `scripts/index_generator.py` before building.
### Verification entrypoints
4. `scripts/build-verify.py`
5. `scripts/smoke.sh`
6. `.gitea/workflows/build.yml`
7. `.gitea/workflows/smoke.yml`
8. `.gitea/workflows/validate.yml`
These form the repos test/CI surface.
There are no unit tests; these scripts are the executable contract.
### Website/content export entrypoints
9. `website/build-chapters.py`
10. `website/index.html`
`build-chapters.py` converts chapter markdown into HTML snippets inside `website/chapters.json`.
`website/index.html` is a large static HTML/CSS/JS page used as the web-facing presentation layer.
### Audiobook entrypoints
11. `audiobook/extract_text.py`
12. `audiobook/create_manifest.py`
13. `audiobook/generate_samples.sh`
These scripts support excerpt extraction, sample generation, and audiobook manifest creation.
### Companion/interactive entrypoints
14. `game/the-door.py`
15. `game/the-door.html`
These are sidecar experiences, not part of the core build pipeline, but they are part of the repo architecture.
### Knowledge/indexing entrypoints
16. `scripts/index_generator.py`
17. `build/semantic_linker.py`
These create graph-like auxiliary artifacts from the manuscript corpus.
---
## Data Flow
### Main book build flow
```text
chapter markdown + front matter + back matter
compile_markdown()
combined manuscript: testament-complete.md
format-specific compilers
├─ pandoc -> EPUB
├─ pandoc -> standalone HTML
├─ xelatex / weasyprint / reportlab -> PDF
└─ metadata/css/cover integrated where available
optional output hashing
build-manifest.json
```
### Website/export flow
```text
chapters/*.md
website/build-chapters.py or compile_all.py::compile_chapters_json()
extract heading + convert paragraphs/quotes/headings to HTML fragments
website/chapters.json
```
Important nuance:
- `website/chapters.json` is produced by the toolchain
- current `website/index.html` appears to be a static landing/presentation page
- no direct `fetch('chapters.json')` usage was found in the current website HTML
So the JSON output is a generated artifact for a web-reader/export path, but not obviously consumed by the checked-in landing page itself.
### Verification flow
```text
chapter files + required support files
scripts/build-verify.py
├─ count files
├─ validate heading format
├─ compute word counts
├─ check markdown integrity
├─ concatenate outputs
└─ write build-report.json when asked
```
### Knowledge graph / semantic link flow
```text
characters/*.md + chapters/*.md
scripts/index_generator.py
KNOWLEDGE_GRAPH.md
chapters/*.md
build/semantic_linker.py
build/cross_refs.json
```
### Audiobook flow
```text
chapter markdown
audiobook/extract_text.py
trimmed text excerpt
audiobook/generate_samples.sh
audio sample files
audiobook/create_manifest.py
audiobook/manifest.md
```
---
## Key Abstractions
### 1. Chapter corpus
The core domain object of the repo is the ordered chapter set:
- `chapters/chapter-01.md` ... `chapters/chapter-18.md`
- exact numbering matters
- heading format matters
- concatenation order matters
Almost every script assumes this ordered corpus is the canonical source of truth.
### 2. Part boundaries (`PARTS`)
Both `compile.py`, `build/build.py`, and `compile_all.py` define a `PARTS` mapping.
This injects higher-level narrative structure into the build output by adding part headers and descriptions at fixed chapter boundaries.
### 3. Compiled manuscript
`testament-complete.md` is the normalized intermediate artifact.
It is the manuscript assembly layer from which downstream formats are built.
This is the closest thing the repo has to an internal IR (intermediate representation).
### 4. Multi-backend packaging
The build system supports multiple packaging backends:
- pandoc for EPUB and HTML
- xelatex for PDF when available
- weasyprint fallback
- reportlab fallback for fully local pure-Python PDF generation
This is a resilience pattern: the repo prefers multiple production paths rather than a single brittle dependency chain.
### 5. Manifested outputs
`build-manifest.json` stores output metadata and SHA256 checksums.
That turns built artifacts into auditable objects rather than opaque files.
### 6. Verification-as-tests
Because there is no `tests/` suite, `scripts/build-verify.py` is effectively the main automated specification for integrity.
It asserts:
- chapter count
- naming/ordering
- heading format
- word-count sanity
- markdown integrity
- concatenation success
- required support files
### 7. Companion surfaces
The repo has non-manuscript presentation surfaces:
- static website
- interactive game/experience (`The Door`)
- audiobook assets and scripts
These make the repo a narrative system, not just a book build.
### 8. Knowledge graph / semantic linking
The repo contains lightweight symbolic tooling:
- regex-based character-to-chapter index generation
- capitalized-phrase cross-reference detection between chapters
This is a GOFAI-like layer over literary content.
---
## API Surface
This repos API surface is mostly CLI-based rather than network-based.
### Canonical CLI surface
#### `compile_all.py`
Commands:
- `python3 compile_all.py`
- `python3 compile_all.py --md`
- `python3 compile_all.py --epub`
- `python3 compile_all.py --pdf`
- `python3 compile_all.py --html`
- `python3 compile_all.py --json`
- `python3 compile_all.py --check`
- `python3 compile_all.py --clean`
Outputs:
- `testament-complete.md`
- `testament.epub`
- `testament.html`
- `testament.pdf`
- `website/chapters.json`
- `build-manifest.json`
#### `build/build.py`
Commands:
- `python3 build/build.py --md`
- `python3 build/build.py --epub`
- `python3 build/build.py --pdf`
- `python3 build/build.py --html`
- default full build behavior
#### `compile.py`
Commands documented:
- `python3 compile.py`
- `python3 compile.py --md`
- `python3 compile.py --epub`
- `python3 compile.py --html`
- `python3 compile.py --check`
Observed quirk:
- `scripts/smoke.sh` calls `python3 compile.py --validate`
- no `--validate` handling exists in source
- the script still exits 0 because `compile.py` ignores unknown args and runs its default build path
That is a real contract quirk/drift worth remembering.
#### `scripts/build-verify.py`
Commands:
- `python3 scripts/build-verify.py`
- `python3 scripts/build-verify.py --ci`
- `python3 scripts/build-verify.py --json`
#### Other tooling
- `python3 website/build-chapters.py`
- `python3 scripts/index_generator.py`
- `python3 build/semantic_linker.py`
- `python3 audiobook/extract_text.py <input.md> <output.txt>`
- `python3 audiobook/create_manifest.py`
- `bash audiobook/generate_samples.sh`
- `bash scripts/smoke.sh`
- `python3 game/the-door.py`
### Data contracts
#### Chapter heading contract
`build-verify.py` expects each chapter to start with:
- `# Chapter N — Title`
#### File naming contract
- chapter files must match `chapter-XX.md`
- exactly 18 chapters are expected by the verifier
#### Output manifest contract
`build-manifest.json` includes, per file:
- path
- size_bytes
- sha256
#### Website chapters JSON contract
Entries include:
- `number`
- `title`
- `html`
---
## Test Coverage Gaps
### Current state
There is no unit-test suite and no `tests/` directory.
Coverage is currently provided by:
- shell smoke checks
- build verification script
- CI workflow checks
That means the repo has verification, but not isolated regression tests.
### What is already covered by script-based checks
- chapter count and naming
- heading format
- minimum word-count sanity
- markdown delimiter/link integrity
- concatenation success
- required-file existence
- basic syntax parsing for Python/YAML/shell/JSON
- secret-pattern grep scanning
### Highest-value missing tests
1. `compile_all.py` dependency-check behavior
- there should be a regression test for `--check`
- current runtime already revealed a concrete failure when `qrcode.__version__` is missing
2. `compile_chapters_json()` correctness
- verify all 18 chapters are emitted
- verify blockquotes/headings/italics render as expected
- verify title extraction stays stable
3. Manifest generation
- verify `build-manifest.json` includes every built artifact actually present
- verify sha256 and size fields are correct
4. Build backend selection
- verify fallback order for PDF generation behaves correctly when xelatex/weasyprint/reportlab availability changes
5. `scripts/index_generator.py`
- verify character mention detection and markdown output determinism
6. `build/semantic_linker.py`
- verify the proper-noun extraction and common-word filtering do not produce obviously bad edges
7. Website/output parity
- verify `website/chapters.json` matches chapter headings and ordering from source manuscripts
8. Companion experience smoke tests
- `game/the-door.py` has no automated behavior coverage
- `game/the-door.html` has no structural or syntax verification
### Recommended first tests
If this repo gets a `tests/` directory, start here:
1. `test_compile_all_check_does_not_crash`
2. `test_build_chapters_emits_18_ordered_entries`
3. `test_manifest_contains_existing_outputs`
4. `test_build_verify_rejects_missing_chapter`
---
## Security Considerations
### 1. Shelling out to external toolchains
The build system uses subprocess execution for:
- pandoc
- xelatex
- weasyprint-related flows
- helper scripts
This is reasonable for a publishing repo, but it means path handling and shell assumptions matter.
### 2. Remote font dependency in website HTML
`website/index.html` imports Google Fonts via CSS `@import`.
That means the website is not fully sovereign/local-first at render time.
If strict offline/local hosting matters, font bundling would be required.
### 3. Secret scanning exists, but is grep-based
Both CI and `scripts/smoke.sh` perform simple pattern scanning.
That is better than nothing, but it is heuristic rather than structured secret detection.
### 4. Artifact integrity is a strength
`build-manifest.json` with SHA256 hashes is a strong integrity pattern.
It gives the repo a lightweight provenance layer for distributables.
### 5. Build check path currently has a reliability bug
Runtime-confirmed:
- `python3 compile_all.py --check` crashes with:
- `AttributeError: module 'qrcode' has no attribute '__version__'`
This is not a remote exploit issue, but it is an operational integrity issue because the advertised safe preflight check is not robust.
Follow-up issue filed:
- the-testament #51
- https://forge.alexanderwhitestone.com/Timmy_Foundation/the-testament/issues/51
---
## Drift / Contradictions
### 1. README vs runtime word count
README says:
- ~70,000 word target
- ~19,000 words drafted
Runtime verification says:
- ~18,884 words in chapter corpus
- ~19,227 words in concatenated output
This is close enough to be directionally aligned, but the verifier is the stronger factual source for current draft size.
### 2. `compile_all.py --check` is documented but currently broken
Documented behavior:
- dependency verification
Observed behavior:
- crashes on qrcode version lookup
### 3. `scripts/smoke.sh` depends on undocumented `compile.py --validate`
- `compile.py` docs do not list `--validate`
- source contains no explicit `--validate` path
- smoke still passes because the script ignores unknown flags and performs its default build path
This is a subtle contract mismatch.
### 4. `website/chapters.json` generation is present, but current website landing page does not appear to consume it directly
That suggests either:
- a future/planned reader path
- an external consumer
- or leftover infrastructure from an earlier website design
---
## Practical Mental Model
Think of the-testament as three repos living inside one repository:
1. the manuscript repo
- chapters
- front/back matter
- worldbuilding
- character sheets
2. the publishing pipeline repo
- compile scripts
- verification scripts
- CI workflows
- manifest generation
3. the companion media repo
- website
- audiobook helpers
- interactive game experience
- soundtrack/cover assets
The connective tissue is the manuscript corpus. Almost everything else either:
- transforms it
- packages it
- validates it
- or re-presents it in another medium
---
## Source Files of Highest Importance
1. `compile_all.py`
- canonical unified pipeline
- best single source of repo architecture
2. `scripts/build-verify.py`
- real executable quality contract
3. `build/build.py`
- structured legacy builder still in active use
4. `compile.py`
- older build entrypoint still referenced by smoke flow
5. `website/index.html`
- primary web presentation artifact
6. `website/build-chapters.py`
- chapter-to-web JSON transform
7. `build/metadata.yaml`
- publication metadata contract
8. `build/semantic_linker.py`
- symbolic/literary relationship extraction
---
## Recommended Next Refactors
1. Make `compile_all.py` the only documented build entrypoint
- de-emphasize or retire duplicated legacy flows once parity is confirmed
2. Add real regression tests around build helpers
- especially `compile_all.py --check`
- chapter JSON generation
- manifest generation
3. Clarify the role of `website/chapters.json`
- either wire it into the site, document its consumer, or remove the dead path
4. Fix the undocumented `compile.py --validate` dependency in smoke
- either implement the flag or stop invoking it
5. Decide whether the companion game and website should remain in the same repo or be treated as first-class subprojects with their own tests
---
## Bottom Line
the-testament is a sovereign novel-production repo with a manuscript at the center and a light but real software system around it.
Its architecture is not application-server-centric.
It is pipeline-centric:
- content in
- validated compilation
- multi-format outputs
- integrity metadata
- companion experiences around the text
The strongest technical asset is the layered publishing pipeline plus manuscript verification.
The biggest weakness is the absence of dedicated regression tests around the build system itself.
Source basis for this genome:
- README and manuscript structure docs
- direct source inspection of `compile_all.py`, `build/build.py`, `compile.py`, website/audiobook/indexing/verification scripts
- runtime verification of build and validation commands
- repo scan of content/build/workflow layout