Compare commits

..

1 Commits

Author SHA1 Message Date
4582653bb4 feat: Codebase Genome — the-nexus full analysis (#672)
Some checks failed
Smoke Test / smoke (pull_request) Failing after 16s
Complete GENOME.md for the-nexus repository:

- Project overview (3D world, Three.js, portal architecture)
- Architecture diagram (Mermaid)
- Entry points and data flow
- Key abstractions (NEXUS, SpatialMemory, Portal System, GOFAI)
- API surface (internal + external)
- Dependencies (Three.js, Python WebSocket)
- Test coverage gaps (7 critical paths untested)
- Security considerations (WebSocket auth, localStorage)
- Technical debt (4082-line app.js, no TypeScript)
- Migration notes from CLAUDE.md

Closes #672
2026-04-14 22:33:42 -04:00
2 changed files with 319 additions and 666 deletions

319
the-nexus-GENOME.md Normal file
View File

@@ -0,0 +1,319 @@
# GENOME.md — the-nexus
**Generated:** 2026-04-14
**Repo:** Timmy_Foundation/the-nexus
**Analysis:** Codebase Genome #672
---
## Project Overview
The Nexus is Timmy's canonical 3D home-world — a browser-based Three.js application that serves as:
1. **Local-first training ground** for Timmy (the sovereign AI)
2. **Wizardly visualization surface** for the fleet system
3. **Portal architecture** connecting to other worlds and services
The app is a real-time 3D environment with spatial memory, GOFAI reasoning, agent presence, and portal-based navigation.
---
## Architecture
```mermaid
graph TB
subgraph Browser["BROWSER LAYER"]
HTML[index.html]
APP[app.js - 4082 lines]
CSS[style.css]
Worker[gofai_worker.js]
end
subgraph ThreeJS["THREE.JS RENDERING"]
Scene[Scene Management]
Camera[Camera System]
Renderer[WebGL Renderer]
Post[Post-processing<br/>Bloom, SMAA]
Physics[Physics/Player]
end
subgraph Nexus["NEXUS COMPONENTS"]
SM[SpatialMemory]
SA[SpatialAudio]
MB[MemoryBirth]
MO[MemoryOptimizer]
MI[MemoryInspect]
MP[MemoryPulse]
RT[ReasoningTrace]
RV[ResonanceVisualizer]
end
subgraph GOFAI["GOFAI REASONING"]
Worker2[Web Worker]
Rules[Rule Engine]
Facts[Fact Store]
Inference[Inference Loop]
end
subgraph Backend["BACKEND SERVICES"]
Server[server.py<br/>WebSocket Bridge]
L402[L402 Cost API]
Portal[Portal Registry]
end
subgraph Data["DATA/PERSISTENCE"]
Local[localStorage]
IDB[IndexedDB]
JSON[portals.json]
Vision[vision.json]
end
HTML --> APP
APP --> ThreeJS
APP --> Nexus
APP --> GOFAI
APP --> Backend
APP --> Data
Worker2 --> APP
Server --> APP
```
---
## Entry Points
### Primary Entry
- **`index.html`** — Main HTML shell, loads app.js
- **`app.js`** — Main application (4082 lines), Three.js scene setup
### Secondary Entry Points
- **`boot.js`** — Bootstrap sequence
- **`bootstrap.mjs`** — ES module bootstrap
- **`server.py`** — WebSocket bridge server
### Configuration Entry Points
- **`portals.json`** — Portal definitions and destinations
- **`vision.json`** — Vision/agent configuration
- **`config/fleet_agents.json`** — Fleet agent definitions
---
## Data Flow
```
User Input
app.js (Event Loop)
┌─────────────────────────────────────┐
│ Three.js Scene │
│ - Player movement │
│ - Camera controls │
│ - Physics simulation │
│ - Portal detection │
└─────────────────────────────────────┘
┌─────────────────────────────────────┐
│ Nexus Components │
│ - SpatialMemory (room/context) │
│ - MemoryBirth (new memories) │
│ - MemoryPulse (heartbeat) │
│ - ReasoningTrace (GOFAI output) │
└─────────────────────────────────────┘
┌─────────────────────────────────────┐
│ GOFAI Worker (off-thread) │
│ - Rule evaluation │
│ - Fact inference │
│ - Decision making │
└─────────────────────────────────────┘
┌─────────────────────────────────────┐
│ Backend Services │
│ - WebSocket (server.py) │
│ - L402 cost API │
│ - Portal registry │
└─────────────────────────────────────┘
Persistence (localStorage/IndexedDB)
```
---
## Key Abstractions
### 1. Nexus Object (`NEXUS`)
Central configuration and state object containing:
- Color palette
- Room definitions
- Portal configurations
- Agent settings
### 2. SpatialMemory
Manages room-based context for the AI agent:
- Room transitions trigger context switches
- Facts are stored per-room
- NPCs have location awareness
### 3. Portal System
Connects the 3D world to external services:
- Portals defined in `portals.json`
- Each portal links to a service/endpoint
- Visual indicators in 3D space
### 4. GOFAI Worker
Off-thread reasoning engine:
- Rule-based inference
- Fact store with persistence
- Decision making for agent behavior
### 5. Memory Components
- **MemoryBirth**: Creates new memories from interactions
- **MemoryOptimizer**: Compresses and deduplicates memories
- **MemoryPulse**: Heartbeat system for memory health
- **MemoryInspect**: Debug/inspection interface
---
## API Surface
### Internal APIs (JavaScript)
| Module | Export | Purpose |
|--------|--------|---------|
| `app.js` | `NEXUS` | Main config/state object |
| `SpatialMemory` | class | Room-based context management |
| `SpatialAudio` | class | 3D positional audio |
| `MemoryBirth` | class | Memory creation |
| `MemoryOptimizer` | class | Memory compression |
| `ReasoningTrace` | class | GOFAI reasoning visualization |
### External APIs (HTTP/WebSocket)
| Endpoint | Protocol | Purpose |
|----------|----------|---------|
| `ws://localhost:PORT` | WebSocket | Real-time bridge to backend |
| `http://localhost:8080/api/cost-estimate` | HTTP | L402 cost estimation |
| Portal endpoints | Various | External service connections |
---
## Dependencies
### Runtime Dependencies
- **Three.js** — 3D rendering engine
- **Three.js Addons** — Post-processing (Bloom, SMAA)
### Build Dependencies
- **ES Modules** — Native browser modules
- **No bundler** — Direct script loading
### Backend Dependencies
- **Python 3.x** — server.py
- **WebSocket** — Real-time communication
---
## Test Coverage
### Existing Tests
- `tests/boot.test.js` — Bootstrap sequence tests
### Test Gaps
1. **Three.js scene initialization** — No tests
2. **Portal system** — No tests
3. **Memory components** — No tests
4. **GOFAI worker** — No tests
5. **WebSocket communication** — No tests
6. **Spatial memory transitions** — No tests
7. **Physics/player movement** — No tests
### Recommended Test Priorities
1. Portal detection and activation
2. Spatial memory room transitions
3. GOFAI worker message passing
4. WebSocket connection handling
5. Memory persistence (localStorage/IndexedDB)
---
## Security Considerations
### Current Risks
1. **WebSocket without auth** — server.py has no authentication
2. **localStorage sensitive data** — Memories stored unencrypted
3. **CORS open** — No origin restrictions on WebSocket
4. **L402 endpoint** — Cost API may expose internal state
### Mitigations
1. Add WebSocket authentication
2. Encrypt sensitive memories
3. Restrict CORS origins
4. Rate limit L402 endpoint
---
## File Structure
```
the-nexus/
├── app.js # Main app (4082 lines)
├── index.html # HTML shell
├── style.css # Styles
├── server.py # WebSocket bridge
├── boot.js # Bootstrap
├── bootstrap.mjs # ES module bootstrap
├── gofai_worker.js # GOFAI web worker
├── portals.json # Portal definitions
├── vision.json # Vision config
├── nexus/ # Nexus components
│ └── components/
│ ├── spatial-memory.js
│ ├── spatial-audio.js
│ ├── memory-birth.js
│ ├── memory-optimizer.js
│ ├── memory-inspect.js
│ ├── memory-pulse.js
│ ├── reasoning-trace.js
│ └── resonance-visualizer.js
├── config/ # Configuration
├── docs/ # Documentation
├── tests/ # Tests
├── agent/ # Agent components
├── bin/ # Scripts
└── assets/ # Static assets
```
---
## Technical Debt
1. **Large app.js** (4082 lines) — Should be split into modules
2. **No TypeScript** — Pure JavaScript, no type safety
3. **Manual DOM manipulation** — Could use a framework
4. **No build system** — Direct ES modules, no optimization
5. **Limited error handling** — Minimal try/catch coverage
---
## Migration Notes
From CLAUDE.md:
- Current `main` does NOT ship the old root frontend files
- A clean checkout serves a directory listing
- The live browser shell exists in legacy form at `/Users/apayne/the-matrix`
- Migration priorities: #684 (docs), #685 (legacy audit), #686 (smoke tests), #687 (restore shell)
---
## Next Steps
1. **Restore browser shell** — Bring frontend back to main
2. **Add tests** — Cover critical paths (portals, memory, GOFAI)
3. **Split app.js** — Modularize the 4082-line file
4. **Add authentication** — Secure WebSocket and APIs
5. **TypeScript migration** — Add type safety
---
*Generated by Codebase Genome pipeline — Issue #672*

View File

@@ -1,666 +0,0 @@
# GENOME.md — the-testament
Generated: 2026-04-15
Repo: Timmy_Foundation/the-testament
Analysis issue: timmy-home #675
---
## Project Overview
The Testament is not a conventional software repo and not just a manuscript dump.
It is a hybrid publishing system with four layers:
1. narrative source files
2. build/packaging pipelines
3. presentation surfaces
4. verification/quality gates
At the content layer, the repo holds a five-part novel with 18 chapter manuscripts, front/back matter, character sheets, worldbuilding notes, cover copy, soundtrack notes, and other companion artifacts.
At the software layer, it ships a small publishing toolchain that compiles the manuscript into:
- combined markdown
- EPUB
- HTML
- PDF
- web-reader JSON
- checksum manifest
It also includes:
- a static promotional/reader website (`website/index.html`)
- an interactive companion experience (`game/the-door.py` / `game/the-door.html`)
- audiobook helper scripts (`audiobook/`)
- validation and smoke-check automation (`scripts/` + `.gitea/workflows/`)
This makes the repo best understood as a sovereign multimedia book production system centered on a novel.
Runtime-confirmed facts from direct verification:
- `scripts/build-verify.py --json` passes and reports 18 chapters
- the verifier reports ~18,884 manuscript words in chapters and ~19,227 words in concatenated output
- `bash scripts/smoke.sh` passes and successfully builds markdown/epub/html
- `python3 build/build.py --md` succeeds
- `python3 compile_all.py --check` currently crashes due a qrcode version lookup bug
---
## Quick Facts
Repository composition from direct scan:
- 18 chapter manuscripts in `chapters/`
- top-level content/support directories include:
- `chapters/`
- `build/`
- `website/`
- `audiobook/`
- `game/`
- `characters/`
- `worldbuilding/`
- `cover/`
- `music/`
- primary code entrypoints are Python scripts plus a static HTML site
- no dedicated `tests/` directory
- validation is script-driven rather than unit-test-driven
Approximate non-output code inventory from `pygount` scan:
- ~3.6K lines of code-equivalent across Python/HTML/CSS/YAML/Bash/JSON
- code mass is concentrated in:
- `compile_all.py`
- `build/build.py`
- `compile.py`
- `scripts/build-verify.py`
- `website/index.html`
- `game/the-door.py`
---
## Architecture
```mermaid
flowchart TD
A[chapters/*.md] --> B[compile_markdown]
C[front-matter.md / build/frontmatter.md] --> B
D[back-matter.md / build/backmatter.md] --> B
E[build/metadata.yaml] --> F[pandoc/reportlab packaging]
G[book-style.css] --> F
H[cover/cover-art.jpg] --> F
B --> I[testament-complete.md]
I --> F
F --> J[testament.epub]
F --> K[testament.html]
F --> L[testament.pdf]
A --> M[compile_chapters_json / website/build-chapters.py]
M --> N[website/chapters.json]
I --> O[generate_manifest]
J --> O
K --> O
L --> O
N --> O
O --> P[build-manifest.json]
A --> Q[scripts/index_generator.py]
R[characters/*.md] --> Q
Q --> S[KNOWLEDGE_GRAPH.md]
A --> T[build/semantic_linker.py]
T --> U[build/cross_refs.json]
A --> V[audiobook/extract_text.py]
V --> W[text excerpts]
W --> X[audiobook/generate_samples.sh]
X --> Y[audiobook sample files]
Y --> Z[audiobook/create_manifest.py]
Z --> AA[audiobook/manifest.md]
AB[scripts/build-verify.py] --> A
AB --> I
AC[scripts/smoke.sh] --> AB
AD[.gitea workflows] --> AC
AE[website/index.html] --> AF[static landing/reading experience]
AG[game/the-door.py / game/the-door.html] --> AH[interactive companion artifact]
```
---
## Entry Points
### Primary build entrypoint
1. `compile_all.py`
This is the canonical unified pipeline.
It builds:
- combined markdown
- EPUB
- PDF
- HTML
- `website/chapters.json`
- `build-manifest.json`
It also exposes:
- `--check`
- `--clean`
- format-specific flags (`--md`, `--epub`, `--pdf`, `--html`, `--json`)
### Legacy build entrypoints
2. `build/build.py`
3. `compile.py`
These overlap with the unified pipeline and still work as alternate build surfaces.
`build/build.py` is the more structured legacy path.
`compile.py` is a simpler older compiler that still shells out to `scripts/index_generator.py` before building.
### Verification entrypoints
4. `scripts/build-verify.py`
5. `scripts/smoke.sh`
6. `.gitea/workflows/build.yml`
7. `.gitea/workflows/smoke.yml`
8. `.gitea/workflows/validate.yml`
These form the repos test/CI surface.
There are no unit tests; these scripts are the executable contract.
### Website/content export entrypoints
9. `website/build-chapters.py`
10. `website/index.html`
`build-chapters.py` converts chapter markdown into HTML snippets inside `website/chapters.json`.
`website/index.html` is a large static HTML/CSS/JS page used as the web-facing presentation layer.
### Audiobook entrypoints
11. `audiobook/extract_text.py`
12. `audiobook/create_manifest.py`
13. `audiobook/generate_samples.sh`
These scripts support excerpt extraction, sample generation, and audiobook manifest creation.
### Companion/interactive entrypoints
14. `game/the-door.py`
15. `game/the-door.html`
These are sidecar experiences, not part of the core build pipeline, but they are part of the repo architecture.
### Knowledge/indexing entrypoints
16. `scripts/index_generator.py`
17. `build/semantic_linker.py`
These create graph-like auxiliary artifacts from the manuscript corpus.
---
## Data Flow
### Main book build flow
```text
chapter markdown + front matter + back matter
compile_markdown()
combined manuscript: testament-complete.md
format-specific compilers
├─ pandoc -> EPUB
├─ pandoc -> standalone HTML
├─ xelatex / weasyprint / reportlab -> PDF
└─ metadata/css/cover integrated where available
optional output hashing
build-manifest.json
```
### Website/export flow
```text
chapters/*.md
website/build-chapters.py or compile_all.py::compile_chapters_json()
extract heading + convert paragraphs/quotes/headings to HTML fragments
website/chapters.json
```
Important nuance:
- `website/chapters.json` is produced by the toolchain
- current `website/index.html` appears to be a static landing/presentation page
- no direct `fetch('chapters.json')` usage was found in the current website HTML
So the JSON output is a generated artifact for a web-reader/export path, but not obviously consumed by the checked-in landing page itself.
### Verification flow
```text
chapter files + required support files
scripts/build-verify.py
├─ count files
├─ validate heading format
├─ compute word counts
├─ check markdown integrity
├─ concatenate outputs
└─ write build-report.json when asked
```
### Knowledge graph / semantic link flow
```text
characters/*.md + chapters/*.md
scripts/index_generator.py
KNOWLEDGE_GRAPH.md
chapters/*.md
build/semantic_linker.py
build/cross_refs.json
```
### Audiobook flow
```text
chapter markdown
audiobook/extract_text.py
trimmed text excerpt
audiobook/generate_samples.sh
audio sample files
audiobook/create_manifest.py
audiobook/manifest.md
```
---
## Key Abstractions
### 1. Chapter corpus
The core domain object of the repo is the ordered chapter set:
- `chapters/chapter-01.md` ... `chapters/chapter-18.md`
- exact numbering matters
- heading format matters
- concatenation order matters
Almost every script assumes this ordered corpus is the canonical source of truth.
### 2. Part boundaries (`PARTS`)
Both `compile.py`, `build/build.py`, and `compile_all.py` define a `PARTS` mapping.
This injects higher-level narrative structure into the build output by adding part headers and descriptions at fixed chapter boundaries.
### 3. Compiled manuscript
`testament-complete.md` is the normalized intermediate artifact.
It is the manuscript assembly layer from which downstream formats are built.
This is the closest thing the repo has to an internal IR (intermediate representation).
### 4. Multi-backend packaging
The build system supports multiple packaging backends:
- pandoc for EPUB and HTML
- xelatex for PDF when available
- weasyprint fallback
- reportlab fallback for fully local pure-Python PDF generation
This is a resilience pattern: the repo prefers multiple production paths rather than a single brittle dependency chain.
### 5. Manifested outputs
`build-manifest.json` stores output metadata and SHA256 checksums.
That turns built artifacts into auditable objects rather than opaque files.
### 6. Verification-as-tests
Because there is no `tests/` suite, `scripts/build-verify.py` is effectively the main automated specification for integrity.
It asserts:
- chapter count
- naming/ordering
- heading format
- word-count sanity
- markdown integrity
- concatenation success
- required support files
### 7. Companion surfaces
The repo has non-manuscript presentation surfaces:
- static website
- interactive game/experience (`The Door`)
- audiobook assets and scripts
These make the repo a narrative system, not just a book build.
### 8. Knowledge graph / semantic linking
The repo contains lightweight symbolic tooling:
- regex-based character-to-chapter index generation
- capitalized-phrase cross-reference detection between chapters
This is a GOFAI-like layer over literary content.
---
## API Surface
This repos API surface is mostly CLI-based rather than network-based.
### Canonical CLI surface
#### `compile_all.py`
Commands:
- `python3 compile_all.py`
- `python3 compile_all.py --md`
- `python3 compile_all.py --epub`
- `python3 compile_all.py --pdf`
- `python3 compile_all.py --html`
- `python3 compile_all.py --json`
- `python3 compile_all.py --check`
- `python3 compile_all.py --clean`
Outputs:
- `testament-complete.md`
- `testament.epub`
- `testament.html`
- `testament.pdf`
- `website/chapters.json`
- `build-manifest.json`
#### `build/build.py`
Commands:
- `python3 build/build.py --md`
- `python3 build/build.py --epub`
- `python3 build/build.py --pdf`
- `python3 build/build.py --html`
- default full build behavior
#### `compile.py`
Commands documented:
- `python3 compile.py`
- `python3 compile.py --md`
- `python3 compile.py --epub`
- `python3 compile.py --html`
- `python3 compile.py --check`
Observed quirk:
- `scripts/smoke.sh` calls `python3 compile.py --validate`
- no `--validate` handling exists in source
- the script still exits 0 because `compile.py` ignores unknown args and runs its default build path
That is a real contract quirk/drift worth remembering.
#### `scripts/build-verify.py`
Commands:
- `python3 scripts/build-verify.py`
- `python3 scripts/build-verify.py --ci`
- `python3 scripts/build-verify.py --json`
#### Other tooling
- `python3 website/build-chapters.py`
- `python3 scripts/index_generator.py`
- `python3 build/semantic_linker.py`
- `python3 audiobook/extract_text.py <input.md> <output.txt>`
- `python3 audiobook/create_manifest.py`
- `bash audiobook/generate_samples.sh`
- `bash scripts/smoke.sh`
- `python3 game/the-door.py`
### Data contracts
#### Chapter heading contract
`build-verify.py` expects each chapter to start with:
- `# Chapter N — Title`
#### File naming contract
- chapter files must match `chapter-XX.md`
- exactly 18 chapters are expected by the verifier
#### Output manifest contract
`build-manifest.json` includes, per file:
- path
- size_bytes
- sha256
#### Website chapters JSON contract
Entries include:
- `number`
- `title`
- `html`
---
## Test Coverage Gaps
### Current state
There is no unit-test suite and no `tests/` directory.
Coverage is currently provided by:
- shell smoke checks
- build verification script
- CI workflow checks
That means the repo has verification, but not isolated regression tests.
### What is already covered by script-based checks
- chapter count and naming
- heading format
- minimum word-count sanity
- markdown delimiter/link integrity
- concatenation success
- required-file existence
- basic syntax parsing for Python/YAML/shell/JSON
- secret-pattern grep scanning
### Highest-value missing tests
1. `compile_all.py` dependency-check behavior
- there should be a regression test for `--check`
- current runtime already revealed a concrete failure when `qrcode.__version__` is missing
2. `compile_chapters_json()` correctness
- verify all 18 chapters are emitted
- verify blockquotes/headings/italics render as expected
- verify title extraction stays stable
3. Manifest generation
- verify `build-manifest.json` includes every built artifact actually present
- verify sha256 and size fields are correct
4. Build backend selection
- verify fallback order for PDF generation behaves correctly when xelatex/weasyprint/reportlab availability changes
5. `scripts/index_generator.py`
- verify character mention detection and markdown output determinism
6. `build/semantic_linker.py`
- verify the proper-noun extraction and common-word filtering do not produce obviously bad edges
7. Website/output parity
- verify `website/chapters.json` matches chapter headings and ordering from source manuscripts
8. Companion experience smoke tests
- `game/the-door.py` has no automated behavior coverage
- `game/the-door.html` has no structural or syntax verification
### Recommended first tests
If this repo gets a `tests/` directory, start here:
1. `test_compile_all_check_does_not_crash`
2. `test_build_chapters_emits_18_ordered_entries`
3. `test_manifest_contains_existing_outputs`
4. `test_build_verify_rejects_missing_chapter`
---
## Security Considerations
### 1. Shelling out to external toolchains
The build system uses subprocess execution for:
- pandoc
- xelatex
- weasyprint-related flows
- helper scripts
This is reasonable for a publishing repo, but it means path handling and shell assumptions matter.
### 2. Remote font dependency in website HTML
`website/index.html` imports Google Fonts via CSS `@import`.
That means the website is not fully sovereign/local-first at render time.
If strict offline/local hosting matters, font bundling would be required.
### 3. Secret scanning exists, but is grep-based
Both CI and `scripts/smoke.sh` perform simple pattern scanning.
That is better than nothing, but it is heuristic rather than structured secret detection.
### 4. Artifact integrity is a strength
`build-manifest.json` with SHA256 hashes is a strong integrity pattern.
It gives the repo a lightweight provenance layer for distributables.
### 5. Build check path currently has a reliability bug
Runtime-confirmed:
- `python3 compile_all.py --check` crashes with:
- `AttributeError: module 'qrcode' has no attribute '__version__'`
This is not a remote exploit issue, but it is an operational integrity issue because the advertised safe preflight check is not robust.
Follow-up issue filed:
- the-testament #51
- https://forge.alexanderwhitestone.com/Timmy_Foundation/the-testament/issues/51
---
## Drift / Contradictions
### 1. README vs runtime word count
README says:
- ~70,000 word target
- ~19,000 words drafted
Runtime verification says:
- ~18,884 words in chapter corpus
- ~19,227 words in concatenated output
This is close enough to be directionally aligned, but the verifier is the stronger factual source for current draft size.
### 2. `compile_all.py --check` is documented but currently broken
Documented behavior:
- dependency verification
Observed behavior:
- crashes on qrcode version lookup
### 3. `scripts/smoke.sh` depends on undocumented `compile.py --validate`
- `compile.py` docs do not list `--validate`
- source contains no explicit `--validate` path
- smoke still passes because the script ignores unknown flags and performs its default build path
This is a subtle contract mismatch.
### 4. `website/chapters.json` generation is present, but current website landing page does not appear to consume it directly
That suggests either:
- a future/planned reader path
- an external consumer
- or leftover infrastructure from an earlier website design
---
## Practical Mental Model
Think of the-testament as three repos living inside one repository:
1. the manuscript repo
- chapters
- front/back matter
- worldbuilding
- character sheets
2. the publishing pipeline repo
- compile scripts
- verification scripts
- CI workflows
- manifest generation
3. the companion media repo
- website
- audiobook helpers
- interactive game experience
- soundtrack/cover assets
The connective tissue is the manuscript corpus. Almost everything else either:
- transforms it
- packages it
- validates it
- or re-presents it in another medium
---
## Source Files of Highest Importance
1. `compile_all.py`
- canonical unified pipeline
- best single source of repo architecture
2. `scripts/build-verify.py`
- real executable quality contract
3. `build/build.py`
- structured legacy builder still in active use
4. `compile.py`
- older build entrypoint still referenced by smoke flow
5. `website/index.html`
- primary web presentation artifact
6. `website/build-chapters.py`
- chapter-to-web JSON transform
7. `build/metadata.yaml`
- publication metadata contract
8. `build/semantic_linker.py`
- symbolic/literary relationship extraction
---
## Recommended Next Refactors
1. Make `compile_all.py` the only documented build entrypoint
- de-emphasize or retire duplicated legacy flows once parity is confirmed
2. Add real regression tests around build helpers
- especially `compile_all.py --check`
- chapter JSON generation
- manifest generation
3. Clarify the role of `website/chapters.json`
- either wire it into the site, document its consumer, or remove the dead path
4. Fix the undocumented `compile.py --validate` dependency in smoke
- either implement the flag or stop invoking it
5. Decide whether the companion game and website should remain in the same repo or be treated as first-class subprojects with their own tests
---
## Bottom Line
the-testament is a sovereign novel-production repo with a manuscript at the center and a light but real software system around it.
Its architecture is not application-server-centric.
It is pipeline-centric:
- content in
- validated compilation
- multi-format outputs
- integrity metadata
- companion experiences around the text
The strongest technical asset is the layered publishing pipeline plus manuscript verification.
The biggest weakness is the absence of dedicated regression tests around the build system itself.
Source basis for this genome:
- README and manuscript structure docs
- direct source inspection of `compile_all.py`, `build/build.py`, `compile.py`, website/audiobook/indexing/verification scripts
- runtime verification of build and validation commands
- repo scan of content/build/workflow layout