667 lines
19 KiB
Markdown
667 lines
19 KiB
Markdown
# GENOME.md — the-testament
|
||
|
||
Generated: 2026-04-15
|
||
Repo: Timmy_Foundation/the-testament
|
||
Analysis issue: timmy-home #675
|
||
|
||
---
|
||
|
||
## Project Overview
|
||
|
||
The Testament is not a conventional software repo and not just a manuscript dump.
|
||
It is a hybrid publishing system with four layers:
|
||
|
||
1. narrative source files
|
||
2. build/packaging pipelines
|
||
3. presentation surfaces
|
||
4. verification/quality gates
|
||
|
||
At the content layer, the repo holds a five-part novel with 18 chapter manuscripts, front/back matter, character sheets, worldbuilding notes, cover copy, soundtrack notes, and other companion artifacts.
|
||
|
||
At the software layer, it ships a small publishing toolchain that compiles the manuscript into:
|
||
- combined markdown
|
||
- EPUB
|
||
- HTML
|
||
- PDF
|
||
- web-reader JSON
|
||
- checksum manifest
|
||
|
||
It also includes:
|
||
- a static promotional/reader website (`website/index.html`)
|
||
- an interactive companion experience (`game/the-door.py` / `game/the-door.html`)
|
||
- audiobook helper scripts (`audiobook/`)
|
||
- validation and smoke-check automation (`scripts/` + `.gitea/workflows/`)
|
||
|
||
This makes the repo best understood as a sovereign multimedia book production system centered on a novel.
|
||
|
||
Runtime-confirmed facts from direct verification:
|
||
- `scripts/build-verify.py --json` passes and reports 18 chapters
|
||
- the verifier reports ~18,884 manuscript words in chapters and ~19,227 words in concatenated output
|
||
- `bash scripts/smoke.sh` passes and successfully builds markdown/epub/html
|
||
- `python3 build/build.py --md` succeeds
|
||
- `python3 compile_all.py --check` currently crashes due a qrcode version lookup bug
|
||
|
||
---
|
||
|
||
## Quick Facts
|
||
|
||
Repository composition from direct scan:
|
||
- 18 chapter manuscripts in `chapters/`
|
||
- top-level content/support directories include:
|
||
- `chapters/`
|
||
- `build/`
|
||
- `website/`
|
||
- `audiobook/`
|
||
- `game/`
|
||
- `characters/`
|
||
- `worldbuilding/`
|
||
- `cover/`
|
||
- `music/`
|
||
- primary code entrypoints are Python scripts plus a static HTML site
|
||
- no dedicated `tests/` directory
|
||
- validation is script-driven rather than unit-test-driven
|
||
|
||
Approximate non-output code inventory from `pygount` scan:
|
||
- ~3.6K lines of code-equivalent across Python/HTML/CSS/YAML/Bash/JSON
|
||
- code mass is concentrated in:
|
||
- `compile_all.py`
|
||
- `build/build.py`
|
||
- `compile.py`
|
||
- `scripts/build-verify.py`
|
||
- `website/index.html`
|
||
- `game/the-door.py`
|
||
|
||
---
|
||
|
||
## Architecture
|
||
|
||
```mermaid
|
||
flowchart TD
|
||
A[chapters/*.md] --> B[compile_markdown]
|
||
C[front-matter.md / build/frontmatter.md] --> B
|
||
D[back-matter.md / build/backmatter.md] --> B
|
||
E[build/metadata.yaml] --> F[pandoc/reportlab packaging]
|
||
G[book-style.css] --> F
|
||
H[cover/cover-art.jpg] --> F
|
||
|
||
B --> I[testament-complete.md]
|
||
I --> F
|
||
|
||
F --> J[testament.epub]
|
||
F --> K[testament.html]
|
||
F --> L[testament.pdf]
|
||
|
||
A --> M[compile_chapters_json / website/build-chapters.py]
|
||
M --> N[website/chapters.json]
|
||
|
||
I --> O[generate_manifest]
|
||
J --> O
|
||
K --> O
|
||
L --> O
|
||
N --> O
|
||
O --> P[build-manifest.json]
|
||
|
||
A --> Q[scripts/index_generator.py]
|
||
R[characters/*.md] --> Q
|
||
Q --> S[KNOWLEDGE_GRAPH.md]
|
||
|
||
A --> T[build/semantic_linker.py]
|
||
T --> U[build/cross_refs.json]
|
||
|
||
A --> V[audiobook/extract_text.py]
|
||
V --> W[text excerpts]
|
||
W --> X[audiobook/generate_samples.sh]
|
||
X --> Y[audiobook sample files]
|
||
Y --> Z[audiobook/create_manifest.py]
|
||
Z --> AA[audiobook/manifest.md]
|
||
|
||
AB[scripts/build-verify.py] --> A
|
||
AB --> I
|
||
AC[scripts/smoke.sh] --> AB
|
||
AD[.gitea workflows] --> AC
|
||
|
||
AE[website/index.html] --> AF[static landing/reading experience]
|
||
AG[game/the-door.py / game/the-door.html] --> AH[interactive companion artifact]
|
||
```
|
||
|
||
---
|
||
|
||
## Entry Points
|
||
|
||
### Primary build entrypoint
|
||
1. `compile_all.py`
|
||
|
||
This is the canonical unified pipeline.
|
||
It builds:
|
||
- combined markdown
|
||
- EPUB
|
||
- PDF
|
||
- HTML
|
||
- `website/chapters.json`
|
||
- `build-manifest.json`
|
||
|
||
It also exposes:
|
||
- `--check`
|
||
- `--clean`
|
||
- format-specific flags (`--md`, `--epub`, `--pdf`, `--html`, `--json`)
|
||
|
||
### Legacy build entrypoints
|
||
2. `build/build.py`
|
||
3. `compile.py`
|
||
|
||
These overlap with the unified pipeline and still work as alternate build surfaces.
|
||
`build/build.py` is the more structured legacy path.
|
||
`compile.py` is a simpler older compiler that still shells out to `scripts/index_generator.py` before building.
|
||
|
||
### Verification entrypoints
|
||
4. `scripts/build-verify.py`
|
||
5. `scripts/smoke.sh`
|
||
6. `.gitea/workflows/build.yml`
|
||
7. `.gitea/workflows/smoke.yml`
|
||
8. `.gitea/workflows/validate.yml`
|
||
|
||
These form the repo’s test/CI surface.
|
||
There are no unit tests; these scripts are the executable contract.
|
||
|
||
### Website/content export entrypoints
|
||
9. `website/build-chapters.py`
|
||
10. `website/index.html`
|
||
|
||
`build-chapters.py` converts chapter markdown into HTML snippets inside `website/chapters.json`.
|
||
`website/index.html` is a large static HTML/CSS/JS page used as the web-facing presentation layer.
|
||
|
||
### Audiobook entrypoints
|
||
11. `audiobook/extract_text.py`
|
||
12. `audiobook/create_manifest.py`
|
||
13. `audiobook/generate_samples.sh`
|
||
|
||
These scripts support excerpt extraction, sample generation, and audiobook manifest creation.
|
||
|
||
### Companion/interactive entrypoints
|
||
14. `game/the-door.py`
|
||
15. `game/the-door.html`
|
||
|
||
These are sidecar experiences, not part of the core build pipeline, but they are part of the repo architecture.
|
||
|
||
### Knowledge/indexing entrypoints
|
||
16. `scripts/index_generator.py`
|
||
17. `build/semantic_linker.py`
|
||
|
||
These create graph-like auxiliary artifacts from the manuscript corpus.
|
||
|
||
---
|
||
|
||
## Data Flow
|
||
|
||
### Main book build flow
|
||
|
||
```text
|
||
chapter markdown + front matter + back matter
|
||
↓
|
||
compile_markdown()
|
||
↓
|
||
combined manuscript: testament-complete.md
|
||
↓
|
||
format-specific compilers
|
||
├─ pandoc -> EPUB
|
||
├─ pandoc -> standalone HTML
|
||
├─ xelatex / weasyprint / reportlab -> PDF
|
||
└─ metadata/css/cover integrated where available
|
||
↓
|
||
optional output hashing
|
||
↓
|
||
build-manifest.json
|
||
```
|
||
|
||
### Website/export flow
|
||
|
||
```text
|
||
chapters/*.md
|
||
↓
|
||
website/build-chapters.py or compile_all.py::compile_chapters_json()
|
||
↓
|
||
extract heading + convert paragraphs/quotes/headings to HTML fragments
|
||
↓
|
||
website/chapters.json
|
||
```
|
||
|
||
Important nuance:
|
||
- `website/chapters.json` is produced by the toolchain
|
||
- current `website/index.html` appears to be a static landing/presentation page
|
||
- no direct `fetch('chapters.json')` usage was found in the current website HTML
|
||
|
||
So the JSON output is a generated artifact for a web-reader/export path, but not obviously consumed by the checked-in landing page itself.
|
||
|
||
### Verification flow
|
||
|
||
```text
|
||
chapter files + required support files
|
||
↓
|
||
scripts/build-verify.py
|
||
├─ count files
|
||
├─ validate heading format
|
||
├─ compute word counts
|
||
├─ check markdown integrity
|
||
├─ concatenate outputs
|
||
└─ write build-report.json when asked
|
||
```
|
||
|
||
### Knowledge graph / semantic link flow
|
||
|
||
```text
|
||
characters/*.md + chapters/*.md
|
||
↓
|
||
scripts/index_generator.py
|
||
↓
|
||
KNOWLEDGE_GRAPH.md
|
||
|
||
chapters/*.md
|
||
↓
|
||
build/semantic_linker.py
|
||
↓
|
||
build/cross_refs.json
|
||
```
|
||
|
||
### Audiobook flow
|
||
|
||
```text
|
||
chapter markdown
|
||
↓
|
||
audiobook/extract_text.py
|
||
↓
|
||
trimmed text excerpt
|
||
↓
|
||
audiobook/generate_samples.sh
|
||
↓
|
||
audio sample files
|
||
↓
|
||
audiobook/create_manifest.py
|
||
↓
|
||
audiobook/manifest.md
|
||
```
|
||
|
||
---
|
||
|
||
## Key Abstractions
|
||
|
||
### 1. Chapter corpus
|
||
The core domain object of the repo is the ordered chapter set:
|
||
- `chapters/chapter-01.md` ... `chapters/chapter-18.md`
|
||
- exact numbering matters
|
||
- heading format matters
|
||
- concatenation order matters
|
||
|
||
Almost every script assumes this ordered corpus is the canonical source of truth.
|
||
|
||
### 2. Part boundaries (`PARTS`)
|
||
Both `compile.py`, `build/build.py`, and `compile_all.py` define a `PARTS` mapping.
|
||
This injects higher-level narrative structure into the build output by adding part headers and descriptions at fixed chapter boundaries.
|
||
|
||
### 3. Compiled manuscript
|
||
`testament-complete.md` is the normalized intermediate artifact.
|
||
It is the manuscript assembly layer from which downstream formats are built.
|
||
|
||
This is the closest thing the repo has to an internal IR (intermediate representation).
|
||
|
||
### 4. Multi-backend packaging
|
||
The build system supports multiple packaging backends:
|
||
- pandoc for EPUB and HTML
|
||
- xelatex for PDF when available
|
||
- weasyprint fallback
|
||
- reportlab fallback for fully local pure-Python PDF generation
|
||
|
||
This is a resilience pattern: the repo prefers multiple production paths rather than a single brittle dependency chain.
|
||
|
||
### 5. Manifested outputs
|
||
`build-manifest.json` stores output metadata and SHA256 checksums.
|
||
That turns built artifacts into auditable objects rather than opaque files.
|
||
|
||
### 6. Verification-as-tests
|
||
Because there is no `tests/` suite, `scripts/build-verify.py` is effectively the main automated specification for integrity.
|
||
It asserts:
|
||
- chapter count
|
||
- naming/ordering
|
||
- heading format
|
||
- word-count sanity
|
||
- markdown integrity
|
||
- concatenation success
|
||
- required support files
|
||
|
||
### 7. Companion surfaces
|
||
The repo has non-manuscript presentation surfaces:
|
||
- static website
|
||
- interactive game/experience (`The Door`)
|
||
- audiobook assets and scripts
|
||
|
||
These make the repo a narrative system, not just a book build.
|
||
|
||
### 8. Knowledge graph / semantic linking
|
||
The repo contains lightweight symbolic tooling:
|
||
- regex-based character-to-chapter index generation
|
||
- capitalized-phrase cross-reference detection between chapters
|
||
|
||
This is a GOFAI-like layer over literary content.
|
||
|
||
---
|
||
|
||
## API Surface
|
||
|
||
This repo’s API surface is mostly CLI-based rather than network-based.
|
||
|
||
### Canonical CLI surface
|
||
|
||
#### `compile_all.py`
|
||
Commands:
|
||
- `python3 compile_all.py`
|
||
- `python3 compile_all.py --md`
|
||
- `python3 compile_all.py --epub`
|
||
- `python3 compile_all.py --pdf`
|
||
- `python3 compile_all.py --html`
|
||
- `python3 compile_all.py --json`
|
||
- `python3 compile_all.py --check`
|
||
- `python3 compile_all.py --clean`
|
||
|
||
Outputs:
|
||
- `testament-complete.md`
|
||
- `testament.epub`
|
||
- `testament.html`
|
||
- `testament.pdf`
|
||
- `website/chapters.json`
|
||
- `build-manifest.json`
|
||
|
||
#### `build/build.py`
|
||
Commands:
|
||
- `python3 build/build.py --md`
|
||
- `python3 build/build.py --epub`
|
||
- `python3 build/build.py --pdf`
|
||
- `python3 build/build.py --html`
|
||
- default full build behavior
|
||
|
||
#### `compile.py`
|
||
Commands documented:
|
||
- `python3 compile.py`
|
||
- `python3 compile.py --md`
|
||
- `python3 compile.py --epub`
|
||
- `python3 compile.py --html`
|
||
- `python3 compile.py --check`
|
||
|
||
Observed quirk:
|
||
- `scripts/smoke.sh` calls `python3 compile.py --validate`
|
||
- no `--validate` handling exists in source
|
||
- the script still exits 0 because `compile.py` ignores unknown args and runs its default build path
|
||
|
||
That is a real contract quirk/drift worth remembering.
|
||
|
||
#### `scripts/build-verify.py`
|
||
Commands:
|
||
- `python3 scripts/build-verify.py`
|
||
- `python3 scripts/build-verify.py --ci`
|
||
- `python3 scripts/build-verify.py --json`
|
||
|
||
#### Other tooling
|
||
- `python3 website/build-chapters.py`
|
||
- `python3 scripts/index_generator.py`
|
||
- `python3 build/semantic_linker.py`
|
||
- `python3 audiobook/extract_text.py <input.md> <output.txt>`
|
||
- `python3 audiobook/create_manifest.py`
|
||
- `bash audiobook/generate_samples.sh`
|
||
- `bash scripts/smoke.sh`
|
||
- `python3 game/the-door.py`
|
||
|
||
### Data contracts
|
||
|
||
#### Chapter heading contract
|
||
`build-verify.py` expects each chapter to start with:
|
||
- `# Chapter N — Title`
|
||
|
||
#### File naming contract
|
||
- chapter files must match `chapter-XX.md`
|
||
- exactly 18 chapters are expected by the verifier
|
||
|
||
#### Output manifest contract
|
||
`build-manifest.json` includes, per file:
|
||
- path
|
||
- size_bytes
|
||
- sha256
|
||
|
||
#### Website chapters JSON contract
|
||
Entries include:
|
||
- `number`
|
||
- `title`
|
||
- `html`
|
||
|
||
---
|
||
|
||
## Test Coverage Gaps
|
||
|
||
### Current state
|
||
There is no unit-test suite and no `tests/` directory.
|
||
Coverage is currently provided by:
|
||
- shell smoke checks
|
||
- build verification script
|
||
- CI workflow checks
|
||
|
||
That means the repo has verification, but not isolated regression tests.
|
||
|
||
### What is already covered by script-based checks
|
||
- chapter count and naming
|
||
- heading format
|
||
- minimum word-count sanity
|
||
- markdown delimiter/link integrity
|
||
- concatenation success
|
||
- required-file existence
|
||
- basic syntax parsing for Python/YAML/shell/JSON
|
||
- secret-pattern grep scanning
|
||
|
||
### Highest-value missing tests
|
||
|
||
1. `compile_all.py` dependency-check behavior
|
||
- there should be a regression test for `--check`
|
||
- current runtime already revealed a concrete failure when `qrcode.__version__` is missing
|
||
|
||
2. `compile_chapters_json()` correctness
|
||
- verify all 18 chapters are emitted
|
||
- verify blockquotes/headings/italics render as expected
|
||
- verify title extraction stays stable
|
||
|
||
3. Manifest generation
|
||
- verify `build-manifest.json` includes every built artifact actually present
|
||
- verify sha256 and size fields are correct
|
||
|
||
4. Build backend selection
|
||
- verify fallback order for PDF generation behaves correctly when xelatex/weasyprint/reportlab availability changes
|
||
|
||
5. `scripts/index_generator.py`
|
||
- verify character mention detection and markdown output determinism
|
||
|
||
6. `build/semantic_linker.py`
|
||
- verify the proper-noun extraction and common-word filtering do not produce obviously bad edges
|
||
|
||
7. Website/output parity
|
||
- verify `website/chapters.json` matches chapter headings and ordering from source manuscripts
|
||
|
||
8. Companion experience smoke tests
|
||
- `game/the-door.py` has no automated behavior coverage
|
||
- `game/the-door.html` has no structural or syntax verification
|
||
|
||
### Recommended first tests
|
||
If this repo gets a `tests/` directory, start here:
|
||
1. `test_compile_all_check_does_not_crash`
|
||
2. `test_build_chapters_emits_18_ordered_entries`
|
||
3. `test_manifest_contains_existing_outputs`
|
||
4. `test_build_verify_rejects_missing_chapter`
|
||
|
||
---
|
||
|
||
## Security Considerations
|
||
|
||
### 1. Shelling out to external toolchains
|
||
The build system uses subprocess execution for:
|
||
- pandoc
|
||
- xelatex
|
||
- weasyprint-related flows
|
||
- helper scripts
|
||
|
||
This is reasonable for a publishing repo, but it means path handling and shell assumptions matter.
|
||
|
||
### 2. Remote font dependency in website HTML
|
||
`website/index.html` imports Google Fonts via CSS `@import`.
|
||
That means the website is not fully sovereign/local-first at render time.
|
||
If strict offline/local hosting matters, font bundling would be required.
|
||
|
||
### 3. Secret scanning exists, but is grep-based
|
||
Both CI and `scripts/smoke.sh` perform simple pattern scanning.
|
||
That is better than nothing, but it is heuristic rather than structured secret detection.
|
||
|
||
### 4. Artifact integrity is a strength
|
||
`build-manifest.json` with SHA256 hashes is a strong integrity pattern.
|
||
It gives the repo a lightweight provenance layer for distributables.
|
||
|
||
### 5. Build check path currently has a reliability bug
|
||
Runtime-confirmed:
|
||
- `python3 compile_all.py --check` crashes with:
|
||
- `AttributeError: module 'qrcode' has no attribute '__version__'`
|
||
|
||
This is not a remote exploit issue, but it is an operational integrity issue because the advertised safe preflight check is not robust.
|
||
|
||
Follow-up issue filed:
|
||
- the-testament #51
|
||
- https://forge.alexanderwhitestone.com/Timmy_Foundation/the-testament/issues/51
|
||
|
||
---
|
||
|
||
## Drift / Contradictions
|
||
|
||
### 1. README vs runtime word count
|
||
README says:
|
||
- ~70,000 word target
|
||
- ~19,000 words drafted
|
||
|
||
Runtime verification says:
|
||
- ~18,884 words in chapter corpus
|
||
- ~19,227 words in concatenated output
|
||
|
||
This is close enough to be directionally aligned, but the verifier is the stronger factual source for current draft size.
|
||
|
||
### 2. `compile_all.py --check` is documented but currently broken
|
||
Documented behavior:
|
||
- dependency verification
|
||
|
||
Observed behavior:
|
||
- crashes on qrcode version lookup
|
||
|
||
### 3. `scripts/smoke.sh` depends on undocumented `compile.py --validate`
|
||
- `compile.py` docs do not list `--validate`
|
||
- source contains no explicit `--validate` path
|
||
- smoke still passes because the script ignores unknown flags and performs its default build path
|
||
|
||
This is a subtle contract mismatch.
|
||
|
||
### 4. `website/chapters.json` generation is present, but current website landing page does not appear to consume it directly
|
||
That suggests either:
|
||
- a future/planned reader path
|
||
- an external consumer
|
||
- or leftover infrastructure from an earlier website design
|
||
|
||
---
|
||
|
||
## Practical Mental Model
|
||
|
||
Think of the-testament as three repos living inside one repository:
|
||
|
||
1. the manuscript repo
|
||
- chapters
|
||
- front/back matter
|
||
- worldbuilding
|
||
- character sheets
|
||
|
||
2. the publishing pipeline repo
|
||
- compile scripts
|
||
- verification scripts
|
||
- CI workflows
|
||
- manifest generation
|
||
|
||
3. the companion media repo
|
||
- website
|
||
- audiobook helpers
|
||
- interactive game experience
|
||
- soundtrack/cover assets
|
||
|
||
The connective tissue is the manuscript corpus. Almost everything else either:
|
||
- transforms it
|
||
- packages it
|
||
- validates it
|
||
- or re-presents it in another medium
|
||
|
||
---
|
||
|
||
## Source Files of Highest Importance
|
||
|
||
1. `compile_all.py`
|
||
- canonical unified pipeline
|
||
- best single source of repo architecture
|
||
|
||
2. `scripts/build-verify.py`
|
||
- real executable quality contract
|
||
|
||
3. `build/build.py`
|
||
- structured legacy builder still in active use
|
||
|
||
4. `compile.py`
|
||
- older build entrypoint still referenced by smoke flow
|
||
|
||
5. `website/index.html`
|
||
- primary web presentation artifact
|
||
|
||
6. `website/build-chapters.py`
|
||
- chapter-to-web JSON transform
|
||
|
||
7. `build/metadata.yaml`
|
||
- publication metadata contract
|
||
|
||
8. `build/semantic_linker.py`
|
||
- symbolic/literary relationship extraction
|
||
|
||
---
|
||
|
||
## Recommended Next Refactors
|
||
|
||
1. Make `compile_all.py` the only documented build entrypoint
|
||
- de-emphasize or retire duplicated legacy flows once parity is confirmed
|
||
|
||
2. Add real regression tests around build helpers
|
||
- especially `compile_all.py --check`
|
||
- chapter JSON generation
|
||
- manifest generation
|
||
|
||
3. Clarify the role of `website/chapters.json`
|
||
- either wire it into the site, document its consumer, or remove the dead path
|
||
|
||
4. Fix the undocumented `compile.py --validate` dependency in smoke
|
||
- either implement the flag or stop invoking it
|
||
|
||
5. Decide whether the companion game and website should remain in the same repo or be treated as first-class subprojects with their own tests
|
||
|
||
---
|
||
|
||
## Bottom Line
|
||
|
||
the-testament is a sovereign novel-production repo with a manuscript at the center and a light but real software system around it.
|
||
|
||
Its architecture is not application-server-centric.
|
||
It is pipeline-centric:
|
||
- content in
|
||
- validated compilation
|
||
- multi-format outputs
|
||
- integrity metadata
|
||
- companion experiences around the text
|
||
|
||
The strongest technical asset is the layered publishing pipeline plus manuscript verification.
|
||
The biggest weakness is the absence of dedicated regression tests around the build system itself.
|
||
|
||
Source basis for this genome:
|
||
- README and manuscript structure docs
|
||
- direct source inspection of `compile_all.py`, `build/build.py`, `compile.py`, website/audiobook/indexing/verification scripts
|
||
- runtime verification of build and validation commands
|
||
- repo scan of content/build/workflow layout
|