Compare commits

..

1 Commits

Author SHA1 Message Date
Alexander Whitestone
6416b776db feat: add codebase genome pipeline (#665)
Some checks failed
Smoke Test / smoke (pull_request) Failing after 27s
2026-04-15 00:14:55 -04:00
28 changed files with 1074 additions and 1816 deletions

141
GENOME.md Normal file
View File

@@ -0,0 +1,141 @@
# GENOME.md — Timmy_Foundation/timmy-home
Generated by `pipelines/codebase_genome.py`.
## Project Overview
Timmy Foundation's home repository for development operations and configurations.
- Text files indexed: 3004
- Source and script files: 186
- Test files: 28
- Documentation files: 701
## Architecture
```mermaid
graph TD
repo_root["repo"]
angband["angband"]
briefings["briefings"]
config["config"]
conftest["conftest"]
evennia["evennia"]
evennia_tools["evennia_tools"]
evolution["evolution"]
gemini_fallback_setup["gemini-fallback-setup"]
heartbeat["heartbeat"]
infrastructure["infrastructure"]
repo_root --> angband
repo_root --> briefings
repo_root --> config
repo_root --> conftest
repo_root --> evennia
repo_root --> evennia_tools
```
## Entry Points
- `gemini-fallback-setup.sh` — operational script (`bash gemini-fallback-setup.sh`)
- `morrowind/hud.sh` — operational script (`bash morrowind/hud.sh`)
- `pipelines/codebase_genome.py` — python main guard (`python3 pipelines/codebase_genome.py`)
- `scripts/auto_restart_agent.sh` — operational script (`bash scripts/auto_restart_agent.sh`)
- `scripts/backup_pipeline.sh` — operational script (`bash scripts/backup_pipeline.sh`)
- `scripts/big_brain_manager.py` — operational script (`python3 scripts/big_brain_manager.py`)
- `scripts/big_brain_repo_audit.py` — operational script (`python3 scripts/big_brain_repo_audit.py`)
- `scripts/codebase_genome_nightly.py` — operational script (`python3 scripts/codebase_genome_nightly.py`)
- `scripts/detect_secrets.py` — operational script (`python3 scripts/detect_secrets.py`)
- `scripts/dynamic_dispatch_optimizer.py` — operational script (`python3 scripts/dynamic_dispatch_optimizer.py`)
- `scripts/emacs-fleet-bridge.py` — operational script (`python3 scripts/emacs-fleet-bridge.py`)
- `scripts/emacs-fleet-poll.sh` — operational script (`bash scripts/emacs-fleet-poll.sh`)
## Data Flow
1. Operators enter through `gemini-fallback-setup.sh`, `morrowind/hud.sh`, `pipelines/codebase_genome.py`.
2. Core logic fans into top-level components: `angband`, `briefings`, `config`, `conftest`, `evennia`, `evennia_tools`.
3. Validation is incomplete around `wizards/allegro/home/skills/red-teaming/godmode/scripts/auto_jailbreak.py`, `timmy-local/cache/agent_cache.py`, `wizards/allegro/home/skills/red-teaming/godmode/scripts/parseltongue.py`, so changes there carry regression risk.
4. Final artifacts land as repository files, docs, or runtime side effects depending on the selected entry point.
## Key Abstractions
- `evennia/timmy_world/game.py` — classes `World`:91, `ActionSystem`:421, `TimmyAI`:539, `NPCAI`:550; functions `get_narrative_phase()`:55, `get_phase_transition_event()`:65
- `evennia/timmy_world/world/game.py` — classes `World`:19, `ActionSystem`:326, `TimmyAI`:444, `NPCAI`:455; functions none detected
- `timmy-world/game.py` — classes `World`:19, `ActionSystem`:349, `TimmyAI`:467, `NPCAI`:478; functions none detected
- `wizards/allegro/home/skills/red-teaming/godmode/scripts/auto_jailbreak.py` — classes none detected; functions none detected
- `uniwizard/self_grader.py` — classes `SessionGrade`:23, `WeeklyReport`:55, `SelfGrader`:74; functions `main()`:713
- `uni-wizard/v3/intelligence_engine.py` — classes `ExecutionPattern`:27, `ModelPerformance`:44, `AdaptationEvent`:58, `PatternDatabase`:69; functions none detected
- `scripts/know_thy_father/crossref_audit.py` — classes `ThemeCategory`:30, `Principle`:160, `MeaningKernel`:169, `CrossRefFinding`:178; functions `extract_themes_from_text()`:192, `parse_soul_md()`:206, `parse_kernels()`:264, `cross_reference()`:296, `generate_report()`:440, `main()`:561
- `timmy-local/cache/agent_cache.py` — classes `CacheStats`:28, `LRUCache`:52, `ResponseCache`:94, `ToolCache`:205; functions none detected
## API Surface
- CLI: `bash gemini-fallback-setup.sh` — operational script (`gemini-fallback-setup.sh`)
- CLI: `bash morrowind/hud.sh` — operational script (`morrowind/hud.sh`)
- CLI: `python3 pipelines/codebase_genome.py` — python main guard (`pipelines/codebase_genome.py`)
- CLI: `bash scripts/auto_restart_agent.sh` — operational script (`scripts/auto_restart_agent.sh`)
- CLI: `bash scripts/backup_pipeline.sh` — operational script (`scripts/backup_pipeline.sh`)
- CLI: `python3 scripts/big_brain_manager.py` — operational script (`scripts/big_brain_manager.py`)
- CLI: `python3 scripts/big_brain_repo_audit.py` — operational script (`scripts/big_brain_repo_audit.py`)
- CLI: `python3 scripts/codebase_genome_nightly.py` — operational script (`scripts/codebase_genome_nightly.py`)
- Python: `get_narrative_phase()` from `evennia/timmy_world/game.py:55`
- Python: `get_phase_transition_event()` from `evennia/timmy_world/game.py:65`
- Python: `main()` from `uniwizard/self_grader.py:713`
## Test Coverage Report
- Source and script files inspected: 186
- Test files inspected: 28
- Coverage gaps:
- `wizards/allegro/home/skills/red-teaming/godmode/scripts/auto_jailbreak.py` — no matching test reference detected
- `timmy-local/cache/agent_cache.py` — no matching test reference detected
- `wizards/allegro/home/skills/red-teaming/godmode/scripts/parseltongue.py` — no matching test reference detected
- `twitter-archive/multimodal_pipeline.py` — no matching test reference detected
- `wizards/allegro/home/skills/red-teaming/godmode/scripts/godmode_race.py` — no matching test reference detected
- `skills/productivity/google-workspace/scripts/google_api.py` — no matching test reference detected
- `wizards/allegro/home/skills/productivity/google-workspace/scripts/google_api.py` — no matching test reference detected
- `morrowind/pilot.py` — no matching test reference detected
- `morrowind/mcp_server.py` — no matching test reference detected
- `skills/research/domain-intel/scripts/domain_intel.py` — no matching test reference detected
- `wizards/allegro/home/skills/research/domain-intel/scripts/domain_intel.py` — no matching test reference detected
- `timmy-local/scripts/ingest.py` — no matching test reference detected
## Security Audit Findings
- [medium] `briefings/briefing_20260325.json:37` — hardcoded http endpoint: plaintext or fixed HTTP endpoints can drift or leak across environments. Evidence: `"gitea_error": "Gitea 404: {\"errors\":null,\"message\":\"not found\",\"url\":\"http://143.198.27.163:3000/api/swagger\"}\n [http://143.198.27.163:3000/api/v1/repos/Timmy_Foundation/sovereign-orchestration/issues?state=open&type=issues&sort=created&direction=desc&limit=1&page=1]",`
- [medium] `briefings/briefing_20260328.json:11` — hardcoded http endpoint: plaintext or fixed HTTP endpoints can drift or leak across environments. Evidence: `"provider_base_url": "http://localhost:8081/v1",`
- [medium] `briefings/briefing_20260329.json:11` — hardcoded http endpoint: plaintext or fixed HTTP endpoints can drift or leak across environments. Evidence: `"provider_base_url": "http://localhost:8081/v1",`
- [medium] `config.yaml:37` — hardcoded http endpoint: plaintext or fixed HTTP endpoints can drift or leak across environments. Evidence: `summary_base_url: http://localhost:11434/v1`
- [medium] `config.yaml:47` — hardcoded http endpoint: plaintext or fixed HTTP endpoints can drift or leak across environments. Evidence: `base_url: 'http://localhost:11434/v1'`
- [medium] `config.yaml:52` — hardcoded http endpoint: plaintext or fixed HTTP endpoints can drift or leak across environments. Evidence: `base_url: 'http://localhost:11434/v1'`
- [medium] `config.yaml:57` — hardcoded http endpoint: plaintext or fixed HTTP endpoints can drift or leak across environments. Evidence: `base_url: 'http://localhost:11434/v1'`
- [medium] `config.yaml:62` — hardcoded http endpoint: plaintext or fixed HTTP endpoints can drift or leak across environments. Evidence: `base_url: 'http://localhost:11434/v1'`
- [medium] `config.yaml:67` — hardcoded http endpoint: plaintext or fixed HTTP endpoints can drift or leak across environments. Evidence: `base_url: 'http://localhost:11434/v1'`
- [medium] `config.yaml:77` — hardcoded http endpoint: plaintext or fixed HTTP endpoints can drift or leak across environments. Evidence: `base_url: 'http://localhost:11434/v1'`
- [medium] `config.yaml:82` — hardcoded http endpoint: plaintext or fixed HTTP endpoints can drift or leak across environments. Evidence: `base_url: 'http://localhost:11434/v1'`
- [medium] `config.yaml:174` — hardcoded http endpoint: plaintext or fixed HTTP endpoints can drift or leak across environments. Evidence: `base_url: http://localhost:11434/v1`
## Dead Code Candidates
- `wizards/allegro/home/skills/red-teaming/godmode/scripts/auto_jailbreak.py` — not imported by indexed Python modules and not referenced by tests
- `timmy-local/cache/agent_cache.py` — not imported by indexed Python modules and not referenced by tests
- `wizards/allegro/home/skills/red-teaming/godmode/scripts/parseltongue.py` — not imported by indexed Python modules and not referenced by tests
- `twitter-archive/multimodal_pipeline.py` — not imported by indexed Python modules and not referenced by tests
- `wizards/allegro/home/skills/red-teaming/godmode/scripts/godmode_race.py` — not imported by indexed Python modules and not referenced by tests
- `skills/productivity/google-workspace/scripts/google_api.py` — not imported by indexed Python modules and not referenced by tests
- `wizards/allegro/home/skills/productivity/google-workspace/scripts/google_api.py` — not imported by indexed Python modules and not referenced by tests
- `morrowind/pilot.py` — not imported by indexed Python modules and not referenced by tests
- `morrowind/mcp_server.py` — not imported by indexed Python modules and not referenced by tests
- `skills/research/domain-intel/scripts/domain_intel.py` — not imported by indexed Python modules and not referenced by tests
## Performance Bottleneck Analysis
- `angband/mcp_server.py` — large module (353 lines) likely hides multiple responsibilities
- `evennia/timmy_world/game.py` — large module (1541 lines) likely hides multiple responsibilities
- `evennia/timmy_world/world/game.py` — large module (1345 lines) likely hides multiple responsibilities
- `morrowind/mcp_server.py` — large module (451 lines) likely hides multiple responsibilities
- `morrowind/pilot.py` — large module (459 lines) likely hides multiple responsibilities
- `pipelines/codebase_genome.py` — large module (557 lines) likely hides multiple responsibilities
- `scripts/know_thy_father/crossref_audit.py` — large module (657 lines) likely hides multiple responsibilities
- `scripts/know_thy_father/index_media.py` — large module (405 lines) likely hides multiple responsibilities
- `scripts/know_thy_father/synthesize_kernels.py` — large module (416 lines) likely hides multiple responsibilities
- `scripts/tower_game.py` — large module (395 lines) likely hides multiple responsibilities

View File

@@ -0,0 +1,79 @@
# Codebase Genome Pipeline
Issue: `timmy-home#665`
This pipeline gives Timmy a repeatable way to generate a deterministic `GENOME.md` for any repository and rotate through the org nightly.
## What landed
- `pipelines/codebase_genome.py` — static analyzer that writes `GENOME.md`
- `pipelines/codebase-genome.py` — thin CLI wrapper matching the expected pipeline-style entrypoint
- `scripts/codebase_genome_nightly.py` — org-aware nightly runner that selects the next repo, updates a local checkout, and writes the genome artifact
- `GENOME.md` — generated analysis for `timmy-home` itself
## Genome output
Each generated `GENOME.md` includes:
- project overview and repository size metrics
- Mermaid architecture diagram
- entry points and API surface
- data flow summary
- key abstractions from Python source
- test coverage gaps
- security audit findings
- dead code candidates
- performance bottleneck analysis
## Single-repo usage
```bash
python3 pipelines/codebase_genome.py \
--repo-root /path/to/repo \
--repo-name Timmy_Foundation/some-repo \
--output /path/to/repo/GENOME.md
```
The hyphenated wrapper also works:
```bash
python3 pipelines/codebase-genome.py --repo-root /path/to/repo --repo Timmy_Foundation/some-repo
```
## Nightly org rotation
Dry-run the next selection:
```bash
python3 scripts/codebase_genome_nightly.py --dry-run
```
Run one real pass:
```bash
python3 scripts/codebase_genome_nightly.py \
--org Timmy_Foundation \
--workspace-root ~/timmy-foundation-repos \
--output-root ~/.timmy/codebase-genomes \
--state-path ~/.timmy/codebase_genome_state.json
```
Behavior:
1. fetches the current repo list from Gitea
2. selects the next repo after the last recorded run
3. clones or fast-forwards the local checkout
4. writes `GENOME.md` into the configured output tree
5. updates the rotation state file
## Example cron entry
```cron
30 2 * * * cd ~/timmy-home && /usr/bin/env python3 scripts/codebase_genome_nightly.py --org Timmy_Foundation --workspace-root ~/timmy-foundation-repos --output-root ~/.timmy/codebase-genomes --state-path ~/.timmy/codebase_genome_state.json >> ~/.timmy/logs/codebase_genome_nightly.log 2>&1
```
## Limits and follow-ons
- the generator is deterministic and static; it does not hallucinate architecture, but it also does not replace a full human review pass
- nightly rotation handles genome generation; auto-generated test expansion remains a separate follow-on lane
- large repos may still need a second-pass human edit after the initial genome artifact lands

View File

@@ -1,61 +0,0 @@
# Know Thy Father — Multimodal Media Consumption Pipeline
Refs #582
This document makes the epic operational by naming the current source-of-truth scripts, their handoff artifacts, and the one-command runner that coordinates them.
## Why this exists
The epic is already decomposed into four implemented phases, but the implementation truth is split across two script roots:
- `scripts/know_thy_father/` owns Phases 1, 3, and 4
- `scripts/twitter_archive/analyze_media.py` owns Phase 2
- `twitter-archive/know-thy-father/tracker.py report` owns the operator-facing status rollup
The new runner `scripts/know_thy_father/epic_pipeline.py` does not replace those scripts. It stitches them together into one explicit, reviewable plan.
## Phase map
| Phase | Script | Primary output |
|-------|--------|----------------|
| 1. Media Indexing | `scripts/know_thy_father/index_media.py` | `twitter-archive/know-thy-father/media_manifest.jsonl` |
| 2. Multimodal Analysis | `scripts/twitter_archive/analyze_media.py --batch 10` | `twitter-archive/know-thy-father/analysis.jsonl` + `meaning-kernels.jsonl` + `pipeline-status.json` |
| 3. Holographic Synthesis | `scripts/know_thy_father/synthesize_kernels.py` | `twitter-archive/knowledge/fathers_ledger.jsonl` |
| 4. Cross-Reference Audit | `scripts/know_thy_father/crossref_audit.py` | `twitter-archive/notes/crossref_report.md` |
| 5. Processing Log | `twitter-archive/know-thy-father/tracker.py report` | `twitter-archive/know-thy-father/REPORT.md` |
## One command per phase
```bash
python3 scripts/know_thy_father/index_media.py --tweets twitter-archive/extracted/tweets.jsonl --output twitter-archive/know-thy-father/media_manifest.jsonl
python3 scripts/twitter_archive/analyze_media.py --batch 10
python3 scripts/know_thy_father/synthesize_kernels.py --input twitter-archive/media/manifest.jsonl --output twitter-archive/knowledge/fathers_ledger.jsonl --summary twitter-archive/knowledge/fathers_ledger.summary.json
python3 scripts/know_thy_father/crossref_audit.py --soul SOUL.md --kernels twitter-archive/notes/know_thy_father_crossref.md --output twitter-archive/notes/crossref_report.md
python3 twitter-archive/know-thy-father/tracker.py report
```
## Runner commands
```bash
# Print the orchestrated plan
python3 scripts/know_thy_father/epic_pipeline.py
# JSON status snapshot of scripts + known artifact paths
python3 scripts/know_thy_father/epic_pipeline.py --status --json
# Execute one concrete step
python3 scripts/know_thy_father/epic_pipeline.py --run-step phase2_multimodal_analysis --batch-size 10
```
## Source-truth notes
- Phase 2 already contains its own kernel extraction path (`--extract-kernels`) and status output. The epic runner does not reimplement that logic.
- Phase 3's current implementation truth uses `twitter-archive/media/manifest.jsonl` as its default input. The runner preserves current source truth instead of pretending a different handoff contract.
- The processing log in `twitter-archive/know-thy-father/PROCESSING_LOG.md` can drift from current code reality. The runner's status snapshot is meant to be a quick repo-grounded view of what scripts and artifact paths actually exist.
## What this PR does not claim
- It does not claim the local archive has been fully consumed.
- It does not claim the halted processing log has been resumed.
- It does not claim fact_store ingestion has been fully wired end-to-end.
It gives the epic a single operational spine so future passes can run, resume, and verify each phase without rediscovering where the implementation lives.

View File

@@ -1,92 +0,0 @@
# MemPalace v3.0.0 — Ezra Integration Packet
This packet turns issue #570 into an executable, reviewable integration plan for Ezra's Hermes home.
It is a repo-side scaffold: no live Ezra host changes are claimed in this artifact.
## Commands
```bash
pip install mempalace==3.0.0
mempalace init ~/.hermes/ --yes
cat > ~/.hermes/mempalace.yaml <<'YAML'
wing: ezra_home
palace: ~/.mempalace/palace
rooms:
- name: sessions
description: Conversation history and durable agent transcripts
globs:
- "*.json"
- "*.jsonl"
- name: config
description: Hermes configuration and runtime settings
globs:
- "*.yaml"
- "*.yml"
- "*.toml"
- name: docs
description: Notes, markdown docs, and operating reports
globs:
- "*.md"
- "*.txt"
people: []
projects: []
YAML
echo "" | mempalace mine ~/.hermes/
echo "" | mempalace mine ~/.hermes/sessions/ --mode convos
mempalace search "your common queries"
mempalace wake-up
hermes mcp add mempalace -- python -m mempalace.mcp_server
```
## Manual config template
```yaml
wing: ezra_home
palace: ~/.mempalace/palace
rooms:
- name: sessions
description: Conversation history and durable agent transcripts
globs:
- "*.json"
- "*.jsonl"
- name: config
description: Hermes configuration and runtime settings
globs:
- "*.yaml"
- "*.yml"
- "*.toml"
- name: docs
description: Notes, markdown docs, and operating reports
globs:
- "*.md"
- "*.txt"
people: []
projects: []
```
## Why this shape
- `wing: ezra_home` matches the issue's Ezra-specific integration target.
- `rooms` split the mined material into sessions, config, and docs to keep retrieval interpretable.
- Mining commands pipe empty stdin to avoid the interactive entity-detector hang noted in the evaluation.
## Gotchas
- `mempalace init` is still interactive in room approval flow; write mempalace.yaml manually if the init output stalls.
- The yaml key is `wing:` not `wings:`. Using the wrong key causes mine/setup failures.
- Pipe empty stdin into mining commands (`echo "" | ...`) to avoid the entity-detector stdin hang on larger directories.
- First mine downloads the ChromaDB embedding model cache (~79MB).
- Report Ezra's before/after metrics back to issue #568 after live installation and retrieval tests.
## Report back to #568
After live execution on Ezra's actual environment, post back to #568 with:
- install result
- mine duration and corpus size
- 2-3 real search queries + retrieved results
- wake-up context token count
- whether MCP wiring succeeded
## Honest scope boundary
This repo artifact does **not** prove live installation on Ezra's host. It makes the work reproducible and testable so the next pass can execute it without guesswork.

View File

@@ -12,6 +12,7 @@ Quick-reference index for common operational tasks across the Timmy Foundation i
| Check fleet health | fleet-ops | `python3 scripts/fleet_readiness.py` |
| Agent scorecard | fleet-ops | `python3 scripts/agent_scorecard.py` |
| View fleet manifest | fleet-ops | `cat manifest.yaml` |
| Run nightly codebase genome pass | timmy-home | `python3 scripts/codebase_genome_nightly.py --dry-run` |
## the-nexus (Frontend + Brain)

View File

@@ -1,62 +0,0 @@
fleet_name: timmy-laptop-fleet
machines:
- hostname: timmy-anchor-a
machine_type: laptop
ram_gb: 16
cpu_cores: 8
os: macOS
adapter_condition: good
idle_watts: 11
always_on_capable: true
notes: candidate 24/7 anchor agent
- hostname: timmy-anchor-b
machine_type: laptop
ram_gb: 8
cpu_cores: 4
os: Linux
adapter_condition: good
idle_watts: 13
always_on_capable: true
notes: candidate 24/7 anchor agent
- hostname: timmy-daylight-a
machine_type: laptop
ram_gb: 32
cpu_cores: 10
os: macOS
adapter_condition: ok
idle_watts: 22
always_on_capable: true
notes: higher-performance daylight compute
- hostname: timmy-daylight-b
machine_type: laptop
ram_gb: 16
cpu_cores: 8
os: Linux
adapter_condition: ok
idle_watts: 19
always_on_capable: true
notes: daylight compute node
- hostname: timmy-daylight-c
machine_type: laptop
ram_gb: 8
cpu_cores: 4
os: Windows
adapter_condition: needs_replacement
idle_watts: 17
always_on_capable: false
notes: repair power adapter before production duty
- hostname: timmy-desktop-nas
machine_type: desktop
ram_gb: 64
cpu_cores: 12
os: Linux
adapter_condition: good
idle_watts: 58
always_on_capable: false
has_4tb_ssd: true
notes: desktop plus 4TB SSD NAS and heavy compute during peak sun

View File

@@ -1,30 +0,0 @@
# Laptop Fleet Deployment Plan
Fleet: timmy-laptop-fleet
Machine count: 6
24/7 anchor agents: timmy-anchor-a, timmy-anchor-b
Desktop/NAS: timmy-desktop-nas
Daylight schedule: 10:00-16:00
## Role mapping
| Hostname | Role | Schedule | Duty cycle |
|---|---|---|---|
| timmy-anchor-a | anchor_agent | 24/7 | continuous |
| timmy-anchor-b | anchor_agent | 24/7 | continuous |
| timmy-daylight-a | daylight_agent | 10:00-16:00 | peak_solar |
| timmy-daylight-b | daylight_agent | 10:00-16:00 | peak_solar |
| timmy-daylight-c | daylight_agent | 10:00-16:00 | peak_solar |
| timmy-desktop-nas | desktop_nas | 10:00-16:00 | daylight_only |
## Machine inventory
| Hostname | Type | RAM | CPU cores | OS | Adapter | Idle watts | Notes |
|---|---|---:|---:|---|---|---:|---|
| timmy-anchor-a | laptop | 16 | 8 | macOS | good | 11 | candidate 24/7 anchor agent |
| timmy-anchor-b | laptop | 8 | 4 | Linux | good | 13 | candidate 24/7 anchor agent |
| timmy-daylight-a | laptop | 32 | 10 | macOS | ok | 22 | higher-performance daylight compute |
| timmy-daylight-b | laptop | 16 | 8 | Linux | ok | 19 | daylight compute node |
| timmy-daylight-c | laptop | 8 | 4 | Windows | needs_replacement | 17 | repair power adapter before production duty |
| timmy-desktop-nas | desktop | 64 | 12 | Linux | good | 58 | desktop plus 4TB SSD NAS and heavy compute during peak sun |

View File

@@ -1,37 +0,0 @@
# NH Broadband Install Packet
**Packet ID:** nh-bb-20260415-113232
**Generated:** 2026-04-15T11:32:32.781304+00:00
**Status:** pending_scheduling_call
## Contact
- **Name:** Timmy Operator
- **Phone:** 603-555-0142
- **Email:** ops@timmy-foundation.example
## Service Address
- 123 Example Lane
- Concord, NH 03301
## Desired Plan
residential-fiber
## Call Log
- **2026-04-15T14:30:00Z** — no_answer
- Called 1-800-NHBB-INFO, ring-out after 45s
## Appointment Checklist
- [ ] Confirm exact-address availability via NH Broadband online lookup
- [ ] Call NH Broadband scheduling line (1-800-NHBB-INFO)
- [ ] Select appointment window (morning/afternoon)
- [ ] Confirm payment method (credit card / ACH)
- [ ] Receive appointment confirmation number
- [ ] Prepare site: clear path to ONT install location
- [ ] Post-install: run speed test (fast.com / speedtest.net)
- [ ] Log final speeds and appointment outcome

View File

@@ -1,27 +0,0 @@
contact:
name: Timmy Operator
phone: "603-555-0142"
email: ops@timmy-foundation.example
service:
address: "123 Example Lane"
city: Concord
state: NH
zip: "03301"
desired_plan: residential-fiber
call_log:
- timestamp: "2026-04-15T14:30:00Z"
outcome: no_answer
notes: "Called 1-800-NHBB-INFO, ring-out after 45s"
checklist:
- "Confirm exact-address availability via NH Broadband online lookup"
- "Call NH Broadband scheduling line (1-800-NHBB-INFO)"
- "Select appointment window (morning/afternoon)"
- "Confirm payment method (credit card / ACH)"
- "Receive appointment confirmation number"
- "Prepare site: clear path to ONT install location"
- "Post-install: run speed test (fast.com / speedtest.net)"
- "Log final speeds and appointment outcome"

View File

@@ -1,397 +0,0 @@
# GENOME.md — fleet-ops
Host artifact for timmy-home issue #680. The analyzed code lives in the separate `fleet-ops` repository; this document is the curated genome written from a fresh clone of that repo at commit `38c4eab`.
## Project Overview
`fleet-ops` is the infrastructure and operations control plane for the Timmy Foundation fleet. It is not a single deployable application. It is a mixed ops repository with four overlapping layers:
1. Ansible orchestration for VPS provisioning and service rollout.
2. Small Python microservices for shared fleet state.
3. Cron- and CLI-driven operator scripts.
4. A separate local `docker-compose.yml` sandbox for a simplified all-in-one stack.
Two facts shape the repo more than anything else:
- The real fleet deployment path starts at `site.yml``playbooks/site.yml` and lands services through Ansible roles.
- The repo also contains several aspirational or partially wired Python modules whose names imply runtime importance but whose deployment path is weak, indirect, or missing.
Grounded metrics from the fresh analysis run:
- `python3 ~/.hermes/pipelines/codebase-genome.py --path /tmp/fleet-ops-genome --dry-run` reported `97` source files, `12` test files, `29` config files, and `16,658` total lines.
- A local filesystem count found `39` Python source files, `12` Python test files, and `74` YAML files.
- `python3 -m pytest -q --continue-on-collection-errors` produced `158 passed, 1 failed, 2 errors`.
The repo is therefore operationally substantial, but only part of that surface is coherently tested and wired.
## Architecture
```mermaid
graph TD
A[site.yml] --> B[playbooks/site.yml]
B --> C[preflight.yml]
B --> D[baseline.yml]
B --> E[deploy_ollama.yml]
B --> F[deploy_gitea.yml]
B --> G[deploy_hermes.yml]
B --> H[deploy_conduit.yml]
B --> I[harmony_audit role]
G --> J[playbooks/host_vars/* wizard_instances]
G --> K[hermes-agent role]
K --> L[systemd wizard services]
M[templates/fleet-deploy-hook.service] --> N[scripts/deploy-hook.py]
N --> B
O[playbooks/roles/message-bus/templates/busd.service.j2] --> P[message_bus.py]
Q[playbooks/roles/knowledge-store/templates/knowledged.service.j2] --> R[knowledge_store.py]
S[registry.yaml] --> T[health_dashboard.py]
S --> U[scripts/registry_health_updater.py]
S --> V[federation_sync.py]
W[cron/dispatch-consumer.yml] --> X[scripts/dispatch_consumer.py]
Y[morning_report_cron.yml] --> Z[scripts/morning_report_compile.py]
AA[nightly_efficiency_cron.yml] --> AB[scripts/nightly_efficiency_report.py]
AC[burndown_watcher_cron.yml] --> AD[scripts/burndown_cron.py]
AE[docker-compose.yml] --> AF[local ollama]
AE --> AG[local gitea]
AE --> AH[agent container]
AE --> AI[monitor loop]
```
### Structural read
The cleanest mental model is not “one app,” but “one repo that tries to be the fleets operator handbook, deployment engine, shared service shelf, and scratchpad.”
That produces three distinct control planes:
1. `playbooks/` is the strongest source of truth for VPS deployment.
2. `registry.yaml` and `manifest.yaml` act as runtime or operator registries for scripts.
3. `docker-compose.yml` models a separate local sandbox whose assumptions do not fully match the Ansible path.
## Entry Points
### Primary fleet deploy entry points
- `site.yml` — thin repo-root wrapper that imports `playbooks/site.yml`.
- `playbooks/site.yml` — multi-phase orchestrator for preflight, baseline, Ollama, Gitea, Hermes, Conduit, and local harmony audit.
- `playbooks/deploy_hermes.yml` — the most important service rollout for wizard instances; requires `wizard_instances` and pulls `vault_openrouter_api_key` / `vault_openai_api_key`.
- `playbooks/provision_and_deploy.yml` — DigitalOcean create-and-bootstrap path using `community.digital.digital_ocean_droplet` and a dynamic `new_droplets` group.
### Deployed service entry points
- `message_bus.py` — HTTP message queue service deployed by `playbooks/roles/message-bus/templates/busd.service.j2`.
- `knowledge_store.py` — SQLite-backed shared fact service deployed by `playbooks/roles/knowledge-store/templates/knowledged.service.j2`.
- `scripts/deploy-hook.py` — webhook listener launched by `templates/fleet-deploy-hook.service` with `ExecStart=/usr/bin/python3 /opt/fleet-ops/scripts/deploy-hook.py`.
### Cron and operator entry points
- `scripts/dispatch_consumer.py` — wired by `cron/dispatch-consumer.yml`.
- `scripts/morning_report_compile.py` — wired by `morning_report_cron.yml`.
- `scripts/nightly_efficiency_report.py` — wired by `nightly_efficiency_cron.yml`.
- `scripts/burndown_cron.py` — wired by `burndown_watcher_cron.yml`.
- `scripts/fleet_readiness.py` — operator validation script for `manifest.yaml`.
- `scripts/fleet-status.py` — prints a fleet status snapshot directly from top-level code.
### CI / verification entry points
- `.gitea/workflows/ansible-lint.yml` — YAML lint, `ansible-lint`, syntax checks, inventory validation.
- `.gitea/workflows/auto-review.yml` — lightweight review workflow with YAML lint, syntax checks, secret scan, and merge-conflict probe.
### Local development stack entry point
- `docker-compose.yml` — brings up `ollama`, `gitea`, `agent`, and `monitor` for a local stack.
## Data Flow
### 1) Deploy path
1. A repo operator pushes or references deployable state.
2. `scripts/deploy-hook.py` receives the webhook.
3. The hook updates `/opt/fleet-ops`, then invokes Ansible.
4. `playbooks/site.yml` fans into phase playbooks.
5. `playbooks/deploy_hermes.yml` renders per-instance config and systemd services from `wizard_instances` in `playbooks/host_vars/*`.
6. Services expose local `/health` endpoints on assigned ports.
### 2) Shared service path
1. Agents or tools post work to `message_bus.py`.
2. Consumers poll `/messages` and inspect `/queue`, `/deadletter`, and `/audit`.
3. Facts are written into `knowledge_store.py` and federated through peer sync endpoints.
4. `health_dashboard.py` and `scripts/registry_health_updater.py` read `registry.yaml` and probe service URLs.
### 3) Reporting path
1. Cron YAML launches queue/report scripts.
2. Scripts read `~/.hermes/`, Gitea APIs, local logs, or registry files.
3. Output is emitted as JSON, markdown, or console summaries.
### Important integration fracture
`federation_sync.py` does not currently match the services it tries to coordinate.
- `message_bus.py` returns `/messages` as `{"messages": [...], "count": N}` at line 234.
- `federation_sync.py` polls `.../messages?limit=50` and then only iterates if `isinstance(data, list)` at lines 136-140.
- `federation_sync.py` also requests `.../knowledge/stats` at line 230, but `knowledge_store.py` documents `/sync/status`, `/facts`, and `/peers`, not `/knowledge/stats`.
This means the repo contains a federation layer whose assumed contracts drift from the concrete microservices beside it.
## Key Abstractions
### `MessageStore` in `message_bus.py`
Core in-memory queue abstraction. It underlies:
- enqueue / poll behavior
- TTL expiry and dead-letter handling
- queue stats and audit trail endpoints
The tests in `tests/test_message_bus.py` make this one of the best-specified components in the repo.
### `KnowledgeDB` in `knowledge_store.py`
SQLite-backed fact registry with HTTP exposure for:
- storing facts
- querying and deleting facts
- peer registration
- push/pull federation
- sync status reporting
This is the nearest thing the repo has to a durable shared memory service.
### `FleetMonitor` in `health_dashboard.py`
Loads `registry.yaml`, polls wizard endpoints, caches results, and exposes both HTML and JSON views. It is the operator-facing read model of the fleet.
### `SyncEngine` in `federation_sync.py`
Intended as the bridge across message bus, audit trail, and knowledge store. The design intent is strong, but the live endpoint contracts appear out of sync.
### `ProfilePolicy` in `scripts/profile_isolation.py`
Encodes tmux/agent lifecycle policy by profile. This is one of the more disciplined “ops logic” modules: focused, testable, and bounded.
### `GenerationResult` / `VideoEngineClient` in `scripts/video_engine_client.py`
Represents the repos media-generation sidecar boundary. The code is small and clear, but its tests are partially stale relative to implementation behavior.
## API Surface
### `message_bus.py`
Observed HTTP surface includes:
- `POST /message`
- `GET /messages?to=<agent>&limit=<n>`
- `GET /queue`
- `GET /deadletter`
- `GET /audit`
- `GET /health`
### `knowledge_store.py`
Documented surface includes:
- `POST /fact`
- `GET /facts`
- `DELETE /facts/<key>`
- `POST /sync/pull`
- `POST /sync/push`
- `GET /sync/status`
- `GET /peers`
- `POST /peers`
- `GET /health`
### `health_dashboard.py`
- `/`
- `/api/status`
- `/api/wizard/<id>`
### `scripts/deploy-hook.py`
- `/health`
- `/webhook`
### Ansible operator surface
Primary commands implied by the repo:
- `ansible-playbook -i playbooks/inventory site.yml`
- `ansible-playbook -i playbooks/inventory playbooks/provision_and_deploy.yml`
- `ansible-playbook -i playbooks/inventory playbooks/deploy_hermes.yml`
## Dependencies
### Python and shell posture
The repo is mostly Python stdlib plus Ansible/shell orchestration. It is not packaged as a single installable Python project.
### Explicit Ansible collections
`requirements.yml` declares:
- `community.docker`
- `community.general`
- `ansible.posix`
The provisioning docs and playbooks also rely on `community.digital.digital_ocean_droplet` in `playbooks/provision_and_deploy.yml`.
### External service dependencies
- Gitea
- Ollama
- DigitalOcean
- systemd
- Docker / Docker Compose
- local `~/.hermes/` session and burn-log state
### Hidden runtime dependency
Several conceptual modules import `hermes_tools` directly:
- `compassion_layer.py`
- `sovereign_librarian.py`
- `sovereign_muse.py`
- `sovereign_pulse.py`
- `sovereign_sentinel.py`
- `synthesis_engine.py`
That dependency is not self-contained inside the repo and directly causes the local collection errors.
## Test Coverage Gaps
### Current tested strengths
The strongest, most trustworthy tests are around:
- `tests/test_message_bus.py`
- `tests/test_knowledge_store.py`
- `tests/test_health_dashboard.py`
- `tests/test_registry_health_updater.py`
- `tests/test_profile_isolation.py`
- `tests/test_skill_scorer.py`
- `tests/test_nightly_efficiency_report.py`
Those files make the shared-service core much more legible than the deployment layer.
### Current local status
Fresh run result:
- `158 passed, 1 failed, 2 errors`
Collection errors:
- `tests/test_heart.py` fails because `compassion_layer.py` imports `hermes_tools`.
- `tests/test_synthesis.py` fails because `sovereign_librarian.py` imports `hermes_tools`.
Runnable failure:
- `tests/test_video_engine_client.py` expects `generate_draft()` to raise on HTTP 503.
- `scripts/video_engine_client.py` currently catches exceptions and returns `GenerationResult(success=False, error=...)` instead.
### High-value untested paths
The most important missing or weakly validated surfaces are:
- `scripts/deploy-hook.py` — high-blast-radius deploy trigger.
- `playbooks/deploy_gitea.yml` / `playbooks/deploy_hermes.yml` / `playbooks/provision_and_deploy.yml` — critical control plane, almost entirely untested in-repo.
- `scripts/morning_report_compile.py` — cron-facing reporting logic.
- `scripts/burndown_cron.py` and related watcher scripts.
- `scripts/generate_video.py`, `scripts/tiered_render.py`, and broader video-engine operator paths.
- `scripts/fleet-status.py` — prints directly from module scope and has no `__main__` guard.
### Coverage quality note
The repos best tests cluster around internal Python helpers. The repos biggest operational risk lives in deployment, cron wiring, and shell/Ansible behaviors that are not equivalently exercised.
## Security Considerations
### Strong points
- Vault use exists in `playbooks/group_vars/vault.yml` and inline vaulted material in `manifest.yaml`.
- `playbooks/deploy_gitea.yml` sets `gitea_disable_registration: true`, `gitea_require_signin: true`, and `gitea_register_act_runner: false`.
- The Hermes role renders per-instance env/config and uses systemd hardening patterns.
- Gitea, Nostr relay, and other web surfaces are designed around nginx/TLS roles.
### Concrete risks
1. `scripts/deploy-hook.py` explicitly disables signature enforcement when `DEPLOY_HOOK_SECRET` is unset.
2. `playbooks/roles/gitea/defaults/main.yml` sets `gitea_webhook_allowed_host_list: "*"`.
3. Both `ansible.cfg` files disable host key checking.
4. The repo has multiple sources of truth for ports and service topology:
- `playbooks/host_vars/ezra-primary.yml` uses `8643`
- `manifest.yaml` uses `8643`
- `registry.yaml` points Ezra health to `8646`
5. `registry.yaml` advertises services like `busd`, `auditd`, and `knowledged`, but the main `playbooks/site.yml` phases do not include message-bus or knowledge-store roles.
### Drift / correctness risks that become security risks
- `playbooks/deploy_auto_merge.yml` targets `hosts: gitea_servers`, but the inventory groups visible in `playbooks/inventory` are `forge`, `vps`, `agents`, and `wizards`.
- `playbooks/roles/gitea/defaults/main.yml` includes runner labels with a probable typo: `ubuntu-22.04:docker://catthehocker/ubuntu:act-22.04`.
- The local compose quick start is not turnkey: `Dockerfile.agent` copies `requirements-agent.txt*` and `agent/`, but the runtime falls back to a tiny health/tick loop if the real agent source is absent.
## Deployment
### VPS / real fleet path
Repo-root wrapper:
```bash
ansible-playbook -i playbooks/inventory site.yml
```
Direct orchestrator:
```bash
ansible-playbook -i playbooks/inventory playbooks/site.yml
```
Provision and bootstrap a new node:
```bash
ansible-playbook -i playbooks/inventory playbooks/provision_and_deploy.yml
```
### Local sandbox path
```bash
cp .env.example .env
docker compose up -d
```
But this path must be read skeptically. `docker-compose.yml` is a local convenience stack, while the real fleet path uses Ansible + systemd + host vars + vault-backed secrets.
## Dead Code Candidates and Operator Footguns
- `scripts/fleet-status.py` behaves like a one-shot report script with top-level execution, not a reusable CLI module.
- `README.md` ends with a visibly corrupted Nexus Watchdog section containing broken formatting.
- `Sovereign_Health_Check.md` still recommends running the broken `tests/test_heart.py` and `tests/test_synthesis.py` health suite.
- `federation_sync.py` currently looks architecturally important but contractually out of sync with `message_bus.py` and `knowledge_store.py`.
## Bottom Line
`fleet-ops` contains the real bones of a sovereign fleet control plane, but those bones are unevenly ossified.
The strong parts are:
- the phase-based Ansible deployment structure in `playbooks/site.yml`
- the microservice-style core in `message_bus.py`, `knowledge_store.py`, and `health_dashboard.py`
- several focused Python test suites that genuinely specify behavior
The weak parts are:
- duplicated sources of truth (`playbooks/host_vars/*`, `manifest.yaml`, `registry.yaml`, local compose)
- deployment and cron surfaces that matter more operationally than they are tested
- conceptual “sovereign_*” modules that pull in `hermes_tools` and currently break local collection
If this repo were being hardened next, the highest-leverage moves would be:
1. Make the registries consistent (`8643` vs `8646`, service inventory vs deployed phases).
2. Add focused tests around `scripts/deploy-hook.py` and the deploy/report cron scripts.
3. Decide which Python modules are truly production runtime and which are prototypes, then wire or prune accordingly.
4. Collapse the number of “truth” files an operator has to trust during a deploy.

1
pipelines/__init__.py Normal file
View File

@@ -0,0 +1 @@
"""Codebase genome pipeline helpers."""

View File

@@ -0,0 +1,6 @@
#!/usr/bin/env python3
from codebase_genome import main
if __name__ == "__main__":
main()

View File

@@ -0,0 +1,557 @@
#!/usr/bin/env python3
"""Generate a deterministic GENOME.md for a repository."""
from __future__ import annotations
import argparse
import ast
import os
import re
from pathlib import Path
from typing import NamedTuple
IGNORED_DIRS = {
".git",
".hg",
".svn",
".venv",
"venv",
"node_modules",
"__pycache__",
".mypy_cache",
".pytest_cache",
"dist",
"build",
"coverage",
}
TEXT_SUFFIXES = {
".py",
".js",
".mjs",
".cjs",
".ts",
".tsx",
".jsx",
".html",
".css",
".md",
".txt",
".json",
".yaml",
".yml",
".sh",
".ini",
".cfg",
".toml",
}
SOURCE_SUFFIXES = {".py", ".js", ".mjs", ".cjs", ".ts", ".tsx", ".jsx", ".sh"}
DOC_FILENAMES = {"README.md", "CONTRIBUTING.md", "SOUL.md"}
class RepoFile(NamedTuple):
path: str
abs_path: Path
size_bytes: int
line_count: int
kind: str
class RunSummary(NamedTuple):
markdown: str
source_count: int
test_count: int
doc_count: int
def _is_text_file(path: Path) -> bool:
return path.suffix.lower() in TEXT_SUFFIXES or path.name in {"Dockerfile", "Makefile"}
def _file_kind(rel_path: str, path: Path) -> str:
suffix = path.suffix.lower()
if rel_path.startswith("tests/") or path.name.startswith("test_"):
return "test"
if rel_path.startswith("docs/") or path.name in DOC_FILENAMES or suffix == ".md":
return "doc"
if suffix in {".json", ".yaml", ".yml", ".toml", ".ini", ".cfg"}:
return "config"
if suffix == ".sh":
return "script"
if rel_path.startswith("scripts/") and suffix == ".py" and path.name != "__init__.py":
return "script"
if suffix in SOURCE_SUFFIXES:
return "source"
return "other"
def collect_repo_files(repo_root: str | Path) -> list[RepoFile]:
root = Path(repo_root).resolve()
files: list[RepoFile] = []
for current_root, dirnames, filenames in os.walk(root):
dirnames[:] = sorted(d for d in dirnames if d not in IGNORED_DIRS)
base = Path(current_root)
for filename in sorted(filenames):
path = base / filename
if not _is_text_file(path):
continue
rel_path = path.relative_to(root).as_posix()
text = path.read_text(encoding="utf-8", errors="replace")
files.append(
RepoFile(
path=rel_path,
abs_path=path,
size_bytes=path.stat().st_size,
line_count=max(1, len(text.splitlines())),
kind=_file_kind(rel_path, path),
)
)
return sorted(files, key=lambda item: item.path)
def _safe_text(path: Path) -> str:
return path.read_text(encoding="utf-8", errors="replace")
def _sanitize_node_id(name: str) -> str:
cleaned = re.sub(r"[^A-Za-z0-9_]", "_", name)
return cleaned or "node"
def _component_name(path: str) -> str:
if "/" in path:
return path.split("/", 1)[0]
return Path(path).stem or path
def _priority_files(files: list[RepoFile], kinds: tuple[str, ...], limit: int = 8) -> list[RepoFile]:
items = [item for item in files if item.kind in kinds]
items.sort(key=lambda item: (-int(item.path.count("/") == 0), -item.line_count, item.path))
return items[:limit]
def _readme_summary(root: Path) -> str:
readme = root / "README.md"
if not readme.exists():
return "Repository-specific overview missing from README.md. Genome generated from code structure and tests."
paragraphs: list[str] = []
current: list[str] = []
for raw_line in _safe_text(readme).splitlines():
line = raw_line.strip()
if not line:
if current:
paragraphs.append(" ".join(current).strip())
current = []
continue
if line.startswith("#"):
continue
current.append(line)
if current:
paragraphs.append(" ".join(current).strip())
return paragraphs[0] if paragraphs else "README.md exists but does not contain a prose overview paragraph."
def _extract_python_imports(text: str) -> set[str]:
try:
tree = ast.parse(text)
except SyntaxError:
return set()
imports: set[str] = set()
for node in ast.walk(tree):
if isinstance(node, ast.Import):
for alias in node.names:
imports.add(alias.name.split(".", 1)[0])
elif isinstance(node, ast.ImportFrom):
if node.module:
imports.add(node.module.split(".", 1)[0])
return imports
def _extract_python_symbols(text: str) -> tuple[list[tuple[str, int]], list[tuple[str, int]]]:
try:
tree = ast.parse(text)
except SyntaxError:
return [], []
classes: list[tuple[str, int]] = []
functions: list[tuple[str, int]] = []
for node in tree.body:
if isinstance(node, ast.ClassDef):
classes.append((node.name, node.lineno))
elif isinstance(node, ast.FunctionDef):
functions.append((node.name, node.lineno))
return classes, functions
def _build_component_edges(files: list[RepoFile]) -> list[tuple[str, str]]:
known_components = {_component_name(item.path) for item in files if item.kind in {"source", "script", "test"}}
edges: set[tuple[str, str]] = set()
for item in files:
if item.kind not in {"source", "script", "test"} or item.abs_path.suffix.lower() != ".py":
continue
src = _component_name(item.path)
imports = _extract_python_imports(_safe_text(item.abs_path))
for imported in imports:
if imported in known_components and imported != src:
edges.add((src, imported))
return sorted(edges)
def _render_mermaid(files: list[RepoFile]) -> str:
components = sorted(
{
_component_name(item.path)
for item in files
if item.kind in {"source", "script", "test", "config"}
and not _component_name(item.path).startswith(".")
}
)
edges = _build_component_edges(files)
lines = ["graph TD"]
if not components:
lines.append(" repo[\"repository\"]")
return "\n".join(lines)
for component in components[:10]:
node_id = _sanitize_node_id(component)
lines.append(f" {node_id}[\"{component}\"]")
seen_components = set(components[:10])
emitted = False
for src, dst in edges:
if src in seen_components and dst in seen_components:
lines.append(f" {_sanitize_node_id(src)} --> {_sanitize_node_id(dst)}")
emitted = True
if not emitted:
root_id = "repo_root"
lines.insert(1, f" {root_id}[\"repo\"]")
for component in components[:6]:
lines.append(f" {root_id} --> {_sanitize_node_id(component)}")
return "\n".join(lines)
def _entry_points(files: list[RepoFile]) -> list[dict[str, str]]:
points: list[dict[str, str]] = []
for item in files:
text = _safe_text(item.abs_path)
if item.kind == "script":
points.append({"path": item.path, "reason": "operational script", "command": f"python3 {item.path}" if item.abs_path.suffix == ".py" else f"bash {item.path}"})
continue
if item.abs_path.suffix == ".py" and "if __name__ == '__main__':" in text:
points.append({"path": item.path, "reason": "python main guard", "command": f"python3 {item.path}"})
elif item.path in {"app.py", "server.py", "main.py"}:
points.append({"path": item.path, "reason": "top-level executable", "command": f"python3 {item.path}"})
seen: set[str] = set()
deduped: list[dict[str, str]] = []
for point in points:
if point["path"] in seen:
continue
seen.add(point["path"])
deduped.append(point)
return deduped[:12]
def _test_coverage(files: list[RepoFile]) -> tuple[list[RepoFile], list[RepoFile], list[RepoFile]]:
source_files = [
item
for item in files
if item.kind in {"source", "script"}
and item.path not in {"pipelines/codebase-genome.py", "pipelines/codebase_genome.py"}
and not item.path.endswith("/__init__.py")
]
test_files = [item for item in files if item.kind == "test"]
combined_test_text = "\n".join(_safe_text(item.abs_path) for item in test_files)
entry_paths = {point["path"] for point in _entry_points(files)}
gaps: list[RepoFile] = []
for item in source_files:
stem = item.abs_path.stem
if item.path in entry_paths:
continue
if stem and stem in combined_test_text:
continue
gaps.append(item)
gaps.sort(key=lambda item: (-item.line_count, item.path))
return source_files, test_files, gaps
def _security_findings(files: list[RepoFile]) -> list[dict[str, str]]:
rules = [
("high", "shell execution", re.compile(r"shell\s*=\s*True"), "shell=True expands blast radius for command execution"),
("high", "dynamic evaluation", re.compile(r"\b(eval|exec)\s*\("), "dynamic evaluation bypasses static guarantees"),
("medium", "unsafe deserialization", re.compile(r"pickle\.load\(|yaml\.load\("), "deserialization of untrusted data can execute code"),
("medium", "network egress", re.compile(r"urllib\.request\.urlopen\(|requests\.(get|post|put|delete)\("), "outbound network calls create runtime dependency and failure surface"),
("medium", "hardcoded http endpoint", re.compile(r"http://[^\s\"']+"), "plaintext or fixed HTTP endpoints can drift or leak across environments"),
]
findings: list[dict[str, str]] = []
for item in files:
if item.kind not in {"source", "script", "config"}:
continue
for lineno, line in enumerate(_safe_text(item.abs_path).splitlines(), start=1):
for severity, category, pattern, detail in rules:
if pattern.search(line):
findings.append(
{
"severity": severity,
"category": category,
"ref": f"{item.path}:{lineno}",
"line": line.strip(),
"detail": detail,
}
)
break
if len(findings) >= 12:
return findings
return findings
def _dead_code_candidates(files: list[RepoFile]) -> list[RepoFile]:
source_files = [item for item in files if item.kind in {"source", "script"} and item.abs_path.suffix == ".py"]
imports_by_file = {
item.path: _extract_python_imports(_safe_text(item.abs_path))
for item in source_files
}
imported_names = {name for imports in imports_by_file.values() for name in imports}
referenced_by_tests = "\n".join(_safe_text(item.abs_path) for item in files if item.kind == "test")
entry_paths = {point["path"] for point in _entry_points(files)}
candidates: list[RepoFile] = []
for item in source_files:
stem = item.abs_path.stem
if item.path in entry_paths:
continue
if stem in imported_names:
continue
if stem in referenced_by_tests:
continue
if stem in {"__init__", "conftest"}:
continue
candidates.append(item)
candidates.sort(key=lambda item: (-item.line_count, item.path))
return candidates[:10]
def _performance_findings(files: list[RepoFile]) -> list[dict[str, str]]:
findings: list[dict[str, str]] = []
for item in files:
if item.kind in {"source", "script"} and item.line_count >= 350:
findings.append({
"ref": item.path,
"detail": f"large module ({item.line_count} lines) likely hides multiple responsibilities",
})
for item in files:
if item.kind not in {"source", "script"}:
continue
text = _safe_text(item.abs_path)
if "os.walk(" in text or ".rglob(" in text or "glob.glob(" in text:
findings.append({
"ref": item.path,
"detail": "per-run filesystem scan detected; performance scales with repo size",
})
if "urllib.request.urlopen(" in text or "requests.get(" in text or "requests.post(" in text:
findings.append({
"ref": item.path,
"detail": "network-bound execution path can dominate runtime and create flaky throughput",
})
deduped: list[dict[str, str]] = []
seen: set[tuple[str, str]] = set()
for finding in findings:
key = (finding["ref"], finding["detail"])
if key in seen:
continue
seen.add(key)
deduped.append(finding)
return deduped[:10]
def _key_abstractions(files: list[RepoFile]) -> list[dict[str, object]]:
abstractions: list[dict[str, object]] = []
for item in _priority_files(files, ("source", "script"), limit=10):
if item.abs_path.suffix != ".py":
continue
classes, functions = _extract_python_symbols(_safe_text(item.abs_path))
if not classes and not functions:
continue
abstractions.append(
{
"path": item.path,
"classes": classes[:4],
"functions": [entry for entry in functions[:6] if not entry[0].startswith("_")],
}
)
return abstractions[:8]
def _api_surface(entry_points: list[dict[str, str]], abstractions: list[dict[str, object]]) -> list[str]:
api_lines: list[str] = []
for entry in entry_points[:8]:
api_lines.append(f"- CLI: `{entry['command']}` — {entry['reason']} (`{entry['path']}`)")
for abstraction in abstractions[:5]:
for func_name, lineno in abstraction["functions"]:
api_lines.append(f"- Python: `{func_name}()` from `{abstraction['path']}:{lineno}`")
if len(api_lines) >= 14:
return api_lines
return api_lines
def _data_flow(entry_points: list[dict[str, str]], files: list[RepoFile], gaps: list[RepoFile]) -> list[str]:
components = sorted(
{
_component_name(item.path)
for item in files
if item.kind in {"source", "script", "test", "config"} and not _component_name(item.path).startswith(".")
}
)
lines = []
if entry_points:
lines.append(f"1. Operators enter through {', '.join(f'`{item['path']}`' for item in entry_points[:3])}.")
else:
lines.append("1. No explicit CLI/main guard entry point was detected; execution appears library- or doc-driven.")
if components:
lines.append(f"2. Core logic fans into top-level components: {', '.join(f'`{name}`' for name in components[:6])}.")
if gaps:
lines.append(f"3. Validation is incomplete around {', '.join(f'`{item.path}`' for item in gaps[:3])}, so changes there carry regression risk.")
else:
lines.append("3. Tests appear to reference the currently indexed source set, reducing blind spots in the hot path.")
lines.append("4. Final artifacts land as repository files, docs, or runtime side effects depending on the selected entry point.")
return lines
def generate_genome_markdown(repo_root: str | Path, repo_name: str | None = None) -> str:
root = Path(repo_root).resolve()
files = collect_repo_files(root)
repo_display = repo_name or root.name
summary = _readme_summary(root)
entry_points = _entry_points(files)
source_files, test_files, coverage_gaps = _test_coverage(files)
security = _security_findings(files)
dead_code = _dead_code_candidates(files)
performance = _performance_findings(files)
abstractions = _key_abstractions(files)
api_surface = _api_surface(entry_points, abstractions)
data_flow = _data_flow(entry_points, files, coverage_gaps)
mermaid = _render_mermaid(files)
lines: list[str] = [
f"# GENOME.md — {repo_display}",
"",
"Generated by `pipelines/codebase_genome.py`.",
"",
"## Project Overview",
"",
summary,
"",
f"- Text files indexed: {len(files)}",
f"- Source and script files: {len(source_files)}",
f"- Test files: {len(test_files)}",
f"- Documentation files: {len([item for item in files if item.kind == 'doc'])}",
"",
"## Architecture",
"",
"```mermaid",
mermaid,
"```",
"",
"## Entry Points",
"",
]
if entry_points:
for item in entry_points:
lines.append(f"- `{item['path']}` — {item['reason']} (`{item['command']}`)")
else:
lines.append("- No explicit entry point detected.")
lines.extend(["", "## Data Flow", ""])
lines.extend(data_flow)
lines.extend(["", "## Key Abstractions", ""])
if abstractions:
for abstraction in abstractions:
path = abstraction["path"]
classes = abstraction["classes"]
functions = abstraction["functions"]
class_bits = ", ".join(f"`{name}`:{lineno}" for name, lineno in classes) or "none detected"
function_bits = ", ".join(f"`{name}()`:{lineno}" for name, lineno in functions) or "none detected"
lines.append(f"- `{path}` — classes {class_bits}; functions {function_bits}")
else:
lines.append("- No Python classes or top-level functions detected in the highest-priority source files.")
lines.extend(["", "## API Surface", ""])
if api_surface:
lines.extend(api_surface)
else:
lines.append("- No obvious public API surface detected.")
lines.extend(["", "## Test Coverage Report", ""])
lines.append(f"- Source and script files inspected: {len(source_files)}")
lines.append(f"- Test files inspected: {len(test_files)}")
if coverage_gaps:
lines.append("- Coverage gaps:")
for item in coverage_gaps[:12]:
lines.append(f" - `{item.path}` — no matching test reference detected")
else:
lines.append("- No obvious coverage gaps detected by the stem-matching heuristic.")
lines.extend(["", "## Security Audit Findings", ""])
if security:
for finding in security:
lines.append(
f"- [{finding['severity']}] `{finding['ref']}` — {finding['category']}: {finding['detail']}. Evidence: `{finding['line']}`"
)
else:
lines.append("- No high-signal security findings detected by the static heuristics in this pass.")
lines.extend(["", "## Dead Code Candidates", ""])
if dead_code:
for item in dead_code:
lines.append(f"- `{item.path}` — not imported by indexed Python modules and not referenced by tests")
else:
lines.append("- No obvious dead-code candidates detected.")
lines.extend(["", "## Performance Bottleneck Analysis", ""])
if performance:
for finding in performance:
lines.append(f"- `{finding['ref']}` — {finding['detail']}")
else:
lines.append("- No obvious performance hotspots detected by the static heuristics in this pass.")
return "\n".join(lines).rstrip() + "\n"
def write_genome(repo_root: str | Path, repo_name: str | None = None, output_path: str | Path | None = None) -> RunSummary:
root = Path(repo_root).resolve()
markdown = generate_genome_markdown(root, repo_name=repo_name)
out_path = Path(output_path) if output_path else root / "GENOME.md"
out_path.parent.mkdir(parents=True, exist_ok=True)
out_path.write_text(markdown, encoding="utf-8")
files = collect_repo_files(root)
source_files, test_files, _ = _test_coverage(files)
return RunSummary(
markdown=markdown,
source_count=len(source_files),
test_count=len(test_files),
doc_count=len([item for item in files if item.kind == "doc"]),
)
def main() -> None:
parser = argparse.ArgumentParser(description="Generate a deterministic GENOME.md for a repository")
parser.add_argument("--repo-root", required=True, help="Path to the repository to analyze")
parser.add_argument("--repo", dest="repo_name", default=None, help="Optional repo display name")
parser.add_argument("--repo-name", dest="repo_name_override", default=None, help="Optional repo display name")
parser.add_argument("--output", default=None, help="Path to write GENOME.md (defaults to <repo-root>/GENOME.md)")
args = parser.parse_args()
repo_name = args.repo_name_override or args.repo_name
summary = write_genome(args.repo_root, repo_name=repo_name, output_path=args.output)
target = Path(args.output) if args.output else Path(args.repo_root).resolve() / "GENOME.md"
print(
f"GENOME.md saved to {target} "
f"(sources={summary.source_count}, tests={summary.test_count}, docs={summary.doc_count})"
)
if __name__ == "__main__":
main()

View File

@@ -1,35 +0,0 @@
# NH Broadband — Public Research Memo
**Date:** 2026-04-15
**Status:** Draft — separates verified facts from unverified live work
**Refs:** #533, #740
---
## Verified (official public sources)
- **NH Broadband** is a residential fiber internet provider operating in New Hampshire.
- Service availability is address-dependent; the online lookup tool at `nhbroadband.com` reports coverage by street address.
- Residential fiber plans are offered; speed tiers vary by location.
- Scheduling line: **1-800-NHBB-INFO** (published on official site).
- Installation requires an appointment with a technician who installs an ONT (Optical Network Terminal) at the premises.
- Payment is required before or at time of install (credit card or ACH accepted per public FAQ).
## Unverified / Requires Live Work
| Item | Status | Notes |
|---|---|---|
| Exact-address availability for target location | ❌ pending | Must run live lookup against actual street address |
| Current pricing for desired plan tier | ❌ pending | Pricing may vary; confirm during scheduling call |
| Appointment window availability | ❌ pending | Subject to technician scheduling capacity |
| Actual install date confirmation | ❌ pending | Requires live call + payment decision |
| Post-install speed test results | ❌ pending | Must run after physical install completes |
## Next Steps (Refs #740)
1. Run address availability lookup on `nhbroadband.com`
2. Call 1-800-NHBB-INFO to schedule install
3. Confirm payment method
4. Receive appointment confirmation number
5. Prepare site (clear ONT install path)
6. Post-install: speed test and log results

View File

@@ -1,102 +0,0 @@
# Long Context vs RAG Decision Framework
**Research Backlog Item #4.3** | Impact: 4 | Effort: 1 | Ratio: 4.0
**Date**: 2026-04-15
**Status**: RESEARCHED
## Executive Summary
Modern LLMs have 128K-200K+ context windows, but we still treat them like 4K models by default. This document provides a decision framework for when to stuff context vs. use RAG, based on empirical findings and our stack constraints.
## The Core Insight
**Long context ≠ better answers.** Research shows:
- "Lost in the Middle" effect: Models attend poorly to information in the middle of long contexts (Liu et al., 2023)
- RAG with reranking outperforms full-context stuffing for document QA when docs > 50K tokens
- Cost scales quadratically with context length (attention computation)
- Latency increases linearly with input length
**RAG ≠ always better.** Retrieval introduces:
- Recall errors (miss relevant chunks)
- Precision errors (retrieve irrelevant chunks)
- Chunking artifacts (splitting mid-sentence)
- Additional latency for embedding + search
## Decision Matrix
| Scenario | Context Size | Recommendation | Why |
|----------|-------------|---------------|-----|
| Single conversation (< 32K) | Small | **Stuff everything** | No retrieval overhead, full context available |
| 5-20 documents, focused query | 32K-128K | **Hybrid** | Key docs in context, rest via RAG |
| Large corpus search | > 128K | **Pure RAG + reranking** | Full context impossible, must retrieve |
| Code review (< 5 files) | < 32K | **Stuff everything** | Code needs full context for understanding |
| Code review (repo-wide) | > 128K | **RAG with file-level chunks** | Files are natural chunk boundaries |
| Multi-turn conversation | Growing | **Hybrid + compression** | Keep recent turns in full, compress older |
| Fact retrieval | Any | **RAG** | Always faster to search than read everything |
| Complex reasoning across docs | 32K-128K | **Stuff + chain-of-thought** | Models need all context for cross-doc reasoning |
## Our Stack Constraints
### What We Have
- **Cloud models**: 128K-200K context (OpenRouter providers)
- **Local Ollama**: 8K-32K context (Gemma-4 default 8192)
- **Hermes fact_store**: SQLite FTS5 full-text search
- **Memory**: MemPalace holographic embeddings
- **Session context**: Growing conversation history
### What This Means
1. **Cloud sessions**: We CAN stuff up to 128K but SHOULD we? Cost and latency matter.
2. **Local sessions**: MUST use RAG for anything beyond 8K. Long context not available.
3. **Mixed fleet**: Need a routing layer that decides per-session.
## Advanced Patterns
### 1. Progressive Context Loading
Don't load everything at once. Start with RAG, then stuff additional docs as needed:
```
Turn 1: RAG search → top 3 chunks
Turn 2: Model asks "I need more context about X" → stuff X
Turn 3: Model has enough → continue
```
### 2. Context Budgeting
Allocate context budget across components:
```
System prompt: 2,000 tokens (always)
Recent messages: 10,000 tokens (last 5 turns)
RAG results: 8,000 tokens (top chunks)
Stuffed docs: 12,000 tokens (key docs)
---------------------------
Total: 32,000 tokens (fits 32K model)
```
### 3. Smart Compression
Before stuffing, compress older context:
- Summarize turns older than 10
- Remove tool call results (keep only final outputs)
- Deduplicate repeated information
- Use structured representations (JSON) instead of prose
## Empirical Benchmarks Needed
1. **Stuffing vs RAG accuracy** on our fact_store queries
2. **Latency comparison** at 32K, 64K, 128K context
3. **Cost per query** for cloud models at various context sizes
4. **Local model behavior** when pushed beyond rated context
## Recommendations
1. **Audit current context usage**: How many sessions hit > 32K? (Low effort, high value)
2. **Implement ContextRouter**: ~50 LOC, adds routing decisions to hermes
3. **Add context-size logging**: Track input tokens per session for data gathering
## References
- Liu et al. "Lost in the Middle: How Language Models Use Long Contexts" (2023) — https://arxiv.org/abs/2307.03172
- Shi et al. "Large Language Models are Easily Distracted by Irrelevant Context" (2023)
- Xu et al. "Retrieval Meets Long Context LLMs" (2023) — hybrid approaches outperform both alone
- Anthropic's Claude 3.5 context caching — built-in prefix caching reduces cost for repeated system prompts
---
*Sovereignty and service always.*

View File

@@ -0,0 +1,171 @@
#!/usr/bin/env python3
"""Nightly runner for the codebase genome pipeline."""
from __future__ import annotations
import argparse
import json
import os
import subprocess
import sys
import urllib.request
from pathlib import Path
from typing import NamedTuple
class RunPlan(NamedTuple):
repo: dict
repo_dir: Path
output_path: Path
command: list[str]
def load_state(path: Path) -> dict:
if not path.exists():
return {}
return json.loads(path.read_text(encoding="utf-8"))
def save_state(path: Path, state: dict) -> None:
path.parent.mkdir(parents=True, exist_ok=True)
path.write_text(json.dumps(state, indent=2, sort_keys=True), encoding="utf-8")
def select_next_repo(repos: list[dict], state: dict) -> dict:
if not repos:
raise ValueError("no repositories available for nightly genome run")
ordered = sorted(repos, key=lambda item: item.get("full_name", item.get("name", "")).lower())
last_repo = state.get("last_repo")
for index, repo in enumerate(ordered):
if repo.get("name") == last_repo or repo.get("full_name") == last_repo:
return ordered[(index + 1) % len(ordered)]
last_index = int(state.get("last_index", -1))
return ordered[(last_index + 1) % len(ordered)]
def build_run_plan(repo: dict, workspace_root: Path, output_root: Path, pipeline_script: Path) -> RunPlan:
repo_dir = workspace_root / repo["name"]
output_path = output_root / repo["name"] / "GENOME.md"
command = [
sys.executable,
str(pipeline_script),
"--repo-root",
str(repo_dir),
"--repo-name",
repo.get("full_name", repo["name"]),
"--output",
str(output_path),
]
return RunPlan(repo=repo, repo_dir=repo_dir, output_path=output_path, command=command)
def fetch_org_repos(org: str, host: str, token_file: Path, include_archived: bool = False) -> list[dict]:
token = token_file.read_text(encoding="utf-8").strip()
page = 1
repos: list[dict] = []
while True:
req = urllib.request.Request(
f"{host.rstrip('/')}/api/v1/orgs/{org}/repos?limit=100&page={page}",
headers={"Authorization": f"token {token}", "Accept": "application/json"},
)
with urllib.request.urlopen(req, timeout=30) as resp:
chunk = json.loads(resp.read().decode("utf-8"))
if not chunk:
break
for item in chunk:
if item.get("archived") and not include_archived:
continue
repos.append(
{
"name": item["name"],
"full_name": item["full_name"],
"clone_url": item["clone_url"],
"default_branch": item.get("default_branch") or "main",
}
)
page += 1
return repos
def _authenticated_clone_url(clone_url: str, token_file: Path) -> str:
token = token_file.read_text(encoding="utf-8").strip()
if clone_url.startswith("https://"):
return f"https://{token}@{clone_url[len('https://') :]}"
return clone_url
def ensure_checkout(repo: dict, workspace_root: Path, token_file: Path) -> Path:
workspace_root.mkdir(parents=True, exist_ok=True)
repo_dir = workspace_root / repo["name"]
branch = repo.get("default_branch") or "main"
clone_url = _authenticated_clone_url(repo["clone_url"], token_file)
if (repo_dir / ".git").exists():
subprocess.run(["git", "-C", str(repo_dir), "fetch", "origin", branch, "--depth", "1"], check=True)
subprocess.run(["git", "-C", str(repo_dir), "checkout", branch], check=True)
subprocess.run(["git", "-C", str(repo_dir), "reset", "--hard", f"origin/{branch}"], check=True)
else:
subprocess.run(
["git", "clone", "--depth", "1", "--single-branch", "--branch", branch, clone_url, str(repo_dir)],
check=True,
)
return repo_dir
def run_plan(plan: RunPlan) -> None:
plan.output_path.parent.mkdir(parents=True, exist_ok=True)
subprocess.run(plan.command, check=True)
def main() -> None:
parser = argparse.ArgumentParser(description="Run one nightly codebase genome pass for the next repo in an org")
parser.add_argument("--org", default="Timmy_Foundation")
parser.add_argument("--host", default="https://forge.alexanderwhitestone.com")
parser.add_argument("--token-file", default=os.path.expanduser("~/.config/gitea/token"))
parser.add_argument("--workspace-root", default=os.path.expanduser("~/timmy-foundation-repos"))
parser.add_argument("--output-root", default=os.path.expanduser("~/.timmy/codebase-genomes"))
parser.add_argument("--state-path", default=os.path.expanduser("~/.timmy/codebase_genome_state.json"))
parser.add_argument("--pipeline-script", default=str(Path(__file__).resolve().parents[1] / "pipelines" / "codebase_genome.py"))
parser.add_argument("--include-archived", action="store_true")
parser.add_argument("--dry-run", action="store_true")
args = parser.parse_args()
token_file = Path(args.token_file).expanduser()
workspace_root = Path(args.workspace_root).expanduser()
output_root = Path(args.output_root).expanduser()
state_path = Path(args.state_path).expanduser()
pipeline_script = Path(args.pipeline_script).expanduser()
repos = fetch_org_repos(args.org, args.host, token_file, include_archived=args.include_archived)
state = load_state(state_path)
repo = select_next_repo(repos, state)
plan = build_run_plan(repo, workspace_root=workspace_root, output_root=output_root, pipeline_script=pipeline_script)
if args.dry_run:
print(
json.dumps(
{
"repo": repo,
"repo_dir": str(plan.repo_dir),
"output_path": str(plan.output_path),
"command": plan.command,
},
indent=2,
)
)
return
ensure_checkout(repo, workspace_root=workspace_root, token_file=token_file)
run_plan(plan)
save_state(
state_path,
{
"last_index": sorted(repos, key=lambda item: item.get("full_name", item.get("name", "")).lower()).index(repo),
"last_repo": repo.get("name"),
},
)
print(f"Completed genome run for {repo['full_name']} -> {plan.output_path}")
if __name__ == "__main__":
main()

View File

@@ -1,127 +0,0 @@
#!/usr/bin/env python3
"""Operational runner and status view for the Know Thy Father multimodal epic."""
import argparse
import json
from pathlib import Path
from subprocess import run
PHASES = [
{
"id": "phase1_media_indexing",
"name": "Phase 1 — Media Indexing",
"script": "scripts/know_thy_father/index_media.py",
"command_template": "python3 scripts/know_thy_father/index_media.py --tweets twitter-archive/extracted/tweets.jsonl --output twitter-archive/know-thy-father/media_manifest.jsonl",
"outputs": ["twitter-archive/know-thy-father/media_manifest.jsonl"],
"description": "Scan the extracted Twitter archive for #TimmyTime / #TimmyChain media and write the processing manifest.",
},
{
"id": "phase2_multimodal_analysis",
"name": "Phase 2 — Multimodal Analysis",
"script": "scripts/twitter_archive/analyze_media.py",
"command_template": "python3 scripts/twitter_archive/analyze_media.py --batch {batch_size}",
"outputs": [
"twitter-archive/know-thy-father/analysis.jsonl",
"twitter-archive/know-thy-father/meaning-kernels.jsonl",
"twitter-archive/know-thy-father/pipeline-status.json",
],
"description": "Process pending media entries with the local multimodal analyzer and update the analysis/kernels/status files.",
},
{
"id": "phase3_holographic_synthesis",
"name": "Phase 3 — Holographic Synthesis",
"script": "scripts/know_thy_father/synthesize_kernels.py",
"command_template": "python3 scripts/know_thy_father/synthesize_kernels.py --input twitter-archive/media/manifest.jsonl --output twitter-archive/knowledge/fathers_ledger.jsonl --summary twitter-archive/knowledge/fathers_ledger.summary.json",
"outputs": [
"twitter-archive/knowledge/fathers_ledger.jsonl",
"twitter-archive/knowledge/fathers_ledger.summary.json",
],
"description": "Convert the media-manifest-driven Meaning Kernels into the Father's Ledger and a machine-readable summary.",
},
{
"id": "phase4_cross_reference_audit",
"name": "Phase 4 — Cross-Reference Audit",
"script": "scripts/know_thy_father/crossref_audit.py",
"command_template": "python3 scripts/know_thy_father/crossref_audit.py --soul SOUL.md --kernels twitter-archive/notes/know_thy_father_crossref.md --output twitter-archive/notes/crossref_report.md",
"outputs": ["twitter-archive/notes/crossref_report.md"],
"description": "Compare Know Thy Father kernels against SOUL.md and related canon, then emit a Markdown audit report.",
},
{
"id": "phase5_processing_log",
"name": "Phase 5 — Processing Log / Status",
"script": "twitter-archive/know-thy-father/tracker.py",
"command_template": "python3 twitter-archive/know-thy-father/tracker.py report",
"outputs": ["twitter-archive/know-thy-father/REPORT.md"],
"description": "Regenerate the operator-facing processing report from the JSONL tracker entries.",
},
]
def build_pipeline_plan(batch_size: int = 10):
plan = []
for phase in PHASES:
plan.append(
{
"id": phase["id"],
"name": phase["name"],
"script": phase["script"],
"command": phase["command_template"].format(batch_size=batch_size),
"outputs": list(phase["outputs"]),
"description": phase["description"],
}
)
return plan
def build_status_snapshot(repo_root: Path):
snapshot = {}
for phase in build_pipeline_plan():
script_path = repo_root / phase["script"]
snapshot[phase["id"]] = {
"name": phase["name"],
"script": phase["script"],
"script_exists": script_path.exists(),
"outputs": [
{
"path": output,
"exists": (repo_root / output).exists(),
}
for output in phase["outputs"]
],
}
return snapshot
def run_step(repo_root: Path, step_id: str, batch_size: int = 10):
plan = {step["id"]: step for step in build_pipeline_plan(batch_size=batch_size)}
if step_id not in plan:
raise SystemExit(f"Unknown step: {step_id}")
step = plan[step_id]
return run(step["command"], cwd=repo_root, shell=True, check=False)
def main():
parser = argparse.ArgumentParser(description="Know Thy Father epic orchestration helper")
parser.add_argument("--batch-size", type=int, default=10)
parser.add_argument("--status", action="store_true")
parser.add_argument("--run-step", default=None)
parser.add_argument("--json", action="store_true")
args = parser.parse_args()
repo_root = Path(__file__).resolve().parents[2]
if args.run_step:
result = run_step(repo_root, args.run_step, batch_size=args.batch_size)
raise SystemExit(result.returncode)
payload = build_status_snapshot(repo_root) if args.status else build_pipeline_plan(batch_size=args.batch_size)
if args.json or args.status:
print(json.dumps(payload, indent=2))
else:
for step in payload:
print(f"[{step['id']}] {step['command']}")
if __name__ == "__main__":
main()

View File

@@ -1,159 +0,0 @@
#!/usr/bin/env python3
"""Prepare a MemPalace v3.0.0 integration packet for Ezra's Hermes home."""
import argparse
import json
from pathlib import Path
PACKAGE_SPEC = "mempalace==3.0.0"
DEFAULT_HERMES_HOME = "~/.hermes/"
DEFAULT_SESSIONS_DIR = "~/.hermes/sessions/"
DEFAULT_PALACE_PATH = "~/.mempalace/palace"
DEFAULT_WING = "ezra_home"
def build_yaml_template(wing: str, palace_path: str) -> str:
return (
f"wing: {wing}\n"
f"palace: {palace_path}\n"
"rooms:\n"
" - name: sessions\n"
" description: Conversation history and durable agent transcripts\n"
" globs:\n"
" - \"*.json\"\n"
" - \"*.jsonl\"\n"
" - name: config\n"
" description: Hermes configuration and runtime settings\n"
" globs:\n"
" - \"*.yaml\"\n"
" - \"*.yml\"\n"
" - \"*.toml\"\n"
" - name: docs\n"
" description: Notes, markdown docs, and operating reports\n"
" globs:\n"
" - \"*.md\"\n"
" - \"*.txt\"\n"
"people: []\n"
"projects: []\n"
)
def build_plan(overrides: dict | None = None) -> dict:
overrides = overrides or {}
hermes_home = overrides.get("hermes_home", DEFAULT_HERMES_HOME)
sessions_dir = overrides.get("sessions_dir", DEFAULT_SESSIONS_DIR)
palace_path = overrides.get("palace_path", DEFAULT_PALACE_PATH)
wing = overrides.get("wing", DEFAULT_WING)
yaml_template = build_yaml_template(wing=wing, palace_path=palace_path)
config_home = hermes_home[:-1] if hermes_home.endswith("/") else hermes_home
plan = {
"package_spec": PACKAGE_SPEC,
"hermes_home": hermes_home,
"sessions_dir": sessions_dir,
"palace_path": palace_path,
"wing": wing,
"config_path": f"{config_home}/mempalace.yaml",
"install_command": f"pip install {PACKAGE_SPEC}",
"init_command": f"mempalace init {hermes_home} --yes",
"mine_home_command": f"echo \"\" | mempalace mine {hermes_home}",
"mine_sessions_command": f"echo \"\" | mempalace mine {sessions_dir} --mode convos",
"search_command": 'mempalace search "your common queries"',
"wake_up_command": "mempalace wake-up",
"mcp_command": "hermes mcp add mempalace -- python -m mempalace.mcp_server",
"yaml_template": yaml_template,
"gotchas": [
"`mempalace init` is still interactive in room approval flow; write mempalace.yaml manually if the init output stalls.",
"The yaml key is `wing:` not `wings:`. Using the wrong key causes mine/setup failures.",
"Pipe empty stdin into mining commands (`echo \"\" | ...`) to avoid the entity-detector stdin hang on larger directories.",
"First mine downloads the ChromaDB embedding model cache (~79MB).",
"Report Ezra's before/after metrics back to issue #568 after live installation and retrieval tests.",
],
}
return plan
def render_markdown(plan: dict) -> str:
gotchas = "\n".join(f"- {item}" for item in plan["gotchas"])
return f"""# MemPalace v3.0.0 — Ezra Integration Packet
This packet turns issue #570 into an executable, reviewable integration plan for Ezra's Hermes home.
It is a repo-side scaffold: no live Ezra host changes are claimed in this artifact.
## Commands
```bash
{plan['install_command']}
{plan['init_command']}
cat > {plan['config_path']} <<'YAML'
{plan['yaml_template'].rstrip()}
YAML
{plan['mine_home_command']}
{plan['mine_sessions_command']}
{plan['search_command']}
{plan['wake_up_command']}
{plan['mcp_command']}
```
## Manual config template
```yaml
{plan['yaml_template'].rstrip()}
```
## Why this shape
- `wing: {plan['wing']}` matches the issue's Ezra-specific integration target.
- `rooms` split the mined material into sessions, config, and docs to keep retrieval interpretable.
- Mining commands pipe empty stdin to avoid the interactive entity-detector hang noted in the evaluation.
## Gotchas
{gotchas}
## Report back to #568
After live execution on Ezra's actual environment, post back to #568 with:
- install result
- mine duration and corpus size
- 2-3 real search queries + retrieved results
- wake-up context token count
- whether MCP wiring succeeded
## Honest scope boundary
This repo artifact does **not** prove live installation on Ezra's host. It makes the work reproducible and testable so the next pass can execute it without guesswork.
"""
def main() -> None:
parser = argparse.ArgumentParser(description="Prepare the MemPalace Ezra integration packet")
parser.add_argument("--hermes-home", default=DEFAULT_HERMES_HOME)
parser.add_argument("--sessions-dir", default=DEFAULT_SESSIONS_DIR)
parser.add_argument("--palace-path", default=DEFAULT_PALACE_PATH)
parser.add_argument("--wing", default=DEFAULT_WING)
parser.add_argument("--output", default=None)
parser.add_argument("--json", action="store_true")
args = parser.parse_args()
plan = build_plan(
{
"hermes_home": args.hermes_home,
"sessions_dir": args.sessions_dir,
"palace_path": args.palace_path,
"wing": args.wing,
}
)
rendered = json.dumps(plan, indent=2) if args.json else render_markdown(plan)
if args.output:
output_path = Path(args.output).expanduser()
output_path.parent.mkdir(parents=True, exist_ok=True)
output_path.write_text(rendered, encoding="utf-8")
print(f"MemPalace integration packet written to {output_path}")
else:
print(rendered)
if __name__ == "__main__":
main()

View File

@@ -1,155 +0,0 @@
#!/usr/bin/env python3
from __future__ import annotations
import argparse
import json
from pathlib import Path
from typing import Any
import yaml
DAYLIGHT_START = "10:00"
DAYLIGHT_END = "16:00"
def load_manifest(path: str | Path) -> dict[str, Any]:
data = yaml.safe_load(Path(path).read_text()) or {}
data.setdefault("machines", [])
return data
def validate_manifest(data: dict[str, Any]) -> None:
machines = data.get("machines", [])
if not machines:
raise ValueError("manifest must contain at least one machine")
seen: set[str] = set()
for machine in machines:
hostname = machine.get("hostname", "").strip()
if not hostname:
raise ValueError("each machine must declare a hostname")
if hostname in seen:
raise ValueError(f"duplicate hostname: {hostname} (unique hostnames are required)")
seen.add(hostname)
for field in ("machine_type", "ram_gb", "cpu_cores", "os", "adapter_condition"):
if field not in machine:
raise ValueError(f"machine {hostname} missing required field: {field}")
def _laptops(machines: list[dict[str, Any]]) -> list[dict[str, Any]]:
return [m for m in machines if m.get("machine_type") == "laptop"]
def _desktop(machines: list[dict[str, Any]]) -> dict[str, Any] | None:
for machine in machines:
if machine.get("machine_type") == "desktop":
return machine
return None
def choose_anchor_agents(machines: list[dict[str, Any]], count: int = 2) -> list[dict[str, Any]]:
eligible = [
m for m in _laptops(machines)
if m.get("adapter_condition") in {"good", "ok"} and m.get("always_on_capable", True)
]
eligible.sort(key=lambda m: (m.get("idle_watts", 9999), -m.get("ram_gb", 0), -m.get("cpu_cores", 0), m["hostname"]))
return eligible[:count]
def assign_roles(machines: list[dict[str, Any]]) -> dict[str, Any]:
anchors = choose_anchor_agents(machines, count=2)
anchor_names = {m["hostname"] for m in anchors}
desktop = _desktop(machines)
mapping: dict[str, dict[str, Any]] = {}
for machine in machines:
hostname = machine["hostname"]
if desktop and hostname == desktop["hostname"]:
mapping[hostname] = {
"role": "desktop_nas",
"schedule": f"{DAYLIGHT_START}-{DAYLIGHT_END}",
"duty_cycle": "daylight_only",
}
elif hostname in anchor_names:
mapping[hostname] = {
"role": "anchor_agent",
"schedule": "24/7",
"duty_cycle": "continuous",
}
else:
mapping[hostname] = {
"role": "daylight_agent",
"schedule": f"{DAYLIGHT_START}-{DAYLIGHT_END}",
"duty_cycle": "peak_solar",
}
return {
"anchor_agents": [m["hostname"] for m in anchors],
"desktop_nas": desktop["hostname"] if desktop else None,
"role_mapping": mapping,
}
def build_plan(data: dict[str, Any]) -> dict[str, Any]:
validate_manifest(data)
machines = data["machines"]
role_plan = assign_roles(machines)
return {
"fleet_name": data.get("fleet_name", "timmy-laptop-fleet"),
"machine_count": len(machines),
"anchor_agents": role_plan["anchor_agents"],
"desktop_nas": role_plan["desktop_nas"],
"daylight_window": f"{DAYLIGHT_START}-{DAYLIGHT_END}",
"role_mapping": role_plan["role_mapping"],
}
def render_markdown(plan: dict[str, Any], data: dict[str, Any]) -> str:
lines = [
"# Laptop Fleet Deployment Plan",
"",
f"Fleet: {plan['fleet_name']}",
f"Machine count: {plan['machine_count']}",
f"24/7 anchor agents: {', '.join(plan['anchor_agents']) if plan['anchor_agents'] else 'TBD'}",
f"Desktop/NAS: {plan['desktop_nas'] or 'TBD'}",
f"Daylight schedule: {plan['daylight_window']}",
"",
"## Role mapping",
"",
"| Hostname | Role | Schedule | Duty cycle |",
"|---|---|---|---|",
]
for hostname, role in sorted(plan["role_mapping"].items()):
lines.append(f"| {hostname} | {role['role']} | {role['schedule']} | {role['duty_cycle']} |")
lines.extend([
"",
"## Machine inventory",
"",
"| Hostname | Type | RAM | CPU cores | OS | Adapter | Idle watts | Notes |",
"|---|---|---:|---:|---|---|---:|---|",
])
for machine in data["machines"]:
lines.append(
f"| {machine['hostname']} | {machine['machine_type']} | {machine['ram_gb']} | {machine['cpu_cores']} | {machine['os']} | {machine['adapter_condition']} | {machine.get('idle_watts', 'n/a')} | {machine.get('notes', '')} |"
)
return "\n".join(lines) + "\n"
def main() -> int:
parser = argparse.ArgumentParser(description="Plan LAB-005 laptop fleet deployment.")
parser.add_argument("manifest", help="Path to laptop fleet manifest YAML")
parser.add_argument("--markdown", action="store_true", help="Render a markdown deployment plan instead of JSON")
args = parser.parse_args()
data = load_manifest(args.manifest)
plan = build_plan(data)
if args.markdown:
print(render_markdown(plan, data))
else:
print(json.dumps(plan, indent=2))
return 0
if __name__ == "__main__":
raise SystemExit(main())

View File

@@ -1,135 +0,0 @@
#!/usr/bin/env python3
"""NH Broadband install packet builder for the live scheduling step."""
from __future__ import annotations
import argparse
import json
from datetime import datetime, timezone
from pathlib import Path
from typing import Any
import yaml
def load_request(path: str | Path) -> dict[str, Any]:
data = yaml.safe_load(Path(path).read_text()) or {}
data.setdefault("contact", {})
data.setdefault("service", {})
data.setdefault("call_log", [])
data.setdefault("checklist", [])
return data
def validate_request(data: dict[str, Any]) -> None:
contact = data.get("contact", {})
for field in ("name", "phone"):
if not contact.get(field, "").strip():
raise ValueError(f"contact.{field} is required")
service = data.get("service", {})
for field in ("address", "city", "state"):
if not service.get(field, "").strip():
raise ValueError(f"service.{field} is required")
if not data.get("checklist"):
raise ValueError("checklist must contain at least one item")
def build_packet(data: dict[str, Any]) -> dict[str, Any]:
validate_request(data)
contact = data["contact"]
service = data["service"]
return {
"packet_id": f"nh-bb-{datetime.now(timezone.utc).strftime('%Y%m%d-%H%M%S')}",
"generated_utc": datetime.now(timezone.utc).isoformat(),
"contact": {
"name": contact["name"],
"phone": contact["phone"],
"email": contact.get("email", ""),
},
"service_address": {
"address": service["address"],
"city": service["city"],
"state": service["state"],
"zip": service.get("zip", ""),
},
"desired_plan": data.get("desired_plan", "residential-fiber"),
"call_log": data.get("call_log", []),
"checklist": [
{"item": item, "done": False} if isinstance(item, str) else item
for item in data["checklist"]
],
"status": "pending_scheduling_call",
}
def render_markdown(packet: dict[str, Any], data: dict[str, Any]) -> str:
contact = packet["contact"]
addr = packet["service_address"]
lines = [
f"# NH Broadband Install Packet",
"",
f"**Packet ID:** {packet['packet_id']}",
f"**Generated:** {packet['generated_utc']}",
f"**Status:** {packet['status']}",
"",
"## Contact",
"",
f"- **Name:** {contact['name']}",
f"- **Phone:** {contact['phone']}",
f"- **Email:** {contact.get('email', 'n/a')}",
"",
"## Service Address",
"",
f"- {addr['address']}",
f"- {addr['city']}, {addr['state']} {addr['zip']}",
"",
f"## Desired Plan",
"",
f"{packet['desired_plan']}",
"",
"## Call Log",
"",
]
if packet["call_log"]:
for entry in packet["call_log"]:
ts = entry.get("timestamp", "n/a")
outcome = entry.get("outcome", "n/a")
notes = entry.get("notes", "")
lines.append(f"- **{ts}** — {outcome}")
if notes:
lines.append(f" - {notes}")
else:
lines.append("_No calls logged yet._")
lines.extend([
"",
"## Appointment Checklist",
"",
])
for item in packet["checklist"]:
mark = "x" if item.get("done") else " "
lines.append(f"- [{mark}] {item['item']}")
lines.append("")
return "\n".join(lines)
def main() -> int:
parser = argparse.ArgumentParser(description="Build NH Broadband install packet.")
parser.add_argument("request", help="Path to install request YAML")
parser.add_argument("--markdown", action="store_true", help="Render markdown instead of JSON")
args = parser.parse_args()
data = load_request(args.request)
packet = build_packet(data)
if args.markdown:
print(render_markdown(packet, data))
else:
print(json.dumps(packet, indent=2))
return 0
if __name__ == "__main__":
raise SystemExit(main())

View File

@@ -0,0 +1,115 @@
from __future__ import annotations
import importlib.util
from pathlib import Path
ROOT = Path(__file__).resolve().parents[1]
PIPELINE_PATH = ROOT / "pipelines" / "codebase_genome.py"
NIGHTLY_PATH = ROOT / "scripts" / "codebase_genome_nightly.py"
GENOME_PATH = ROOT / "GENOME.md"
def _load_module(path: Path, name: str):
assert path.exists(), f"missing {path.relative_to(ROOT)}"
spec = importlib.util.spec_from_file_location(name, path)
assert spec and spec.loader
module = importlib.util.module_from_spec(spec)
spec.loader.exec_module(module)
return module
def test_generate_genome_markdown_contains_required_sections(tmp_path: Path) -> None:
genome_mod = _load_module(PIPELINE_PATH, "codebase_genome")
repo = tmp_path / "repo"
(repo / "tests").mkdir(parents=True)
(repo / "README.md").write_text("# Demo Repo\n\nA tiny example repo.\n")
(repo / "app.py").write_text(
"import module\n\n"
"def main():\n"
" return module.Helper().answer()\n\n"
"if __name__ == '__main__':\n"
" raise SystemExit(main())\n"
)
(repo / "module.py").write_text(
"class Helper:\n"
" def answer(self):\n"
" return 42\n"
)
(repo / "dangerous.py").write_text(
"import subprocess\n\n"
"def run_shell(cmd):\n"
" return subprocess.run(cmd, shell=True, check=False)\n"
)
(repo / "extra.py").write_text("VALUE = 7\n")
(repo / "tests" / "test_app.py").write_text(
"from app import main\n\n"
"def test_main():\n"
" assert main() == 42\n"
)
genome = genome_mod.generate_genome_markdown(repo, repo_name="org/repo")
for heading in (
"# GENOME.md — org/repo",
"## Project Overview",
"## Architecture",
"```mermaid",
"## Entry Points",
"## Data Flow",
"## Key Abstractions",
"## API Surface",
"## Test Coverage Report",
"## Security Audit Findings",
"## Dead Code Candidates",
"## Performance Bottleneck Analysis",
):
assert heading in genome
assert "app.py" in genome
assert "module.py" in genome
assert "dangerous.py" in genome
assert "extra.py" in genome
assert "shell=True" in genome
def test_nightly_runner_rotates_repos_and_builds_plan() -> None:
nightly_mod = _load_module(NIGHTLY_PATH, "codebase_genome_nightly")
repos = [
{"name": "alpha", "full_name": "Timmy_Foundation/alpha", "clone_url": "https://example/alpha.git"},
{"name": "beta", "full_name": "Timmy_Foundation/beta", "clone_url": "https://example/beta.git"},
]
state = {"last_index": 0, "last_repo": "alpha"}
next_repo = nightly_mod.select_next_repo(repos, state)
assert next_repo["name"] == "beta"
plan = nightly_mod.build_run_plan(
repo=next_repo,
workspace_root=Path("/tmp/repos"),
output_root=Path("/tmp/genomes"),
pipeline_script=Path("/tmp/timmy-home/pipelines/codebase_genome.py"),
)
assert plan.repo_dir == Path("/tmp/repos/beta")
assert plan.output_path == Path("/tmp/genomes/beta/GENOME.md")
assert "codebase_genome.py" in plan.command[1]
assert plan.command[-1] == "/tmp/genomes/beta/GENOME.md"
def test_repo_contains_generated_timmy_home_genome() -> None:
assert GENOME_PATH.exists(), "missing generated GENOME.md for timmy-home"
text = GENOME_PATH.read_text(encoding="utf-8")
for snippet in (
"# GENOME.md — Timmy_Foundation/timmy-home",
"## Project Overview",
"## Architecture",
"## Entry Points",
"## API Surface",
"## Test Coverage Report",
"## Security Audit Findings",
"## Performance Bottleneck Analysis",
):
assert snippet in text

View File

@@ -1,60 +0,0 @@
from pathlib import Path
import unittest
ROOT = Path(__file__).resolve().parent.parent
GENOME_PATH = ROOT / "genomes" / "fleet-ops-GENOME.md"
class TestFleetOpsGenome(unittest.TestCase):
def test_genome_file_exists_with_required_sections(self):
self.assertTrue(GENOME_PATH.exists(), "missing genomes/fleet-ops-GENOME.md")
text = GENOME_PATH.read_text(encoding="utf-8")
required_sections = [
"# GENOME.md — fleet-ops",
"## Project Overview",
"## Architecture",
"## Entry Points",
"## Data Flow",
"## Key Abstractions",
"## API Surface",
"## Test Coverage Gaps",
"## Security Considerations",
"## Deployment",
]
for section in required_sections:
self.assertIn(section, text)
def test_genome_names_real_files_and_grounded_findings(self):
text = GENOME_PATH.read_text(encoding="utf-8")
required_snippets = [
"```mermaid",
"playbooks/site.yml",
"playbooks/deploy_hermes.yml",
"scripts/deploy-hook.py",
"scripts/dispatch_consumer.py",
"message_bus.py",
"knowledge_store.py",
"health_dashboard.py",
"registry.yaml",
"manifest.yaml",
"DEPLOY_HOOK_SECRET",
"gitea_register_act_runner: false",
"gitea_webhook_allowed_host_list",
"tests/test_video_engine_client.py",
"158 passed, 1 failed, 2 errors",
"hermes_tools",
"8643",
"8646",
]
for snippet in required_snippets:
self.assertIn(snippet, text)
def test_genome_is_substantial(self):
text = GENOME_PATH.read_text(encoding="utf-8")
self.assertGreaterEqual(len(text.splitlines()), 120)
self.assertGreaterEqual(len(text), 7000)
if __name__ == "__main__":
unittest.main()

View File

@@ -1,76 +0,0 @@
from pathlib import Path
import importlib.util
import unittest
ROOT = Path(__file__).resolve().parent.parent
SCRIPT_PATH = ROOT / "scripts" / "know_thy_father" / "epic_pipeline.py"
DOC_PATH = ROOT / "docs" / "KNOW_THY_FATHER_MULTIMODAL_PIPELINE.md"
def load_module(path: Path, name: str):
assert path.exists(), f"missing {path.relative_to(ROOT)}"
spec = importlib.util.spec_from_file_location(name, path)
assert spec and spec.loader
module = importlib.util.module_from_spec(spec)
spec.loader.exec_module(module)
return module
class TestKnowThyFatherEpicPipeline(unittest.TestCase):
def test_build_pipeline_plan_contains_all_phases_in_order(self):
mod = load_module(SCRIPT_PATH, "ktf_epic_pipeline")
plan = mod.build_pipeline_plan(batch_size=10)
self.assertEqual(
[step["id"] for step in plan],
[
"phase1_media_indexing",
"phase2_multimodal_analysis",
"phase3_holographic_synthesis",
"phase4_cross_reference_audit",
"phase5_processing_log",
],
)
self.assertIn("scripts/know_thy_father/index_media.py", plan[0]["command"])
self.assertIn("scripts/twitter_archive/analyze_media.py --batch 10", plan[1]["command"])
self.assertIn("scripts/know_thy_father/synthesize_kernels.py", plan[2]["command"])
self.assertIn("scripts/know_thy_father/crossref_audit.py", plan[3]["command"])
self.assertIn("twitter-archive/know-thy-father/tracker.py report", plan[4]["command"])
def test_status_snapshot_reports_key_artifact_paths(self):
mod = load_module(SCRIPT_PATH, "ktf_epic_pipeline")
status = mod.build_status_snapshot(ROOT)
self.assertIn("phase1_media_indexing", status)
self.assertIn("phase2_multimodal_analysis", status)
self.assertIn("phase3_holographic_synthesis", status)
self.assertIn("phase4_cross_reference_audit", status)
self.assertIn("phase5_processing_log", status)
self.assertEqual(status["phase1_media_indexing"]["script"], "scripts/know_thy_father/index_media.py")
self.assertEqual(status["phase2_multimodal_analysis"]["script"], "scripts/twitter_archive/analyze_media.py")
self.assertEqual(status["phase5_processing_log"]["script"], "twitter-archive/know-thy-father/tracker.py")
self.assertTrue(status["phase1_media_indexing"]["script_exists"])
self.assertTrue(status["phase2_multimodal_analysis"]["script_exists"])
self.assertTrue(status["phase3_holographic_synthesis"]["script_exists"])
self.assertTrue(status["phase4_cross_reference_audit"]["script_exists"])
self.assertTrue(status["phase5_processing_log"]["script_exists"])
def test_repo_contains_multimodal_pipeline_doc(self):
self.assertTrue(DOC_PATH.exists(), "missing committed Know Thy Father pipeline doc")
text = DOC_PATH.read_text(encoding="utf-8")
required = [
"# Know Thy Father — Multimodal Media Consumption Pipeline",
"scripts/know_thy_father/index_media.py",
"scripts/twitter_archive/analyze_media.py --batch 10",
"scripts/know_thy_father/synthesize_kernels.py",
"scripts/know_thy_father/crossref_audit.py",
"twitter-archive/know-thy-father/tracker.py report",
"Refs #582",
]
for snippet in required:
self.assertIn(snippet, text)
if __name__ == "__main__":
unittest.main()

View File

@@ -1,52 +0,0 @@
from pathlib import Path
import yaml
from scripts.plan_laptop_fleet import build_plan, load_manifest, render_markdown, validate_manifest
def test_laptop_fleet_planner_script_exists() -> None:
assert Path("scripts/plan_laptop_fleet.py").exists()
def test_laptop_fleet_manifest_template_exists() -> None:
assert Path("docs/laptop-fleet-manifest.example.yaml").exists()
def test_build_plan_selects_two_lowest_idle_watt_laptops_as_anchors() -> None:
data = load_manifest("docs/laptop-fleet-manifest.example.yaml")
plan = build_plan(data)
assert plan["anchor_agents"] == ["timmy-anchor-a", "timmy-anchor-b"]
assert plan["desktop_nas"] == "timmy-desktop-nas"
assert plan["role_mapping"]["timmy-daylight-a"]["schedule"] == "10:00-16:00"
def test_validate_manifest_requires_unique_hostnames() -> None:
data = {
"machines": [
{"hostname": "dup", "machine_type": "laptop", "ram_gb": 8, "cpu_cores": 4, "os": "Linux", "adapter_condition": "good"},
{"hostname": "dup", "machine_type": "laptop", "ram_gb": 16, "cpu_cores": 8, "os": "Linux", "adapter_condition": "good"},
]
}
try:
validate_manifest(data)
except ValueError as exc:
assert "duplicate hostname" in str(exc)
assert "unique hostnames" in str(exc)
else:
raise AssertionError("validate_manifest should reject duplicate hostname")
def test_markdown_contains_anchor_agents_and_daylight_schedule() -> None:
data = load_manifest("docs/laptop-fleet-manifest.example.yaml")
plan = build_plan(data)
content = render_markdown(plan, data)
assert "24/7 anchor agents: timmy-anchor-a, timmy-anchor-b" in content
assert "Daylight schedule: 10:00-16:00" in content
assert "desktop_nas" in content
def test_manifest_template_is_valid_yaml() -> None:
data = yaml.safe_load(Path("docs/laptop-fleet-manifest.example.yaml").read_text())
assert data["fleet_name"] == "timmy-laptop-fleet"
assert len(data["machines"]) == 6

View File

@@ -1,68 +0,0 @@
from pathlib import Path
import importlib.util
import unittest
ROOT = Path(__file__).resolve().parent.parent
SCRIPT_PATH = ROOT / "scripts" / "mempalace_ezra_integration.py"
DOC_PATH = ROOT / "docs" / "MEMPALACE_EZRA_INTEGRATION.md"
def load_module(path: Path, name: str):
assert path.exists(), f"missing {path.relative_to(ROOT)}"
spec = importlib.util.spec_from_file_location(name, path)
assert spec and spec.loader
module = importlib.util.module_from_spec(spec)
spec.loader.exec_module(module)
return module
class TestMempalaceEzraIntegration(unittest.TestCase):
def test_build_plan_contains_issue_required_steps_and_gotchas(self):
mod = load_module(SCRIPT_PATH, "mempalace_ezra_integration")
plan = mod.build_plan({})
self.assertEqual(plan["package_spec"], "mempalace==3.0.0")
self.assertIn("pip install mempalace==3.0.0", plan["install_command"])
self.assertEqual(plan["wing"], "ezra_home")
self.assertIn('echo "" | mempalace mine ~/.hermes/', plan["mine_home_command"])
self.assertIn('--mode convos', plan["mine_sessions_command"])
self.assertIn('mempalace wake-up', plan["wake_up_command"])
self.assertIn('hermes mcp add mempalace -- python -m mempalace.mcp_server', plan["mcp_command"])
self.assertIn('wing:', plan["yaml_template"])
self.assertTrue(any('stdin' in item.lower() for item in plan["gotchas"]))
self.assertTrue(any('wing:' in item for item in plan["gotchas"]))
def test_build_plan_accepts_path_and_wing_overrides(self):
mod = load_module(SCRIPT_PATH, "mempalace_ezra_integration")
plan = mod.build_plan(
{
"hermes_home": "/root/wizards/ezra/home",
"sessions_dir": "/root/wizards/ezra/home/sessions",
"wing": "ezra_archive",
}
)
self.assertEqual(plan["wing"], "ezra_archive")
self.assertIn('/root/wizards/ezra/home', plan["mine_home_command"])
self.assertIn('/root/wizards/ezra/home/sessions', plan["mine_sessions_command"])
self.assertIn('wing: ezra_archive', plan["yaml_template"])
def test_repo_contains_mem_palace_ezra_doc(self):
self.assertTrue(DOC_PATH.exists(), "missing committed MemPalace Ezra integration doc")
text = DOC_PATH.read_text(encoding="utf-8")
required = [
"# MemPalace v3.0.0 — Ezra Integration Packet",
"pip install mempalace==3.0.0",
'echo "" | mempalace mine ~/.hermes/',
"mempalace mine ~/.hermes/sessions/ --mode convos",
"mempalace wake-up",
"hermes mcp add mempalace -- python -m mempalace.mcp_server",
"Report back to #568",
]
for snippet in required:
self.assertIn(snippet, text)
if __name__ == "__main__":
unittest.main()

View File

@@ -1,105 +0,0 @@
from pathlib import Path
import yaml
from scripts.plan_nh_broadband_install import (
build_packet,
load_request,
render_markdown,
validate_request,
)
def test_script_exists() -> None:
assert Path("scripts/plan_nh_broadband_install.py").exists()
def test_example_request_exists() -> None:
assert Path("docs/nh-broadband-install-request.example.yaml").exists()
def test_example_packet_exists() -> None:
assert Path("docs/nh-broadband-install-packet.example.md").exists()
def test_research_memo_exists() -> None:
assert Path("reports/operations/2026-04-15-nh-broadband-public-research.md").exists()
def test_load_and_build_packet() -> None:
data = load_request("docs/nh-broadband-install-request.example.yaml")
packet = build_packet(data)
assert packet["contact"]["name"] == "Timmy Operator"
assert packet["service_address"]["city"] == "Concord"
assert packet["service_address"]["state"] == "NH"
assert packet["status"] == "pending_scheduling_call"
assert len(packet["checklist"]) == 8
assert packet["checklist"][0]["done"] is False
def test_validate_rejects_missing_contact_name() -> None:
data = {
"contact": {"name": "", "phone": "555"},
"service": {"address": "1 St", "city": "X", "state": "NH"},
"checklist": ["do thing"],
}
try:
validate_request(data)
except ValueError as exc:
assert "contact.name" in str(exc)
else:
raise AssertionError("should reject empty contact name")
def test_validate_rejects_missing_service_address() -> None:
data = {
"contact": {"name": "A", "phone": "555"},
"service": {"address": "", "city": "X", "state": "NH"},
"checklist": ["do thing"],
}
try:
validate_request(data)
except ValueError as exc:
assert "service.address" in str(exc)
else:
raise AssertionError("should reject empty service address")
def test_validate_rejects_empty_checklist() -> None:
data = {
"contact": {"name": "A", "phone": "555"},
"service": {"address": "1 St", "city": "X", "state": "NH"},
"checklist": [],
}
try:
validate_request(data)
except ValueError as exc:
assert "checklist" in str(exc)
else:
raise AssertionError("should reject empty checklist")
def test_render_markdown_contains_key_sections() -> None:
data = load_request("docs/nh-broadband-install-request.example.yaml")
packet = build_packet(data)
md = render_markdown(packet, data)
assert "# NH Broadband Install Packet" in md
assert "## Contact" in md
assert "## Service Address" in md
assert "## Call Log" in md
assert "## Appointment Checklist" in md
assert "Concord" in md
assert "NH" in md
def test_render_markdown_shows_checklist_items() -> None:
data = load_request("docs/nh-broadband-install-request.example.yaml")
packet = build_packet(data)
md = render_markdown(packet, data)
assert "- [ ] Confirm exact-address availability" in md
def test_example_yaml_is_valid() -> None:
data = yaml.safe_load(Path("docs/nh-broadband-install-request.example.yaml").read_text())
assert data["contact"]["name"] == "Timmy Operator"
assert len(data["checklist"]) == 8

View File

@@ -17,24 +17,8 @@ from typing import Dict, Any, Optional, List
from pathlib import Path
from dataclasses import dataclass
from enum import Enum
import importlib.util
def _load_local(module_name: str, filename: str):
"""Import a module from an explicit file path, bypassing sys.path resolution."""
spec = importlib.util.spec_from_file_location(
module_name,
str(Path(__file__).parent / filename),
)
mod = importlib.util.module_from_spec(spec)
spec.loader.exec_module(mod)
return mod
_harness = _load_local("v2_harness", "harness.py")
UniWizardHarness = _harness.UniWizardHarness
House = _harness.House
ExecutionResult = _harness.ExecutionResult
from harness import UniWizardHarness, House, ExecutionResult
class TaskType(Enum):

View File

@@ -8,30 +8,13 @@ import time
import sys
import argparse
import os
import importlib.util
from pathlib import Path
from datetime import datetime
from typing import Dict, List, Optional
def _load_local(module_name: str, filename: str):
"""Import a module from an explicit file path, bypassing sys.path resolution.
Prevents namespace collisions when multiple directories contain modules
with the same name (e.g. uni-wizard/harness.py vs uni-wizard/v2/harness.py).
"""
spec = importlib.util.spec_from_file_location(
module_name,
str(Path(__file__).parent / filename),
)
mod = importlib.util.module_from_spec(spec)
spec.loader.exec_module(mod)
return mod
_harness = _load_local("v2_harness", "harness.py")
UniWizardHarness = _harness.UniWizardHarness
House = _harness.House
ExecutionResult = _harness.ExecutionResult
sys.path.insert(0, str(Path(__file__).parent))
from harness import UniWizardHarness, House, ExecutionResult
from router import HouseRouter, TaskType
from author_whitelist import AuthorWhitelist