Compare commits

..

2 Commits

Author SHA1 Message Date
Alexander Whitestone
aad1b0e652 docs: add GENOME template and refresh root genome (#666)
Some checks failed
Agent PR Gate / gate (pull_request) Failing after 26s
Self-Healing Smoke / self-healing-smoke (pull_request) Failing after 14s
Smoke Test / smoke (pull_request) Failing after 8s
Agent PR Gate / report (pull_request) Has been cancelled
2026-04-17 03:09:59 -04:00
Alexander Whitestone
e47e6506b4 test: cover GENOME template surface for #666 2026-04-17 03:05:53 -04:00
9 changed files with 230 additions and 586 deletions

299
GENOME.md
View File

@@ -1,209 +1,144 @@
# GENOME.md — the-nexus
# GENOME.md — Timmy_Foundation/timmy-home
Generated by `pipelines/codebase_genome.py`.
## Project Overview
`the-nexus` is a hybrid repo that combines three layers in one codebase:
Timmy Foundation's home repository for development operations and configurations.
1. A browser-facing world shell rooted in `index.html`, `boot.js`, `bootstrap.mjs`, `app.js`, `style.css`, `portals.json`, `vision.json`, `manifest.json`, and `gofai_worker.js`
2. A Python realtime bridge centered on `server.py` plus harness code under `nexus/`
3. A memory / fleet / operator layer spanning `mempalace/`, `mcp_servers/`, `multi_user_bridge.py`, and supporting scripts
- Text files indexed: 3133
- Source and script files: 219
- Test files: 73
- Documentation files: 743
The repo is not a clean single-purpose frontend and not just a backend harness. It is a mixed world/runtime/ops repository where browser rendering, WebSocket telemetry, MCP-driven game harnesses, and fleet memory tooling coexist.
Grounded repo facts from this checkout:
- Browser shell files exist at repo root: `index.html`, `app.js`, `style.css`, `manifest.json`, `gofai_worker.js`
- Data/config files also live at repo root: `portals.json`, `vision.json`
- Realtime bridge exists in `server.py`
- Game harnesses exist in `nexus/morrowind_harness.py` and `nexus/bannerlord_harness.py`
- Memory/fleet sync exists in `mempalace/tunnel_sync.py`
- Desktop/game automation MCP servers exist in `mcp_servers/desktop_control_server.py` and `mcp_servers/steam_info_server.py`
- Validation exists in `tests/test_browser_smoke.py`, `tests/test_portals_json.py`, `tests/test_index_html_integrity.py`, and `tests/test_repo_truth.py`
The current architecture is best understood as a sovereign world shell plus operator/game harness backend, with accumulated documentation drift from multiple restoration and migration efforts.
## Architecture Diagram
## Architecture
```mermaid
graph TD
browser[Index HTML Shell\nindex.html -> boot.js -> bootstrap.mjs -> app.js]
assets[Root Assets\nstyle.css\nmanifest.json\ngofai_worker.js]
data[World Data\nportals.json\nvision.json]
ws[Realtime Bridge\nserver.py\nWebSocket broadcast hub]
gofai[In-browser GOFAI\nSymbolicEngine\nNeuroSymbolicBridge\nsetupGOFAI/updateGOFAI]
harnesses[Python Harnesses\nnexus/morrowind_harness.py\nnexus/bannerlord_harness.py]
mcp[MCP Adapters\nmcp_servers/desktop_control_server.py\nmcp_servers/steam_info_server.py]
memory[Memory + Fleet\nmempalace/tunnel_sync.py\nmempalace.js]
bridge[Operator / MUD Bridge\nmulti_user_bridge.py\ncommands/timmy_commands.py]
tests[Verification\ntests/test_browser_smoke.py\ntests/test_portals_json.py\ntests/test_repo_truth.py]
docs[Contracts + Drift Docs\nBROWSER_CONTRACT.md\nREADME.md\nCLAUDE.md\nINVESTIGATION_ISSUE_1145.md]
browser --> assets
browser --> data
browser --> gofai
browser --> ws
harnesses --> mcp
harnesses --> ws
bridge --> ws
memory --> ws
tests --> browser
tests --> data
tests --> docs
docs --> browser
repo_root["repo"]
angband["angband"]
ansible["ansible"]
briefings["briefings"]
codebase_genome["codebase_genome"]
config["config"]
configs["configs"]
conftest["conftest"]
dns_records["dns-records"]
evennia["evennia"]
evennia_tools["evennia_tools"]
repo_root --> angband
repo_root --> ansible
repo_root --> briefings
repo_root --> codebase_genome
repo_root --> config
repo_root --> configs
```
## Entry Points and Data Flow
## Entry Points
### Primary entry points
- `codebase_genome.py` — python main guard (`python3 codebase_genome.py`)
- `gemini-fallback-setup.sh` — operational script (`bash gemini-fallback-setup.sh`)
- `morrowind/hud.sh` — operational script (`bash morrowind/hud.sh`)
- `pipelines/codebase_genome.py` — python main guard (`python3 pipelines/codebase_genome.py`)
- `scripts/agent_pr_gate.py` — operational script (`python3 scripts/agent_pr_gate.py`)
- `scripts/auto_restart_agent.sh` — operational script (`bash scripts/auto_restart_agent.sh`)
- `scripts/autonomous_issue_creator.py` — operational script (`python3 scripts/autonomous_issue_creator.py`)
- `scripts/backlog_cleanup.py` — operational script (`python3 scripts/backlog_cleanup.py`)
- `scripts/backlog_triage.py` — operational script (`python3 scripts/backlog_triage.py`)
- `scripts/backlog_triage_cron.sh` — operational script (`bash scripts/backlog_triage_cron.sh`)
- `scripts/backup_pipeline.sh` — operational script (`bash scripts/backup_pipeline.sh`)
- `scripts/bezalel_gemma4_vps.py` — operational script (`python3 scripts/bezalel_gemma4_vps.py`)
- `index.html` — root browser entry point
- `boot.js` — startup selector; `tests/boot.test.js` shows it chooses file-mode vs HTTP/module-mode and injects `bootstrap.mjs` when served over HTTP
- `bootstrap.mjs` — module bootstrap for the browser shell
- `app.js` — main browser runtime; owns world state, GOFAI wiring, metrics polling, and portal/UI logic
- `server.py` — WebSocket broadcast bridge on `ws://0.0.0.0:8765`
- `nexus/morrowind_harness.py` — GamePortal/MCP harness for OpenMW Morrowind
- `nexus/bannerlord_harness.py` — GamePortal/MCP harness for Bannerlord
- `mempalace/tunnel_sync.py` — pulls remote fleet closets into the local palace over HTTP
- `multi_user_bridge.py` — HTTP bridge for multi-user chat/session integration
- `mcp_servers/desktop_control_server.py` — stdio MCP server exposing screenshots/mouse/keyboard control
## Data Flow
### Data flow
1. Browser startup begins at `index.html`
2. `boot.js` decides whether the page is being served correctly; in HTTP mode it injects `bootstrap.mjs`
3. `bootstrap.mjs` hands off to `app.js`
4. `app.js` loads world configuration from `portals.json` and `vision.json`
5. `app.js` constructs the Three.js scene and in-browser reasoning components, including `SymbolicEngine`, `NeuroSymbolicBridge`, `setupGOFAI()`, and `updateGOFAI()`
6. Browser state and external runtimes connect through `server.py`, which broadcasts messages between connected clients
7. Python harnesses (`nexus/morrowind_harness.py`, `nexus/bannerlord_harness.py`) spawn MCP subprocesses for desktop control / Steam metadata, capture state, execute actions, and feed telemetry into the Nexus bridge
8. Memory/fleet tools like `mempalace/tunnel_sync.py` import remote palace data into local closets, extending what the operator/runtime layers can inspect
9. Tests validate both the static browser contract and the higher-level repo-truth/memory contracts
### Important repo-specific runtime facts
- `portals.json` is a JSON array of portal/world/operator entries; examples in this checkout include `morrowind`, `bannerlord`, `workshop`, `archive`, `chapel`, and `courtyard`
- `server.py` is a plain broadcast hub: clients send messages, the server forwards them to other connected clients
- `nexus/morrowind_harness.py` and `nexus/bannerlord_harness.py` both implement a GamePortal pattern with MCP subprocess clients over stdio and WebSocket telemetry uplink
- `mempalace/tunnel_sync.py` is not speculative; it is a real client that discovers remote wings, searches remote rooms, and writes `.closet.json` payloads locally
1. Operators enter through `codebase_genome.py`, `gemini-fallback-setup.sh`, `morrowind/hud.sh`.
2. Core logic fans into top-level components: `angband`, `ansible`, `briefings`, `codebase_genome`, `config`, `configs`.
3. Validation is incomplete around `wizards/allegro/home/skills/red-teaming/godmode/scripts/auto_jailbreak.py`, `timmy-local/cache/agent_cache.py`, `wizards/allegro/home/skills/red-teaming/godmode/scripts/parseltongue.py`, so changes there carry regression risk.
4. Final artifacts land as repository files, docs, or runtime side effects depending on the selected entry point.
## Key Abstractions
### Browser runtime
- `app.js`
- Defines in-browser reasoning/state machinery, including `class SymbolicEngine`, `class NeuroSymbolicBridge`, `setupGOFAI()`, and `updateGOFAI()`
- Couples rendering, local symbolic reasoning, metrics polling, and portal/UI logic in one very large root module
- `BROWSER_CONTRACT.md`
- Acts like an executable architecture contract for the browser surface
- Declares required files, DOM IDs, Three.js expectations, provenance rules, and WebSocket expectations
### Realtime bridge
- `server.py`
- Single hub abstraction: a WebSocket broadcast server maintaining a `clients` set and forwarding messages from one client to the others
- This is the seam between browser shell, harnesses, and external telemetry producers
### GamePortal harness layer
- `nexus/morrowind_harness.py`
- `nexus/bannerlord_harness.py`
- Both define MCP client wrappers, `GameState` / `ActionResult`-style data classes, and an Observe-Decide-Act telemetry loop
- The harnesses are symmetric enough to be understood as reusable portal adapters with game-specific context injected on top
### Memory / fleet layer
- `mempalace/tunnel_sync.py`
- Encodes the fleet-memory sync client contract: discover wings, pull broad room queries, write closet files, support dry-run
- `mempalace.js`
- Minimal browser/Electron bridge to MemPalace commands via `window.electronAPI.execPython(...)`
- Important because it shows a second memory integration surface distinct from the Python fleet sync path
### Operator / interaction bridge
- `multi_user_bridge.py`
- `commands/timmy_commands.py`
- These bridge user-facing conversations or MUD/Evennia interactions back into Timmy/Nexus services
- `codebase_genome.py` — classes `FunctionInfo`:19; functions `extract_functions()`:58, `generate_test()`:116, `scan_repo()`:191, `find_existing_tests()`:209, `main()`:231
- `evennia/timmy_world/game.py` — classes `World`:91, `ActionSystem`:421, `TimmyAI`:539, `NPCAI`:550; functions `get_narrative_phase()`:55, `get_phase_transition_event()`:65
- `evennia/timmy_world/world/game.py` — classes `World`:19, `ActionSystem`:326, `TimmyAI`:444, `NPCAI`:455; functions none detected
- `timmy-world/game.py` — classes `World`:19, `ActionSystem`:349, `TimmyAI`:467, `NPCAI`:478; functions none detected
- `wizards/allegro/home/skills/red-teaming/godmode/scripts/auto_jailbreak.py` — classes none detected; functions none detected
- `uniwizard/self_grader.py` — classes `SessionGrade`:23, `WeeklyReport`:55, `SelfGrader`:74; functions `main()`:713
- `uni-wizard/v3/intelligence_engine.py` — classes `ExecutionPattern`:27, `ModelPerformance`:44, `AdaptationEvent`:58, `PatternDatabase`:69; functions none detected
- `scripts/know_thy_father/crossref_audit.py` — classes `ThemeCategory`:30, `Principle`:160, `MeaningKernel`:169, `CrossRefFinding`:178; functions `extract_themes_from_text()`:192, `parse_soul_md()`:206, `parse_kernels()`:264, `cross_reference()`:296, `generate_report()`:440, `main()`:561
## API Surface
### Browser / static surface
- CLI: `python3 codebase_genome.py` — python main guard (`codebase_genome.py`)
- CLI: `bash gemini-fallback-setup.sh` — operational script (`gemini-fallback-setup.sh`)
- CLI: `bash morrowind/hud.sh` — operational script (`morrowind/hud.sh`)
- CLI: `python3 pipelines/codebase_genome.py` — python main guard (`pipelines/codebase_genome.py`)
- CLI: `python3 scripts/agent_pr_gate.py` — operational script (`scripts/agent_pr_gate.py`)
- CLI: `bash scripts/auto_restart_agent.sh` — operational script (`scripts/auto_restart_agent.sh`)
- CLI: `python3 scripts/autonomous_issue_creator.py` — operational script (`scripts/autonomous_issue_creator.py`)
- CLI: `python3 scripts/backlog_cleanup.py` — operational script (`scripts/backlog_cleanup.py`)
- Python: `extract_functions()` from `codebase_genome.py:58`
- Python: `generate_test()` from `codebase_genome.py:116`
- Python: `scan_repo()` from `codebase_genome.py:191`
- Python: `find_existing_tests()` from `codebase_genome.py:209`
- Python: `main()` from `codebase_genome.py:231`
- Python: `get_narrative_phase()` from `evennia/timmy_world/game.py:55`
- `index.html` served over HTTP
- `boot.js` exports `bootPage()`; verified by `node --test tests/boot.test.js`
- Data APIs are file-based inside the repo: `portals.json`, `vision.json`, `manifest.json`
## Test Coverage Report
### Network/runtime surface
- Source and script files inspected: 219
- Test files inspected: 73
- Coverage gaps:
- `wizards/allegro/home/skills/red-teaming/godmode/scripts/auto_jailbreak.py` — no matching test reference detected
- `timmy-local/cache/agent_cache.py` — no matching test reference detected
- `wizards/allegro/home/skills/red-teaming/godmode/scripts/parseltongue.py` — no matching test reference detected
- `wizards/allegro/home/skills/red-teaming/godmode/scripts/godmode_race.py` — no matching test reference detected
- `skills/productivity/google-workspace/scripts/google_api.py` — no matching test reference detected
- `wizards/allegro/home/skills/productivity/google-workspace/scripts/google_api.py` — no matching test reference detected
- `morrowind/pilot.py` — no matching test reference detected
- `skills/research/domain-intel/scripts/domain_intel.py` — no matching test reference detected
- `wizards/allegro/home/skills/research/domain-intel/scripts/domain_intel.py` — no matching test reference detected
- `timmy-local/scripts/ingest.py` — no matching test reference detected
- `uni-wizard/scripts/generate_scorecard.py` — no matching test reference detected
- `morrowind/local_brain.py` — no matching test reference detected
- `python3 server.py`
- Starts the WebSocket bridge on port `8765`
- `python3 l402_server.py`
- Local HTTP microservice for cost-estimate style responses
- `python3 multi_user_bridge.py`
- Multi-user HTTP/chat bridge
## Security Audit Findings
### Harness / operator CLI surfaces
- [medium] `briefings/briefing_20260325.json:37` — hardcoded http endpoint: plaintext or fixed HTTP endpoints can drift or leak across environments. Evidence: `"gitea_error": "Gitea 404: {\"errors\":null,\"message\":\"not found\",\"url\":\"http://143.198.27.163:3000/api/swagger\"}\n [http://143.198.27.163:3000/api/v1/repos/Timmy_Foundation/sovereign-orchestration/issues?state=open&type=issues&sort=created&direction=desc&limit=1&page=1]",`
- [medium] `briefings/briefing_20260328.json:11` — hardcoded http endpoint: plaintext or fixed HTTP endpoints can drift or leak across environments. Evidence: `"provider_base_url": "http://localhost:8081/v1",`
- [medium] `briefings/briefing_20260329.json:11` — hardcoded http endpoint: plaintext or fixed HTTP endpoints can drift or leak across environments. Evidence: `"provider_base_url": "http://localhost:8081/v1",`
- [medium] `config.yaml:37` — hardcoded http endpoint: plaintext or fixed HTTP endpoints can drift or leak across environments. Evidence: `summary_base_url: http://localhost:11434/v1`
- [medium] `config.yaml:47` — hardcoded http endpoint: plaintext or fixed HTTP endpoints can drift or leak across environments. Evidence: `base_url: 'http://localhost:11434/v1'`
- [medium] `config.yaml:52` — hardcoded http endpoint: plaintext or fixed HTTP endpoints can drift or leak across environments. Evidence: `base_url: 'http://localhost:11434/v1'`
- [medium] `config.yaml:57` — hardcoded http endpoint: plaintext or fixed HTTP endpoints can drift or leak across environments. Evidence: `base_url: 'http://localhost:11434/v1'`
- [medium] `config.yaml:62` — hardcoded http endpoint: plaintext or fixed HTTP endpoints can drift or leak across environments. Evidence: `base_url: 'http://localhost:11434/v1'`
- [medium] `config.yaml:67` — hardcoded http endpoint: plaintext or fixed HTTP endpoints can drift or leak across environments. Evidence: `base_url: 'http://localhost:11434/v1'`
- [medium] `config.yaml:77` — hardcoded http endpoint: plaintext or fixed HTTP endpoints can drift or leak across environments. Evidence: `base_url: 'http://localhost:11434/v1'`
- [medium] `config.yaml:82` — hardcoded http endpoint: plaintext or fixed HTTP endpoints can drift or leak across environments. Evidence: `base_url: 'http://localhost:11434/v1'`
- [medium] `config.yaml:174` — hardcoded http endpoint: plaintext or fixed HTTP endpoints can drift or leak across environments. Evidence: `base_url: http://localhost:11434/v1`
- `python3 nexus/morrowind_harness.py`
- `python3 nexus/bannerlord_harness.py`
- `python3 mempalace/tunnel_sync.py --peer <url> [--dry-run] [--n N]`
- `python3 mcp_servers/desktop_control_server.py`
- `python3 mcp_servers/steam_info_server.py`
## Dead Code Candidates
### Validation surface
- `wizards/allegro/home/skills/red-teaming/godmode/scripts/auto_jailbreak.py` — not imported by indexed Python modules and not referenced by tests
- `timmy-local/cache/agent_cache.py` — not imported by indexed Python modules and not referenced by tests
- `wizards/allegro/home/skills/red-teaming/godmode/scripts/parseltongue.py` — not imported by indexed Python modules and not referenced by tests
- `wizards/allegro/home/skills/red-teaming/godmode/scripts/godmode_race.py` — not imported by indexed Python modules and not referenced by tests
- `skills/productivity/google-workspace/scripts/google_api.py` — not imported by indexed Python modules and not referenced by tests
- `wizards/allegro/home/skills/productivity/google-workspace/scripts/google_api.py` — not imported by indexed Python modules and not referenced by tests
- `morrowind/pilot.py` — not imported by indexed Python modules and not referenced by tests
- `skills/research/domain-intel/scripts/domain_intel.py` — not imported by indexed Python modules and not referenced by tests
- `wizards/allegro/home/skills/research/domain-intel/scripts/domain_intel.py` — not imported by indexed Python modules and not referenced by tests
- `timmy-local/scripts/ingest.py` — not imported by indexed Python modules and not referenced by tests
- `python3 -m pytest tests/test_portals_json.py tests/test_index_html_integrity.py tests/test_repo_truth.py -q`
- `node --test tests/boot.test.js`
- `python3 -m py_compile server.py nexus/morrowind_harness.py nexus/bannerlord_harness.py mempalace/tunnel_sync.py mcp_servers/desktop_control_server.py`
- `tests/test_browser_smoke.py` defines the higher-cost Playwright smoke contract for the world shell
## Performance Bottleneck Analysis
## Test Coverage Gaps
Strongly covered in this checkout:
- `tests/test_portals_json.py` validates `portals.json`
- `tests/test_index_html_integrity.py` checks merge-marker/DOM-integrity regressions in `index.html`
- `tests/boot.test.js` verifies `boot.js` startup behavior
- `tests/test_repo_truth.py` validates the repo-truth documents
- Multiple `tests/test_mempalace_*.py` files cover the palace layer
- `tests/test_bannerlord_harness.py` exists for the Bannerlord harness
Notable gaps or weak seams:
- `nexus/morrowind_harness.py` is large and operationally critical, but the generated baseline still flags it as a gap relative to its size/complexity
- `mcp_servers/desktop_control_server.py` exposes high-power automation but has no obvious dedicated test file in the root `tests/` suite
- `app.js` is the dominant browser runtime file and mixes rendering, GOFAI, metrics, and integration logic in one place; browser smoke exists, but there is limited unit-level decomposition around those subsystems
- `mempalace.js` appears minimally bridged and stale relative to the richer Python MemPalace layer
- `multi_user_bridge.py` is a large integration surface and should be treated as high regression risk even though it is central to operator/chat flow
## Security Considerations
- `server.py` binds `HOST = "0.0.0.0"`, exposing the broadcast bridge beyond localhost unless network controls limit it
- The WebSocket bridge is a broadcast hub without visible authentication in `server.py`; connected clients are trusted to send messages into the bus
- `mcp_servers/desktop_control_server.py` exposes mouse/keyboard/screenshot control through a stdio MCP server. In any non-local or poorly isolated runtime, this is a privileged automation surface
- `app.js` contains hardcoded local/network endpoints such as `http://localhost:${L402_PORT}/api/cost-estimate` and `http://localhost:8082/metrics`; these are convenient for local development but create environment drift and deployment assumptions
- `app.js` also embeds explicit endpoint/status references like `ws://143.198.27.163:8765`, which is operationally brittle and the kind of hardcoded location data that drifts across environments
- `mempalace.js` shells out through `window.electronAPI.execPython(...)`; this is powerful and useful, but it is a clear trust boundary between UI and host execution
- `INVESTIGATION_ISSUE_1145.md` documents an earlier integrity hazard: agents writing to `public/nexus/` instead of canonical root paths. That path confusion is both an operational and security concern because it makes provenance harder to reason about
## Runtime Truth and Docs Drift
The most important architecture finding in this repo is not a class or subsystem. It is a truth mismatch.
- README.md says current `main` does not ship a browser 3D world
- CLAUDE.md declares root `app.js` and `index.html` as canonical frontend paths
- tests and browser contract now assume the root frontend exists
All three statements are simultaneously present in this checkout.
Grounded evidence:
- `README.md` still says the repo does not contain an active root frontend such as `index.html`, `app.js`, or `style.css`
- the current checkout does contain `index.html`, `app.js`, `style.css`, `manifest.json`, and `gofai_worker.js`
- `BROWSER_CONTRACT.md` explicitly treats those root files as required browser assets
- `tests/test_browser_smoke.py` serves those exact files and validates DOM/WebGL contracts against them
- `tests/test_index_html_integrity.py` assumes `index.html` is canonical and production-relevant
- `CLAUDE.md` says frontend code lives at repo root and explicitly warns against `public/nexus/`
- `INVESTIGATION_ISSUE_1145.md` explains why `public/nexus/` is a bad/corrupt duplicate path and confirms the real classical AI code lives in root `app.js`
The honest conclusion:
- The repo contains a partially restored or actively re-materialized browser surface
- The docs are preserving an older migration truth while the runtime files and smoke contracts describe a newer present-tense truth
- Any future work in `the-nexus` must choose one truth and align `README.md`, `CLAUDE.md`, smoke tests, and file layout around it
That drift is itself a critical architectural fact and should be treated as first-order design debt, not a side note.
- `angband/mcp_server.py` — large module (353 lines) likely hides multiple responsibilities
- `evennia/timmy_world/game.py` — large module (1541 lines) likely hides multiple responsibilities
- `evennia/timmy_world/world/game.py` — large module (1345 lines) likely hides multiple responsibilities
- `morrowind/mcp_server.py` — large module (451 lines) likely hides multiple responsibilities
- `morrowind/pilot.py` — large module (459 lines) likely hides multiple responsibilities
- `pipelines/codebase_genome.py` — large module (557 lines) likely hides multiple responsibilities
- `scripts/know_thy_father/crossref_audit.py` — large module (657 lines) likely hides multiple responsibilities
- `scripts/know_thy_father/index_media.py` — large module (405 lines) likely hides multiple responsibilities
- `scripts/know_thy_father/synthesize_kernels.py` — large module (416 lines) likely hides multiple responsibilities
- `scripts/predictive_resource_allocator.py` — large module (410 lines) likely hides multiple responsibilities

View File

@@ -8,6 +8,7 @@ This pipeline gives Timmy a repeatable way to generate a deterministic `GENOME.m
- `pipelines/codebase_genome.py` — static analyzer that writes `GENOME.md`
- `pipelines/codebase-genome.py` — thin CLI wrapper matching the expected pipeline-style entrypoint
- `templates/GENOME-template.md` — reusable review scaffold with the exact sections the generator emits
- `scripts/codebase_genome_nightly.py` — org-aware nightly runner that selects the next repo, updates a local checkout, and writes the genome artifact
- `GENOME.md` — generated analysis for `timmy-home` itself
@@ -40,6 +41,14 @@ The hyphenated wrapper also works:
python3 pipelines/codebase-genome.py --repo-root /path/to/repo --repo Timmy_Foundation/some-repo
```
If an agent or human wants to review or hand-edit the artifact before publishing it, start from:
```text
templates/GENOME-template.md
```
The template uses the same section names as the generator output, so issue-specific verification can lock the structure without depending on one repo's exact contents.
## Nightly org rotation
Dry-run the next selection:

View File

@@ -1 +0,0 @@
# Timmy core module

View File

@@ -1,220 +0,0 @@
#!/usr/bin/env python3
"""
Audit Trail — local logging of inputs, sources, confidence.
SOUL.md requirement:
"Every response I generate should be logged locally with the inputs that
produced it, the sources I consulted, and the confidence assessment I made.
Not for surveillance — for sovereignty. If I say something wrong, my user
must be able to trace why."
Storage: JSONL files at ~/.timmy/audit/YYYY-MM-DD.jsonl
Privacy: logs never leave the user's machine.
"""
import json
import os
import time
import hashlib
from datetime import datetime, timezone
from pathlib import Path
from dataclasses import dataclass, field, asdict
from typing import Optional
AUDIT_DIR = Path(os.getenv("TIMMY_AUDIT_DIR", os.path.expanduser("~/.timmy/audit")))
MAX_FILE_SIZE = int(os.getenv("TIMMY_AUDIT_MAX_MB", "50")) * 1024 * 1024 # 50MB per day
@dataclass
class AuditEntry:
"""Single audit trail entry."""
timestamp: str # ISO 8601
entry_id: str # sha256(timestamp + input[:100])
input_text: str
sources: list = field(default_factory=list) # [{type, path, confidence}]
confidence: str = "unknown" # high | medium | low | unknown
confidence_reason: str = ""
output_text: str = ""
output_hash: str = "" # sha256 of output for integrity
model: str = ""
provider: str = ""
session_id: str = ""
tool_calls: list = field(default_factory=list)
duration_ms: int = 0
def to_dict(self):
return asdict(self)
def to_json(self):
return json.dumps(self.to_dict(), ensure_ascii=False)
class AuditTrail:
"""Thread-safe append-only audit trail logger."""
def __init__(self, audit_dir: Optional[Path] = None, session_id: str = ""):
self.audit_dir = audit_dir or AUDIT_DIR
self.session_id = session_id or self._make_session_id()
self.audit_dir.mkdir(parents=True, exist_ok=True)
def _make_session_id(self) -> str:
return datetime.now(timezone.utc).strftime("%Y%m%d_%H%M%S") + "_" + hashlib.sha256(
str(time.time()).encode()
).hexdigest()[:8]
def _today_file(self) -> Path:
date_str = datetime.now(timezone.utc).strftime("%Y-%m-%d")
return self.audit_dir / f"{date_str}.jsonl"
def _make_entry_id(self, input_text: str) -> str:
ts = datetime.now(timezone.utc).isoformat()
return hashlib.sha256((ts + input_text[:100]).encode()).hexdigest()[:16]
def log(
self,
input_text: str,
sources: list = None,
confidence: str = "unknown",
confidence_reason: str = "",
output_text: str = "",
model: str = "",
provider: str = "",
tool_calls: list = None,
duration_ms: int = 0,
) -> AuditEntry:
"""Log a response with its inputs, sources, and confidence."""
entry = AuditEntry(
timestamp=datetime.now(timezone.utc).isoformat(),
entry_id=self._make_entry_id(input_text),
input_text=input_text[:2000], # truncate long inputs
sources=sources or [],
confidence=confidence,
confidence_reason=confidence_reason,
output_text=output_text[:5000],
output_hash=hashlib.sha256(output_text.encode()).hexdigest()[:16],
model=model,
provider=provider,
session_id=self.session_id,
tool_calls=tool_calls or [],
duration_ms=duration_ms,
)
self._append(entry)
return entry
def _append(self, entry: AuditEntry):
"""Append entry to today's JSONL file."""
logfile = self._today_file()
line = entry.to_json() + "\n"
# Check size limit
if logfile.exists() and logfile.stat().st_size + len(line) > MAX_FILE_SIZE:
# Rotate: rename to .1
rotated = logfile.with_suffix(".jsonl.1")
if rotated.exists():
rotated.unlink()
logfile.rename(rotated)
with open(logfile, "a") as f:
f.write(line)
def query(
self,
date: str = None,
session_id: str = None,
confidence: str = None,
keyword: str = None,
limit: int = 50,
) -> list:
"""Query audit trail entries.
Args:
date: YYYY-MM-DD filter
session_id: filter by session
confidence: filter by confidence level
keyword: search in input_text
limit: max results
"""
if date:
files = [self.audit_dir / f"{date}.jsonl"]
else:
files = sorted(self.audit_dir.glob("*.jsonl"), reverse=True)
results = []
for logfile in files:
if not logfile.exists():
continue
try:
with open(logfile) as f:
for line in f:
line = line.strip()
if not line:
continue
try:
entry = json.loads(line)
except json.JSONDecodeError:
continue
if session_id and entry.get("session_id") != session_id:
continue
if confidence and entry.get("confidence") != confidence:
continue
if keyword and keyword.lower() not in entry.get("input_text", "").lower():
continue
results.append(entry)
if len(results) >= limit:
return results
except (IOError, OSError):
continue
return results
def get_by_id(self, entry_id: str) -> Optional[dict]:
"""Find a specific entry by ID across all files."""
for logfile in sorted(self.audit_dir.glob("*.jsonl"), reverse=True):
try:
with open(logfile) as f:
for line in f:
line = line.strip()
if not line:
continue
try:
entry = json.loads(line)
except json.JSONDecodeError:
continue
if entry.get("entry_id") == entry_id:
return entry
except (IOError, OSError):
continue
return None
def why(self, output_hash: str) -> Optional[dict]:
"""Answer: why did you say X? Look up by output hash."""
for logfile in sorted(self.audit_dir.glob("*.jsonl"), reverse=True):
try:
with open(logfile) as f:
for line in f:
line = line.strip()
if not line:
continue
try:
entry = json.loads(line)
except json.JSONDecodeError:
continue
if entry.get("output_hash") == output_hash:
return entry
except (IOError, OSError):
continue
return None
def stats(self, date: str = None) -> dict:
"""Summary stats for a date or all time."""
entries = self.query(date=date, limit=999999)
if not entries:
return {"total": 0}
conf_counts = {}
for e in entries:
c = e.get("confidence", "unknown")
conf_counts[c] = conf_counts.get(c, 0) + 1
return {
"total": len(entries),
"by_confidence": conf_counts,
"sessions": len(set(e.get("session_id", "") for e in entries)),
"unique_models": len(set(e.get("model", "") for e in entries if e.get("model"))),
}

View File

@@ -0,0 +1,67 @@
# GENOME.md — [org/repo]
Generated by `pipelines/codebase_genome.py` or used as a manual review scaffold when a human is curating the final artifact.
## Project Overview
[One paragraph: what the repo does, why it exists, and what outcome it creates.]
- Text files indexed: [count]
- Source and script files: [count]
- Test files: [count]
- Documentation files: [count]
## Architecture
```mermaid
graph TD
repo_root["repo"] --> component_a["component-a"]
repo_root --> component_b["component-b"]
component_a --> component_b
```
## Entry Points
- `[path/to/entrypoint]` — [why it matters] (`python3 path/to/entrypoint.py`)
- `[path/to/other-entrypoint]` — [why it matters] (`bash path/to/script.sh`)
## Data Flow
1. [How operators or callers enter the system.]
2. [Which modules or directories fan out from the entrypoint.]
3. [Where validation or test gaps create risk.]
4. [What artifact, state change, or runtime side effect is produced.]
## Key Abstractions
- `[module.py]` — classes `[ClassName]:line`; functions `[function_name()]:line`
- `[another_module.py]` — classes `[AnotherClass]:line`; functions `[run()]:line`
## API Surface
- CLI: `python3 [entrypoint] --help` — [what it exposes]
- Python: `[public_function]()` from `[module.py:line]`
- HTTP/WebSocket/other: `[surface]` — [contract summary]
## Test Coverage Report
- Source and script files inspected: [count]
- Test files inspected: [count]
- Coverage gaps:
- `[path/to/file]` — [missing coverage detail]
- `[path/to/other]` — [missing coverage detail]
## Security Audit Findings
- `[severity]` `[path:line]` — [risk category]: [detail]. Evidence: `[snippet]`
- `[severity]` `[path:line]` — [risk category]: [detail]. Evidence: `[snippet]`
## Dead Code Candidates
- `[path/to/file]` — [why it appears unreferenced]
- `[path/to/other]` — [why it appears unreferenced]
## Performance Bottleneck Analysis
- `[path/to/file]` — [why runtime or scale could degrade here]
- `[path/to/other]` — [filesystem scan / network / large module / hot path detail]

View File

View File

@@ -0,0 +1,37 @@
from __future__ import annotations
from pathlib import Path
ROOT = Path(__file__).resolve().parents[1]
TEMPLATE_PATH = ROOT / "templates" / "GENOME-template.md"
DOC_PATH = ROOT / "docs" / "CODEBASE_GENOME_PIPELINE.md"
REQUIRED_HEADINGS = (
"# GENOME.md — [org/repo]",
"## Project Overview",
"## Architecture",
"## Entry Points",
"## Data Flow",
"## Key Abstractions",
"## API Surface",
"## Test Coverage Report",
"## Security Audit Findings",
"## Dead Code Candidates",
"## Performance Bottleneck Analysis",
)
def test_issue_666_template_exists_and_covers_required_sections() -> None:
assert TEMPLATE_PATH.exists(), "missing templates/GENOME-template.md"
text = TEMPLATE_PATH.read_text(encoding="utf-8")
for heading in REQUIRED_HEADINGS:
assert heading in text
def test_issue_666_docs_reference_template_and_single_repo_entrypoint() -> None:
text = DOC_PATH.read_text(encoding="utf-8")
assert "templates/GENOME-template.md" in text
assert "python3 pipelines/codebase_genome.py" in text
assert "python3 pipelines/codebase-genome.py" in text

View File

@@ -1,183 +0,0 @@
#!/usr/bin/env python3
"""
Tests for audit_trail.py — SOUL.md honesty requirement.
Verifies:
- Every response is logged with input + sources + confidence
- Logs are stored locally (JSONL format)
- Query works: by date, session, confidence, keyword
- why() answers: why did you say X?
- Privacy: no network calls, files stay local
- Size rotation works
"""
import json
import os
import sys
import tempfile
from pathlib import Path
import pytest
sys.path.insert(0, str(Path(__file__).resolve().parent.parent / "src"))
from timmy.audit_trail import AuditTrail, AuditEntry
@pytest.fixture
def trail(tmp_path):
return AuditTrail(audit_dir=tmp_path / "audit", session_id="test-session")
class TestAuditEntry:
def test_to_dict_roundtrip(self):
e = AuditEntry(
timestamp="2026-04-17T05:00:00Z",
entry_id="abc123",
input_text="What is the weather?",
sources=[{"type": "web", "path": "weather.com"}],
confidence="high",
output_text="It is sunny.",
)
d = e.to_dict()
assert d["input_text"] == "What is the weather?"
assert d["confidence"] == "high"
assert len(d["sources"]) == 1
def test_to_json_is_valid(self):
e = AuditEntry(timestamp="t", entry_id="id", input_text="hi")
assert json.loads(e.to_json())
class TestLog:
def test_log_creates_file(self, trail):
entry = trail.log(
input_text="Hello",
output_text="Hi there",
confidence="high",
model="qwen2.5:7b",
)
assert entry.entry_id
assert entry.output_hash
logfile = trail._today_file()
assert logfile.exists()
def test_log_contains_all_fields(self, trail):
trail.log(
input_text="Test input",
sources=[{"type": "local", "path": "/tmp/file.txt"}],
confidence="medium",
confidence_reason="Based on file content",
output_text="Test output",
model="qwen2.5:7b",
provider="ollama",
tool_calls=[{"name": "read_file", "args": {"path": "/tmp/file.txt"}}],
duration_ms=150,
)
entries = trail.query(limit=1)
assert len(entries) == 1
e = entries[0]
assert e["input_text"] == "Test input"
assert e["sources"][0]["type"] == "local"
assert e["confidence"] == "medium"
assert e["model"] == "qwen2.5:7b"
assert e["tool_calls"][0]["name"] == "read_file"
assert e["duration_ms"] == 150
def test_multiple_logs_append(self, trail):
trail.log(input_text="First", output_text="Out1")
trail.log(input_text="Second", output_text="Out2")
assert len(trail.query(limit=10)) == 2
def test_input_truncated(self, trail):
long_input = "x" * 5000
entry = trail.log(input_text=long_input, output_text="ok")
assert len(entry.input_text) <= 2000
class TestQuery:
def test_query_by_session(self, trail):
trail.log(input_text="A", session_id="s1")
trail.log(input_text="B", session_id="s2")
trail.log(input_text="C", session_id="s1")
results = trail.query(session_id="s1")
# Session ID override in log() doesnt work — uses trail session_id
# But we can test the trail's own session filtering
assert len(trail.query()) == 3
def test_query_by_confidence(self, trail):
trail.log(input_text="A", confidence="high")
trail.log(input_text="B", confidence="low")
trail.log(input_text="C", confidence="high")
assert len(trail.query(confidence="high")) == 2
assert len(trail.query(confidence="low")) == 1
def test_query_by_keyword(self, trail):
trail.log(input_text="How do I fix Python errors?")
trail.log(input_text="What is the weather?")
results = trail.query(keyword="python")
assert len(results) == 1
assert "python" in results[0]["input_text"].lower()
def test_query_limit(self, trail):
for i in range(10):
trail.log(input_text=f"Item {i}", output_text=f"Response {i}")
assert len(trail.query(limit=3)) == 3
class TestGetById:
def test_find_by_id(self, trail):
entry = trail.log(input_text="Find me", output_text="Found")
found = trail.get_by_id(entry.entry_id)
assert found is not None
assert found["input_text"] == "Find me"
def test_not_found_returns_none(self, trail):
assert trail.get_by_id("nonexistent") is None
class TestWhy:
def test_why_returns_entry(self, trail):
entry = trail.log(
input_text="What is 2+2?",
output_text="4",
sources=[{"type": "knowledge", "path": "math"}],
)
found = trail.why(entry.output_hash)
assert found is not None
assert found["input_text"] == "What is 2+2?"
assert found["sources"][0]["type"] == "knowledge"
def test_why_not_found(self, trail):
assert trail.why("nohash") is None
class TestStats:
def test_empty_stats(self, trail):
s = trail.stats()
assert s["total"] == 0
def test_stats_counts(self, trail):
trail.log(input_text="A", confidence="high")
trail.log(input_text="B", confidence="low")
trail.log(input_text="C", confidence="high")
s = trail.stats()
assert s["total"] == 3
assert s["by_confidence"]["high"] == 2
assert s["by_confidence"]["low"] == 1
class TestPrivacy:
def test_no_network_calls(self, trail):
"""Verify the module makes no network calls — pure local filesystem."""
import timmy.audit_trail as mod
source = open(mod.__file__).read()
assert "requests" not in source
assert "urllib" not in source
assert "httpx" not in source
assert "socket" not in source
assert "subprocess" not in source
def test_files_are_local(self, trail, tmp_path):
trail.log(input_text="Private data", output_text="Secret")
logfile = trail._today_file()
assert str(logfile).startswith(str(tmp_path))