Compare commits

..

1 Commits

Author SHA1 Message Date
8d60b6c693 fix(#676): Add GENOME.md for compounding-intelligence
Some checks failed
Smoke Test / smoke (pull_request) Failing after 25s
Complete codebase genome:
- Project overview and three-pipeline architecture
- Mermaid architecture diagram
- Entry points and data flow
- Knowledge schema and confidence scoring
- Key abstractions
- Test coverage analysis with gaps
- Security considerations
- Dependencies and status
2026-04-15 03:25:20 +00:00
3 changed files with 139 additions and 823 deletions

835
GENOME.md
View File

@@ -1,741 +1,184 @@
# GENOME.md — timmy-home
# GENOME.md — compounding-intelligence
Generated: 2026-04-15 04:23:24Z
Issue: #670
Branch: `fix/670`
> Auto-generated codebase genome. Repo 9/16 in the Codebase Genome series.
## Project Overview
`timmy-home` is not a single product application. It is the Timmy Foundation workspace repo: an operational monorepo that mixes live guardrails, fleet scripts, local-world experiments, training pipelines, research artifacts, notes, and prototype agent runtimes.
**compounding-intelligence** turns 1B+ daily tokens into durable, compounding fleet intelligence. It solves the core problem of AI agent amnesia: every session starts at zero, rediscovering the same facts, pitfalls, and patterns that previous sessions already learned.
The most important reality check is in `OPERATIONS.md`: the active production system is Hermes plus the `timmy-config` sidecar, while `timmy-home` functions as the workspace, proving ground, and artifact store.
The project implements three pipelines forming a compounding loop:
### What the repo is good at
```
SESSION ENDS --> HARVESTER --> KNOWLEDGE STORE --> BOOTSTRAPPER --> NEW SESSION STARTS SMARTER
|
MEASURER --> Prove it's working
```
- local operational utilities and proof-oriented runbooks
- offline-first data transforms for training and archive analysis
- experimental Evennia / world-shell work
- prototype autonomous harnesses (`uniwizard/`, `uni-wizard/`)
- game-agent experiments (`morrowind/`, Tower simulations)
- telemetry, reports, and training-data staging
### What the repo is not
- not a clean single-purpose package
- not a consistently production-hardened deployable service
- not one coherent architecture; it is several architectures sharing one tree
### Repository shape
Metric snapshot from `/tmp/BURN-7-6`:
- 3,119 files total
- 234 source files (`.py`, `.sh`, `.js`)
- 29 test files
- 21 config files
- 397,683 text lines total
- 42,428 Python lines
- 2,403 shell lines
- 272,829 Markdown lines
The file count is dominated by data and documentation:
- `training-data/` → 2,013 files
- `wizards/` → 345 files
- `skills/` → 327 files
This matters: operational code is only a small fraction of the repo. Any analysis that treats the repo as “just a Python package” will be wrong.
## Runtime Reality
The current live-system contract is defined more accurately by `OPERATIONS.md` than by `README.md`.
- `README.md` documents secret scanning and a tiny subset of the repo.
- `OPERATIONS.md` states that the active system is Hermes + `timmy-config` sidecar.
- `CONTRIBUTING.md` defines the repos most important invariant: proof is required for merge.
That means `timmy-home` behaves as a workspace + runbook + experiment garden around a live sidecar/orchestrator that mostly lives elsewhere.
**Key insight**: Intelligence from a million tokens of work evaporates when the session ends. This project captures it, stores it, and injects it into future sessions so they start smarter.
## Architecture
```mermaid
graph TD
subgraph Inputs
A[Gitea / Forge issues]
B[Hermes sessions and local state]
C[Twitter archive + media]
D[OpenMW / Evennia runtime state]
E[Local machine + VPS fleet]
graph LR
A[Session Transcripts] -->|Harvester| B[Knowledge Store]
B -->|Bootstrapper| C[New Session Context]
C --> D[Agent Work]
D --> A
B -->|Measurer| E[Dashboard]
E -->|Metrics| F[Proof of Compounding]
subgraph Knowledge Store
B1[index.json]
B2[global/]
B3[repos/{repo}.md]
B4[agents/{agent}.md]
end
subgraph Ops_Workspace
F[scripts/\nops, proof, hygiene, reports]
G[tests/ + .gitea/workflows/\nverification surface]
H[OPERATIONS.md / CONTRIBUTING.md\nrunbook + merge gate]
end
subgraph World_and_Agent_Runtimes
I[evennia/ + scripts/evennia/\nlocal world lane]
J[timmy-local/\ncache + in-world commands]
K[morrowind/\nOpenMW MCP + local brain]
L[uniwizard/ + uni-wizard/\nrouting and harness experiments]
end
subgraph Analysis_and_Training
M[scripts/twitter_archive/\narchive extraction + media analysis]
N[scripts/know_thy_father/\nmeaning kernels + crossref]
O[evennia_tools/ + metrics/\ntelemetry + sovereignty accounting]
P[training-data/ reports/ briefings/\nartifacts]
end
A --> F
B --> F
B --> O
C --> M
C --> N
D --> I
D --> J
D --> K
E --> F
E --> L
F --> G
H --> G
I --> O
J --> O
K --> P
L --> P
M --> P
N --> P
O --> P
```
## Major Domains
### Pipeline 1: Harvester
- **Input**: Finished session transcripts (JSONL format)
- **Process**: LLM extracts durable knowledge using structured prompt
- **Output**: Facts stored in `knowledge/` directory
- **Categories**: fact, pitfall, pattern, tool-quirk, question
- **Deduplication**: Content-hash based, existing knowledge has priority
### 1. Proof and operational guardrails
### Pipeline 2: Bootstrapper
- **Input**: `knowledge/` store
- **Process**: Queries for relevant facts, assembles compact 2k-token context
- **Output**: Injected context at session start
- **Goal**: New sessions start with full situational awareness
Key files:
### Pipeline 3: Measurer
- **Input**: Knowledge store + session metrics
- **Process**: Tracks knowledge velocity, error reduction, hit rate
- **Output**: Dashboard.md + daily reports
- **Goal**: Prove the compounding loop works
- `README.md`
- `OPERATIONS.md`
- `CONTRIBUTING.md`
- `.gitea/workflows/smoke.yml`
- `scripts/detect_secrets.py`
- `scripts/trajectory_sanitize.py`
## Directory Structure
Purpose:
- define what counts as valid proof
- protect the repo from obvious secret leaks
- sanitize training/session artifacts before reuse
- provide a minimal smoke gate in CI
This is the most stable and best-tested area of the repo.
### 2. Evennia / world-shell lane
Key files:
- `scripts/evennia/bootstrap_local_evennia.py`
- `scripts/evennia/verify_local_evennia.py`
- `scripts/evennia/evennia_mcp_server.py`
- `evennia/timmy_world/...`
- `timmy-local/evennia/...`
- `evennia_tools/layout.py`
- `evennia_tools/telemetry.py`
- `evennia_tools/training.py`
Purpose:
- bootstrap and verify a local Evennia runtime
- expose Evennia via MCP/telnet tooling
- model Timmy as a room-based world shell with tool-bridging commands
- collect telemetry and training traces from world interactions
Reality:
- there is a mostly stock Evennia scaffold under `evennia/timmy_world/`
- there is a richer but incomplete prototype under `timmy-local/evennia/`
- the two are conceptually aligned but not fully unified
### 3. Tower / narrative simulation lane
Key files:
- `scripts/tower_game.py`
- `timmy-world/game.py`
- `evennia/timmy_world/game.py`
- `evennia/timmy_world/world/game.py`
Purpose:
- simulate Timmys narrative “Tower” world
- track rooms, trust, phases, energy, and dialogue
- serve as an emergence / world-state experimentation lane
Reality:
- `scripts/tower_game.py` is the cleanest and best-tested version
- three larger Tower variants exist in parallel, creating drift and maintenance risk
### 4. Twitter archive / Know Thy Father lane
Key files:
- `scripts/twitter_archive/extract_archive.py`
- `scripts/twitter_archive/extract_media_manifest.py`
- `scripts/twitter_archive/analyze_media.py`
- `scripts/twitter_archive/build_dpo_pairs.py`
- `scripts/know_thy_father/index_media.py`
- `scripts/know_thy_father/synthesize_kernels.py`
- `scripts/know_thy_father/crossref_audit.py`
- `twitter-archive/know-thy-father/tracker.py`
Purpose:
- normalize raw archive exports
- build media manifests and hashtag metrics
- derive analysis artifacts and “meaning kernels”
- compare extracted themes against Timmys declared principles
- generate DPO / training-ready artifacts
This is one of the strongest parts of the repo in terms of tests, but it also has multiple overlapping generations of pipeline code.
### 5. Harness / routing experiments
Key files:
- `uniwizard/task_classifier.py`
- `uniwizard/quality_scorer.py`
- `uniwizard/self_grader.py`
- `uni-wizard/harness.py`
- `uni-wizard/daemons/task_router.py`
- `uni-wizard/daemons/health_daemon.py`
- `uni-wizard/v2/...`
- `uni-wizard/v3/...`
- `uni-wizard/v4/...`
Purpose:
- classify prompts and route them to candidate backends
- record backend quality over time
- grade Hermes sessions and generate self-improvement signals
- prototype unified local-first agent harnesses
- explore multi-house routing, telemetry, and adaptation engines
Reality:
- `uniwizard/` is cleaner and more testable
- `uni-wizard/` contains several architecture generations in one tree
- versioned tests expose namespace/import collisions that currently break full-repo pytest
### 6. Game-agent / local reflex lane
Key files:
- `morrowind/mcp_server.py`
- `morrowind/local_brain.py`
- `morrowind/pilot.py`
- `morrowind/play.py`
- `morrowind/agent.py`
Purpose:
- drive OpenMW locally through perception + keystroke automation
- expose game actions through MCP tools
- compare deterministic and model-driven control loops
- save trajectories for training or DPO later
This lane is high-side-effect and essentially untested.
### 7. Telemetry, bridge, and sovereignty accounting
Key files:
- `metrics/model_tracker.py`
- `infrastructure/timmy-bridge/client/timmy_client.py`
- `infrastructure/timmy-bridge/monitor/timmy_monitor.py`
- `infrastructure/timmy-bridge/reports/generate_report.py`
Purpose:
- track local-vs-cloud usage and rough cost avoidance
- relay state/artifacts through a bridge architecture
- record telemetry and generate retrospective reports
Reality:
- structurally useful
- but parts of the bridge and crypto story are explicitly placeholder-grade
```
compounding-intelligence/
|-- README.md # Project overview and roadmap
|-- knowledge/
| |-- index.json # Machine-readable fact index (versioned)
| |-- global/ # Cross-repo knowledge
| |-- repos/{repo}.md # Per-repo knowledge files
| |-- agents/{agent}.md # Agent-type notes
|-- scripts/
| |-- test_harvest_prompt.py # Validation for harvest prompt output
| |-- test_harvest_prompt_comprehensive.py # Extended test suite
|-- templates/
| |-- harvest-prompt.md # LLM prompt for knowledge extraction
|-- metrics/
| |-- .gitkeep # Placeholder for dashboard
|-- test_sessions/
| |-- session_failure.jsonl # Test data: failed session
| |-- session_partial.jsonl # Test data: partial session
| |-- session_patterns.jsonl # Test data: pattern extraction
| |-- session_questions.jsonl # Test data: question identification
| |-- session_success.jsonl # Test data: successful session
```
## Entry Points
There is no single canonical `main.py`. The repo has many entry points, grouped by lane.
| File | Purpose | Entry |
|------|---------|-------|
| `templates/harvest-prompt.md` | Extraction prompt | LLM input template |
| `scripts/test_harvest_prompt.py` | Validation | `python3 test_harvest_prompt.py` |
| `knowledge/index.json` | Data store | Read/write by all pipelines |
### Repo-level operational entry points
## Data Flow
- `scripts/detect_secrets.py`
- secret scanning CLI and pre-commit helper
- `scripts/trajectory_sanitize.py`
- artifact sanitization CLI for JSON / JSONL session data
- `scripts/fleet_milestones.py`
- milestone-trigger state machine and logger
- `scripts/failover_monitor.py`
- simple host reachability recorder
- `.gitea/workflows/smoke.yml`
- CI smoke entry point for parse checks and grep-based secret scan
```
1. Agent completes session -> session transcript (JSONL)
2. Harvester reads transcript
3. LLM processes via harvest-prompt.md template
4. Extracted knowledge validated against schema
5. Deduplicated against existing index.json
6. New facts appended with source attribution
7. Bootstrapper queries index.json for relevant facts
8. Context injected into next session
9. Measurer tracks velocity and quality metrics
```
### Evennia / world entry points
## Knowledge Schema
- `scripts/evennia/bootstrap_local_evennia.py`
- local world bootstrap and provisioning
- `scripts/evennia/verify_local_evennia.py`
- runtime verification via HTTP, shell, telnet
- `scripts/evennia/evennia_mcp_server.py`
- MCP bridge into the Evennia runtime
- `timmy-local/evennia/world/build.py`
- creates rooms, exits, tool objects, and Timmy character state
Each knowledge item in `index.json`:
### Archive / research entry points
```json
{
"fact": "One sentence description",
"category": "fact|pitfall|pattern|tool-quirk|question",
"repo": "Repository name or 'global'",
"confidence": 0.0-1.0,
"source": "mempalace|fact_store|skill|harvester",
"source_file": "Origin file if applicable",
"migrated_at": "ISO 8601 timestamp"
}
```
- `scripts/twitter_archive/extract_archive.py`
- `scripts/twitter_archive/extract_media_manifest.py`
- `scripts/twitter_archive/analyze_media.py`
- `scripts/twitter_archive/build_dpo_pairs.py`
- `scripts/know_thy_father/index_media.py`
- `scripts/know_thy_father/synthesize_kernels.py`
- `scripts/know_thy_father/crossref_audit.py`
### Harness / agent entry points
- `uni-wizard/harness.py`
- `uni-wizard/daemons/task_router.py`
- `uni-wizard/daemons/health_daemon.py`
- `uniwizard/self_grader.py`
- `uniwizard/quality_scorer.py`
- `uniwizard/task_classifier.py`
### Game-agent entry points
- `morrowind/mcp_server.py`
- `morrowind/local_brain.py`
- `morrowind/pilot.py`
- `morrowind/play.py`
## Data Flows
### 1. Proof / hygiene flow
1. Developer changes repo files.
2. `scripts/detect_secrets.py` scans content for obvious leaks.
3. `CONTRIBUTING.md` requires exact proof artifacts, not hand-wavy claims.
4. `.gitea/workflows/smoke.yml` attempts parse checks in CI.
5. Sanitized session artifacts can be passed through `scripts/trajectory_sanitize.py` before reuse.
### 2. Archive-to-training flow
1. Raw archive export enters `scripts/twitter_archive/extract_archive.py`.
2. Media-bearing rows are indexed by `extract_media_manifest.py`.
3. `analyze_media.py` attaches batch analysis output and kernels.
4. `scripts/know_thy_father/*.py` transforms archive slices into meaning kernels and conscience cross-references.
5. Results land in `training-data/`, reports, and summary artifacts.
### 3. Evennia world flow
1. `bootstrap_local_evennia.py` prepares local runtime state.
2. `timmy-local/evennia/world/build.py` constructs rooms, exits, and Timmys in-world state.
3. `scripts/evennia/evennia_mcp_server.py` bridges runtime control to Hermes/MCP.
4. `evennia_tools/telemetry.py` writes JSONL event streams and metadata.
5. Example traces are stored under `~/.timmy/training-data/evennia` and repo examples.
### 4. Harness / routing flow
1. Prompt text enters `uniwizard/task_classifier.py`.
2. Candidate backends are ranked by type and complexity.
3. `uniwizard/quality_scorer.py` records observed backend performance.
4. `uniwizard/self_grader.py` parses Hermes sessions and grades task quality.
5. `uni-wizard/` branches this idea into increasingly ambitious task routers, telemetry systems, and adaptation engines.
### 5. Morrowind agent flow
1. OpenMW state is parsed from logs or screenshots.
2. `morrowind/local_brain.py` or `pilot.py` builds a local control prompt.
3. A local model or deterministic motor layer selects an action.
4. `morrowind/mcp_server.py` or local automation injects keys/actions.
5. Trajectories are logged under `~/.timmy/morrowind/`.
### Confidence Scoring
- **0.9-1.0**: Explicitly stated with verification
- **0.7-0.8**: Clearly implied by multiple data points
- **0.5-0.6**: Suggested but not fully verified
- **0.3-0.4**: Inferred from limited data
- **0.1-0.2**: Speculative or uncertain
## Key Abstractions
### `TowerGame` / `GameState` / `Phase` / `Room`
File: `scripts/tower_game.py`
- the cleanest narrative-engine abstraction in the repo
- models room state, trust, energy, narrative phase, dialogue, and monologue
- heavily covered by `tests/test_tower_game.py`
### `TimmyCharacter` and `TimmyRoom`
Files:
- `timmy-local/evennia/typeclasses/characters.py`
- `timmy-local/evennia/typeclasses/rooms.py`
Purpose:
- represent Timmy as a persistent Evennia character with task, knowledge, tool, preference, and metric state
- model rooms as functional workspaces like Workshop, Library, Observatory, Forge, and Dispatch
### In-world tool commands
File: `timmy-local/evennia/commands/tools.py`
Key commands:
- `read`, `write`, `search`
- `git_status`, `git_log`, `git_pull`
- `sysinfo`, `health`
- `think`
- `gitea_issues`
- room navigation commands
This is the repos most direct “world shell as operations interface” abstraction.
### `UniWizardHarness`
Files:
- `uni-wizard/harness.py`
- `uni-wizard/v2/harness.py`
Purpose:
- unify system, git, network, and file-style actions behind one harness surface
- later versions add provenance, house routing, telemetry, and policy logic
The downside is namespace drift across versions.
### `TaskClassifier` and `ClassificationResult`
File: `uniwizard/task_classifier.py`
Purpose:
- classify prompts by type and complexity
- recommend backend order for execution
This is a central abstraction in the repos routing experiments.
### `PatternDatabase` / `IntelligenceEngine`
File: `uni-wizard/v3/intelligence_engine.py`
Purpose:
- store execution patterns and model performance
- derive adaptation events and predictions
This is a prototype intelligence layer, not yet a clean production subsystem.
### `MeaningKernel`
File: `scripts/know_thy_father/synthesize_kernels.py`
Purpose:
- normalize archive/media findings into structured semantic outputs
- feed downstream summaries and fact-style exports
### Evennia telemetry helpers
File: `evennia_tools/telemetry.py`
Purpose:
- produce deterministic session-day directories, event log paths, metadata files, and append-only telemetry streams
### `TimmyClient`
File: `infrastructure/timmy-bridge/client/timmy_client.py`
Purpose:
- send heartbeat/artifact state toward the bridge/relay system
This is important structurally, but still partly demo-grade in implementation quality.
1. **Knowledge Item**: Atomic unit of extracted intelligence. One fact, one category, one confidence score.
2. **Knowledge Store**: Directory-based persistent storage with JSON index.
3. **Harvest Prompt**: Structured LLM prompt that converts session transcripts to knowledge items.
4. **Bootstrap Context**: Compact 2k-token summary injected at session start.
5. **Compounding Loop**: The cycle of extract -> store -> inject -> work -> extract.
## API Surface
## CLI commands
### Knowledge Store (file-based)
- **Read**: `knowledge/index.json` — all facts
- **Write**: Append to `index.json` after deduplication
- **Query**: Filter by category, repo, confidence threshold
### Proof / hygiene
```bash
python3 scripts/detect_secrets.py <paths>
python3 scripts/trajectory_sanitize.py --input <in> --output <out>
python3 scripts/fleet_milestones.py --list
python3 scripts/fleet_milestones.py --trigger <milestone>
python3 scripts/failover_monitor.py
```
### Evennia
```bash
python3 scripts/evennia/bootstrap_local_evennia.py
python3 scripts/evennia/verify_local_evennia.py
python3 scripts/evennia/evennia_mcp_server.py
python3 scripts/evennia/eval_world_basics.py
python3 scripts/evennia/generate_sample_trace.py
```
### Archive / research
```bash
python3 scripts/twitter_archive/extract_archive.py ...
python3 scripts/twitter_archive/extract_media_manifest.py ...
python3 scripts/twitter_archive/analyze_media.py --status
python3 scripts/twitter_archive/analyze_media.py --batch <n>
python3 scripts/know_thy_father/index_media.py ...
python3 scripts/know_thy_father/synthesize_kernels.py ...
python3 scripts/know_thy_father/crossref_audit.py ...
```
### Harness / routing
```bash
python3 uniwizard/self_grader.py ...
python3 uniwizard/quality_scorer.py ...
python3 uniwizard/task_classifier.py ...
python3 uni-wizard/harness.py ...
python3 uni-wizard/daemons/task_router.py ...
python3 uni-wizard/daemons/health_daemon.py ...
```
### Game-agent
```bash
python3 morrowind/mcp_server.py
python3 morrowind/local_brain.py
python3 morrowind/pilot.py
python3 morrowind/play.py
```
## In-world commands
Documented in `timmy-local/evennia/commands/tools.py`:
- `read <path>`
- `write <path> = <content>`
- `search <pattern>`
- `git status`
- `git log [n]`
- `git pull`
- `sysinfo`
- `health`
- `think <prompt>`
- `gitea issues`
- movement/status commands for Workshop, Library, Observatory
### Templates
- **harvest-prompt.md**: Input template for LLM extraction
- **bootstrap-context.md**: Output template for session injection
## Test Coverage
### Current verified state
| Test File | Covers | Status |
|-----------|--------|--------|
| `test_harvest_prompt.py` | Schema validation, required fields | Present |
| `test_harvest_prompt_comprehensive.py` | Extended validation, edge cases | Present |
| `test_sessions/session_failure.jsonl` | Failure extraction | Test data |
| `test_sessions/session_partial.jsonl` | Partial session handling | Test data |
| `test_sessions/session_patterns.jsonl` | Pattern extraction | Test data |
| `test_sessions/session_questions.jsonl` | Question identification | Test data |
| `test_sessions/session_success.jsonl` | Full extraction | Test data |
Commands run on this branch:
- `pytest -q tests/test_fleet_milestones.py tests/test_failover_monitor.py``7 passed`
- `pytest -q tests``150 passed, 19 warnings`
- `pytest -q``260 passed, 4 failed, 45 warnings`
### Tests added in this branch
- `tests/test_fleet_milestones.py`
- covers state persistence, dry-run behavior, unknown milestones, idempotence
- `tests/test_failover_monitor.py`
- covers online/offline detection and status-file emission
These close two concrete gaps around previously untested operational scripts.
### Well-covered areas
- secret detection
- trajectory sanitization
- Tower game (`scripts/tower_game.py`)
- Know Thy Father pipeline
- Twitter archive pipeline
- Evennia telemetry/layout helper modules
- documentation proof-policy assertions
### Weak or missing coverage
- no real end-to-end coverage for `scripts/evennia/bootstrap_local_evennia.py`
- no runtime coverage for `scripts/evennia/verify_local_evennia.py` against a real Evennia world
- no automated tests for `morrowind/`
- no meaningful automated coverage for `infrastructure/timmy-bridge/`
- most shell/provisioning scripts remain untested
- `uni-wizard/v2` and `uni-wizard/v3` are not full-repo-pytest clean
### Concrete failing area
Full-repo pytest currently fails in `uni-wizard/v2/tests/test_author_whitelist.py` because `uni-wizard/v2/task_router_daemon.py` imports `harness` and resolves to `uni-wizard/harness.py` instead of the version-local v2 harness. That is a real namespace collision, not flaky test noise.
### Gaps
- No integration tests for full harvester pipeline
- No tests for bootstrapper context assembly
- No tests for measurer metrics computation
- No tests for deduplication logic
- No CI pipeline configured
## Security Considerations
### 1. Hard-coded IPs and host-specific paths
1. **Knowledge injection**: Bootstrapper injects context from knowledge store. Malicious facts in the store could influence agent behavior. Trust scoring partially mitigates this.
2. **Session transcripts**: May contain sensitive data (tokens, API keys). Harvester must filter sensitive patterns before storage.
3. **LLM extraction**: Harvest prompt instructs "no hallucination" but LLMs can still confabulate. Confidence scoring and source attribution provide auditability.
4. **File-based storage**: No access control on knowledge files. Anyone with filesystem access can read/modify.
Examples appear across the repo, including:
## Dependencies
- `timmy-local/README.md`
- `scripts/setup-uni-wizard.sh`
- `scripts/provision-timmy-vps.sh`
- `scripts/emacs-fleet-poll.sh`
- `scripts/emacs-fleet-bridge.py`
- `scripts/failover_monitor.py`
- Python 3.10+
- No external packages (stdlib only)
- LLM access for harvester pipeline (Ollama or cloud provider)
- Hermes agent framework for session management
Risk:
## Status
- fragile deploy assumptions
- easy environment drift
- accidental disclosure in public docs or configs
### 2. Broad mutation surfaces in world-shell commands
`timmy-local/evennia/commands/tools.py` exposes host file, git, search, system, and network operations inside the world abstraction.
Risk:
- large blast radius if command permissions are weak
- hard to separate playful world interaction from privileged host mutation
### 3. CI is smoke-only and partly broken
`.gitea/workflows/smoke.yml` does not run pytest.
Worse, its JSON parse command is broken as written:
```bash
find . -name '*.json' | xargs -r python3 -m json.tool > /dev/null
```
`python -m json.tool` does not accept that many positional files in one invocation, so the smoke workflow can fail or mislead.
### 4. Placeholder-grade crypto / bridge logic
`infrastructure/timmy-bridge/` contains meaningful structure, but parts of the relay/client story are still simplified or placeholder quality.
Risk:
- implementation may look more production-ready than it is
- trust assumptions may exceed actual cryptographic guarantees
### 5. Offensive tooling co-located with ops code
The `wizards/.../red-teaming/godmode/...` lane contains explicit jailbreak automation assets.
Risk:
- policy confusion
- accidental inclusion in downstream automation
- higher review burden for a repo that also houses operational infrastructure
### 6. Deprecated UTC calls
Multiple tested modules still use `datetime.utcnow()` and emit deprecation warnings.
Risk:
- future Python compatibility debt
- noisy test output that can hide more important warnings
## Dependencies and External Assumptions
### Python / system
- Python 3.11 in CI
- `pytest`
- `PyYAML` for YAML smoke parsing
- `sqlite3`
- shell utilities like `bash`, `ping`, `grep`, `xargs`
### Local-runtime assumptions
- Hermes local session/state under `~/.hermes/`
- Timmy local state under `~/.timmy/`
- Evennia runtime availability for world-shell scripts
- local inference endpoints (llama.cpp / Ollama / localhost services)
- `ffmpeg` / `ffprobe` for media analysis paths
- OpenMW / Apple automation for `morrowind/`
- SSH/systemd availability for fleet scripts
## Deployment and Operability
The most important deploy fact is this:
- live orchestration is described in `OPERATIONS.md` as Hermes + `timmy-config` sidecar
- `timmy-home` supplies workspace scripts, experiments, runbooks, and artifacts around that live system
Practical run surfaces:
- CI smoke: `.gitea/workflows/smoke.yml`
- local tests: `pytest -q tests`
- Evennia world lane: `scripts/evennia/*.py`
- archive lane: `scripts/twitter_archive/*.py`, `scripts/know_thy_father/*.py`
- local-world experiments: `timmy-local/`
- harness experiments: `uniwizard/`, `uni-wizard/`
## Duplication and Drift Candidates
### High-confidence duplication
- `scripts/tower_game.py`
- `timmy-world/game.py`
- `evennia/timmy_world/game.py`
- `evennia/timmy_world/world/game.py`
These are overlapping Tower implementations with different behavior and shared concepts.
### Architecture-generation drift
- `uniwizard/`
- `uni-wizard/`
- `uni-wizard/v2/`
- `uni-wizard/v3/`
- `uni-wizard/v4/`
Multiple generations coexist with conflicting import assumptions.
### Pipeline overlap
- `twitter-archive/`
- `scripts/twitter_archive/`
- `scripts/know_thy_father/`
These lanes overlap in mission and artifact shape.
## Performance and Scaling Notes
- most heavy data volume lives in `training-data/`, so repo-wide file scans are expensive by default
- smoke commands that blindly walk all JSON files will age poorly as artifact volume grows
- archive/media pipelines depend on batch processing and checkpointing to remain tractable
- routing/telemetry systems rely heavily on local SQLite and JSONL append-only files, which is simple and inspectable but may become contention-prone under sustained automation
## Recommended Follow-up Work
1. Fix `.gitea/workflows/smoke.yml` so JSON validation iterates file-by-file and pytest is part of CI. Filed as issue #715.
2. Resolve `uni-wizard` namespace collisions so `pytest -q` is green repo-wide. Filed as issue #716.
3. Decide which Tower implementation is canonical and retire the others or clearly mark them experimental.
4. Separate production-grade bridge/runtime code from placeholder or speculative prototypes.
5. Centralize host/path/IP configuration instead of embedding machine-specific values in docs and scripts.
6. Add end-to-end verification for the Evennia runtime lane.
7. Add at least smoke coverage for the `morrowind/` and `timmy-bridge/` lanes.
## Verification Notes
This GENOME is based on:
- direct inspection of the repo root, top-level metrics, and key runtime docs
- direct test execution on this branch
- direct reproduction of the broken CI JSON-parse command
- targeted new tests added for untested operational scripts
- deeper file-level analysis across Evennia, archive, harness, and game-agent lanes
It should be read as a map of a workspace monorepo with several real subsystems and several prototype subsystems, not as documentation for one singular deployable app.
- **Phase**: Early development
- **Epics**: 4 (Harvester, Knowledge Store, Bootstrap, Measurement)
- **Milestone**: 4 (Retroactive Harvest)
- **Open Issues**: Active development across harvester and knowledge store pipelines

View File

@@ -1,45 +0,0 @@
import json
import subprocess
from scripts import failover_monitor as monitor
def test_check_health_reports_online(monkeypatch):
def fake_check_call(cmd, stdout=None):
assert cmd[:4] == ["ping", "-c", "1", "-W"]
return 0
monkeypatch.setattr(monitor.subprocess, "check_call", fake_check_call)
assert monitor.check_health("1.2.3.4") == "ONLINE"
def test_check_health_reports_offline(monkeypatch):
def fake_check_call(cmd, stdout=None):
raise subprocess.CalledProcessError(returncode=1, cmd=cmd)
monkeypatch.setattr(monitor.subprocess, "check_call", fake_check_call)
assert monitor.check_health("1.2.3.4") == "OFFLINE"
def test_main_writes_status_file_and_prints(tmp_path, monkeypatch, capsys):
monkeypatch.setattr(monitor, "STATUS_FILE", tmp_path / "failover_status.json")
monkeypatch.setattr(monitor, "FLEET", {"ezra": "1.1.1.1", "bezalel": "2.2.2.2"})
monkeypatch.setattr(monitor.time, "time", lambda: 1713148800.0)
monkeypatch.setattr(
monitor,
"check_health",
lambda host: "ONLINE" if host == "1.1.1.1" else "OFFLINE",
)
monitor.main()
payload = json.loads(monitor.STATUS_FILE.read_text())
assert payload == {
"timestamp": 1713148800.0,
"fleet": {"ezra": "ONLINE", "bezalel": "OFFLINE"},
}
captured = capsys.readouterr()
assert "ALLEGRO FAILOVER MONITOR" in captured.out.upper()
assert "EZRA: ONLINE" in captured.out
assert "BEZALEL: OFFLINE" in captured.out

View File

@@ -1,82 +0,0 @@
import json
from datetime import datetime
import pytest
from scripts import fleet_milestones as fm
class FixedDateTime:
@classmethod
def utcnow(cls):
return datetime(2026, 4, 15, 1, 2, 3)
def test_trigger_persists_state_and_log(tmp_path, monkeypatch, capsys):
state_file = tmp_path / "milestones.json"
log_file = tmp_path / "fleet_milestones.log"
monkeypatch.setattr(fm, "STATE_FILE", state_file)
monkeypatch.setattr(fm, "LOG_FILE", log_file)
monkeypatch.setattr(fm, "datetime", FixedDateTime)
fm.trigger("health_check_first_run")
saved = json.loads(state_file.read_text())
assert saved["health_check_first_run"] == {
"triggered_at": "2026-04-15T01:02:03Z",
"phase": 1,
}
log_lines = log_file.read_text().strip().splitlines()
assert len(log_lines) == 1
assert "First automated health check ran" in log_lines[0]
captured = capsys.readouterr()
assert "MILESTONE" in captured.out
def test_trigger_dry_run_logs_without_persisting_state(tmp_path, monkeypatch):
state_file = tmp_path / "milestones.json"
log_file = tmp_path / "fleet_milestones.log"
monkeypatch.setattr(fm, "STATE_FILE", state_file)
monkeypatch.setattr(fm, "LOG_FILE", log_file)
monkeypatch.setattr(fm, "datetime", FixedDateTime)
fm.trigger("backup_first_success", dry_run=True)
assert not state_file.exists()
assert "First automated backup completed" in log_file.read_text()
def test_trigger_unknown_key_exits(monkeypatch):
monkeypatch.setattr(fm, "datetime", FixedDateTime)
with pytest.raises(SystemExit) as exc:
fm.trigger("not-a-real-milestone")
assert exc.value.code == 1
def test_trigger_is_idempotent_once_recorded(tmp_path, monkeypatch, capsys):
state_file = tmp_path / "milestones.json"
log_file = tmp_path / "fleet_milestones.log"
state_file.write_text(
json.dumps(
{
"health_check_first_run": {
"triggered_at": "2026-04-01T00:00:00Z",
"phase": 1,
}
}
)
)
monkeypatch.setattr(fm, "STATE_FILE", state_file)
monkeypatch.setattr(fm, "LOG_FILE", log_file)
monkeypatch.setattr(fm, "datetime", FixedDateTime)
fm.trigger("health_check_first_run")
assert not log_file.exists()
captured = capsys.readouterr()
assert "already triggered" in captured.out