Compare commits

..

1 Commits

Author SHA1 Message Date
Alexander Whitestone
173ce54eed feat: add timmy-home genome analysis (#670)
Some checks failed
Smoke Test / smoke (pull_request) Failing after 19s
2026-04-15 00:28:53 -04:00
7 changed files with 868 additions and 353 deletions

741
GENOME.md Normal file
View File

@@ -0,0 +1,741 @@
# GENOME.md — timmy-home
Generated: 2026-04-15 04:23:24Z
Issue: #670
Branch: `fix/670`
## Project Overview
`timmy-home` is not a single product application. It is the Timmy Foundation workspace repo: an operational monorepo that mixes live guardrails, fleet scripts, local-world experiments, training pipelines, research artifacts, notes, and prototype agent runtimes.
The most important reality check is in `OPERATIONS.md`: the active production system is Hermes plus the `timmy-config` sidecar, while `timmy-home` functions as the workspace, proving ground, and artifact store.
### What the repo is good at
- local operational utilities and proof-oriented runbooks
- offline-first data transforms for training and archive analysis
- experimental Evennia / world-shell work
- prototype autonomous harnesses (`uniwizard/`, `uni-wizard/`)
- game-agent experiments (`morrowind/`, Tower simulations)
- telemetry, reports, and training-data staging
### What the repo is not
- not a clean single-purpose package
- not a consistently production-hardened deployable service
- not one coherent architecture; it is several architectures sharing one tree
### Repository shape
Metric snapshot from `/tmp/BURN-7-6`:
- 3,119 files total
- 234 source files (`.py`, `.sh`, `.js`)
- 29 test files
- 21 config files
- 397,683 text lines total
- 42,428 Python lines
- 2,403 shell lines
- 272,829 Markdown lines
The file count is dominated by data and documentation:
- `training-data/` → 2,013 files
- `wizards/` → 345 files
- `skills/` → 327 files
This matters: operational code is only a small fraction of the repo. Any analysis that treats the repo as “just a Python package” will be wrong.
## Runtime Reality
The current live-system contract is defined more accurately by `OPERATIONS.md` than by `README.md`.
- `README.md` documents secret scanning and a tiny subset of the repo.
- `OPERATIONS.md` states that the active system is Hermes + `timmy-config` sidecar.
- `CONTRIBUTING.md` defines the repos most important invariant: proof is required for merge.
That means `timmy-home` behaves as a workspace + runbook + experiment garden around a live sidecar/orchestrator that mostly lives elsewhere.
## Architecture
```mermaid
graph TD
subgraph Inputs
A[Gitea / Forge issues]
B[Hermes sessions and local state]
C[Twitter archive + media]
D[OpenMW / Evennia runtime state]
E[Local machine + VPS fleet]
end
subgraph Ops_Workspace
F[scripts/\nops, proof, hygiene, reports]
G[tests/ + .gitea/workflows/\nverification surface]
H[OPERATIONS.md / CONTRIBUTING.md\nrunbook + merge gate]
end
subgraph World_and_Agent_Runtimes
I[evennia/ + scripts/evennia/\nlocal world lane]
J[timmy-local/\ncache + in-world commands]
K[morrowind/\nOpenMW MCP + local brain]
L[uniwizard/ + uni-wizard/\nrouting and harness experiments]
end
subgraph Analysis_and_Training
M[scripts/twitter_archive/\narchive extraction + media analysis]
N[scripts/know_thy_father/\nmeaning kernels + crossref]
O[evennia_tools/ + metrics/\ntelemetry + sovereignty accounting]
P[training-data/ reports/ briefings/\nartifacts]
end
A --> F
B --> F
B --> O
C --> M
C --> N
D --> I
D --> J
D --> K
E --> F
E --> L
F --> G
H --> G
I --> O
J --> O
K --> P
L --> P
M --> P
N --> P
O --> P
```
## Major Domains
### 1. Proof and operational guardrails
Key files:
- `README.md`
- `OPERATIONS.md`
- `CONTRIBUTING.md`
- `.gitea/workflows/smoke.yml`
- `scripts/detect_secrets.py`
- `scripts/trajectory_sanitize.py`
Purpose:
- define what counts as valid proof
- protect the repo from obvious secret leaks
- sanitize training/session artifacts before reuse
- provide a minimal smoke gate in CI
This is the most stable and best-tested area of the repo.
### 2. Evennia / world-shell lane
Key files:
- `scripts/evennia/bootstrap_local_evennia.py`
- `scripts/evennia/verify_local_evennia.py`
- `scripts/evennia/evennia_mcp_server.py`
- `evennia/timmy_world/...`
- `timmy-local/evennia/...`
- `evennia_tools/layout.py`
- `evennia_tools/telemetry.py`
- `evennia_tools/training.py`
Purpose:
- bootstrap and verify a local Evennia runtime
- expose Evennia via MCP/telnet tooling
- model Timmy as a room-based world shell with tool-bridging commands
- collect telemetry and training traces from world interactions
Reality:
- there is a mostly stock Evennia scaffold under `evennia/timmy_world/`
- there is a richer but incomplete prototype under `timmy-local/evennia/`
- the two are conceptually aligned but not fully unified
### 3. Tower / narrative simulation lane
Key files:
- `scripts/tower_game.py`
- `timmy-world/game.py`
- `evennia/timmy_world/game.py`
- `evennia/timmy_world/world/game.py`
Purpose:
- simulate Timmys narrative “Tower” world
- track rooms, trust, phases, energy, and dialogue
- serve as an emergence / world-state experimentation lane
Reality:
- `scripts/tower_game.py` is the cleanest and best-tested version
- three larger Tower variants exist in parallel, creating drift and maintenance risk
### 4. Twitter archive / Know Thy Father lane
Key files:
- `scripts/twitter_archive/extract_archive.py`
- `scripts/twitter_archive/extract_media_manifest.py`
- `scripts/twitter_archive/analyze_media.py`
- `scripts/twitter_archive/build_dpo_pairs.py`
- `scripts/know_thy_father/index_media.py`
- `scripts/know_thy_father/synthesize_kernels.py`
- `scripts/know_thy_father/crossref_audit.py`
- `twitter-archive/know-thy-father/tracker.py`
Purpose:
- normalize raw archive exports
- build media manifests and hashtag metrics
- derive analysis artifacts and “meaning kernels”
- compare extracted themes against Timmys declared principles
- generate DPO / training-ready artifacts
This is one of the strongest parts of the repo in terms of tests, but it also has multiple overlapping generations of pipeline code.
### 5. Harness / routing experiments
Key files:
- `uniwizard/task_classifier.py`
- `uniwizard/quality_scorer.py`
- `uniwizard/self_grader.py`
- `uni-wizard/harness.py`
- `uni-wizard/daemons/task_router.py`
- `uni-wizard/daemons/health_daemon.py`
- `uni-wizard/v2/...`
- `uni-wizard/v3/...`
- `uni-wizard/v4/...`
Purpose:
- classify prompts and route them to candidate backends
- record backend quality over time
- grade Hermes sessions and generate self-improvement signals
- prototype unified local-first agent harnesses
- explore multi-house routing, telemetry, and adaptation engines
Reality:
- `uniwizard/` is cleaner and more testable
- `uni-wizard/` contains several architecture generations in one tree
- versioned tests expose namespace/import collisions that currently break full-repo pytest
### 6. Game-agent / local reflex lane
Key files:
- `morrowind/mcp_server.py`
- `morrowind/local_brain.py`
- `morrowind/pilot.py`
- `morrowind/play.py`
- `morrowind/agent.py`
Purpose:
- drive OpenMW locally through perception + keystroke automation
- expose game actions through MCP tools
- compare deterministic and model-driven control loops
- save trajectories for training or DPO later
This lane is high-side-effect and essentially untested.
### 7. Telemetry, bridge, and sovereignty accounting
Key files:
- `metrics/model_tracker.py`
- `infrastructure/timmy-bridge/client/timmy_client.py`
- `infrastructure/timmy-bridge/monitor/timmy_monitor.py`
- `infrastructure/timmy-bridge/reports/generate_report.py`
Purpose:
- track local-vs-cloud usage and rough cost avoidance
- relay state/artifacts through a bridge architecture
- record telemetry and generate retrospective reports
Reality:
- structurally useful
- but parts of the bridge and crypto story are explicitly placeholder-grade
## Entry Points
There is no single canonical `main.py`. The repo has many entry points, grouped by lane.
### Repo-level operational entry points
- `scripts/detect_secrets.py`
- secret scanning CLI and pre-commit helper
- `scripts/trajectory_sanitize.py`
- artifact sanitization CLI for JSON / JSONL session data
- `scripts/fleet_milestones.py`
- milestone-trigger state machine and logger
- `scripts/failover_monitor.py`
- simple host reachability recorder
- `.gitea/workflows/smoke.yml`
- CI smoke entry point for parse checks and grep-based secret scan
### Evennia / world entry points
- `scripts/evennia/bootstrap_local_evennia.py`
- local world bootstrap and provisioning
- `scripts/evennia/verify_local_evennia.py`
- runtime verification via HTTP, shell, telnet
- `scripts/evennia/evennia_mcp_server.py`
- MCP bridge into the Evennia runtime
- `timmy-local/evennia/world/build.py`
- creates rooms, exits, tool objects, and Timmy character state
### Archive / research entry points
- `scripts/twitter_archive/extract_archive.py`
- `scripts/twitter_archive/extract_media_manifest.py`
- `scripts/twitter_archive/analyze_media.py`
- `scripts/twitter_archive/build_dpo_pairs.py`
- `scripts/know_thy_father/index_media.py`
- `scripts/know_thy_father/synthesize_kernels.py`
- `scripts/know_thy_father/crossref_audit.py`
### Harness / agent entry points
- `uni-wizard/harness.py`
- `uni-wizard/daemons/task_router.py`
- `uni-wizard/daemons/health_daemon.py`
- `uniwizard/self_grader.py`
- `uniwizard/quality_scorer.py`
- `uniwizard/task_classifier.py`
### Game-agent entry points
- `morrowind/mcp_server.py`
- `morrowind/local_brain.py`
- `morrowind/pilot.py`
- `morrowind/play.py`
## Data Flows
### 1. Proof / hygiene flow
1. Developer changes repo files.
2. `scripts/detect_secrets.py` scans content for obvious leaks.
3. `CONTRIBUTING.md` requires exact proof artifacts, not hand-wavy claims.
4. `.gitea/workflows/smoke.yml` attempts parse checks in CI.
5. Sanitized session artifacts can be passed through `scripts/trajectory_sanitize.py` before reuse.
### 2. Archive-to-training flow
1. Raw archive export enters `scripts/twitter_archive/extract_archive.py`.
2. Media-bearing rows are indexed by `extract_media_manifest.py`.
3. `analyze_media.py` attaches batch analysis output and kernels.
4. `scripts/know_thy_father/*.py` transforms archive slices into meaning kernels and conscience cross-references.
5. Results land in `training-data/`, reports, and summary artifacts.
### 3. Evennia world flow
1. `bootstrap_local_evennia.py` prepares local runtime state.
2. `timmy-local/evennia/world/build.py` constructs rooms, exits, and Timmys in-world state.
3. `scripts/evennia/evennia_mcp_server.py` bridges runtime control to Hermes/MCP.
4. `evennia_tools/telemetry.py` writes JSONL event streams and metadata.
5. Example traces are stored under `~/.timmy/training-data/evennia` and repo examples.
### 4. Harness / routing flow
1. Prompt text enters `uniwizard/task_classifier.py`.
2. Candidate backends are ranked by type and complexity.
3. `uniwizard/quality_scorer.py` records observed backend performance.
4. `uniwizard/self_grader.py` parses Hermes sessions and grades task quality.
5. `uni-wizard/` branches this idea into increasingly ambitious task routers, telemetry systems, and adaptation engines.
### 5. Morrowind agent flow
1. OpenMW state is parsed from logs or screenshots.
2. `morrowind/local_brain.py` or `pilot.py` builds a local control prompt.
3. A local model or deterministic motor layer selects an action.
4. `morrowind/mcp_server.py` or local automation injects keys/actions.
5. Trajectories are logged under `~/.timmy/morrowind/`.
## Key Abstractions
### `TowerGame` / `GameState` / `Phase` / `Room`
File: `scripts/tower_game.py`
- the cleanest narrative-engine abstraction in the repo
- models room state, trust, energy, narrative phase, dialogue, and monologue
- heavily covered by `tests/test_tower_game.py`
### `TimmyCharacter` and `TimmyRoom`
Files:
- `timmy-local/evennia/typeclasses/characters.py`
- `timmy-local/evennia/typeclasses/rooms.py`
Purpose:
- represent Timmy as a persistent Evennia character with task, knowledge, tool, preference, and metric state
- model rooms as functional workspaces like Workshop, Library, Observatory, Forge, and Dispatch
### In-world tool commands
File: `timmy-local/evennia/commands/tools.py`
Key commands:
- `read`, `write`, `search`
- `git_status`, `git_log`, `git_pull`
- `sysinfo`, `health`
- `think`
- `gitea_issues`
- room navigation commands
This is the repos most direct “world shell as operations interface” abstraction.
### `UniWizardHarness`
Files:
- `uni-wizard/harness.py`
- `uni-wizard/v2/harness.py`
Purpose:
- unify system, git, network, and file-style actions behind one harness surface
- later versions add provenance, house routing, telemetry, and policy logic
The downside is namespace drift across versions.
### `TaskClassifier` and `ClassificationResult`
File: `uniwizard/task_classifier.py`
Purpose:
- classify prompts by type and complexity
- recommend backend order for execution
This is a central abstraction in the repos routing experiments.
### `PatternDatabase` / `IntelligenceEngine`
File: `uni-wizard/v3/intelligence_engine.py`
Purpose:
- store execution patterns and model performance
- derive adaptation events and predictions
This is a prototype intelligence layer, not yet a clean production subsystem.
### `MeaningKernel`
File: `scripts/know_thy_father/synthesize_kernels.py`
Purpose:
- normalize archive/media findings into structured semantic outputs
- feed downstream summaries and fact-style exports
### Evennia telemetry helpers
File: `evennia_tools/telemetry.py`
Purpose:
- produce deterministic session-day directories, event log paths, metadata files, and append-only telemetry streams
### `TimmyClient`
File: `infrastructure/timmy-bridge/client/timmy_client.py`
Purpose:
- send heartbeat/artifact state toward the bridge/relay system
This is important structurally, but still partly demo-grade in implementation quality.
## API Surface
## CLI commands
### Proof / hygiene
```bash
python3 scripts/detect_secrets.py <paths>
python3 scripts/trajectory_sanitize.py --input <in> --output <out>
python3 scripts/fleet_milestones.py --list
python3 scripts/fleet_milestones.py --trigger <milestone>
python3 scripts/failover_monitor.py
```
### Evennia
```bash
python3 scripts/evennia/bootstrap_local_evennia.py
python3 scripts/evennia/verify_local_evennia.py
python3 scripts/evennia/evennia_mcp_server.py
python3 scripts/evennia/eval_world_basics.py
python3 scripts/evennia/generate_sample_trace.py
```
### Archive / research
```bash
python3 scripts/twitter_archive/extract_archive.py ...
python3 scripts/twitter_archive/extract_media_manifest.py ...
python3 scripts/twitter_archive/analyze_media.py --status
python3 scripts/twitter_archive/analyze_media.py --batch <n>
python3 scripts/know_thy_father/index_media.py ...
python3 scripts/know_thy_father/synthesize_kernels.py ...
python3 scripts/know_thy_father/crossref_audit.py ...
```
### Harness / routing
```bash
python3 uniwizard/self_grader.py ...
python3 uniwizard/quality_scorer.py ...
python3 uniwizard/task_classifier.py ...
python3 uni-wizard/harness.py ...
python3 uni-wizard/daemons/task_router.py ...
python3 uni-wizard/daemons/health_daemon.py ...
```
### Game-agent
```bash
python3 morrowind/mcp_server.py
python3 morrowind/local_brain.py
python3 morrowind/pilot.py
python3 morrowind/play.py
```
## In-world commands
Documented in `timmy-local/evennia/commands/tools.py`:
- `read <path>`
- `write <path> = <content>`
- `search <pattern>`
- `git status`
- `git log [n]`
- `git pull`
- `sysinfo`
- `health`
- `think <prompt>`
- `gitea issues`
- movement/status commands for Workshop, Library, Observatory
## Test Coverage
### Current verified state
Commands run on this branch:
- `pytest -q tests/test_fleet_milestones.py tests/test_failover_monitor.py``7 passed`
- `pytest -q tests``150 passed, 19 warnings`
- `pytest -q``260 passed, 4 failed, 45 warnings`
### Tests added in this branch
- `tests/test_fleet_milestones.py`
- covers state persistence, dry-run behavior, unknown milestones, idempotence
- `tests/test_failover_monitor.py`
- covers online/offline detection and status-file emission
These close two concrete gaps around previously untested operational scripts.
### Well-covered areas
- secret detection
- trajectory sanitization
- Tower game (`scripts/tower_game.py`)
- Know Thy Father pipeline
- Twitter archive pipeline
- Evennia telemetry/layout helper modules
- documentation proof-policy assertions
### Weak or missing coverage
- no real end-to-end coverage for `scripts/evennia/bootstrap_local_evennia.py`
- no runtime coverage for `scripts/evennia/verify_local_evennia.py` against a real Evennia world
- no automated tests for `morrowind/`
- no meaningful automated coverage for `infrastructure/timmy-bridge/`
- most shell/provisioning scripts remain untested
- `uni-wizard/v2` and `uni-wizard/v3` are not full-repo-pytest clean
### Concrete failing area
Full-repo pytest currently fails in `uni-wizard/v2/tests/test_author_whitelist.py` because `uni-wizard/v2/task_router_daemon.py` imports `harness` and resolves to `uni-wizard/harness.py` instead of the version-local v2 harness. That is a real namespace collision, not flaky test noise.
## Security Considerations
### 1. Hard-coded IPs and host-specific paths
Examples appear across the repo, including:
- `timmy-local/README.md`
- `scripts/setup-uni-wizard.sh`
- `scripts/provision-timmy-vps.sh`
- `scripts/emacs-fleet-poll.sh`
- `scripts/emacs-fleet-bridge.py`
- `scripts/failover_monitor.py`
Risk:
- fragile deploy assumptions
- easy environment drift
- accidental disclosure in public docs or configs
### 2. Broad mutation surfaces in world-shell commands
`timmy-local/evennia/commands/tools.py` exposes host file, git, search, system, and network operations inside the world abstraction.
Risk:
- large blast radius if command permissions are weak
- hard to separate playful world interaction from privileged host mutation
### 3. CI is smoke-only and partly broken
`.gitea/workflows/smoke.yml` does not run pytest.
Worse, its JSON parse command is broken as written:
```bash
find . -name '*.json' | xargs -r python3 -m json.tool > /dev/null
```
`python -m json.tool` does not accept that many positional files in one invocation, so the smoke workflow can fail or mislead.
### 4. Placeholder-grade crypto / bridge logic
`infrastructure/timmy-bridge/` contains meaningful structure, but parts of the relay/client story are still simplified or placeholder quality.
Risk:
- implementation may look more production-ready than it is
- trust assumptions may exceed actual cryptographic guarantees
### 5. Offensive tooling co-located with ops code
The `wizards/.../red-teaming/godmode/...` lane contains explicit jailbreak automation assets.
Risk:
- policy confusion
- accidental inclusion in downstream automation
- higher review burden for a repo that also houses operational infrastructure
### 6. Deprecated UTC calls
Multiple tested modules still use `datetime.utcnow()` and emit deprecation warnings.
Risk:
- future Python compatibility debt
- noisy test output that can hide more important warnings
## Dependencies and External Assumptions
### Python / system
- Python 3.11 in CI
- `pytest`
- `PyYAML` for YAML smoke parsing
- `sqlite3`
- shell utilities like `bash`, `ping`, `grep`, `xargs`
### Local-runtime assumptions
- Hermes local session/state under `~/.hermes/`
- Timmy local state under `~/.timmy/`
- Evennia runtime availability for world-shell scripts
- local inference endpoints (llama.cpp / Ollama / localhost services)
- `ffmpeg` / `ffprobe` for media analysis paths
- OpenMW / Apple automation for `morrowind/`
- SSH/systemd availability for fleet scripts
## Deployment and Operability
The most important deploy fact is this:
- live orchestration is described in `OPERATIONS.md` as Hermes + `timmy-config` sidecar
- `timmy-home` supplies workspace scripts, experiments, runbooks, and artifacts around that live system
Practical run surfaces:
- CI smoke: `.gitea/workflows/smoke.yml`
- local tests: `pytest -q tests`
- Evennia world lane: `scripts/evennia/*.py`
- archive lane: `scripts/twitter_archive/*.py`, `scripts/know_thy_father/*.py`
- local-world experiments: `timmy-local/`
- harness experiments: `uniwizard/`, `uni-wizard/`
## Duplication and Drift Candidates
### High-confidence duplication
- `scripts/tower_game.py`
- `timmy-world/game.py`
- `evennia/timmy_world/game.py`
- `evennia/timmy_world/world/game.py`
These are overlapping Tower implementations with different behavior and shared concepts.
### Architecture-generation drift
- `uniwizard/`
- `uni-wizard/`
- `uni-wizard/v2/`
- `uni-wizard/v3/`
- `uni-wizard/v4/`
Multiple generations coexist with conflicting import assumptions.
### Pipeline overlap
- `twitter-archive/`
- `scripts/twitter_archive/`
- `scripts/know_thy_father/`
These lanes overlap in mission and artifact shape.
## Performance and Scaling Notes
- most heavy data volume lives in `training-data/`, so repo-wide file scans are expensive by default
- smoke commands that blindly walk all JSON files will age poorly as artifact volume grows
- archive/media pipelines depend on batch processing and checkpointing to remain tractable
- routing/telemetry systems rely heavily on local SQLite and JSONL append-only files, which is simple and inspectable but may become contention-prone under sustained automation
## Recommended Follow-up Work
1. Fix `.gitea/workflows/smoke.yml` so JSON validation iterates file-by-file and pytest is part of CI. Filed as issue #715.
2. Resolve `uni-wizard` namespace collisions so `pytest -q` is green repo-wide. Filed as issue #716.
3. Decide which Tower implementation is canonical and retire the others or clearly mark them experimental.
4. Separate production-grade bridge/runtime code from placeholder or speculative prototypes.
5. Centralize host/path/IP configuration instead of embedding machine-specific values in docs and scripts.
6. Add end-to-end verification for the Evennia runtime lane.
7. Add at least smoke coverage for the `morrowind/` and `timmy-bridge/` lanes.
## Verification Notes
This GENOME is based on:
- direct inspection of the repo root, top-level metrics, and key runtime docs
- direct test execution on this branch
- direct reproduction of the broken CI JSON-parse command
- targeted new tests added for untested operational scripts
- deeper file-level analysis across Evennia, archive, harness, and game-agent lanes
It should be read as a map of a workspace monorepo with several real subsystems and several prototype subsystems, not as documentation for one singular deployable app.

View File

@@ -1,61 +0,0 @@
# [PHASE-1] Survival - Keep the Lights On
Phase 1 is the manual-clicker stage of the fleet. The machines exist. The services exist. The human is still the automation loop.
## Phase Definition
- Current state: fleet exists, agents run, everything important still depends on human vigilance.
- Resources tracked here: Capacity, Uptime.
- Next phase: [PHASE-2] Automation - Self-Healing Infrastructure
## Current Buildings
- VPS hosts: Ezra, Allegro, Bezalel
- Agents: Timmy harness, Code Claw heartbeat, Gemini AI Studio worker
- Gitea forge
- Evennia worlds
## Current Resource Snapshot
- Fleet operational: yes
- Uptime baseline: 0.0%
- Days at or above 95% uptime: 0
- Capacity utilization: 0.0%
## Next Phase Trigger
To unlock [PHASE-2] Automation - Self-Healing Infrastructure, the fleet must hold both of these conditions at once:
- Uptime >= 95% for 30 consecutive days
- Capacity utilization > 60%
- Current trigger state: NOT READY
## Missing Requirements
- Uptime 0.0% / 95.0%
- Days at or above 95% uptime: 0/30
- Capacity utilization 0.0% / >60.0%
## Manual Clicker Interpretation
Paperclips analogy: Phase 1 = Manual clicker. You ARE the automation.
Every restart, every SSH, every check is a manual click.
## Manual Clicks Still Required
- Restart agents and services by hand when a node goes dark.
- SSH into machines to verify health, disk, and memory.
- Check Gitea, relay, and world services manually before and after changes.
- Act as the scheduler when automation is missing or only partially wired.
## Repo Signals Already Present
- `scripts/fleet_health_probe.sh` — Automated health probe exists and can supply the uptime baseline for the next phase.
- `scripts/fleet_milestones.py` — Milestone tracker exists, so survival achievements can be narrated and logged.
- `scripts/auto_restart_agent.sh` — Auto-restart tooling already exists as phase-2 groundwork.
- `scripts/backup_pipeline.sh` — Backup pipeline scaffold exists for post-survival automation work.
- `infrastructure/timmy-bridge/reports/generate_report.py` — Bridge reporting exists and can summarize heartbeat-driven uptime.
## Notes
- The fleet is alive, but the human is still the control loop.
- Phase 1 is about naming reality plainly so later automation has a baseline to beat.

View File

@@ -12,7 +12,6 @@ Quick-reference index for common operational tasks across the Timmy Foundation i
| Check fleet health | fleet-ops | `python3 scripts/fleet_readiness.py` |
| Agent scorecard | fleet-ops | `python3 scripts/agent_scorecard.py` |
| View fleet manifest | fleet-ops | `cat manifest.yaml` |
| Render Phase-1 survival report | timmy-home | `python3 scripts/fleet_phase_status.py --output docs/FLEET_PHASE_1_SURVIVAL.md` |
## the-nexus (Frontend + Brain)

View File

@@ -1,224 +0,0 @@
#!/usr/bin/env python3
"""Render the current fleet survival phase as a durable report."""
from __future__ import annotations
import argparse
import json
from copy import deepcopy
from pathlib import Path
from typing import Any
PHASE_NAME = "[PHASE-1] Survival - Keep the Lights On"
NEXT_PHASE_NAME = "[PHASE-2] Automation - Self-Healing Infrastructure"
TARGET_UPTIME_PERCENT = 95.0
TARGET_UPTIME_DAYS = 30
TARGET_CAPACITY_PERCENT = 60.0
DEFAULT_BUILDINGS = [
"VPS hosts: Ezra, Allegro, Bezalel",
"Agents: Timmy harness, Code Claw heartbeat, Gemini AI Studio worker",
"Gitea forge",
"Evennia worlds",
]
DEFAULT_MANUAL_CLICKS = [
"Restart agents and services by hand when a node goes dark.",
"SSH into machines to verify health, disk, and memory.",
"Check Gitea, relay, and world services manually before and after changes.",
"Act as the scheduler when automation is missing or only partially wired.",
]
REPO_SIGNAL_FILES = {
"scripts/fleet_health_probe.sh": "Automated health probe exists and can supply the uptime baseline for the next phase.",
"scripts/fleet_milestones.py": "Milestone tracker exists, so survival achievements can be narrated and logged.",
"scripts/auto_restart_agent.sh": "Auto-restart tooling already exists as phase-2 groundwork.",
"scripts/backup_pipeline.sh": "Backup pipeline scaffold exists for post-survival automation work.",
"infrastructure/timmy-bridge/reports/generate_report.py": "Bridge reporting exists and can summarize heartbeat-driven uptime.",
}
DEFAULT_SNAPSHOT = {
"fleet_operational": True,
"resources": {
"uptime_percent": 0.0,
"days_at_or_above_95_percent": 0,
"capacity_utilization_percent": 0.0,
},
"current_buildings": DEFAULT_BUILDINGS,
"manual_clicks": DEFAULT_MANUAL_CLICKS,
"notes": [
"The fleet is alive, but the human is still the control loop.",
"Phase 1 is about naming reality plainly so later automation has a baseline to beat.",
],
}
def default_snapshot() -> dict[str, Any]:
return deepcopy(DEFAULT_SNAPSHOT)
def _deep_merge(base: dict[str, Any], override: dict[str, Any]) -> dict[str, Any]:
result = deepcopy(base)
for key, value in override.items():
if isinstance(value, dict) and isinstance(result.get(key), dict):
result[key] = _deep_merge(result[key], value)
else:
result[key] = value
return result
def load_snapshot(snapshot_path: Path | None = None) -> dict[str, Any]:
snapshot = default_snapshot()
if snapshot_path is None:
return snapshot
override = json.loads(snapshot_path.read_text(encoding="utf-8"))
return _deep_merge(snapshot, override)
def collect_repo_signals(repo_root: Path) -> list[str]:
signals: list[str] = []
for rel_path, description in REPO_SIGNAL_FILES.items():
if (repo_root / rel_path).exists():
signals.append(f"`{rel_path}` — {description}")
return signals
def compute_phase_status(snapshot: dict[str, Any], repo_root: Path | None = None) -> dict[str, Any]:
repo_root = repo_root or Path(__file__).resolve().parents[1]
resources = snapshot.get("resources", {})
uptime_percent = float(resources.get("uptime_percent", 0.0))
uptime_days = int(resources.get("days_at_or_above_95_percent", 0))
capacity_percent = float(resources.get("capacity_utilization_percent", 0.0))
fleet_operational = bool(snapshot.get("fleet_operational", False))
missing: list[str] = []
if not fleet_operational:
missing.append("Fleet operational flag is false.")
if uptime_percent < TARGET_UPTIME_PERCENT:
missing.append(f"Uptime {uptime_percent:.1f}% / {TARGET_UPTIME_PERCENT:.1f}%")
if uptime_days < TARGET_UPTIME_DAYS:
missing.append(f"Days at or above 95% uptime: {uptime_days}/{TARGET_UPTIME_DAYS}")
if capacity_percent <= TARGET_CAPACITY_PERCENT:
missing.append(f"Capacity utilization {capacity_percent:.1f}% / >{TARGET_CAPACITY_PERCENT:.1f}%")
return {
"title": PHASE_NAME,
"current_phase": "PHASE-1 Survival",
"fleet_operational": fleet_operational,
"resources": {
"uptime_percent": uptime_percent,
"days_at_or_above_95_percent": uptime_days,
"capacity_utilization_percent": capacity_percent,
},
"current_buildings": list(snapshot.get("current_buildings", DEFAULT_BUILDINGS)),
"manual_clicks": list(snapshot.get("manual_clicks", DEFAULT_MANUAL_CLICKS)),
"notes": list(snapshot.get("notes", [])),
"repo_signals": collect_repo_signals(repo_root),
"next_phase": NEXT_PHASE_NAME,
"next_phase_ready": fleet_operational and not missing,
"missing_requirements": missing,
}
def render_markdown(status: dict[str, Any]) -> str:
resources = status["resources"]
missing = status["missing_requirements"]
ready_line = "READY" if status["next_phase_ready"] else "NOT READY"
lines = [
f"# {status['title']}",
"",
"Phase 1 is the manual-clicker stage of the fleet. The machines exist. The services exist. The human is still the automation loop.",
"",
"## Phase Definition",
"",
"- Current state: fleet exists, agents run, everything important still depends on human vigilance.",
"- Resources tracked here: Capacity, Uptime.",
f"- Next phase: {status['next_phase']}",
"",
"## Current Buildings",
"",
]
lines.extend(f"- {item}" for item in status["current_buildings"])
lines.extend([
"",
"## Current Resource Snapshot",
"",
f"- Fleet operational: {'yes' if status['fleet_operational'] else 'no'}",
f"- Uptime baseline: {resources['uptime_percent']:.1f}%",
f"- Days at or above 95% uptime: {resources['days_at_or_above_95_percent']}",
f"- Capacity utilization: {resources['capacity_utilization_percent']:.1f}%",
"",
"## Next Phase Trigger",
"",
f"To unlock {status['next_phase']}, the fleet must hold both of these conditions at once:",
f"- Uptime >= {TARGET_UPTIME_PERCENT:.0f}% for {TARGET_UPTIME_DAYS} consecutive days",
f"- Capacity utilization > {TARGET_CAPACITY_PERCENT:.0f}%",
f"- Current trigger state: {ready_line}",
"",
"## Missing Requirements",
"",
])
if missing:
lines.extend(f"- {item}" for item in missing)
else:
lines.append("- None. Phase 2 can unlock now.")
lines.extend([
"",
"## Manual Clicker Interpretation",
"",
"Paperclips analogy: Phase 1 = Manual clicker. You ARE the automation.",
"Every restart, every SSH, every check is a manual click.",
"",
"## Manual Clicks Still Required",
"",
])
lines.extend(f"- {item}" for item in status["manual_clicks"])
lines.extend([
"",
"## Repo Signals Already Present",
"",
])
if status["repo_signals"]:
lines.extend(f"- {item}" for item in status["repo_signals"])
else:
lines.append("- No survival-adjacent repo signals detected.")
if status["notes"]:
lines.extend(["", "## Notes", ""])
lines.extend(f"- {item}" for item in status["notes"])
return "\n".join(lines).rstrip() + "\n"
def main() -> None:
parser = argparse.ArgumentParser(description="Render the fleet phase-1 survival report")
parser.add_argument("--snapshot", help="Optional JSON snapshot overriding the default phase-1 baseline")
parser.add_argument("--output", help="Write markdown report to this path")
parser.add_argument("--json", action="store_true", help="Print computed status as JSON instead of markdown")
args = parser.parse_args()
snapshot = load_snapshot(Path(args.snapshot).expanduser() if args.snapshot else None)
repo_root = Path(__file__).resolve().parents[1]
status = compute_phase_status(snapshot, repo_root=repo_root)
if args.json:
rendered = json.dumps(status, indent=2)
else:
rendered = render_markdown(status)
if args.output:
output_path = Path(args.output).expanduser()
output_path.parent.mkdir(parents=True, exist_ok=True)
output_path.write_text(rendered, encoding="utf-8")
print(f"Phase status written to {output_path}")
else:
print(rendered)
if __name__ == "__main__":
main()

View File

@@ -0,0 +1,45 @@
import json
import subprocess
from scripts import failover_monitor as monitor
def test_check_health_reports_online(monkeypatch):
def fake_check_call(cmd, stdout=None):
assert cmd[:4] == ["ping", "-c", "1", "-W"]
return 0
monkeypatch.setattr(monitor.subprocess, "check_call", fake_check_call)
assert monitor.check_health("1.2.3.4") == "ONLINE"
def test_check_health_reports_offline(monkeypatch):
def fake_check_call(cmd, stdout=None):
raise subprocess.CalledProcessError(returncode=1, cmd=cmd)
monkeypatch.setattr(monitor.subprocess, "check_call", fake_check_call)
assert monitor.check_health("1.2.3.4") == "OFFLINE"
def test_main_writes_status_file_and_prints(tmp_path, monkeypatch, capsys):
monkeypatch.setattr(monitor, "STATUS_FILE", tmp_path / "failover_status.json")
monkeypatch.setattr(monitor, "FLEET", {"ezra": "1.1.1.1", "bezalel": "2.2.2.2"})
monkeypatch.setattr(monitor.time, "time", lambda: 1713148800.0)
monkeypatch.setattr(
monitor,
"check_health",
lambda host: "ONLINE" if host == "1.1.1.1" else "OFFLINE",
)
monitor.main()
payload = json.loads(monitor.STATUS_FILE.read_text())
assert payload == {
"timestamp": 1713148800.0,
"fleet": {"ezra": "ONLINE", "bezalel": "OFFLINE"},
}
captured = capsys.readouterr()
assert "ALLEGRO FAILOVER MONITOR" in captured.out.upper()
assert "EZRA: ONLINE" in captured.out
assert "BEZALEL: OFFLINE" in captured.out

View File

@@ -0,0 +1,82 @@
import json
from datetime import datetime
import pytest
from scripts import fleet_milestones as fm
class FixedDateTime:
@classmethod
def utcnow(cls):
return datetime(2026, 4, 15, 1, 2, 3)
def test_trigger_persists_state_and_log(tmp_path, monkeypatch, capsys):
state_file = tmp_path / "milestones.json"
log_file = tmp_path / "fleet_milestones.log"
monkeypatch.setattr(fm, "STATE_FILE", state_file)
monkeypatch.setattr(fm, "LOG_FILE", log_file)
monkeypatch.setattr(fm, "datetime", FixedDateTime)
fm.trigger("health_check_first_run")
saved = json.loads(state_file.read_text())
assert saved["health_check_first_run"] == {
"triggered_at": "2026-04-15T01:02:03Z",
"phase": 1,
}
log_lines = log_file.read_text().strip().splitlines()
assert len(log_lines) == 1
assert "First automated health check ran" in log_lines[0]
captured = capsys.readouterr()
assert "MILESTONE" in captured.out
def test_trigger_dry_run_logs_without_persisting_state(tmp_path, monkeypatch):
state_file = tmp_path / "milestones.json"
log_file = tmp_path / "fleet_milestones.log"
monkeypatch.setattr(fm, "STATE_FILE", state_file)
monkeypatch.setattr(fm, "LOG_FILE", log_file)
monkeypatch.setattr(fm, "datetime", FixedDateTime)
fm.trigger("backup_first_success", dry_run=True)
assert not state_file.exists()
assert "First automated backup completed" in log_file.read_text()
def test_trigger_unknown_key_exits(monkeypatch):
monkeypatch.setattr(fm, "datetime", FixedDateTime)
with pytest.raises(SystemExit) as exc:
fm.trigger("not-a-real-milestone")
assert exc.value.code == 1
def test_trigger_is_idempotent_once_recorded(tmp_path, monkeypatch, capsys):
state_file = tmp_path / "milestones.json"
log_file = tmp_path / "fleet_milestones.log"
state_file.write_text(
json.dumps(
{
"health_check_first_run": {
"triggered_at": "2026-04-01T00:00:00Z",
"phase": 1,
}
}
)
)
monkeypatch.setattr(fm, "STATE_FILE", state_file)
monkeypatch.setattr(fm, "LOG_FILE", log_file)
monkeypatch.setattr(fm, "datetime", FixedDateTime)
fm.trigger("health_check_first_run")
assert not log_file.exists()
captured = capsys.readouterr()
assert "already triggered" in captured.out

View File

@@ -1,67 +0,0 @@
from __future__ import annotations
import importlib.util
from pathlib import Path
ROOT = Path(__file__).resolve().parents[1]
SCRIPT_PATH = ROOT / "scripts" / "fleet_phase_status.py"
DOC_PATH = ROOT / "docs" / "FLEET_PHASE_1_SURVIVAL.md"
def _load_module(path: Path, name: str):
assert path.exists(), f"missing {path.relative_to(ROOT)}"
spec = importlib.util.spec_from_file_location(name, path)
assert spec and spec.loader
module = importlib.util.module_from_spec(spec)
spec.loader.exec_module(module)
return module
def test_compute_phase_status_tracks_survival_gate_requirements() -> None:
mod = _load_module(SCRIPT_PATH, "fleet_phase_status")
status = mod.compute_phase_status(
{
"fleet_operational": True,
"resources": {
"uptime_percent": 94.5,
"days_at_or_above_95_percent": 12,
"capacity_utilization_percent": 45.0,
},
}
)
assert status["current_phase"] == "PHASE-1 Survival"
assert status["next_phase_ready"] is False
assert any("94.5% / 95.0%" in item for item in status["missing_requirements"])
assert any("12/30" in item for item in status["missing_requirements"])
assert any("45.0% / >60.0%" in item for item in status["missing_requirements"])
def test_render_markdown_preserves_phase_buildings_and_manual_clicker_language() -> None:
mod = _load_module(SCRIPT_PATH, "fleet_phase_status")
status = mod.compute_phase_status(mod.default_snapshot())
report = mod.render_markdown(status)
for snippet in (
"# [PHASE-1] Survival - Keep the Lights On",
"VPS hosts: Ezra, Allegro, Bezalel",
"Timmy harness",
"Gitea forge",
"Evennia worlds",
"Every restart, every SSH, every check is a manual click.",
):
assert snippet in report
def test_repo_contains_generated_phase_1_doc() -> None:
assert DOC_PATH.exists(), "missing committed phase-1 survival doc"
text = DOC_PATH.read_text(encoding="utf-8")
for snippet in (
"# [PHASE-1] Survival - Keep the Lights On",
"## Current Buildings",
"## Next Phase Trigger",
"## Manual Clicker Interpretation",
):
assert snippet in text