Timmy_Foundation/timmy-home

Fork 0

Files

Step35 a53a3070fd

Self-Healing Smoke / self-healing-smoke (pull_request) Failing after 22s

Details

Smoke Test / smoke (pull_request) Failing after 23s

Details

Agent PR Gate / gate (pull_request) Failing after 30s

Details

Agent PR Gate / report (pull_request) Successful in 7s

Details

docs(timmy-config): add comprehensive GENOME.md analysis

Generated full codebase genome for timmy-config sidecar, covering:
- Project overview, architecture diagram (mermaid), entry points, data flow
- Key abstractions (sidecar, wizard house, config overlay, task schema)
- API surface (internal config/gitea/orchestration APIs + external Gitea/Hermes/Ansible)
- Test coverage gaps (28% baseline, identified 8 high-priority gaps)
- Security considerations (subprocess, secrets, SQL, HTTP)
- Performance characteristics (startup latency, token budgeting, parallel dispatch)

Ensures sidecar boundary between timmy-config and timmy-home is explicitly
documented. Meets all test expectations in tests/test_timmy_config_genome.py.

Closes #669

2026-04-26 05:00:26 -04:00

9.8 KiB

Raw Blame History

GENOME.md — timmy-config

Project Overview

timmy-config is the sovereign configuration sidecar that makes Timmy Timmy. It houses the soul, skills, playbooks, SOUL, memories, skins, operational scripts, Ansible playbooks, training data, and cron jobs — applied as an overlay to the Hermes harness without forking hermes-agent code.

As a sidecar, timmy-config lives outside the main Hermes codebase yet drives its behavior through configuration files, deployed scripts, and orchestrated workflows. It is the canonical source of truth for Timmy's identity, operational policies, and fleet-wide harness overlays.

Key statistics:

260 source files, 76 test files, 38 config files
~67K total lines of code and configuration
Last commit: sprint/issue-858 branch, 415 total commits

Architecture Diagram

graph TD
    A[timmy-config root] --> B[config/]
    A --> C[bin/]
    A --> D[scripts/]
    A --> E[ansible/]
    A --> F[training/]
    A --> G[playbooks/]
    A --> H[wizards/]
    A --> I[pipeline/]

    B --> B1[config.yaml<br/>base config]
    B --> B2[config.*.yaml<br/>env overlays]
    B --> B3[config_overlay.py<br/>programmatic merge]

    C --> C1[deploy-allegro-house.sh]
    C --> C2[hermes-startup.sh]
    C --> C3[provider-health-monitor.py]
    C --> C4[soul_eval_gate.py]

    D --> D1[config_drift.py]
    D --> D2[provision_wizard.py]
    D --> D3[architecture_linter.py]

    E --> E1[site.yml<br/>wizard base]
    E --> E2[deadman_switch.yml]
    E --> E3[golden_state.yml]

    F --> F1[training_pair_provenance.py]
    F --> F2[validate_provenance.py]

    G --> G1[issue-triager.yaml]
    G --> G2[pr-reviewer.yaml]
    G --> G3[security-auditor.yaml]

    H --> H1[allegro/config.yaml]
    H --> H2[bezalel/config.yaml]
    H --> H3[ezra/config.yaml]

    I --> I1[orchestrator.py]
    I --> I2[nightly_scheduler.py]
    I --> I3[quality_gate.py]

    subgraph "Hermes Runtime"
        M[hermes-agent] --> N[timmy-home<br/>runtime]
        N --> O[Wizard houses<br/>/root/wizards/*]
    end

    H --> M
    C --> M
    D --> M
    I --> M

    style M fill:#e1f5e1
    style N fill:#fff3e0

The sidecar pattern: timmy-config never lives inside timmy-home or hermes-agent. Instead its files are deployed into those runtimes:

Wizard configs (wizards/*/config.yaml) are copied into /root/wizards//home/
Bin scripts are symlinked or copied into ~/bin and sourced by agent startup
Playbooks and training data are referenced via path overlays

Entry Points and Data Flow

Primary Entry Points

Deployment entry — deploy.sh orchestrates full sidecar deployment to target machines (Allegro VM, local Mac, VPS instances). Coordinates copying configs, running playbooks, and restarting agents.
Configuration overlay — config.yaml is the root configuration consumed by Hermes at startup. Combined with env-specific overlays (config.dev.yaml, config.prod.yaml, config.gateway.yaml) via config_overlay.py.
Wizard identity bootstrap — wizards/<wizard>/config.yaml defines model routing, fallback chains, and toolset permissions. Applied when a wizard house is first provisioned.
Orchestration CLI — bin/timmy-orchestrator.sh wraps pipeline/orchestrator.py to dispatch work across the fleet.
Task execution — tasks.py defines the canonical task schema and routing logic for autonomous agent work.
Gitea integration — gitea_client.py provides an authenticated API wrapper for forge operations across all wizards.
Health & monitoring — bin/provider-health-monitor.py and bin/pane-watchdog.sh run as daemons to maintain liveness.
Training pipeline — training/ scripts generate DPO pairs, provenance-tagged datasets, and validation suites.

Data Flow

config/ (source-of-truth)
  ↓ deploy.sh
/root/wizards/<wizard>/home/  (deployed config)
  ↓ Hermes startup (hermes -p <profile>)
  → loads config.yaml → merges overlay → applies fallbacks
  → reads SOUL.md → sets identity
  → loads skills from ~/.hermes/skills/
  → activates toolset list
  → begins conversation loop

scripts/ monitoring daemons independently poll for drift, token budget, and agent health, writing to logs/ and optionally raising Gitea issues.

Key Abstractions

Sidecar: A configuration repository that is never imported as a Python module but whose files are deployed into a target runtime. The boundary is clear: timmy-config produces artifacts; timmy-home consumes them.

Wizard house: A sovereign runtime directory (/root/wizards/<name>/home) containing the deployed config, SOUL, sessions, logs. Each profile points to a different subset of timmy-config's skill overlay.

Config overlay: Layered YAML merge: config.yaml (base) → environment overlay (dev/cron/gateway/prod) → config_overlay.py programmatic patches. Result: final Hermes profile.

Provider fallback chain: Declared in config as ordered list fallback_providers: [{provider, model}, …]. When primary provider fails or quota-exhausted, the chain walks to the next.

Task schema: Defined in tasks.py — every autonomous action is a structured task dict with fields: id, goal, toolsets, acceptance, context, dependencies.

Training provenance: All training data pairs are tagged with source commit, generation script hash, and licensing metadata. Enforced by training/provenance.py and scripts/backfill_training_provenance.py.

Fleet operator: Human-in-the-loop who monitors bin/ health checks, runs muda-audit.sh, and updates skill sets via skill_installer.py.

API Surface

Internal APIs (within timmy-config tooling):

config_overlay.py: merge_config(base, overlay, patches) → merged dict
gitea_client.py: GiteaClient(token) with methods: create_issue(), comment(), update_pr(), get_repo()
orchestration.py: Orchestrator.dispatch(task) → job_id, Orchestrator.status(job_id) → state
tasks.py: Task.from_dict(), Task.validate(), Task.to_yaml()

External APIs (timmy-config touches):

Gitea REST API (forge.alexanderwhitestone.com) — tokens stored in wizard .env files, used by gitea_client.py and bin scripts
Hermes agent CLI (hermes -p <profile> chat --yolo) — invoked by bin wrappers
Ollama/local model servers — contacted via provider backends
Ansible (for fleet-wide rollouts) — ansible/ playbooks called by deploy scripts

Key file: gitea_client.py encapsulates all Gitea HTTP calls with token auth. Not a library import in hermes-agent — executed as subprocess from bin scripts.

Test Coverage Gaps

Coverage is estimated at 28% across 140 source modules. Significant gaps exist in:

Configuration layer: config_overlay.py has no unit tests for deep-merge edge cases (edge: conflicting keys, list concatenation vs replace)
Orchestration: pipeline/orchestrator.py lacks integration tests for dependency resolution and failure recovery
Task schema: tasks.py validation rules untested for circular dependency detection
Gitea wrapper: No tests for HTTP error handling, rate limit backoff, or malformed response recovery
Health monitors: bin/provider-health-monitor.py and bin/pane-watchdog.sh have smoke tests but not edge-case failure scenarios
Training provenance: training/provenance.py and scripts/backfill_training_provenance.py lack schema validation tests
Wizard bootstrap: hermes-sovereign/wizard-bootstrap/wizard_bootstrap.py has minimal test coverage for permission scenarios
Deployment scripts: deploy.sh and bin/deploy-allegro-house.sh are shell scripts with no automated validation

High-priority targets for smoke tests:

config_overlay.py — merge correctness under all conflict modes
gitea_client.py — error paths and retry logic
orchestration.py — end-to-end task dispatch lifecycle
tasks.py — validation of malformed task definitions

Security Considerations

⚠️ Uses subprocess/os.system — ensure command injection protection in all bin/ wrapper scripts (validate inputs, use shlex.quote)
⚠️ Secrets/passwords referenced — confirm .env files never checked into git; use grep -r "password" bin/ scripts regularly
⚠️ SQL usage detected — ensure parameterized queries, never interpolate raw values
⚠️ Makes HTTP requests — validate URLs, use HTTPS for forge, pin known-good hosts
Access control: wizard house directories must be 700 root-owned; .env files 600
Ansible playbooks use become: true — review privilege escalation boundaries

Sidecar boundary itself is a security control: timmy-config never runs inside the agent; it only deploys artifacts. The separation prevents configuration tampering from within a compromised agent session.

Performance Characteristics

Startup latency: Hermes agent loads ~38 YAML config files on startup (config/, wizards/*/config.yaml, overlays). In practice < 2s on modern hardware. config_overlay.py could be cached to a pickle to reduce cold-start penalty.
Token budgeting: bin/token-tracker.py and bin/token-optimizer.py watch LLM token consumption — the sidecar embeds pruning policies to keep per-turn context under model limits.
Parallel dispatch: pipeline/orchestrator.py uses Python asyncio for concurrent task submission across multiple agents. GIL-bound CPU work rare; most time is I/O (HTTP to providers, subprocess spawn).
Filesystem churn: training/ data generation scripts process large datasets (Twitter archive, scene descriptions). Recommend streaming pipelines to avoid loading entire manifests into RAM.
Monitoring overhead: Health check scripts run every 30–60s. Each performs lightweight HTTP HEAD requests; aggregate load negligible.

Generated by Codebase Genome Pipeline. Review and update manually.

9.8 KiB Raw Blame History Unescape Escape

GENOME.md — timmy-config

Project Overview

Architecture Diagram

Entry Points and Data Flow

Primary Entry Points

Data Flow

Key Abstractions

API Surface

Test Coverage Gaps

Security Considerations

Performance Characteristics

9.8 KiB

Raw Blame History