Compare commits

..

3 Commits

Author SHA1 Message Date
Alexander Whitestone
8d9e7cbf7e docs: record hermes-agent test finding (#668)
Some checks are pending
Smoke Test / smoke (pull_request) Waiting to run
2026-04-15 00:26:33 -04:00
Alexander Whitestone
85bc612100 test: validate hermes-agent genome (#668) 2026-04-15 00:24:01 -04:00
Alexander Whitestone
9e120888c0 docs: add hermes-agent genome draft (#668) 2026-04-15 00:21:39 -04:00
9 changed files with 542 additions and 1044 deletions

572
GENOME.md
View File

@@ -1,141 +1,485 @@
# GENOME.md — Timmy_Foundation/timmy-home
# GENOME.md — hermes-agent
Generated by `pipelines/codebase_genome.py`.
Repository-wide facts in this document come from two grounded passes over `/Users/apayne/hermes-agent` on 2026-04-15:
- `python3 ~/.hermes/pipelines/codebase-genome.py --path /Users/apayne/hermes-agent --dry-run`
- targeted manual inspection of the core runtime, tooling, gateway, ACP, cron, and persistence modules
This is the Timmy Foundation fork of `hermes-agent`, not a generic upstream summary.
## Project Overview
Timmy Foundation's home repository for development operations and configurations.
`hermes-agent` is a multi-surface AI agent runtime, not just a terminal chatbot.
It combines:
- a rich interactive CLI/TUI
- a synchronous core agent loop
- a large tool registry with terminal, file, web, browser, MCP, memory, cron, delegation, and code-execution tools
- a multi-platform messaging gateway
- ACP editor integration
- an OpenAI-compatible API server
- cron scheduling
- persistent session/memory/state stores
- batch and RL-adjacent research surfaces
- Text files indexed: 3004
- Source and script files: 186
- Test files: 28
- Documentation files: 701
The product promise in `README.md` is that Hermes is a self-improving agent:
- it creates and updates skills
- persists memory across sessions
- searches past conversations
- delegates to subagents
- runs scheduled automations
- can operate through multiple runtime backends and communication surfaces
## Architecture
Grounded quick facts from the analyzed checkout:
- pipeline scan: 395 source files, 561 test files, 11 config files, 331,794 total lines
- Python-only pass: 307 non-test `.py` modules and 561 test Python files
- Python LOC split: 211,709 source LOC / 184,512 test LOC
- current branch: `main`
- current commit: `95d11dfd`
- last commit seen by pipeline: `95d11dfd docs: automation templates gallery + comparison post (#9821)`
- total commits reported by pipeline: 4140
- largest Python modules observed:
- `run_agent.py` — 10,871 LOC
- `cli.py` — 10,017 LOC
- `gateway/run.py` — 9,289 LOC
- `hermes_cli/main.py` — 6,056 LOC
That size profile matters. Hermes is architecturally broad, but a few very large orchestration files still dominate the control plane.
## Architecture Diagram
```mermaid
graph TD
repo_root["repo"]
angband["angband"]
briefings["briefings"]
config["config"]
conftest["conftest"]
evennia["evennia"]
evennia_tools["evennia_tools"]
evolution["evolution"]
gemini_fallback_setup["gemini-fallback-setup"]
heartbeat["heartbeat"]
infrastructure["infrastructure"]
repo_root --> angband
repo_root --> briefings
repo_root --> config
repo_root --> conftest
repo_root --> evennia
repo_root --> evennia_tools
flowchart TD
A[CLI / Gateway / ACP / API / Cron / Batch] --> B[AIAgent in run_agent.py]
B --> C[agent/prompt_builder.py]
B --> D[agent/memory_manager.py]
B --> E[agent/context_compressor.py]
B --> F[model_tools.py]
F --> G[tools/registry.py]
G --> H[tools/*.py built-in tools]
G --> I[tools/mcp_tool.py imported MCP tools]
G --> J[delegate / execute_code / cron / browser / terminal / file tools]
B --> K[hermes_state.py SQLite SessionDB]
B --> L[toolsets.py toolset selection]
M[cli.py + hermes_cli/main.py] --> B
N[gateway/run.py] --> B
O[acp_adapter/server.py] --> B
P[gateway/platforms/api_server.py] --> B
Q[cron/scheduler.py + cron/jobs.py] --> B
R[batch_runner.py] --> B
N --> S[gateway/session.py]
N --> T[gateway/platforms/* adapters]
P --> U[Responses API store]
O --> V[ACP session/event server]
Q --> W[cron job persistence + delivery]
K --> X[state.db / FTS5 search]
S --> Y[sessions.json mapping]
J --> Z[local shell, files, web, browser, subprocesses, remote MCP servers]
```
## Entry Points
## Entry Points and Data Flow
- `gemini-fallback-setup.sh` — operational script (`bash gemini-fallback-setup.sh`)
- `morrowind/hud.sh` — operational script (`bash morrowind/hud.sh`)
- `pipelines/codebase_genome.py` — python main guard (`python3 pipelines/codebase_genome.py`)
- `scripts/auto_restart_agent.sh` — operational script (`bash scripts/auto_restart_agent.sh`)
- `scripts/backup_pipeline.sh` — operational script (`bash scripts/backup_pipeline.sh`)
- `scripts/big_brain_manager.py` — operational script (`python3 scripts/big_brain_manager.py`)
- `scripts/big_brain_repo_audit.py` — operational script (`python3 scripts/big_brain_repo_audit.py`)
- `scripts/codebase_genome_nightly.py` — operational script (`python3 scripts/codebase_genome_nightly.py`)
- `scripts/detect_secrets.py` — operational script (`python3 scripts/detect_secrets.py`)
- `scripts/dynamic_dispatch_optimizer.py` — operational script (`python3 scripts/dynamic_dispatch_optimizer.py`)
- `scripts/emacs-fleet-bridge.py` — operational script (`python3 scripts/emacs-fleet-bridge.py`)
- `scripts/emacs-fleet-poll.sh` — operational script (`bash scripts/emacs-fleet-poll.sh`)
### Primary entry points
## Data Flow
1. `hermes``hermes_cli.main:main`
- canonical CLI entry point
- preloads profile context and builds the argparse/subcommand shell
- hands interactive chat to `cli.py`
1. Operators enter through `gemini-fallback-setup.sh`, `morrowind/hud.sh`, `pipelines/codebase_genome.py`.
2. Core logic fans into top-level components: `angband`, `briefings`, `config`, `conftest`, `evennia`, `evennia_tools`.
3. Validation is incomplete around `wizards/allegro/home/skills/red-teaming/godmode/scripts/auto_jailbreak.py`, `timmy-local/cache/agent_cache.py`, `wizards/allegro/home/skills/red-teaming/godmode/scripts/parseltongue.py`, so changes there carry regression risk.
4. Final artifacts land as repository files, docs, or runtime side effects depending on the selected entry point.
2. `hermes-agent``run_agent:main`
- direct runner around the core agent loop
- closest entry point to the raw agent runtime
3. `hermes-acp``acp_adapter.entry:main`
- ACP server for VS Code / Zed / JetBrains style integrations
4. `gateway/run.py`
- async orchestration loop for Telegram, Discord, Slack, WhatsApp, Signal, Matrix, webhook, email, SMS, and other adapters
5. `gateway/platforms/api_server.py`
- OpenAI-compatible HTTP surface
- exposes `/v1/chat/completions`, `/v1/responses`, `/v1/models`, `/v1/runs`, and `/health`
6. `cron/scheduler.py` + `cron/jobs.py`
- scheduled job execution and delivery
7. `batch_runner.py`
- parallel batch trajectory and research workloads
### Core data flow
1. An entry surface receives input:
- terminal prompt
- incoming platform message
- ACP editor request
- HTTP request
- scheduled cron job
- batch input
2. The surface resolves runtime state:
- profile/config
- platform identity
- model/provider settings
- toolset selection
- current session ID and conversation history
3. `run_agent.py` assembles the effective prompt:
- persona/system directives
- platform hints
- context files (`AGENTS.md`, `SOUL.md`, repo-local context)
- skill content
- memory blocks from `agent/memory_manager.py`
- compression summaries from `agent/context_compressor.py`
4. `model_tools.py` discovers and filters tools:
- imports tool modules so they self-register into `tools/registry.py`
- resolves enabled toolsets from `toolsets.py`
- returns tool schemas to the active model provider
5. The model responds with either:
- final assistant text
- tool calls
6. Tool calls are dispatched through:
- `model_tools.py`
- `tools/registry.py`
- the concrete tool handler
7. Tool outputs are appended back into the conversation and the loop continues until a final answer is produced.
8. State is persisted through:
- `hermes_state.py` for sessions/messages/search
- `gateway/session.py` for gateway session routing state
- dedicated stores for response APIs, background processes, and cron jobs
This is a layered architecture: many user-facing surfaces, one central agent runtime, one central tool registry, and several specialized persistence layers.
## Key Abstractions
- `evennia/timmy_world/game.py` — classes `World`:91, `ActionSystem`:421, `TimmyAI`:539, `NPCAI`:550; functions `get_narrative_phase()`:55, `get_phase_transition_event()`:65
- `evennia/timmy_world/world/game.py` — classes `World`:19, `ActionSystem`:326, `TimmyAI`:444, `NPCAI`:455; functions none detected
- `timmy-world/game.py` — classes `World`:19, `ActionSystem`:349, `TimmyAI`:467, `NPCAI`:478; functions none detected
- `wizards/allegro/home/skills/red-teaming/godmode/scripts/auto_jailbreak.py` — classes none detected; functions none detected
- `uniwizard/self_grader.py` — classes `SessionGrade`:23, `WeeklyReport`:55, `SelfGrader`:74; functions `main()`:713
- `uni-wizard/v3/intelligence_engine.py` — classes `ExecutionPattern`:27, `ModelPerformance`:44, `AdaptationEvent`:58, `PatternDatabase`:69; functions none detected
- `scripts/know_thy_father/crossref_audit.py` — classes `ThemeCategory`:30, `Principle`:160, `MeaningKernel`:169, `CrossRefFinding`:178; functions `extract_themes_from_text()`:192, `parse_soul_md()`:206, `parse_kernels()`:264, `cross_reference()`:296, `generate_report()`:440, `main()`:561
- `timmy-local/cache/agent_cache.py` — classes `CacheStats`:28, `LRUCache`:52, `ResponseCache`:94, `ToolCache`:205; functions none detected
### 1. `AIAgent` (`run_agent.py`)
This is the heart of Hermes.
It owns:
- provider/model invocation
- tool-loop orchestration
- prompt assembly
- memory integration
- compression and token budgeting
- final response construction
### 2. `IterationBudget` (`run_agent.py`)
A guardrail abstraction around how much work a turn may do.
It matters because Hermes is not just text generation — it may launch tools, spawn subagents, or recurse through internal workflows.
### 3. `ToolRegistry` / tool self-registration (`tools/registry.py`)
Every major tool advertises itself into a central registry.
That gives Hermes one place to manage:
- schemas
- handlers
- availability checks
- environment requirements
- dispatch behavior
This is a defining architectural trait of the codebase.
### 4. Toolsets (`toolsets.py`)
Tool exposure is not hardcoded per surface.
Instead, Hermes uses named toolsets and platform-specific aliases such as CLI, gateway, ACP, and API-server presets.
This is how one agent runtime can safely shape different operating surfaces.
### 5. `MemoryManager` (`agent/memory_manager.py`)
Hermes supports both built-in memory and external memory providers.
The abstraction here is not “a markdown note” but a memory multiplexor that decides what memory context gets injected and how memory tools behave.
### 6. `ContextCompressor` (`agent/context_compressor.py`)
Compression is a first-class subsystem.
Hermes treats long-context management as part of the runtime architecture, not an afterthought.
### 7. `SessionDB` (`hermes_state.py`)
SQLite + FTS5 session persistence is core infrastructure.
This is what makes cross-session recall, search, billing/accounting, and agent continuity practical.
### 8. `SessionStore` / `SessionContext` (`gateway/session.py`)
The gateway needs a routing abstraction different from raw message history.
It tracks home channels, session keys, reset policy, and platform-specific mapping.
### 9. `HermesACPAgent` (`acp_adapter/server.py`)
ACP is not bolted on as a thin shim.
It wraps Hermes as an editor-native agent with its own session/event lifecycle.
### 10. `ProcessRegistry` (`tools/process_registry.py`)
Long-running background commands are first-class managed resources.
Hermes tracks them explicitly rather than treating subprocesses as disposable side effects.
## API Surface
- CLI: `bash gemini-fallback-setup.sh` — operational script (`gemini-fallback-setup.sh`)
- CLI: `bash morrowind/hud.sh` — operational script (`morrowind/hud.sh`)
- CLI: `python3 pipelines/codebase_genome.py` — python main guard (`pipelines/codebase_genome.py`)
- CLI: `bash scripts/auto_restart_agent.sh` — operational script (`scripts/auto_restart_agent.sh`)
- CLI: `bash scripts/backup_pipeline.sh` — operational script (`scripts/backup_pipeline.sh`)
- CLI: `python3 scripts/big_brain_manager.py` — operational script (`scripts/big_brain_manager.py`)
- CLI: `python3 scripts/big_brain_repo_audit.py` — operational script (`scripts/big_brain_repo_audit.py`)
- CLI: `python3 scripts/codebase_genome_nightly.py` — operational script (`scripts/codebase_genome_nightly.py`)
- Python: `get_narrative_phase()` from `evennia/timmy_world/game.py:55`
- Python: `get_phase_transition_event()` from `evennia/timmy_world/game.py:65`
- Python: `main()` from `uniwizard/self_grader.py:713`
### CLI and shell API
Important surfaces exposed by packaging and command routing:
- `hermes`
- `hermes-agent`
- `hermes-acp`
- subcommands in `hermes_cli/main.py`
- slash commands defined centrally in `hermes_cli/commands.py`
## Test Coverage Report
The slash-command registry is a notable design choice because the same command metadata feeds:
- CLI help
- gateway help
- Telegram bot command menus
- Slack subcommand routing
- autocomplete
- Source and script files inspected: 186
- Test files inspected: 28
- Coverage gaps:
- `wizards/allegro/home/skills/red-teaming/godmode/scripts/auto_jailbreak.py` — no matching test reference detected
- `timmy-local/cache/agent_cache.py` — no matching test reference detected
- `wizards/allegro/home/skills/red-teaming/godmode/scripts/parseltongue.py` — no matching test reference detected
- `twitter-archive/multimodal_pipeline.py` — no matching test reference detected
- `wizards/allegro/home/skills/red-teaming/godmode/scripts/godmode_race.py` — no matching test reference detected
- `skills/productivity/google-workspace/scripts/google_api.py` — no matching test reference detected
- `wizards/allegro/home/skills/productivity/google-workspace/scripts/google_api.py` — no matching test reference detected
- `morrowind/pilot.py` — no matching test reference detected
- `morrowind/mcp_server.py` — no matching test reference detected
- `skills/research/domain-intel/scripts/domain_intel.py` — no matching test reference detected
- `wizards/allegro/home/skills/research/domain-intel/scripts/domain_intel.py` — no matching test reference detected
- `timmy-local/scripts/ingest.py` — no matching test reference detected
### HTTP API surface
From `gateway/platforms/api_server.py`, the major routes are:
- `POST /v1/chat/completions`
- `POST /v1/responses`
- `GET /v1/responses/{response_id}`
- `DELETE /v1/responses/{response_id}`
- `GET /v1/models`
- `POST /v1/runs`
- `GET /v1/runs/{run_id}/events`
- `GET /health`
## Security Audit Findings
This makes Hermes usable as an OpenAI-compatible backend for external clients and web UIs.
- [medium] `briefings/briefing_20260325.json:37` — hardcoded http endpoint: plaintext or fixed HTTP endpoints can drift or leak across environments. Evidence: `"gitea_error": "Gitea 404: {\"errors\":null,\"message\":\"not found\",\"url\":\"http://143.198.27.163:3000/api/swagger\"}\n [http://143.198.27.163:3000/api/v1/repos/Timmy_Foundation/sovereign-orchestration/issues?state=open&type=issues&sort=created&direction=desc&limit=1&page=1]",`
- [medium] `briefings/briefing_20260328.json:11` — hardcoded http endpoint: plaintext or fixed HTTP endpoints can drift or leak across environments. Evidence: `"provider_base_url": "http://localhost:8081/v1",`
- [medium] `briefings/briefing_20260329.json:11` — hardcoded http endpoint: plaintext or fixed HTTP endpoints can drift or leak across environments. Evidence: `"provider_base_url": "http://localhost:8081/v1",`
- [medium] `config.yaml:37` — hardcoded http endpoint: plaintext or fixed HTTP endpoints can drift or leak across environments. Evidence: `summary_base_url: http://localhost:11434/v1`
- [medium] `config.yaml:47` — hardcoded http endpoint: plaintext or fixed HTTP endpoints can drift or leak across environments. Evidence: `base_url: 'http://localhost:11434/v1'`
- [medium] `config.yaml:52` — hardcoded http endpoint: plaintext or fixed HTTP endpoints can drift or leak across environments. Evidence: `base_url: 'http://localhost:11434/v1'`
- [medium] `config.yaml:57` — hardcoded http endpoint: plaintext or fixed HTTP endpoints can drift or leak across environments. Evidence: `base_url: 'http://localhost:11434/v1'`
- [medium] `config.yaml:62` — hardcoded http endpoint: plaintext or fixed HTTP endpoints can drift or leak across environments. Evidence: `base_url: 'http://localhost:11434/v1'`
- [medium] `config.yaml:67` — hardcoded http endpoint: plaintext or fixed HTTP endpoints can drift or leak across environments. Evidence: `base_url: 'http://localhost:11434/v1'`
- [medium] `config.yaml:77` — hardcoded http endpoint: plaintext or fixed HTTP endpoints can drift or leak across environments. Evidence: `base_url: 'http://localhost:11434/v1'`
- [medium] `config.yaml:82` — hardcoded http endpoint: plaintext or fixed HTTP endpoints can drift or leak across environments. Evidence: `base_url: 'http://localhost:11434/v1'`
- [medium] `config.yaml:174` — hardcoded http endpoint: plaintext or fixed HTTP endpoints can drift or leak across environments. Evidence: `base_url: http://localhost:11434/v1`
### Messaging platform API surface
The gateway platform abstraction exposes Hermes across many adapters under `gateway/platforms/`.
Observed adapters include:
- Telegram
- Discord
- Slack
- WhatsApp
- Signal
- Matrix
- Home Assistant
- webhook
- email
- SMS
- Mattermost
- QQBot
- WeCom / Weixin
- DingTalk
- BlueBubbles
## Dead Code Candidates
### Tool API surface
The tool surface is broad and central to the product:
- terminal execution
- process management
- file IO / search / patch
- browser automation
- web search/extract
- cron jobs
- memory and session search
- subagent delegation
- execute_code sandbox
- MCP tool import
- TTS / vision / image generation
- smart-home integrations
- `wizards/allegro/home/skills/red-teaming/godmode/scripts/auto_jailbreak.py` — not imported by indexed Python modules and not referenced by tests
- `timmy-local/cache/agent_cache.py` — not imported by indexed Python modules and not referenced by tests
- `wizards/allegro/home/skills/red-teaming/godmode/scripts/parseltongue.py` — not imported by indexed Python modules and not referenced by tests
- `twitter-archive/multimodal_pipeline.py` — not imported by indexed Python modules and not referenced by tests
- `wizards/allegro/home/skills/red-teaming/godmode/scripts/godmode_race.py` — not imported by indexed Python modules and not referenced by tests
- `skills/productivity/google-workspace/scripts/google_api.py` — not imported by indexed Python modules and not referenced by tests
- `wizards/allegro/home/skills/productivity/google-workspace/scripts/google_api.py` — not imported by indexed Python modules and not referenced by tests
- `morrowind/pilot.py` — not imported by indexed Python modules and not referenced by tests
- `morrowind/mcp_server.py` — not imported by indexed Python modules and not referenced by tests
- `skills/research/domain-intel/scripts/domain_intel.py` — not imported by indexed Python modules and not referenced by tests
### MCP / ACP surface
Hermes participates on both sides:
- as an MCP client via `tools/mcp_tool.py`
- as an MCP server for messaging/session capabilities via `mcp_serve.py`
- as an ACP server via `acp_adapter/*`
## Performance Bottleneck Analysis
That makes Hermes an orchestration hub, not just a single runtime process.
- `angband/mcp_server.py` — large module (353 lines) likely hides multiple responsibilities
- `evennia/timmy_world/game.py` — large module (1541 lines) likely hides multiple responsibilities
- `evennia/timmy_world/world/game.py` — large module (1345 lines) likely hides multiple responsibilities
- `morrowind/mcp_server.py` — large module (451 lines) likely hides multiple responsibilities
- `morrowind/pilot.py` — large module (459 lines) likely hides multiple responsibilities
- `pipelines/codebase_genome.py` — large module (557 lines) likely hides multiple responsibilities
- `scripts/know_thy_father/crossref_audit.py` — large module (657 lines) likely hides multiple responsibilities
- `scripts/know_thy_father/index_media.py` — large module (405 lines) likely hides multiple responsibilities
- `scripts/know_thy_father/synthesize_kernels.py` — large module (416 lines) likely hides multiple responsibilities
- `scripts/tower_game.py` — large module (395 lines) likely hides multiple responsibilities
## Test Coverage Gaps
### Current observed test posture
A live collection pass on the analyzed checkout produced:
- 11,470 tests collected
- 50 deselected
- 6 collection errors
The collection errors are all ACP-related:
- `tests/acp/test_entry.py`
- `tests/acp/test_events.py`
- `tests/acp/test_mcp_e2e.py`
- `tests/acp/test_permissions.py`
- `tests/acp/test_server.py`
- `tests/acp/test_tools.py`
Root cause from the live run:
- `ModuleNotFoundError: No module named 'acp'`
- equivalently: `ModuleNotFoundError: No module named `acp`` in the failing ACP collection lane
- this lines up with `pyproject.toml`, where ACP support is optional and gated behind the `acp` extra (`agent-client-protocol>=0.9.0,<1.0`)
A secondary signal from collection:
- `tests/tools/test_file_sync_perf.py` emits `PytestUnknownMarkWarning: Unknown pytest.mark.ssh`
This specific collection problem is now tracked in hermes-agent issue `#779`.
### Where coverage looks strong
By file distribution, the codebase is heavily tested around:
- `gateway/`
- `tools/`
- `hermes_cli/`
- `run_agent`
- `cli`
- `agent`
That matches the product center of gravity: runtime orchestration, tool dispatch, and communication surfaces.
### Highest-value remaining gaps
The biggest gaps are not in total test count. They are in critical-path complexity.
1. `run_agent.py`
- the most important file in the repo and also the largest
- likely has broad behavior coverage, but branch-level completeness is improbable at 10k+ LOC
2. `cli.py`
- extremely large UI/orchestration surface
- high risk of hidden regressions across streaming, voice, slash-command routing, and interaction state
3. `gateway/run.py`
- core async gateway brain
- many platform-specific edge cases converge here
4. `hermes_cli/main.py`
- main command shell is huge and mixes parsing, routing, setup, and environment behavior
5. ACP end-to-end coverage under optional dependency installation
- current collection failure proves this lane is environment-sensitive
- ACP deserves a reliable extras-aware CI lane so collection failures are surfaced intentionally, not accidentally
6. `batch_runner.py` and `trajectory_compressor.py`
- research/training surfaces appear lighter and deserve more explicit contract tests
7. cron lifecycle and delivery failure behavior
- `cron/scheduler.py` and `cron/jobs.py` are safety-critical for unattended automation
8. optional or integration-heavy backends
- platform adapters like Feishu / Discord / Telegram
- container/cloud terminal environments
- MCP server interop
- API server streaming edge cases
### Missing tests for critical paths
The next high-leverage test work should target:
- ACP extras-enabled collection and smoke execution
- `run_agent.py` happy-path + interruption + compression + delegate + approval interaction boundaries
- `gateway/run.py` cache/interrupt/restart/session-boundary behavior at integration level
- `cron/scheduler.py` delivery error recovery, stale-job cleanup, and due-job fairness
- `batch_runner.py` and `trajectory_compressor.py` contract tests
- API-server Responses lifecycle and streaming segmentation behavior
## Security Considerations
Hermes is security-sensitive because it can run commands, read files, talk to platforms, call browsers, and broker MCP tools.
The codebase already contains several strong defensive layers.
### 1. Prompt-injection defense for context files
`agent/prompt_builder.py` scans context files such as `AGENTS.md`, `SOUL.md`, and similar instructions for:
- prompt-override language
- hidden comment/HTML tricks
- invisible unicode
- secret exfiltration patterns
That is an important architectural guardrail because Hermes explicitly ingests repository-local instruction files.
### 2. Dangerous-command approval system
`tools/approval.py` centralizes detection of destructive commands and risky shell behavior.
The repo treats command approval as a core policy subsystem, not a UI nicety.
### 3. File-path and device protections
`tools/file_tools.py` blocks dangerous device paths and sensitive system writes.
It also redacts sensitive content in read/search results and blocks reads from internal Hermes-sensitive locations.
### 4. Terminal/workdir sanitization
`tools/terminal_tool.py` constrains workdir handling and shell execution boundaries.
This matters because terminal access is one of the highest-risk capabilities Hermes exposes.
### 5. MCP subprocess hygiene
`tools/mcp_tool.py` filters environment variables passed to MCP servers and strips credentials from surfaced errors.
Given that MCP introduces third-party subprocesses into the tool graph, this is a critical boundary.
### 6. Gateway privacy and pairing controls
Gateway code includes pairing, session routing, and ID-redaction logic.
That is important because Hermes operates across public and semi-public communication surfaces.
### 7. HTTP/API hardening
`gateway/platforms/api_server.py` includes auth, CORS handling, and response-store boundaries.
This makes the API server a real production surface, not just a convenience wrapper.
### 8. Supply-chain awareness
`pyproject.toml` pins many dependencies to constrained ranges and includes security notes for selected packages.
That indicates explicit supply-chain thinking in dependency management.
## Performance Characteristics
### 1. prompt caching is a first-class optimization
Hermes preserves long-lived agent instances and supports provider-specific prompt caching for compatible providers.
That is essential because repeated system prompts and tool schemas are expensive.
### 2. context compression is built into the runtime
Compression is not a manual rescue path only.
Hermes estimates token budgets, prunes old tool noise, and can summarize prior context when needed.
### 3. parallel tool execution exists, but selectively
The runtime can batch safe tool calls in parallel rather than serializing every read-only action.
This improves latency without giving up all control over side effects.
### 4. Async loop reuse reduces orchestration overhead
The runtime avoids constantly recreating event loops for async tools, which matters when many tool calls are issued inside otherwise synchronous agent flows.
### 5. SQLite is tuned for agent workloads
`hermes_state.py` uses WAL mode, short lock windows, and retry logic instead of pretending SQLite is magically contention-free.
This is a sensible tradeoff for sovereign local persistence.
### 6. Background processes are explicitly managed
`ProcessRegistry` maintains output windows, state, and watcher behavior so long-running commands do not become invisible resource leaks.
### 7. Large control-plane files are a real performance and maintenance cost
The repo has broad feature coverage, but a few huge orchestration files dominate complexity:
- `run_agent.py`
- `cli.py`
- `gateway/run.py`
- `hermes_cli/main.py`
These files are not just maintainability debt; they also create higher reasoning and regression load for both humans and agents working in the codebase.
## Critical Modules to Name Explicitly
The following files define the real control plane of Hermes and should always be named in any serious architecture summary:
- `run_agent.py`
- `model_tools.py`
- `tools/registry.py`
- `toolsets.py`
- `cli.py`
- `hermes_cli/main.py`
- `hermes_cli/commands.py`
- `hermes_state.py`
- `agent/prompt_builder.py`
- `agent/context_compressor.py`
- `agent/memory_manager.py`
- `tools/terminal_tool.py`
- `tools/file_tools.py`
- `tools/mcp_tool.py`
- `gateway/run.py`
- `gateway/session.py`
- `gateway/platforms/api_server.py`
- `acp_adapter/server.py`
- `cron/scheduler.py`
- `cron/jobs.py`
- `batch_runner.py`
- `trajectory_compressor.py`
## Practical Takeaway
Hermes Agent is best understood as a sovereign agent operating system.
The CLI, gateway, ACP server, API server, cron scheduler, and tool graph are all frontends onto one core runtime.
The strongest qualities of the codebase are:
- broad feature coverage
- a central tool-registry design
- serious persistence/memory infrastructure
- strong security thinking around prompts, tools, files, and approvals
- a deep test surface across gateway/tools/CLI behavior
The most important risks are:
- extremely large orchestration files
- optional-surface fragility, especially ACP extras and integration-heavy adapters
- under-tested research/batch lanes relative to the core runtime
- growing complexity at the boundaries where multiple surfaces reuse the same agent loop

View File

@@ -1,79 +0,0 @@
# Codebase Genome Pipeline
Issue: `timmy-home#665`
This pipeline gives Timmy a repeatable way to generate a deterministic `GENOME.md` for any repository and rotate through the org nightly.
## What landed
- `pipelines/codebase_genome.py` — static analyzer that writes `GENOME.md`
- `pipelines/codebase-genome.py` — thin CLI wrapper matching the expected pipeline-style entrypoint
- `scripts/codebase_genome_nightly.py` — org-aware nightly runner that selects the next repo, updates a local checkout, and writes the genome artifact
- `GENOME.md` — generated analysis for `timmy-home` itself
## Genome output
Each generated `GENOME.md` includes:
- project overview and repository size metrics
- Mermaid architecture diagram
- entry points and API surface
- data flow summary
- key abstractions from Python source
- test coverage gaps
- security audit findings
- dead code candidates
- performance bottleneck analysis
## Single-repo usage
```bash
python3 pipelines/codebase_genome.py \
--repo-root /path/to/repo \
--repo-name Timmy_Foundation/some-repo \
--output /path/to/repo/GENOME.md
```
The hyphenated wrapper also works:
```bash
python3 pipelines/codebase-genome.py --repo-root /path/to/repo --repo Timmy_Foundation/some-repo
```
## Nightly org rotation
Dry-run the next selection:
```bash
python3 scripts/codebase_genome_nightly.py --dry-run
```
Run one real pass:
```bash
python3 scripts/codebase_genome_nightly.py \
--org Timmy_Foundation \
--workspace-root ~/timmy-foundation-repos \
--output-root ~/.timmy/codebase-genomes \
--state-path ~/.timmy/codebase_genome_state.json
```
Behavior:
1. fetches the current repo list from Gitea
2. selects the next repo after the last recorded run
3. clones or fast-forwards the local checkout
4. writes `GENOME.md` into the configured output tree
5. updates the rotation state file
## Example cron entry
```cron
30 2 * * * cd ~/timmy-home && /usr/bin/env python3 scripts/codebase_genome_nightly.py --org Timmy_Foundation --workspace-root ~/timmy-foundation-repos --output-root ~/.timmy/codebase-genomes --state-path ~/.timmy/codebase_genome_state.json >> ~/.timmy/logs/codebase_genome_nightly.log 2>&1
```
## Limits and follow-ons
- the generator is deterministic and static; it does not hallucinate architecture, but it also does not replace a full human review pass
- nightly rotation handles genome generation; auto-generated test expansion remains a separate follow-on lane
- large repos may still need a second-pass human edit after the initial genome artifact lands

View File

@@ -12,7 +12,6 @@ Quick-reference index for common operational tasks across the Timmy Foundation i
| Check fleet health | fleet-ops | `python3 scripts/fleet_readiness.py` |
| Agent scorecard | fleet-ops | `python3 scripts/agent_scorecard.py` |
| View fleet manifest | fleet-ops | `cat manifest.yaml` |
| Run nightly codebase genome pass | timmy-home | `python3 scripts/codebase_genome_nightly.py --dry-run` |
## the-nexus (Frontend + Brain)

View File

@@ -1 +0,0 @@
"""Codebase genome pipeline helpers."""

View File

@@ -1,6 +0,0 @@
#!/usr/bin/env python3
from codebase_genome import main
if __name__ == "__main__":
main()

View File

@@ -1,557 +0,0 @@
#!/usr/bin/env python3
"""Generate a deterministic GENOME.md for a repository."""
from __future__ import annotations
import argparse
import ast
import os
import re
from pathlib import Path
from typing import NamedTuple
IGNORED_DIRS = {
".git",
".hg",
".svn",
".venv",
"venv",
"node_modules",
"__pycache__",
".mypy_cache",
".pytest_cache",
"dist",
"build",
"coverage",
}
TEXT_SUFFIXES = {
".py",
".js",
".mjs",
".cjs",
".ts",
".tsx",
".jsx",
".html",
".css",
".md",
".txt",
".json",
".yaml",
".yml",
".sh",
".ini",
".cfg",
".toml",
}
SOURCE_SUFFIXES = {".py", ".js", ".mjs", ".cjs", ".ts", ".tsx", ".jsx", ".sh"}
DOC_FILENAMES = {"README.md", "CONTRIBUTING.md", "SOUL.md"}
class RepoFile(NamedTuple):
path: str
abs_path: Path
size_bytes: int
line_count: int
kind: str
class RunSummary(NamedTuple):
markdown: str
source_count: int
test_count: int
doc_count: int
def _is_text_file(path: Path) -> bool:
return path.suffix.lower() in TEXT_SUFFIXES or path.name in {"Dockerfile", "Makefile"}
def _file_kind(rel_path: str, path: Path) -> str:
suffix = path.suffix.lower()
if rel_path.startswith("tests/") or path.name.startswith("test_"):
return "test"
if rel_path.startswith("docs/") or path.name in DOC_FILENAMES or suffix == ".md":
return "doc"
if suffix in {".json", ".yaml", ".yml", ".toml", ".ini", ".cfg"}:
return "config"
if suffix == ".sh":
return "script"
if rel_path.startswith("scripts/") and suffix == ".py" and path.name != "__init__.py":
return "script"
if suffix in SOURCE_SUFFIXES:
return "source"
return "other"
def collect_repo_files(repo_root: str | Path) -> list[RepoFile]:
root = Path(repo_root).resolve()
files: list[RepoFile] = []
for current_root, dirnames, filenames in os.walk(root):
dirnames[:] = sorted(d for d in dirnames if d not in IGNORED_DIRS)
base = Path(current_root)
for filename in sorted(filenames):
path = base / filename
if not _is_text_file(path):
continue
rel_path = path.relative_to(root).as_posix()
text = path.read_text(encoding="utf-8", errors="replace")
files.append(
RepoFile(
path=rel_path,
abs_path=path,
size_bytes=path.stat().st_size,
line_count=max(1, len(text.splitlines())),
kind=_file_kind(rel_path, path),
)
)
return sorted(files, key=lambda item: item.path)
def _safe_text(path: Path) -> str:
return path.read_text(encoding="utf-8", errors="replace")
def _sanitize_node_id(name: str) -> str:
cleaned = re.sub(r"[^A-Za-z0-9_]", "_", name)
return cleaned or "node"
def _component_name(path: str) -> str:
if "/" in path:
return path.split("/", 1)[0]
return Path(path).stem or path
def _priority_files(files: list[RepoFile], kinds: tuple[str, ...], limit: int = 8) -> list[RepoFile]:
items = [item for item in files if item.kind in kinds]
items.sort(key=lambda item: (-int(item.path.count("/") == 0), -item.line_count, item.path))
return items[:limit]
def _readme_summary(root: Path) -> str:
readme = root / "README.md"
if not readme.exists():
return "Repository-specific overview missing from README.md. Genome generated from code structure and tests."
paragraphs: list[str] = []
current: list[str] = []
for raw_line in _safe_text(readme).splitlines():
line = raw_line.strip()
if not line:
if current:
paragraphs.append(" ".join(current).strip())
current = []
continue
if line.startswith("#"):
continue
current.append(line)
if current:
paragraphs.append(" ".join(current).strip())
return paragraphs[0] if paragraphs else "README.md exists but does not contain a prose overview paragraph."
def _extract_python_imports(text: str) -> set[str]:
try:
tree = ast.parse(text)
except SyntaxError:
return set()
imports: set[str] = set()
for node in ast.walk(tree):
if isinstance(node, ast.Import):
for alias in node.names:
imports.add(alias.name.split(".", 1)[0])
elif isinstance(node, ast.ImportFrom):
if node.module:
imports.add(node.module.split(".", 1)[0])
return imports
def _extract_python_symbols(text: str) -> tuple[list[tuple[str, int]], list[tuple[str, int]]]:
try:
tree = ast.parse(text)
except SyntaxError:
return [], []
classes: list[tuple[str, int]] = []
functions: list[tuple[str, int]] = []
for node in tree.body:
if isinstance(node, ast.ClassDef):
classes.append((node.name, node.lineno))
elif isinstance(node, ast.FunctionDef):
functions.append((node.name, node.lineno))
return classes, functions
def _build_component_edges(files: list[RepoFile]) -> list[tuple[str, str]]:
known_components = {_component_name(item.path) for item in files if item.kind in {"source", "script", "test"}}
edges: set[tuple[str, str]] = set()
for item in files:
if item.kind not in {"source", "script", "test"} or item.abs_path.suffix.lower() != ".py":
continue
src = _component_name(item.path)
imports = _extract_python_imports(_safe_text(item.abs_path))
for imported in imports:
if imported in known_components and imported != src:
edges.add((src, imported))
return sorted(edges)
def _render_mermaid(files: list[RepoFile]) -> str:
components = sorted(
{
_component_name(item.path)
for item in files
if item.kind in {"source", "script", "test", "config"}
and not _component_name(item.path).startswith(".")
}
)
edges = _build_component_edges(files)
lines = ["graph TD"]
if not components:
lines.append(" repo[\"repository\"]")
return "\n".join(lines)
for component in components[:10]:
node_id = _sanitize_node_id(component)
lines.append(f" {node_id}[\"{component}\"]")
seen_components = set(components[:10])
emitted = False
for src, dst in edges:
if src in seen_components and dst in seen_components:
lines.append(f" {_sanitize_node_id(src)} --> {_sanitize_node_id(dst)}")
emitted = True
if not emitted:
root_id = "repo_root"
lines.insert(1, f" {root_id}[\"repo\"]")
for component in components[:6]:
lines.append(f" {root_id} --> {_sanitize_node_id(component)}")
return "\n".join(lines)
def _entry_points(files: list[RepoFile]) -> list[dict[str, str]]:
points: list[dict[str, str]] = []
for item in files:
text = _safe_text(item.abs_path)
if item.kind == "script":
points.append({"path": item.path, "reason": "operational script", "command": f"python3 {item.path}" if item.abs_path.suffix == ".py" else f"bash {item.path}"})
continue
if item.abs_path.suffix == ".py" and "if __name__ == '__main__':" in text:
points.append({"path": item.path, "reason": "python main guard", "command": f"python3 {item.path}"})
elif item.path in {"app.py", "server.py", "main.py"}:
points.append({"path": item.path, "reason": "top-level executable", "command": f"python3 {item.path}"})
seen: set[str] = set()
deduped: list[dict[str, str]] = []
for point in points:
if point["path"] in seen:
continue
seen.add(point["path"])
deduped.append(point)
return deduped[:12]
def _test_coverage(files: list[RepoFile]) -> tuple[list[RepoFile], list[RepoFile], list[RepoFile]]:
source_files = [
item
for item in files
if item.kind in {"source", "script"}
and item.path not in {"pipelines/codebase-genome.py", "pipelines/codebase_genome.py"}
and not item.path.endswith("/__init__.py")
]
test_files = [item for item in files if item.kind == "test"]
combined_test_text = "\n".join(_safe_text(item.abs_path) for item in test_files)
entry_paths = {point["path"] for point in _entry_points(files)}
gaps: list[RepoFile] = []
for item in source_files:
stem = item.abs_path.stem
if item.path in entry_paths:
continue
if stem and stem in combined_test_text:
continue
gaps.append(item)
gaps.sort(key=lambda item: (-item.line_count, item.path))
return source_files, test_files, gaps
def _security_findings(files: list[RepoFile]) -> list[dict[str, str]]:
rules = [
("high", "shell execution", re.compile(r"shell\s*=\s*True"), "shell=True expands blast radius for command execution"),
("high", "dynamic evaluation", re.compile(r"\b(eval|exec)\s*\("), "dynamic evaluation bypasses static guarantees"),
("medium", "unsafe deserialization", re.compile(r"pickle\.load\(|yaml\.load\("), "deserialization of untrusted data can execute code"),
("medium", "network egress", re.compile(r"urllib\.request\.urlopen\(|requests\.(get|post|put|delete)\("), "outbound network calls create runtime dependency and failure surface"),
("medium", "hardcoded http endpoint", re.compile(r"http://[^\s\"']+"), "plaintext or fixed HTTP endpoints can drift or leak across environments"),
]
findings: list[dict[str, str]] = []
for item in files:
if item.kind not in {"source", "script", "config"}:
continue
for lineno, line in enumerate(_safe_text(item.abs_path).splitlines(), start=1):
for severity, category, pattern, detail in rules:
if pattern.search(line):
findings.append(
{
"severity": severity,
"category": category,
"ref": f"{item.path}:{lineno}",
"line": line.strip(),
"detail": detail,
}
)
break
if len(findings) >= 12:
return findings
return findings
def _dead_code_candidates(files: list[RepoFile]) -> list[RepoFile]:
source_files = [item for item in files if item.kind in {"source", "script"} and item.abs_path.suffix == ".py"]
imports_by_file = {
item.path: _extract_python_imports(_safe_text(item.abs_path))
for item in source_files
}
imported_names = {name for imports in imports_by_file.values() for name in imports}
referenced_by_tests = "\n".join(_safe_text(item.abs_path) for item in files if item.kind == "test")
entry_paths = {point["path"] for point in _entry_points(files)}
candidates: list[RepoFile] = []
for item in source_files:
stem = item.abs_path.stem
if item.path in entry_paths:
continue
if stem in imported_names:
continue
if stem in referenced_by_tests:
continue
if stem in {"__init__", "conftest"}:
continue
candidates.append(item)
candidates.sort(key=lambda item: (-item.line_count, item.path))
return candidates[:10]
def _performance_findings(files: list[RepoFile]) -> list[dict[str, str]]:
findings: list[dict[str, str]] = []
for item in files:
if item.kind in {"source", "script"} and item.line_count >= 350:
findings.append({
"ref": item.path,
"detail": f"large module ({item.line_count} lines) likely hides multiple responsibilities",
})
for item in files:
if item.kind not in {"source", "script"}:
continue
text = _safe_text(item.abs_path)
if "os.walk(" in text or ".rglob(" in text or "glob.glob(" in text:
findings.append({
"ref": item.path,
"detail": "per-run filesystem scan detected; performance scales with repo size",
})
if "urllib.request.urlopen(" in text or "requests.get(" in text or "requests.post(" in text:
findings.append({
"ref": item.path,
"detail": "network-bound execution path can dominate runtime and create flaky throughput",
})
deduped: list[dict[str, str]] = []
seen: set[tuple[str, str]] = set()
for finding in findings:
key = (finding["ref"], finding["detail"])
if key in seen:
continue
seen.add(key)
deduped.append(finding)
return deduped[:10]
def _key_abstractions(files: list[RepoFile]) -> list[dict[str, object]]:
abstractions: list[dict[str, object]] = []
for item in _priority_files(files, ("source", "script"), limit=10):
if item.abs_path.suffix != ".py":
continue
classes, functions = _extract_python_symbols(_safe_text(item.abs_path))
if not classes and not functions:
continue
abstractions.append(
{
"path": item.path,
"classes": classes[:4],
"functions": [entry for entry in functions[:6] if not entry[0].startswith("_")],
}
)
return abstractions[:8]
def _api_surface(entry_points: list[dict[str, str]], abstractions: list[dict[str, object]]) -> list[str]:
api_lines: list[str] = []
for entry in entry_points[:8]:
api_lines.append(f"- CLI: `{entry['command']}` — {entry['reason']} (`{entry['path']}`)")
for abstraction in abstractions[:5]:
for func_name, lineno in abstraction["functions"]:
api_lines.append(f"- Python: `{func_name}()` from `{abstraction['path']}:{lineno}`")
if len(api_lines) >= 14:
return api_lines
return api_lines
def _data_flow(entry_points: list[dict[str, str]], files: list[RepoFile], gaps: list[RepoFile]) -> list[str]:
components = sorted(
{
_component_name(item.path)
for item in files
if item.kind in {"source", "script", "test", "config"} and not _component_name(item.path).startswith(".")
}
)
lines = []
if entry_points:
lines.append(f"1. Operators enter through {', '.join(f'`{item['path']}`' for item in entry_points[:3])}.")
else:
lines.append("1. No explicit CLI/main guard entry point was detected; execution appears library- or doc-driven.")
if components:
lines.append(f"2. Core logic fans into top-level components: {', '.join(f'`{name}`' for name in components[:6])}.")
if gaps:
lines.append(f"3. Validation is incomplete around {', '.join(f'`{item.path}`' for item in gaps[:3])}, so changes there carry regression risk.")
else:
lines.append("3. Tests appear to reference the currently indexed source set, reducing blind spots in the hot path.")
lines.append("4. Final artifacts land as repository files, docs, or runtime side effects depending on the selected entry point.")
return lines
def generate_genome_markdown(repo_root: str | Path, repo_name: str | None = None) -> str:
root = Path(repo_root).resolve()
files = collect_repo_files(root)
repo_display = repo_name or root.name
summary = _readme_summary(root)
entry_points = _entry_points(files)
source_files, test_files, coverage_gaps = _test_coverage(files)
security = _security_findings(files)
dead_code = _dead_code_candidates(files)
performance = _performance_findings(files)
abstractions = _key_abstractions(files)
api_surface = _api_surface(entry_points, abstractions)
data_flow = _data_flow(entry_points, files, coverage_gaps)
mermaid = _render_mermaid(files)
lines: list[str] = [
f"# GENOME.md — {repo_display}",
"",
"Generated by `pipelines/codebase_genome.py`.",
"",
"## Project Overview",
"",
summary,
"",
f"- Text files indexed: {len(files)}",
f"- Source and script files: {len(source_files)}",
f"- Test files: {len(test_files)}",
f"- Documentation files: {len([item for item in files if item.kind == 'doc'])}",
"",
"## Architecture",
"",
"```mermaid",
mermaid,
"```",
"",
"## Entry Points",
"",
]
if entry_points:
for item in entry_points:
lines.append(f"- `{item['path']}` — {item['reason']} (`{item['command']}`)")
else:
lines.append("- No explicit entry point detected.")
lines.extend(["", "## Data Flow", ""])
lines.extend(data_flow)
lines.extend(["", "## Key Abstractions", ""])
if abstractions:
for abstraction in abstractions:
path = abstraction["path"]
classes = abstraction["classes"]
functions = abstraction["functions"]
class_bits = ", ".join(f"`{name}`:{lineno}" for name, lineno in classes) or "none detected"
function_bits = ", ".join(f"`{name}()`:{lineno}" for name, lineno in functions) or "none detected"
lines.append(f"- `{path}` — classes {class_bits}; functions {function_bits}")
else:
lines.append("- No Python classes or top-level functions detected in the highest-priority source files.")
lines.extend(["", "## API Surface", ""])
if api_surface:
lines.extend(api_surface)
else:
lines.append("- No obvious public API surface detected.")
lines.extend(["", "## Test Coverage Report", ""])
lines.append(f"- Source and script files inspected: {len(source_files)}")
lines.append(f"- Test files inspected: {len(test_files)}")
if coverage_gaps:
lines.append("- Coverage gaps:")
for item in coverage_gaps[:12]:
lines.append(f" - `{item.path}` — no matching test reference detected")
else:
lines.append("- No obvious coverage gaps detected by the stem-matching heuristic.")
lines.extend(["", "## Security Audit Findings", ""])
if security:
for finding in security:
lines.append(
f"- [{finding['severity']}] `{finding['ref']}` — {finding['category']}: {finding['detail']}. Evidence: `{finding['line']}`"
)
else:
lines.append("- No high-signal security findings detected by the static heuristics in this pass.")
lines.extend(["", "## Dead Code Candidates", ""])
if dead_code:
for item in dead_code:
lines.append(f"- `{item.path}` — not imported by indexed Python modules and not referenced by tests")
else:
lines.append("- No obvious dead-code candidates detected.")
lines.extend(["", "## Performance Bottleneck Analysis", ""])
if performance:
for finding in performance:
lines.append(f"- `{finding['ref']}` — {finding['detail']}")
else:
lines.append("- No obvious performance hotspots detected by the static heuristics in this pass.")
return "\n".join(lines).rstrip() + "\n"
def write_genome(repo_root: str | Path, repo_name: str | None = None, output_path: str | Path | None = None) -> RunSummary:
root = Path(repo_root).resolve()
markdown = generate_genome_markdown(root, repo_name=repo_name)
out_path = Path(output_path) if output_path else root / "GENOME.md"
out_path.parent.mkdir(parents=True, exist_ok=True)
out_path.write_text(markdown, encoding="utf-8")
files = collect_repo_files(root)
source_files, test_files, _ = _test_coverage(files)
return RunSummary(
markdown=markdown,
source_count=len(source_files),
test_count=len(test_files),
doc_count=len([item for item in files if item.kind == "doc"]),
)
def main() -> None:
parser = argparse.ArgumentParser(description="Generate a deterministic GENOME.md for a repository")
parser.add_argument("--repo-root", required=True, help="Path to the repository to analyze")
parser.add_argument("--repo", dest="repo_name", default=None, help="Optional repo display name")
parser.add_argument("--repo-name", dest="repo_name_override", default=None, help="Optional repo display name")
parser.add_argument("--output", default=None, help="Path to write GENOME.md (defaults to <repo-root>/GENOME.md)")
args = parser.parse_args()
repo_name = args.repo_name_override or args.repo_name
summary = write_genome(args.repo_root, repo_name=repo_name, output_path=args.output)
target = Path(args.output) if args.output else Path(args.repo_root).resolve() / "GENOME.md"
print(
f"GENOME.md saved to {target} "
f"(sources={summary.source_count}, tests={summary.test_count}, docs={summary.doc_count})"
)
if __name__ == "__main__":
main()

View File

@@ -1,171 +0,0 @@
#!/usr/bin/env python3
"""Nightly runner for the codebase genome pipeline."""
from __future__ import annotations
import argparse
import json
import os
import subprocess
import sys
import urllib.request
from pathlib import Path
from typing import NamedTuple
class RunPlan(NamedTuple):
repo: dict
repo_dir: Path
output_path: Path
command: list[str]
def load_state(path: Path) -> dict:
if not path.exists():
return {}
return json.loads(path.read_text(encoding="utf-8"))
def save_state(path: Path, state: dict) -> None:
path.parent.mkdir(parents=True, exist_ok=True)
path.write_text(json.dumps(state, indent=2, sort_keys=True), encoding="utf-8")
def select_next_repo(repos: list[dict], state: dict) -> dict:
if not repos:
raise ValueError("no repositories available for nightly genome run")
ordered = sorted(repos, key=lambda item: item.get("full_name", item.get("name", "")).lower())
last_repo = state.get("last_repo")
for index, repo in enumerate(ordered):
if repo.get("name") == last_repo or repo.get("full_name") == last_repo:
return ordered[(index + 1) % len(ordered)]
last_index = int(state.get("last_index", -1))
return ordered[(last_index + 1) % len(ordered)]
def build_run_plan(repo: dict, workspace_root: Path, output_root: Path, pipeline_script: Path) -> RunPlan:
repo_dir = workspace_root / repo["name"]
output_path = output_root / repo["name"] / "GENOME.md"
command = [
sys.executable,
str(pipeline_script),
"--repo-root",
str(repo_dir),
"--repo-name",
repo.get("full_name", repo["name"]),
"--output",
str(output_path),
]
return RunPlan(repo=repo, repo_dir=repo_dir, output_path=output_path, command=command)
def fetch_org_repos(org: str, host: str, token_file: Path, include_archived: bool = False) -> list[dict]:
token = token_file.read_text(encoding="utf-8").strip()
page = 1
repos: list[dict] = []
while True:
req = urllib.request.Request(
f"{host.rstrip('/')}/api/v1/orgs/{org}/repos?limit=100&page={page}",
headers={"Authorization": f"token {token}", "Accept": "application/json"},
)
with urllib.request.urlopen(req, timeout=30) as resp:
chunk = json.loads(resp.read().decode("utf-8"))
if not chunk:
break
for item in chunk:
if item.get("archived") and not include_archived:
continue
repos.append(
{
"name": item["name"],
"full_name": item["full_name"],
"clone_url": item["clone_url"],
"default_branch": item.get("default_branch") or "main",
}
)
page += 1
return repos
def _authenticated_clone_url(clone_url: str, token_file: Path) -> str:
token = token_file.read_text(encoding="utf-8").strip()
if clone_url.startswith("https://"):
return f"https://{token}@{clone_url[len('https://') :]}"
return clone_url
def ensure_checkout(repo: dict, workspace_root: Path, token_file: Path) -> Path:
workspace_root.mkdir(parents=True, exist_ok=True)
repo_dir = workspace_root / repo["name"]
branch = repo.get("default_branch") or "main"
clone_url = _authenticated_clone_url(repo["clone_url"], token_file)
if (repo_dir / ".git").exists():
subprocess.run(["git", "-C", str(repo_dir), "fetch", "origin", branch, "--depth", "1"], check=True)
subprocess.run(["git", "-C", str(repo_dir), "checkout", branch], check=True)
subprocess.run(["git", "-C", str(repo_dir), "reset", "--hard", f"origin/{branch}"], check=True)
else:
subprocess.run(
["git", "clone", "--depth", "1", "--single-branch", "--branch", branch, clone_url, str(repo_dir)],
check=True,
)
return repo_dir
def run_plan(plan: RunPlan) -> None:
plan.output_path.parent.mkdir(parents=True, exist_ok=True)
subprocess.run(plan.command, check=True)
def main() -> None:
parser = argparse.ArgumentParser(description="Run one nightly codebase genome pass for the next repo in an org")
parser.add_argument("--org", default="Timmy_Foundation")
parser.add_argument("--host", default="https://forge.alexanderwhitestone.com")
parser.add_argument("--token-file", default=os.path.expanduser("~/.config/gitea/token"))
parser.add_argument("--workspace-root", default=os.path.expanduser("~/timmy-foundation-repos"))
parser.add_argument("--output-root", default=os.path.expanduser("~/.timmy/codebase-genomes"))
parser.add_argument("--state-path", default=os.path.expanduser("~/.timmy/codebase_genome_state.json"))
parser.add_argument("--pipeline-script", default=str(Path(__file__).resolve().parents[1] / "pipelines" / "codebase_genome.py"))
parser.add_argument("--include-archived", action="store_true")
parser.add_argument("--dry-run", action="store_true")
args = parser.parse_args()
token_file = Path(args.token_file).expanduser()
workspace_root = Path(args.workspace_root).expanduser()
output_root = Path(args.output_root).expanduser()
state_path = Path(args.state_path).expanduser()
pipeline_script = Path(args.pipeline_script).expanduser()
repos = fetch_org_repos(args.org, args.host, token_file, include_archived=args.include_archived)
state = load_state(state_path)
repo = select_next_repo(repos, state)
plan = build_run_plan(repo, workspace_root=workspace_root, output_root=output_root, pipeline_script=pipeline_script)
if args.dry_run:
print(
json.dumps(
{
"repo": repo,
"repo_dir": str(plan.repo_dir),
"output_path": str(plan.output_path),
"command": plan.command,
},
indent=2,
)
)
return
ensure_checkout(repo, workspace_root=workspace_root, token_file=token_file)
run_plan(plan)
save_state(
state_path,
{
"last_index": sorted(repos, key=lambda item: item.get("full_name", item.get("name", "")).lower()).index(repo),
"last_repo": repo.get("name"),
},
)
print(f"Completed genome run for {repo['full_name']} -> {plan.output_path}")
if __name__ == "__main__":
main()

View File

@@ -1,115 +0,0 @@
from __future__ import annotations
import importlib.util
from pathlib import Path
ROOT = Path(__file__).resolve().parents[1]
PIPELINE_PATH = ROOT / "pipelines" / "codebase_genome.py"
NIGHTLY_PATH = ROOT / "scripts" / "codebase_genome_nightly.py"
GENOME_PATH = ROOT / "GENOME.md"
def _load_module(path: Path, name: str):
assert path.exists(), f"missing {path.relative_to(ROOT)}"
spec = importlib.util.spec_from_file_location(name, path)
assert spec and spec.loader
module = importlib.util.module_from_spec(spec)
spec.loader.exec_module(module)
return module
def test_generate_genome_markdown_contains_required_sections(tmp_path: Path) -> None:
genome_mod = _load_module(PIPELINE_PATH, "codebase_genome")
repo = tmp_path / "repo"
(repo / "tests").mkdir(parents=True)
(repo / "README.md").write_text("# Demo Repo\n\nA tiny example repo.\n")
(repo / "app.py").write_text(
"import module\n\n"
"def main():\n"
" return module.Helper().answer()\n\n"
"if __name__ == '__main__':\n"
" raise SystemExit(main())\n"
)
(repo / "module.py").write_text(
"class Helper:\n"
" def answer(self):\n"
" return 42\n"
)
(repo / "dangerous.py").write_text(
"import subprocess\n\n"
"def run_shell(cmd):\n"
" return subprocess.run(cmd, shell=True, check=False)\n"
)
(repo / "extra.py").write_text("VALUE = 7\n")
(repo / "tests" / "test_app.py").write_text(
"from app import main\n\n"
"def test_main():\n"
" assert main() == 42\n"
)
genome = genome_mod.generate_genome_markdown(repo, repo_name="org/repo")
for heading in (
"# GENOME.md — org/repo",
"## Project Overview",
"## Architecture",
"```mermaid",
"## Entry Points",
"## Data Flow",
"## Key Abstractions",
"## API Surface",
"## Test Coverage Report",
"## Security Audit Findings",
"## Dead Code Candidates",
"## Performance Bottleneck Analysis",
):
assert heading in genome
assert "app.py" in genome
assert "module.py" in genome
assert "dangerous.py" in genome
assert "extra.py" in genome
assert "shell=True" in genome
def test_nightly_runner_rotates_repos_and_builds_plan() -> None:
nightly_mod = _load_module(NIGHTLY_PATH, "codebase_genome_nightly")
repos = [
{"name": "alpha", "full_name": "Timmy_Foundation/alpha", "clone_url": "https://example/alpha.git"},
{"name": "beta", "full_name": "Timmy_Foundation/beta", "clone_url": "https://example/beta.git"},
]
state = {"last_index": 0, "last_repo": "alpha"}
next_repo = nightly_mod.select_next_repo(repos, state)
assert next_repo["name"] == "beta"
plan = nightly_mod.build_run_plan(
repo=next_repo,
workspace_root=Path("/tmp/repos"),
output_root=Path("/tmp/genomes"),
pipeline_script=Path("/tmp/timmy-home/pipelines/codebase_genome.py"),
)
assert plan.repo_dir == Path("/tmp/repos/beta")
assert plan.output_path == Path("/tmp/genomes/beta/GENOME.md")
assert "codebase_genome.py" in plan.command[1]
assert plan.command[-1] == "/tmp/genomes/beta/GENOME.md"
def test_repo_contains_generated_timmy_home_genome() -> None:
assert GENOME_PATH.exists(), "missing generated GENOME.md for timmy-home"
text = GENOME_PATH.read_text(encoding="utf-8")
for snippet in (
"# GENOME.md — Timmy_Foundation/timmy-home",
"## Project Overview",
"## Architecture",
"## Entry Points",
"## API Surface",
"## Test Coverage Report",
"## Security Audit Findings",
"## Performance Bottleneck Analysis",
):
assert snippet in text

View File

@@ -0,0 +1,84 @@
from pathlib import Path
GENOME = Path('GENOME.md')
def read_genome() -> str:
assert GENOME.exists(), 'GENOME.md must exist at repo root'
return GENOME.read_text(encoding='utf-8')
def test_genome_exists():
assert GENOME.exists(), 'GENOME.md must exist at repo root'
def test_genome_has_required_sections():
text = read_genome()
for heading in [
'# GENOME.md — hermes-agent',
'## Project Overview',
'## Architecture Diagram',
'## Entry Points and Data Flow',
'## Key Abstractions',
'## API Surface',
'## Test Coverage Gaps',
'## Security Considerations',
'## Performance Characteristics',
'## Critical Modules to Name Explicitly',
]:
assert heading in text
def test_genome_contains_mermaid_diagram():
text = read_genome()
assert '```mermaid' in text
assert 'flowchart TD' in text
def test_genome_mentions_control_plane_modules():
text = read_genome()
for token in [
'run_agent.py',
'model_tools.py',
'tools/registry.py',
'toolsets.py',
'cli.py',
'hermes_cli/main.py',
'hermes_state.py',
'gateway/run.py',
'acp_adapter/server.py',
'cron/scheduler.py',
]:
assert token in text
def test_genome_mentions_test_gap_and_collection_findings():
text = read_genome()
for token in [
'11,470 tests collected',
'6 collection errors',
'ModuleNotFoundError: No module named `acp`',
'trajectory_compressor.py',
'batch_runner.py',
]:
assert token in text
def test_genome_mentions_security_and_performance_layers():
text = read_genome()
for token in [
'prompt_builder.py',
'approval.py',
'file_tools.py',
'mcp_tool.py',
'WAL mode',
'prompt caching',
'context compression',
'parallel tool execution',
]:
assert token in text
def test_genome_is_substantial():
text = read_genome()
assert len(text) >= 10000