timmy-home/GENOME.md

# GENOME.md — hermes-agent

Repository-wide facts in this document come from two grounded passes over `/Users/apayne/hermes-agent` on 2026-04-15:
- `python3 ~/.hermes/pipelines/codebase-genome.py --path /Users/apayne/hermes-agent --dry-run`
- targeted manual inspection of the core runtime, tooling, gateway, ACP, cron, and persistence modules

This is the Timmy Foundation fork of `hermes-agent`, not a generic upstream summary.

## Project Overview

`hermes-agent` is a multi-surface AI agent runtime, not just a terminal chatbot.
It combines:
- a rich interactive CLI/TUI
- a synchronous core agent loop
- a large tool registry with terminal, file, web, browser, MCP, memory, cron, delegation, and code-execution tools
- a multi-platform messaging gateway
- ACP editor integration
- an OpenAI-compatible API server
- cron scheduling
- persistent session/memory/state stores
- batch and RL-adjacent research surfaces

The product promise in `README.md` is that Hermes is a self-improving agent:
- it creates and updates skills
- persists memory across sessions
- searches past conversations
- delegates to subagents
- runs scheduled automations
- can operate through multiple runtime backends and communication surfaces

Grounded quick facts from the analyzed checkout:
- pipeline scan: 395 source files, 561 test files, 11 config files, 331,794 total lines
- Python-only pass: 307 non-test `.py` modules and 561 test Python files
- Python LOC split: 211,709 source LOC / 184,512 test LOC
- current branch: `main`
- current commit: `95d11dfd`
- last commit seen by pipeline: `95d11dfd docs: automation templates gallery + comparison post (#9821)`
- total commits reported by pipeline: 4140
- largest Python modules observed:
  - `run_agent.py` — 10,871 LOC
  - `cli.py` — 10,017 LOC
  - `gateway/run.py` — 9,289 LOC
  - `hermes_cli/main.py` — 6,056 LOC

That size profile matters. Hermes is architecturally broad, but a few very large orchestration files still dominate the control plane.

## Architecture Diagram

```mermaid
flowchart TD
    A[CLI / Gateway / ACP / API / Cron / Batch] --> B[AIAgent in run_agent.py]
    B --> C[agent/prompt_builder.py]
    B --> D[agent/memory_manager.py]
    B --> E[agent/context_compressor.py]
    B --> F[model_tools.py]

    F --> G[tools/registry.py]
    G --> H[tools/*.py built-in tools]
    G --> I[tools/mcp_tool.py imported MCP tools]
    G --> J[delegate / execute_code / cron / browser / terminal / file tools]

    B --> K[hermes_state.py SQLite SessionDB]
    B --> L[toolsets.py toolset selection]

    M[cli.py + hermes_cli/main.py] --> B
    N[gateway/run.py] --> B
    O[acp_adapter/server.py] --> B
    P[gateway/platforms/api_server.py] --> B
    Q[cron/scheduler.py + cron/jobs.py] --> B
    R[batch_runner.py] --> B

    N --> S[gateway/session.py]
    N --> T[gateway/platforms/* adapters]
    P --> U[Responses API store]
    O --> V[ACP session/event server]
    Q --> W[cron job persistence + delivery]

    K --> X[state.db / FTS5 search]
    S --> Y[sessions.json mapping]
    J --> Z[local shell, files, web, browser, subprocesses, remote MCP servers]
```

## Entry Points and Data Flow

### Primary entry points

1. `hermes` → `hermes_cli.main:main`
   - canonical CLI entry point
   - preloads profile context and builds the argparse/subcommand shell
   - hands interactive chat to `cli.py`

2. `hermes-agent` → `run_agent:main`
   - direct runner around the core agent loop
   - closest entry point to the raw agent runtime

3. `hermes-acp` → `acp_adapter.entry:main`
   - ACP server for VS Code / Zed / JetBrains style integrations

4. `gateway/run.py`
   - async orchestration loop for Telegram, Discord, Slack, WhatsApp, Signal, Matrix, webhook, email, SMS, and other adapters

5. `gateway/platforms/api_server.py`
   - OpenAI-compatible HTTP surface
   - exposes `/v1/chat/completions`, `/v1/responses`, `/v1/models`, `/v1/runs`, and `/health`

6. `cron/scheduler.py` + `cron/jobs.py`
   - scheduled job execution and delivery

7. `batch_runner.py`
   - parallel batch trajectory and research workloads

### Core data flow

1. An entry surface receives input:
   - terminal prompt
   - incoming platform message
   - ACP editor request
   - HTTP request
   - scheduled cron job
   - batch input

2. The surface resolves runtime state:
   - profile/config
   - platform identity
   - model/provider settings
   - toolset selection
   - current session ID and conversation history

3. `run_agent.py` assembles the effective prompt:
   - persona/system directives
   - platform hints
   - context files (`AGENTS.md`, `SOUL.md`, repo-local context)
   - skill content
   - memory blocks from `agent/memory_manager.py`
   - compression summaries from `agent/context_compressor.py`

4. `model_tools.py` discovers and filters tools:
   - imports tool modules so they self-register into `tools/registry.py`
   - resolves enabled toolsets from `toolsets.py`
   - returns tool schemas to the active model provider

5. The model responds with either:
   - final assistant text
   - tool calls

6. Tool calls are dispatched through:
   - `model_tools.py`
   - `tools/registry.py`
   - the concrete tool handler

7. Tool outputs are appended back into the conversation and the loop continues until a final answer is produced.

8. State is persisted through:
   - `hermes_state.py` for sessions/messages/search
   - `gateway/session.py` for gateway session routing state
   - dedicated stores for response APIs, background processes, and cron jobs

This is a layered architecture: many user-facing surfaces, one central agent runtime, one central tool registry, and several specialized persistence layers.

## Key Abstractions

### 1. `AIAgent` (`run_agent.py`)
This is the heart of Hermes.
It owns:
- provider/model invocation
- tool-loop orchestration
- prompt assembly
- memory integration
- compression and token budgeting
- final response construction

### 2. `IterationBudget` (`run_agent.py`)
A guardrail abstraction around how much work a turn may do.
It matters because Hermes is not just text generation — it may launch tools, spawn subagents, or recurse through internal workflows.

### 3. `ToolRegistry` / tool self-registration (`tools/registry.py`)
Every major tool advertises itself into a central registry.
That gives Hermes one place to manage:
- schemas
- handlers
- availability checks
- environment requirements
- dispatch behavior

This is a defining architectural trait of the codebase.

### 4. Toolsets (`toolsets.py`)
Tool exposure is not hardcoded per surface.
Instead, Hermes uses named toolsets and platform-specific aliases such as CLI, gateway, ACP, and API-server presets.
This is how one agent runtime can safely shape different operating surfaces.

### 5. `MemoryManager` (`agent/memory_manager.py`)
Hermes supports both built-in memory and external memory providers.
The abstraction here is not “a markdown note” but a memory multiplexor that decides what memory context gets injected and how memory tools behave.

### 6. `ContextCompressor` (`agent/context_compressor.py`)
Compression is a first-class subsystem.
Hermes treats long-context management as part of the runtime architecture, not an afterthought.

### 7. `SessionDB` (`hermes_state.py`)
SQLite + FTS5 session persistence is core infrastructure.
This is what makes cross-session recall, search, billing/accounting, and agent continuity practical.

### 8. `SessionStore` / `SessionContext` (`gateway/session.py`)
The gateway needs a routing abstraction different from raw message history.
It tracks home channels, session keys, reset policy, and platform-specific mapping.

### 9. `HermesACPAgent` (`acp_adapter/server.py`)
ACP is not bolted on as a thin shim.
It wraps Hermes as an editor-native agent with its own session/event lifecycle.

### 10. `ProcessRegistry` (`tools/process_registry.py`)
Long-running background commands are first-class managed resources.
Hermes tracks them explicitly rather than treating subprocesses as disposable side effects.

## API Surface

### CLI and shell API
Important surfaces exposed by packaging and command routing:
- `hermes`
- `hermes-agent`
- `hermes-acp`
- subcommands in `hermes_cli/main.py`
- slash commands defined centrally in `hermes_cli/commands.py`

The slash-command registry is a notable design choice because the same command metadata feeds:
- CLI help
- gateway help
- Telegram bot command menus
- Slack subcommand routing
- autocomplete

### HTTP API surface
From `gateway/platforms/api_server.py`, the major routes are:
- `POST /v1/chat/completions`
- `POST /v1/responses`
- `GET /v1/responses/{response_id}`
- `DELETE /v1/responses/{response_id}`
- `GET /v1/models`
- `POST /v1/runs`
- `GET /v1/runs/{run_id}/events`
- `GET /health`

This makes Hermes usable as an OpenAI-compatible backend for external clients and web UIs.

### Messaging platform API surface
The gateway platform abstraction exposes Hermes across many adapters under `gateway/platforms/`.
Observed adapters include:
- Telegram
- Discord
- Slack
- WhatsApp
- Signal
- Matrix
- Home Assistant
- webhook
- email
- SMS
- Mattermost
- QQBot
- WeCom / Weixin
- DingTalk
- BlueBubbles

### Tool API surface
The tool surface is broad and central to the product:
- terminal execution
- process management
- file IO / search / patch
- browser automation
- web search/extract
- cron jobs
- memory and session search
- subagent delegation
- execute_code sandbox
- MCP tool import
- TTS / vision / image generation
- smart-home integrations

### MCP / ACP surface
Hermes participates on both sides:
- as an MCP client via `tools/mcp_tool.py`
- as an MCP server for messaging/session capabilities via `mcp_serve.py`
- as an ACP server via `acp_adapter/*`

That makes Hermes an orchestration hub, not just a single runtime process.

## Test Coverage Gaps

### Current observed test posture
A live collection pass on the analyzed checkout produced:
- 11,470 tests collected
- 50 deselected
- 6 collection errors

The collection errors are all ACP-related:
- `tests/acp/test_entry.py`
- `tests/acp/test_events.py`
- `tests/acp/test_mcp_e2e.py`
- `tests/acp/test_permissions.py`
- `tests/acp/test_server.py`
- `tests/acp/test_tools.py`

Root cause from the live run:
- `ModuleNotFoundError: No module named 'acp'`
- equivalently: `ModuleNotFoundError: No module named `acp`` in the failing ACP collection lane
- this lines up with `pyproject.toml`, where ACP support is optional and gated behind the `acp` extra (`agent-client-protocol>=0.9.0,<1.0`)

A secondary signal from collection:
- `tests/tools/test_file_sync_perf.py` emits `PytestUnknownMarkWarning: Unknown pytest.mark.ssh`

This specific collection problem is now tracked in hermes-agent issue `#779`.

### Where coverage looks strong
By file distribution, the codebase is heavily tested around:
- `gateway/`
- `tools/`
- `hermes_cli/`
- `run_agent`
- `cli`
- `agent`

That matches the product center of gravity: runtime orchestration, tool dispatch, and communication surfaces.

### Highest-value remaining gaps
The biggest gaps are not in total test count. They are in critical-path complexity.

1. `run_agent.py`
   - the most important file in the repo and also the largest
   - likely has broad behavior coverage, but branch-level completeness is improbable at 10k+ LOC

2. `cli.py`
   - extremely large UI/orchestration surface
   - high risk of hidden regressions across streaming, voice, slash-command routing, and interaction state

3. `gateway/run.py`
   - core async gateway brain
   - many platform-specific edge cases converge here

4. `hermes_cli/main.py`
   - main command shell is huge and mixes parsing, routing, setup, and environment behavior

5. ACP end-to-end coverage under optional dependency installation
   - current collection failure proves this lane is environment-sensitive
   - ACP deserves a reliable extras-aware CI lane so collection failures are surfaced intentionally, not accidentally

6. `batch_runner.py` and `trajectory_compressor.py`
   - research/training surfaces appear lighter and deserve more explicit contract tests

7. cron lifecycle and delivery failure behavior
   - `cron/scheduler.py` and `cron/jobs.py` are safety-critical for unattended automation

8. optional or integration-heavy backends
   - platform adapters like Feishu / Discord / Telegram
   - container/cloud terminal environments
   - MCP server interop
   - API server streaming edge cases

### Missing tests for critical paths
The next high-leverage test work should target:
- ACP extras-enabled collection and smoke execution
- `run_agent.py` happy-path + interruption + compression + delegate + approval interaction boundaries
- `gateway/run.py` cache/interrupt/restart/session-boundary behavior at integration level
- `cron/scheduler.py` delivery error recovery, stale-job cleanup, and due-job fairness
- `batch_runner.py` and `trajectory_compressor.py` contract tests
- API-server Responses lifecycle and streaming segmentation behavior

## Security Considerations

Hermes is security-sensitive because it can run commands, read files, talk to platforms, call browsers, and broker MCP tools.
The codebase already contains several strong defensive layers.

### 1. Prompt-injection defense for context files
`agent/prompt_builder.py` scans context files such as `AGENTS.md`, `SOUL.md`, and similar instructions for:
- prompt-override language
- hidden comment/HTML tricks
- invisible unicode
- secret exfiltration patterns

That is an important architectural guardrail because Hermes explicitly ingests repository-local instruction files.

### 2. Dangerous-command approval system
`tools/approval.py` centralizes detection of destructive commands and risky shell behavior.
The repo treats command approval as a core policy subsystem, not a UI nicety.

### 3. File-path and device protections
`tools/file_tools.py` blocks dangerous device paths and sensitive system writes.
It also redacts sensitive content in read/search results and blocks reads from internal Hermes-sensitive locations.

### 4. Terminal/workdir sanitization
`tools/terminal_tool.py` constrains workdir handling and shell execution boundaries.
This matters because terminal access is one of the highest-risk capabilities Hermes exposes.

### 5. MCP subprocess hygiene
`tools/mcp_tool.py` filters environment variables passed to MCP servers and strips credentials from surfaced errors.
Given that MCP introduces third-party subprocesses into the tool graph, this is a critical boundary.

### 6. Gateway privacy and pairing controls
Gateway code includes pairing, session routing, and ID-redaction logic.
That is important because Hermes operates across public and semi-public communication surfaces.

### 7. HTTP/API hardening
`gateway/platforms/api_server.py` includes auth, CORS handling, and response-store boundaries.
This makes the API server a real production surface, not just a convenience wrapper.

### 8. Supply-chain awareness
`pyproject.toml` pins many dependencies to constrained ranges and includes security notes for selected packages.
That indicates explicit supply-chain thinking in dependency management.

## Performance Characteristics

### 1. prompt caching is a first-class optimization
Hermes preserves long-lived agent instances and supports provider-specific prompt caching for compatible providers.
That is essential because repeated system prompts and tool schemas are expensive.

### 2. context compression is built into the runtime
Compression is not a manual rescue path only.
Hermes estimates token budgets, prunes old tool noise, and can summarize prior context when needed.

### 3. parallel tool execution exists, but selectively
The runtime can batch safe tool calls in parallel rather than serializing every read-only action.
This improves latency without giving up all control over side effects.

### 4. Async loop reuse reduces orchestration overhead
The runtime avoids constantly recreating event loops for async tools, which matters when many tool calls are issued inside otherwise synchronous agent flows.

### 5. SQLite is tuned for agent workloads
`hermes_state.py` uses WAL mode, short lock windows, and retry logic instead of pretending SQLite is magically contention-free.
This is a sensible tradeoff for sovereign local persistence.

### 6. Background processes are explicitly managed
`ProcessRegistry` maintains output windows, state, and watcher behavior so long-running commands do not become invisible resource leaks.

### 7. Large control-plane files are a real performance and maintenance cost
The repo has broad feature coverage, but a few huge orchestration files dominate complexity:
- `run_agent.py`
- `cli.py`
- `gateway/run.py`
- `hermes_cli/main.py`

These files are not just maintainability debt; they also create higher reasoning and regression load for both humans and agents working in the codebase.

## Critical Modules to Name Explicitly

The following files define the real control plane of Hermes and should always be named in any serious architecture summary:
- `run_agent.py`
- `model_tools.py`
- `tools/registry.py`
- `toolsets.py`
- `cli.py`
- `hermes_cli/main.py`
- `hermes_cli/commands.py`
- `hermes_state.py`
- `agent/prompt_builder.py`
- `agent/context_compressor.py`
- `agent/memory_manager.py`
- `tools/terminal_tool.py`
- `tools/file_tools.py`
- `tools/mcp_tool.py`
- `gateway/run.py`
- `gateway/session.py`
- `gateway/platforms/api_server.py`
- `acp_adapter/server.py`
- `cron/scheduler.py`
- `cron/jobs.py`
- `batch_runner.py`
- `trajectory_compressor.py`

## Practical Takeaway

Hermes Agent is best understood as a sovereign agent operating system.
The CLI, gateway, ACP server, API server, cron scheduler, and tool graph are all frontends onto one core runtime.

The strongest qualities of the codebase are:
- broad feature coverage
- a central tool-registry design
- serious persistence/memory infrastructure
- strong security thinking around prompts, tools, files, and approvals
- a deep test surface across gateway/tools/CLI behavior

The most important risks are:
- extremely large orchestration files
- optional-surface fragility, especially ACP extras and integration-heavy adapters
- under-tested research/batch lanes relative to the core runtime
- growing complexity at the boundaries where multiple surfaces reuse the same agent loop