docs: add ACP and internal systems implementation guides

- add ACP user and developer docs covering setup, lifecycle, callbacks,
  permissions, tool rendering, and runtime behavior
- add developer guides for agent loop, provider runtime resolution,
  prompt assembly, context caching/compression, gateway internals,
  session storage, tools runtime, trajectories, and cron internals
- refresh architecture, quickstart, installation, CLI reference, and
  environments docs to link the new implementation pages and ACP support
This commit is contained in:
teknium1
2026-03-14 00:29:48 -07:00
parent 29176f302e
commit d87a1615ce
17 changed files with 1256 additions and 170 deletions

View File

@@ -0,0 +1,182 @@
---
sidebar_position: 2
title: "ACP Internals"
description: "How the ACP adapter works: lifecycle, sessions, event bridge, approvals, and tool rendering"
---
# ACP Internals
The ACP adapter wraps Hermes' synchronous `AIAgent` in an async JSON-RPC stdio server.
Key implementation files:
- `acp_adapter/entry.py`
- `acp_adapter/server.py`
- `acp_adapter/session.py`
- `acp_adapter/events.py`
- `acp_adapter/permissions.py`
- `acp_adapter/tools.py`
- `acp_adapter/auth.py`
- `acp_registry/agent.json`
## Boot flow
```text
hermes acp / hermes-acp / python -m acp_adapter
-> acp_adapter.entry.main()
-> load ~/.hermes/.env
-> configure stderr logging
-> construct HermesACPAgent
-> acp.run_agent(agent)
```
Stdout is reserved for ACP JSON-RPC transport. Human-readable logs go to stderr.
## Major components
### `HermesACPAgent`
`acp_adapter/server.py` implements the ACP agent protocol.
Responsibilities:
- initialize / authenticate
- new/load/resume/fork/list/cancel session methods
- prompt execution
- session model switching
- wiring sync AIAgent callbacks into ACP async notifications
### `SessionManager`
`acp_adapter/session.py` tracks live ACP sessions.
Each session stores:
- `session_id`
- `agent`
- `cwd`
- `model`
- `history`
- `cancel_event`
The manager is thread-safe and supports:
- create
- get
- remove
- fork
- list
- cleanup
- cwd updates
### Event bridge
`acp_adapter/events.py` converts AIAgent callbacks into ACP `session_update` events.
Bridged callbacks:
- `tool_progress_callback`
- `thinking_callback`
- `step_callback`
- `message_callback`
Because `AIAgent` runs in a worker thread while ACP I/O lives on the main event loop, the bridge uses:
```python
asyncio.run_coroutine_threadsafe(...)
```
### Permission bridge
`acp_adapter/permissions.py` adapts dangerous terminal approval prompts into ACP permission requests.
Mapping:
- `allow_once` -> Hermes `once`
- `allow_always` -> Hermes `always`
- reject options -> Hermes `deny`
Timeouts and bridge failures deny by default.
### Tool rendering helpers
`acp_adapter/tools.py` maps Hermes tools to ACP tool kinds and builds editor-facing content.
Examples:
- `patch` / `write_file` -> file diffs
- `terminal` -> shell command text
- `read_file` / `search_files` -> text previews
- large results -> truncated text blocks for UI safety
## Session lifecycle
```text
new_session(cwd)
-> create SessionState
-> create AIAgent(platform="acp", enabled_toolsets=["hermes-acp"])
-> bind task_id/session_id to cwd override
prompt(..., session_id)
-> extract text from ACP content blocks
-> reset cancel event
-> install callbacks + approval bridge
-> run AIAgent in ThreadPoolExecutor
-> update session history
-> emit final agent message chunk
```
### Cancelation
`cancel(session_id)`:
- sets the session cancel event
- calls `agent.interrupt()` when available
- causes the prompt response to return `stop_reason="cancelled"`
### Forking
`fork_session()` deep-copies message history into a new live session, preserving conversation state while giving the fork its own session ID and cwd.
## Provider/auth behavior
ACP does not implement its own auth store.
Instead it reuses Hermes' runtime resolver:
- `acp_adapter/auth.py`
- `hermes_cli/runtime_provider.py`
So ACP advertises and uses the currently configured Hermes provider/credentials.
## Working directory binding
ACP sessions carry an editor cwd.
The session manager binds that cwd to the ACP session ID via task-scoped terminal/file overrides, so file and terminal tools operate relative to the editor workspace.
## Duplicate same-name tool calls
The event bridge tracks tool IDs FIFO per tool name, not just one ID per name. This is important for:
- parallel same-name calls
- repeated same-name calls in one step
Without FIFO queues, completion events would attach to the wrong tool invocation.
## Approval callback restoration
ACP temporarily installs an approval callback on the terminal tool during prompt execution, then restores the previous callback afterward. This avoids leaving ACP session-specific approval handlers installed globally forever.
## Current limitations
- ACP sessions are process-local from the ACP server's point of view
- non-text prompt blocks are currently ignored for request text extraction
- editor-specific UX varies by ACP client implementation
## Related files
- `tests/acp/` — ACP test suite
- `toolsets.py``hermes-acp` toolset definition
- `hermes_cli/main.py``hermes acp` CLI subcommand
- `pyproject.toml``[acp]` optional dependency + `hermes-acp` script

View File

@@ -0,0 +1,110 @@
---
sidebar_position: 3
title: "Agent Loop Internals"
description: "Detailed walkthrough of AIAgent execution, API modes, tools, callbacks, and fallback behavior"
---
# Agent Loop Internals
The core orchestration engine is `run_agent.py`'s `AIAgent`.
## Core responsibilities
`AIAgent` is responsible for:
- assembling the effective prompt and tool schemas
- selecting the correct provider/API mode
- making interruptible model calls
- executing tool calls (sequentially or concurrently)
- maintaining session history
- handling compression, retries, and fallback models
## API modes
Hermes currently supports three API execution modes:
| API mode | Used for |
|----------|----------|
| `chat_completions` | OpenAI-compatible chat endpoints, including OpenRouter and most custom endpoints |
| `codex_responses` | OpenAI Codex / Responses API path |
| `anthropic_messages` | Native Anthropic Messages API |
The mode is resolved from explicit args, provider selection, and base URL heuristics.
## Turn lifecycle
```text
run_conversation()
-> generate effective task_id
-> append current user message
-> load or build cached system prompt
-> maybe preflight-compress
-> build api_messages
-> inject ephemeral prompt layers
-> apply prompt caching if appropriate
-> make interruptible API call
-> if tool calls: execute them, append tool results, loop
-> if final text: persist, cleanup, return response
```
## Interruptible API calls
Hermes wraps API requests so they can be interrupted from the CLI or gateway.
This matters because:
- the agent may be in a long LLM call
- the user may send a new message mid-flight
- background systems may need cancellation semantics
## Tool execution modes
Hermes uses two execution strategies:
- sequential execution for single or interactive tools
- concurrent execution for multiple non-interactive tools
Concurrent tool execution preserves message/result ordering when reinserting tool responses into conversation history.
## Callback surfaces
`AIAgent` supports platform/integration callbacks such as:
- `tool_progress_callback`
- `thinking_callback`
- `reasoning_callback`
- `clarify_callback`
- `step_callback`
- `message_callback`
These are how the CLI, gateway, and ACP integrations stream intermediate progress and interactive approval/clarification flows.
## Budget and fallback behavior
Hermes tracks a shared iteration budget across parent and subagents. It also injects budget pressure hints near the end of the available iteration window.
Fallback model support allows the agent to switch providers/models when the primary route fails in supported failure paths.
## Compression and persistence
Before and during long runs, Hermes may:
- flush memory before context loss
- compress middle conversation turns
- split the session lineage into a new session ID after compression
- preserve recent context and structural tool-call/result consistency
## Key files to read next
- `run_agent.py`
- `agent/prompt_builder.py`
- `agent/context_compressor.py`
- `agent/prompt_caching.py`
- `model_tools.py`
## Related docs
- [Provider Runtime Resolution](./provider-runtime.md)
- [Prompt Assembly](./prompt-assembly.md)
- [Context Compression & Prompt Caching](./context-compression-and-caching.md)
- [Tools Runtime](./tools-runtime.md)

View File

@@ -1,218 +1,151 @@
---
sidebar_position: 1
title: "Architecture"
description: "Hermes Agent internals — project structure, agent loop, key classes, and design patterns"
description: "Hermes Agent internals — major subsystems, execution paths, and where to read next"
---
# Architecture
This guide covers the internal architecture of Hermes Agent for developers contributing to the project.
This page is the top-level map of Hermes Agent internals. The project has grown beyond a single monolithic loop, so the best way to understand it is by subsystem.
## Project Structure
## High-level structure
```
```text
hermes-agent/
├── run_agent.py # AIAgent class — core conversation loop, tool dispatch
├── cli.py # HermesCLI class — interactive TUI, prompt_toolkit
├── model_tools.py # Tool orchestration (thin layer over tools/registry.py)
├── toolsets.py # Tool groupings and presets
├── hermes_state.py # SQLite session database with FTS5 full-text search
├── batch_runner.py # Parallel batch processing for trajectory generation
├── run_agent.py # AIAgent core loop
├── cli.py # interactive terminal UI
├── model_tools.py # tool discovery/orchestration
├── toolsets.py # tool groupings and presets
├── hermes_state.py # SQLite session/state database
├── batch_runner.py # batch trajectory generation
├── agent/ # Agent internals (extracted modules)
│ ├── prompt_builder.py # System prompt assembly (identity, skills, memory)
│ ├── context_compressor.py # Auto-summarization when approaching context limits
│ ├── auxiliary_client.py # Resolves auxiliary OpenAI clients (summarization, vision)
│ ├── display.py # KawaiiSpinner, tool progress formatting
│ ├── model_metadata.py # Model context lengths, token estimation
│ └── trajectory.py # Trajectory saving helpers
├── hermes_cli/ # CLI command implementations
│ ├── main.py # Entry point, argument parsing, command dispatch
│ ├── config.py # Config management, migration, env var definitions
│ ├── setup.py # Interactive setup wizard
│ ├── auth.py # Provider resolution, OAuth, Nous Portal
│ ├── models.py # OpenRouter model selection lists
│ ├── banner.py # Welcome banner, ASCII art
│ ├── commands.py # Slash command definitions + autocomplete
│ ├── callbacks.py # Interactive callbacks (clarify, sudo, approval)
│ ├── doctor.py # Diagnostics
│ └── skills_hub.py # Skills Hub CLI + /skills slash command handler
├── tools/ # Tool implementations (self-registering)
│ ├── registry.py # Central tool registry (schemas, handlers, dispatch)
│ ├── approval.py # Dangerous command detection + per-session approval
│ ├── terminal_tool.py # Terminal orchestration (sudo, env lifecycle, backends)
│ ├── file_operations.py # File tool implementations (read, write, search, patch)
│ ├── file_tools.py # File tool registration
│ ├── web_tools.py # web_search, web_extract
│ ├── vision_tools.py # Image analysis via multimodal models
│ ├── delegate_tool.py # Subagent spawning and parallel task execution
│ ├── code_execution_tool.py # Sandboxed Python with RPC tool access
│ ├── session_search_tool.py # Search past conversations
│ ├── cronjob_tools.py # Scheduled task management
│ ├── skills_tool.py # Skill search and load
│ ├── skill_manager_tool.py # Skill management
│ └── environments/ # Terminal execution backends
│ ├── base.py # BaseEnvironment ABC
│ ├── local.py, docker.py, ssh.py, singularity.py, modal.py, daytona.py
├── gateway/ # Messaging gateway
│ ├── run.py # GatewayRunner — platform lifecycle, message routing
│ ├── config.py # Platform configuration resolution
│ ├── session.py # Session store, context prompts, reset policies
│ └── platforms/ # Platform adapters
│ ├── telegram.py, discord_adapter.py, slack.py, whatsapp.py
├── scripts/ # Installer and bridge scripts
│ ├── install.sh # Linux/macOS installer
│ ├── install.ps1 # Windows PowerShell installer
│ └── whatsapp-bridge/ # Node.js WhatsApp bridge (Baileys)
├── skills/ # Bundled skills (copied to ~/.hermes/skills/)
├── optional-skills/ # Official optional skills (discoverable via hub, not activated by default)
├── environments/ # RL training environments (Atropos integration)
└── tests/ # Test suite
├── agent/ # prompt building, compression, caching, metadata, trajectories
├── hermes_cli/ # command entrypoints, auth, setup, models, config, doctor
├── tools/ # tool implementations and terminal environments
├── gateway/ # messaging gateway, session routing, delivery, pairing, hooks
├── cron/ # scheduled job storage and scheduler
├── honcho_integration/ # Honcho memory integration
├── acp_adapter/ # ACP editor integration server
├── acp_registry/ # ACP registry manifest + icon
├── environments/ # Hermes RL / benchmark environment framework
├── skills/ # bundled skills
├── optional-skills/ # official optional skills
└── tests/ # test suite
```
## Core Loop
## Recommended reading order
The main agent loop lives in `run_agent.py`:
If you are new to the codebase, read in this order:
```
User message → AIAgent._run_agent_loop()
├── Build system prompt (prompt_builder.py)
├── Build API kwargs (model, messages, tools, reasoning config)
├── Call LLM (OpenAI-compatible API)
├── If tool_calls in response:
│ ├── Execute each tool via registry dispatch
│ ├── Add tool results to conversation
│ └── Loop back to LLM call
├── If text response:
│ ├── Persist session to DB
│ └── Return final_response
└── Context compression if approaching token limit
```
1. this page
2. [Agent Loop Internals](./agent-loop.md)
3. [Prompt Assembly](./prompt-assembly.md)
4. [Provider Runtime Resolution](./provider-runtime.md)
5. [Tools Runtime](./tools-runtime.md)
6. [Session Storage](./session-storage.md)
7. [Gateway Internals](./gateway-internals.md)
8. [Context Compression & Prompt Caching](./context-compression-and-caching.md)
9. [ACP Internals](./acp-internals.md)
10. [Environments, Benchmarks & Data Generation](./environments.md)
```python
while turns < max_turns:
response = client.chat.completions.create(
model=model,
messages=messages,
tools=tool_schemas,
)
## Major subsystems
if response.tool_calls:
for tool_call in response.tool_calls:
result = execute_tool(tool_call)
messages.append(tool_result_message(result))
turns += 1
else:
return response.content
```
### Agent loop
## AIAgent Class
The core synchronous orchestration engine is `AIAgent` in `run_agent.py`.
```python
class AIAgent:
def __init__(
self,
model: str = "anthropic/claude-opus-4.6",
api_key: str = None,
base_url: str = None, # Resolved internally based on provider
max_iterations: int = 60,
enabled_toolsets: list = None,
disabled_toolsets: list = None,
verbose_logging: bool = False,
quiet_mode: bool = False,
tool_progress_callback: callable = None,
):
...
It is responsible for:
def chat(self, message: str) -> str:
# Main entry point - runs the agent loop
...
```
- provider/API-mode selection
- prompt construction
- tool execution
- retries and fallback
- callbacks
- compression and persistence
## File Dependency Chain
See [Agent Loop Internals](./agent-loop.md).
```
tools/registry.py (no deps — imported by all tool files)
tools/*.py (each calls registry.register() at import time)
model_tools.py (imports tools/registry + triggers tool discovery)
run_agent.py, cli.py, batch_runner.py, environments/
```
### Prompt system
Each tool file co-locates its schema, handler, and registration. `model_tools.py` is a thin orchestration layer.
Prompt-building logic is split between:
## Key Design Patterns
- `run_agent.py`
- `agent/prompt_builder.py`
- `agent/prompt_caching.py`
- `agent/context_compressor.py`
### Self-Registering Tools
See:
Each tool file calls `registry.register()` at import time. `model_tools.py` triggers discovery by importing all tool modules.
- [Prompt Assembly](./prompt-assembly.md)
- [Context Compression & Prompt Caching](./context-compression-and-caching.md)
### Toolset Grouping
### Provider/runtime resolution
Tools are grouped into toolsets (`web`, `terminal`, `file`, `browser`, etc.) that can be enabled/disabled per platform.
Hermes has a shared runtime provider resolver used by CLI, gateway, cron, ACP, and auxiliary calls.
### Session Persistence
See [Provider Runtime Resolution](./provider-runtime.md).
All conversations are stored in SQLite (`hermes_state.py`) with full-text search. JSON logs go to `~/.hermes/sessions/`.
### Tooling runtime
### Ephemeral Injection
The tool registry, toolsets, terminal backends, process manager, and dispatch rules form a subsystem of their own.
System prompts and prefill messages are injected at API call time, never persisted to the database or logs.
See [Tools Runtime](./tools-runtime.md).
### Provider Abstraction
### Session persistence
The agent works with any OpenAI-compatible API. Provider resolution happens at init time (Nous Portal OAuth, OpenRouter API key, or custom endpoint).
Historical session state is stored primarily in SQLite, with lineage preserved across compression splits.
### Conversation Format
See [Session Storage](./session-storage.md).
Messages follow the OpenAI format:
### Messaging gateway
```python
messages = [
{"role": "system", "content": "You are a helpful assistant..."},
{"role": "user", "content": "Search for Python tutorials"},
{"role": "assistant", "content": None, "tool_calls": [...]},
{"role": "tool", "tool_call_id": "...", "content": "..."},
{"role": "assistant", "content": "Here's what I found..."},
]
```
The gateway is a long-running orchestration layer for platform adapters, session routing, pairing, delivery, and cron ticking.
## CLI Architecture
See [Gateway Internals](./gateway-internals.md).
The interactive CLI (`cli.py`) uses:
### ACP integration
- **Rich** — Welcome banner and styled panels
- **prompt_toolkit** — Fixed input area with history, `patch_stdout`, slash command autocomplete
- **KawaiiSpinner** — Animated kawaii faces during API calls; clean activity feed for tool results
ACP exposes Hermes as an editor-native agent over stdio/JSON-RPC.
Key UX behaviors:
See:
- Thinking spinner shows animated kawaii face + verb (`(⌐■_■) deliberating...`)
- Tool execution results appear as `┊ {emoji} {verb} {detail} {duration}`
- Prompt shows `⚕ ` when working, `` when idle
- Multi-line paste support with automatic formatting
- [ACP Editor Integration](../user-guide/features/acp.md)
- [ACP Internals](./acp-internals.md)
## Messaging Gateway Architecture
### Cron
The gateway (`gateway/run.py`) uses `GatewayRunner` to:
Cron jobs are implemented as first-class agent tasks, not just shell tasks.
1. Connect to all configured platforms
2. Route messages through per-chat session stores
3. Dispatch to AIAgent instances
4. Run the cron scheduler (ticks every 60s)
5. Handle interrupts and tool progress notifications
See [Cron Internals](./cron-internals.md).
Each platform adapter conforms to `BasePlatformAdapter`.
### RL / environments / trajectories
## Configuration System
Hermes ships a full environment framework for evaluation, RL integration, and SFT data generation.
- `~/.hermes/config.yaml` — All settings
- `~/.hermes/.env` — API keys and secrets
- `_config_version` in `DEFAULT_CONFIG` — Bumped when required fields are added, triggers migration prompts
See:
- [Environments, Benchmarks & Data Generation](./environments.md)
- [Trajectories & Training Format](./trajectory-format.md)
## Design themes
Several cross-cutting design themes appear throughout the codebase:
- prompt stability matters
- tool execution must be observable and interruptible
- session persistence must survive long-running use
- platform frontends should share one agent core
- optional subsystems should remain loosely coupled where possible
## Implementation notes
The older mental model of Hermes as “one OpenAI-compatible chat loop plus some tools” is no longer sufficient. Current Hermes includes:
- multiple API modes
- auxiliary model routing
- ACP editor integration
- gateway-specific session and delivery semantics
- RL environment infrastructure
- prompt-caching and compression logic with lineage-aware persistence
Use this page as the map, then dive into subsystem-specific docs for the real implementation details.

View File

@@ -0,0 +1,72 @@
---
sidebar_position: 6
title: "Context Compression & Prompt Caching"
description: "How Hermes compresses long conversations and applies provider-side prompt caching"
---
# Context Compression & Prompt Caching
Hermes manages long conversations with two complementary mechanisms:
- prompt caching
- context compression
Primary files:
- `agent/prompt_caching.py`
- `agent/context_compressor.py`
- `run_agent.py`
## Prompt caching
For Anthropic/native and Claude-via-OpenRouter flows, Hermes applies Anthropic-style cache markers.
Current strategy:
- cache the system prompt
- cache the last 3 non-system messages
- default TTL is 5 minutes unless explicitly extended
This is implemented in `agent/prompt_caching.py`.
## Why prompt stability matters
Prompt caching only helps when the stable prefix remains stable. That is why Hermes avoids rebuilding or mutating the core system prompt mid-session unless it has to.
## Compression trigger
Hermes can compress context when conversations become large. Configuration defaults live in `config.yaml`, and the compressor also has runtime checks based on actual prompt token counts.
## Compression algorithm
The compressor protects:
- the first N turns
- the last N turns
and summarizes the middle section.
It also cleans up structural issues such as orphaned tool-call/result pairs so the API never receives invalid conversation structure after compression.
## Pre-compression memory flush
Before compression, Hermes can give the model one last chance to persist memory so facts are not lost when middle turns are summarized away.
## Session lineage after compression
Compression can split the session into a new session ID while preserving parent lineage in the state DB.
This lets Hermes continue operating with a smaller active context while retaining a searchable ancestry chain.
## Re-injected state after compression
After compression, Hermes may re-inject compact operational state such as:
- todo snapshot
- prior-read-files summary
## Related docs
- [Prompt Assembly](./prompt-assembly.md)
- [Session Storage](./session-storage.md)
- [Agent Loop Internals](./agent-loop.md)

View File

@@ -0,0 +1,56 @@
---
sidebar_position: 11
title: "Cron Internals"
description: "How Hermes stores, schedules, locks, and delivers cron jobs"
---
# Cron Internals
Hermes cron support is implemented primarily in:
- `cron/jobs.py`
- `cron/scheduler.py`
- `gateway/run.py`
## Scheduling model
Hermes supports:
- one-shot delays
- intervals
- cron expressions
- explicit timestamps
## Job storage
Cron jobs are stored in Hermes-managed local state with atomic save/update semantics.
## Runtime behavior
The scheduler:
- loads jobs
- computes due work
- executes jobs in fresh agent sessions
- handles repeat counters
- updates next-run metadata
In gateway mode, cron ticking is integrated into the long-running gateway loop.
## Delivery model
Cron jobs can deliver to:
- origin chat
- local files
- platform home channels
- explicit platform/chat IDs
## Locking
Hermes uses lock-based protections so concurrent cron ticks or overlapping scheduler processes do not corrupt job state.
## Related docs
- [Cron feature guide](../user-guide/features/cron.md)
- [Gateway Internals](./gateway-internals.md)

View File

@@ -14,6 +14,10 @@ Hermes Agent includes a full environment framework that connects its tool-callin
All three share the same core: an **environment** class that defines tasks, runs an agent loop, and scores the output.
:::info Repo environments vs RL training tools
The Python environment framework documented here lives under the repo's `environments/` directory and is the implementation-level API for Hermes/Atropos integration. This is separate from the user-facing `rl_*` tools, which operate as an orchestration surface for remote RL training workflows.
:::
:::tip Quick Links
- **Want to run benchmarks?** Jump to [Available Benchmarks](#available-benchmarks)
- **Want to train with RL?** See [RL Training Tools](/user-guide/features/rl-training) for the agent-driven interface, or [Running Environments](#running-environments) for manual execution

View File

@@ -0,0 +1,95 @@
---
sidebar_position: 7
title: "Gateway Internals"
description: "How the messaging gateway boots, authorizes users, routes sessions, and delivers messages"
---
# Gateway Internals
The messaging gateway is the long-running process that connects Hermes to external platforms.
Key files:
- `gateway/run.py`
- `gateway/config.py`
- `gateway/session.py`
- `gateway/delivery.py`
- `gateway/pairing.py`
- `gateway/channel_directory.py`
- `gateway/hooks.py`
- `gateway/mirror.py`
- `gateway/platforms/*`
## Core responsibilities
The gateway process is responsible for:
- loading configuration from `.env`, `config.yaml`, and `gateway.json`
- starting platform adapters
- authorizing users
- routing incoming events to sessions
- maintaining per-chat session continuity
- dispatching messages to `AIAgent`
- running cron ticks and background maintenance tasks
- mirroring/proactively delivering output to configured channels
## Config sources
The gateway has a multi-source config model:
- environment variables
- `~/.hermes/gateway.json`
- selected bridged values from `~/.hermes/config.yaml`
## Session routing
`gateway/session.py` and `GatewayRunner` cooperate to map incoming messages to active session IDs.
Session keying can depend on:
- platform
- user/chat identity
- thread/topic identity
- special platform-specific routing behavior
## Authorization layers
The gateway can authorize through:
- platform allowlists
- gateway-wide allowlists
- DM pairing flows
- explicit allow-all settings
Pairing support is implemented in `gateway/pairing.py`.
## Delivery path
Outgoing deliveries are handled by `gateway/delivery.py`, which knows how to:
- deliver to a home channel
- resolve explicit targets
- mirror some remote deliveries back into local history/session tracking
## Hooks
Gateway events emit hook callbacks through `gateway/hooks.py`. Hooks are local trusted Python code and can observe or extend gateway lifecycle events.
## Background maintenance
The gateway also runs maintenance tasks such as:
- cron ticking
- cache refreshes
- session expiry checks
- proactive memory flush before reset/expiry
## Honcho interaction
When Honcho is enabled, the gateway can keep persistent Honcho managers aligned with session lifetimes and platform-specific session keys.
## Related docs
- [Session Storage](./session-storage.md)
- [Cron Internals](./cron-internals.md)
- [ACP Internals](./acp-internals.md)

View File

@@ -0,0 +1,85 @@
---
sidebar_position: 5
title: "Prompt Assembly"
description: "How Hermes builds the system prompt, preserves cache stability, and injects ephemeral layers"
---
# Prompt Assembly
Hermes deliberately separates:
- **cached system prompt state**
- **ephemeral API-call-time additions**
This is one of the most important design choices in the project because it affects:
- token usage
- prompt caching effectiveness
- session continuity
- memory correctness
Primary files:
- `run_agent.py`
- `agent/prompt_builder.py`
- `tools/memory_tool.py`
## Cached system prompt layers
The cached system prompt is assembled in roughly this order:
1. default agent identity
2. tool-aware behavior guidance
3. Honcho static block (when active)
4. optional system message
5. frozen MEMORY snapshot
6. frozen USER profile snapshot
7. skills index
8. context files (`AGENTS.md`, `SOUL.md`, `.cursorrules`, `.cursor/rules/*.mdc`)
9. timestamp / optional session ID
10. platform hint
## API-call-time-only layers
These are intentionally *not* persisted as part of the cached system prompt:
- `ephemeral_system_prompt`
- prefill messages
- gateway-derived session context overlays
- later-turn Honcho recall injected into the current-turn user message
This separation keeps the stable prefix stable for caching.
## Memory snapshots
Local memory and user profile data are injected as frozen snapshots at session start. Mid-session writes update disk state but do not mutate the already-built system prompt until a new session or forced rebuild occurs.
## Context files
`agent/prompt_builder.py` scans and sanitizes:
- `AGENTS.md`
- `SOUL.md`
- `.cursorrules`
- `.cursor/rules/*.mdc`
Long files are truncated before injection.
## Skills index
The skills system contributes a compact skills index to the prompt when skills tooling is available.
## Why prompt assembly is split this way
The architecture is intentionally optimized to:
- preserve provider-side prompt caching
- avoid mutating history unnecessarily
- keep memory semantics understandable
- let gateway/ACP/CLI add context without poisoning persistent prompt state
## Related docs
- [Context Compression & Prompt Caching](./context-compression-and-caching.md)
- [Session Storage](./session-storage.md)
- [Gateway Internals](./gateway-internals.md)

View File

@@ -0,0 +1,116 @@
---
sidebar_position: 4
title: "Provider Runtime Resolution"
description: "How Hermes resolves providers, credentials, API modes, and auxiliary models at runtime"
---
# Provider Runtime Resolution
Hermes has a shared provider runtime resolver used across:
- CLI
- gateway
- cron jobs
- ACP
- auxiliary model calls
Primary implementation:
- `hermes_cli/runtime_provider.py`
- `hermes_cli/auth.py`
- `agent/auxiliary_client.py`
## Resolution precedence
At a high level, provider resolution uses:
1. explicit CLI/runtime request
2. environment variables
3. `config.yaml` model/provider config
4. provider-specific defaults or auto resolution
## Providers
Current provider families include:
- OpenRouter
- Nous Portal
- OpenAI Codex
- Anthropic (native)
- Z.AI
- Kimi / Moonshot
- MiniMax
- MiniMax China
- custom OpenAI-compatible endpoints
## Output of runtime resolution
The runtime resolver returns data such as:
- `provider`
- `api_mode`
- `base_url`
- `api_key`
- `source`
- provider-specific metadata like expiry/refresh info
## Why this matters
This resolver is the main reason Hermes can share auth/runtime logic between:
- `hermes chat`
- gateway message handling
- cron jobs running in fresh sessions
- ACP editor sessions
- auxiliary model tasks
## OpenRouter vs custom OpenAI-compatible base URLs
Hermes contains logic to avoid leaking the wrong API key to a custom endpoint when both `OPENROUTER_API_KEY` and `OPENAI_API_KEY` exist.
That distinction is especially important for:
- local model servers
- non-OpenRouter OpenAI-compatible APIs
- switching providers without re-running setup
## Native Anthropic path
Anthropic is not just "via OpenRouter" anymore.
When provider resolution selects `anthropic`, Hermes uses:
- `api_mode = anthropic_messages`
- the native Anthropic Messages API
- `agent/anthropic_adapter.py` for translation
## OpenAI Codex path
Codex uses a separate Responses API path:
- `api_mode = codex_responses`
- dedicated credential resolution and auth store support
## Auxiliary model routing
Auxiliary tasks such as:
- vision
- web extraction summarization
- context compression summaries
- session search summarization
- skills hub operations
- MCP helper operations
- memory flushes
can use their own provider/model routing rather than the main conversational model.
## Fallback models
Hermes also supports a configured fallback model/provider, allowing runtime failover in supported error paths.
## Related docs
- [Agent Loop Internals](./agent-loop.md)
- [ACP Internals](./acp-internals.md)
- [Context Compression & Prompt Caching](./context-compression-and-caching.md)

View File

@@ -0,0 +1,66 @@
---
sidebar_position: 8
title: "Session Storage"
description: "How Hermes stores sessions in SQLite, maintains lineage, and exposes recall/search"
---
# Session Storage
Hermes uses a SQLite-backed session store as the main source of truth for historical conversation state.
Primary files:
- `hermes_state.py`
- `gateway/session.py`
- `tools/session_search_tool.py`
## Main database
The primary store lives at:
```text
~/.hermes/state.db
```
It contains:
- sessions
- messages
- metadata such as token counts and titles
- lineage relationships
- full-text search indexes
## What is stored per session
Examples of important session metadata:
- session ID
- source/platform
- title
- created/updated timestamps
- token counts
- tool call counts
- stored system prompt snapshot
- parent session ID after compression splits
## Lineage
When Hermes compresses a conversation, it can continue in a new session ID while preserving ancestry via `parent_session_id`.
This means resuming/searching can follow session families instead of treating each compressed shard as unrelated.
## Gateway vs CLI persistence
- CLI uses the state DB directly for resume/history/search
- gateway keeps active-session mappings and may also maintain additional platform transcript/state files
- some legacy JSON/JSONL artifacts still exist for compatibility, but SQLite is the main historical store
## Session search
The `session_search` tool uses the session DB's search features to retrieve and summarize relevant past work.
## Related docs
- [Gateway Internals](./gateway-internals.md)
- [Prompt Assembly](./prompt-assembly.md)
- [Context Compression & Prompt Caching](./context-compression-and-caching.md)

View File

@@ -0,0 +1,65 @@
---
sidebar_position: 9
title: "Tools Runtime"
description: "Runtime behavior of the tool registry, toolsets, dispatch, and terminal environments"
---
# Tools Runtime
Hermes tools are self-registering functions grouped into toolsets and executed through a central registry/dispatch system.
Primary files:
- `tools/registry.py`
- `model_tools.py`
- `toolsets.py`
- `tools/terminal_tool.py`
- `tools/environments/*`
## Tool registration model
Each tool module calls `registry.register(...)` at import time.
`model_tools.py` is responsible for importing/discovering tool modules and building the schema list used by the model.
## Toolset resolution
Toolsets are named bundles of tools. Hermes resolves them through:
- explicit enabled/disabled toolset lists
- platform presets (`hermes-cli`, `hermes-telegram`, etc.)
- dynamic MCP toolsets
- curated special-purpose sets like `hermes-acp`
## Dispatch
At runtime, tools are dispatched through the central registry, with agent-loop exceptions for some agent-level tools such as memory/todo/session-search handling.
## Terminal/runtime environments
The terminal system supports multiple backends:
- local
- docker
- ssh
- singularity
- modal
- daytona
It also supports:
- per-task cwd overrides
- background process management
- PTY mode
- approval callbacks for dangerous commands
## Concurrency
Tool calls may execute sequentially or concurrently depending on the tool mix and interaction requirements.
## Related docs
- [Toolsets Reference](../reference/toolsets-reference.md)
- [Built-in Tools Reference](../reference/tools-reference.md)
- [Agent Loop Internals](./agent-loop.md)
- [ACP Internals](./acp-internals.md)

View File

@@ -0,0 +1,56 @@
---
sidebar_position: 10
title: "Trajectories & Training Format"
description: "How Hermes saves trajectories, normalizes tool calls, and produces training-friendly outputs"
---
# Trajectories & Training Format
Hermes can save conversation trajectories for training, evaluation, and batch data generation workflows.
Primary files:
- `agent/trajectory.py`
- `run_agent.py`
- `batch_runner.py`
- `trajectory_compressor.py`
## What trajectories are for
Trajectory outputs are used for:
- SFT data generation
- debugging agent behavior
- benchmark/evaluation artifact capture
- post-processing and compression pipelines
## Normalization strategy
Hermes converts live conversation structure into a training-friendly format.
Important behaviors include:
- representing reasoning in explicit markup
- converting tool calls into structured XML-like regions for dataset compatibility
- grouping tool outputs appropriately
- separating successful and failed trajectories
## Persistence boundaries
Trajectory files do **not** blindly mirror all runtime prompt state.
Some prompt-time-only layers are intentionally excluded from persisted trajectory content so datasets are cleaner and less environment-specific.
## Batch runner
`batch_runner.py` emits richer metadata than single-session trajectory saving, including:
- model/provider metadata
- toolset info
- partial/failure markers
- tool statistics
## Related docs
- [Environments, Benchmarks & Data Generation](./environments.md)
- [Agent Loop Internals](./agent-loop.md)

View File

@@ -123,6 +123,7 @@ uv pip install -e "."
| `honcho` | AI-native memory (Honcho integration) | `uv pip install -e ".[honcho]"` |
| `mcp` | Model Context Protocol support | `uv pip install -e ".[mcp]"` |
| `homeassistant` | Home Assistant integration | `uv pip install -e ".[homeassistant]"` |
| `acp` | ACP editor integration support | `uv pip install -e ".[acp]"` |
| `slack` | Slack messaging | `uv pip install -e ".[slack]"` |
| `dev` | pytest & test utilities | `uv pip install -e ".[dev]"` |

View File

@@ -147,6 +147,17 @@ hermes skills install official/security/1password
Or use the `/skills` slash command inside chat.
### Use Hermes inside an editor via ACP
Hermes can also run as an ACP server for ACP-compatible editors like VS Code, Zed, and JetBrains:
```bash
pip install -e '.[acp]'
hermes acp
```
See [ACP Editor Integration](../user-guide/features/acp.md) for setup details.
### Try MCP servers
Connect to external tools via the Model Context Protocol:

View File

@@ -44,6 +44,7 @@ hermes [global-options] <command> [subcommand/options]
| `hermes pairing` | Approve or revoke messaging pairing codes. |
| `hermes skills` | Browse, install, publish, audit, and configure skills. |
| `hermes honcho` | Manage Honcho cross-session memory integration. |
| `hermes acp` | Run Hermes as an ACP server for editor integration. |
| `hermes tools` | Configure enabled tools per platform. |
| `hermes sessions` | Browse, export, prune, rename, and delete sessions. |
| `hermes insights` | Show token/cost/activity analytics. |
@@ -283,6 +284,29 @@ Subcommands:
| `identity` | Seed or show the AI peer identity representation. |
| `migrate` | Migration guide from openclaw-honcho to Hermes Honcho. |
## `hermes acp`
```bash
hermes acp
```
Starts Hermes as an ACP (Agent Client Protocol) stdio server for editor integration.
Related entrypoints:
```bash
hermes-acp
python -m acp_adapter
```
Install support first:
```bash
pip install -e '.[acp]'
```
See [ACP Editor Integration](../user-guide/features/acp.md) and [ACP Internals](../developer-guide/acp-internals.md).
## `hermes tools`
```bash

View File

@@ -0,0 +1,197 @@
---
sidebar_position: 11
title: "ACP Editor Integration"
description: "Use Hermes Agent inside ACP-compatible editors such as VS Code, Zed, and JetBrains"
---
# ACP Editor Integration
Hermes Agent can run as an ACP server, letting ACP-compatible editors talk to Hermes over stdio and render:
- chat messages
- tool activity
- file diffs
- terminal commands
- approval prompts
- streamed thinking / response chunks
ACP is a good fit when you want Hermes to behave like an editor-native coding agent instead of a standalone CLI or messaging bot.
## What Hermes exposes in ACP mode
Hermes runs with a curated `hermes-acp` toolset designed for editor workflows. It includes:
- file tools: `read_file`, `write_file`, `patch`, `search_files`
- terminal tools: `terminal`, `process`
- web/browser tools
- memory, todo, session search
- skills
- execute_code and delegate_task
- vision
It intentionally excludes things that do not fit typical editor UX, such as messaging delivery and cronjob management.
## Installation
Install Hermes normally, then add the ACP extra:
```bash
pip install -e '.[acp]'
```
This installs the `agent-client-protocol` dependency and enables:
- `hermes acp`
- `hermes-acp`
- `python -m acp_adapter`
## Launching the ACP server
Any of the following starts Hermes in ACP mode:
```bash
hermes acp
```
```bash
hermes-acp
```
```bash
python -m acp_adapter
```
Hermes logs to stderr so stdout remains reserved for ACP JSON-RPC traffic.
## Editor setup
### VS Code
Install an ACP client extension, then point it at the repo's `acp_registry/` directory.
Example settings snippet:
```json
{
"acpClient.agents": [
{
"name": "hermes-agent",
"registryDir": "/path/to/hermes-agent/acp_registry"
}
]
}
```
### Zed
Example settings snippet:
```json
{
"acp": {
"agents": [
{
"name": "hermes-agent",
"registry_dir": "/path/to/hermes-agent/acp_registry"
}
]
}
}
```
### JetBrains
Use an ACP-compatible plugin and point it at:
```text
/path/to/hermes-agent/acp_registry
```
## Registry manifest
The ACP registry manifest lives at:
```text
acp_registry/agent.json
```
It advertises a command-based agent whose launch command is:
```text
hermes acp
```
## Configuration and credentials
ACP mode uses the same Hermes configuration as the CLI:
- `~/.hermes/.env`
- `~/.hermes/config.yaml`
- `~/.hermes/skills/`
- `~/.hermes/state.db`
Provider resolution uses Hermes' normal runtime resolver, so ACP inherits the currently configured provider and credentials.
## Session behavior
ACP sessions are tracked by the ACP adapter's in-memory session manager while the server is running.
Each session stores:
- session ID
- working directory
- selected model
- current conversation history
- cancel event
The underlying `AIAgent` still uses Hermes' normal persistence/logging paths, but ACP `list/load/resume/fork` are scoped to the currently running ACP server process.
## Working directory behavior
ACP sessions bind the editor's cwd to the Hermes task ID so file and terminal tools run relative to the editor workspace, not the server process cwd.
## Approvals
Dangerous terminal commands can be routed back to the editor as approval prompts. ACP approval options are simpler than the CLI flow:
- allow once
- allow always
- deny
On timeout or error, the approval bridge denies the request.
## Troubleshooting
### ACP agent does not appear in the editor
Check:
- the editor is pointed at the correct `acp_registry/` path
- Hermes is installed and on your PATH
- the ACP extra is installed (`pip install -e '.[acp]'`)
### ACP starts but immediately errors
Try these checks:
```bash
hermes doctor
hermes status
hermes acp
```
### Missing credentials
ACP mode does not have its own login flow. It uses Hermes' existing provider setup. Configure credentials with:
```bash
hermes model
```
or by editing `~/.hermes/.env`.
## See also
- [ACP Internals](../../developer-guide/acp-internals.md)
- [Provider Runtime Resolution](../../developer-guide/provider-runtime.md)
- [Tools Runtime](../../developer-guide/tools-runtime.md)

View File

@@ -42,6 +42,8 @@ const sidebars: SidebarsConfig = {
'user-guide/messaging/discord',
'user-guide/messaging/slack',
'user-guide/messaging/whatsapp',
'user-guide/messaging/signal',
'user-guide/messaging/email',
'user-guide/messaging/homeassistant',
],
},
@@ -81,6 +83,7 @@ const sidebars: SidebarsConfig = {
type: 'category',
label: 'Integrations',
items: [
'user-guide/features/acp',
'user-guide/features/mcp',
'user-guide/features/honcho',
'user-guide/features/provider-routing',
@@ -101,6 +104,16 @@ const sidebars: SidebarsConfig = {
label: 'Developer Guide',
items: [
'developer-guide/architecture',
'developer-guide/agent-loop',
'developer-guide/provider-runtime',
'developer-guide/prompt-assembly',
'developer-guide/context-compression-and-caching',
'developer-guide/gateway-internals',
'developer-guide/session-storage',
'developer-guide/tools-runtime',
'developer-guide/acp-internals',
'developer-guide/trajectory-format',
'developer-guide/cron-internals',
'developer-guide/environments',
'developer-guide/adding-tools',
'developer-guide/creating-skills',