Major changes across 20 documentation pages: Staleness fixes: - Fix FAQ: wrong import path (hermes.agent → run_agent) - Fix FAQ: stale Gemini 2.0 model → Gemini 3 Flash - Fix integrations/index: missing MiniMax TTS provider - Fix integrations/index: web_crawl is not a registered tool - Fix sessions: add all 19 session sources (was only 5) - Fix cron: add all 18 delivery targets (was only telegram/discord) - Fix webhooks: add all delivery targets - Fix overview: add missing MCP, memory providers, credential pools - Fix all line-number references → use function name searches instead - Update file size estimates (run_agent ~9200, gateway ~7200, cli ~8500) Expanded thin pages (< 150 lines → substantial depth): - honcho.md: 43 → 108 lines — added feature comparison, tools, config, CLI - overview.md: 49 → 55 lines — added MCP, memory providers, credential pools - toolsets-reference.md: 57 → 175 lines — added explanations, config examples, custom toolsets, wildcards, platform differences table - optional-skills-catalog.md: 74 → 153 lines — added 25+ missing skills across communication, devops, mlops (18!), productivity, research categories - integrations/index.md: 82 → 115 lines — added messaging, HA, plugins sections - cron-internals.md: 90 → 195 lines — added job JSON example, lifecycle states, tick cycle, delivery targets, script-backed jobs, CLI interface - gateway-internals.md: 111 → 250 lines — added architecture diagram, message flow, two-level guard, platform adapters, token locks, process management - agent-loop.md: 112 → 235 lines — added entry points, API mode resolution, turn lifecycle detail, message alternation rules, tool execution flow, callback table, budget tracking, compression details - architecture.md: 152 → 295 lines — added system overview diagram, data flow diagrams, design principles table, dependency chain Other depth additions: - context-references.md: added platform availability, compression interaction, common patterns sections - slash-commands.md: added quick commands config example, alias resolution - image-generation.md: added platform delivery table - tools-reference.md: added tool counts, MCP tools note - index.md: updated platform count (5 → 14+), tool count (40+ → 47)
275 lines
16 KiB
Markdown
275 lines
16 KiB
Markdown
---
|
|
sidebar_position: 1
|
|
title: "Architecture"
|
|
description: "Hermes Agent internals — major subsystems, execution paths, data flow, and where to read next"
|
|
---
|
|
|
|
# Architecture
|
|
|
|
This page is the top-level map of Hermes Agent internals. Use it to orient yourself in the codebase, then dive into subsystem-specific docs for implementation details.
|
|
|
|
## System Overview
|
|
|
|
```text
|
|
┌─────────────────────────────────────────────────────────────────────┐
|
|
│ Entry Points │
|
|
│ │
|
|
│ CLI (cli.py) Gateway (gateway/run.py) ACP (acp_adapter/) │
|
|
│ Batch Runner API Server Python Library │
|
|
└──────────┬──────────────┬───────────────────────┬────────────────────┘
|
|
│ │ │
|
|
▼ ▼ ▼
|
|
┌─────────────────────────────────────────────────────────────────────┐
|
|
│ AIAgent (run_agent.py) │
|
|
│ │
|
|
│ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │
|
|
│ │ Prompt │ │ Provider │ │ Tool │ │
|
|
│ │ Builder │ │ Resolution │ │ Dispatch │ │
|
|
│ │ (prompt_ │ │ (runtime_ │ │ (model_ │ │
|
|
│ │ builder.py) │ │ provider.py)│ │ tools.py) │ │
|
|
│ └──────┬───────┘ └──────┬───────┘ └──────┬───────┘ │
|
|
│ │ │ │ │
|
|
│ ┌──────┴───────┐ ┌──────┴───────┐ ┌──────┴───────┐ │
|
|
│ │ Compression │ │ 3 API Modes │ │ Tool Registry│ │
|
|
│ │ & Caching │ │ chat_compl. │ │ (registry.py)│ │
|
|
│ │ │ │ codex_resp. │ │ 47 tools │ │
|
|
│ │ │ │ anthropic │ │ 37 toolsets │ │
|
|
│ └──────────────┘ └──────────────┘ └──────────────┘ │
|
|
└─────────────────────────────────────────────────────────────────────┘
|
|
│ │
|
|
▼ ▼
|
|
┌───────────────────┐ ┌──────────────────────┐
|
|
│ Session Storage │ │ Tool Backends │
|
|
│ (SQLite + FTS5) │ │ Terminal (6 backends) │
|
|
│ hermes_state.py │ │ Browser (5 backends) │
|
|
│ gateway/session.py│ │ Web (4 backends) │
|
|
└───────────────────┘ │ MCP (dynamic) │
|
|
│ File, Vision, etc. │
|
|
└──────────────────────┘
|
|
```
|
|
|
|
## Directory Structure
|
|
|
|
```text
|
|
hermes-agent/
|
|
├── run_agent.py # AIAgent — core conversation loop (~9,200 lines)
|
|
├── cli.py # HermesCLI — interactive terminal UI (~8,500 lines)
|
|
├── model_tools.py # Tool discovery, schema collection, dispatch
|
|
├── toolsets.py # Tool groupings and platform presets
|
|
├── hermes_state.py # SQLite session/state database with FTS5
|
|
├── hermes_constants.py # HERMES_HOME, profile-aware paths
|
|
├── batch_runner.py # Batch trajectory generation
|
|
│
|
|
├── agent/ # Agent internals
|
|
│ ├── prompt_builder.py # System prompt assembly
|
|
│ ├── context_compressor.py # Conversation compression algorithm
|
|
│ ├── prompt_caching.py # Anthropic prompt caching
|
|
│ ├── auxiliary_client.py # Auxiliary LLM for side tasks (vision, summarization)
|
|
│ ├── model_metadata.py # Model context lengths, token estimation
|
|
│ ├── models_dev.py # models.dev registry integration
|
|
│ ├── anthropic_adapter.py # Anthropic Messages API format conversion
|
|
│ ├── display.py # KawaiiSpinner, tool preview formatting
|
|
│ ├── skill_commands.py # Skill slash commands
|
|
│ ├── memory_store.py # Persistent memory read/write
|
|
│ └── trajectory.py # Trajectory saving helpers
|
|
│
|
|
├── hermes_cli/ # CLI subcommands and setup
|
|
│ ├── main.py # Entry point — all `hermes` subcommands (~4,200 lines)
|
|
│ ├── config.py # DEFAULT_CONFIG, OPTIONAL_ENV_VARS, migration
|
|
│ ├── commands.py # COMMAND_REGISTRY — central slash command definitions
|
|
│ ├── auth.py # PROVIDER_REGISTRY, credential resolution
|
|
│ ├── runtime_provider.py # Provider → api_mode + credentials
|
|
│ ├── models.py # Model catalog, provider model lists
|
|
│ ├── model_switch.py # /model command logic (CLI + gateway shared)
|
|
│ ├── setup.py # Interactive setup wizard (~3,500 lines)
|
|
│ ├── skin_engine.py # CLI theming engine
|
|
│ ├── skills_config.py # hermes skills — enable/disable per platform
|
|
│ ├── skills_hub.py # /skills slash command
|
|
│ ├── tools_config.py # hermes tools — enable/disable per platform
|
|
│ ├── plugins.py # PluginManager — discovery, loading, hooks
|
|
│ ├── callbacks.py # Terminal callbacks (clarify, sudo, approval)
|
|
│ └── gateway.py # hermes gateway start/stop
|
|
│
|
|
├── tools/ # Tool implementations (one file per tool)
|
|
│ ├── registry.py # Central tool registry
|
|
│ ├── approval.py # Dangerous command detection
|
|
│ ├── terminal_tool.py # Terminal orchestration
|
|
│ ├── process_registry.py # Background process management
|
|
│ ├── file_tools.py # read_file, write_file, patch, search_files
|
|
│ ├── web_tools.py # web_search, web_extract
|
|
│ ├── browser_tool.py # 11 browser automation tools
|
|
│ ├── code_execution_tool.py # execute_code sandbox
|
|
│ ├── delegate_tool.py # Subagent delegation
|
|
│ ├── mcp_tool.py # MCP client (~1,050 lines)
|
|
│ ├── credential_files.py # File-based credential passthrough
|
|
│ ├── env_passthrough.py # Env var passthrough for sandboxes
|
|
│ ├── ansi_strip.py # ANSI escape stripping
|
|
│ └── environments/ # Terminal backends (local, docker, ssh, modal, daytona, singularity)
|
|
│
|
|
├── gateway/ # Messaging platform gateway
|
|
│ ├── run.py # GatewayRunner — message dispatch (~5,800 lines)
|
|
│ ├── session.py # SessionStore — conversation persistence
|
|
│ ├── delivery.py # Outbound message delivery
|
|
│ ├── pairing.py # DM pairing authorization
|
|
│ ├── hooks.py # Hook discovery and lifecycle events
|
|
│ ├── mirror.py # Cross-session message mirroring
|
|
│ ├── status.py # Token locks, profile-scoped process tracking
|
|
│ ├── builtin_hooks/ # Always-registered hooks
|
|
│ └── platforms/ # 14 adapters: telegram, discord, slack, whatsapp,
|
|
│ # signal, matrix, mattermost, email, sms,
|
|
│ # dingtalk, feishu, wecom, homeassistant, webhook
|
|
│
|
|
├── acp_adapter/ # ACP server (VS Code / Zed / JetBrains)
|
|
├── cron/ # Scheduler (jobs.py, scheduler.py)
|
|
├── plugins/memory/ # Memory provider plugins
|
|
├── environments/ # RL training environments (Atropos)
|
|
├── skills/ # Bundled skills (always available)
|
|
├── optional-skills/ # Official optional skills (install explicitly)
|
|
├── website/ # Docusaurus documentation site
|
|
└── tests/ # Pytest suite (~3,000+ tests)
|
|
```
|
|
|
|
## Data Flow
|
|
|
|
### CLI Session
|
|
|
|
```text
|
|
User input → HermesCLI.process_input()
|
|
→ AIAgent.run_conversation()
|
|
→ prompt_builder.build_system_prompt()
|
|
→ runtime_provider.resolve_runtime_provider()
|
|
→ API call (chat_completions / codex_responses / anthropic_messages)
|
|
→ tool_calls? → model_tools.handle_function_call() → loop
|
|
→ final response → display → save to SessionDB
|
|
```
|
|
|
|
### Gateway Message
|
|
|
|
```text
|
|
Platform event → Adapter.on_message() → MessageEvent
|
|
→ GatewayRunner._handle_message()
|
|
→ authorize user
|
|
→ resolve session key
|
|
→ create AIAgent with session history
|
|
→ AIAgent.run_conversation()
|
|
→ deliver response back through adapter
|
|
```
|
|
|
|
### Cron Job
|
|
|
|
```text
|
|
Scheduler tick → load due jobs from jobs.json
|
|
→ create fresh AIAgent (no history)
|
|
→ inject attached skills as context
|
|
→ run job prompt
|
|
→ deliver response to target platform
|
|
→ update job state and next_run
|
|
```
|
|
|
|
## Recommended Reading Order
|
|
|
|
If you are new to the codebase:
|
|
|
|
1. **This page** — orient yourself
|
|
2. **[Agent Loop Internals](./agent-loop.md)** — how AIAgent works
|
|
3. **[Prompt Assembly](./prompt-assembly.md)** — system prompt construction
|
|
4. **[Provider Runtime Resolution](./provider-runtime.md)** — how providers are selected
|
|
5. **[Adding Providers](./adding-providers.md)** — practical guide to adding a new provider
|
|
6. **[Tools Runtime](./tools-runtime.md)** — tool registry, dispatch, environments
|
|
7. **[Session Storage](./session-storage.md)** — SQLite schema, FTS5, session lineage
|
|
8. **[Gateway Internals](./gateway-internals.md)** — messaging platform gateway
|
|
9. **[Context Compression & Prompt Caching](./context-compression-and-caching.md)** — compression and caching
|
|
10. **[ACP Internals](./acp-internals.md)** — IDE integration
|
|
11. **[Environments, Benchmarks & Data Generation](./environments.md)** — RL training
|
|
|
|
## Major Subsystems
|
|
|
|
### Agent Loop
|
|
|
|
The synchronous orchestration engine (`AIAgent` in `run_agent.py`). Handles provider selection, prompt construction, tool execution, retries, fallback, callbacks, compression, and persistence. Supports three API modes for different provider backends.
|
|
|
|
→ [Agent Loop Internals](./agent-loop.md)
|
|
|
|
### Prompt System
|
|
|
|
Prompt construction and maintenance across the conversation lifecycle:
|
|
|
|
- **`prompt_builder.py`** — Assembles the system prompt from: personality (SOUL.md), memory (MEMORY.md, USER.md), skills, context files (AGENTS.md, .hermes.md), tool-use guidance, and model-specific instructions
|
|
- **`prompt_caching.py`** — Applies Anthropic cache breakpoints for prefix caching
|
|
- **`context_compressor.py`** — Summarizes middle conversation turns when context exceeds thresholds
|
|
|
|
→ [Prompt Assembly](./prompt-assembly.md), [Context Compression & Prompt Caching](./context-compression-and-caching.md)
|
|
|
|
### Provider Resolution
|
|
|
|
A shared runtime resolver used by CLI, gateway, cron, ACP, and auxiliary calls. Maps `(provider, model)` tuples to `(api_mode, api_key, base_url)`. Handles 18+ providers, OAuth flows, credential pools, and alias resolution.
|
|
|
|
→ [Provider Runtime Resolution](./provider-runtime.md)
|
|
|
|
### Tool System
|
|
|
|
Central tool registry (`tools/registry.py`) with 47 registered tools across 20 toolsets. Each tool file self-registers at import time. The registry handles schema collection, dispatch, availability checking, and error wrapping. Terminal tools support 6 backends (local, Docker, SSH, Daytona, Modal, Singularity).
|
|
|
|
→ [Tools Runtime](./tools-runtime.md)
|
|
|
|
### Session Persistence
|
|
|
|
SQLite-based session storage with FTS5 full-text search. Sessions have lineage tracking (parent/child across compressions), per-platform isolation, and atomic writes with contention handling.
|
|
|
|
→ [Session Storage](./session-storage.md)
|
|
|
|
### Messaging Gateway
|
|
|
|
Long-running process with 14 platform adapters, unified session routing, user authorization (allowlists + DM pairing), slash command dispatch, hook system, cron ticking, and background maintenance.
|
|
|
|
→ [Gateway Internals](./gateway-internals.md)
|
|
|
|
### Plugin System
|
|
|
|
Three discovery sources: `~/.hermes/plugins/` (user), `.hermes/plugins/` (project), and pip entry points. Plugins register tools, hooks, and CLI commands through a context API. Memory providers are a specialized plugin type under `plugins/memory/`.
|
|
|
|
→ [Plugin Guide](/docs/guides/build-a-hermes-plugin), [Memory Provider Plugin](./memory-provider-plugin.md)
|
|
|
|
### Cron
|
|
|
|
First-class agent tasks (not shell tasks). Jobs store in JSON, support multiple schedule formats, can attach skills and scripts, and deliver to any platform.
|
|
|
|
→ [Cron Internals](./cron-internals.md)
|
|
|
|
### ACP Integration
|
|
|
|
Exposes Hermes as an editor-native agent over stdio/JSON-RPC for VS Code, Zed, and JetBrains.
|
|
|
|
→ [ACP Internals](./acp-internals.md)
|
|
|
|
### RL / Environments / Trajectories
|
|
|
|
Full environment framework for evaluation and RL training. Integrates with Atropos, supports multiple tool-call parsers, and generates ShareGPT-format trajectories.
|
|
|
|
→ [Environments, Benchmarks & Data Generation](./environments.md), [Trajectories & Training Format](./trajectory-format.md)
|
|
|
|
## Design Principles
|
|
|
|
| Principle | What it means in practice |
|
|
|-----------|--------------------------|
|
|
| **Prompt stability** | System prompt doesn't change mid-conversation. No cache-breaking mutations except explicit user actions (`/model`). |
|
|
| **Observable execution** | Every tool call is visible to the user via callbacks. Progress updates in CLI (spinner) and gateway (chat messages). |
|
|
| **Interruptible** | API calls and tool execution can be cancelled mid-flight by user input or signals. |
|
|
| **Platform-agnostic core** | One AIAgent class serves CLI, gateway, ACP, batch, and API server. Platform differences live in the entry point, not the agent. |
|
|
| **Loose coupling** | Optional subsystems (MCP, plugins, memory providers, RL environments) use registry patterns and check_fn gating, not hard dependencies. |
|
|
| **Profile isolation** | Each profile (`hermes -p <name>`) gets its own HERMES_HOME, config, memory, sessions, and gateway PID. Multiple profiles run concurrently. |
|
|
|
|
## File Dependency Chain
|
|
|
|
```text
|
|
tools/registry.py (no deps — imported by all tool files)
|
|
↑
|
|
tools/*.py (each calls registry.register() at import time)
|
|
↑
|
|
model_tools.py (imports tools/registry + triggers tool discovery)
|
|
↑
|
|
run_agent.py, cli.py, batch_runner.py, environments/
|
|
```
|
|
|
|
This chain means tool registration happens at import time, before any agent instance is created. Adding a new tool requires an import in `model_tools.py`'s `_discover_tools()` list.
|