Files
hermes-agent/website/docs/developer-guide/architecture.md
teknium1 19016497ef docs: fix all remaining minor accuracy issues
- updating.md: Note that 'hermes update' auto-handles config migration
- cli.md: Add summary_model to compression config, fix display config
  (add personality/compact), remove unverified pastes/ claim
- configuration.md: Add 5 missing config sections (stt, human_delay,
  code_execution, delegation, clarify), fix display defaults,
  fix reasoning_effort default to empty/unset
- messaging/index.md: Add GATEWAY_ALLOWED_USERS to security section
- skills.md: Add category field to skills_list return value
- mcp.md: Document auto-registered utility tools (resources/prompts)
- architecture.md: Fix file_tools.py reference, base_url default to None,
  synchronous agent loop pseudocode
- cli-commands.md: Fix hermes logout description
- environment-variables.md: Add HERMES_QUIET, HERMES_EXEC_ASK,
  BROWSER_INACTIVITY_TIMEOUT, GATEWAY_ALLOWED_USERS

Verification scan: 27/27 checks passed, zero issues remaining.
2026-03-05 07:00:51 -08:00

218 lines
8.9 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
---
sidebar_position: 1
title: "Architecture"
description: "Hermes Agent internals — project structure, agent loop, key classes, and design patterns"
---
# Architecture
This guide covers the internal architecture of Hermes Agent for developers contributing to the project.
## Project Structure
```
hermes-agent/
├── run_agent.py # AIAgent class — core conversation loop, tool dispatch
├── cli.py # HermesCLI class — interactive TUI, prompt_toolkit
├── model_tools.py # Tool orchestration (thin layer over tools/registry.py)
├── toolsets.py # Tool groupings and presets
├── hermes_state.py # SQLite session database with FTS5 full-text search
├── batch_runner.py # Parallel batch processing for trajectory generation
├── agent/ # Agent internals (extracted modules)
│ ├── prompt_builder.py # System prompt assembly (identity, skills, memory)
│ ├── context_compressor.py # Auto-summarization when approaching context limits
│ ├── auxiliary_client.py # Resolves auxiliary OpenAI clients (summarization, vision)
│ ├── display.py # KawaiiSpinner, tool progress formatting
│ ├── model_metadata.py # Model context lengths, token estimation
│ └── trajectory.py # Trajectory saving helpers
├── hermes_cli/ # CLI command implementations
│ ├── main.py # Entry point, argument parsing, command dispatch
│ ├── config.py # Config management, migration, env var definitions
│ ├── setup.py # Interactive setup wizard
│ ├── auth.py # Provider resolution, OAuth, Nous Portal
│ ├── models.py # OpenRouter model selection lists
│ ├── banner.py # Welcome banner, ASCII art
│ ├── commands.py # Slash command definitions + autocomplete
│ ├── callbacks.py # Interactive callbacks (clarify, sudo, approval)
│ ├── doctor.py # Diagnostics
│ └── skills_hub.py # Skills Hub CLI + /skills slash command handler
├── tools/ # Tool implementations (self-registering)
│ ├── registry.py # Central tool registry (schemas, handlers, dispatch)
│ ├── approval.py # Dangerous command detection + per-session approval
│ ├── terminal_tool.py # Terminal orchestration (sudo, env lifecycle, backends)
│ ├── file_operations.py # File tool implementations (read, write, search, patch)
│ ├── file_tools.py # File tool registration
│ ├── web_tools.py # web_search, web_extract
│ ├── vision_tools.py # Image analysis via multimodal models
│ ├── delegate_tool.py # Subagent spawning and parallel task execution
│ ├── code_execution_tool.py # Sandboxed Python with RPC tool access
│ ├── session_search_tool.py # Search past conversations
│ ├── cronjob_tools.py # Scheduled task management
│ ├── skills_tool.py # Skill search and load
│ ├── skill_manager_tool.py # Skill management
│ └── environments/ # Terminal execution backends
│ ├── base.py # BaseEnvironment ABC
│ ├── local.py, docker.py, ssh.py, singularity.py, modal.py
├── gateway/ # Messaging gateway
│ ├── run.py # GatewayRunner — platform lifecycle, message routing
│ ├── config.py # Platform configuration resolution
│ ├── session.py # Session store, context prompts, reset policies
│ └── platforms/ # Platform adapters
│ ├── telegram.py, discord_adapter.py, slack.py, whatsapp.py
├── scripts/ # Installer and bridge scripts
│ ├── install.sh # Linux/macOS installer
│ ├── install.ps1 # Windows PowerShell installer
│ └── whatsapp-bridge/ # Node.js WhatsApp bridge (Baileys)
├── skills/ # Bundled skills (copied to ~/.hermes/skills/)
├── environments/ # RL training environments (Atropos integration)
└── tests/ # Test suite
```
## Core Loop
The main agent loop lives in `run_agent.py`:
```
User message → AIAgent._run_agent_loop()
├── Build system prompt (prompt_builder.py)
├── Build API kwargs (model, messages, tools, reasoning config)
├── Call LLM (OpenAI-compatible API)
├── If tool_calls in response:
│ ├── Execute each tool via registry dispatch
│ ├── Add tool results to conversation
│ └── Loop back to LLM call
├── If text response:
│ ├── Persist session to DB
│ └── Return final_response
└── Context compression if approaching token limit
```
```python
while turns < max_turns:
response = client.chat.completions.create(
model=model,
messages=messages,
tools=tool_schemas,
)
if response.tool_calls:
for tool_call in response.tool_calls:
result = execute_tool(tool_call)
messages.append(tool_result_message(result))
turns += 1
else:
return response.content
```
## AIAgent Class
```python
class AIAgent:
def __init__(
self,
model: str = "anthropic/claude-opus-4.6",
api_key: str = None,
base_url: str = None, # Resolved internally based on provider
max_iterations: int = 60,
enabled_toolsets: list = None,
disabled_toolsets: list = None,
verbose_logging: bool = False,
quiet_mode: bool = False,
tool_progress_callback: callable = None,
):
...
def chat(self, message: str) -> str:
# Main entry point - runs the agent loop
...
```
## File Dependency Chain
```
tools/registry.py (no deps — imported by all tool files)
tools/*.py (each calls registry.register() at import time)
model_tools.py (imports tools/registry + triggers tool discovery)
run_agent.py, cli.py, batch_runner.py, environments/
```
Each tool file co-locates its schema, handler, and registration. `model_tools.py` is a thin orchestration layer.
## Key Design Patterns
### Self-Registering Tools
Each tool file calls `registry.register()` at import time. `model_tools.py` triggers discovery by importing all tool modules.
### Toolset Grouping
Tools are grouped into toolsets (`web`, `terminal`, `file`, `browser`, etc.) that can be enabled/disabled per platform.
### Session Persistence
All conversations are stored in SQLite (`hermes_state.py`) with full-text search. JSON logs go to `~/.hermes/sessions/`.
### Ephemeral Injection
System prompts and prefill messages are injected at API call time, never persisted to the database or logs.
### Provider Abstraction
The agent works with any OpenAI-compatible API. Provider resolution happens at init time (Nous Portal OAuth, OpenRouter API key, or custom endpoint).
### Conversation Format
Messages follow the OpenAI format:
```python
messages = [
{"role": "system", "content": "You are a helpful assistant..."},
{"role": "user", "content": "Search for Python tutorials"},
{"role": "assistant", "content": None, "tool_calls": [...]},
{"role": "tool", "tool_call_id": "...", "content": "..."},
{"role": "assistant", "content": "Here's what I found..."},
]
```
## CLI Architecture
The interactive CLI (`cli.py`) uses:
- **Rich** — Welcome banner and styled panels
- **prompt_toolkit** — Fixed input area with history, `patch_stdout`, slash command autocomplete
- **KawaiiSpinner** — Animated kawaii faces during API calls; clean activity feed for tool results
Key UX behaviors:
- Thinking spinner shows animated kawaii face + verb (`(⌐■_■) deliberating...`)
- Tool execution results appear as `┊ {emoji} {verb} {detail} {duration}`
- Prompt shows `⚕ ` when working, `` when idle
- Multi-line paste support with automatic formatting
## Messaging Gateway Architecture
The gateway (`gateway/run.py`) uses `GatewayRunner` to:
1. Connect to all configured platforms
2. Route messages through per-chat session stores
3. Dispatch to AIAgent instances
4. Run the cron scheduler (ticks every 60s)
5. Handle interrupts and tool progress notifications
Each platform adapter conforms to `BasePlatformAdapter`.
## Configuration System
- `~/.hermes/config.yaml` — All settings
- `~/.hermes/.env` — API keys and secrets
- `_config_version` in `DEFAULT_CONFIG` — Bumped when required fields are added, triggers migration prompts