Files

teknium1 ada3713e77 feat: add documentation website (Docusaurus)

- 25 documentation pages covering Getting Started, User Guide, Developer Guide, and Reference
- Docusaurus with custom amber/gold theme matching the landing page branding
- GitHub Actions workflow to deploy landing page + docs to GitHub Pages
- Landing page at root, docs at /docs/ on hermes-agent.nousresearch.com
- Content extracted and restructured from existing repo docs (README, AGENTS.md, CONTRIBUTING.md, docs/)
- Auto-deploy on push to main when website/ or landingpage/ changes

2026-03-05 05:24:55 -08:00

8.8 KiB

Raw Blame History

sidebar_position, title, description

sidebar_position	title	description
1	Architecture	Hermes Agent internals — project structure, agent loop, key classes, and design patterns

Architecture

This guide covers the internal architecture of Hermes Agent for developers contributing to the project.

Project Structure

hermes-agent/
├── run_agent.py              # AIAgent class — core conversation loop, tool dispatch
├── cli.py                    # HermesCLI class — interactive TUI, prompt_toolkit
├── model_tools.py            # Tool orchestration (thin layer over tools/registry.py)
├── toolsets.py               # Tool groupings and presets
├── hermes_state.py           # SQLite session database with FTS5 full-text search
├── batch_runner.py           # Parallel batch processing for trajectory generation
│
├── agent/                    # Agent internals (extracted modules)
│   ├── prompt_builder.py         # System prompt assembly (identity, skills, memory)
│   ├── context_compressor.py     # Auto-summarization when approaching context limits
│   ├── auxiliary_client.py       # Resolves auxiliary OpenAI clients (summarization, vision)
│   ├── display.py                # KawaiiSpinner, tool progress formatting
│   ├── model_metadata.py         # Model context lengths, token estimation
│   └── trajectory.py             # Trajectory saving helpers
│
├── hermes_cli/               # CLI command implementations
│   ├── main.py                   # Entry point, argument parsing, command dispatch
│   ├── config.py                 # Config management, migration, env var definitions
│   ├── setup.py                  # Interactive setup wizard
│   ├── auth.py                   # Provider resolution, OAuth, Nous Portal
│   ├── models.py                 # OpenRouter model selection lists
│   ├── banner.py                 # Welcome banner, ASCII art
│   ├── commands.py               # Slash command definitions + autocomplete
│   ├── callbacks.py              # Interactive callbacks (clarify, sudo, approval)
│   ├── doctor.py                 # Diagnostics
│   └── skills_hub.py             # Skills Hub CLI + /skills slash command handler
│
├── tools/                    # Tool implementations (self-registering)
│   ├── registry.py               # Central tool registry (schemas, handlers, dispatch)
│   ├── approval.py               # Dangerous command detection + per-session approval
│   ├── terminal_tool.py          # Terminal orchestration (sudo, env lifecycle, backends)
│   ├── file_operations.py        # read_file, write_file, search, patch
│   ├── web_tools.py              # web_search, web_extract
│   ├── vision_tools.py           # Image analysis via multimodal models
│   ├── delegate_tool.py          # Subagent spawning and parallel task execution
│   ├── code_execution_tool.py    # Sandboxed Python with RPC tool access
│   ├── session_search_tool.py    # Search past conversations
│   ├── cronjob_tools.py          # Scheduled task management
│   ├── skill_tools.py            # Skill search, load, manage
│   └── environments/             # Terminal execution backends
│       ├── base.py                   # BaseEnvironment ABC
│       ├── local.py, docker.py, ssh.py, singularity.py, modal.py
│
├── gateway/                  # Messaging gateway
│   ├── run.py                    # GatewayRunner — platform lifecycle, message routing
│   ├── config.py                 # Platform configuration resolution
│   ├── session.py                # Session store, context prompts, reset policies
│   └── platforms/                # Platform adapters
│       ├── telegram.py, discord_adapter.py, slack.py, whatsapp.py
│
├── scripts/                  # Installer and bridge scripts
│   ├── install.sh                # Linux/macOS installer
│   ├── install.ps1               # Windows PowerShell installer
│   └── whatsapp-bridge/          # Node.js WhatsApp bridge (Baileys)
│
├── skills/                   # Bundled skills (copied to ~/.hermes/skills/)
├── environments/             # RL training environments (Atropos integration)
└── tests/                    # Test suite

Core Loop

The main agent loop lives in run_agent.py:

User message → AIAgent._run_agent_loop()
  ├── Build system prompt (prompt_builder.py)
  ├── Build API kwargs (model, messages, tools, reasoning config)
  ├── Call LLM (OpenAI-compatible API)
  ├── If tool_calls in response:
  │     ├── Execute each tool via registry dispatch
  │     ├── Add tool results to conversation
  │     └── Loop back to LLM call
  ├── If text response:
  │     ├── Persist session to DB
  │     └── Return final_response
  └── Context compression if approaching token limit

while turns < max_turns:
    response = client.chat.completions.create(
        model=model,
        messages=messages,
        tools=tool_schemas,
    )

    if response.tool_calls:
        for tool_call in response.tool_calls:
            result = await execute_tool(tool_call)
            messages.append(tool_result_message(result))
        turns += 1
    else:
        return response.content

AIAgent Class

class AIAgent:
    def __init__(
        self,
        model: str = "anthropic/claude-sonnet-4",
        api_key: str = None,
        base_url: str = "https://openrouter.ai/api/v1",
        max_iterations: int = 60,
        enabled_toolsets: list = None,
        disabled_toolsets: list = None,
        verbose_logging: bool = False,
        quiet_mode: bool = False,
        tool_progress_callback: callable = None,
    ):
        ...

    def chat(self, user_message: str, task_id: str = None) -> str:
        # Main entry point - runs the agent loop
        ...

File Dependency Chain

tools/registry.py  (no deps — imported by all tool files)
       ↑
tools/*.py  (each calls registry.register() at import time)
       ↑
model_tools.py  (imports tools/registry + triggers tool discovery)
       ↑
run_agent.py, cli.py, batch_runner.py, environments/

Each tool file co-locates its schema, handler, and registration. model_tools.py is a thin orchestration layer.

Key Design Patterns

Self-Registering Tools

Each tool file calls registry.register() at import time. model_tools.py triggers discovery by importing all tool modules.

Toolset Grouping

Tools are grouped into toolsets (web, terminal, file, browser, etc.) that can be enabled/disabled per platform.

Session Persistence

All conversations are stored in SQLite (hermes_state.py) with full-text search. JSON logs go to ~/.hermes/sessions/.

Ephemeral Injection

System prompts and prefill messages are injected at API call time, never persisted to the database or logs.

Provider Abstraction

The agent works with any OpenAI-compatible API. Provider resolution happens at init time (Nous Portal OAuth, OpenRouter API key, or custom endpoint).

Conversation Format

Messages follow the OpenAI format:

messages = [
    {"role": "system", "content": "You are a helpful assistant..."},
    {"role": "user", "content": "Search for Python tutorials"},
    {"role": "assistant", "content": None, "tool_calls": [...]},
    {"role": "tool", "tool_call_id": "...", "content": "..."},
    {"role": "assistant", "content": "Here's what I found..."},
]

CLI Architecture

The interactive CLI (cli.py) uses:

Rich — Welcome banner and styled panels
prompt_toolkit — Fixed input area with history, patch_stdout, slash command autocomplete
KawaiiSpinner — Animated kawaii faces during API calls; clean activity feed for tool results

Key UX behaviors:

Thinking spinner shows animated kawaii face + verb ((⌐■_■) deliberating...)
Tool execution results appear as ┊ {emoji} {verb} {detail} {duration}
Prompt shows ⚕ ❯ when working, ❯ when idle
Pasting 5+ lines auto-saves to ~/.hermes/pastes/ and collapses

Messaging Gateway Architecture

The gateway (gateway/run.py) uses GatewayRunner to:

Connect to all configured platforms
Route messages through per-chat session stores
Dispatch to AIAgent instances
Run the cron scheduler (ticks every 60s)
Handle interrupts and tool progress notifications

Each platform adapter conforms to BasePlatformAdapter.

Configuration System

~/.hermes/config.yaml — All settings
~/.hermes/.env — API keys and secrets
_config_version in DEFAULT_CONFIG — Bumped when required fields are added, triggers migration prompts

8.8 KiB Raw Blame History Unescape Escape