docs: comprehensive documentation audit — fix stale info, expand thin pages, add depth (#5393)

Major changes across 20 documentation pages:

Staleness fixes:
- Fix FAQ: wrong import path (hermes.agent → run_agent)
- Fix FAQ: stale Gemini 2.0 model → Gemini 3 Flash
- Fix integrations/index: missing MiniMax TTS provider
- Fix integrations/index: web_crawl is not a registered tool
- Fix sessions: add all 19 session sources (was only 5)
- Fix cron: add all 18 delivery targets (was only telegram/discord)
- Fix webhooks: add all delivery targets
- Fix overview: add missing MCP, memory providers, credential pools
- Fix all line-number references → use function name searches instead
- Update file size estimates (run_agent ~9200, gateway ~7200, cli ~8500)

Expanded thin pages (< 150 lines → substantial depth):
- honcho.md: 43 → 108 lines — added feature comparison, tools, config, CLI
- overview.md: 49 → 55 lines — added MCP, memory providers, credential pools
- toolsets-reference.md: 57 → 175 lines — added explanations, config examples,
  custom toolsets, wildcards, platform differences table
- optional-skills-catalog.md: 74 → 153 lines — added 25+ missing skills across
  communication, devops, mlops (18!), productivity, research categories
- integrations/index.md: 82 → 115 lines — added messaging, HA, plugins sections
- cron-internals.md: 90 → 195 lines — added job JSON example, lifecycle states,
  tick cycle, delivery targets, script-backed jobs, CLI interface
- gateway-internals.md: 111 → 250 lines — added architecture diagram, message
  flow, two-level guard, platform adapters, token locks, process management
- agent-loop.md: 112 → 235 lines — added entry points, API mode resolution,
  turn lifecycle detail, message alternation rules, tool execution flow,
  callback table, budget tracking, compression details
- architecture.md: 152 → 295 lines — added system overview diagram, data flow
  diagrams, design principles table, dependency chain

Other depth additions:
- context-references.md: added platform availability, compression interaction,
  common patterns sections
- slash-commands.md: added quick commands config example, alias resolution
- image-generation.md: added platform delivery table
- tools-reference.md: added tool counts, MCP tools note
- index.md: updated platform count (5 → 14+), tool count (40+ → 47)
This commit is contained in:
Teknium
2026-04-05 19:45:50 -07:00
committed by GitHub
parent fec58ad99e
commit 43d468cea8
20 changed files with 1243 additions and 406 deletions

View File

@@ -6,107 +6,231 @@ description: "Detailed walkthrough of AIAgent execution, API modes, tools, callb
# Agent Loop Internals
The core orchestration engine is `run_agent.py`'s `AIAgent`.
The core orchestration engine is `run_agent.py`'s `AIAgent` class — roughly 9,200 lines that handle everything from prompt assembly to tool dispatch to provider failover.
## Core responsibilities
## Core Responsibilities
`AIAgent` is responsible for:
- assembling the effective prompt and tool schemas
- selecting the correct provider/API mode
- making interruptible model calls
- executing tool calls (sequentially or concurrently)
- maintaining session history
- handling compression, retries, and fallback models
- Assembling the effective system prompt and tool schemas via `prompt_builder.py`
- Selecting the correct provider/API mode (chat_completions, codex_responses, anthropic_messages)
- Making interruptible model calls with cancellation support
- Executing tool calls (sequentially or concurrently via thread pool)
- Maintaining conversation history in OpenAI message format
- Handling compression, retries, and fallback model switching
- Tracking iteration budgets across parent and child agents
- Flushing persistent memory before context is lost
## API modes
## Two Entry Points
Hermes currently supports three API execution modes:
```python
# Simple interface — returns final response string
response = agent.chat("Fix the bug in main.py")
| API mode | Used for |
|----------|----------|
| `chat_completions` | OpenAI-compatible chat endpoints, including OpenRouter and most custom endpoints |
| `codex_responses` | OpenAI Codex / Responses API path |
| `anthropic_messages` | Native Anthropic Messages API |
# Full interface — returns dict with messages, metadata, usage stats
result = agent.run_conversation(
user_message="Fix the bug in main.py",
system_message=None, # auto-built if omitted
conversation_history=None, # auto-loaded from session if omitted
task_id="task_abc123"
)
```
The mode is resolved from explicit args, provider selection, and base URL heuristics.
`chat()` is a thin wrapper around `run_conversation()` that extracts the `final_response` field from the result dict.
## Turn lifecycle
## API Modes
Hermes supports three API execution modes, resolved from provider selection, explicit args, and base URL heuristics:
| API mode | Used for | Client type |
|----------|----------|-------------|
| `chat_completions` | OpenAI-compatible endpoints (OpenRouter, custom, most providers) | `openai.OpenAI` |
| `codex_responses` | OpenAI Codex / Responses API | `openai.OpenAI` with Responses format |
| `anthropic_messages` | Native Anthropic Messages API | `anthropic.Anthropic` via adapter |
The mode determines how messages are formatted, how tool calls are structured, how responses are parsed, and how caching/streaming works. All three converge on the same internal message format (OpenAI-style `role`/`content`/`tool_calls` dicts) before and after API calls.
**Mode resolution order:**
1. Explicit `api_mode` constructor arg (highest priority)
2. Provider-specific detection (e.g., `anthropic` provider → `anthropic_messages`)
3. Base URL heuristics (e.g., `api.anthropic.com``anthropic_messages`)
4. Default: `chat_completions`
## Turn Lifecycle
Each iteration of the agent loop follows this sequence:
```text
run_conversation()
-> generate effective task_id
-> append current user message
-> load or build cached system prompt
-> maybe preflight-compress
-> build api_messages
-> inject ephemeral prompt layers
-> apply prompt caching if appropriate
-> make interruptible API call
-> if tool calls: execute them, append tool results, loop
-> if final text: persist, cleanup, return response
1. Generate task_id if not provided
2. Append user message to conversation history
3. Build or reuse cached system prompt (prompt_builder.py)
4. Check if preflight compression is needed (>50% context)
5. Build API messages from conversation history
- chat_completions: OpenAI format as-is
- codex_responses: convert to Responses API input items
- anthropic_messages: convert via anthropic_adapter.py
6. Inject ephemeral prompt layers (budget warnings, context pressure)
7. Apply prompt caching markers if on Anthropic
8. Make interruptible API call (_api_call_with_interrupt)
9. Parse response:
- If tool_calls: execute them, append results, loop back to step 5
- If text response: persist session, flush memory if needed, return
```
## Interruptible API calls
### Message Format
Hermes wraps API requests so they can be interrupted from the CLI or gateway.
All messages use OpenAI-compatible format internally:
This matters because:
```python
{"role": "system", "content": "..."}
{"role": "user", "content": "..."}
{"role": "assistant", "content": "...", "tool_calls": [...]}
{"role": "tool", "tool_call_id": "...", "content": "..."}
```
- the agent may be in a long LLM call
- the user may send a new message mid-flight
- background systems may need cancellation semantics
Reasoning content (from models that support extended thinking) is stored in `assistant_msg["reasoning"]` and optionally displayed via the `reasoning_callback`.
## Tool execution modes
### Message Alternation Rules
Hermes uses two execution strategies:
The agent loop enforces strict message role alternation:
- sequential execution for single or interactive tools
- concurrent execution for multiple non-interactive tools
- After the system message: `User → Assistant → User → Assistant → ...`
- During tool calling: `Assistant (with tool_calls) → Tool → Tool → ... → Assistant`
- **Never** two assistant messages in a row
- **Never** two user messages in a row
- **Only** `tool` role can have consecutive entries (parallel tool results)
Concurrent tool execution preserves message/result ordering when reinserting tool responses into conversation history.
Providers validate these sequences and will reject malformed histories.
## Callback surfaces
## Interruptible API Calls
`AIAgent` supports platform/integration callbacks such as:
API requests are wrapped in `_api_call_with_interrupt()` which runs the actual HTTP call in a background thread while monitoring an interrupt event:
- `tool_progress_callback`
- `thinking_callback`
- `reasoning_callback`
- `clarify_callback`
- `step_callback`
- `stream_delta_callback`
- `tool_gen_callback`
- `status_callback`
```text
┌──────────────────────┐ ┌──────────────┐
│ Main thread │ │ API thread │
│ wait on: │────▶│ HTTP POST │
│ - response ready │ │ to provider │
│ - interrupt event │ └──────────────┘
│ - timeout │
└──────────────────────┘
```
These are how the CLI, gateway, and ACP integrations stream intermediate progress and interactive approval/clarification flows.
When interrupted (user sends new message, `/stop` command, or signal):
- The API thread is abandoned (response discarded)
- The agent can process the new input or shut down cleanly
- No partial response is injected into conversation history
## Budget and fallback behavior
## Tool Execution
Hermes tracks a shared iteration budget across parent and subagents. It also injects budget pressure hints near the end of the available iteration window.
### Sequential vs Concurrent
Fallback model support allows the agent to switch providers/models when the primary route fails in supported failure paths.
When the model returns tool calls:
## Compression and persistence
- **Single tool call** → executed directly in the main thread
- **Multiple tool calls** → executed concurrently via `ThreadPoolExecutor`
- Exception: tools marked as interactive (e.g., `clarify`) force sequential execution
- Results are reinserted in the original tool call order regardless of completion order
Before and during long runs, Hermes may:
### Execution Flow
- flush memory before context loss
- compress middle conversation turns
- split the session lineage into a new session ID after compression
- preserve recent context and structural tool-call/result consistency
```text
for each tool_call in response.tool_calls:
1. Resolve handler from tools/registry.py
2. Fire pre_tool_call plugin hook
3. Check if dangerous command (tools/approval.py)
- If dangerous: invoke approval_callback, wait for user
4. Execute handler with args + task_id
5. Fire post_tool_call plugin hook
6. Append {"role": "tool", "content": result} to history
```
## Key files to read next
### Agent-Level Tools
- `run_agent.py`
- `agent/prompt_builder.py`
- `agent/context_compressor.py`
- `agent/prompt_caching.py`
- `model_tools.py`
Some tools are intercepted by `run_agent.py` *before* reaching `handle_function_call()`:
## Related docs
| Tool | Why intercepted |
|------|-----------------|
| `todo` | Reads/writes agent-local task state |
| `memory` | Writes to persistent memory files with character limits |
These tools modify agent state directly and return synthetic tool results without going through the registry.
## Callback Surfaces
`AIAgent` supports platform-specific callbacks that enable real-time progress in the CLI, gateway, and ACP integrations:
| Callback | When fired | Used by |
|----------|-----------|---------|
| `tool_progress_callback` | Before/after each tool execution | CLI spinner, gateway progress messages |
| `thinking_callback` | When model starts/stops thinking | CLI "thinking..." indicator |
| `reasoning_callback` | When model returns reasoning content | CLI reasoning display, gateway reasoning blocks |
| `clarify_callback` | When `clarify` tool is called | CLI input prompt, gateway interactive message |
| `step_callback` | After each complete agent turn | Gateway step tracking, ACP progress |
| `stream_delta_callback` | Each streaming token (when enabled) | CLI streaming display |
| `tool_gen_callback` | When tool call is parsed from stream | CLI tool preview in spinner |
| `status_callback` | State changes (thinking, executing, etc.) | ACP status updates |
## Budget and Fallback Behavior
### Iteration Budget
The agent tracks iterations via `IterationBudget`:
- Default: 90 iterations (configurable via `agent.max_turns`)
- Shared across parent and child agents — a subagent consumes from the parent's budget
- At 70%+ usage, `_get_budget_warning()` appends a `[BUDGET WARNING: ...]` to the last tool result
- At 100%, the agent stops and returns a summary of work done
### Fallback Model
When the primary model fails (429 rate limit, 5xx server error, 401/403 auth error):
1. Check `fallback_providers` list in config
2. Try each fallback in order
3. On success, continue the conversation with the new provider
4. On 401/403, attempt credential refresh before failing over
The fallback system also covers auxiliary tasks independently — vision, compression, web extraction, and session search each have their own fallback chain configurable via the `auxiliary.*` config section.
## Compression and Persistence
### When Compression Triggers
- **Preflight** (before API call): If conversation exceeds 50% of model's context window
- **Gateway auto-compression**: If conversation exceeds 85% (more aggressive, runs between turns)
### What Happens During Compression
1. Memory is flushed to disk first (preventing data loss)
2. Middle conversation turns are summarized into a compact summary
3. The last N messages are preserved intact (`compression.protect_last_n`, default: 20)
4. Tool call/result message pairs are kept together (never split)
5. A new session lineage ID is generated (compression creates a "child" session)
### Session Persistence
After each turn:
- Messages are saved to the session store (SQLite via `hermes_state.py`)
- Memory changes are flushed to `MEMORY.md` / `USER.md`
- The session can be resumed later via `/resume` or `hermes chat --resume`
## Key Source Files
| File | Purpose |
|------|---------|
| `run_agent.py` | AIAgent class — the complete agent loop (~9,200 lines) |
| `agent/prompt_builder.py` | System prompt assembly from memory, skills, context files, personality |
| `agent/context_compressor.py` | Conversation compression algorithm |
| `agent/prompt_caching.py` | Anthropic prompt caching markers and cache metrics |
| `agent/auxiliary_client.py` | Auxiliary LLM client for side tasks (vision, summarization) |
| `model_tools.py` | Tool schema collection, `handle_function_call()` dispatch |
## Related Docs
- [Provider Runtime Resolution](./provider-runtime.md)
- [Prompt Assembly](./prompt-assembly.md)
- [Context Compression & Prompt Caching](./context-compression-and-caching.md)
- [Tools Runtime](./tools-runtime.md)
- [Architecture Overview](./architecture.md)

View File

@@ -1,152 +1,274 @@
---
sidebar_position: 1
title: "Architecture"
description: "Hermes Agent internals — major subsystems, execution paths, and where to read next"
description: "Hermes Agent internals — major subsystems, execution paths, data flow, and where to read next"
---
# Architecture
This page is the top-level map of Hermes Agent internals. The project has grown beyond a single monolithic loop, so the best way to understand it is by subsystem.
This page is the top-level map of Hermes Agent internals. Use it to orient yourself in the codebase, then dive into subsystem-specific docs for implementation details.
## High-level structure
## System Overview
```text
┌─────────────────────────────────────────────────────────────────────┐
│ Entry Points │
│ │
│ CLI (cli.py) Gateway (gateway/run.py) ACP (acp_adapter/) │
│ Batch Runner API Server Python Library │
└──────────┬──────────────┬───────────────────────┬────────────────────┘
│ │ │
▼ ▼ ▼
┌─────────────────────────────────────────────────────────────────────┐
│ AIAgent (run_agent.py) │
│ │
│ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │
│ │ Prompt │ │ Provider │ │ Tool │ │
│ │ Builder │ │ Resolution │ │ Dispatch │ │
│ │ (prompt_ │ │ (runtime_ │ │ (model_ │ │
│ │ builder.py) │ │ provider.py)│ │ tools.py) │ │
│ └──────┬───────┘ └──────┬───────┘ └──────┬───────┘ │
│ │ │ │ │
│ ┌──────┴───────┐ ┌──────┴───────┐ ┌──────┴───────┐ │
│ │ Compression │ │ 3 API Modes │ │ Tool Registry│ │
│ │ & Caching │ │ chat_compl. │ │ (registry.py)│ │
│ │ │ │ codex_resp. │ │ 47 tools │ │
│ │ │ │ anthropic │ │ 37 toolsets │ │
│ └──────────────┘ └──────────────┘ └──────────────┘ │
└─────────────────────────────────────────────────────────────────────┘
│ │
▼ ▼
┌───────────────────┐ ┌──────────────────────┐
│ Session Storage │ │ Tool Backends │
│ (SQLite + FTS5) │ │ Terminal (6 backends) │
│ hermes_state.py │ │ Browser (5 backends) │
│ gateway/session.py│ │ Web (4 backends) │
└───────────────────┘ │ MCP (dynamic) │
│ File, Vision, etc. │
└──────────────────────┘
```
## Directory Structure
```text
hermes-agent/
├── run_agent.py # AIAgent core loop
├── cli.py # interactive terminal UI
├── model_tools.py # tool discovery/orchestration
├── toolsets.py # tool groupings and presets
├── hermes_state.py # SQLite session/state database
├── batch_runner.py # batch trajectory generation
├── run_agent.py # AIAgent core conversation loop (~9,200 lines)
├── cli.py # HermesCLI — interactive terminal UI (~8,500 lines)
├── model_tools.py # Tool discovery, schema collection, dispatch
├── toolsets.py # Tool groupings and platform presets
├── hermes_state.py # SQLite session/state database with FTS5
├── hermes_constants.py # HERMES_HOME, profile-aware paths
├── batch_runner.py # Batch trajectory generation
├── agent/ # prompt building, compression, caching, metadata, trajectories
├── hermes_cli/ # command entrypoints, auth, setup, models, config, doctor
├── tools/ # tool implementations and terminal environments
├── gateway/ # messaging gateway, session routing, delivery, pairing, hooks
├── cron/ # scheduled job storage and scheduler
├── plugins/memory/ # Memory provider plugins (honcho, openviking, mem0, etc.)
├── acp_adapter/ # ACP editor integration server
├── acp_registry/ # ACP registry manifest + icon
├── environments/ # Hermes RL / benchmark environment framework
├── skills/ # bundled skills
├── optional-skills/ # official optional skills
└── tests/ # test suite
├── agent/ # Agent internals
│ ├── prompt_builder.py # System prompt assembly
│ ├── context_compressor.py # Conversation compression algorithm
│ ├── prompt_caching.py # Anthropic prompt caching
│ ├── auxiliary_client.py # Auxiliary LLM for side tasks (vision, summarization)
│ ├── model_metadata.py # Model context lengths, token estimation
│ ├── models_dev.py # models.dev registry integration
├── anthropic_adapter.py # Anthropic Messages API format conversion
│ ├── display.py # KawaiiSpinner, tool preview formatting
├── skill_commands.py # Skill slash commands
│ ├── memory_store.py # Persistent memory read/write
└── trajectory.py # Trajectory saving helpers
├── hermes_cli/ # CLI subcommands and setup
│ ├── main.py # Entry point — all `hermes` subcommands (~4,200 lines)
│ ├── config.py # DEFAULT_CONFIG, OPTIONAL_ENV_VARS, migration
│ ├── commands.py # COMMAND_REGISTRY — central slash command definitions
│ ├── auth.py # PROVIDER_REGISTRY, credential resolution
│ ├── runtime_provider.py # Provider → api_mode + credentials
│ ├── models.py # Model catalog, provider model lists
│ ├── model_switch.py # /model command logic (CLI + gateway shared)
│ ├── setup.py # Interactive setup wizard (~3,500 lines)
│ ├── skin_engine.py # CLI theming engine
│ ├── skills_config.py # hermes skills — enable/disable per platform
│ ├── skills_hub.py # /skills slash command
│ ├── tools_config.py # hermes tools — enable/disable per platform
│ ├── plugins.py # PluginManager — discovery, loading, hooks
│ ├── callbacks.py # Terminal callbacks (clarify, sudo, approval)
│ └── gateway.py # hermes gateway start/stop
├── tools/ # Tool implementations (one file per tool)
│ ├── registry.py # Central tool registry
│ ├── approval.py # Dangerous command detection
│ ├── terminal_tool.py # Terminal orchestration
│ ├── process_registry.py # Background process management
│ ├── file_tools.py # read_file, write_file, patch, search_files
│ ├── web_tools.py # web_search, web_extract
│ ├── browser_tool.py # 11 browser automation tools
│ ├── code_execution_tool.py # execute_code sandbox
│ ├── delegate_tool.py # Subagent delegation
│ ├── mcp_tool.py # MCP client (~1,050 lines)
│ ├── credential_files.py # File-based credential passthrough
│ ├── env_passthrough.py # Env var passthrough for sandboxes
│ ├── ansi_strip.py # ANSI escape stripping
│ └── environments/ # Terminal backends (local, docker, ssh, modal, daytona, singularity)
├── gateway/ # Messaging platform gateway
│ ├── run.py # GatewayRunner — message dispatch (~5,800 lines)
│ ├── session.py # SessionStore — conversation persistence
│ ├── delivery.py # Outbound message delivery
│ ├── pairing.py # DM pairing authorization
│ ├── hooks.py # Hook discovery and lifecycle events
│ ├── mirror.py # Cross-session message mirroring
│ ├── status.py # Token locks, profile-scoped process tracking
│ ├── builtin_hooks/ # Always-registered hooks
│ └── platforms/ # 14 adapters: telegram, discord, slack, whatsapp,
│ # signal, matrix, mattermost, email, sms,
│ # dingtalk, feishu, wecom, homeassistant, webhook
├── acp_adapter/ # ACP server (VS Code / Zed / JetBrains)
├── cron/ # Scheduler (jobs.py, scheduler.py)
├── plugins/memory/ # Memory provider plugins
├── environments/ # RL training environments (Atropos)
├── skills/ # Bundled skills (always available)
├── optional-skills/ # Official optional skills (install explicitly)
├── website/ # Docusaurus documentation site
└── tests/ # Pytest suite (~3,000+ tests)
```
## Recommended reading order
## Data Flow
If you are new to the codebase, read in this order:
### CLI Session
1. this page
2. [Agent Loop Internals](./agent-loop.md)
3. [Prompt Assembly](./prompt-assembly.md)
4. [Provider Runtime Resolution](./provider-runtime.md)
5. [Adding Providers](./adding-providers.md)
6. [Tools Runtime](./tools-runtime.md)
7. [Session Storage](./session-storage.md)
8. [Gateway Internals](./gateway-internals.md)
9. [Context Compression & Prompt Caching](./context-compression-and-caching.md)
10. [ACP Internals](./acp-internals.md)
11. [Environments, Benchmarks & Data Generation](./environments.md)
```text
User input → HermesCLI.process_input()
→ AIAgent.run_conversation()
→ prompt_builder.build_system_prompt()
→ runtime_provider.resolve_runtime_provider()
→ API call (chat_completions / codex_responses / anthropic_messages)
→ tool_calls? → model_tools.handle_function_call() → loop
→ final response → display → save to SessionDB
```
## Major subsystems
### Gateway Message
### Agent loop
```text
Platform event → Adapter.on_message() → MessageEvent
→ GatewayRunner._handle_message()
→ authorize user
→ resolve session key
→ create AIAgent with session history
→ AIAgent.run_conversation()
→ deliver response back through adapter
```
The core synchronous orchestration engine is `AIAgent` in `run_agent.py`.
### Cron Job
It is responsible for:
```text
Scheduler tick → load due jobs from jobs.json
→ create fresh AIAgent (no history)
→ inject attached skills as context
→ run job prompt
→ deliver response to target platform
→ update job state and next_run
```
- provider/API-mode selection
- prompt construction
- tool execution
- retries and fallback
- callbacks
- compression and persistence
## Recommended Reading Order
See [Agent Loop Internals](./agent-loop.md).
If you are new to the codebase:
### Prompt system
1. **This page** — orient yourself
2. **[Agent Loop Internals](./agent-loop.md)** — how AIAgent works
3. **[Prompt Assembly](./prompt-assembly.md)** — system prompt construction
4. **[Provider Runtime Resolution](./provider-runtime.md)** — how providers are selected
5. **[Adding Providers](./adding-providers.md)** — practical guide to adding a new provider
6. **[Tools Runtime](./tools-runtime.md)** — tool registry, dispatch, environments
7. **[Session Storage](./session-storage.md)** — SQLite schema, FTS5, session lineage
8. **[Gateway Internals](./gateway-internals.md)** — messaging platform gateway
9. **[Context Compression & Prompt Caching](./context-compression-and-caching.md)** — compression and caching
10. **[ACP Internals](./acp-internals.md)** — IDE integration
11. **[Environments, Benchmarks & Data Generation](./environments.md)** — RL training
Prompt-building logic is split between:
## Major Subsystems
- `run_agent.py`
- `agent/prompt_builder.py`
- `agent/prompt_caching.py`
- `agent/context_compressor.py`
### Agent Loop
See:
The synchronous orchestration engine (`AIAgent` in `run_agent.py`). Handles provider selection, prompt construction, tool execution, retries, fallback, callbacks, compression, and persistence. Supports three API modes for different provider backends.
- [Prompt Assembly](./prompt-assembly.md)
- [Context Compression & Prompt Caching](./context-compression-and-caching.md)
→ [Agent Loop Internals](./agent-loop.md)
### Provider/runtime resolution
### Prompt System
Hermes has a shared runtime provider resolver used by CLI, gateway, cron, ACP, and auxiliary calls.
Prompt construction and maintenance across the conversation lifecycle:
See [Provider Runtime Resolution](./provider-runtime.md).
- **`prompt_builder.py`** — Assembles the system prompt from: personality (SOUL.md), memory (MEMORY.md, USER.md), skills, context files (AGENTS.md, .hermes.md), tool-use guidance, and model-specific instructions
- **`prompt_caching.py`** — Applies Anthropic cache breakpoints for prefix caching
- **`context_compressor.py`** — Summarizes middle conversation turns when context exceeds thresholds
### Tooling runtime
→ [Prompt Assembly](./prompt-assembly.md), [Context Compression & Prompt Caching](./context-compression-and-caching.md)
The tool registry, toolsets, terminal backends, process manager, and dispatch rules form a subsystem of their own.
### Provider Resolution
See [Tools Runtime](./tools-runtime.md).
A shared runtime resolver used by CLI, gateway, cron, ACP, and auxiliary calls. Maps `(provider, model)` tuples to `(api_mode, api_key, base_url)`. Handles 18+ providers, OAuth flows, credential pools, and alias resolution.
### Session persistence
→ [Provider Runtime Resolution](./provider-runtime.md)
Historical session state is stored primarily in SQLite, with lineage preserved across compression splits.
### Tool System
See [Session Storage](./session-storage.md).
Central tool registry (`tools/registry.py`) with 47 registered tools across 20 toolsets. Each tool file self-registers at import time. The registry handles schema collection, dispatch, availability checking, and error wrapping. Terminal tools support 6 backends (local, Docker, SSH, Daytona, Modal, Singularity).
### Messaging gateway
→ [Tools Runtime](./tools-runtime.md)
The gateway is a long-running orchestration layer for platform adapters, session routing, pairing, delivery, and cron ticking.
### Session Persistence
See [Gateway Internals](./gateway-internals.md).
SQLite-based session storage with FTS5 full-text search. Sessions have lineage tracking (parent/child across compressions), per-platform isolation, and atomic writes with contention handling.
### ACP integration
→ [Session Storage](./session-storage.md)
ACP exposes Hermes as an editor-native agent over stdio/JSON-RPC.
### Messaging Gateway
See:
Long-running process with 14 platform adapters, unified session routing, user authorization (allowlists + DM pairing), slash command dispatch, hook system, cron ticking, and background maintenance.
- [ACP Editor Integration](../user-guide/features/acp.md)
- [ACP Internals](./acp-internals.md)
→ [Gateway Internals](./gateway-internals.md)
### Plugin System
Three discovery sources: `~/.hermes/plugins/` (user), `.hermes/plugins/` (project), and pip entry points. Plugins register tools, hooks, and CLI commands through a context API. Memory providers are a specialized plugin type under `plugins/memory/`.
→ [Plugin Guide](/docs/guides/build-a-hermes-plugin), [Memory Provider Plugin](./memory-provider-plugin.md)
### Cron
Cron jobs are implemented as first-class agent tasks, not just shell tasks.
First-class agent tasks (not shell tasks). Jobs store in JSON, support multiple schedule formats, can attach skills and scripts, and deliver to any platform.
See [Cron Internals](./cron-internals.md).
[Cron Internals](./cron-internals.md)
### RL / environments / trajectories
### ACP Integration
Hermes ships a full environment framework for evaluation, RL integration, and SFT data generation.
Exposes Hermes as an editor-native agent over stdio/JSON-RPC for VS Code, Zed, and JetBrains.
See:
→ [ACP Internals](./acp-internals.md)
- [Environments, Benchmarks & Data Generation](./environments.md)
- [Trajectories & Training Format](./trajectory-format.md)
### RL / Environments / Trajectories
## Design themes
Full environment framework for evaluation and RL training. Integrates with Atropos, supports multiple tool-call parsers, and generates ShareGPT-format trajectories.
Several cross-cutting design themes appear throughout the codebase:
→ [Environments, Benchmarks & Data Generation](./environments.md), [Trajectories & Training Format](./trajectory-format.md)
- prompt stability matters
- tool execution must be observable and interruptible
- session persistence must survive long-running use
- platform frontends should share one agent core
- optional subsystems should remain loosely coupled where possible
## Design Principles
## Implementation notes
| Principle | What it means in practice |
|-----------|--------------------------|
| **Prompt stability** | System prompt doesn't change mid-conversation. No cache-breaking mutations except explicit user actions (`/model`). |
| **Observable execution** | Every tool call is visible to the user via callbacks. Progress updates in CLI (spinner) and gateway (chat messages). |
| **Interruptible** | API calls and tool execution can be cancelled mid-flight by user input or signals. |
| **Platform-agnostic core** | One AIAgent class serves CLI, gateway, ACP, batch, and API server. Platform differences live in the entry point, not the agent. |
| **Loose coupling** | Optional subsystems (MCP, plugins, memory providers, RL environments) use registry patterns and check_fn gating, not hard dependencies. |
| **Profile isolation** | Each profile (`hermes -p <name>`) gets its own HERMES_HOME, config, memory, sessions, and gateway PID. Multiple profiles run concurrently. |
The older mental model of Hermes as “one OpenAI-compatible chat loop plus some tools” is no longer sufficient. Current Hermes includes:
## File Dependency Chain
- multiple API modes
- auxiliary model routing
- ACP editor integration
- gateway-specific session and delivery semantics
- RL environment infrastructure
- prompt-caching and compression logic with lineage-aware persistence
```text
tools/registry.py (no deps — imported by all tool files)
tools/*.py (each calls registry.register() at import time)
model_tools.py (imports tools/registry + triggers tool discovery)
run_agent.py, cli.py, batch_runner.py, environments/
```
Use this page as the map, then dive into subsystem-specific docs for the real implementation details.
This chain means tool registration happens at import time, before any agent instance is created. Adding a new tool requires an import in `model_tools.py`'s `_discover_tools()` list.

View File

@@ -4,7 +4,7 @@ Hermes Agent uses a dual compression system and Anthropic prompt caching to
manage context window usage efficiently across long conversations.
Source files: `agent/context_compressor.py`, `agent/prompt_caching.py`,
`gateway/run.py` (session hygiene), `run_agent.py` (lines 1146-1204)
`gateway/run.py` (session hygiene), `run_agent.py` (search for `_compress_context`)
## Dual Compression System
@@ -26,7 +26,7 @@ Hermes has two separate compression layers that operate independently:
### 1. Gateway Session Hygiene (85% threshold)
Located in `gateway/run.py` (around line 2220). This is a **safety net** that
Located in `gateway/run.py` (search for `_maybe_compress_session`). This is a **safety net** that
runs before the agent processes a message. It prevents API failures when sessions
grow too large between turns (e.g., overnight accumulation in Telegram/Discord).

View File

@@ -6,85 +6,195 @@ description: "How Hermes stores, schedules, edits, pauses, skill-loads, and deli
# Cron Internals
Hermes cron support is implemented primarily in:
The cron subsystem provides scheduled task execution — from simple one-shot delays to recurring cron-expression jobs with skill injection and cross-platform delivery.
- `cron/jobs.py`
- `cron/scheduler.py`
- `tools/cronjob_tools.py`
- `gateway/run.py`
- `hermes_cli/cron.py`
## Key Files
## Scheduling model
| File | Purpose |
|------|---------|
| `cron/jobs.py` | Job model, storage, atomic read/write to `jobs.json` |
| `cron/scheduler.py` | Scheduler loop — due-job detection, execution, repeat tracking |
| `tools/cronjob_tools.py` | Model-facing `cronjob` tool registration and handler |
| `gateway/run.py` | Gateway integration — cron ticking in the long-running loop |
| `hermes_cli/cron.py` | CLI `hermes cron` subcommands |
Hermes supports:
## Scheduling Model
- one-shot delays
- intervals
- cron expressions
- explicit timestamps
Four schedule formats are supported:
The model-facing surface is a single `cronjob` tool with action-style operations:
| Format | Example | Behavior |
|--------|---------|----------|
| **Relative delay** | `30m`, `2h`, `1d` | One-shot, fires after the specified duration |
| **Interval** | `every 2h`, `every 30m` | Recurring, fires at regular intervals |
| **Cron expression** | `0 9 * * *` | Standard 5-field cron syntax (minute, hour, day, month, weekday) |
| **ISO timestamp** | `2025-01-15T09:00:00` | One-shot, fires at the exact time |
- `create`
- `list`
- `update`
- `pause`
- `resume`
- `run`
- `remove`
The model-facing surface is a single `cronjob` tool with action-style operations: `create`, `list`, `update`, `pause`, `resume`, `run`, `remove`.
## Job storage
## Job Storage
Cron jobs are stored in Hermes-managed local state (`~/.hermes/cron/jobs.json`) with atomic write semantics.
Jobs are stored in `~/.hermes/cron/jobs.json` with atomic write semantics (write to temp file, then rename). Each job record contains:
Each job can carry:
```json
{
"id": "job_abc123",
"name": "Daily briefing",
"prompt": "Summarize today's AI news and funding rounds",
"schedule": "0 9 * * *",
"skills": ["ai-funding-daily-report"],
"deliver": "telegram:-1001234567890",
"repeat": null,
"state": "scheduled",
"next_run": "2025-01-16T09:00:00Z",
"run_count": 42,
"created_at": "2025-01-01T00:00:00Z",
"model": null,
"provider": null,
"script": null
}
```
- prompt
- schedule metadata
- repeat counters
- delivery target
- lifecycle state (`scheduled`, `paused`, `completed`, etc.)
- zero, one, or multiple attached skills
### Job Lifecycle States
Backward compatibility is preserved for older jobs that only stored a legacy single `skill` field or none of the newer lifecycle fields.
| State | Meaning |
|-------|---------|
| `scheduled` | Active, will fire at next scheduled time |
| `paused` | Suspended — won't fire until resumed |
| `completed` | Repeat count exhausted or one-shot that has fired |
| `running` | Currently executing (transient state) |
## Runtime behavior
### Backward Compatibility
The scheduler:
Older jobs may have a single `skill` field instead of the `skills` array. The scheduler normalizes this at load time — single `skill` is promoted to `skills: [skill]`.
- loads jobs
- computes due work
- executes jobs in fresh agent sessions
- optionally injects one or more skills before the prompt
- handles repeat counters
- updates next-run metadata and state
## Scheduler Runtime
In gateway mode, cron ticking is integrated into the long-running gateway loop.
### Tick Cycle
## Skill-backed jobs
The scheduler runs on a periodic tick (default: every 60 seconds):
A cron job may attach multiple skills. At runtime, Hermes loads those skills in order and then appends the job prompt as the task instruction.
```text
tick()
1. Acquire scheduler lock (prevents overlapping ticks)
2. Load all jobs from jobs.json
3. Filter to due jobs (next_run <= now AND state == "scheduled")
4. For each due job:
a. Set state to "running"
b. Create fresh AIAgent session (no conversation history)
c. Load attached skills in order (injected as user messages)
d. Run the job prompt through the agent
e. Deliver the response to the configured target
f. Update run_count, compute next_run
g. If repeat count exhausted → state = "completed"
h. Otherwise → state = "scheduled"
5. Write updated jobs back to jobs.json
6. Release scheduler lock
```
This gives scheduled jobs reusable guidance without requiring the user to paste full skill bodies into the cron prompt.
### Gateway Integration
## Recursion guard
In gateway mode, the scheduler tick is integrated into the gateway's main event loop. The gateway calls `scheduler.tick()` on its periodic maintenance cycle, which runs alongside message handling.
Cron-run sessions disable the `cronjob` toolset. This prevents a scheduled job from recursively creating or mutating more cron jobs and accidentally exploding token usage or scheduler load.
In CLI mode, cron jobs only fire when `hermes cron` commands are run or during active CLI sessions.
## Delivery model
### Fresh Session Isolation
Cron jobs can deliver to:
Each cron job runs in a completely fresh agent session:
- origin chat
- local files
- platform home channels
- explicit platform/chat IDs
- No conversation history from previous runs
- No memory of previous cron executions (unless persisted to memory/files)
- The prompt must be self-contained — cron jobs cannot ask clarifying questions
- The `cronjob` toolset is disabled (recursion guard)
## Skill-Backed Jobs
A cron job can attach one or more skills via the `skills` field. At execution time:
1. Skills are loaded in the specified order
2. Each skill's SKILL.md content is injected as context
3. The job's prompt is appended as the task instruction
4. The agent processes the combined skill context + prompt
This enables reusable, tested workflows without pasting full instructions into cron prompts. For example:
```
Create a daily funding report → attach "ai-funding-daily-report" skill
```
### Script-Backed Jobs
Jobs can also attach a Python script via the `script` field. The script runs *before* each agent turn, and its stdout is injected into the prompt as context. This enables data collection and change detection patterns:
```python
# ~/.hermes/scripts/check_competitors.py
import requests, json
# Fetch competitor release notes, diff against last run
# Print summary to stdout — agent analyzes and reports
```
## Delivery Model
Cron job results can be delivered to any supported platform:
| Target | Syntax | Example |
|--------|--------|---------|
| Origin chat | `origin` | Deliver to the chat where the job was created |
| Local file | `local` | Save to `~/.hermes/cron/output/` |
| Telegram | `telegram` or `telegram:<chat_id>` | `telegram:-1001234567890` |
| Discord | `discord` or `discord:#channel` | `discord:#engineering` |
| Slack | `slack` | Deliver to Slack home channel |
| WhatsApp | `whatsapp` | Deliver to WhatsApp home |
| Signal | `signal` | Deliver to Signal |
| Matrix | `matrix` | Deliver to Matrix home room |
| Mattermost | `mattermost` | Deliver to Mattermost home |
| Email | `email` | Deliver via email |
| SMS | `sms` | Deliver via SMS |
| Home Assistant | `homeassistant` | Deliver to HA conversation |
| DingTalk | `dingtalk` | Deliver to DingTalk |
| Feishu | `feishu` | Deliver to Feishu |
| WeCom | `wecom` | Deliver to WeCom |
For Telegram topics, use the format `telegram:<chat_id>:<thread_id>` (e.g., `telegram:-1001234567890:17585`).
### Response Wrapping
By default (`cron.wrap_response: true`), cron deliveries are wrapped with:
- A header identifying the cron job name and task
- A footer noting the agent cannot see the delivered message in conversation
The `[SILENT]` prefix in a cron response suppresses delivery entirely — useful for jobs that only need to write to files or perform side effects.
### Session Isolation
Cron deliveries are NOT mirrored into gateway session conversation history. They exist only in the cron job's own session. This prevents message alternation violations in the target chat's conversation.
## Recursion Guard
Cron-run sessions have the `cronjob` toolset disabled. This prevents:
- A scheduled job from creating new cron jobs
- Recursive scheduling that could explode token usage
- Accidental mutation of the job schedule from within a job
## Locking
Hermes uses lock-based protections so overlapping scheduler ticks do not execute the same due-job batch twice.
The scheduler uses file-based locking to prevent overlapping ticks from executing the same due-job batch twice. This is important in gateway mode where multiple maintenance cycles could overlap if a previous tick takes longer than the tick interval.
## Related docs
## CLI Interface
- [Cron feature guide](../user-guide/features/cron.md)
The `hermes cron` CLI provides direct job management:
```bash
hermes cron list # Show all jobs
hermes cron add # Interactive job creation
hermes cron edit <job_id> # Edit job configuration
hermes cron pause <job_id> # Pause a running job
hermes cron resume <job_id> # Resume a paused job
hermes cron run <job_id> # Trigger immediate execution
hermes cron remove <job_id> # Delete a job
```
## Related Docs
- [Cron Feature Guide](/docs/user-guide/features/cron)
- [Gateway Internals](./gateway-internals.md)
- [Agent Loop Internals](./agent-loop.md)

View File

@@ -6,106 +6,248 @@ description: "How the messaging gateway boots, authorizes users, routes sessions
# Gateway Internals
The messaging gateway is the long-running process that connects Hermes to external platforms.
The messaging gateway is the long-running process that connects Hermes to 14+ external messaging platforms through a unified architecture.
Key files:
## Key Files
- `gateway/run.py`
- `gateway/config.py`
- `gateway/session.py`
- `gateway/delivery.py`
- `gateway/pairing.py`
- `gateway/channel_directory.py`
- `gateway/hooks.py`
- `gateway/mirror.py`
- `gateway/platforms/*`
| File | Purpose |
|------|---------|
| `gateway/run.py` | `GatewayRunner` — main loop, slash commands, message dispatch (~7,200 lines) |
| `gateway/session.py` | `SessionStore` — conversation persistence and session key construction |
| `gateway/delivery.py` | Outbound message delivery to target platforms/channels |
| `gateway/pairing.py` | DM pairing flow for user authorization |
| `gateway/channel_directory.py` | Maps chat IDs to human-readable names for cron delivery |
| `gateway/hooks.py` | Hook discovery, loading, and lifecycle event dispatch |
| `gateway/mirror.py` | Cross-session message mirroring for `send_message` |
| `gateway/status.py` | Token lock management for profile-scoped gateway instances |
| `gateway/builtin_hooks/` | Always-registered hooks (e.g., BOOT.md system prompt hook) |
| `gateway/platforms/` | Platform adapters (one per messaging platform) |
## Core responsibilities
## Architecture Overview
The gateway process is responsible for:
```text
┌─────────────────────────────────────────────────┐
│ GatewayRunner │
│ │
│ ┌──────────┐ ┌──────────┐ ┌──────────┐ │
│ │ Telegram │ │ Discord │ │ Slack │ ... │
│ │ Adapter │ │ Adapter │ │ Adapter │ │
│ └─────┬─────┘ └─────┬────┘ └─────┬────┘ │
│ │ │ │ │
│ └──────────────┼──────────────┘ │
│ ▼ │
│ _handle_message() │
│ │ │
│ ┌────────────┼────────────┐ │
│ ▼ ▼ ▼ │
│ Slash command AIAgent Queue/BG │
│ dispatch creation sessions │
│ │ │
│ ▼ │
│ SessionStore │
│ (SQLite persistence) │
└─────────────────────────────────────────────────┘
```
- loading configuration from `.env`, `config.yaml`, and `gateway.json`
- starting platform adapters
- authorizing users
- routing incoming events to sessions
- maintaining per-chat session continuity
- dispatching messages to `AIAgent`
- running cron ticks and background maintenance tasks
- mirroring/proactively delivering output to configured channels
## Message Flow
## Config sources
When a message arrives from any platform:
The gateway has a multi-source config model:
1. **Platform adapter** receives raw event, normalizes it into a `MessageEvent`
2. **Base adapter** checks active session guard:
- If agent is running for this session → queue message, set interrupt event
- If `/approve`, `/deny`, `/stop` → bypass guard (dispatched inline)
3. **GatewayRunner._handle_message()** receives the event:
- Resolve session key via `_session_key_for_source()` (format: `agent:main:{platform}:{chat_type}:{chat_id}`)
- Check authorization (see Authorization below)
- Check if it's a slash command → dispatch to command handler
- Check if agent is already running → intercept commands like `/stop`, `/status`
- Otherwise → create `AIAgent` instance and run conversation
4. **Response** is sent back through the platform adapter
- environment variables
- `~/.hermes/gateway.json`
- selected bridged values from `~/.hermes/config.yaml`
### Session Key Format
## Session routing
Session keys encode the full routing context:
`gateway/session.py` and `GatewayRunner` cooperate to map incoming messages to active session IDs.
```
agent:main:{platform}:{chat_type}:{chat_id}
```
Session keying can depend on:
For example: `agent:main:telegram:private:123456789`
- platform
- user/chat identity
- thread/topic identity
- special platform-specific routing behavior
Thread-aware platforms (Telegram forum topics, Discord threads, Slack threads) may include thread IDs in the chat_id portion. **Never construct session keys manually** — always use `build_session_key()` from `gateway/session.py`.
## Authorization layers
### Two-Level Message Guard
The gateway can authorize through:
When an agent is actively running, incoming messages pass through two sequential guards:
- platform allowlists
- gateway-wide allowlists
- DM pairing flows
- explicit allow-all settings
1. **Level 1 — Base adapter** (`gateway/platforms/base.py`): Checks `_active_sessions`. If the session is active, queues the message in `_pending_messages` and sets an interrupt event. This catches messages *before* they reach the gateway runner.
Pairing support is implemented in `gateway/pairing.py`.
2. **Level 2 — Gateway runner** (`gateway/run.py`): Checks `_running_agents`. Intercepts specific commands (`/stop`, `/new`, `/queue`, `/status`, `/approve`, `/deny`) and routes them appropriately. Everything else triggers `running_agent.interrupt()`.
## Delivery path
Commands that must reach the runner while the agent is blocked (like `/approve`) are dispatched **inline** via `await self._message_handler(event)` — they bypass the background task system to avoid race conditions.
Outgoing deliveries are handled by `gateway/delivery.py`, which knows how to:
## Authorization
- deliver to a home channel
- resolve explicit targets
- mirror some remote deliveries back into local history/session tracking
The gateway uses a multi-layer authorization check, evaluated in order:
1. **Gateway-wide allow-all** (`GATEWAY_ALLOW_ALL_USERS`) — if set, all users are authorized
2. **Platform allowlist** (e.g., `TELEGRAM_ALLOWED_USERS`) — comma-separated user IDs
3. **DM pairing** — authenticated users can pair new users via a pairing code
4. **Admin escalation** — some commands require admin status beyond basic authorization
### DM Pairing Flow
```text
Admin: /pair
Gateway: "Pairing code: ABC123. Share with the user."
New user: ABC123
Gateway: "Paired! You're now authorized."
```
Pairing state is persisted in `gateway/pairing.py` and survives restarts.
## Slash Command Dispatch
All slash commands in the gateway flow through the same resolution pipeline:
1. `resolve_command()` from `hermes_cli/commands.py` maps input to canonical name (handles aliases, prefix matching)
2. The canonical name is checked against `GATEWAY_KNOWN_COMMANDS`
3. Handler in `_handle_message()` dispatches based on canonical name
4. Some commands are gated on config (`gateway_config_gate` on `CommandDef`)
### Running-Agent Guard
Commands that must NOT execute while the agent is processing are rejected early:
```python
if _quick_key in self._running_agents:
if canonical == "model":
return "⏳ Agent is running — wait for it to finish or /stop first."
```
Bypass commands (`/stop`, `/new`, `/approve`, `/deny`, `/queue`, `/status`) have special handling.
## Config Sources
The gateway reads configuration from multiple sources:
| Source | What it provides |
|--------|-----------------|
| `~/.hermes/.env` | API keys, bot tokens, platform credentials |
| `~/.hermes/config.yaml` | Model settings, tool configuration, display options |
| Environment variables | Override any of the above |
Unlike the CLI (which uses `load_cli_config()` with hardcoded defaults), the gateway reads `config.yaml` directly via YAML loader. This means config keys that exist in the CLI's defaults dict but not in the user's config file may behave differently between CLI and gateway.
## Platform Adapters
Each messaging platform has an adapter in `gateway/platforms/`:
```text
gateway/platforms/
├── base.py # BaseAdapter — shared logic for all platforms
├── telegram.py # Telegram Bot API (long polling or webhook)
├── discord.py # Discord bot via discord.py
├── slack.py # Slack Socket Mode
├── whatsapp.py # WhatsApp Business Cloud API
├── signal.py # Signal via signal-cli REST API
├── matrix.py # Matrix via matrix-nio (optional E2EE)
├── mattermost.py # Mattermost WebSocket API
├── email_adapter.py # Email via IMAP/SMTP
├── sms.py # SMS via Twilio
├── dingtalk.py # DingTalk WebSocket
├── feishu.py # Feishu/Lark WebSocket or webhook
├── wecom.py # WeCom (WeChat Work) callback
└── homeassistant.py # Home Assistant conversation integration
```
Adapters implement a common interface:
- `connect()` / `disconnect()` — lifecycle management
- `send_message()` — outbound message delivery
- `on_message()` — inbound message normalization → `MessageEvent`
### Token Locks
Adapters that connect with unique credentials call `acquire_scoped_lock()` in `connect()` and `release_scoped_lock()` in `disconnect()`. This prevents two profiles from using the same bot token simultaneously.
## Delivery Path
Outgoing deliveries (`gateway/delivery.py`) handle:
- **Direct reply** — send response back to the originating chat
- **Home channel delivery** — route cron job outputs and background results to a configured home channel
- **Explicit target delivery** — `send_message` tool specifying `telegram:-1001234567890`
- **Cross-platform delivery** — deliver to a different platform than the originating message
Cron job deliveries are NOT mirrored into gateway session history — they live in their own cron session only. This is a deliberate design choice to avoid message alternation violations.
## Hooks
Gateway events emit hook callbacks through `gateway/hooks.py`. Hooks are local trusted Python code and can observe or extend gateway lifecycle events.
Gateway hooks are Python modules that respond to lifecycle events:
## Background maintenance
### Gateway Hook Events
The gateway also runs maintenance tasks such as:
| Event | When fired |
|-------|-----------|
| `gateway:startup` | Gateway process starts |
| `session:start` | New conversation session begins |
| `session:end` | Session completes or times out |
| `session:reset` | User resets session with `/new` |
| `agent:start` | Agent begins processing a message |
| `agent:step` | Agent completes one tool-calling iteration |
| `agent:end` | Agent finishes and returns response |
| `command:*` | Any slash command is executed |
- cron ticking
- cache refreshes
- session expiry checks
- proactive memory flush before reset/expiry
Hooks are discovered from `gateway/builtin_hooks/` (always active) and `~/.hermes/hooks/` (user-installed). Each hook is a directory with a `HOOK.yaml` manifest and `handler.py`.
## Honcho interaction
## Memory Provider Integration
When a memory provider plugin (e.g. Honcho) is enabled, the gateway creates an AIAgent per incoming message with the same session ID. The memory provider's `initialize()` receives the session ID and creates the appropriate backend session. Tools are routed through the `MemoryManager`, which handles all provider lifecycle hooks (prefetch, sync, session end).
When a memory provider plugin (e.g., Honcho) is enabled:
### Memory provider session routing
1. Gateway creates an `AIAgent` per message with the session ID
2. The `MemoryManager` initializes the provider with the session context
3. Provider tools (e.g., `honcho_profile`, `viking_search`) are routed through:
Memory provider tools (e.g. `honcho_profile`, `viking_search`) are routed through the MemoryManager in `_invoke_tool()`:
```
```text
AIAgent._invoke_tool()
→ self._memory_manager.handle_tool_call(name, args)
→ provider.handle_tool_call(name, args)
```
Each memory provider manages its own session lifecycle internally. The `initialize()` method receives the session ID, and `on_session_end()` handles cleanup and final flush.
4. On session end/reset, `on_session_end()` fires for cleanup and final data flush
### Memory flush lifecycle
### Memory Flush Lifecycle
When a session is reset, resumed, or expires, the gateway flushes built-in memories before discarding context. The flush creates a temporary `AIAgent` that runs a memory-only conversation turn. The memory provider's `on_session_end()` hook fires during this process, giving external providers a chance to persist any buffered data.
When a session is reset, resumed, or expires:
1. Built-in memories are flushed to disk
2. Memory provider's `on_session_end()` hook fires
3. A temporary `AIAgent` runs a memory-only conversation turn
4. Context is then discarded or archived
## Related docs
## Background Maintenance
The gateway runs periodic maintenance alongside message handling:
- **Cron ticking** — checks job schedules and fires due jobs
- **Session expiry** — cleans up abandoned sessions after timeout
- **Memory flush** — proactively flushes memory before session expiry
- **Cache refresh** — refreshes model lists and provider status
## Process Management
The gateway runs as a long-lived process, managed via:
- `hermes gateway start` / `hermes gateway stop` — manual control
- `systemctl` (Linux) or `launchctl` (macOS) — service management
- PID file at `~/.hermes/gateway.pid` — profile-scoped process tracking
**Profile-scoped vs global**: `start_gateway()` uses profile-scoped PID files. `hermes gateway stop` stops only the current profile's gateway. `hermes gateway stop --all` uses global `ps aux` scanning to kill all gateway processes (used during updates).
## Related Docs
- [Session Storage](./session-storage.md)
- [Cron Internals](./cron-internals.md)
- [ACP Internals](./acp-internals.md)
- [Agent Loop Internals](./agent-loop.md)
- [Messaging Gateway (User Guide)](/docs/user-guide/messaging)

View File

@@ -3,7 +3,7 @@
Hermes Agent saves conversation trajectories in ShareGPT-compatible JSONL format
for use as training data, debugging artifacts, and reinforcement learning datasets.
Source files: `agent/trajectory.py`, `run_agent.py` (lines 1788-1975), `batch_runner.py`
Source files: `agent/trajectory.py`, `run_agent.py` (search for `_save_trajectory`), `batch_runner.py`
## File Naming Convention

View File

@@ -28,7 +28,7 @@ It's not a coding copilot tethered to an IDE or a chatbot wrapper around a singl
| 🗺️ **[Learning Path](/docs/getting-started/learning-path)** | Find the right docs for your experience level |
| ⚙️ **[Configuration](/docs/user-guide/configuration)** | Config file, providers, models, and options |
| 💬 **[Messaging Gateway](/docs/user-guide/messaging)** | Set up Telegram, Discord, Slack, or WhatsApp |
| 🔧 **[Tools & Toolsets](/docs/user-guide/features/tools)** | 40+ built-in tools and how to configure them |
| 🔧 **[Tools & Toolsets](/docs/user-guide/features/tools)** | 47 built-in tools and how to configure them |
| 🧠 **[Memory System](/docs/user-guide/features/memory)** | Persistent memory that grows across sessions |
| 📚 **[Skills System](/docs/user-guide/features/skills)** | Procedural memory the agent creates and reuses |
| 🔌 **[MCP Integration](/docs/user-guide/features/mcp)** | Connect to MCP servers, filter their tools, and extend Hermes safely |
@@ -46,7 +46,7 @@ It's not a coding copilot tethered to an IDE or a chatbot wrapper around a singl
- **A closed learning loop** — Agent-curated memory with periodic nudges, autonomous skill creation, skill self-improvement during use, FTS5 cross-session recall with LLM summarization, and [Honcho](https://github.com/plastic-labs/honcho) dialectic user modeling
- **Runs anywhere, not just your laptop** — 6 terminal backends: local, Docker, SSH, Daytona, Singularity, Modal. Daytona and Modal offer serverless persistence — your environment hibernates when idle, costing nearly nothing
- **Lives where you do** — CLI, Telegram, Discord, Slack, WhatsApp, all from one gateway
- **Lives where you do** — CLI, Telegram, Discord, Slack, WhatsApp, Signal, Matrix, Mattermost, Email, SMS, DingTalk, Feishu, WeCom, Home Assistant — 14+ platforms from one gateway
- **Built by model trainers** — Created by [Nous Research](https://nousresearch.com), the lab behind Hermes, Nomos, and Psyche. Works with [Nous Portal](https://portal.nousresearch.com), [OpenRouter](https://openrouter.ai), OpenAI, or any endpoint
- **Scheduled automations** — Built-in cron with delivery to any platform
- **Delegates & parallelizes** — Spawn isolated subagents for parallel workstreams. Programmatic Tool Calling via `execute_code` collapses multi-step pipelines into single inference calls

View File

@@ -22,7 +22,7 @@ Hermes supports multiple AI inference providers out of the box. Use `hermes mode
## Web Search Backends
The `web_search`, `web_extract`, and `web_crawl` tools support four backend providers, configured via `config.yaml` or `hermes tools`:
The `web_search` and `web_extract` tools support four backend providers, configured via `config.yaml` or `hermes tools`:
| Backend | Env Var | Search | Extract | Crawl |
|---------|---------|--------|---------|-------|
@@ -56,13 +56,14 @@ See [Browser Automation](/docs/user-guide/features/browser) for setup and usage.
Text-to-speech and speech-to-text across all messaging platforms:
| Provider | Quality | Cost | API Key |
|----------|---------|------|---------|
| **Edge TTS** (default) | Good | Free | None needed |
| **ElevenLabs** | Excellent | Paid | `ELEVENLABS_API_KEY` |
| **OpenAI TTS** | Good | Paid | `VOICE_TOOLS_OPENAI_KEY` |
| **NeuTTS** | Good | Free | None needed |
||----------|---------|------|---------|
|| **Edge TTS** (default) | Good | Free | None needed |
|| **ElevenLabs** | Excellent | Paid | `ELEVENLABS_API_KEY` |
|| **OpenAI TTS** | Good | Paid | `VOICE_TOOLS_OPENAI_KEY` |
|| **MiniMax** | Good | Paid | `MINIMAX_API_KEY` |
|| **NeuTTS** | Good | Free | None needed |
Speech-to-text uses Whisper for voice message transcription on Telegram, Discord, and WhatsApp. See [Voice & TTS](/docs/user-guide/features/tts) and [Voice Mode](/docs/user-guide/features/voice-mode) for details.
Speech-to-text supports three providers: local Whisper (free, runs on-device), Groq (fast cloud), and OpenAI Whisper API. Voice message transcription works across Telegram, Discord, WhatsApp, and other messaging platforms. See [Voice & TTS](/docs/user-guide/features/tts) and [Voice Mode](/docs/user-guide/features/voice-mode) for details.
## IDE & Editor Integration
@@ -74,9 +75,27 @@ Speech-to-text uses Whisper for voice message transcription on Telegram, Discord
## Memory & Personalization
- **[Honcho Memory](/docs/user-guide/features/honcho)** — AI-native persistent memory for cross-session user modeling and personalization. Honcho adds deep user modeling via dialectic reasoning on top of Hermes's built-in memory system.
- **[Built-in Memory](/docs/user-guide/features/memory)** — Persistent, curated memory via `MEMORY.md` and `USER.md` files. The agent maintains bounded stores of personal notes and user profile data that survive across sessions.
- **[Memory Providers](/docs/user-guide/features/memory-providers)** — Plug in external memory backends for deeper personalization. Seven providers are supported: Honcho (dialectic reasoning), OpenViking (tiered retrieval), Mem0 (cloud extraction), Hindsight (knowledge graphs), Holographic (local SQLite), RetainDB (hybrid search), and ByteRover (CLI-based).
## Messaging Platforms
Hermes runs as a gateway bot on 14+ messaging platforms, all configured through the same `gateway` subsystem:
- **[Telegram](/docs/user-guide/messaging/telegram)**, **[Discord](/docs/user-guide/messaging/discord)**, **[Slack](/docs/user-guide/messaging/slack)**, **[WhatsApp](/docs/user-guide/messaging/whatsapp)**, **[Signal](/docs/user-guide/messaging/signal)**, **[Matrix](/docs/user-guide/messaging/matrix)**, **[Mattermost](/docs/user-guide/messaging/mattermost)**, **[Email](/docs/user-guide/messaging/email)**, **[SMS](/docs/user-guide/messaging/sms)**, **[DingTalk](/docs/user-guide/messaging/dingtalk)**, **[Feishu/Lark](/docs/user-guide/messaging/feishu)**, **[WeCom](/docs/user-guide/messaging/wecom)**, **[Home Assistant](/docs/user-guide/messaging/homeassistant)**, **[Webhooks](/docs/user-guide/messaging/webhooks)**
See the [Messaging Gateway overview](/docs/user-guide/messaging) for the platform comparison table and setup guide.
## Home Automation
- **[Home Assistant](/docs/user-guide/messaging/homeassistant)** — Control smart home devices via four dedicated tools (`ha_list_entities`, `ha_get_state`, `ha_list_services`, `ha_call_service`). The Home Assistant toolset activates automatically when `HASS_TOKEN` is configured.
## Plugins
- **[Plugin System](/docs/user-guide/features/plugins)** — Extend Hermes with custom tools, lifecycle hooks, and CLI commands without modifying core code. Plugins are discovered from `~/.hermes/plugins/`, project-local `.hermes/plugins/`, and pip-installed entry points.
- **[Build a Plugin](/docs/guides/build-a-hermes-plugin)** — Step-by-step guide for creating Hermes plugins with tools, hooks, and CLI commands.
## Training & Evaluation
- **[RL Training](/docs/user-guide/features/rl-training)** — Generate trajectory data from agent sessions for reinforcement learning and model fine-tuning.
- **[RL Training](/docs/user-guide/features/rl-training)** — Generate trajectory data from agent sessions for reinforcement learning and model fine-tuning. Supports Atropos environments with customizable reward functions.
- **[Batch Processing](/docs/user-guide/features/batch-processing)** — Run the agent across hundreds of prompts in parallel, generating structured ShareGPT-format trajectory data for training data generation or evaluation.

View File

@@ -90,7 +90,7 @@ Both persist across sessions. See [Memory](../user-guide/features/memory.md) and
Yes. Import the `AIAgent` class and use Hermes programmatically:
```python
from hermes.agent import AIAgent
from run_agent import AIAgent
agent = AIAgent(model="openrouter/nous/hermes-3-llama-3.1-70b")
response = agent.chat("Explain quantum computing briefly")
@@ -227,7 +227,7 @@ hermes chat --model openrouter/meta-llama/llama-3.1-70b-instruct
hermes chat
# Use a model with a larger context window
hermes chat --model openrouter/google/gemini-2.0-flash-001
hermes chat --model openrouter/google/gemini-3-flash-preview
```
If this happens on the first long conversation, Hermes may have the wrong context length for your model. Check what it detected:

View File

@@ -1,74 +1,153 @@
---
sidebar_position: 6
title: "Official Optional Skills Catalog"
description: "Catalog of official optional skills available from the repository"
sidebar_position: 9
title: "Optional Skills Catalog"
description: "Official optional skills shipped with hermes-agent — install via hermes skills install official/<category>/<skill>"
---
# Official Optional Skills Catalog
# Optional Skills Catalog
Official optional skills live in the repository under `optional-skills/`. Install them with `hermes skills install official/<category>/<skill>` or browse them with `hermes skills browse --source official`.
Official optional skills ship with the hermes-agent repository under `optional-skills/` but are **not active by default**. Install them explicitly:
## autonomous-ai-agents
```bash
hermes skills install official/<category>/<skill>
```
| Skill | Description | Path |
|-------|-------------|------|
| `blackbox` | Delegate coding tasks to Blackbox AI CLI agent. Multi-model agent with built-in judge that runs tasks through multiple LLMs and picks the best result. Requires the blackbox CLI and a Blackbox AI API key. | `autonomous-ai-agents/blackbox` |
For example:
## blockchain
```bash
hermes skills install official/blockchain/solana
hermes skills install official/mlops/flash-attention
```
| Skill | Description | Path |
|-------|-------------|------|
| `base` | Query Base (Ethereum L2) blockchain data with USD pricing — wallet balances, token info, transaction details, gas analysis, contract inspection. | `blockchain/base` |
| `solana` | Query Solana blockchain data with USD pricing — wallet balances, token portfolios with values, transaction details, NFTs, whale detection, and live network stats. Uses Solana RPC + CoinGecko. No API key required. | `blockchain/solana` |
Once installed, the skill appears in the agent's skill list and can be loaded automatically when relevant tasks are detected.
## creative
To uninstall:
| Skill | Description | Path |
|-------|-------------|------|
| `blender-mcp` | Control Blender directly from Hermes via socket connection to the blender-mcp addon. Create 3D objects, materials, animations, and run arbitrary Blender Python. | `creative/blender-mcp` |
| `meme-generation` | Generate real meme images by picking a template and overlaying text with Pillow. Produces actual .png meme files. | `creative/meme-generation` |
```bash
hermes skills uninstall <skill-name>
```
## email
---
| Skill | Description | Path |
|-------|-------------|------|
| `agentmail` | Give the agent its own dedicated email inbox via AgentMail. Send, receive, and manage email autonomously using agent-owned email addresses (e.g. hermes-agent@agentmail.to). | `email/agentmail` |
## Autonomous AI Agents
## health
| Skill | Description |
|-------|-------------|
| **blackbox** | Delegate coding tasks to Blackbox AI CLI agent. Multi-model agent with built-in judge that runs tasks through multiple LLMs and picks the best result. |
| **honcho** | Configure and use Honcho memory with Hermes — cross-session user modeling, multi-profile peer isolation, observation config, and dialectic reasoning. |
| Skill | Description | Path |
|-------|-------------|------|
| `neuroskill-bci` | Connect to a running NeuroSkill instance and incorporate the user's real-time cognitive and emotional state (focus, relaxation, mood, cognitive load, drowsiness, heart rate, HRV, sleep staging, and 40+ derived EXG scores) into responses. Requires a BCI wearable (Muse 2/S or Open… | `health/neuroskill-bci` |
## Blockchain
## mcp
| Skill | Description |
|-------|-------------|
| **base** | Query Base (Ethereum L2) blockchain data with USD pricing — wallet balances, token info, transaction details, gas analysis, contract inspection, whale detection, and live network stats. No API key required. |
| **solana** | Query Solana blockchain data with USD pricing — wallet balances, token portfolios, transaction details, NFTs, whale detection, and live network stats. No API key required. |
| Skill | Description | Path |
|-------|-------------|------|
| `fastmcp` | Build, test, inspect, install, and deploy MCP servers with FastMCP in Python. | `mcp/fastmcp` |
## Communication
## migration
| Skill | Description |
|-------|-------------|
| **one-three-one-rule** | Structured communication framework for proposals and decision-making. |
| Skill | Description | Path |
|-------|-------------|------|
| `openclaw-migration` | Migrate a user's OpenClaw customization footprint into Hermes Agent. Imports Hermes-compatible memories, SOUL.md, command allowlists, user skills, and selected workspace assets from ~/.openclaw, then reports exactly what could not be migrated and why. | `migration/openclaw-migration` |
## Creative
## productivity
| Skill | Description |
|-------|-------------|
| **blender-mcp** | Control Blender directly from Hermes via socket connection to the blender-mcp addon. Create 3D objects, materials, animations, and run arbitrary Blender Python (bpy) code. |
| **meme-generation** | Generate real meme images by picking a template and overlaying text with Pillow. Produces actual `.png` meme files. |
| Skill | Description | Path |
|-------|-------------|------|
| `telephony` | Give Hermes phone capabilities — provision a Twilio number, send/receive SMS/MMS, make direct calls, and place AI-driven outbound calls through Bland.ai or Vapi. | `productivity/telephony` |
## DevOps
## research
| Skill | Description |
|-------|-------------|
| **cli** | Run 150+ AI apps via inference.sh CLI (infsh) — image generation, video creation, LLMs, search, 3D, and social automation. |
| **docker-management** | Manage Docker containers, images, volumes, networks, and Compose stacks — lifecycle ops, debugging, cleanup, and Dockerfile optimization. |
| Skill | Description | Path |
|-------|-------------|------|
| `bioinformatics` | Gateway to 400+ bioinformatics skills from bioSkills and ClawBio. Covers genomics, transcriptomics, single-cell, variant calling, pharmacogenomics, metagenomics, structural biology. | `research/bioinformatics` |
| `qmd` | Search personal knowledge bases, notes, docs, and meeting transcripts locally using qmd — a hybrid retrieval engine with BM25, vector search, and LLM reranking. Supports CLI and MCP integration. | `research/qmd` |
## Email
## security
| Skill | Description |
|-------|-------------|
| **agentmail** | Give the agent its own dedicated email inbox via AgentMail. Send, receive, and manage email autonomously using agent-owned email addresses. |
| Skill | Description | Path |
|-------|-------------|------|
| `1password` | Set up and use 1Password CLI (op). Use when installing the CLI, enabling desktop app integration, signing in, and reading/injecting secrets for commands. | `security/1password` |
| `oss-forensics` | Supply chain investigation, evidence recovery, and forensic analysis for GitHub repositories. Covers deleted commit recovery, force-push detection, IOC extraction. | `security/oss-forensics` |
| `sherlock` | OSINT username search across 400+ social networks. Hunt down social media accounts by username. | `security/sherlock` |
## Health
| Skill | Description |
|-------|-------------|
| **neuroskill-bci** | Brain-Computer Interface (BCI) integration for neuroscience research workflows. |
## MCP
| Skill | Description |
|-------|-------------|
| **fastmcp** | Build, test, inspect, install, and deploy MCP servers with FastMCP in Python. Covers wrapping APIs or databases as MCP tools, exposing resources or prompts, and deployment. |
## Migration
| Skill | Description |
|-------|-------------|
| **openclaw-migration** | Migrate a user's OpenClaw customization footprint into Hermes Agent. Imports memories, SOUL.md, command allowlists, user skills, and selected workspace assets. |
## MLOps
The largest optional category — covers the full ML pipeline from data curation to production inference.
| Skill | Description |
|-------|-------------|
| **accelerate** | Simplest distributed training API. 4 lines to add distributed support to any PyTorch script. Unified API for DeepSpeed/FSDP/Megatron/DDP. |
| **chroma** | Open-source embedding database. Store embeddings and metadata, perform vector and full-text search. Simple 4-function API for RAG and semantic search. |
| **faiss** | Facebook's library for efficient similarity search and clustering of dense vectors. Supports billions of vectors, GPU acceleration, and various index types (Flat, IVF, HNSW). |
| **flash-attention** | Optimize transformer attention with Flash Attention for 2-4x speedup and 10-20x memory reduction. Supports PyTorch SDPA, flash-attn library, H100 FP8, and sliding window. |
| **hermes-atropos-environments** | Build, test, and debug Hermes Agent RL environments for Atropos training. Covers the HermesAgentBaseEnv interface, reward functions, agent loop integration, and evaluation. |
| **huggingface-tokenizers** | Fast Rust-based tokenizers for research and production. Tokenizes 1GB in under 20 seconds. Supports BPE, WordPiece, and Unigram algorithms. |
| **instructor** | Extract structured data from LLM responses with Pydantic validation, retry failed extractions automatically, and stream partial results. |
| **lambda-labs** | Reserved and on-demand GPU cloud instances for ML training and inference. SSH access, persistent filesystems, and multi-node clusters. |
| **llava** | Large Language and Vision Assistant — visual instruction tuning and image-based conversations combining CLIP vision with LLaMA language models. |
| **nemo-curator** | GPU-accelerated data curation for LLM training. Fuzzy deduplication (16x faster), quality filtering (30+ heuristics), semantic dedup, PII redaction. Scales with RAPIDS. |
| **pinecone** | Managed vector database for production AI. Auto-scaling, hybrid search (dense + sparse), metadata filtering, and low latency (under 100ms p95). |
| **pytorch-lightning** | High-level PyTorch framework with Trainer class, automatic distributed training (DDP/FSDP/DeepSpeed), callbacks, and minimal boilerplate. |
| **qdrant** | High-performance vector similarity search engine. Rust-powered with fast nearest neighbor search, hybrid search with filtering, and scalable vector storage. |
| **saelens** | Train and analyze Sparse Autoencoders (SAEs) using SAELens to decompose neural network activations into interpretable features. |
| **simpo** | Simple Preference Optimization — reference-free alternative to DPO with better performance (+6.4 pts on AlpacaEval 2.0). No reference model needed. |
| **slime** | LLM post-training with RL using Megatron+SGLang framework. Custom data generation workflows and tight Megatron-LM integration for RL scaling. |
| **tensorrt-llm** | Optimize LLM inference with NVIDIA TensorRT for maximum throughput. 10-100x faster than PyTorch on A100/H100 with quantization (FP8/INT4) and in-flight batching. |
| **torchtitan** | PyTorch-native distributed LLM pretraining with 4D parallelism (FSDP2, TP, PP, CP). Scale from 8 to 512+ GPUs with Float8 and torch.compile. |
## Productivity
| Skill | Description |
|-------|-------------|
| **canvas** | Canvas LMS integration — fetch enrolled courses and assignments using API token authentication. |
| **memento-flashcards** | Spaced repetition flashcard system for learning and knowledge retention. |
| **siyuan** | SiYuan Note API for searching, reading, creating, and managing blocks and documents in a self-hosted knowledge base. |
| **telephony** | Give Hermes phone capabilities — provision a Twilio number, send/receive SMS/MMS, make calls, and place AI-driven outbound calls through Bland.ai or Vapi. |
## Research
| Skill | Description |
|-------|-------------|
| **bioinformatics** | Gateway to 400+ bioinformatics skills from bioSkills and ClawBio. Covers genomics, transcriptomics, single-cell, variant calling, pharmacogenomics, metagenomics, and structural biology. |
| **domain-intel** | Passive domain reconnaissance using Python stdlib. Subdomain discovery, SSL certificate inspection, WHOIS lookups, DNS records, and bulk multi-domain analysis. No API keys required. |
| **duckduckgo-search** | Free web search via DuckDuckGo — text, news, images, videos. No API key needed. |
| **gitnexus-explorer** | Index a codebase with GitNexus and serve an interactive knowledge graph via web UI and Cloudflare tunnel. |
| **parallel-cli** | Vendor skill for Parallel CLI — agent-native web search, extraction, deep research, enrichment, and monitoring. |
| **qmd** | Search personal knowledge bases, notes, docs, and meeting transcripts locally using qmd — a hybrid retrieval engine with BM25, vector search, and LLM reranking. |
| **scrapling** | Web scraping with Scrapling — HTTP fetching, stealth browser automation, Cloudflare bypass, and spider crawling via CLI and Python. |
## Security
| Skill | Description |
|-------|-------------|
| **1password** | Set up and use 1Password CLI (op). Install the CLI, enable desktop app integration, sign in, and read/inject secrets for commands. |
| **oss-forensics** | Open-source software forensics — analyze packages, dependencies, and supply chain risks. |
| **sherlock** | OSINT username search across 400+ social networks. Hunt down social media accounts by username. |
---
## Contributing Optional Skills
To add a new optional skill to the repository:
1. Create a directory under `optional-skills/<category>/<skill-name>/`
2. Add a `SKILL.md` with standard frontmatter (name, description, version, author)
3. Include any supporting files in `references/`, `templates/`, or `scripts/` subdirectories
4. Submit a pull request — the skill will appear in this catalog once merged

View File

@@ -89,9 +89,22 @@ Type `/` in the CLI to open the autocomplete menu. Built-in commands are case-in
| `/<skill-name>` | Load any installed skill as an on-demand command. Example: `/gif-search`, `/github-pr-workflow`, `/excalidraw`. |
| `/skills ...` | Search, browse, inspect, install, audit, publish, and configure skills from registries and the official optional-skills catalog. |
### Quick commands
### Quick Commands
User-defined quick commands from `quick_commands` in `~/.hermes/config.yaml` are also available as slash commands. These are resolved at dispatch time, not shown in the built-in autocomplete/help tables.
User-defined quick commands map a short alias to a longer prompt. Configure them in `~/.hermes/config.yaml`:
```yaml
quick_commands:
review: "Review my latest git diff and suggest improvements"
deploy: "Run the deployment script at scripts/deploy.sh and verify the output"
morning: "Check my calendar, unread emails, and summarize today's priorities"
```
Then type `/review`, `/deploy`, or `/morning` in the CLI. Quick commands are resolved at dispatch time and are not shown in the built-in autocomplete/help tables.
### Alias Resolution
Commands support prefix matching: typing `/h` resolves to `/help`, `/mod` resolves to `/model`. When a prefix is ambiguous (matches multiple commands), the first match in registry order wins. Full command names and registered aliases always take priority over prefix matches.
## Messaging slash commands

View File

@@ -6,7 +6,13 @@ description: "Authoritative reference for Hermes built-in tools, grouped by tool
# Built-in Tools Reference
This page documents the built-in Hermes tool registry as it exists in code. Availability can still vary by platform, credentials, and enabled toolsets.
This page documents all 47 built-in tools in the Hermes tool registry, grouped by toolset. Availability varies by platform, credentials, and enabled toolsets.
**Quick counts:** 11 browser tools, 4 file tools, 10 RL tools, 4 Home Assistant tools, 2 terminal tools, 2 web tools, and 14 standalone tools across other toolsets.
:::tip MCP Tools
In addition to built-in tools, Hermes can load tools dynamically from MCP servers. MCP tools appear with a server-name prefix (e.g., `github_create_issue` for the `github` MCP server). See [MCP Integration](/docs/user-guide/features/mcp) for configuration.
:::
## `browser` toolset

View File

@@ -6,53 +6,150 @@ description: "Reference for Hermes core, composite, platform, and dynamic toolse
# Toolsets Reference
Toolsets are named bundles of tools that you can enable with `hermes chat --toolsets ...`, configure per platform, or resolve inside the agent runtime.
Toolsets are named bundles of tools that control what the agent can do. They're the primary mechanism for configuring tool availability per platform, per session, or per task.
| Toolset | Kind | Resolves to |
|---------|------|-------------|
| `browser` | core | `browser_back`, `browser_click`, `browser_close`, `browser_console`, `browser_get_images`, `browser_navigate`, `browser_press`, `browser_scroll`, `browser_snapshot`, `browser_type`, `browser_vision`, `web_search` |
| `clarify` | core | `clarify` |
| `code_execution` | core | `execute_code` |
| `cronjob` | core | `cronjob` |
| `debugging` | composite | `patch`, `process`, `read_file`, `search_files`, `terminal`, `web_extract`, `web_search`, `write_file` |
| `delegation` | core | `delegate_task` |
| `file` | core | `patch`, `read_file`, `search_files`, `write_file` |
| `hermes-acp` | platform | `browser_back`, `browser_click`, `browser_close`, `browser_console`, `browser_get_images`, `browser_navigate`, `browser_press`, `browser_scroll`, `browser_snapshot`, `browser_type`, `browser_vision`, `delegate_task`, `execute_code`, `memory`, `patch`, `process`, `read_file`, `search_files`, `session_search`, `skill_manage`, `skill_view`, `skills_list`, `terminal`, `todo`, `vision_analyze`, `web_extract`, `web_search`, `write_file` |
| `hermes-cli` | platform | `browser_back`, `browser_click`, `browser_close`, `browser_console`, `browser_get_images`, `browser_navigate`, `browser_press`, `browser_scroll`, `browser_snapshot`, `browser_type`, `browser_vision`, `clarify`, `cronjob`, `delegate_task`, `execute_code`, `ha_call_service`, `ha_get_state`, `ha_list_entities`, `ha_list_services`, `image_generate`, `memory`, `mixture_of_agents`, `patch`, `process`, `read_file`, `search_files`, `send_message`, `session_search`, `skill_manage`, `skill_view`, `skills_list`, `terminal`, `text_to_speech`, `todo`, `vision_analyze`, `web_extract`, `web_search`, `write_file` |
| `hermes-api-server` | platform | `browser_back`, `browser_click`, `browser_close`, `browser_console`, `browser_get_images`, `browser_navigate`, `browser_press`, `browser_scroll`, `browser_snapshot`, `browser_type`, `browser_vision`, `cronjob`, `delegate_task`, `execute_code`, `ha_call_service`, `ha_get_state`, `ha_list_entities`, `ha_list_services`, `image_generate`, `memory`, `mixture_of_agents`, `patch`, `process`, `read_file`, `search_files`, `session_search`, `skill_manage`, `skill_view`, `skills_list`, `terminal`, `todo`, `vision_analyze`, `web_extract`, `web_search`, `write_file` |
| `hermes-dingtalk` | platform | _(same as hermes-cli)_ |
| `hermes-feishu` | platform | _(same as hermes-cli)_ |
| `hermes-wecom` | platform | _(same as hermes-cli)_ |
| `hermes-discord` | platform | _(same as hermes-cli)_ |
| `hermes-email` | platform | _(same as hermes-cli)_ |
| `hermes-gateway` | composite | Union of all messaging platform toolsets |
| `hermes-homeassistant` | platform | _(same as hermes-cli)_ |
| `hermes-matrix` | platform | _(same as hermes-cli)_ |
| `hermes-mattermost` | platform | _(same as hermes-cli)_ |
| `hermes-signal` | platform | _(same as hermes-cli)_ |
| `hermes-slack` | platform | _(same as hermes-cli)_ |
| `hermes-sms` | platform | _(same as hermes-cli)_ |
| `hermes-telegram` | platform | _(same as hermes-cli)_ |
| `hermes-whatsapp` | platform | _(same as hermes-cli)_ |
| `hermes-webhook` | platform | _(same as hermes-cli)_ |
| `homeassistant` | core | `ha_call_service`, `ha_get_state`, `ha_list_entities`, `ha_list_services` |
| `image_gen` | core | `image_generate` |
| `memory` | core | `memory` |
| `messaging` | core | `send_message` |
| `moa` | core | `mixture_of_agents` |
| `rl` | core | `rl_check_status`, `rl_edit_config`, `rl_get_current_config`, `rl_get_results`, `rl_list_environments`, `rl_list_runs`, `rl_select_environment`, `rl_start_training`, `rl_stop_training`, `rl_test_inference` |
| `safe` | composite | `image_generate`, `mixture_of_agents`, `vision_analyze`, `web_extract`, `web_search` |
| `search` | core | `web_search` |
| `session_search` | core | `session_search` |
| `skills` | core | `skill_manage`, `skill_view`, `skills_list` |
| `terminal` | core | `process`, `terminal` |
| `todo` | core | `todo` |
| `tts` | core | `text_to_speech` |
| `vision` | core | `vision_analyze` |
| `web` | core | `web_extract`, `web_search` |
## How Toolsets Work
## Dynamic toolsets
Every tool belongs to exactly one toolset. When you enable a toolset, all tools in that bundle become available to the agent. Toolsets come in three kinds:
- `mcp-<server>` — generated at runtime for each configured MCP server.
- Custom toolsets can be created in configuration and resolved at startup.
- Wildcards: `all` and `*` expand to every registered toolset.
- **Core** — A single logical group of related tools (e.g., `file` bundles `read_file`, `write_file`, `patch`, `search_files`)
- **Composite** — Combines multiple core toolsets for a common scenario (e.g., `debugging` bundles file, terminal, and web tools)
- **Platform** — A complete tool configuration for a specific deployment context (e.g., `hermes-cli` is the default for interactive CLI sessions)
## Configuring Toolsets
### Per-session (CLI)
```bash
hermes chat --toolsets web,file,terminal
hermes chat --toolsets debugging # composite — expands to file + terminal + web
hermes chat --toolsets all # everything
```
### Per-platform (config.yaml)
```yaml
toolsets:
- hermes-cli # default for CLI
# - hermes-telegram # override for Telegram gateway
```
### Interactive management
```bash
hermes tools # curses UI to enable/disable per platform
```
Or in-session:
```
/tools list
/tools disable browser
/tools enable rl
```
## Core Toolsets
| Toolset | Tools | Purpose |
|---------|-------|---------|
| `browser` | `browser_back`, `browser_click`, `browser_close`, `browser_console`, `browser_get_images`, `browser_navigate`, `browser_press`, `browser_scroll`, `browser_snapshot`, `browser_type`, `browser_vision`, `web_search` | Full browser automation. Includes `web_search` as a fallback for quick lookups. |
| `clarify` | `clarify` | Ask the user a question when the agent needs clarification. |
| `code_execution` | `execute_code` | Run Python scripts that call Hermes tools programmatically. |
| `cronjob` | `cronjob` | Schedule and manage recurring tasks. |
| `delegation` | `delegate_task` | Spawn isolated subagent instances for parallel work. |
| `file` | `patch`, `read_file`, `search_files`, `write_file` | File reading, writing, searching, and editing. |
| `homeassistant` | `ha_call_service`, `ha_get_state`, `ha_list_entities`, `ha_list_services` | Smart home control via Home Assistant. Only available when `HASS_TOKEN` is set. |
| `image_gen` | `image_generate` | Text-to-image generation via FAL.ai. |
| `memory` | `memory` | Persistent cross-session memory management. |
| `messaging` | `send_message` | Send messages to other platforms (Telegram, Discord, etc.) from within a session. |
| `moa` | `mixture_of_agents` | Multi-model consensus via Mixture of Agents. |
| `rl` | `rl_check_status`, `rl_edit_config`, `rl_get_current_config`, `rl_get_results`, `rl_list_environments`, `rl_list_runs`, `rl_select_environment`, `rl_start_training`, `rl_stop_training`, `rl_test_inference` | RL training environment management (Atropos). |
| `search` | `web_search` | Web search only (without extract). |
| `session_search` | `session_search` | Search past conversation sessions. |
| `skills` | `skill_manage`, `skill_view`, `skills_list` | Skill CRUD and browsing. |
| `terminal` | `process`, `terminal` | Shell command execution and background process management. |
| `todo` | `todo` | Task list management within a session. |
| `tts` | `text_to_speech` | Text-to-speech audio generation. |
| `vision` | `vision_analyze` | Image analysis via vision-capable models. |
| `web` | `web_extract`, `web_search` | Web search and page content extraction. |
## Composite Toolsets
These expand to multiple core toolsets, providing a convenient shorthand for common scenarios:
| Toolset | Expands to | Use case |
|---------|-----------|----------|
| `debugging` | `patch`, `process`, `read_file`, `search_files`, `terminal`, `web_extract`, `web_search`, `write_file` | Debug sessions — file access, terminal, and web research without browser or delegation overhead. |
| `safe` | `image_generate`, `mixture_of_agents`, `vision_analyze`, `web_extract`, `web_search` | Read-only research and media generation. No file writes, no terminal access, no code execution. Good for untrusted or constrained environments. |
## Platform Toolsets
Platform toolsets define the complete tool configuration for a deployment target. Most messaging platforms use the same set as `hermes-cli`:
| Toolset | Differences from `hermes-cli` |
|---------|-------------------------------|
| `hermes-cli` | Full toolset — all 39 tools including `clarify`. The default for interactive CLI sessions. |
| `hermes-acp` | Drops `clarify`, `cronjob`, `image_generate`, `mixture_of_agents`, `send_message`, `text_to_speech`, homeassistant tools. Focused on coding tasks in IDE context. |
| `hermes-api-server` | Drops `clarify` and `send_message`. Adds everything else — suitable for programmatic access where user interaction isn't possible. |
| `hermes-telegram` | Same as `hermes-cli`. |
| `hermes-discord` | Same as `hermes-cli`. |
| `hermes-slack` | Same as `hermes-cli`. |
| `hermes-whatsapp` | Same as `hermes-cli`. |
| `hermes-signal` | Same as `hermes-cli`. |
| `hermes-matrix` | Same as `hermes-cli`. |
| `hermes-mattermost` | Same as `hermes-cli`. |
| `hermes-email` | Same as `hermes-cli`. |
| `hermes-sms` | Same as `hermes-cli`. |
| `hermes-dingtalk` | Same as `hermes-cli`. |
| `hermes-feishu` | Same as `hermes-cli`. |
| `hermes-wecom` | Same as `hermes-cli`. |
| `hermes-homeassistant` | Same as `hermes-cli`. |
| `hermes-webhook` | Same as `hermes-cli`. |
| `hermes-gateway` | Union of all messaging platform toolsets. Used internally when the gateway needs the broadest possible tool set. |
## Dynamic Toolsets
### MCP server toolsets
Each configured MCP server generates a `mcp-<server>` toolset at runtime. For example, if you configure a `github` MCP server, a `mcp-github` toolset is created containing all tools that server exposes.
```yaml
# config.yaml
mcp:
servers:
github:
command: npx
args: ["-y", "@modelcontextprotocol/server-github"]
```
This creates a `mcp-github` toolset you can reference in `--toolsets` or platform configs.
### Plugin toolsets
Plugins can register their own toolsets via `ctx.register_tool()` during plugin initialization. These appear alongside built-in toolsets and can be enabled/disabled the same way.
### Custom toolsets
Define custom toolsets in `config.yaml` to create project-specific bundles:
```yaml
toolsets:
- hermes-cli
custom_toolsets:
data-science:
- file
- terminal
- code_execution
- web
- vision
```
### Wildcards
- `all` or `*` — expands to every registered toolset (built-in + dynamic + plugin)
## Relationship to `hermes tools`
The `hermes tools` command provides a curses-based UI for toggling individual tools on or off per platform. This operates at the tool level (finer than toolsets) and persists to `config.yaml`. Disabled tools are filtered out even if their toolset is enabled.
See also: [Tools Reference](./tools-reference.md) for the complete list of individual tools and their parameters.

View File

@@ -95,6 +95,38 @@ All paths are resolved relative to the working directory. References that resolv
Binary files are detected via MIME type and null-byte scanning. Known text extensions (`.py`, `.md`, `.json`, `.yaml`, `.toml`, `.js`, `.ts`, etc.) bypass MIME-based detection. Binary files are rejected with a warning.
## Platform Availability
Context references are primarily a **CLI feature**. They work in the interactive CLI where `@` triggers tab completion and references are expanded before the message is sent to the agent.
In **messaging platforms** (Telegram, Discord, etc.), the `@` syntax is not expanded by the gateway — messages are passed through as-is. The agent itself can still reference files via the `read_file`, `search_files`, and `web_extract` tools.
## Interaction with Context Compression
When conversation context is compressed, the expanded reference content is included in the compression summary. This means:
- Large file contents injected via `@file:` contribute to context usage
- If the conversation is later compressed, the file content is summarized (not preserved verbatim)
- For very large files, consider using line ranges (`@file:main.py:100-200`) to inject only relevant sections
## Common Patterns
```text
# Code review workflow
Review @diff and check for security issues
# Debug with context
This test is failing. Here's the test @file:tests/test_auth.py
and the implementation @file:src/auth.py:50-80
# Project exploration
What does this project do? @folder:src @file:README.md
# Research
Compare the approaches in @url:https://arxiv.org/abs/2301.00001
and @url:https://arxiv.org/abs/2301.00002
```
## Error Handling
Invalid references produce inline warnings rather than failures:

View File

@@ -187,9 +187,21 @@ When scheduling jobs, you specify where the output goes:
| `"origin"` | Back to where the job was created | Default on messaging platforms |
| `"local"` | Save to local files only (`~/.hermes/cron/output/`) | Default on CLI |
| `"telegram"` | Telegram home channel | Uses `TELEGRAM_HOME_CHANNEL` |
| `"discord"` | Discord home channel | Uses `DISCORD_HOME_CHANNEL` |
| `"telegram:123456"` | Specific Telegram chat by ID | Direct delivery |
| `"discord:987654"` | Specific Discord channel by ID | Direct delivery |
| `"telegram:-100123:17585"` | Specific Telegram topic | `chat_id:thread_id` format |
| `"discord"` | Discord home channel | Uses `DISCORD_HOME_CHANNEL` |
| `"discord:#engineering"` | Specific Discord channel | By channel name |
| `"slack"` | Slack home channel | |
| `"whatsapp"` | WhatsApp home | |
| `"signal"` | Signal | |
| `"matrix"` | Matrix home room | |
| `"mattermost"` | Mattermost home channel | |
| `"email"` | Email | |
| `"sms"` | SMS via Twilio | |
| `"homeassistant"` | Home Assistant | |
| `"dingtalk"` | DingTalk | |
| `"feishu"` | Feishu/Lark | |
| `"wecom"` | WeCom | |
The agent's final response is automatically delivered. You do not need to call `send_message` in the cron prompt.

View File

@@ -1,22 +1,39 @@
---
sidebar_position: 99
title: "Honcho Memory"
description: "Honcho is now available as a memory provider plugin"
description: "AI-native persistent memory via Honcho — dialectic reasoning, multi-agent user modeling, and deep personalization"
---
# Honcho Memory
:::info Honcho is now a Memory Provider Plugin
Honcho has been integrated into the [Memory Providers](./memory-providers.md) system. All Honcho features are available through the unified memory provider interface.
[Honcho](https://github.com/plastic-labs/honcho) is an AI-native memory backend that adds dialectic reasoning and deep user modeling on top of Hermes's built-in memory system. Instead of simple key-value storage, Honcho maintains a running model of who the user is — their preferences, communication style, goals, and patterns — by reasoning about conversations after they happen.
:::info Honcho is a Memory Provider Plugin
Honcho is integrated into the [Memory Providers](./memory-providers.md) system. All features below are available through the unified memory provider interface.
:::
## What Honcho Adds
| Capability | Built-in Memory | Honcho |
|-----------|----------------|--------|
| Cross-session persistence | ✔ File-based MEMORY.md/USER.md | ✔ Server-side with API |
| User profile | ✔ Manual agent curation | ✔ Automatic dialectic reasoning |
| Multi-agent isolation | — | ✔ Per-peer profile separation |
| Observation modes | — | ✔ Unified or directional observation |
| Conclusions (derived insights) | — | ✔ Server-side reasoning about patterns |
| Search across history | ✔ FTS5 session search | ✔ Semantic search over conclusions |
**Dialectic reasoning**: After each conversation, Honcho analyzes the exchange and derives "conclusions" — insights about the user's preferences, habits, and goals. These conclusions accumulate over time, giving the agent a deepening understanding that goes beyond what the user explicitly stated.
**Multi-agent profiles**: When multiple Hermes instances talk to the same user (e.g., a coding assistant and a personal assistant), Honcho maintains separate "peer" profiles. Each peer sees only its own observations and conclusions, preventing cross-contamination of context.
## Setup
```bash
hermes memory setup # select "honcho"
hermes memory setup # select "honcho" from the provider list
```
Or set manually:
Or configure manually:
```yaml
# ~/.hermes/config.yaml
@@ -28,16 +45,49 @@ memory:
echo "HONCHO_API_KEY=your-key" >> ~/.hermes/.env
```
Get an API key at [honcho.dev](https://honcho.dev).
## Configuration Options
```yaml
# ~/.hermes/config.yaml
honcho:
observation: directional # "unified" (default for new installs) or "directional"
peer_name: "" # auto-detected from platform, or set manually
```
**Observation modes:**
- `unified` — All observations go into a single pool. Simpler, good for single-agent setups.
- `directional` — Observations are tagged with direction (user→agent, agent→user). Enables richer analysis of conversation dynamics.
## Tools
When Honcho is active as the memory provider, four additional tools become available:
| Tool | Purpose |
|------|---------|
| `honcho_conclude` | Trigger server-side dialectic reasoning on recent conversations |
| `honcho_context` | Retrieve relevant context from Honcho's memory for the current conversation |
| `honcho_profile` | View or update the user's Honcho profile |
| `honcho_search` | Semantic search across all stored conclusions and observations |
## CLI Commands
```bash
hermes honcho status # Show connection status and config
hermes honcho peer # Update peer names for multi-agent setups
```
## Migrating from `hermes honcho`
If you previously used `hermes honcho setup`:
If you previously used the standalone `hermes honcho setup`:
1. Your existing configuration (`honcho.json` or `~/.honcho/config.json`) is preserved
2. Your server-side data (memories, conclusions, user profiles) is intact
3. Just set `memory.provider: honcho` to reactivate
3. Set `memory.provider: honcho` in config.yaml to reactivate
No re-login or re-setup needed. Run `hermes memory setup` and select "honcho" — the wizard detects your existing config.
## Full Documentation
See [Memory Providers — Honcho](./memory-providers.md#honcho) for tools, config reference, and details.
See [Memory Providers — Honcho](./memory-providers.md#honcho) for the complete reference.

View File

@@ -141,10 +141,25 @@ Debug logs are saved to `./logs/image_tools_debug_<session_id>.json` with detail
The image generation tool runs with safety checks disabled by default (`safety_tolerance: 5`, the most permissive setting). This is configured at the code level and is not user-adjustable.
## Platform Delivery
Generated images are delivered differently depending on the platform:
| Platform | Delivery method |
|----------|----------------|
| **CLI** | Image URL printed as markdown `![description](url)` — click to open in browser |
| **Telegram** | Image sent as a photo message with the prompt as caption |
| **Discord** | Image embedded in a message |
| **Slack** | Image URL in message (Slack unfurls it) |
| **WhatsApp** | Image sent as a media message |
| **Other platforms** | Image URL in plain text |
The agent uses `MEDIA:<url>` syntax in its response, which the platform adapter converts to the appropriate format.
## Limitations
- **Requires FAL API key** — image generation incurs API costs on your FAL.ai account
- **No image editing** — this is text-to-image only, no inpainting or img2img
- **URL-based delivery** — images are returned as temporary FAL.ai URLs, not saved locally
- **URL-based delivery** — images are returned as temporary FAL.ai URLs, not saved locally. URLs expire after a period (typically hours)
- **Upscaling adds latency** — the automatic 2x upscale step adds processing time
- **Max 4 images per request** — `num_images` is capped at 4

View File

@@ -31,15 +31,17 @@ Hermes Agent includes a rich set of capabilities that extend far beyond basic ch
- **[Browser Automation](browser.md)** — Full browser automation with multiple backends: Browserbase cloud, Browser Use cloud, local Chrome via CDP, or local Chromium. Navigate websites, fill forms, and extract information.
- **[Vision & Image Paste](vision.md)** — Multimodal vision support. Paste images from your clipboard into the CLI and ask the agent to analyze, describe, or work with them using any vision-capable model.
- **[Image Generation](image-generation.md)** — Generate images from text prompts using FAL.ai's FLUX 2 Pro model with automatic 2x upscaling via the Clarity Upscaler.
- **[Voice & TTS](tts.md)** — Text-to-speech output and voice message transcription across all messaging platforms, with four provider options: Edge TTS (free), ElevenLabs, OpenAI TTS, and NeuTTS.
- **[Voice & TTS](tts.md)** — Text-to-speech output and voice message transcription across all messaging platforms, with five provider options: Edge TTS (free), ElevenLabs, OpenAI TTS, MiniMax, and NeuTTS.
## Integrations
- **[MCP Integration](mcp.md)** — Connect to any MCP server via stdio or HTTP transport. Access external tools from GitHub, databases, file systems, and internal APIs without writing native Hermes tools. Includes per-server tool filtering and sampling support.
- **[Provider Routing](provider-routing.md)** — Fine-grained control over which AI providers handle your requests. Optimize for cost, speed, or quality with sorting, whitelists, blacklists, and priority ordering.
- **[Fallback Providers](fallback-providers.md)** — Automatic failover to backup LLM providers when your primary model encounters errors, including independent fallback for auxiliary tasks like vision and compression.
- **[Credential Pools](credential-pools.md)** — Distribute API calls across multiple keys for the same provider. Automatic rotation on rate limits or failures.
- **[Memory Providers](memory-providers.md)** — Plug in external memory backends (Honcho, OpenViking, Mem0, Hindsight, Holographic, RetainDB, ByteRover) for cross-session user modeling and personalization beyond the built-in memory system.
- **[API Server](api-server.md)** — Expose Hermes as an OpenAI-compatible HTTP endpoint. Connect any frontend that speaks the OpenAI format — Open WebUI, LobeChat, LibreChat, and more.
- **[IDE Integration (ACP)](acp.md)** — Use Hermes inside ACP-compatible editors such as VS Code, Zed, and JetBrains. Chat, tool activity, file diffs, and terminal commands render inside your editor.
- **[Honcho Memory](honcho.md)** — AI-native persistent memory for cross-session user modeling and personalization via dialectic reasoning.
- **[RL Training](rl-training.md)** — Generate trajectory data from agent sessions for reinforcement learning and model fine-tuning.
## Customization

View File

@@ -70,7 +70,7 @@ Routes define how different webhook sources are handled. Each route is a named e
| `secret` | **Yes** | HMAC secret for signature validation. Falls back to the global `secret` if not set on the route. Set to `"INSECURE_NO_AUTH"` for testing only (skips validation). |
| `prompt` | No | Template string with dot-notation payload access (e.g. `{pull_request.title}`). If omitted, the full JSON payload is dumped into the prompt. |
| `skills` | No | List of skill names to load for the agent run. |
| `deliver` | No | Where to send the response: `github_comment`, `telegram`, `discord`, `slack`, `signal`, `sms`, or `log` (default). |
| `deliver` | No | Where to send the response: `github_comment`, `telegram`, `discord`, `slack`, `signal`, `matrix`, `mattermost`, `email`, `sms`, `dingtalk`, `feishu`, `wecom`, or `log` (default). |
| `deliver_extra` | No | Additional delivery config — keys depend on `deliver` type (e.g. `repo`, `pr_number`, `chat_id`). Values support the same `{dot.notation}` templates as `prompt`. |
### Full example

View File

@@ -10,7 +10,7 @@ Hermes Agent automatically saves every conversation as a session. Sessions enabl
## How Sessions Work
Every conversation — whether from the CLI, Telegram, Discord, WhatsApp, or Slack — is stored as a session with full message history. Sessions are tracked in two complementary systems:
Every conversation — whether from the CLI, Telegram, Discord, Slack, WhatsApp, Signal, Matrix, or any other messaging platform — is stored as a session with full message history. Sessions are tracked in two complementary systems:
1. **SQLite database** (`~/.hermes/state.db`) — structured session metadata with FTS5 full-text search
2. **JSONL transcripts** (`~/.hermes/sessions/`) — raw conversation transcripts including tool calls (gateway)
@@ -34,8 +34,22 @@ Each session is tagged with its source platform:
| `cli` | Interactive CLI (`hermes` or `hermes chat`) |
| `telegram` | Telegram messenger |
| `discord` | Discord server/DM |
| `whatsapp` | WhatsApp messenger |
| `slack` | Slack workspace |
| `whatsapp` | WhatsApp messenger |
| `signal` | Signal messenger |
| `matrix` | Matrix rooms and DMs |
| `mattermost` | Mattermost channels |
| `email` | Email (IMAP/SMTP) |
| `sms` | SMS via Twilio |
| `dingtalk` | DingTalk messenger |
| `feishu` | Feishu/Lark messenger |
| `wecom` | WeCom (WeChat Work) |
| `homeassistant` | Home Assistant conversation |
| `webhook` | Incoming webhooks |
| `api-server` | API server requests |
| `acp` | ACP editor integration |
| `cron` | Scheduled cron jobs |
| `batch` | Batch processing runs |
## CLI Session Resume