docs: comprehensive documentation audit — fix stale info, expand thin pages, add depth (#5393)

Major changes across 20 documentation pages: Staleness fixes: - Fix FAQ: wrong import path (hermes.agent → run_agent) - Fix FAQ: stale Gemini 2.0 model → Gemini 3 Flash - Fix integrations/index: missing MiniMax TTS provider - Fix integrations/index: web_crawl is not a registered tool - Fix sessions: add all 19 session sources (was only 5) - Fix cron: add all 18 delivery targets (was only telegram/discord) - Fix webhooks: add all delivery targets - Fix overview: add missing MCP, memory providers, credential pools - Fix all line-number references → use function name searches instead - Update file size estimates (run_agent ~9200, gateway ~7200, cli ~8500) Expanded thin pages (< 150 lines → substantial depth): - honcho.md: 43 → 108 lines — added feature comparison, tools, config, CLI - overview.md: 49 → 55 lines — added MCP, memory providers, credential pools - toolsets-reference.md: 57 → 175 lines — added explanations, config examples, custom toolsets, wildcards, platform differences table - optional-skills-catalog.md: 74 → 153 lines — added 25+ missing skills across communication, devops, mlops (18!), productivity, research categories - integrations/index.md: 82 → 115 lines — added messaging, HA, plugins sections - cron-internals.md: 90 → 195 lines — added job JSON example, lifecycle states, tick cycle, delivery targets, script-backed jobs, CLI interface - gateway-internals.md: 111 → 250 lines — added architecture diagram, message flow, two-level guard, platform adapters, token locks, process management - agent-loop.md: 112 → 235 lines — added entry points, API mode resolution, turn lifecycle detail, message alternation rules, tool execution flow, callback table, budget tracking, compression details - architecture.md: 152 → 295 lines — added system overview diagram, data flow diagrams, design principles table, dependency chain Other depth additions: - context-references.md: added platform availability, compression interaction, common patterns sections - slash-commands.md: added quick commands config example, alias resolution - image-generation.md: added platform delivery table - tools-reference.md: added tool counts, MCP tools note - index.md: updated platform count (5 → 14+), tool count (40+ → 47)
2026-04-05 19:45:50 -07:00
parent fec58ad99e
commit 43d468cea8
20 changed files with 1243 additions and 406 deletions
--- a/website/docs/developer-guide/agent-loop.md
+++ b/website/docs/developer-guide/agent-loop.md
@@ -6,107 +6,231 @@ description: "Detailed walkthrough of AIAgent execution, API modes, tools, callb

 # Agent Loop Internals

-The core orchestration engine is `run_agent.py`'s `AIAgent`.
+The core orchestration engine is `run_agent.py`'s `AIAgent` class — roughly 9,200 lines that handle everything from prompt assembly to tool dispatch to provider failover.

-## Core responsibilities
+## Core Responsibilities

 `AIAgent` is responsible for:

- assembling the effective prompt and tool schemas
- selecting the correct provider/API mode
- making interruptible model calls
- executing tool calls (sequentially or concurrently)
- maintaining session history
- handling compression, retries, and fallback models
+- Assembling the effective system prompt and tool schemas via `prompt_builder.py`
+- Selecting the correct provider/API mode (chat_completions, codex_responses, anthropic_messages)
+- Making interruptible model calls with cancellation support
+- Executing tool calls (sequentially or concurrently via thread pool)
+- Maintaining conversation history in OpenAI message format
+- Handling compression, retries, and fallback model switching
+- Tracking iteration budgets across parent and child agents
+- Flushing persistent memory before context is lost

-## API modes
+## Two Entry Points

-Hermes currently supports three API execution modes:
+```python
+# Simple interface — returns final response string
+response = agent.chat("Fix the bug in main.py")

-| API mode | Used for |
-|----------|----------|
-| `chat_completions` | OpenAI-compatible chat endpoints, including OpenRouter and most custom endpoints |
-| `codex_responses` | OpenAI Codex / Responses API path |
-| `anthropic_messages` | Native Anthropic Messages API |
+# Full interface — returns dict with messages, metadata, usage stats
+result = agent.run_conversation(
+    user_message="Fix the bug in main.py",
+    system_message=None,           # auto-built if omitted
+    conversation_history=None,      # auto-loaded from session if omitted
+    task_id="task_abc123"
+)
+```

-The mode is resolved from explicit args, provider selection, and base URL heuristics.
+`chat()` is a thin wrapper around `run_conversation()` that extracts the `final_response` field from the result dict.

-## Turn lifecycle
+## API Modes
+
+Hermes supports three API execution modes, resolved from provider selection, explicit args, and base URL heuristics:
+
+| API mode | Used for | Client type |
+|----------|----------|-------------|
+| `chat_completions` | OpenAI-compatible endpoints (OpenRouter, custom, most providers) | `openai.OpenAI` |
+| `codex_responses` | OpenAI Codex / Responses API | `openai.OpenAI` with Responses format |
+| `anthropic_messages` | Native Anthropic Messages API | `anthropic.Anthropic` via adapter |
+
+The mode determines how messages are formatted, how tool calls are structured, how responses are parsed, and how caching/streaming works. All three converge on the same internal message format (OpenAI-style `role`/`content`/`tool_calls` dicts) before and after API calls.
+
+**Mode resolution order:**
+1. Explicit `api_mode` constructor arg (highest priority)
+2. Provider-specific detection (e.g., `anthropic` provider → `anthropic_messages`)
+3. Base URL heuristics (e.g., `api.anthropic.com` → `anthropic_messages`)
+4. Default: `chat_completions`
+
+## Turn Lifecycle
+
+Each iteration of the agent loop follows this sequence:

 ```text
 run_conversation()
-  -> generate effective task_id
-  -> append current user message
-  -> load or build cached system prompt
-  -> maybe preflight-compress
-  -> build api_messages
-  -> inject ephemeral prompt layers
-  -> apply prompt caching if appropriate
-  -> make interruptible API call
-  -> if tool calls: execute them, append tool results, loop
-  -> if final text: persist, cleanup, return response
+  1. Generate task_id if not provided
+  2. Append user message to conversation history
+  3. Build or reuse cached system prompt (prompt_builder.py)
+  4. Check if preflight compression is needed (>50% context)
+  5. Build API messages from conversation history
+     - chat_completions: OpenAI format as-is
+     - codex_responses: convert to Responses API input items
+     - anthropic_messages: convert via anthropic_adapter.py
+  6. Inject ephemeral prompt layers (budget warnings, context pressure)
+  7. Apply prompt caching markers if on Anthropic
+  8. Make interruptible API call (_api_call_with_interrupt)
+  9. Parse response:
+     - If tool_calls: execute them, append results, loop back to step 5
+     - If text response: persist session, flush memory if needed, return
 ```

-## Interruptible API calls
+### Message Format

-Hermes wraps API requests so they can be interrupted from the CLI or gateway.
+All messages use OpenAI-compatible format internally:

-This matters because:
+```python
+{"role": "system", "content": "..."}
+{"role": "user", "content": "..."}
+{"role": "assistant", "content": "...", "tool_calls": [...]}
+{"role": "tool", "tool_call_id": "...", "content": "..."}
+```

- the agent may be in a long LLM call
- the user may send a new message mid-flight
- background systems may need cancellation semantics
+Reasoning content (from models that support extended thinking) is stored in `assistant_msg["reasoning"]` and optionally displayed via the `reasoning_callback`.

-## Tool execution modes
+### Message Alternation Rules

-Hermes uses two execution strategies:
+The agent loop enforces strict message role alternation:

- sequential execution for single or interactive tools
- concurrent execution for multiple non-interactive tools
+- After the system message: `User → Assistant → User → Assistant → ...`
+- During tool calling: `Assistant (with tool_calls) → Tool → Tool → ... → Assistant`
+- **Never** two assistant messages in a row
+- **Never** two user messages in a row
+- **Only** `tool` role can have consecutive entries (parallel tool results)

-Concurrent tool execution preserves message/result ordering when reinserting tool responses into conversation history.
+Providers validate these sequences and will reject malformed histories.

-## Callback surfaces
+## Interruptible API Calls

-`AIAgent` supports platform/integration callbacks such as:
+API requests are wrapped in `_api_call_with_interrupt()` which runs the actual HTTP call in a background thread while monitoring an interrupt event:

- `tool_progress_callback`
- `thinking_callback`
- `reasoning_callback`
- `clarify_callback`
- `step_callback`
- `stream_delta_callback`
- `tool_gen_callback`
- `status_callback`
+```text
+┌──────────────────────┐     ┌──────────────┐
+│  Main thread         │     │  API thread   │
+│  wait on:            │────▶│  HTTP POST    │
+│  - response ready    │     │  to provider  │
+│  - interrupt event   │     └──────────────┘
+│  - timeout           │
+└──────────────────────┘
+```

-These are how the CLI, gateway, and ACP integrations stream intermediate progress and interactive approval/clarification flows.
+When interrupted (user sends new message, `/stop` command, or signal):
+- The API thread is abandoned (response discarded)
+- The agent can process the new input or shut down cleanly
+- No partial response is injected into conversation history

-## Budget and fallback behavior
+## Tool Execution

-Hermes tracks a shared iteration budget across parent and subagents. It also injects budget pressure hints near the end of the available iteration window.
+### Sequential vs Concurrent

-Fallback model support allows the agent to switch providers/models when the primary route fails in supported failure paths.
+When the model returns tool calls:

-## Compression and persistence
+- **Single tool call** → executed directly in the main thread
+- **Multiple tool calls** → executed concurrently via `ThreadPoolExecutor`
+  - Exception: tools marked as interactive (e.g., `clarify`) force sequential execution
+  - Results are reinserted in the original tool call order regardless of completion order

-Before and during long runs, Hermes may:
+### Execution Flow

- flush memory before context loss
- compress middle conversation turns
- split the session lineage into a new session ID after compression
- preserve recent context and structural tool-call/result consistency
+```text
+for each tool_call in response.tool_calls:
+    1. Resolve handler from tools/registry.py
+    2. Fire pre_tool_call plugin hook
+    3. Check if dangerous command (tools/approval.py)
+       - If dangerous: invoke approval_callback, wait for user
+    4. Execute handler with args + task_id
+    5. Fire post_tool_call plugin hook
+    6. Append {"role": "tool", "content": result} to history
+```

-## Key files to read next
+### Agent-Level Tools

- `run_agent.py`
- `agent/prompt_builder.py`
- `agent/context_compressor.py`
- `agent/prompt_caching.py`
- `model_tools.py`
+Some tools are intercepted by `run_agent.py` *before* reaching `handle_function_call()`:

-## Related docs
+| Tool | Why intercepted |
+|------|-----------------|
+| `todo` | Reads/writes agent-local task state |
+| `memory` | Writes to persistent memory files with character limits |
+
+These tools modify agent state directly and return synthetic tool results without going through the registry.
+
+## Callback Surfaces
+
+`AIAgent` supports platform-specific callbacks that enable real-time progress in the CLI, gateway, and ACP integrations:
+
+| Callback | When fired | Used by |
+|----------|-----------|---------|
+| `tool_progress_callback` | Before/after each tool execution | CLI spinner, gateway progress messages |
+| `thinking_callback` | When model starts/stops thinking | CLI "thinking..." indicator |
+| `reasoning_callback` | When model returns reasoning content | CLI reasoning display, gateway reasoning blocks |
+| `clarify_callback` | When `clarify` tool is called | CLI input prompt, gateway interactive message |
+| `step_callback` | After each complete agent turn | Gateway step tracking, ACP progress |
+| `stream_delta_callback` | Each streaming token (when enabled) | CLI streaming display |
+| `tool_gen_callback` | When tool call is parsed from stream | CLI tool preview in spinner |
+| `status_callback` | State changes (thinking, executing, etc.) | ACP status updates |
+
+## Budget and Fallback Behavior
+
+### Iteration Budget
+
+The agent tracks iterations via `IterationBudget`:
+
+- Default: 90 iterations (configurable via `agent.max_turns`)
+- Shared across parent and child agents — a subagent consumes from the parent's budget
+- At 70%+ usage, `_get_budget_warning()` appends a `[BUDGET WARNING: ...]` to the last tool result
+- At 100%, the agent stops and returns a summary of work done
+
+### Fallback Model
+
+When the primary model fails (429 rate limit, 5xx server error, 401/403 auth error):
+
+1. Check `fallback_providers` list in config
+2. Try each fallback in order
+3. On success, continue the conversation with the new provider
+4. On 401/403, attempt credential refresh before failing over
+
+The fallback system also covers auxiliary tasks independently — vision, compression, web extraction, and session search each have their own fallback chain configurable via the `auxiliary.*` config section.
+
+## Compression and Persistence
+
+### When Compression Triggers
+
+- **Preflight** (before API call): If conversation exceeds 50% of model's context window
+- **Gateway auto-compression**: If conversation exceeds 85% (more aggressive, runs between turns)
+
+### What Happens During Compression
+
+1. Memory is flushed to disk first (preventing data loss)
+2. Middle conversation turns are summarized into a compact summary
+3. The last N messages are preserved intact (`compression.protect_last_n`, default: 20)
+4. Tool call/result message pairs are kept together (never split)
+5. A new session lineage ID is generated (compression creates a "child" session)
+
+### Session Persistence
+
+After each turn:
+- Messages are saved to the session store (SQLite via `hermes_state.py`)
+- Memory changes are flushed to `MEMORY.md` / `USER.md`
+- The session can be resumed later via `/resume` or `hermes chat --resume`
+
+## Key Source Files
+
+| File | Purpose |
+|------|---------|
+| `run_agent.py` | AIAgent class — the complete agent loop (~9,200 lines) |
+| `agent/prompt_builder.py` | System prompt assembly from memory, skills, context files, personality |
+| `agent/context_compressor.py` | Conversation compression algorithm |
+| `agent/prompt_caching.py` | Anthropic prompt caching markers and cache metrics |
+| `agent/auxiliary_client.py` | Auxiliary LLM client for side tasks (vision, summarization) |
+| `model_tools.py` | Tool schema collection, `handle_function_call()` dispatch |
+
+## Related Docs

 - [Provider Runtime Resolution](./provider-runtime.md)
 - [Prompt Assembly](./prompt-assembly.md)
 - [Context Compression & Prompt Caching](./context-compression-and-caching.md)
 - [Tools Runtime](./tools-runtime.md)
+- [Architecture Overview](./architecture.md)
--- a/website/docs/developer-guide/architecture.md
+++ b/website/docs/developer-guide/architecture.md
@@ -1,152 +1,274 @@
 ---
 sidebar_position: 1
 title: "Architecture"
-description: "Hermes Agent internals — major subsystems, execution paths, and where to read next"
+description: "Hermes Agent internals — major subsystems, execution paths, data flow, and where to read next"
 ---

 # Architecture

-This page is the top-level map of Hermes Agent internals. The project has grown beyond a single monolithic loop, so the best way to understand it is by subsystem.
+This page is the top-level map of Hermes Agent internals. Use it to orient yourself in the codebase, then dive into subsystem-specific docs for implementation details.

-## High-level structure
+## System Overview
+
+```text
+┌─────────────────────────────────────────────────────────────────────┐
+│                        Entry Points                                  │
+│                                                                      │
+│  CLI (cli.py)    Gateway (gateway/run.py)    ACP (acp_adapter/)     │
+│  Batch Runner    API Server                  Python Library          │
+└──────────┬──────────────┬───────────────────────┬────────────────────┘
+           │              │                       │
+           ▼              ▼                       ▼
+┌─────────────────────────────────────────────────────────────────────┐
+│                     AIAgent (run_agent.py)                           │
+│                                                                      │
+│  ┌──────────────┐ ┌──────────────┐ ┌──────────────┐                │
+│  │ Prompt        │ │ Provider     │ │ Tool         │                │
+│  │ Builder       │ │ Resolution   │ │ Dispatch     │                │
+│  │ (prompt_      │ │ (runtime_    │ │ (model_      │                │
+│  │  builder.py)  │ │  provider.py)│ │  tools.py)   │                │
+│  └──────┬───────┘ └──────┬───────┘ └──────┬───────┘                │
+│         │                │                │                          │
+│  ┌──────┴───────┐ ┌──────┴───────┐ ┌──────┴───────┐                │
+│  │ Compression  │ │ 3 API Modes  │ │ Tool Registry│                │
+│  │ & Caching    │ │ chat_compl.  │ │ (registry.py)│                │
+│  │              │ │ codex_resp.  │ │ 47 tools     │                │
+│  │              │ │ anthropic    │ │ 37 toolsets   │                │
+│  └──────────────┘ └──────────────┘ └──────────────┘                │
+└─────────────────────────────────────────────────────────────────────┘
+           │                                    │
+           ▼                                    ▼
+┌───────────────────┐              ┌──────────────────────┐
+│ Session Storage   │              │ Tool Backends         │
+│ (SQLite + FTS5)   │              │ Terminal (6 backends) │
+│ hermes_state.py   │              │ Browser (5 backends)  │
+│ gateway/session.py│              │ Web (4 backends)      │
+└───────────────────┘              │ MCP (dynamic)         │
+                                   │ File, Vision, etc.    │
+                                   └──────────────────────┘
+```
+
+## Directory Structure

 ```text
 hermes-agent/
-├── run_agent.py              # AIAgent core loop
-├── cli.py                    # interactive terminal UI
-├── model_tools.py            # tool discovery/orchestration
-├── toolsets.py               # tool groupings and presets
-├── hermes_state.py           # SQLite session/state database
-├── batch_runner.py           # batch trajectory generation
+├── run_agent.py              # AIAgent — core conversation loop (~9,200 lines)
+├── cli.py                    # HermesCLI — interactive terminal UI (~8,500 lines)
+├── model_tools.py            # Tool discovery, schema collection, dispatch
+├── toolsets.py               # Tool groupings and platform presets
+├── hermes_state.py           # SQLite session/state database with FTS5
+├── hermes_constants.py       # HERMES_HOME, profile-aware paths
+├── batch_runner.py           # Batch trajectory generation
 │
-├── agent/                    # prompt building, compression, caching, metadata, trajectories
-├── hermes_cli/               # command entrypoints, auth, setup, models, config, doctor
-├── tools/                    # tool implementations and terminal environments
-├── gateway/                  # messaging gateway, session routing, delivery, pairing, hooks
-├── cron/                     # scheduled job storage and scheduler
-├── plugins/memory/           # Memory provider plugins (honcho, openviking, mem0, etc.)
-├── acp_adapter/              # ACP editor integration server
-├── acp_registry/             # ACP registry manifest + icon
-├── environments/             # Hermes RL / benchmark environment framework
-├── skills/                   # bundled skills
-├── optional-skills/          # official optional skills
-└── tests/                    # test suite
+├── agent/                    # Agent internals
+│   ├── prompt_builder.py     # System prompt assembly
+│   ├── context_compressor.py # Conversation compression algorithm
+│   ├── prompt_caching.py     # Anthropic prompt caching
+│   ├── auxiliary_client.py   # Auxiliary LLM for side tasks (vision, summarization)
+│   ├── model_metadata.py     # Model context lengths, token estimation
+│   ├── models_dev.py         # models.dev registry integration
+│   ├── anthropic_adapter.py  # Anthropic Messages API format conversion
+│   ├── display.py            # KawaiiSpinner, tool preview formatting
+│   ├── skill_commands.py     # Skill slash commands
+│   ├── memory_store.py       # Persistent memory read/write
+│   └── trajectory.py         # Trajectory saving helpers
+│
+├── hermes_cli/               # CLI subcommands and setup
+│   ├── main.py               # Entry point — all `hermes` subcommands (~4,200 lines)
+│   ├── config.py             # DEFAULT_CONFIG, OPTIONAL_ENV_VARS, migration
+│   ├── commands.py           # COMMAND_REGISTRY — central slash command definitions
+│   ├── auth.py               # PROVIDER_REGISTRY, credential resolution
+│   ├── runtime_provider.py   # Provider → api_mode + credentials
+│   ├── models.py             # Model catalog, provider model lists
+│   ├── model_switch.py       # /model command logic (CLI + gateway shared)
+│   ├── setup.py              # Interactive setup wizard (~3,500 lines)
+│   ├── skin_engine.py        # CLI theming engine
+│   ├── skills_config.py      # hermes skills — enable/disable per platform
+│   ├── skills_hub.py         # /skills slash command
+│   ├── tools_config.py       # hermes tools — enable/disable per platform
+│   ├── plugins.py            # PluginManager — discovery, loading, hooks
+│   ├── callbacks.py          # Terminal callbacks (clarify, sudo, approval)
+│   └── gateway.py            # hermes gateway start/stop
+│
+├── tools/                    # Tool implementations (one file per tool)
+│   ├── registry.py           # Central tool registry
+│   ├── approval.py           # Dangerous command detection
+│   ├── terminal_tool.py      # Terminal orchestration
+│   ├── process_registry.py   # Background process management
+│   ├── file_tools.py         # read_file, write_file, patch, search_files
+│   ├── web_tools.py          # web_search, web_extract
+│   ├── browser_tool.py       # 11 browser automation tools
+│   ├── code_execution_tool.py # execute_code sandbox
+│   ├── delegate_tool.py      # Subagent delegation
+│   ├── mcp_tool.py           # MCP client (~1,050 lines)
+│   ├── credential_files.py   # File-based credential passthrough
+│   ├── env_passthrough.py    # Env var passthrough for sandboxes
+│   ├── ansi_strip.py         # ANSI escape stripping
+│   └── environments/         # Terminal backends (local, docker, ssh, modal, daytona, singularity)
+│
+├── gateway/                  # Messaging platform gateway
+│   ├── run.py                # GatewayRunner — message dispatch (~5,800 lines)
+│   ├── session.py            # SessionStore — conversation persistence
+│   ├── delivery.py           # Outbound message delivery
+│   ├── pairing.py            # DM pairing authorization
+│   ├── hooks.py              # Hook discovery and lifecycle events
+│   ├── mirror.py             # Cross-session message mirroring
+│   ├── status.py             # Token locks, profile-scoped process tracking
+│   ├── builtin_hooks/        # Always-registered hooks
+│   └── platforms/            # 14 adapters: telegram, discord, slack, whatsapp,
+│                             #   signal, matrix, mattermost, email, sms,
+│                             #   dingtalk, feishu, wecom, homeassistant, webhook
+│
+├── acp_adapter/              # ACP server (VS Code / Zed / JetBrains)
+├── cron/                     # Scheduler (jobs.py, scheduler.py)
+├── plugins/memory/           # Memory provider plugins
+├── environments/             # RL training environments (Atropos)
+├── skills/                   # Bundled skills (always available)
+├── optional-skills/          # Official optional skills (install explicitly)
+├── website/                  # Docusaurus documentation site
+└── tests/                    # Pytest suite (~3,000+ tests)
 ```

-## Recommended reading order
+## Data Flow

-If you are new to the codebase, read in this order:
+### CLI Session

-1. this page
-2. [Agent Loop Internals](./agent-loop.md)
-3. [Prompt Assembly](./prompt-assembly.md)
-4. [Provider Runtime Resolution](./provider-runtime.md)
-5. [Adding Providers](./adding-providers.md)
-6. [Tools Runtime](./tools-runtime.md)
-7. [Session Storage](./session-storage.md)
-8. [Gateway Internals](./gateway-internals.md)
-9. [Context Compression & Prompt Caching](./context-compression-and-caching.md)
-10. [ACP Internals](./acp-internals.md)
-11. [Environments, Benchmarks & Data Generation](./environments.md)
+```text
+User input → HermesCLI.process_input()
+  → AIAgent.run_conversation()
+    → prompt_builder.build_system_prompt()
+    → runtime_provider.resolve_runtime_provider()
+    → API call (chat_completions / codex_responses / anthropic_messages)
+    → tool_calls? → model_tools.handle_function_call() → loop
+    → final response → display → save to SessionDB
+```

-## Major subsystems
+### Gateway Message

-### Agent loop
+```text
+Platform event → Adapter.on_message() → MessageEvent
+  → GatewayRunner._handle_message()
+    → authorize user
+    → resolve session key
+    → create AIAgent with session history
+    → AIAgent.run_conversation()
+    → deliver response back through adapter
+```

-The core synchronous orchestration engine is `AIAgent` in `run_agent.py`.
+### Cron Job

-It is responsible for:
+```text
+Scheduler tick → load due jobs from jobs.json
+  → create fresh AIAgent (no history)
+  → inject attached skills as context
+  → run job prompt
+  → deliver response to target platform
+  → update job state and next_run
+```

- provider/API-mode selection
- prompt construction
- tool execution
- retries and fallback
- callbacks
- compression and persistence
+## Recommended Reading Order

-See [Agent Loop Internals](./agent-loop.md).
+If you are new to the codebase:

-### Prompt system
+1. **This page** — orient yourself
+2. **[Agent Loop Internals](./agent-loop.md)** — how AIAgent works
+3. **[Prompt Assembly](./prompt-assembly.md)** — system prompt construction
+4. **[Provider Runtime Resolution](./provider-runtime.md)** — how providers are selected
+5. **[Adding Providers](./adding-providers.md)** — practical guide to adding a new provider
+6. **[Tools Runtime](./tools-runtime.md)** — tool registry, dispatch, environments
+7. **[Session Storage](./session-storage.md)** — SQLite schema, FTS5, session lineage
+8. **[Gateway Internals](./gateway-internals.md)** — messaging platform gateway
+9. **[Context Compression & Prompt Caching](./context-compression-and-caching.md)** — compression and caching
+10. **[ACP Internals](./acp-internals.md)** — IDE integration
+11. **[Environments, Benchmarks & Data Generation](./environments.md)** — RL training

-Prompt-building logic is split between:
+## Major Subsystems

- `run_agent.py`
- `agent/prompt_builder.py`
- `agent/prompt_caching.py`
- `agent/context_compressor.py`
+### Agent Loop

-See:
+The synchronous orchestration engine (`AIAgent` in `run_agent.py`). Handles provider selection, prompt construction, tool execution, retries, fallback, callbacks, compression, and persistence. Supports three API modes for different provider backends.

- [Prompt Assembly](./prompt-assembly.md)
- [Context Compression & Prompt Caching](./context-compression-and-caching.md)
+→ [Agent Loop Internals](./agent-loop.md)

-### Provider/runtime resolution
+### Prompt System

-Hermes has a shared runtime provider resolver used by CLI, gateway, cron, ACP, and auxiliary calls.
+Prompt construction and maintenance across the conversation lifecycle:

-See [Provider Runtime Resolution](./provider-runtime.md).
+- **`prompt_builder.py`** — Assembles the system prompt from: personality (SOUL.md), memory (MEMORY.md, USER.md), skills, context files (AGENTS.md, .hermes.md), tool-use guidance, and model-specific instructions
+- **`prompt_caching.py`** — Applies Anthropic cache breakpoints for prefix caching
+- **`context_compressor.py`** — Summarizes middle conversation turns when context exceeds thresholds

-### Tooling runtime
+→ [Prompt Assembly](./prompt-assembly.md), [Context Compression & Prompt Caching](./context-compression-and-caching.md)

-The tool registry, toolsets, terminal backends, process manager, and dispatch rules form a subsystem of their own.
+### Provider Resolution

-See [Tools Runtime](./tools-runtime.md).
+A shared runtime resolver used by CLI, gateway, cron, ACP, and auxiliary calls. Maps `(provider, model)` tuples to `(api_mode, api_key, base_url)`. Handles 18+ providers, OAuth flows, credential pools, and alias resolution.

-### Session persistence
+→ [Provider Runtime Resolution](./provider-runtime.md)

-Historical session state is stored primarily in SQLite, with lineage preserved across compression splits.
+### Tool System

-See [Session Storage](./session-storage.md).
+Central tool registry (`tools/registry.py`) with 47 registered tools across 20 toolsets. Each tool file self-registers at import time. The registry handles schema collection, dispatch, availability checking, and error wrapping. Terminal tools support 6 backends (local, Docker, SSH, Daytona, Modal, Singularity).

-### Messaging gateway
+→ [Tools Runtime](./tools-runtime.md)

-The gateway is a long-running orchestration layer for platform adapters, session routing, pairing, delivery, and cron ticking.
+### Session Persistence

-See [Gateway Internals](./gateway-internals.md).
+SQLite-based session storage with FTS5 full-text search. Sessions have lineage tracking (parent/child across compressions), per-platform isolation, and atomic writes with contention handling.

-### ACP integration
+→ [Session Storage](./session-storage.md)

-ACP exposes Hermes as an editor-native agent over stdio/JSON-RPC.
+### Messaging Gateway

-See:
+Long-running process with 14 platform adapters, unified session routing, user authorization (allowlists + DM pairing), slash command dispatch, hook system, cron ticking, and background maintenance.

- [ACP Editor Integration](../user-guide/features/acp.md)
- [ACP Internals](./acp-internals.md)
+→ [Gateway Internals](./gateway-internals.md)
+
+### Plugin System
+
+Three discovery sources: `~/.hermes/plugins/` (user), `.hermes/plugins/` (project), and pip entry points. Plugins register tools, hooks, and CLI commands through a context API. Memory providers are a specialized plugin type under `plugins/memory/`.
+
+→ [Plugin Guide](/docs/guides/build-a-hermes-plugin), [Memory Provider Plugin](./memory-provider-plugin.md)

 ### Cron

-Cron jobs are implemented as first-class agent tasks, not just shell tasks.
+First-class agent tasks (not shell tasks). Jobs store in JSON, support multiple schedule formats, can attach skills and scripts, and deliver to any platform.

-See [Cron Internals](./cron-internals.md).
+→ [Cron Internals](./cron-internals.md)

-### RL / environments / trajectories
+### ACP Integration

-Hermes ships a full environment framework for evaluation, RL integration, and SFT data generation.
+Exposes Hermes as an editor-native agent over stdio/JSON-RPC for VS Code, Zed, and JetBrains.

-See:
+→ [ACP Internals](./acp-internals.md)

- [Environments, Benchmarks & Data Generation](./environments.md)
- [Trajectories & Training Format](./trajectory-format.md)
+### RL / Environments / Trajectories

-## Design themes
+Full environment framework for evaluation and RL training. Integrates with Atropos, supports multiple tool-call parsers, and generates ShareGPT-format trajectories.

-Several cross-cutting design themes appear throughout the codebase:
+→ [Environments, Benchmarks & Data Generation](./environments.md), [Trajectories & Training Format](./trajectory-format.md)

- prompt stability matters
- tool execution must be observable and interruptible
- session persistence must survive long-running use
- platform frontends should share one agent core
- optional subsystems should remain loosely coupled where possible
+## Design Principles

-## Implementation notes
+| Principle | What it means in practice |
+|-----------|--------------------------|
+| **Prompt stability** | System prompt doesn't change mid-conversation. No cache-breaking mutations except explicit user actions (`/model`). |
+| **Observable execution** | Every tool call is visible to the user via callbacks. Progress updates in CLI (spinner) and gateway (chat messages). |
+| **Interruptible** | API calls and tool execution can be cancelled mid-flight by user input or signals. |
+| **Platform-agnostic core** | One AIAgent class serves CLI, gateway, ACP, batch, and API server. Platform differences live in the entry point, not the agent. |
+| **Loose coupling** | Optional subsystems (MCP, plugins, memory providers, RL environments) use registry patterns and check_fn gating, not hard dependencies. |
+| **Profile isolation** | Each profile (`hermes -p <name>`) gets its own HERMES_HOME, config, memory, sessions, and gateway PID. Multiple profiles run concurrently. |

-The older mental model of Hermes as “one OpenAI-compatible chat loop plus some tools” is no longer sufficient. Current Hermes includes:
+## File Dependency Chain

- multiple API modes
- auxiliary model routing
- ACP editor integration
- gateway-specific session and delivery semantics
- RL environment infrastructure
- prompt-caching and compression logic with lineage-aware persistence
+```text
+tools/registry.py  (no deps — imported by all tool files)
+       ↑
+tools/*.py  (each calls registry.register() at import time)
+       ↑
+model_tools.py  (imports tools/registry + triggers tool discovery)
+       ↑
+run_agent.py, cli.py, batch_runner.py, environments/
+```

-Use this page as the map, then dive into subsystem-specific docs for the real implementation details.
+This chain means tool registration happens at import time, before any agent instance is created. Adding a new tool requires an import in `model_tools.py`'s `_discover_tools()` list.
--- a/website/docs/developer-guide/context-compression-and-caching.md
+++ b/website/docs/developer-guide/context-compression-and-caching.md
@@ -4,7 +4,7 @@ Hermes Agent uses a dual compression system and Anthropic prompt caching to
 manage context window usage efficiently across long conversations.

 Source files: `agent/context_compressor.py`, `agent/prompt_caching.py`,
-`gateway/run.py` (session hygiene), `run_agent.py` (lines 1146-1204)
+`gateway/run.py` (session hygiene), `run_agent.py` (search for `_compress_context`)


 ## Dual Compression System
@@ -26,7 +26,7 @@ Hermes has two separate compression layers that operate independently:

 ### 1. Gateway Session Hygiene (85% threshold)

-Located in `gateway/run.py` (around line 2220). This is a **safety net** that
+Located in `gateway/run.py` (search for `_maybe_compress_session`). This is a **safety net** that
 runs before the agent processes a message. It prevents API failures when sessions
 grow too large between turns (e.g., overnight accumulation in Telegram/Discord).

--- a/website/docs/developer-guide/cron-internals.md
+++ b/website/docs/developer-guide/cron-internals.md
@@ -6,85 +6,195 @@ description: "How Hermes stores, schedules, edits, pauses, skill-loads, and deli

 # Cron Internals

-Hermes cron support is implemented primarily in:
+The cron subsystem provides scheduled task execution — from simple one-shot delays to recurring cron-expression jobs with skill injection and cross-platform delivery.

- `cron/jobs.py`
- `cron/scheduler.py`
- `tools/cronjob_tools.py`
- `gateway/run.py`
- `hermes_cli/cron.py`
+## Key Files

-## Scheduling model
+| File | Purpose |
+|------|---------|
+| `cron/jobs.py` | Job model, storage, atomic read/write to `jobs.json` |
+| `cron/scheduler.py` | Scheduler loop — due-job detection, execution, repeat tracking |
+| `tools/cronjob_tools.py` | Model-facing `cronjob` tool registration and handler |
+| `gateway/run.py` | Gateway integration — cron ticking in the long-running loop |
+| `hermes_cli/cron.py` | CLI `hermes cron` subcommands |

-Hermes supports:
+## Scheduling Model

- one-shot delays
- intervals
- cron expressions
- explicit timestamps
+Four schedule formats are supported:

-The model-facing surface is a single `cronjob` tool with action-style operations:
+| Format | Example | Behavior |
+|--------|---------|----------|
+| **Relative delay** | `30m`, `2h`, `1d` | One-shot, fires after the specified duration |
+| **Interval** | `every 2h`, `every 30m` | Recurring, fires at regular intervals |
+| **Cron expression** | `0 9 * * *` | Standard 5-field cron syntax (minute, hour, day, month, weekday) |
+| **ISO timestamp** | `2025-01-15T09:00:00` | One-shot, fires at the exact time |

- `create`
- `list`
- `update`
- `pause`
- `resume`
- `run`
- `remove`
+The model-facing surface is a single `cronjob` tool with action-style operations: `create`, `list`, `update`, `pause`, `resume`, `run`, `remove`.

-## Job storage
+## Job Storage

-Cron jobs are stored in Hermes-managed local state (`~/.hermes/cron/jobs.json`) with atomic write semantics.
+Jobs are stored in `~/.hermes/cron/jobs.json` with atomic write semantics (write to temp file, then rename). Each job record contains:

-Each job can carry:
+```json
+{
+  "id": "job_abc123",
+  "name": "Daily briefing",
+  "prompt": "Summarize today's AI news and funding rounds",
+  "schedule": "0 9 * * *",
+  "skills": ["ai-funding-daily-report"],
+  "deliver": "telegram:-1001234567890",
+  "repeat": null,
+  "state": "scheduled",
+  "next_run": "2025-01-16T09:00:00Z",
+  "run_count": 42,
+  "created_at": "2025-01-01T00:00:00Z",
+  "model": null,
+  "provider": null,
+  "script": null
+}
+```

- prompt
- schedule metadata
- repeat counters
- delivery target
- lifecycle state (`scheduled`, `paused`, `completed`, etc.)
- zero, one, or multiple attached skills
+### Job Lifecycle States

-Backward compatibility is preserved for older jobs that only stored a legacy single `skill` field or none of the newer lifecycle fields.
+| State | Meaning |
+|-------|---------|
+| `scheduled` | Active, will fire at next scheduled time |
+| `paused` | Suspended — won't fire until resumed |
+| `completed` | Repeat count exhausted or one-shot that has fired |
+| `running` | Currently executing (transient state) |

-## Runtime behavior
+### Backward Compatibility

-The scheduler:
+Older jobs may have a single `skill` field instead of the `skills` array. The scheduler normalizes this at load time — single `skill` is promoted to `skills: [skill]`.

- loads jobs
- computes due work
- executes jobs in fresh agent sessions
- optionally injects one or more skills before the prompt
- handles repeat counters
- updates next-run metadata and state
+## Scheduler Runtime

-In gateway mode, cron ticking is integrated into the long-running gateway loop.
+### Tick Cycle

-## Skill-backed jobs
+The scheduler runs on a periodic tick (default: every 60 seconds):

-A cron job may attach multiple skills. At runtime, Hermes loads those skills in order and then appends the job prompt as the task instruction.
+```text
+tick()
+  1. Acquire scheduler lock (prevents overlapping ticks)
+  2. Load all jobs from jobs.json
+  3. Filter to due jobs (next_run <= now AND state == "scheduled")
+  4. For each due job:
+     a. Set state to "running"
+     b. Create fresh AIAgent session (no conversation history)
+     c. Load attached skills in order (injected as user messages)
+     d. Run the job prompt through the agent
+     e. Deliver the response to the configured target
+     f. Update run_count, compute next_run
+     g. If repeat count exhausted → state = "completed"
+     h. Otherwise → state = "scheduled"
+  5. Write updated jobs back to jobs.json
+  6. Release scheduler lock
+```

-This gives scheduled jobs reusable guidance without requiring the user to paste full skill bodies into the cron prompt.
+### Gateway Integration

-## Recursion guard
+In gateway mode, the scheduler tick is integrated into the gateway's main event loop. The gateway calls `scheduler.tick()` on its periodic maintenance cycle, which runs alongside message handling.

-Cron-run sessions disable the `cronjob` toolset. This prevents a scheduled job from recursively creating or mutating more cron jobs and accidentally exploding token usage or scheduler load.
+In CLI mode, cron jobs only fire when `hermes cron` commands are run or during active CLI sessions.

-## Delivery model
+### Fresh Session Isolation

-Cron jobs can deliver to:
+Each cron job runs in a completely fresh agent session:

- origin chat
- local files
- platform home channels
- explicit platform/chat IDs
+- No conversation history from previous runs
+- No memory of previous cron executions (unless persisted to memory/files)
+- The prompt must be self-contained — cron jobs cannot ask clarifying questions
+- The `cronjob` toolset is disabled (recursion guard)
+
+## Skill-Backed Jobs
+
+A cron job can attach one or more skills via the `skills` field. At execution time:
+
+1. Skills are loaded in the specified order
+2. Each skill's SKILL.md content is injected as context
+3. The job's prompt is appended as the task instruction
+4. The agent processes the combined skill context + prompt
+
+This enables reusable, tested workflows without pasting full instructions into cron prompts. For example:
+
+```
+Create a daily funding report → attach "ai-funding-daily-report" skill
+```
+
+### Script-Backed Jobs
+
+Jobs can also attach a Python script via the `script` field. The script runs *before* each agent turn, and its stdout is injected into the prompt as context. This enables data collection and change detection patterns:
+
+```python
+# ~/.hermes/scripts/check_competitors.py
+import requests, json
+# Fetch competitor release notes, diff against last run
+# Print summary to stdout — agent analyzes and reports
+```
+
+## Delivery Model
+
+Cron job results can be delivered to any supported platform:
+
+| Target | Syntax | Example |
+|--------|--------|---------|
+| Origin chat | `origin` | Deliver to the chat where the job was created |
+| Local file | `local` | Save to `~/.hermes/cron/output/` |
+| Telegram | `telegram` or `telegram:<chat_id>` | `telegram:-1001234567890` |
+| Discord | `discord` or `discord:#channel` | `discord:#engineering` |
+| Slack | `slack` | Deliver to Slack home channel |
+| WhatsApp | `whatsapp` | Deliver to WhatsApp home |
+| Signal | `signal` | Deliver to Signal |
+| Matrix | `matrix` | Deliver to Matrix home room |
+| Mattermost | `mattermost` | Deliver to Mattermost home |
+| Email | `email` | Deliver via email |
+| SMS | `sms` | Deliver via SMS |
+| Home Assistant | `homeassistant` | Deliver to HA conversation |
+| DingTalk | `dingtalk` | Deliver to DingTalk |
+| Feishu | `feishu` | Deliver to Feishu |
+| WeCom | `wecom` | Deliver to WeCom |
+
+For Telegram topics, use the format `telegram:<chat_id>:<thread_id>` (e.g., `telegram:-1001234567890:17585`).
+
+### Response Wrapping
+
+By default (`cron.wrap_response: true`), cron deliveries are wrapped with:
+- A header identifying the cron job name and task
+- A footer noting the agent cannot see the delivered message in conversation
+
+The `[SILENT]` prefix in a cron response suppresses delivery entirely — useful for jobs that only need to write to files or perform side effects.
+
+### Session Isolation
+
+Cron deliveries are NOT mirrored into gateway session conversation history. They exist only in the cron job's own session. This prevents message alternation violations in the target chat's conversation.
+
+## Recursion Guard
+
+Cron-run sessions have the `cronjob` toolset disabled. This prevents:
+- A scheduled job from creating new cron jobs
+- Recursive scheduling that could explode token usage
+- Accidental mutation of the job schedule from within a job

 ## Locking

-Hermes uses lock-based protections so overlapping scheduler ticks do not execute the same due-job batch twice.
+The scheduler uses file-based locking to prevent overlapping ticks from executing the same due-job batch twice. This is important in gateway mode where multiple maintenance cycles could overlap if a previous tick takes longer than the tick interval.

-## Related docs
+## CLI Interface

- [Cron feature guide](../user-guide/features/cron.md)
+The `hermes cron` CLI provides direct job management:
+
+```bash
+hermes cron list                    # Show all jobs
+hermes cron add                     # Interactive job creation
+hermes cron edit <job_id>           # Edit job configuration
+hermes cron pause <job_id>          # Pause a running job
+hermes cron resume <job_id>         # Resume a paused job
+hermes cron run <job_id>            # Trigger immediate execution
+hermes cron remove <job_id>         # Delete a job
+```
+
+## Related Docs
+
+- [Cron Feature Guide](/docs/user-guide/features/cron)
 - [Gateway Internals](./gateway-internals.md)
+- [Agent Loop Internals](./agent-loop.md)
--- a/website/docs/developer-guide/gateway-internals.md
+++ b/website/docs/developer-guide/gateway-internals.md
@@ -6,106 +6,248 @@ description: "How the messaging gateway boots, authorizes users, routes sessions

 # Gateway Internals

-The messaging gateway is the long-running process that connects Hermes to external platforms.
+The messaging gateway is the long-running process that connects Hermes to 14+ external messaging platforms through a unified architecture.

-Key files:
+## Key Files

- `gateway/run.py`
- `gateway/config.py`
- `gateway/session.py`
- `gateway/delivery.py`
- `gateway/pairing.py`
- `gateway/channel_directory.py`
- `gateway/hooks.py`
- `gateway/mirror.py`
- `gateway/platforms/*`
+| File | Purpose |
+|------|---------|
+| `gateway/run.py` | `GatewayRunner` — main loop, slash commands, message dispatch (~7,200 lines) |
+| `gateway/session.py` | `SessionStore` — conversation persistence and session key construction |
+| `gateway/delivery.py` | Outbound message delivery to target platforms/channels |
+| `gateway/pairing.py` | DM pairing flow for user authorization |
+| `gateway/channel_directory.py` | Maps chat IDs to human-readable names for cron delivery |
+| `gateway/hooks.py` | Hook discovery, loading, and lifecycle event dispatch |
+| `gateway/mirror.py` | Cross-session message mirroring for `send_message` |
+| `gateway/status.py` | Token lock management for profile-scoped gateway instances |
+| `gateway/builtin_hooks/` | Always-registered hooks (e.g., BOOT.md system prompt hook) |
+| `gateway/platforms/` | Platform adapters (one per messaging platform) |

-## Core responsibilities
+## Architecture Overview

-The gateway process is responsible for:
+```text
+┌─────────────────────────────────────────────────┐
+│                 GatewayRunner                     │
+│                                                   │
+│  ┌──────────┐  ┌──────────┐  ┌──────────┐       │
+│  │ Telegram  │  │ Discord  │  │  Slack   │  ...  │
+│  │ Adapter   │  │ Adapter  │  │ Adapter  │       │
+│  └─────┬─────┘  └─────┬────┘  └─────┬────┘       │
+│        │              │              │             │
+│        └──────────────┼──────────────┘             │
+│                       ▼                            │
+│              _handle_message()                     │
+│                       │                            │
+│          ┌────────────┼────────────┐               │
+│          ▼            ▼            ▼               │
+│   Slash command   AIAgent      Queue/BG            │
+│    dispatch       creation     sessions            │
+│                       │                            │
+│                       ▼                            │
+│              SessionStore                          │
+│           (SQLite persistence)                     │
+└─────────────────────────────────────────────────┘
+```

- loading configuration from `.env`, `config.yaml`, and `gateway.json`
- starting platform adapters
- authorizing users
- routing incoming events to sessions
- maintaining per-chat session continuity
- dispatching messages to `AIAgent`
- running cron ticks and background maintenance tasks
- mirroring/proactively delivering output to configured channels
+## Message Flow

-## Config sources
+When a message arrives from any platform:

-The gateway has a multi-source config model:
+1. **Platform adapter** receives raw event, normalizes it into a `MessageEvent`
+2. **Base adapter** checks active session guard:
+   - If agent is running for this session → queue message, set interrupt event
+   - If `/approve`, `/deny`, `/stop` → bypass guard (dispatched inline)
+3. **GatewayRunner._handle_message()** receives the event:
+   - Resolve session key via `_session_key_for_source()` (format: `agent:main:{platform}:{chat_type}:{chat_id}`)
+   - Check authorization (see Authorization below)
+   - Check if it's a slash command → dispatch to command handler
+   - Check if agent is already running → intercept commands like `/stop`, `/status`
+   - Otherwise → create `AIAgent` instance and run conversation
+4. **Response** is sent back through the platform adapter

- environment variables
- `~/.hermes/gateway.json`
- selected bridged values from `~/.hermes/config.yaml`
+### Session Key Format

-## Session routing
+Session keys encode the full routing context:

-`gateway/session.py` and `GatewayRunner` cooperate to map incoming messages to active session IDs.
+```
+agent:main:{platform}:{chat_type}:{chat_id}
+```

-Session keying can depend on:
+For example: `agent:main:telegram:private:123456789`

- platform
- user/chat identity
- thread/topic identity
- special platform-specific routing behavior
+Thread-aware platforms (Telegram forum topics, Discord threads, Slack threads) may include thread IDs in the chat_id portion. **Never construct session keys manually** — always use `build_session_key()` from `gateway/session.py`.

-## Authorization layers
+### Two-Level Message Guard

-The gateway can authorize through:
+When an agent is actively running, incoming messages pass through two sequential guards:

- platform allowlists
- gateway-wide allowlists
- DM pairing flows
- explicit allow-all settings
+1. **Level 1 — Base adapter** (`gateway/platforms/base.py`): Checks `_active_sessions`. If the session is active, queues the message in `_pending_messages` and sets an interrupt event. This catches messages *before* they reach the gateway runner.

-Pairing support is implemented in `gateway/pairing.py`.
+2. **Level 2 — Gateway runner** (`gateway/run.py`): Checks `_running_agents`. Intercepts specific commands (`/stop`, `/new`, `/queue`, `/status`, `/approve`, `/deny`) and routes them appropriately. Everything else triggers `running_agent.interrupt()`.

-## Delivery path
+Commands that must reach the runner while the agent is blocked (like `/approve`) are dispatched **inline** via `await self._message_handler(event)` — they bypass the background task system to avoid race conditions.

-Outgoing deliveries are handled by `gateway/delivery.py`, which knows how to:
+## Authorization

- deliver to a home channel
- resolve explicit targets
- mirror some remote deliveries back into local history/session tracking
+The gateway uses a multi-layer authorization check, evaluated in order:
+
+1. **Gateway-wide allow-all** (`GATEWAY_ALLOW_ALL_USERS`) — if set, all users are authorized
+2. **Platform allowlist** (e.g., `TELEGRAM_ALLOWED_USERS`) — comma-separated user IDs
+3. **DM pairing** — authenticated users can pair new users via a pairing code
+4. **Admin escalation** — some commands require admin status beyond basic authorization
+
+### DM Pairing Flow
+
+```text
+Admin: /pair
+Gateway: "Pairing code: ABC123. Share with the user."
+New user: ABC123
+Gateway: "Paired! You're now authorized."
+```
+
+Pairing state is persisted in `gateway/pairing.py` and survives restarts.
+
+## Slash Command Dispatch
+
+All slash commands in the gateway flow through the same resolution pipeline:
+
+1. `resolve_command()` from `hermes_cli/commands.py` maps input to canonical name (handles aliases, prefix matching)
+2. The canonical name is checked against `GATEWAY_KNOWN_COMMANDS`
+3. Handler in `_handle_message()` dispatches based on canonical name
+4. Some commands are gated on config (`gateway_config_gate` on `CommandDef`)
+
+### Running-Agent Guard
+
+Commands that must NOT execute while the agent is processing are rejected early:
+
+```python
+if _quick_key in self._running_agents:
+    if canonical == "model":
+        return "⏳ Agent is running — wait for it to finish or /stop first."
+```
+
+Bypass commands (`/stop`, `/new`, `/approve`, `/deny`, `/queue`, `/status`) have special handling.
+
+## Config Sources
+
+The gateway reads configuration from multiple sources:
+
+| Source | What it provides |
+|--------|-----------------|
+| `~/.hermes/.env` | API keys, bot tokens, platform credentials |
+| `~/.hermes/config.yaml` | Model settings, tool configuration, display options |
+| Environment variables | Override any of the above |
+
+Unlike the CLI (which uses `load_cli_config()` with hardcoded defaults), the gateway reads `config.yaml` directly via YAML loader. This means config keys that exist in the CLI's defaults dict but not in the user's config file may behave differently between CLI and gateway.
+
+## Platform Adapters
+
+Each messaging platform has an adapter in `gateway/platforms/`:
+
+```text
+gateway/platforms/
+├── base.py              # BaseAdapter — shared logic for all platforms
+├── telegram.py          # Telegram Bot API (long polling or webhook)
+├── discord.py           # Discord bot via discord.py
+├── slack.py             # Slack Socket Mode
+├── whatsapp.py          # WhatsApp Business Cloud API
+├── signal.py            # Signal via signal-cli REST API
+├── matrix.py            # Matrix via matrix-nio (optional E2EE)
+├── mattermost.py        # Mattermost WebSocket API
+├── email_adapter.py     # Email via IMAP/SMTP
+├── sms.py               # SMS via Twilio
+├── dingtalk.py          # DingTalk WebSocket
+├── feishu.py            # Feishu/Lark WebSocket or webhook
+├── wecom.py             # WeCom (WeChat Work) callback
+└── homeassistant.py     # Home Assistant conversation integration
+```
+
+Adapters implement a common interface:
+- `connect()` / `disconnect()` — lifecycle management
+- `send_message()` — outbound message delivery
+- `on_message()` — inbound message normalization → `MessageEvent`
+
+### Token Locks
+
+Adapters that connect with unique credentials call `acquire_scoped_lock()` in `connect()` and `release_scoped_lock()` in `disconnect()`. This prevents two profiles from using the same bot token simultaneously.
+
+## Delivery Path
+
+Outgoing deliveries (`gateway/delivery.py`) handle:
+
+- **Direct reply** — send response back to the originating chat
+- **Home channel delivery** — route cron job outputs and background results to a configured home channel
+- **Explicit target delivery** — `send_message` tool specifying `telegram:-1001234567890`
+- **Cross-platform delivery** — deliver to a different platform than the originating message
+
+Cron job deliveries are NOT mirrored into gateway session history — they live in their own cron session only. This is a deliberate design choice to avoid message alternation violations.

 ## Hooks

-Gateway events emit hook callbacks through `gateway/hooks.py`. Hooks are local trusted Python code and can observe or extend gateway lifecycle events.
+Gateway hooks are Python modules that respond to lifecycle events:

-## Background maintenance
+### Gateway Hook Events

-The gateway also runs maintenance tasks such as:
+| Event | When fired |
+|-------|-----------|
+| `gateway:startup` | Gateway process starts |
+| `session:start` | New conversation session begins |
+| `session:end` | Session completes or times out |
+| `session:reset` | User resets session with `/new` |
+| `agent:start` | Agent begins processing a message |
+| `agent:step` | Agent completes one tool-calling iteration |
+| `agent:end` | Agent finishes and returns response |
+| `command:*` | Any slash command is executed |

- cron ticking
- cache refreshes
- session expiry checks
- proactive memory flush before reset/expiry
+Hooks are discovered from `gateway/builtin_hooks/` (always active) and `~/.hermes/hooks/` (user-installed). Each hook is a directory with a `HOOK.yaml` manifest and `handler.py`.

-## Honcho interaction
+## Memory Provider Integration

-When a memory provider plugin (e.g. Honcho) is enabled, the gateway creates an AIAgent per incoming message with the same session ID. The memory provider's `initialize()` receives the session ID and creates the appropriate backend session. Tools are routed through the `MemoryManager`, which handles all provider lifecycle hooks (prefetch, sync, session end).
+When a memory provider plugin (e.g., Honcho) is enabled:

-### Memory provider session routing
+1. Gateway creates an `AIAgent` per message with the session ID
+2. The `MemoryManager` initializes the provider with the session context
+3. Provider tools (e.g., `honcho_profile`, `viking_search`) are routed through:

-Memory provider tools (e.g. `honcho_profile`, `viking_search`) are routed through the MemoryManager in `_invoke_tool()`:
-
-```
+```text
 AIAgent._invoke_tool()
  → self._memory_manager.handle_tool_call(name, args)
    → provider.handle_tool_call(name, args)
 ```

-Each memory provider manages its own session lifecycle internally. The `initialize()` method receives the session ID, and `on_session_end()` handles cleanup and final flush.
+4. On session end/reset, `on_session_end()` fires for cleanup and final data flush

-### Memory flush lifecycle
+### Memory Flush Lifecycle

-When a session is reset, resumed, or expires, the gateway flushes built-in memories before discarding context. The flush creates a temporary `AIAgent` that runs a memory-only conversation turn. The memory provider's `on_session_end()` hook fires during this process, giving external providers a chance to persist any buffered data.
+When a session is reset, resumed, or expires:
+1. Built-in memories are flushed to disk
+2. Memory provider's `on_session_end()` hook fires
+3. A temporary `AIAgent` runs a memory-only conversation turn
+4. Context is then discarded or archived

-## Related docs
+## Background Maintenance
+
+The gateway runs periodic maintenance alongside message handling:
+
+- **Cron ticking** — checks job schedules and fires due jobs
+- **Session expiry** — cleans up abandoned sessions after timeout
+- **Memory flush** — proactively flushes memory before session expiry
+- **Cache refresh** — refreshes model lists and provider status
+
+## Process Management
+
+The gateway runs as a long-lived process, managed via:
+
+- `hermes gateway start` / `hermes gateway stop` — manual control
+- `systemctl` (Linux) or `launchctl` (macOS) — service management
+- PID file at `~/.hermes/gateway.pid` — profile-scoped process tracking
+
+**Profile-scoped vs global**: `start_gateway()` uses profile-scoped PID files. `hermes gateway stop` stops only the current profile's gateway. `hermes gateway stop --all` uses global `ps aux` scanning to kill all gateway processes (used during updates).
+
+## Related Docs

 - [Session Storage](./session-storage.md)
 - [Cron Internals](./cron-internals.md)
 - [ACP Internals](./acp-internals.md)
+- [Agent Loop Internals](./agent-loop.md)
+- [Messaging Gateway (User Guide)](/docs/user-guide/messaging)
--- a/website/docs/developer-guide/trajectory-format.md
+++ b/website/docs/developer-guide/trajectory-format.md
@@ -3,7 +3,7 @@
 Hermes Agent saves conversation trajectories in ShareGPT-compatible JSONL format
 for use as training data, debugging artifacts, and reinforcement learning datasets.

-Source files: `agent/trajectory.py`, `run_agent.py` (lines 1788-1975), `batch_runner.py`
+Source files: `agent/trajectory.py`, `run_agent.py` (search for `_save_trajectory`), `batch_runner.py`


 ## File Naming Convention
--- a/website/docs/index.md
+++ b/website/docs/index.md
@@ -28,7 +28,7 @@ It's not a coding copilot tethered to an IDE or a chatbot wrapper around a singl
 | 🗺️ **[Learning Path](/docs/getting-started/learning-path)** | Find the right docs for your experience level |
 | ⚙️ **[Configuration](/docs/user-guide/configuration)** | Config file, providers, models, and options |
 | 💬 **[Messaging Gateway](/docs/user-guide/messaging)** | Set up Telegram, Discord, Slack, or WhatsApp |
-| 🔧 **[Tools & Toolsets](/docs/user-guide/features/tools)** | 40+ built-in tools and how to configure them |
+| 🔧 **[Tools & Toolsets](/docs/user-guide/features/tools)** | 47 built-in tools and how to configure them |
 | 🧠 **[Memory System](/docs/user-guide/features/memory)** | Persistent memory that grows across sessions |
 | 📚 **[Skills System](/docs/user-guide/features/skills)** | Procedural memory the agent creates and reuses |
 | 🔌 **[MCP Integration](/docs/user-guide/features/mcp)** | Connect to MCP servers, filter their tools, and extend Hermes safely |
@@ -46,7 +46,7 @@ It's not a coding copilot tethered to an IDE or a chatbot wrapper around a singl

 - **A closed learning loop** — Agent-curated memory with periodic nudges, autonomous skill creation, skill self-improvement during use, FTS5 cross-session recall with LLM summarization, and [Honcho](https://github.com/plastic-labs/honcho) dialectic user modeling
 - **Runs anywhere, not just your laptop** — 6 terminal backends: local, Docker, SSH, Daytona, Singularity, Modal. Daytona and Modal offer serverless persistence — your environment hibernates when idle, costing nearly nothing
- **Lives where you do** — CLI, Telegram, Discord, Slack, WhatsApp, all from one gateway
+- **Lives where you do** — CLI, Telegram, Discord, Slack, WhatsApp, Signal, Matrix, Mattermost, Email, SMS, DingTalk, Feishu, WeCom, Home Assistant — 14+ platforms from one gateway
 - **Built by model trainers** — Created by [Nous Research](https://nousresearch.com), the lab behind Hermes, Nomos, and Psyche. Works with [Nous Portal](https://portal.nousresearch.com), [OpenRouter](https://openrouter.ai), OpenAI, or any endpoint
 - **Scheduled automations** — Built-in cron with delivery to any platform
 - **Delegates & parallelizes** — Spawn isolated subagents for parallel workstreams. Programmatic Tool Calling via `execute_code` collapses multi-step pipelines into single inference calls
--- a/website/docs/integrations/index.md
+++ b/website/docs/integrations/index.md
@@ -22,7 +22,7 @@ Hermes supports multiple AI inference providers out of the box. Use `hermes mode

 ## Web Search Backends

-The `web_search`, `web_extract`, and `web_crawl` tools support four backend providers, configured via `config.yaml` or `hermes tools`:
+The `web_search` and `web_extract` tools support four backend providers, configured via `config.yaml` or `hermes tools`:

 | Backend | Env Var | Search | Extract | Crawl |
 |---------|---------|--------|---------|-------|
@@ -56,13 +56,14 @@ See [Browser Automation](/docs/user-guide/features/browser) for setup and usage.
 Text-to-speech and speech-to-text across all messaging platforms:

 | Provider | Quality | Cost | API Key |
-|----------|---------|------|---------|
-| **Edge TTS** (default) | Good | Free | None needed |
-| **ElevenLabs** | Excellent | Paid | `ELEVENLABS_API_KEY` |
-| **OpenAI TTS** | Good | Paid | `VOICE_TOOLS_OPENAI_KEY` |
-| **NeuTTS** | Good | Free | None needed |
+||----------|---------|------|---------|
+|| **Edge TTS** (default) | Good | Free | None needed |
+|| **ElevenLabs** | Excellent | Paid | `ELEVENLABS_API_KEY` |
+|| **OpenAI TTS** | Good | Paid | `VOICE_TOOLS_OPENAI_KEY` |
+|| **MiniMax** | Good | Paid | `MINIMAX_API_KEY` |
+|| **NeuTTS** | Good | Free | None needed |

-Speech-to-text uses Whisper for voice message transcription on Telegram, Discord, and WhatsApp. See [Voice & TTS](/docs/user-guide/features/tts) and [Voice Mode](/docs/user-guide/features/voice-mode) for details.
+Speech-to-text supports three providers: local Whisper (free, runs on-device), Groq (fast cloud), and OpenAI Whisper API. Voice message transcription works across Telegram, Discord, WhatsApp, and other messaging platforms. See [Voice & TTS](/docs/user-guide/features/tts) and [Voice Mode](/docs/user-guide/features/voice-mode) for details.

 ## IDE & Editor Integration

@@ -74,9 +75,27 @@ Speech-to-text uses Whisper for voice message transcription on Telegram, Discord

 ## Memory & Personalization

- **[Honcho Memory](/docs/user-guide/features/honcho)** — AI-native persistent memory for cross-session user modeling and personalization. Honcho adds deep user modeling via dialectic reasoning on top of Hermes's built-in memory system.
+- **[Built-in Memory](/docs/user-guide/features/memory)** — Persistent, curated memory via `MEMORY.md` and `USER.md` files. The agent maintains bounded stores of personal notes and user profile data that survive across sessions.
+- **[Memory Providers](/docs/user-guide/features/memory-providers)** — Plug in external memory backends for deeper personalization. Seven providers are supported: Honcho (dialectic reasoning), OpenViking (tiered retrieval), Mem0 (cloud extraction), Hindsight (knowledge graphs), Holographic (local SQLite), RetainDB (hybrid search), and ByteRover (CLI-based).
+
+## Messaging Platforms
+
+Hermes runs as a gateway bot on 14+ messaging platforms, all configured through the same `gateway` subsystem:
+
+- **[Telegram](/docs/user-guide/messaging/telegram)**, **[Discord](/docs/user-guide/messaging/discord)**, **[Slack](/docs/user-guide/messaging/slack)**, **[WhatsApp](/docs/user-guide/messaging/whatsapp)**, **[Signal](/docs/user-guide/messaging/signal)**, **[Matrix](/docs/user-guide/messaging/matrix)**, **[Mattermost](/docs/user-guide/messaging/mattermost)**, **[Email](/docs/user-guide/messaging/email)**, **[SMS](/docs/user-guide/messaging/sms)**, **[DingTalk](/docs/user-guide/messaging/dingtalk)**, **[Feishu/Lark](/docs/user-guide/messaging/feishu)**, **[WeCom](/docs/user-guide/messaging/wecom)**, **[Home Assistant](/docs/user-guide/messaging/homeassistant)**, **[Webhooks](/docs/user-guide/messaging/webhooks)**
+
+See the [Messaging Gateway overview](/docs/user-guide/messaging) for the platform comparison table and setup guide.
+
+## Home Automation
+
+- **[Home Assistant](/docs/user-guide/messaging/homeassistant)** — Control smart home devices via four dedicated tools (`ha_list_entities`, `ha_get_state`, `ha_list_services`, `ha_call_service`). The Home Assistant toolset activates automatically when `HASS_TOKEN` is configured.
+
+## Plugins
+
+- **[Plugin System](/docs/user-guide/features/plugins)** — Extend Hermes with custom tools, lifecycle hooks, and CLI commands without modifying core code. Plugins are discovered from `~/.hermes/plugins/`, project-local `.hermes/plugins/`, and pip-installed entry points.
+- **[Build a Plugin](/docs/guides/build-a-hermes-plugin)** — Step-by-step guide for creating Hermes plugins with tools, hooks, and CLI commands.

 ## Training & Evaluation

- **[RL Training](/docs/user-guide/features/rl-training)** — Generate trajectory data from agent sessions for reinforcement learning and model fine-tuning.
+- **[RL Training](/docs/user-guide/features/rl-training)** — Generate trajectory data from agent sessions for reinforcement learning and model fine-tuning. Supports Atropos environments with customizable reward functions.
 - **[Batch Processing](/docs/user-guide/features/batch-processing)** — Run the agent across hundreds of prompts in parallel, generating structured ShareGPT-format trajectory data for training data generation or evaluation.
--- a/website/docs/reference/faq.md
+++ b/website/docs/reference/faq.md
@@ -90,7 +90,7 @@ Both persist across sessions. See [Memory](../user-guide/features/memory.md) and
 Yes. Import the `AIAgent` class and use Hermes programmatically:

 ```python
-from hermes.agent import AIAgent
+from run_agent import AIAgent

 agent = AIAgent(model="openrouter/nous/hermes-3-llama-3.1-70b")
 response = agent.chat("Explain quantum computing briefly")
@@ -227,7 +227,7 @@ hermes chat --model openrouter/meta-llama/llama-3.1-70b-instruct
 hermes chat

 # Use a model with a larger context window
-hermes chat --model openrouter/google/gemini-2.0-flash-001
+hermes chat --model openrouter/google/gemini-3-flash-preview
 ```

 If this happens on the first long conversation, Hermes may have the wrong context length for your model. Check what it detected:
--- a/website/docs/reference/optional-skills-catalog.md
+++ b/website/docs/reference/optional-skills-catalog.md
@@ -1,74 +1,153 @@
 ---
-sidebar_position: 6
-title: "Official Optional Skills Catalog"
-description: "Catalog of official optional skills available from the repository"
+sidebar_position: 9
+title: "Optional Skills Catalog"
+description: "Official optional skills shipped with hermes-agent — install via hermes skills install official/<category>/<skill>"
 ---

-# Official Optional Skills Catalog
+# Optional Skills Catalog

-Official optional skills live in the repository under `optional-skills/`. Install them with `hermes skills install official/<category>/<skill>` or browse them with `hermes skills browse --source official`.
+Official optional skills ship with the hermes-agent repository under `optional-skills/` but are **not active by default**. Install them explicitly:

-## autonomous-ai-agents
+```bash
+hermes skills install official/<category>/<skill>
+```

-| Skill | Description | Path |
-|-------|-------------|------|
-| `blackbox` | Delegate coding tasks to Blackbox AI CLI agent. Multi-model agent with built-in judge that runs tasks through multiple LLMs and picks the best result. Requires the blackbox CLI and a Blackbox AI API key. | `autonomous-ai-agents/blackbox` |
+For example:

-## blockchain
+```bash
+hermes skills install official/blockchain/solana
+hermes skills install official/mlops/flash-attention
+```

-| Skill | Description | Path |
-|-------|-------------|------|
-| `base` | Query Base (Ethereum L2) blockchain data with USD pricing — wallet balances, token info, transaction details, gas analysis, contract inspection. | `blockchain/base` |
-| `solana` | Query Solana blockchain data with USD pricing — wallet balances, token portfolios with values, transaction details, NFTs, whale detection, and live network stats. Uses Solana RPC + CoinGecko. No API key required. | `blockchain/solana` |
+Once installed, the skill appears in the agent's skill list and can be loaded automatically when relevant tasks are detected.

-## creative
+To uninstall:

-| Skill | Description | Path |
-|-------|-------------|------|
-| `blender-mcp` | Control Blender directly from Hermes via socket connection to the blender-mcp addon. Create 3D objects, materials, animations, and run arbitrary Blender Python. | `creative/blender-mcp` |
-| `meme-generation` | Generate real meme images by picking a template and overlaying text with Pillow. Produces actual .png meme files. | `creative/meme-generation` |
+```bash
+hermes skills uninstall <skill-name>
+```

-## email
+---

-| Skill | Description | Path |
-|-------|-------------|------|
-| `agentmail` | Give the agent its own dedicated email inbox via AgentMail. Send, receive, and manage email autonomously using agent-owned email addresses (e.g. hermes-agent@agentmail.to). | `email/agentmail` |
+## Autonomous AI Agents

-## health
+| Skill | Description |
+|-------|-------------|
+| **blackbox** | Delegate coding tasks to Blackbox AI CLI agent. Multi-model agent with built-in judge that runs tasks through multiple LLMs and picks the best result. |
+| **honcho** | Configure and use Honcho memory with Hermes — cross-session user modeling, multi-profile peer isolation, observation config, and dialectic reasoning. |

-| Skill | Description | Path |
-|-------|-------------|------|
-| `neuroskill-bci` | Connect to a running NeuroSkill instance and incorporate the user's real-time cognitive and emotional state (focus, relaxation, mood, cognitive load, drowsiness, heart rate, HRV, sleep staging, and 40+ derived EXG scores) into responses. Requires a BCI wearable (Muse 2/S or Open… | `health/neuroskill-bci` |
+## Blockchain

-## mcp
+| Skill | Description |
+|-------|-------------|
+| **base** | Query Base (Ethereum L2) blockchain data with USD pricing — wallet balances, token info, transaction details, gas analysis, contract inspection, whale detection, and live network stats. No API key required. |
+| **solana** | Query Solana blockchain data with USD pricing — wallet balances, token portfolios, transaction details, NFTs, whale detection, and live network stats. No API key required. |

-| Skill | Description | Path |
-|-------|-------------|------|
-| `fastmcp` | Build, test, inspect, install, and deploy MCP servers with FastMCP in Python. | `mcp/fastmcp` |
+## Communication

-## migration
+| Skill | Description |
+|-------|-------------|
+| **one-three-one-rule** | Structured communication framework for proposals and decision-making. |

-| Skill | Description | Path |
-|-------|-------------|------|
-| `openclaw-migration` | Migrate a user's OpenClaw customization footprint into Hermes Agent. Imports Hermes-compatible memories, SOUL.md, command allowlists, user skills, and selected workspace assets from ~/.openclaw, then reports exactly what could not be migrated and why. | `migration/openclaw-migration` |
+## Creative

-## productivity
+| Skill | Description |
+|-------|-------------|
+| **blender-mcp** | Control Blender directly from Hermes via socket connection to the blender-mcp addon. Create 3D objects, materials, animations, and run arbitrary Blender Python (bpy) code. |
+| **meme-generation** | Generate real meme images by picking a template and overlaying text with Pillow. Produces actual `.png` meme files. |

-| Skill | Description | Path |
-|-------|-------------|------|
-| `telephony` | Give Hermes phone capabilities — provision a Twilio number, send/receive SMS/MMS, make direct calls, and place AI-driven outbound calls through Bland.ai or Vapi. | `productivity/telephony` |
+## DevOps

-## research
+| Skill | Description |
+|-------|-------------|
+| **cli** | Run 150+ AI apps via inference.sh CLI (infsh) — image generation, video creation, LLMs, search, 3D, and social automation. |
+| **docker-management** | Manage Docker containers, images, volumes, networks, and Compose stacks — lifecycle ops, debugging, cleanup, and Dockerfile optimization. |

-| Skill | Description | Path |
-|-------|-------------|------|
-| `bioinformatics` | Gateway to 400+ bioinformatics skills from bioSkills and ClawBio. Covers genomics, transcriptomics, single-cell, variant calling, pharmacogenomics, metagenomics, structural biology. | `research/bioinformatics` |
-| `qmd` | Search personal knowledge bases, notes, docs, and meeting transcripts locally using qmd — a hybrid retrieval engine with BM25, vector search, and LLM reranking. Supports CLI and MCP integration. | `research/qmd` |
+## Email

-## security
+| Skill | Description |
+|-------|-------------|
+| **agentmail** | Give the agent its own dedicated email inbox via AgentMail. Send, receive, and manage email autonomously using agent-owned email addresses. |

-| Skill | Description | Path |
-|-------|-------------|------|
-| `1password` | Set up and use 1Password CLI (op). Use when installing the CLI, enabling desktop app integration, signing in, and reading/injecting secrets for commands. | `security/1password` |
-| `oss-forensics` | Supply chain investigation, evidence recovery, and forensic analysis for GitHub repositories. Covers deleted commit recovery, force-push detection, IOC extraction. | `security/oss-forensics` |
-| `sherlock` | OSINT username search across 400+ social networks. Hunt down social media accounts by username. | `security/sherlock` |
+## Health
+
+| Skill | Description |
+|-------|-------------|
+| **neuroskill-bci** | Brain-Computer Interface (BCI) integration for neuroscience research workflows. |
+
+## MCP
+
+| Skill | Description |
+|-------|-------------|
+| **fastmcp** | Build, test, inspect, install, and deploy MCP servers with FastMCP in Python. Covers wrapping APIs or databases as MCP tools, exposing resources or prompts, and deployment. |
+
+## Migration
+
+| Skill | Description |
+|-------|-------------|
+| **openclaw-migration** | Migrate a user's OpenClaw customization footprint into Hermes Agent. Imports memories, SOUL.md, command allowlists, user skills, and selected workspace assets. |
+
+## MLOps
+
+The largest optional category — covers the full ML pipeline from data curation to production inference.
+
+| Skill | Description |
+|-------|-------------|
+| **accelerate** | Simplest distributed training API. 4 lines to add distributed support to any PyTorch script. Unified API for DeepSpeed/FSDP/Megatron/DDP. |
+| **chroma** | Open-source embedding database. Store embeddings and metadata, perform vector and full-text search. Simple 4-function API for RAG and semantic search. |
+| **faiss** | Facebook's library for efficient similarity search and clustering of dense vectors. Supports billions of vectors, GPU acceleration, and various index types (Flat, IVF, HNSW). |
+| **flash-attention** | Optimize transformer attention with Flash Attention for 2-4x speedup and 10-20x memory reduction. Supports PyTorch SDPA, flash-attn library, H100 FP8, and sliding window. |
+| **hermes-atropos-environments** | Build, test, and debug Hermes Agent RL environments for Atropos training. Covers the HermesAgentBaseEnv interface, reward functions, agent loop integration, and evaluation. |
+| **huggingface-tokenizers** | Fast Rust-based tokenizers for research and production. Tokenizes 1GB in under 20 seconds. Supports BPE, WordPiece, and Unigram algorithms. |
+| **instructor** | Extract structured data from LLM responses with Pydantic validation, retry failed extractions automatically, and stream partial results. |
+| **lambda-labs** | Reserved and on-demand GPU cloud instances for ML training and inference. SSH access, persistent filesystems, and multi-node clusters. |
+| **llava** | Large Language and Vision Assistant — visual instruction tuning and image-based conversations combining CLIP vision with LLaMA language models. |
+| **nemo-curator** | GPU-accelerated data curation for LLM training. Fuzzy deduplication (16x faster), quality filtering (30+ heuristics), semantic dedup, PII redaction. Scales with RAPIDS. |
+| **pinecone** | Managed vector database for production AI. Auto-scaling, hybrid search (dense + sparse), metadata filtering, and low latency (under 100ms p95). |
+| **pytorch-lightning** | High-level PyTorch framework with Trainer class, automatic distributed training (DDP/FSDP/DeepSpeed), callbacks, and minimal boilerplate. |
+| **qdrant** | High-performance vector similarity search engine. Rust-powered with fast nearest neighbor search, hybrid search with filtering, and scalable vector storage. |
+| **saelens** | Train and analyze Sparse Autoencoders (SAEs) using SAELens to decompose neural network activations into interpretable features. |
+| **simpo** | Simple Preference Optimization — reference-free alternative to DPO with better performance (+6.4 pts on AlpacaEval 2.0). No reference model needed. |
+| **slime** | LLM post-training with RL using Megatron+SGLang framework. Custom data generation workflows and tight Megatron-LM integration for RL scaling. |
+| **tensorrt-llm** | Optimize LLM inference with NVIDIA TensorRT for maximum throughput. 10-100x faster than PyTorch on A100/H100 with quantization (FP8/INT4) and in-flight batching. |
+| **torchtitan** | PyTorch-native distributed LLM pretraining with 4D parallelism (FSDP2, TP, PP, CP). Scale from 8 to 512+ GPUs with Float8 and torch.compile. |
+
+## Productivity
+
+| Skill | Description |
+|-------|-------------|
+| **canvas** | Canvas LMS integration — fetch enrolled courses and assignments using API token authentication. |
+| **memento-flashcards** | Spaced repetition flashcard system for learning and knowledge retention. |
+| **siyuan** | SiYuan Note API for searching, reading, creating, and managing blocks and documents in a self-hosted knowledge base. |
+| **telephony** | Give Hermes phone capabilities — provision a Twilio number, send/receive SMS/MMS, make calls, and place AI-driven outbound calls through Bland.ai or Vapi. |
+
+## Research
+
+| Skill | Description |
+|-------|-------------|
+| **bioinformatics** | Gateway to 400+ bioinformatics skills from bioSkills and ClawBio. Covers genomics, transcriptomics, single-cell, variant calling, pharmacogenomics, metagenomics, and structural biology. |
+| **domain-intel** | Passive domain reconnaissance using Python stdlib. Subdomain discovery, SSL certificate inspection, WHOIS lookups, DNS records, and bulk multi-domain analysis. No API keys required. |
+| **duckduckgo-search** | Free web search via DuckDuckGo — text, news, images, videos. No API key needed. |
+| **gitnexus-explorer** | Index a codebase with GitNexus and serve an interactive knowledge graph via web UI and Cloudflare tunnel. |
+| **parallel-cli** | Vendor skill for Parallel CLI — agent-native web search, extraction, deep research, enrichment, and monitoring. |
+| **qmd** | Search personal knowledge bases, notes, docs, and meeting transcripts locally using qmd — a hybrid retrieval engine with BM25, vector search, and LLM reranking. |
+| **scrapling** | Web scraping with Scrapling — HTTP fetching, stealth browser automation, Cloudflare bypass, and spider crawling via CLI and Python. |
+
+## Security
+
+| Skill | Description |
+|-------|-------------|
+| **1password** | Set up and use 1Password CLI (op). Install the CLI, enable desktop app integration, sign in, and read/inject secrets for commands. |
+| **oss-forensics** | Open-source software forensics — analyze packages, dependencies, and supply chain risks. |
+| **sherlock** | OSINT username search across 400+ social networks. Hunt down social media accounts by username. |
+
+---
+
+## Contributing Optional Skills
+
+To add a new optional skill to the repository:
+
+1. Create a directory under `optional-skills/<category>/<skill-name>/`
+2. Add a `SKILL.md` with standard frontmatter (name, description, version, author)
+3. Include any supporting files in `references/`, `templates/`, or `scripts/` subdirectories
+4. Submit a pull request — the skill will appear in this catalog once merged
--- a/website/docs/reference/slash-commands.md
+++ b/website/docs/reference/slash-commands.md
@@ -89,9 +89,22 @@ Type `/` in the CLI to open the autocomplete menu. Built-in commands are case-in
 | `/<skill-name>` | Load any installed skill as an on-demand command. Example: `/gif-search`, `/github-pr-workflow`, `/excalidraw`. |
 | `/skills ...` | Search, browse, inspect, install, audit, publish, and configure skills from registries and the official optional-skills catalog. |

-### Quick commands
+### Quick Commands

-User-defined quick commands from `quick_commands` in `~/.hermes/config.yaml` are also available as slash commands. These are resolved at dispatch time, not shown in the built-in autocomplete/help tables.
+User-defined quick commands map a short alias to a longer prompt. Configure them in `~/.hermes/config.yaml`:
+
+```yaml
+quick_commands:
+  review: "Review my latest git diff and suggest improvements"
+  deploy: "Run the deployment script at scripts/deploy.sh and verify the output"
+  morning: "Check my calendar, unread emails, and summarize today's priorities"
+```
+
+Then type `/review`, `/deploy`, or `/morning` in the CLI. Quick commands are resolved at dispatch time and are not shown in the built-in autocomplete/help tables.
+
+### Alias Resolution
+
+Commands support prefix matching: typing `/h` resolves to `/help`, `/mod` resolves to `/model`. When a prefix is ambiguous (matches multiple commands), the first match in registry order wins. Full command names and registered aliases always take priority over prefix matches.

 ## Messaging slash commands

--- a/website/docs/reference/tools-reference.md
+++ b/website/docs/reference/tools-reference.md
@@ -6,7 +6,13 @@ description: "Authoritative reference for Hermes built-in tools, grouped by tool

 # Built-in Tools Reference

-This page documents the built-in Hermes tool registry as it exists in code. Availability can still vary by platform, credentials, and enabled toolsets.
+This page documents all 47 built-in tools in the Hermes tool registry, grouped by toolset. Availability varies by platform, credentials, and enabled toolsets.
+
+**Quick counts:** 11 browser tools, 4 file tools, 10 RL tools, 4 Home Assistant tools, 2 terminal tools, 2 web tools, and 14 standalone tools across other toolsets.
+
+:::tip MCP Tools
+In addition to built-in tools, Hermes can load tools dynamically from MCP servers. MCP tools appear with a server-name prefix (e.g., `github_create_issue` for the `github` MCP server). See [MCP Integration](/docs/user-guide/features/mcp) for configuration.
+:::

 ## `browser` toolset

--- a/website/docs/reference/toolsets-reference.md
+++ b/website/docs/reference/toolsets-reference.md
@@ -6,53 +6,150 @@ description: "Reference for Hermes core, composite, platform, and dynamic toolse

 # Toolsets Reference

-Toolsets are named bundles of tools that you can enable with `hermes chat --toolsets ...`, configure per platform, or resolve inside the agent runtime.
+Toolsets are named bundles of tools that control what the agent can do. They're the primary mechanism for configuring tool availability per platform, per session, or per task.

-| Toolset | Kind | Resolves to |
-|---------|------|-------------|
-| `browser` | core | `browser_back`, `browser_click`, `browser_close`, `browser_console`, `browser_get_images`, `browser_navigate`, `browser_press`, `browser_scroll`, `browser_snapshot`, `browser_type`, `browser_vision`, `web_search` |
-| `clarify` | core | `clarify` |
-| `code_execution` | core | `execute_code` |
-| `cronjob` | core | `cronjob` |
-| `debugging` | composite | `patch`, `process`, `read_file`, `search_files`, `terminal`, `web_extract`, `web_search`, `write_file` |
-| `delegation` | core | `delegate_task` |
-| `file` | core | `patch`, `read_file`, `search_files`, `write_file` |
-| `hermes-acp` | platform | `browser_back`, `browser_click`, `browser_close`, `browser_console`, `browser_get_images`, `browser_navigate`, `browser_press`, `browser_scroll`, `browser_snapshot`, `browser_type`, `browser_vision`, `delegate_task`, `execute_code`, `memory`, `patch`, `process`, `read_file`, `search_files`, `session_search`, `skill_manage`, `skill_view`, `skills_list`, `terminal`, `todo`, `vision_analyze`, `web_extract`, `web_search`, `write_file` |
-| `hermes-cli` | platform | `browser_back`, `browser_click`, `browser_close`, `browser_console`, `browser_get_images`, `browser_navigate`, `browser_press`, `browser_scroll`, `browser_snapshot`, `browser_type`, `browser_vision`, `clarify`, `cronjob`, `delegate_task`, `execute_code`, `ha_call_service`, `ha_get_state`, `ha_list_entities`, `ha_list_services`, `image_generate`, `memory`, `mixture_of_agents`, `patch`, `process`, `read_file`, `search_files`, `send_message`, `session_search`, `skill_manage`, `skill_view`, `skills_list`, `terminal`, `text_to_speech`, `todo`, `vision_analyze`, `web_extract`, `web_search`, `write_file` |
-| `hermes-api-server` | platform | `browser_back`, `browser_click`, `browser_close`, `browser_console`, `browser_get_images`, `browser_navigate`, `browser_press`, `browser_scroll`, `browser_snapshot`, `browser_type`, `browser_vision`, `cronjob`, `delegate_task`, `execute_code`, `ha_call_service`, `ha_get_state`, `ha_list_entities`, `ha_list_services`, `image_generate`, `memory`, `mixture_of_agents`, `patch`, `process`, `read_file`, `search_files`, `session_search`, `skill_manage`, `skill_view`, `skills_list`, `terminal`, `todo`, `vision_analyze`, `web_extract`, `web_search`, `write_file` |
-| `hermes-dingtalk` | platform | _(same as hermes-cli)_ |
-| `hermes-feishu` | platform | _(same as hermes-cli)_ |
-| `hermes-wecom` | platform | _(same as hermes-cli)_ |
-| `hermes-discord` | platform | _(same as hermes-cli)_ |
-| `hermes-email` | platform | _(same as hermes-cli)_ |
-| `hermes-gateway` | composite | Union of all messaging platform toolsets |
-| `hermes-homeassistant` | platform | _(same as hermes-cli)_ |
-| `hermes-matrix` | platform | _(same as hermes-cli)_ |
-| `hermes-mattermost` | platform | _(same as hermes-cli)_ |
-| `hermes-signal` | platform | _(same as hermes-cli)_ |
-| `hermes-slack` | platform | _(same as hermes-cli)_ |
-| `hermes-sms` | platform | _(same as hermes-cli)_ |
-| `hermes-telegram` | platform | _(same as hermes-cli)_ |
-| `hermes-whatsapp` | platform | _(same as hermes-cli)_ |
-| `hermes-webhook` | platform | _(same as hermes-cli)_ |
-| `homeassistant` | core | `ha_call_service`, `ha_get_state`, `ha_list_entities`, `ha_list_services` |
-| `image_gen` | core | `image_generate` |
-| `memory` | core | `memory` |
-| `messaging` | core | `send_message` |
-| `moa` | core | `mixture_of_agents` |
-| `rl` | core | `rl_check_status`, `rl_edit_config`, `rl_get_current_config`, `rl_get_results`, `rl_list_environments`, `rl_list_runs`, `rl_select_environment`, `rl_start_training`, `rl_stop_training`, `rl_test_inference` |
-| `safe` | composite | `image_generate`, `mixture_of_agents`, `vision_analyze`, `web_extract`, `web_search` |
-| `search` | core | `web_search` |
-| `session_search` | core | `session_search` |
-| `skills` | core | `skill_manage`, `skill_view`, `skills_list` |
-| `terminal` | core | `process`, `terminal` |
-| `todo` | core | `todo` |
-| `tts` | core | `text_to_speech` |
-| `vision` | core | `vision_analyze` |
-| `web` | core | `web_extract`, `web_search` |
+## How Toolsets Work

-## Dynamic toolsets
+Every tool belongs to exactly one toolset. When you enable a toolset, all tools in that bundle become available to the agent. Toolsets come in three kinds:

- `mcp-<server>` — generated at runtime for each configured MCP server.
- Custom toolsets can be created in configuration and resolved at startup.
- Wildcards: `all` and `*` expand to every registered toolset.
+- **Core** — A single logical group of related tools (e.g., `file` bundles `read_file`, `write_file`, `patch`, `search_files`)
+- **Composite** — Combines multiple core toolsets for a common scenario (e.g., `debugging` bundles file, terminal, and web tools)
+- **Platform** — A complete tool configuration for a specific deployment context (e.g., `hermes-cli` is the default for interactive CLI sessions)
+
+## Configuring Toolsets
+
+### Per-session (CLI)
+
+```bash
+hermes chat --toolsets web,file,terminal
+hermes chat --toolsets debugging        # composite — expands to file + terminal + web
+hermes chat --toolsets all              # everything
+```
+
+### Per-platform (config.yaml)
+
+```yaml
+toolsets:
+  - hermes-cli          # default for CLI
+  # - hermes-telegram   # override for Telegram gateway
+```
+
+### Interactive management
+
+```bash
+hermes tools                            # curses UI to enable/disable per platform
+```
+
+Or in-session:
+
+```
+/tools list
+/tools disable browser
+/tools enable rl
+```
+
+## Core Toolsets
+
+| Toolset | Tools | Purpose |
+|---------|-------|---------|
+| `browser` | `browser_back`, `browser_click`, `browser_close`, `browser_console`, `browser_get_images`, `browser_navigate`, `browser_press`, `browser_scroll`, `browser_snapshot`, `browser_type`, `browser_vision`, `web_search` | Full browser automation. Includes `web_search` as a fallback for quick lookups. |
+| `clarify` | `clarify` | Ask the user a question when the agent needs clarification. |
+| `code_execution` | `execute_code` | Run Python scripts that call Hermes tools programmatically. |
+| `cronjob` | `cronjob` | Schedule and manage recurring tasks. |
+| `delegation` | `delegate_task` | Spawn isolated subagent instances for parallel work. |
+| `file` | `patch`, `read_file`, `search_files`, `write_file` | File reading, writing, searching, and editing. |
+| `homeassistant` | `ha_call_service`, `ha_get_state`, `ha_list_entities`, `ha_list_services` | Smart home control via Home Assistant. Only available when `HASS_TOKEN` is set. |
+| `image_gen` | `image_generate` | Text-to-image generation via FAL.ai. |
+| `memory` | `memory` | Persistent cross-session memory management. |
+| `messaging` | `send_message` | Send messages to other platforms (Telegram, Discord, etc.) from within a session. |
+| `moa` | `mixture_of_agents` | Multi-model consensus via Mixture of Agents. |
+| `rl` | `rl_check_status`, `rl_edit_config`, `rl_get_current_config`, `rl_get_results`, `rl_list_environments`, `rl_list_runs`, `rl_select_environment`, `rl_start_training`, `rl_stop_training`, `rl_test_inference` | RL training environment management (Atropos). |
+| `search` | `web_search` | Web search only (without extract). |
+| `session_search` | `session_search` | Search past conversation sessions. |
+| `skills` | `skill_manage`, `skill_view`, `skills_list` | Skill CRUD and browsing. |
+| `terminal` | `process`, `terminal` | Shell command execution and background process management. |
+| `todo` | `todo` | Task list management within a session. |
+| `tts` | `text_to_speech` | Text-to-speech audio generation. |
+| `vision` | `vision_analyze` | Image analysis via vision-capable models. |
+| `web` | `web_extract`, `web_search` | Web search and page content extraction. |
+
+## Composite Toolsets
+
+These expand to multiple core toolsets, providing a convenient shorthand for common scenarios:
+
+| Toolset | Expands to | Use case |
+|---------|-----------|----------|
+| `debugging` | `patch`, `process`, `read_file`, `search_files`, `terminal`, `web_extract`, `web_search`, `write_file` | Debug sessions — file access, terminal, and web research without browser or delegation overhead. |
+| `safe` | `image_generate`, `mixture_of_agents`, `vision_analyze`, `web_extract`, `web_search` | Read-only research and media generation. No file writes, no terminal access, no code execution. Good for untrusted or constrained environments. |
+
+## Platform Toolsets
+
+Platform toolsets define the complete tool configuration for a deployment target. Most messaging platforms use the same set as `hermes-cli`:
+
+| Toolset | Differences from `hermes-cli` |
+|---------|-------------------------------|
+| `hermes-cli` | Full toolset — all 39 tools including `clarify`. The default for interactive CLI sessions. |
+| `hermes-acp` | Drops `clarify`, `cronjob`, `image_generate`, `mixture_of_agents`, `send_message`, `text_to_speech`, homeassistant tools. Focused on coding tasks in IDE context. |
+| `hermes-api-server` | Drops `clarify` and `send_message`. Adds everything else — suitable for programmatic access where user interaction isn't possible. |
+| `hermes-telegram` | Same as `hermes-cli`. |
+| `hermes-discord` | Same as `hermes-cli`. |
+| `hermes-slack` | Same as `hermes-cli`. |
+| `hermes-whatsapp` | Same as `hermes-cli`. |
+| `hermes-signal` | Same as `hermes-cli`. |
+| `hermes-matrix` | Same as `hermes-cli`. |
+| `hermes-mattermost` | Same as `hermes-cli`. |
+| `hermes-email` | Same as `hermes-cli`. |
+| `hermes-sms` | Same as `hermes-cli`. |
+| `hermes-dingtalk` | Same as `hermes-cli`. |
+| `hermes-feishu` | Same as `hermes-cli`. |
+| `hermes-wecom` | Same as `hermes-cli`. |
+| `hermes-homeassistant` | Same as `hermes-cli`. |
+| `hermes-webhook` | Same as `hermes-cli`. |
+| `hermes-gateway` | Union of all messaging platform toolsets. Used internally when the gateway needs the broadest possible tool set. |
+
+## Dynamic Toolsets
+
+### MCP server toolsets
+
+Each configured MCP server generates a `mcp-<server>` toolset at runtime. For example, if you configure a `github` MCP server, a `mcp-github` toolset is created containing all tools that server exposes.
+
+```yaml
+# config.yaml
+mcp:
+  servers:
+    github:
+      command: npx
+      args: ["-y", "@modelcontextprotocol/server-github"]
+```
+
+This creates a `mcp-github` toolset you can reference in `--toolsets` or platform configs.
+
+### Plugin toolsets
+
+Plugins can register their own toolsets via `ctx.register_tool()` during plugin initialization. These appear alongside built-in toolsets and can be enabled/disabled the same way.
+
+### Custom toolsets
+
+Define custom toolsets in `config.yaml` to create project-specific bundles:
+
+```yaml
+toolsets:
+  - hermes-cli
+custom_toolsets:
+  data-science:
+    - file
+    - terminal
+    - code_execution
+    - web
+    - vision
+```
+
+### Wildcards
+
+- `all` or `*` — expands to every registered toolset (built-in + dynamic + plugin)
+
+## Relationship to `hermes tools`
+
+The `hermes tools` command provides a curses-based UI for toggling individual tools on or off per platform. This operates at the tool level (finer than toolsets) and persists to `config.yaml`. Disabled tools are filtered out even if their toolset is enabled.
+
+See also: [Tools Reference](./tools-reference.md) for the complete list of individual tools and their parameters.
--- a/website/docs/user-guide/features/context-references.md
+++ b/website/docs/user-guide/features/context-references.md
@@ -95,6 +95,38 @@ All paths are resolved relative to the working directory. References that resolv

 Binary files are detected via MIME type and null-byte scanning. Known text extensions (`.py`, `.md`, `.json`, `.yaml`, `.toml`, `.js`, `.ts`, etc.) bypass MIME-based detection. Binary files are rejected with a warning.

+## Platform Availability
+
+Context references are primarily a **CLI feature**. They work in the interactive CLI where `@` triggers tab completion and references are expanded before the message is sent to the agent.
+
+In **messaging platforms** (Telegram, Discord, etc.), the `@` syntax is not expanded by the gateway — messages are passed through as-is. The agent itself can still reference files via the `read_file`, `search_files`, and `web_extract` tools.
+
+## Interaction with Context Compression
+
+When conversation context is compressed, the expanded reference content is included in the compression summary. This means:
+
+- Large file contents injected via `@file:` contribute to context usage
+- If the conversation is later compressed, the file content is summarized (not preserved verbatim)
+- For very large files, consider using line ranges (`@file:main.py:100-200`) to inject only relevant sections
+
+## Common Patterns
+
+```text
+# Code review workflow
+Review @diff and check for security issues
+
+# Debug with context
+This test is failing. Here's the test @file:tests/test_auth.py
+and the implementation @file:src/auth.py:50-80
+
+# Project exploration
+What does this project do? @folder:src @file:README.md
+
+# Research
+Compare the approaches in @url:https://arxiv.org/abs/2301.00001
+and @url:https://arxiv.org/abs/2301.00002
+```
+
 ## Error Handling

 Invalid references produce inline warnings rather than failures:
--- a/website/docs/user-guide/features/cron.md
+++ b/website/docs/user-guide/features/cron.md
@@ -187,9 +187,21 @@ When scheduling jobs, you specify where the output goes:
 | `"origin"` | Back to where the job was created | Default on messaging platforms |
 | `"local"` | Save to local files only (`~/.hermes/cron/output/`) | Default on CLI |
 | `"telegram"` | Telegram home channel | Uses `TELEGRAM_HOME_CHANNEL` |
-| `"discord"` | Discord home channel | Uses `DISCORD_HOME_CHANNEL` |
 | `"telegram:123456"` | Specific Telegram chat by ID | Direct delivery |
-| `"discord:987654"` | Specific Discord channel by ID | Direct delivery |
+| `"telegram:-100123:17585"` | Specific Telegram topic | `chat_id:thread_id` format |
+| `"discord"` | Discord home channel | Uses `DISCORD_HOME_CHANNEL` |
+| `"discord:#engineering"` | Specific Discord channel | By channel name |
+| `"slack"` | Slack home channel | |
+| `"whatsapp"` | WhatsApp home | |
+| `"signal"` | Signal | |
+| `"matrix"` | Matrix home room | |
+| `"mattermost"` | Mattermost home channel | |
+| `"email"` | Email | |
+| `"sms"` | SMS via Twilio | |
+| `"homeassistant"` | Home Assistant | |
+| `"dingtalk"` | DingTalk | |
+| `"feishu"` | Feishu/Lark | |
+| `"wecom"` | WeCom | |

 The agent's final response is automatically delivered. You do not need to call `send_message` in the cron prompt.

--- a/website/docs/user-guide/features/honcho.md
+++ b/website/docs/user-guide/features/honcho.md
@@ -1,22 +1,39 @@
 ---
 sidebar_position: 99
 title: "Honcho Memory"
-description: "Honcho is now available as a memory provider plugin"
+description: "AI-native persistent memory via Honcho — dialectic reasoning, multi-agent user modeling, and deep personalization"
 ---

 # Honcho Memory

-:::info Honcho is now a Memory Provider Plugin
-Honcho has been integrated into the [Memory Providers](./memory-providers.md) system. All Honcho features are available through the unified memory provider interface.
+[Honcho](https://github.com/plastic-labs/honcho) is an AI-native memory backend that adds dialectic reasoning and deep user modeling on top of Hermes's built-in memory system. Instead of simple key-value storage, Honcho maintains a running model of who the user is — their preferences, communication style, goals, and patterns — by reasoning about conversations after they happen.
+
+:::info Honcho is a Memory Provider Plugin
+Honcho is integrated into the [Memory Providers](./memory-providers.md) system. All features below are available through the unified memory provider interface.
 :::

+## What Honcho Adds
+
+| Capability | Built-in Memory | Honcho |
+|-----------|----------------|--------|
+| Cross-session persistence | ✔ File-based MEMORY.md/USER.md | ✔ Server-side with API |
+| User profile | ✔ Manual agent curation | ✔ Automatic dialectic reasoning |
+| Multi-agent isolation | — | ✔ Per-peer profile separation |
+| Observation modes | — | ✔ Unified or directional observation |
+| Conclusions (derived insights) | — | ✔ Server-side reasoning about patterns |
+| Search across history | ✔ FTS5 session search | ✔ Semantic search over conclusions |
+
+**Dialectic reasoning**: After each conversation, Honcho analyzes the exchange and derives "conclusions" — insights about the user's preferences, habits, and goals. These conclusions accumulate over time, giving the agent a deepening understanding that goes beyond what the user explicitly stated.
+
+**Multi-agent profiles**: When multiple Hermes instances talk to the same user (e.g., a coding assistant and a personal assistant), Honcho maintains separate "peer" profiles. Each peer sees only its own observations and conclusions, preventing cross-contamination of context.
+
 ## Setup

 ```bash
-hermes memory setup    # select "honcho"
+hermes memory setup    # select "honcho" from the provider list
 ```

-Or set manually:
+Or configure manually:

 ```yaml
 # ~/.hermes/config.yaml
@@ -28,16 +45,49 @@ memory:
 echo "HONCHO_API_KEY=your-key" >> ~/.hermes/.env
 ```

+Get an API key at [honcho.dev](https://honcho.dev).
+
+## Configuration Options
+
+```yaml
+# ~/.hermes/config.yaml
+honcho:
+  observation: directional    # "unified" (default for new installs) or "directional"
+  peer_name: ""               # auto-detected from platform, or set manually
+```
+
+**Observation modes:**
+- `unified` — All observations go into a single pool. Simpler, good for single-agent setups.
+- `directional` — Observations are tagged with direction (user→agent, agent→user). Enables richer analysis of conversation dynamics.
+
+## Tools
+
+When Honcho is active as the memory provider, four additional tools become available:
+
+| Tool | Purpose |
+|------|---------|
+| `honcho_conclude` | Trigger server-side dialectic reasoning on recent conversations |
+| `honcho_context` | Retrieve relevant context from Honcho's memory for the current conversation |
+| `honcho_profile` | View or update the user's Honcho profile |
+| `honcho_search` | Semantic search across all stored conclusions and observations |
+
+## CLI Commands
+
+```bash
+hermes honcho status          # Show connection status and config
+hermes honcho peer            # Update peer names for multi-agent setups
+```
+
 ## Migrating from `hermes honcho`

-If you previously used `hermes honcho setup`:
+If you previously used the standalone `hermes honcho setup`:

 1. Your existing configuration (`honcho.json` or `~/.honcho/config.json`) is preserved
 2. Your server-side data (memories, conclusions, user profiles) is intact
-3. Just set `memory.provider: honcho` to reactivate
+3. Set `memory.provider: honcho` in config.yaml to reactivate

 No re-login or re-setup needed. Run `hermes memory setup` and select "honcho" — the wizard detects your existing config.

 ## Full Documentation

-See [Memory Providers — Honcho](./memory-providers.md#honcho) for tools, config reference, and details.
+See [Memory Providers — Honcho](./memory-providers.md#honcho) for the complete reference.
--- a/website/docs/user-guide/features/image-generation.md
+++ b/website/docs/user-guide/features/image-generation.md
@@ -141,10 +141,25 @@ Debug logs are saved to `./logs/image_tools_debug_<session_id>.json` with detail

 The image generation tool runs with safety checks disabled by default (`safety_tolerance: 5`, the most permissive setting). This is configured at the code level and is not user-adjustable.

+## Platform Delivery
+
+Generated images are delivered differently depending on the platform:
+
+| Platform | Delivery method |
+|----------|----------------|
+| **CLI** | Image URL printed as markdown `![description](url)` — click to open in browser |
+| **Telegram** | Image sent as a photo message with the prompt as caption |
+| **Discord** | Image embedded in a message |
+| **Slack** | Image URL in message (Slack unfurls it) |
+| **WhatsApp** | Image sent as a media message |
+| **Other platforms** | Image URL in plain text |
+
+The agent uses `MEDIA:<url>` syntax in its response, which the platform adapter converts to the appropriate format.
+
 ## Limitations

 - **Requires FAL API key** — image generation incurs API costs on your FAL.ai account
 - **No image editing** — this is text-to-image only, no inpainting or img2img
- **URL-based delivery** — images are returned as temporary FAL.ai URLs, not saved locally
+- **URL-based delivery** — images are returned as temporary FAL.ai URLs, not saved locally. URLs expire after a period (typically hours)
 - **Upscaling adds latency** — the automatic 2x upscale step adds processing time
 - **Max 4 images per request** — `num_images` is capped at 4
--- a/website/docs/user-guide/features/overview.md
+++ b/website/docs/user-guide/features/overview.md
@@ -31,15 +31,17 @@ Hermes Agent includes a rich set of capabilities that extend far beyond basic ch
 - **[Browser Automation](browser.md)** — Full browser automation with multiple backends: Browserbase cloud, Browser Use cloud, local Chrome via CDP, or local Chromium. Navigate websites, fill forms, and extract information.
 - **[Vision & Image Paste](vision.md)** — Multimodal vision support. Paste images from your clipboard into the CLI and ask the agent to analyze, describe, or work with them using any vision-capable model.
 - **[Image Generation](image-generation.md)** — Generate images from text prompts using FAL.ai's FLUX 2 Pro model with automatic 2x upscaling via the Clarity Upscaler.
- **[Voice & TTS](tts.md)** — Text-to-speech output and voice message transcription across all messaging platforms, with four provider options: Edge TTS (free), ElevenLabs, OpenAI TTS, and NeuTTS.
+- **[Voice & TTS](tts.md)** — Text-to-speech output and voice message transcription across all messaging platforms, with five provider options: Edge TTS (free), ElevenLabs, OpenAI TTS, MiniMax, and NeuTTS.

 ## Integrations

+- **[MCP Integration](mcp.md)** — Connect to any MCP server via stdio or HTTP transport. Access external tools from GitHub, databases, file systems, and internal APIs without writing native Hermes tools. Includes per-server tool filtering and sampling support.
 - **[Provider Routing](provider-routing.md)** — Fine-grained control over which AI providers handle your requests. Optimize for cost, speed, or quality with sorting, whitelists, blacklists, and priority ordering.
 - **[Fallback Providers](fallback-providers.md)** — Automatic failover to backup LLM providers when your primary model encounters errors, including independent fallback for auxiliary tasks like vision and compression.
+- **[Credential Pools](credential-pools.md)** — Distribute API calls across multiple keys for the same provider. Automatic rotation on rate limits or failures.
+- **[Memory Providers](memory-providers.md)** — Plug in external memory backends (Honcho, OpenViking, Mem0, Hindsight, Holographic, RetainDB, ByteRover) for cross-session user modeling and personalization beyond the built-in memory system.
 - **[API Server](api-server.md)** — Expose Hermes as an OpenAI-compatible HTTP endpoint. Connect any frontend that speaks the OpenAI format — Open WebUI, LobeChat, LibreChat, and more.
 - **[IDE Integration (ACP)](acp.md)** — Use Hermes inside ACP-compatible editors such as VS Code, Zed, and JetBrains. Chat, tool activity, file diffs, and terminal commands render inside your editor.
- **[Honcho Memory](honcho.md)** — AI-native persistent memory for cross-session user modeling and personalization via dialectic reasoning.
 - **[RL Training](rl-training.md)** — Generate trajectory data from agent sessions for reinforcement learning and model fine-tuning.

 ## Customization
--- a/website/docs/user-guide/messaging/webhooks.md
+++ b/website/docs/user-guide/messaging/webhooks.md
@@ -70,7 +70,7 @@ Routes define how different webhook sources are handled. Each route is a named e
 | `secret` | **Yes** | HMAC secret for signature validation. Falls back to the global `secret` if not set on the route. Set to `"INSECURE_NO_AUTH"` for testing only (skips validation). |
 | `prompt` | No | Template string with dot-notation payload access (e.g. `{pull_request.title}`). If omitted, the full JSON payload is dumped into the prompt. |
 | `skills` | No | List of skill names to load for the agent run. |
-| `deliver` | No | Where to send the response: `github_comment`, `telegram`, `discord`, `slack`, `signal`, `sms`, or `log` (default). |
+| `deliver` | No | Where to send the response: `github_comment`, `telegram`, `discord`, `slack`, `signal`, `matrix`, `mattermost`, `email`, `sms`, `dingtalk`, `feishu`, `wecom`, or `log` (default). |
 | `deliver_extra` | No | Additional delivery config — keys depend on `deliver` type (e.g. `repo`, `pr_number`, `chat_id`). Values support the same `{dot.notation}` templates as `prompt`. |

 ### Full example
--- a/website/docs/user-guide/sessions.md
+++ b/website/docs/user-guide/sessions.md
@@ -10,7 +10,7 @@ Hermes Agent automatically saves every conversation as a session. Sessions enabl

 ## How Sessions Work

-Every conversation — whether from the CLI, Telegram, Discord, WhatsApp, or Slack — is stored as a session with full message history. Sessions are tracked in two complementary systems:
+Every conversation — whether from the CLI, Telegram, Discord, Slack, WhatsApp, Signal, Matrix, or any other messaging platform — is stored as a session with full message history. Sessions are tracked in two complementary systems:

 1. **SQLite database** (`~/.hermes/state.db`) — structured session metadata with FTS5 full-text search
 2. **JSONL transcripts** (`~/.hermes/sessions/`) — raw conversation transcripts including tool calls (gateway)
@@ -34,8 +34,22 @@ Each session is tagged with its source platform:
 | `cli` | Interactive CLI (`hermes` or `hermes chat`) |
 | `telegram` | Telegram messenger |
 | `discord` | Discord server/DM |
-| `whatsapp` | WhatsApp messenger |
 | `slack` | Slack workspace |
+| `whatsapp` | WhatsApp messenger |
+| `signal` | Signal messenger |
+| `matrix` | Matrix rooms and DMs |
+| `mattermost` | Mattermost channels |
+| `email` | Email (IMAP/SMTP) |
+| `sms` | SMS via Twilio |
+| `dingtalk` | DingTalk messenger |
+| `feishu` | Feishu/Lark messenger |
+| `wecom` | WeCom (WeChat Work) |
+| `homeassistant` | Home Assistant conversation |
+| `webhook` | Incoming webhooks |
+| `api-server` | API server requests |
+| `acp` | ACP editor integration |
+| `cron` | Scheduled cron jobs |
+| `batch` | Batch processing runs |

 ## CLI Session Resume