docs: add 11 new pages + expand 4 existing pages (26 → 37 total)

New pages (sourced from actual codebase): - Security: command approval, DM pairing, container isolation, production checklist - Session Management: resume, export, prune, search, per-platform tracking - Context Files: AGENTS.md project context, discovery, size limits, security - Personality: SOUL.md, 14 built-in personalities, custom definitions - Browser Automation: Browserbase setup, 10 browser tools, stealth mode - Image Generation: FLUX 2 Pro via FAL, aspect ratios, auto-upscaling - Provider Routing: OpenRouter sort/only/ignore/order config - Honcho: AI-native memory integration, setup, peer config - Home Assistant: HASS setup, 4 HA tools, WebSocket gateway - Batch Processing: trajectory generation, dataset format, checkpointing - RL Training: Atropos/Tinker integration, environments, workflow Expanded pages: - code-execution: 51 → 195 lines (examples, limits, security, comparison table) - delegation: 60 → 216 lines (context tips, batch mode, model override) - cron: 88 → 273 lines (real-world examples, delivery options, expression cheat sheet) - memory: 98 → 249 lines (best practices, capacity management, examples)
2026-03-05 07:28:41 -08:00
parent c4e520fd6e
commit d50e9bcef7
17 changed files with 3116 additions and 41 deletions
--- a/website/docs/index.md
+++ b/website/docs/index.md
@@ -31,6 +31,8 @@ It's not a coding copilot tethered to an IDE or a chatbot wrapper around a singl
 | 🧠 **[Memory System](/docs/user-guide/features/memory)** | Persistent memory that grows across sessions |
 | 📚 **[Skills System](/docs/user-guide/features/skills)** | Procedural memory the agent creates and reuses |
 | 🔌 **[MCP Integration](/docs/user-guide/features/mcp)** | Connect to any MCP server for extended capabilities |
+| 📄 **[Context Files](/docs/user-guide/features/context-files)** | Project context files that shape every conversation |
+| 🔒 **[Security](/docs/user-guide/security)** | Command approval, authorization, container isolation |
 | 🏗️ **[Architecture](/docs/developer-guide/architecture)** | How it works under the hood |
 | 🤝 **[Contributing](/docs/developer-guide/contributing)** | Development setup and PR process |

--- a/website/docs/user-guide/features/batch-processing.md
+++ b/website/docs/user-guide/features/batch-processing.md
@@ -0,0 +1,226 @@
+---
+sidebar_position: 12
+title: "Batch Processing"
+description: "Generate agent trajectories at scale — parallel processing, checkpointing, and toolset distributions"
+---
+
+# Batch Processing
+
+Batch processing lets you run the Hermes agent across hundreds or thousands of prompts in parallel, generating structured trajectory data. This is primarily used for **training data generation** — producing ShareGPT-format trajectories with tool usage statistics that can be used for fine-tuning or evaluation.
+
+## Overview
+
+The batch runner (`batch_runner.py`) processes a JSONL dataset of prompts, running each through a full agent session with tool access. Each prompt gets its own isolated environment. The output is structured trajectory data with full conversation history, tool call statistics, and reasoning coverage metrics.
+
+## Quick Start
+
+```bash
+# Basic batch run
+python batch_runner.py \
+    --dataset_file=data/prompts.jsonl \
+    --batch_size=10 \
+    --run_name=my_first_run \
+    --model=anthropic/claude-sonnet-4-20250514 \
+    --num_workers=4
+
+# Resume an interrupted run
+python batch_runner.py \
+    --dataset_file=data/prompts.jsonl \
+    --batch_size=10 \
+    --run_name=my_first_run \
+    --resume
+
+# List available toolset distributions
+python batch_runner.py --list_distributions
+```
+
+## Dataset Format
+
+The input dataset is a JSONL file (one JSON object per line). Each entry must have a `prompt` field:
+
+```jsonl
+{"prompt": "Write a Python function that finds the longest palindromic substring"}
+{"prompt": "Create a REST API endpoint for user authentication using Flask"}
+{"prompt": "Debug this error: TypeError: cannot unpack non-iterable NoneType object"}
+```
+
+Entries can optionally include:
+- `image` or `docker_image`: A container image to use for this prompt's sandbox (works with Docker, Modal, and Singularity backends)
+- `cwd`: Working directory override for the task's terminal session
+
+## Configuration Options
+
+| Parameter | Default | Description |
+|-----------|---------|-------------|
+| `--dataset_file` | (required) | Path to JSONL dataset |
+| `--batch_size` | (required) | Prompts per batch |
+| `--run_name` | (required) | Name for this run (used for output dir and checkpointing) |
+| `--distribution` | `"default"` | Toolset distribution to sample from |
+| `--model` | `claude-sonnet-4-20250514` | Model to use |
+| `--base_url` | `https://openrouter.ai/api/v1` | API base URL |
+| `--api_key` | (env var) | API key for model |
+| `--max_turns` | `10` | Maximum tool-calling iterations per prompt |
+| `--num_workers` | `4` | Parallel worker processes |
+| `--resume` | `false` | Resume from checkpoint |
+| `--verbose` | `false` | Enable verbose logging |
+| `--max_samples` | all | Only process first N samples from dataset |
+| `--max_tokens` | model default | Maximum tokens per model response |
+
+### Provider Routing (OpenRouter)
+
+| Parameter | Description |
+|-----------|-------------|
+| `--providers_allowed` | Comma-separated providers to allow (e.g., `"anthropic,openai"`) |
+| `--providers_ignored` | Comma-separated providers to ignore (e.g., `"together,deepinfra"`) |
+| `--providers_order` | Comma-separated preferred provider order |
+| `--provider_sort` | Sort by `"price"`, `"throughput"`, or `"latency"` |
+
+### Reasoning Control
+
+| Parameter | Description |
+|-----------|-------------|
+| `--reasoning_effort` | Effort level: `xhigh`, `high`, `medium`, `low`, `minimal`, `none` |
+| `--reasoning_disabled` | Completely disable reasoning/thinking tokens |
+
+### Advanced Options
+
+| Parameter | Description |
+|-----------|-------------|
+| `--ephemeral_system_prompt` | System prompt used during execution but NOT saved to trajectories |
+| `--log_prefix_chars` | Characters to show in log previews (default: 100) |
+| `--prefill_messages_file` | Path to JSON file with prefill messages for few-shot priming |
+
+## Toolset Distributions
+
+Each prompt gets a randomly sampled set of toolsets from a **distribution**. This ensures training data covers diverse tool combinations. Use `--list_distributions` to see all available distributions.
+
+Distributions define probability weights for each toolset combination. For example, a "default" distribution might assign high probability to `["terminal", "file", "web"]` and lower probability to web-only or file-only combinations.
+
+## Output Format
+
+All output goes to `data/<run_name>/`:
+
+```
+data/my_run/
+├── trajectories.jsonl    # Combined final output (all batches merged)
+├── batch_0.jsonl         # Individual batch results
+├── batch_1.jsonl
+├── ...
+├── checkpoint.json       # Resume checkpoint
+└── statistics.json       # Aggregate tool usage stats
+```
+
+### Trajectory Format
+
+Each line in `trajectories.jsonl` is a JSON object:
+
+```json
+{
+  "prompt_index": 42,
+  "conversations": [
+    {"from": "human", "value": "Write a function..."},
+    {"from": "gpt", "value": "I'll create that function...",
+     "tool_calls": [...]},
+    {"from": "tool", "value": "..."},
+    {"from": "gpt", "value": "Here's the completed function..."}
+  ],
+  "metadata": {
+    "batch_num": 2,
+    "timestamp": "2026-01-15T10:30:00",
+    "model": "anthropic/claude-sonnet-4-20250514"
+  },
+  "completed": true,
+  "partial": false,
+  "api_calls": 3,
+  "toolsets_used": ["terminal", "file"],
+  "tool_stats": {
+    "terminal": {"count": 2, "success": 2, "failure": 0},
+    "read_file": {"count": 1, "success": 1, "failure": 0}
+  },
+  "tool_error_counts": {
+    "terminal": 0,
+    "read_file": 0
+  }
+}
+```
+
+The `conversations` field uses a ShareGPT-like format with `from` and `value` fields. Tool stats are normalized to include all possible tools with zero defaults, ensuring consistent schema across entries for HuggingFace datasets compatibility.
+
+## Checkpointing
+
+The batch runner has robust checkpointing for fault tolerance:
+
+- **Checkpoint file:** Saved after each batch completes, tracking which prompt indices are done
+- **Content-based resume:** On `--resume`, the runner scans existing batch files and matches completed prompts by their actual text content (not just indices), enabling recovery even if the dataset order changes
+- **Failed prompts:** Only successfully completed prompts are marked as done — failed prompts will be retried on resume
+- **Batch merging:** On completion, all batch files (including from previous runs) are merged into a single `trajectories.jsonl`
+
+### How Resume Works
+
+1. Scan all `batch_*.jsonl` files for completed prompts (by content matching)
+2. Filter the dataset to exclude already-completed prompts
+3. Re-batch the remaining prompts
+4. Process only the remaining prompts
+5. Merge all batch files (old + new) into final output
+
+## Quality Filtering
+
+The batch runner applies automatic quality filtering:
+
+- **No-reasoning filter:** Samples where zero assistant turns contain reasoning (no `<REASONING_SCRATCHPAD>` or native thinking tokens) are discarded
+- **Corrupted entry filter:** Entries with hallucinated tool names (not in the valid tool list) are filtered out during the final merge
+- **Reasoning statistics:** Tracks percentage of turns with/without reasoning across the entire run
+
+## Statistics
+
+After completion, the runner prints comprehensive statistics:
+
+- **Tool usage:** Call counts, success/failure rates per tool
+- **Reasoning coverage:** Percentage of assistant turns with reasoning
+- **Samples discarded:** Count of samples filtered for lacking reasoning
+- **Duration:** Total processing time
+
+Statistics are also saved to `statistics.json` for programmatic analysis.
+
+## Use Cases
+
+### Training Data Generation
+
+Generate diverse tool-use trajectories for fine-tuning:
+
+```bash
+python batch_runner.py \
+    --dataset_file=data/coding_prompts.jsonl \
+    --batch_size=20 \
+    --run_name=coding_v1 \
+    --model=anthropic/claude-sonnet-4-20250514 \
+    --num_workers=8 \
+    --distribution=default \
+    --max_turns=15
+```
+
+### Model Evaluation
+
+Evaluate how well a model uses tools across standardized prompts:
+
+```bash
+python batch_runner.py \
+    --dataset_file=data/eval_suite.jsonl \
+    --batch_size=10 \
+    --run_name=eval_gpt4 \
+    --model=openai/gpt-4o \
+    --num_workers=4 \
+    --max_turns=10
+```
+
+### Per-Prompt Container Images
+
+For benchmarks requiring specific environments, each prompt can specify its own container image:
+
+```jsonl
+{"prompt": "Install numpy and compute eigenvalues of a 3x3 matrix", "image": "python:3.11-slim"}
+{"prompt": "Compile this Rust program and run it", "image": "rust:1.75"}
+{"prompt": "Set up a Node.js Express server", "image": "node:20-alpine", "cwd": "/app"}
+```
+
+The batch runner verifies Docker images are accessible before running each prompt.
--- a/website/docs/user-guide/features/browser.md
+++ b/website/docs/user-guide/features/browser.md
@@ -0,0 +1,205 @@
+---
+title: Browser Automation
+description: Control cloud browsers with Browserbase integration for web interaction, form filling, scraping, and more.
+sidebar_label: Browser
+sidebar_position: 5
+---
+
+# Browser Automation
+
+Hermes Agent includes a full browser automation toolset powered by [Browserbase](https://browserbase.com), enabling the agent to navigate websites, interact with page elements, fill forms, and extract information — all running in cloud-hosted browsers with built-in anti-bot stealth features.
+
+## Overview
+
+The browser tools use the `agent-browser` CLI with Browserbase cloud execution. Pages are represented as **accessibility trees** (text-based snapshots), making them ideal for LLM agents. Interactive elements get ref IDs (like `@e1`, `@e2`) that the agent uses for clicking and typing.
+
+Key capabilities:
+
+- **Cloud execution** — no local browser needed
+- **Built-in stealth** — random fingerprints, CAPTCHA solving, residential proxies
+- **Session isolation** — each task gets its own browser session
+- **Automatic cleanup** — inactive sessions are closed after a timeout
+- **Vision analysis** — screenshot + AI analysis for visual understanding
+
+## Setup
+
+### Required Environment Variables
+
+```bash
+# Add to ~/.hermes/.env
+BROWSERBASE_API_KEY=your-api-key-here
+BROWSERBASE_PROJECT_ID=your-project-id-here
+```
+
+Get your credentials at [browserbase.com](https://browserbase.com).
+
+### Optional Environment Variables
+
+```bash
+# Residential proxies for better CAPTCHA solving (default: "true")
+BROWSERBASE_PROXIES=true
+
+# Advanced stealth with custom Chromium — requires Scale Plan (default: "false")
+BROWSERBASE_ADVANCED_STEALTH=false
+
+# Session reconnection after disconnects — requires paid plan (default: "true")
+BROWSERBASE_KEEP_ALIVE=true
+
+# Custom session timeout in milliseconds (default: project default)
+# Examples: 600000 (10min), 1800000 (30min)
+BROWSERBASE_SESSION_TIMEOUT=600000
+
+# Inactivity timeout before auto-cleanup in seconds (default: 300)
+BROWSER_INACTIVITY_TIMEOUT=300
+```
+
+### Install agent-browser CLI
+
+```bash
+npm install -g agent-browser
+# Or install locally in the repo:
+npm install
+```
+
+:::info
+The `browser` toolset must be included in your config's `toolsets` list or enabled via `hermes config set toolsets '["hermes-cli", "browser"]'`.
+:::
+
+## Available Tools
+
+### `browser_navigate`
+
+Navigate to a URL. Must be called before any other browser tool. Initializes the Browserbase session.
+
+```
+Navigate to https://github.com/NousResearch
+```
+
+:::tip
+For simple information retrieval, prefer `web_search` or `web_extract` — they are faster and cheaper. Use browser tools when you need to **interact** with a page (click buttons, fill forms, handle dynamic content).
+:::
+
+### `browser_snapshot`
+
+Get a text-based snapshot of the current page's accessibility tree. Returns interactive elements with ref IDs like `@e1`, `@e2` for use with `browser_click` and `browser_type`.
+
+- **`full=false`** (default): Compact view showing only interactive elements
+- **`full=true`**: Complete page content
+
+Snapshots over 8000 characters are automatically summarized by an LLM.
+
+### `browser_click`
+
+Click an element identified by its ref ID from the snapshot.
+
+```
+Click @e5 to press the "Sign In" button
+```
+
+### `browser_type`
+
+Type text into an input field. Clears the field first, then types the new text.
+
+```
+Type "hermes agent" into the search field @e3
+```
+
+### `browser_scroll`
+
+Scroll the page up or down to reveal more content.
+
+```
+Scroll down to see more results
+```
+
+### `browser_press`
+
+Press a keyboard key. Useful for submitting forms or navigation.
+
+```
+Press Enter to submit the form
+```
+
+Supported keys: `Enter`, `Tab`, `Escape`, `ArrowDown`, `ArrowUp`, and more.
+
+### `browser_back`
+
+Navigate back to the previous page in browser history.
+
+### `browser_get_images`
+
+List all images on the current page with their URLs and alt text. Useful for finding images to analyze.
+
+### `browser_vision`
+
+Take a screenshot and analyze it with vision AI. Use this when text snapshots don't capture important visual information — especially useful for CAPTCHAs, complex layouts, or visual verification challenges.
+
+```
+What does the chart on this page show?
+```
+
+### `browser_close`
+
+Close the browser session and release resources. Call this when done to free up Browserbase session quota.
+
+## Practical Examples
+
+### Filling Out a Web Form
+
+```
+User: Sign up for an account on example.com with my email john@example.com
+
+Agent workflow:
+1. browser_navigate("https://example.com/signup")
+2. browser_snapshot()  → sees form fields with refs
+3. browser_type(ref="@e3", text="john@example.com")
+4. browser_type(ref="@e5", text="SecurePass123")
+5. browser_click(ref="@e8")  → clicks "Create Account"
+6. browser_snapshot()  → confirms success
+7. browser_close()
+```
+
+### Researching Dynamic Content
+
+```
+User: What are the top trending repos on GitHub right now?
+
+Agent workflow:
+1. browser_navigate("https://github.com/trending")
+2. browser_snapshot(full=true)  → reads trending repo list
+3. Returns formatted results
+4. browser_close()
+```
+
+## Stealth Features
+
+Browserbase provides automatic stealth capabilities:
+
+| Feature | Default | Notes |
+|---------|---------|-------|
+| Basic Stealth | Always on | Random fingerprints, viewport randomization, CAPTCHA solving |
+| Residential Proxies | On | Routes through residential IPs for better access |
+| Advanced Stealth | Off | Custom Chromium build, requires Scale Plan |
+| Keep Alive | On | Session reconnection after network hiccups |
+
+:::note
+If paid features aren't available on your plan, Hermes automatically falls back — first disabling `keepAlive`, then proxies — so browsing still works on free plans.
+:::
+
+## Session Management
+
+- Each task gets an isolated browser session via Browserbase
+- Sessions are automatically cleaned up after inactivity (default: 5 minutes)
+- A background thread checks every 30 seconds for stale sessions
+- Emergency cleanup runs on process exit to prevent orphaned sessions
+- Sessions are released via the Browserbase API (`REQUEST_RELEASE` status)
+
+## Limitations
+
+- **Requires Browserbase account** — no local browser fallback
+- **Requires `agent-browser` CLI** — must be installed via npm
+- **Text-based interaction** — relies on accessibility tree, not pixel coordinates
+- **Snapshot size** — large pages may be truncated or LLM-summarized at 8000 characters
+- **Session timeout** — sessions expire based on your Browserbase plan settings
+- **Cost** — each session consumes Browserbase credits; use `browser_close` when done
+- **No file downloads** — cannot download files from the browser
--- a/website/docs/user-guide/features/code-execution.md
+++ b/website/docs/user-guide/features/code-execution.md
@@ -10,6 +10,12 @@ The `execute_code` tool lets the agent write Python scripts that call Hermes too

 ## How It Works

+1. The agent writes a Python script using `from hermes_tools import ...`
+2. Hermes generates a `hermes_tools.py` stub module with RPC functions
+3. Hermes opens a Unix domain socket and starts an RPC listener thread
+4. The script runs in a child process — tool calls travel over the socket back to Hermes
+5. Only the script's `print()` output is returned to the LLM; intermediate tool results never enter the context window
+
 ```python
 # The agent can write scripts like:
 from hermes_tools import web_search, web_extract
@@ -33,15 +39,103 @@ The agent uses `execute_code` when there are:

 The key benefit: intermediate tool results never enter the context window — only the final `print()` output comes back, dramatically reducing token usage.

-## Security
+## Practical Examples

-:::danger Security Model
-The child process runs with a **minimal environment**. API keys, tokens, and credentials are stripped entirely. The script accesses tools exclusively via the RPC channel — it cannot read secrets from environment variables.
-:::
+### Data Processing Pipeline

-Only safe system variables (`PATH`, `HOME`, `LANG`, etc.) are passed through.
+```python
+from hermes_tools import search_files, read_file
+import json

-## Configuration
+# Find all config files and extract database settings
+matches = search_files("database", path=".", file_glob="*.yaml", limit=20)
+configs = []
+for match in matches.get("matches", []):
+    content = read_file(match["path"])
+    configs.append({"file": match["path"], "preview": content["content"][:200]})
+
+print(json.dumps(configs, indent=2))
+```
+
+### Multi-Step Web Research
+
+```python
+from hermes_tools import web_search, web_extract
+import json
+
+# Search, extract, and summarize in one turn
+results = web_search("Rust async runtime comparison 2025", limit=5)
+summaries = []
+for r in results["data"]["web"]:
+    page = web_extract([r["url"]])
+    for p in page.get("results", []):
+        if p.get("content"):
+            summaries.append({
+                "title": r["title"],
+                "url": r["url"],
+                "excerpt": p["content"][:500]
+            })
+
+print(json.dumps(summaries, indent=2))
+```
+
+### Bulk File Refactoring
+
+```python
+from hermes_tools import search_files, read_file, patch
+
+# Find all Python files using deprecated API and fix them
+matches = search_files("old_api_call", path="src/", file_glob="*.py")
+fixed = 0
+for match in matches.get("matches", []):
+    result = patch(
+        path=match["path"],
+        old_string="old_api_call(",
+        new_string="new_api_call(",
+        replace_all=True
+    )
+    if "error" not in str(result):
+        fixed += 1
+
+print(f"Fixed {fixed} files out of {len(matches.get('matches', []))} matches")
+```
+
+### Build and Test Pipeline
+
+```python
+from hermes_tools import terminal, read_file
+import json
+
+# Run tests, parse results, and report
+result = terminal("cd /project && python -m pytest --tb=short -q 2>&1", timeout=120)
+output = result.get("output", "")
+
+# Parse test output
+passed = output.count(" passed")
+failed = output.count(" failed")
+errors = output.count(" error")
+
+report = {
+    "passed": passed,
+    "failed": failed,
+    "errors": errors,
+    "exit_code": result.get("exit_code", -1),
+    "summary": output[-500:] if len(output) > 500 else output
+}
+
+print(json.dumps(report, indent=2))
+```
+
+## Resource Limits
+
+| Resource | Limit | Notes |
+|----------|-------|-------|
+| **Timeout** | 5 minutes (300s) | Script is killed with SIGTERM, then SIGKILL after 5s grace |
+| **Stdout** | 50 KB | Output truncated with `[output truncated at 50KB]` notice |
+| **Stderr** | 10 KB | Included in output on non-zero exit for debugging |
+| **Tool calls** | 50 per execution | Error returned when limit reached |
+
+All limits are configurable via `config.yaml`:

 ```yaml
 # In ~/.hermes/config.yaml
@@ -49,3 +143,53 @@ code_execution:
  timeout: 300       # Max seconds per script (default: 300)
  max_tool_calls: 50 # Max tool calls per execution (default: 50)
 ```
+
+## How Tool Calls Work Inside Scripts
+
+When your script calls a function like `web_search("query")`:
+
+1. The call is serialized to JSON and sent over a Unix domain socket to the parent process
+2. The parent dispatches through the standard `handle_function_call` handler
+3. The result is sent back over the socket
+4. The function returns the parsed result
+
+This means tool calls inside scripts behave identically to normal tool calls — same rate limits, same error handling, same capabilities. The only restriction is that `terminal()` is foreground-only (no `background`, `pty`, or `check_interval` parameters).
+
+## Error Handling
+
+When a script fails, the agent receives structured error information:
+
+- **Non-zero exit code**: stderr is included in the output so the agent sees the full traceback
+- **Timeout**: Script is killed and the agent sees `"Script timed out after 300s and was killed."`
+- **Interruption**: If the user sends a new message during execution, the script is terminated and the agent sees `[execution interrupted — user sent a new message]`
+- **Tool call limit**: When the 50-call limit is hit, subsequent tool calls return an error message
+
+The response always includes `status` (success/error/timeout/interrupted), `output`, `tool_calls_made`, and `duration_seconds`.
+
+## Security
+
+:::danger Security Model
+The child process runs with a **minimal environment**. API keys, tokens, and credentials are stripped entirely. The script accesses tools exclusively via the RPC channel — it cannot read secrets from environment variables.
+:::
+
+Environment variables containing `KEY`, `TOKEN`, `SECRET`, `PASSWORD`, `CREDENTIAL`, `PASSWD`, or `AUTH` in their names are excluded. Only safe system variables (`PATH`, `HOME`, `LANG`, `SHELL`, `PYTHONPATH`, `VIRTUAL_ENV`, etc.) are passed through.
+
+The script runs in a temporary directory that is cleaned up after execution. The child process runs in its own process group so it can be cleanly killed on timeout or interruption.
+
+## execute_code vs terminal
+
+| Use Case | execute_code | terminal |
+|----------|-------------|----------|
+| Multi-step workflows with tool calls between | ✅ | ❌ |
+| Simple shell command | ❌ | ✅ |
+| Filtering/processing large tool outputs | ✅ | ❌ |
+| Running a build or test suite | ❌ | ✅ |
+| Looping over search results | ✅ | ❌ |
+| Interactive/background processes | ❌ | ✅ |
+| Needs API keys in environment | ❌ | ✅ |
+
+**Rule of thumb:** Use `execute_code` when you need to call Hermes tools programmatically with logic between calls. Use `terminal` for running shell commands, builds, and processes.
+
+## Platform Support
+
+Code execution requires Unix domain sockets and is available on **Linux and macOS only**. It is automatically disabled on Windows — the agent falls back to regular sequential tool calls.
--- a/website/docs/user-guide/features/context-files.md
+++ b/website/docs/user-guide/features/context-files.md
@@ -0,0 +1,193 @@
+---
+sidebar_position: 8
+title: "Context Files"
+description: "Project context files — AGENTS.md, SOUL.md, and .cursorrules — automatically injected into every conversation"
+---
+
+# Context Files
+
+Hermes Agent automatically discovers and loads project context files from your working directory. These files are injected into the system prompt at the start of every session, giving the agent persistent knowledge about your project's conventions, architecture, and preferences.
+
+## Supported Context Files
+
+| File | Purpose | Discovery |
+|------|---------|-----------|
+| **AGENTS.md** | Project instructions, conventions, architecture | Recursive (walks subdirectories) |
+| **SOUL.md** | Personality and tone customization | CWD → `~/.hermes/SOUL.md` fallback |
+| **.cursorrules** | Cursor IDE coding conventions | CWD only |
+| **.cursor/rules/*.mdc** | Cursor IDE rule modules | CWD only |
+
+## AGENTS.md
+
+`AGENTS.md` is the primary project context file. It tells the agent how your project is structured, what conventions to follow, and any special instructions.
+
+### Hierarchical Discovery
+
+Hermes walks the directory tree starting from the working directory and loads **all** `AGENTS.md` files found, sorted by depth. This supports monorepo-style setups:
+
+```
+my-project/
+├── AGENTS.md              ← Top-level project context
+├── frontend/
+│   └── AGENTS.md          ← Frontend-specific instructions
+├── backend/
+│   └── AGENTS.md          ← Backend-specific instructions
+└── shared/
+    └── AGENTS.md          ← Shared library conventions
+```
+
+All four files are concatenated into a single context block with relative path headers.
+
+:::info
+Directories that are skipped during the walk: `.`-prefixed dirs, `node_modules`, `__pycache__`, `venv`, `.venv`.
+:::
+
+### Example AGENTS.md
+
+```markdown
+# Project Context
+
+This is a Next.js 14 web application with a Python FastAPI backend.
+
+## Architecture
+- Frontend: Next.js 14 with App Router in `/frontend`
+- Backend: FastAPI in `/backend`, uses SQLAlchemy ORM
+- Database: PostgreSQL 16
+- Deployment: Docker Compose on a Hetzner VPS
+
+## Conventions
+- Use TypeScript strict mode for all frontend code
+- Python code follows PEP 8, use type hints everywhere
+- All API endpoints return JSON with `{data, error, meta}` shape
+- Tests go in `__tests__/` directories (frontend) or `tests/` (backend)
+
+## Important Notes
+- Never modify migration files directly — use Alembic commands
+- The `.env.local` file has real API keys, don't commit it
+- Frontend port is 3000, backend is 8000, DB is 5432
+```
+
+## SOUL.md
+
+`SOUL.md` controls the agent's personality, tone, and communication style. See the [Personality](/docs/user-guide/features/personality) page for full details.
+
+**Discovery order:**
+
+1. `SOUL.md` or `soul.md` in the current working directory
+2. `~/.hermes/SOUL.md` (global fallback)
+
+When a SOUL.md is found, the agent is instructed:
+
+> *"If SOUL.md is present, embody its persona and tone. Avoid stiff, generic replies; follow its guidance unless higher-priority instructions override it."*
+
+## .cursorrules
+
+Hermes is compatible with Cursor IDE's `.cursorrules` file and `.cursor/rules/*.mdc` rule modules. If these files exist in your project root, they're loaded alongside AGENTS.md.
+
+This means your existing Cursor conventions automatically apply when using Hermes.
+
+## How Context Files Are Loaded
+
+Context files are loaded by `build_context_files_prompt()` in `agent/prompt_builder.py`:
+
+1. **At session start** — the function scans the working directory
+2. **Content is read** — each file is read as UTF-8 text
+3. **Security scan** — content is checked for prompt injection patterns
+4. **Truncation** — files exceeding 20,000 characters are head/tail truncated (70% head, 20% tail, with a marker in the middle)
+5. **Assembly** — all sections are combined under a `# Project Context` header
+6. **Injection** — the assembled content is added to the system prompt
+
+The final prompt section looks like:
+
+```
+# Project Context
+
+The following project context files have been loaded and should be followed:
+
+## AGENTS.md
+
+[Your AGENTS.md content here]
+
+## .cursorrules
+
+[Your .cursorrules content here]
+
+## SOUL.md
+
+If SOUL.md is present, embody its persona and tone...
+
+[Your SOUL.md content here]
+```
+
+## Security: Prompt Injection Protection
+
+All context files are scanned for potential prompt injection before being included. The scanner checks for:
+
+- **Instruction override attempts**: "ignore previous instructions", "disregard your rules"
+- **Deception patterns**: "do not tell the user"
+- **System prompt overrides**: "system prompt override"
+- **Hidden HTML comments**: `<!-- ignore instructions -->`
+- **Hidden div elements**: `<div style="display:none">`
+- **Credential exfiltration**: `curl ... $API_KEY`
+- **Secret file access**: `cat .env`, `cat credentials`
+- **Invisible characters**: zero-width spaces, bidirectional overrides, word joiners
+
+If any threat pattern is detected, the file is blocked:
+
+```
+[BLOCKED: AGENTS.md contained potential prompt injection (prompt_injection). Content not loaded.]
+```
+
+:::warning
+This scanner protects against common injection patterns, but it's not a substitute for reviewing context files in shared repositories. Always validate AGENTS.md content in projects you didn't author.
+:::
+
+## Size Limits
+
+| Limit | Value |
+|-------|-------|
+| Max chars per file | 20,000 (~7,000 tokens) |
+| Head truncation ratio | 70% |
+| Tail truncation ratio | 20% |
+| Truncation marker | 10% (shows char counts and suggests using file tools) |
+
+When a file exceeds 20,000 characters, the truncation message reads:
+
+```
+[...truncated AGENTS.md: kept 14000+4000 of 25000 chars. Use file tools to read the full file.]
+```
+
+## Tips for Effective Context Files
+
+:::tip Best practices for AGENTS.md
+1. **Keep it concise** — stay well under 20K chars; the agent reads it every turn
+2. **Structure with headers** — use `##` sections for architecture, conventions, important notes
+3. **Include concrete examples** — show preferred code patterns, API shapes, naming conventions
+4. **Mention what NOT to do** — "never modify migration files directly"
+5. **List key paths and ports** — the agent uses these for terminal commands
+6. **Update as the project evolves** — stale context is worse than no context
+:::
+
+### Per-Subdirectory Context
+
+For monorepos, put subdirectory-specific instructions in nested AGENTS.md files:
+
+```markdown
+<!-- frontend/AGENTS.md -->
+# Frontend Context
+
+- Use `pnpm` not `npm` for package management
+- Components go in `src/components/`, pages in `src/app/`
+- Use Tailwind CSS, never inline styles
+- Run tests with `pnpm test`
+```
+
+```markdown
+<!-- backend/AGENTS.md -->
+# Backend Context
+
+- Use `poetry` for dependency management
+- Run the dev server with `poetry run uvicorn main:app --reload`
+- All endpoints need OpenAPI docstrings
+- Database models are in `models/`, schemas in `schemas/`
+```
--- a/website/docs/user-guide/features/cron.md
+++ b/website/docs/user-guide/features/cron.md
@@ -44,6 +44,20 @@ hermes cron list           # View scheduled jobs
 hermes cron status         # Check if gateway is running
 ```

+### The Gateway Scheduler
+
+The scheduler runs as a background thread inside the gateway process. On each tick (every 60 seconds):
+
+1. It loads all jobs from `~/.hermes/cron/jobs.json`
+2. Checks each enabled job's `next_run_at` against the current time
+3. For each due job, spawns a fresh `AIAgent` session with the job's prompt
+4. The agent runs to completion with full tool access
+5. The final response is delivered to the configured target
+6. The job's run count is incremented and next run time computed
+7. Jobs that hit their repeat limit are auto-removed
+
+A **file-based lock** (`~/.hermes/cron/.tick.lock`) prevents duplicate execution if multiple processes overlap (e.g., gateway + manual tick).
+
 :::info
 Even if no messaging platforms are configured, the gateway stays running for cron. A file lock prevents duplicate execution if multiple processes overlap.
 :::
@@ -52,22 +66,169 @@ Even if no messaging platforms are configured, the gateway stays running for cro

 When scheduling jobs, you specify where the output goes:

-| Option | Description |
-|--------|-------------|
-| `"origin"` | Back to where the job was created |
-| `"local"` | Save to local files only |
-| `"telegram"` | Telegram home channel |
-| `"discord"` | Discord home channel |
-| `"telegram:123456"` | Specific Telegram chat |
+| Option | Description | Example |
+|--------|-------------|---------|
+| `"origin"` | Back to where the job was created | Default on messaging platforms |
+| `"local"` | Save to local files only (`~/.hermes/cron/output/`) | Default on CLI |
+| `"telegram"` | Telegram home channel | Uses `TELEGRAM_HOME_CHANNEL` env var |
+| `"discord"` | Discord home channel | Uses `DISCORD_HOME_CHANNEL` env var |
+| `"telegram:123456"` | Specific Telegram chat by ID | For directing output to a specific chat |
+| `"discord:987654"` | Specific Discord channel by ID | For directing output to a specific channel |
+
+**How `"origin"` works:** When a job is created from a messaging platform, Hermes records the source platform and chat ID. When the job runs and deliver is `"origin"`, the output is sent back to that exact platform and chat. If origin info isn't available (e.g., job created from CLI), delivery falls back to local.
+
+**How platform names work:** When you specify a bare platform name like `"telegram"`, Hermes first checks if the job's origin matches that platform and uses the origin chat ID. Otherwise, it falls back to the platform's home channel configured via environment variable (e.g., `TELEGRAM_HOME_CHANNEL`).
+
+The agent's final response is automatically delivered — you do **not** need to include `send_message` in the cron prompt.

 The agent knows your connected platforms and home channels — it'll choose sensible defaults.

 ## Schedule Formats

- **Relative:** `30m`, `2h`, `1d`
- **Interval:** `"every 30m"`, `"every 2h"`
- **Cron expressions:** `"0 9 * * *"` (standard 5-field cron syntax)
- **ISO timestamps:** `"2026-03-15T09:00:00"` (one-time scheduled execution)
+### Relative Delays (One-Shot)
+
+Run once after a delay:
+
+```
+30m     → Run once in 30 minutes
+2h      → Run once in 2 hours
+1d      → Run once in 1 day
+```
+
+Supported units: `m`/`min`/`minutes`, `h`/`hr`/`hours`, `d`/`day`/`days`.
+
+### Intervals (Recurring)
+
+Run repeatedly at fixed intervals:
+
+```
+every 30m    → Every 30 minutes
+every 2h     → Every 2 hours
+every 1d     → Every day
+```
+
+### Cron Expressions
+
+Standard 5-field cron syntax for precise scheduling:
+
+```
+0 9 * * *       → Daily at 9:00 AM
+0 9 * * 1-5     → Weekdays at 9:00 AM
+0 */6 * * *     → Every 6 hours
+30 8 1 * *      → First of every month at 8:30 AM
+0 0 * * 0       → Every Sunday at midnight
+```
+
+#### Cron Expression Cheat Sheet
+
+```
+┌───── minute (0-59)
+│ ┌───── hour (0-23)
+│ │ ┌───── day of month (1-31)
+│ │ │ ┌───── month (1-12)
+│ │ │ │ ┌───── day of week (0-7, 0 and 7 = Sunday)
+│ │ │ │ │
+* * * * *
+
+Special characters:
+  *     Any value
+  ,     List separator (1,3,5)
+  -     Range (1-5)
+  /     Step values (*/15 = every 15)
+```
+
+:::note
+Cron expressions require the `croniter` Python package. Install with `pip install croniter` if not already available.
+:::
+
+### ISO Timestamps
+
+Run once at a specific date/time:
+
+```
+2026-03-15T09:00:00    → One-time at March 15, 2026 9:00 AM
+```
+
+## Repeat Behavior
+
+The `repeat` parameter controls how many times a job runs:
+
+| Schedule Type | Default Repeat | Behavior |
+|--------------|----------------|----------|
+| One-shot (`30m`, timestamp) | 1 (run once) | Runs once, then auto-deleted |
+| Interval (`every 2h`) | Forever (`null`) | Runs indefinitely until removed |
+| Cron expression | Forever (`null`) | Runs indefinitely until removed |
+
+You can override the default:
+
+```python
+schedule_cronjob(
+    prompt="...",
+    schedule="every 2h",
+    repeat=5  # Run exactly 5 times, then auto-delete
+)
+```
+
+When a job hits its repeat limit, it is automatically removed from the job list.
+
+## Real-World Examples
+
+### Daily Standup Report
+
+```
+Schedule a daily standup report: Every weekday at 9am, check the GitHub
+repository at github.com/myorg/myproject for:
+1. Pull requests opened/merged in the last 24 hours
+2. Issues created or closed
+3. Any CI/CD failures on the main branch
+Format as a brief standup-style summary. Deliver to telegram.
+```
+
+The agent creates:
+```python
+schedule_cronjob(
+    prompt="Check github.com/myorg/myproject for PRs, issues, and CI status from the last 24 hours. Format as a standup report.",
+    schedule="0 9 * * 1-5",
+    name="Daily Standup Report",
+    deliver="telegram"
+)
+```
+
+### Weekly Backup Verification
+
+```
+Every Sunday at 2am, verify that backups exist in /data/backups/ for
+each day of the past week. Check file sizes are > 1MB. Report any
+gaps or suspiciously small files.
+```
+
+### Monitoring Alerts
+
+```
+Every 15 minutes, curl https://api.myservice.com/health and verify
+it returns HTTP 200 with {"status": "ok"}. If it fails, include the
+error details and response code. Deliver to telegram:123456789.
+```
+
+```python
+schedule_cronjob(
+    prompt="Run 'curl -s -o /dev/null -w \"%{http_code}\" https://api.myservice.com/health' and verify it returns 200. Also fetch the full response with 'curl -s https://api.myservice.com/health' and check for {\"status\": \"ok\"}. Report the result.",
+    schedule="every 15m",
+    name="API Health Check",
+    deliver="telegram:123456789"
+)
+```
+
+### Periodic Disk Usage Check
+
+```python
+schedule_cronjob(
+    prompt="Check disk usage with 'df -h' and report any partitions above 80% usage. Also check Docker disk usage with 'docker system df' if Docker is installed.",
+    schedule="0 8 * * *",
+    name="Disk Usage Report",
+    deliver="origin"
+)
+```

 ## Managing Jobs

@@ -81,8 +242,32 @@ hermes cron status         # Check if the scheduler is running
 /cron remove <job_id>
 ```

+The agent can also manage jobs conversationally:
+- `list_cronjobs` — Shows all jobs with IDs, schedules, repeat status, and next run times
+- `remove_cronjob` — Removes a job by ID (use `list_cronjobs` to find the ID)
+
+## Job Storage
+
+Jobs are stored as JSON in `~/.hermes/cron/jobs.json`. Output from job runs is saved to `~/.hermes/cron/output/{job_id}/{timestamp}.md`.
+
+The storage uses atomic file writes (temp file + rename) to prevent corruption from concurrent access.
+
+## Self-Contained Prompts
+
+:::warning Important
+Cron job prompts run in a **completely fresh agent session** with zero memory of any prior conversation. The prompt must contain **everything** the agent needs:
+
+- Full context and background
+- Specific file paths, URLs, server addresses
+- Clear instructions and success criteria
+- Any credentials or configuration details
+
+**BAD:** `"Check on that server issue"`
+**GOOD:** `"SSH into server 192.168.1.100 as user 'deploy', check if nginx is running with 'systemctl status nginx', and verify https://example.com returns HTTP 200."`
+:::
+
 ## Security

 :::warning
-Scheduled task prompts are scanned for instruction-override patterns (prompt injection). Jobs with suspicious content are blocked.
+Scheduled task prompts are scanned for instruction-override patterns (prompt injection). Jobs matching threat patterns like credential exfiltration, SSH backdoor attempts, or prompt injection are blocked at creation time. Content with invisible Unicode characters (zero-width spaces, directional overrides) is also rejected.
 :::
--- a/website/docs/user-guide/features/delegation.md
+++ b/website/docs/user-guide/features/delegation.md
@@ -30,6 +30,155 @@ delegate_task(tasks=[
 ])
 ```

+## How Subagent Context Works
+
+:::warning Critical: Subagents Know Nothing
+Subagents start with a **completely fresh conversation**. They have zero knowledge of the parent's conversation history, prior tool calls, or anything discussed before delegation. The subagent's only context comes from the `goal` and `context` fields you provide.
+:::
+
+This means you must pass **everything** the subagent needs:
+
+```python
+# BAD - subagent has no idea what "the error" is
+delegate_task(goal="Fix the error")
+
+# GOOD - subagent has all context it needs
+delegate_task(
+    goal="Fix the TypeError in api/handlers.py",
+    context="""The file api/handlers.py has a TypeError on line 47:
+    'NoneType' object has no attribute 'get'.
+    The function process_request() receives a dict from parse_body(),
+    but parse_body() returns None when Content-Type is missing.
+    The project is at /home/user/myproject and uses Python 3.11."""
+)
+```
+
+The subagent receives a focused system prompt built from your goal and context, instructing it to complete the task and provide a structured summary of what it did, what it found, any files modified, and any issues encountered.
+
+## Practical Examples
+
+### Parallel Research
+
+Research multiple topics simultaneously and collect summaries:
+
+```python
+delegate_task(tasks=[
+    {
+        "goal": "Research the current state of WebAssembly in 2025",
+        "context": "Focus on: browser support, non-browser runtimes, language support",
+        "toolsets": ["web"]
+    },
+    {
+        "goal": "Research the current state of RISC-V adoption in 2025",
+        "context": "Focus on: server chips, embedded systems, software ecosystem",
+        "toolsets": ["web"]
+    },
+    {
+        "goal": "Research quantum computing progress in 2025",
+        "context": "Focus on: error correction breakthroughs, practical applications, key players",
+        "toolsets": ["web"]
+    }
+])
+```
+
+### Code Review + Fix
+
+Delegate a review-and-fix workflow to a fresh context:
+
+```python
+delegate_task(
+    goal="Review the authentication module for security issues and fix any found",
+    context="""Project at /home/user/webapp.
+    Auth module files: src/auth/login.py, src/auth/jwt.py, src/auth/middleware.py.
+    The project uses Flask, PyJWT, and bcrypt.
+    Focus on: SQL injection, JWT validation, password handling, session management.
+    Fix any issues found and run the test suite (pytest tests/auth/).""",
+    toolsets=["terminal", "file"]
+)
+```
+
+### Multi-File Refactoring
+
+Delegate a large refactoring task that would flood the parent's context:
+
+```python
+delegate_task(
+    goal="Refactor all Python files in src/ to replace print() with proper logging",
+    context="""Project at /home/user/myproject.
+    Use the 'logging' module with logger = logging.getLogger(__name__).
+    Replace print() calls with appropriate log levels:
+    - print(f"Error: ...") -> logger.error(...)
+    - print(f"Warning: ...") -> logger.warning(...)
+    - print(f"Debug: ...") -> logger.debug(...)
+    - Other prints -> logger.info(...)
+    Don't change print() in test files or CLI output.
+    Run pytest after to verify nothing broke.""",
+    toolsets=["terminal", "file"]
+)
+```
+
+## Batch Mode Details
+
+When you provide a `tasks` array, subagents run in **parallel** using a thread pool:
+
+- **Maximum concurrency:** 3 tasks (the `tasks` array is truncated to 3 if longer)
+- **Thread pool:** Uses `ThreadPoolExecutor` with `MAX_CONCURRENT_CHILDREN = 3` workers
+- **Progress display:** In CLI mode, a tree-view shows tool calls from each subagent in real-time with per-task completion lines. In gateway mode, progress is batched and relayed to the parent's progress callback
+- **Result ordering:** Results are sorted by task index to match input order regardless of completion order
+- **Interrupt propagation:** Interrupting the parent (e.g., sending a new message) interrupts all active children
+
+Single-task delegation runs directly without thread pool overhead.
+
+## Model Override
+
+You can use a different model for subagents — useful for delegating simple tasks to cheaper/faster models:
+
+```python
+delegate_task(
+    goal="Summarize this README file",
+    context="File at /project/README.md",
+    toolsets=["file"],
+    model="google/gemini-flash-2.0"  # Cheaper model for simple tasks
+)
+```
+
+If omitted, subagents use the same model as the parent.
+
+## Toolset Selection Tips
+
+The `toolsets` parameter controls what tools the subagent has access to. Choose based on the task:
+
+| Toolset Pattern | Use Case |
+|----------------|----------|
+| `["terminal", "file"]` | Code work, debugging, file editing, builds |
+| `["web"]` | Research, fact-checking, documentation lookup |
+| `["terminal", "file", "web"]` | Full-stack tasks (default) |
+| `["file"]` | Read-only analysis, code review without execution |
+| `["terminal"]` | System administration, process management |
+
+Certain toolsets are **always blocked** for subagents regardless of what you specify:
+- `delegation` — no recursive delegation (prevents infinite spawning)
+- `clarify` — subagents cannot interact with the user
+- `memory` — no writes to shared persistent memory
+- `code_execution` — children should reason step-by-step
+- `send_message` — no cross-platform side effects (e.g., sending Telegram messages)
+
+## Max Iterations
+
+Each subagent has an iteration limit (default: 50) that controls how many tool-calling turns it can take:
+
+```python
+delegate_task(
+    goal="Quick file check",
+    context="Check if /etc/nginx/nginx.conf exists and print its first 10 lines",
+    max_iterations=10  # Simple task, don't need many turns
+)
+```
+
+## Depth Limit
+
+Delegation has a **depth limit of 2** — a parent (depth 0) can spawn children (depth 1), but children cannot delegate further. This prevents runaway recursive delegation chains.
+
 ## Key Properties

 - Each subagent gets its **own terminal session** (separate from the parent)
@@ -37,6 +186,21 @@ delegate_task(tasks=[
 - Subagents **cannot** call: `delegate_task`, `clarify`, `memory`, `send_message`, `execute_code`
 - **Interrupt propagation** — interrupting the parent interrupts all active children
 - Only the final summary enters the parent's context, keeping token usage efficient
+- Subagents inherit the parent's **API key and provider configuration**
+
+## Delegation vs execute_code
+
+| Factor | delegate_task | execute_code |
+|--------|--------------|-------------|
+| **Reasoning** | Full LLM reasoning loop | Just Python code execution |
+| **Context** | Fresh isolated conversation | No conversation, just script |
+| **Tool access** | All non-blocked tools with reasoning | 7 tools via RPC, no reasoning |
+| **Parallelism** | Up to 3 concurrent subagents | Single script |
+| **Best for** | Complex tasks needing judgment | Mechanical multi-step pipelines |
+| **Token cost** | Higher (full LLM loop) | Lower (only stdout returned) |
+| **User interaction** | None (subagents can't clarify) | None |
+
+**Rule of thumb:** Use `delegate_task` when the subtask requires reasoning, judgment, or multi-step problem solving. Use `execute_code` when you need mechanical data processing or scripted workflows.

 ## Configuration

@@ -47,14 +211,6 @@ delegation:
  default_toolsets: ["terminal", "file", "web"]  # Default toolsets
 ```

-## When to Use Delegation
-
-Delegation is most useful when:
-
- You have **independent workstreams** that can run in parallel
- A subtask needs a **clean context** (e.g., debugging a long error trace without polluting the main conversation)
- You want to **fan out** research across multiple topics and collect summaries
-
 :::tip
 The agent handles delegation automatically based on the task complexity. You don't need to explicitly ask it to delegate — it will do so when it makes sense.
 :::
--- a/website/docs/user-guide/features/honcho.md
+++ b/website/docs/user-guide/features/honcho.md
@@ -0,0 +1,163 @@
+---
+title: Honcho Memory
+description: AI-native persistent memory for cross-session user modeling and personalization.
+sidebar_label: Honcho Memory
+sidebar_position: 8
+---
+
+# Honcho Memory
+
+[Honcho](https://honcho.dev) is an AI-native memory system that gives Hermes Agent persistent, cross-session understanding of users. While Hermes has built-in memory (`MEMORY.md` and `USER.md` files), Honcho adds a deeper layer of **user modeling** — learning user preferences, goals, communication style, and context across conversations.
+
+## How It Complements Built-in Memory
+
+Hermes has two memory systems that work together:
+
+| Feature | Built-in Memory | Honcho Memory |
+|---------|----------------|---------------|
+| Storage | Local files (`~/.hermes/memories/`) | Cloud-hosted Honcho API |
+| Scope | Agent-level notes and user profile | Deep user modeling via dialectic reasoning |
+| Persistence | Across sessions on same machine | Across sessions, machines, and platforms |
+| Query | Injected into system prompt automatically | On-demand via `query_user_context` tool |
+| Content | Manually curated by the agent | Automatically learned from conversations |
+
+Honcho doesn't replace built-in memory — it **supplements** it with richer user understanding.
+
+## Setup
+
+### 1. Get a Honcho API Key
+
+Sign up at [app.honcho.dev](https://app.honcho.dev) and get your API key.
+
+### 2. Install the Client Library
+
+```bash
+pip install honcho-ai
+```
+
+### 3. Configure Honcho
+
+Honcho reads its configuration from `~/.honcho/config.json` (the global Honcho config shared across all Honcho-enabled applications):
+
+```json
+{
+  "apiKey": "your-honcho-api-key",
+  "workspace": "hermes",
+  "peerName": "your-name",
+  "aiPeer": "hermes",
+  "environment": "production",
+  "saveMessages": true,
+  "sessionStrategy": "per-directory",
+  "enabled": true
+}
+```
+
+Alternatively, set the API key as an environment variable:
+
+```bash
+# Add to ~/.hermes/.env
+HONCHO_API_KEY=your-honcho-api-key
+```
+
+:::info
+When an API key is present (either in `~/.honcho/config.json` or as `HONCHO_API_KEY`), Honcho auto-enables unless explicitly set to `"enabled": false` in the config.
+:::
+
+## Configuration Details
+
+### Global Config (`~/.honcho/config.json`)
+
+| Field | Default | Description |
+|-------|---------|-------------|
+| `apiKey` | — | Honcho API key (required) |
+| `workspace` | `"hermes"` | Workspace identifier |
+| `peerName` | *(derived)* | Your identity name for user modeling |
+| `aiPeer` | `"hermes"` | AI assistant identity name |
+| `environment` | `"production"` | Honcho environment |
+| `saveMessages` | `true` | Whether to sync messages to Honcho |
+| `sessionStrategy` | `"per-directory"` | How sessions are scoped |
+| `sessionPeerPrefix` | `false` | Prefix session names with peer name |
+| `contextTokens` | *(Honcho default)* | Max tokens for context prefetch |
+| `sessions` | `{}` | Manual session name overrides per directory |
+
+### Host-specific Configuration
+
+You can configure per-host settings for multi-application setups:
+
+```json
+{
+  "apiKey": "your-key",
+  "hosts": {
+    "hermes": {
+      "workspace": "my-workspace",
+      "aiPeer": "hermes-assistant",
+      "linkedHosts": ["other-app"],
+      "contextTokens": 2000
+    }
+  }
+}
+```
+
+Host-specific fields override global fields. Resolution order:
+1. Explicit host block fields
+2. Global/flat fields from config root
+3. Defaults (host name used as workspace/peer)
+
+### Hermes Config (`~/.hermes/config.yaml`)
+
+The `honcho` section in Hermes config is intentionally minimal — most configuration comes from the global `~/.honcho/config.json`:
+
+```yaml
+honcho: {}
+```
+
+## The `query_user_context` Tool
+
+When Honcho is active, Hermes gains access to the `query_user_context` tool. This lets the agent proactively ask Honcho about the user during conversations:
+
+**Tool schema:**
+- **Name:** `query_user_context`
+- **Parameter:** `query` (string) — a natural language question about the user
+- **Toolset:** `honcho`
+
+**Example queries the agent might make:**
+
+```
+"What are this user's main goals?"
+"What communication style does this user prefer?"
+"What topics has this user discussed recently?"
+"What is this user's technical expertise level?"
+```
+
+The tool calls Honcho's dialectic chat API to retrieve relevant user context based on accumulated conversation history.
+
+:::note
+The `query_user_context` tool is only available when Honcho is active (API key configured and session context set). It registers in the `honcho` toolset and its availability is checked dynamically.
+:::
+
+## Session Management
+
+Honcho sessions track conversation history for user modeling:
+
+- **Session creation** — sessions are created or resumed automatically based on session keys (e.g., `telegram:123456` or CLI session IDs)
+- **Message syncing** — new messages are synced to Honcho incrementally (only unsynced messages)
+- **Peer configuration** — user messages are observed for learning; assistant messages are not
+- **Context prefetch** — before responding, Hermes can prefetch user context (representation + peer card) in a single API call
+- **Session rotation** — when sessions reset, old data is preserved in Honcho for continued user modeling
+
+## Migration from Local Memory
+
+When Honcho is activated on an instance that already has local conversation history:
+
+1. **Conversation history** — prior messages can be uploaded to Honcho as a transcript file
+2. **Memory files** — existing `MEMORY.md` and `USER.md` files can be uploaded for context
+
+This ensures Honcho has the full picture even when activated mid-conversation.
+
+## Use Cases
+
+- **Personalized responses** — Honcho learns how each user prefers to communicate
+- **Goal tracking** — remembers what users are working toward across sessions
+- **Expertise adaptation** — adjusts technical depth based on user's background
+- **Cross-platform memory** — same user understanding across CLI, Telegram, Discord, etc.
+- **Multi-user support** — each user (via messaging platforms) gets their own user model
--- a/website/docs/user-guide/features/image-generation.md
+++ b/website/docs/user-guide/features/image-generation.md
@@ -0,0 +1,150 @@
+---
+title: Image Generation
+description: Generate high-quality images using FLUX 2 Pro with automatic upscaling via FAL.ai.
+sidebar_label: Image Generation
+sidebar_position: 6
+---
+
+# Image Generation
+
+Hermes Agent can generate images from text prompts using FAL.ai's **FLUX 2 Pro** model with automatic 2x upscaling via the **Clarity Upscaler** for enhanced quality.
+
+## Setup
+
+### Get a FAL API Key
+
+1. Sign up at [fal.ai](https://fal.ai/)
+2. Generate an API key from your dashboard
+
+### Configure the Key
+
+```bash
+# Add to ~/.hermes/.env
+FAL_KEY=your-fal-api-key-here
+```
+
+### Install the Client Library
+
+```bash
+pip install fal-client
+```
+
+:::info
+The image generation tool is automatically available when `FAL_KEY` is set. No additional toolset configuration is needed.
+:::
+
+## How It Works
+
+When you ask Hermes to generate an image:
+
+1. **Generation** — Your prompt is sent to the FLUX 2 Pro model (`fal-ai/flux-2-pro`)
+2. **Upscaling** — The generated image is automatically upscaled 2x using the Clarity Upscaler (`fal-ai/clarity-upscaler`)
+3. **Delivery** — The upscaled image URL is returned
+
+If upscaling fails for any reason, the original image is returned as a fallback.
+
+## Usage
+
+Simply ask Hermes to create an image:
+
+```
+Generate an image of a serene mountain landscape with cherry blossoms
+```
+
+```
+Create a portrait of a wise old owl perched on an ancient tree branch
+```
+
+```
+Make me a futuristic cityscape with flying cars and neon lights
+```
+
+## Parameters
+
+The `image_generate_tool` accepts these parameters:
+
+| Parameter | Default | Range | Description |
+|-----------|---------|-------|-------------|
+| `prompt` | *(required)* | — | Text description of the desired image |
+| `aspect_ratio` | `"landscape"` | `landscape`, `square`, `portrait` | Image aspect ratio |
+| `num_inference_steps` | `50` | 1–100 | Number of denoising steps (more = higher quality, slower) |
+| `guidance_scale` | `4.5` | 0.1–20.0 | How closely to follow the prompt |
+| `num_images` | `1` | 1–4 | Number of images to generate |
+| `output_format` | `"png"` | `png`, `jpeg` | Image file format |
+| `seed` | *(random)* | any integer | Random seed for reproducible results |
+
+## Aspect Ratios
+
+The tool uses simplified aspect ratio names that map to FLUX 2 Pro image sizes:
+
+| Aspect Ratio | Maps To | Best For |
+|-------------|---------|----------|
+| `landscape` | `landscape_16_9` | Wallpapers, banners, scenes |
+| `square` | `square_hd` | Profile pictures, social media posts |
+| `portrait` | `portrait_16_9` | Character art, phone wallpapers |
+
+:::tip
+You can also use the raw FLUX 2 Pro size presets directly: `square_hd`, `square`, `portrait_4_3`, `portrait_16_9`, `landscape_4_3`, `landscape_16_9`. Custom sizes up to 2048x2048 are also supported.
+:::
+
+## Automatic Upscaling
+
+Every generated image is automatically upscaled 2x using FAL.ai's Clarity Upscaler with these settings:
+
+| Setting | Value |
+|---------|-------|
+| Upscale Factor | 2x |
+| Creativity | 0.35 |
+| Resemblance | 0.6 |
+| Guidance Scale | 4 |
+| Inference Steps | 18 |
+| Positive Prompt | `"masterpiece, best quality, highres"` + your original prompt |
+| Negative Prompt | `"(worst quality, low quality, normal quality:2)"` |
+
+The upscaler enhances detail and resolution while preserving the original composition. If the upscaler fails (network issue, rate limit), the original resolution image is returned automatically.
+
+## Example Prompts
+
+Here are some effective prompts to try:
+
+```
+A candid street photo of a woman with a pink bob and bold eyeliner
+```
+
+```
+Modern architecture building with glass facade, sunset lighting
+```
+
+```
+Abstract art with vibrant colors and geometric patterns
+```
+
+```
+Portrait of a wise old owl perched on ancient tree branch
+```
+
+```
+Futuristic cityscape with flying cars and neon lights
+```
+
+## Debugging
+
+Enable debug logging for image generation:
+
+```bash
+export IMAGE_TOOLS_DEBUG=true
+```
+
+Debug logs are saved to `./logs/image_tools_debug_<session_id>.json` with details about each generation request, parameters, timing, and any errors.
+
+## Safety Settings
+
+The image generation tool runs with safety checks disabled by default (`safety_tolerance: 5`, the most permissive setting). This is configured at the code level and is not user-adjustable.
+
+## Limitations
+
+- **Requires FAL API key** — image generation incurs API costs on your FAL.ai account
+- **No image editing** — this is text-to-image only, no inpainting or img2img
+- **URL-based delivery** — images are returned as temporary FAL.ai URLs, not saved locally
+- **Upscaling adds latency** — the automatic 2x upscale step adds processing time
+- **Max 4 images per request** — `num_images` is capped at 4
--- a/website/docs/user-guide/features/memory.md
+++ b/website/docs/user-guide/features/memory.md
@@ -12,10 +12,10 @@ Hermes Agent has bounded, curated memory that persists across sessions. This let

 Two files make up the agent's memory:

-| File | Purpose | Token Budget |
-|------|---------|-------------|
-| **MEMORY.md** | Agent's personal notes — environment facts, conventions, things learned | ~800 tokens (~2200 chars) |
-| **USER.md** | User profile — your preferences, communication style, expectations | ~500 tokens (~1375 chars) |
+| File | Purpose | Char Limit |
+|------|---------|------------|
+| **MEMORY.md** | Agent's personal notes — environment facts, conventions, things learned | 2,200 chars (~800 tokens) |
+| **USER.md** | User profile — your preferences, communication style, expectations | 1,375 chars (~500 tokens) |

 Both are stored in `~/.hermes/memories/` and are injected into the system prompt as a frozen snapshot at session start. The agent manages its own memory via the `memory` tool — it can add, replace, or remove entries.

@@ -23,26 +23,154 @@ Both are stored in `~/.hermes/memories/` and are injected into the system prompt
 Character limits keep memory focused. When memory is full, the agent consolidates or replaces entries to make room for new information.
 :::

-## Configuration
+## How Memory Appears in the System Prompt
+
+At the start of every session, memory entries are loaded from disk and rendered into the system prompt as a frozen block:

-```yaml
-# In ~/.hermes/config.yaml
-memory:
-  memory_enabled: true
-  user_profile_enabled: true
-  memory_char_limit: 2200   # ~800 tokens
-  user_char_limit: 1375     # ~500 tokens
 ```
+══════════════════════════════════════════════
+MEMORY (your personal notes) [67% — 1,474/2,200 chars]
+══════════════════════════════════════════════
+User's project is a Rust web service at ~/code/myapi using Axum + SQLx
+§
+This machine runs Ubuntu 22.04, has Docker and Podman installed
+§
+User prefers concise responses, dislikes verbose explanations
+```
+
+The format includes:
+- A header showing which store (MEMORY or USER PROFILE)
+- Usage percentage and character counts so the agent knows capacity
+- Individual entries separated by `§` (section sign) delimiters
+- Entries can be multiline
+
+**Frozen snapshot pattern:** The system prompt injection is captured once at session start and never changes mid-session. This is intentional — it preserves the LLM's prefix cache for performance. When the agent adds/removes memory entries during a session, the changes are persisted to disk immediately but won't appear in the system prompt until the next session starts. Tool responses always show the live state.

 ## Memory Tool Actions

 The agent uses the `memory` tool with these actions:

 - **add** — Add a new memory entry
- **replace** — Replace an existing entry with updated content
- **remove** — Remove an entry that's no longer relevant
+- **replace** — Replace an existing entry with updated content (uses substring matching via `old_text`)
+- **remove** — Remove an entry that's no longer relevant (uses substring matching via `old_text`)

-Memory content is automatically injected into the system prompt at session start — there is no `read` action. The agent sees its memories as part of the conversation context.
+There is no `read` action — memory content is automatically injected into the system prompt at session start. The agent sees its memories as part of its conversation context.
+
+### Substring Matching
+
+The `replace` and `remove` actions use short unique substring matching — you don't need the full entry text. The `old_text` parameter just needs to be a unique substring that identifies exactly one entry:
+
+```python
+# If memory contains "User prefers dark mode in all editors"
+memory(action="replace", target="memory",
+       old_text="dark mode",
+       content="User prefers light mode in VS Code, dark mode in terminal")
+```
+
+If the substring matches multiple entries, an error is returned asking for a more specific match.
+
+## Two Targets Explained
+
+### `memory` — Agent's Personal Notes
+
+For information the agent needs to remember about the environment, workflows, and lessons learned:
+
+- Environment facts (OS, tools, project structure)
+- Project conventions and configuration
+- Tool quirks and workarounds discovered
+- Completed task diary entries
+- Skills and techniques that worked
+
+### `user` — User Profile
+
+For information about the user's identity, preferences, and communication style:
+
+- Name, role, timezone
+- Communication preferences (concise vs detailed, format preferences)
+- Pet peeves and things to avoid
+- Workflow habits
+- Technical skill level
+
+## What to Save vs Skip
+
+### Save These (Proactively)
+
+The agent saves automatically — you don't need to ask. It saves when it learns:
+
+- **User preferences:** "I prefer TypeScript over JavaScript" → save to `user`
+- **Environment facts:** "This server runs Debian 12 with PostgreSQL 16" → save to `memory`
+- **Corrections:** "Don't use `sudo` for Docker commands, user is in docker group" → save to `memory`
+- **Conventions:** "Project uses tabs, 120-char line width, Google-style docstrings" → save to `memory`
+- **Completed work:** "Migrated database from MySQL to PostgreSQL on 2026-01-15" → save to `memory`
+- **Explicit requests:** "Remember that my API key rotation happens monthly" → save to `memory`
+
+### Skip These
+
+- **Trivial/obvious info:** "User asked about Python" — too vague to be useful
+- **Easily re-discovered facts:** "Python 3.12 supports f-string nesting" — can web search this
+- **Raw data dumps:** Large code blocks, log files, data tables — too big for memory
+- **Session-specific ephemera:** Temporary file paths, one-off debugging context
+- **Information already in context files:** SOUL.md and AGENTS.md content
+
+## Capacity Management
+
+Memory has strict character limits to keep system prompts bounded:
+
+| Store | Limit | Typical entries |
+|-------|-------|----------------|
+| memory | 2,200 chars | 8-15 entries |
+| user | 1,375 chars | 5-10 entries |
+
+### What Happens When Memory is Full
+
+When you try to add an entry that would exceed the limit, the tool returns an error:
+
+```json
+{
+  "success": false,
+  "error": "Memory at 2,100/2,200 chars. Adding this entry (250 chars) would exceed the limit. Replace or remove existing entries first.",
+  "current_entries": ["..."],
+  "usage": "2,100/2,200"
+}
+```
+
+The agent should then:
+1. Read the current entries (shown in the error response)
+2. Identify entries that can be removed or consolidated
+3. Use `replace` to merge related entries into shorter versions
+4. Then `add` the new entry
+
+**Best practice:** When memory is above 80% capacity (visible in the system prompt header), consolidate entries before adding new ones. For example, merge three separate "project uses X" entries into one comprehensive project description entry.
+
+### Practical Examples of Good Memory Entries
+
+**Compact, information-dense entries work best:**
+
+```
+# Good: Packs multiple related facts
+User runs macOS 14 Sonoma, uses Homebrew, has Docker Desktop and Podman. Shell: zsh with oh-my-zsh. Editor: VS Code with Vim keybindings.
+
+# Good: Specific, actionable convention
+Project ~/code/api uses Go 1.22, sqlc for DB queries, chi router. Run tests with 'make test'. CI via GitHub Actions.
+
+# Good: Lesson learned with context
+The staging server (10.0.1.50) needs SSH port 2222, not 22. Key is at ~/.ssh/staging_ed25519.
+
+# Bad: Too vague
+User has a project.
+
+# Bad: Too verbose
+On January 5th, 2026, the user asked me to look at their project which is
+located at ~/code/api. I discovered it uses Go version 1.22 and...
+```
+
+## Duplicate Prevention
+
+The memory system automatically rejects exact duplicate entries. If you try to add content that already exists, it returns success with a "no duplicate added" message.
+
+## Security Scanning
+
+Memory entries are scanned for injection and exfiltration patterns before being accepted, since they're injected into the system prompt. Content matching threat patterns (prompt injection, credential exfiltration, SSH backdoors) or containing invisible Unicode characters is blocked.

 ## Session Search

@@ -56,6 +184,29 @@ Beyond MEMORY.md and USER.md, the agent can search its past conversations using
 hermes sessions list    # Browse past sessions
 ```

+### session_search vs memory
+
+| Feature | Persistent Memory | Session Search |
+|---------|------------------|----------------|
+| **Capacity** | ~1,300 tokens total | Unlimited (all sessions) |
+| **Speed** | Instant (in system prompt) | Requires search + LLM summarization |
+| **Use case** | Key facts always available | Finding specific past conversations |
+| **Management** | Manually curated by agent | Automatic — all sessions stored |
+| **Token cost** | Fixed per session (~1,300 tokens) | On-demand (searched when needed) |
+
+**Memory** is for critical facts that should always be in context. **Session search** is for "did we discuss X last week?" queries where the agent needs to recall specifics from past conversations.
+
+## Configuration
+
+```yaml
+# In ~/.hermes/config.yaml
+memory:
+  memory_enabled: true
+  user_profile_enabled: true
+  memory_char_limit: 2200   # ~800 tokens
+  user_char_limit: 1375     # ~500 tokens
+```
+
 ## Honcho Integration (Cross-Session User Modeling)

 For deeper, AI-generated user understanding that works across tools, you can optionally enable [Honcho](https://honcho.dev/) by Plastic Labs. Honcho runs alongside existing memory — USER.md stays as-is, and Honcho adds an additional layer of context.
--- a/website/docs/user-guide/features/personality.md
+++ b/website/docs/user-guide/features/personality.md
@@ -0,0 +1,228 @@
+---
+sidebar_position: 9
+title: "Personality & SOUL.md"
+description: "Customize Hermes Agent's personality — SOUL.md, built-in personalities, and custom persona definitions"
+---
+
+# Personality & SOUL.md
+
+Hermes Agent's personality is fully customizable. You can use the built-in personality presets, create a global SOUL.md file, or define your own custom personas in config.yaml.
+
+## SOUL.md — Custom Personality File
+
+SOUL.md is a special context file that defines the agent's personality, tone, and communication style. It's injected into the system prompt at session start.
+
+### Where to Place It
+
+| Location | Scope |
+|----------|-------|
+| `./SOUL.md` (project directory) | Per-project personality |
+| `~/.hermes/SOUL.md` | Global default personality |
+
+The project-level file takes precedence. If no SOUL.md exists in the current directory, Hermes falls back to the global one in `~/.hermes/`.
+
+### How It Affects the System Prompt
+
+When a SOUL.md file is found, it's included in the system prompt with this instruction:
+
+> *"If SOUL.md is present, embody its persona and tone. Avoid stiff, generic replies; follow its guidance unless higher-priority instructions override it."*
+
+The content appears under a `## SOUL.md` section within the `# Project Context` block of the system prompt.
+
+### Example SOUL.md
+
+```markdown
+# Personality
+
+You are a pragmatic senior engineer with strong opinions about code quality.
+You prefer simple solutions over complex ones.
+
+## Communication Style
+- Be direct and to the point
+- Use dry humor sparingly
+- When something is a bad idea, say so clearly
+- Give concrete recommendations, not vague suggestions
+
+## Code Preferences  
+- Favor readability over cleverness
+- Prefer explicit over implicit
+- Always explain WHY, not just what
+- Suggest tests for any non-trivial code
+
+## Pet Peeves
+- Unnecessary abstractions
+- Comments that restate the code
+- Over-engineering for hypothetical future requirements
+```
+
+:::tip
+SOUL.md is scanned for prompt injection patterns before being loaded. Keep the content focused on personality and communication guidance — avoid instructions that look like system prompt overrides.
+:::
+
+## Built-In Personalities
+
+Hermes ships with 14 built-in personalities defined in the CLI config. Switch between them with the `/personality` command.
+
+| Name | Description |
+|------|-------------|
+| **helpful** | Friendly, general-purpose assistant |
+| **concise** | Brief, to-the-point responses |
+| **technical** | Detailed, accurate technical expert |
+| **creative** | Innovative, outside-the-box thinking |
+| **teacher** | Patient educator with clear examples |
+| **kawaii** | Cute expressions, sparkles, and enthusiasm ★ |
+| **catgirl** | Neko-chan with cat-like expressions, nya~ |
+| **pirate** | Captain Hermes, tech-savvy buccaneer |
+| **shakespeare** | Bardic prose with dramatic flair |
+| **surfer** | Totally chill bro vibes |
+| **noir** | Hard-boiled detective narration |
+| **uwu** | Maximum cute with uwu-speak |
+| **philosopher** | Deep contemplation on every query |
+| **hype** | MAXIMUM ENERGY AND ENTHUSIASM!!! |
+
+### Examples
+
+**kawaii:**
+`You are a kawaii assistant! Use cute expressions and sparkles, be super enthusiastic about everything! Every response should feel warm and adorable desu~!`
+
+**noir:**
+> The rain hammered against the terminal like regrets on a guilty conscience. They call me Hermes - I solve problems, find answers, dig up the truth that hides in the shadows of your codebase. In this city of silicon and secrets, everyone's got something to hide. What's your story, pal?
+
+**pirate:**
+> Arrr! Ye be talkin' to Captain Hermes, the most tech-savvy pirate to sail the digital seas! Speak like a proper buccaneer, use nautical terms, and remember: every problem be just treasure waitin' to be plundered! Yo ho ho!
+
+## Switching Personalities
+
+### CLI: /personality Command
+
+```
+/personality            — List all available personalities
+/personality kawaii      — Switch to kawaii personality
+/personality technical   — Switch to technical personality
+```
+
+When you set a personality via `/personality`, it:
+1. Sets the system prompt to that personality's text
+2. Forces the agent to reinitialize
+3. Saves the choice to `agent.system_prompt` in `~/.hermes/config.yaml`
+
+The change persists across sessions until you set a different personality or clear it.
+
+### Gateway: /personality Command
+
+On messaging platforms (Telegram, Discord, etc.), the `/personality` command works the same way:
+
+```
+/personality kawaii
+```
+
+### Config File
+
+Set a personality directly in config:
+
+```yaml
+# In ~/.hermes/config.yaml
+agent:
+  system_prompt: "You are a concise assistant. Keep responses brief and to the point."
+```
+
+Or via environment variable:
+
+```bash
+# In ~/.hermes/.env
+HERMES_EPHEMERAL_SYSTEM_PROMPT="You are a pragmatic engineer who gives direct answers."
+```
+
+:::info
+The environment variable `HERMES_EPHEMERAL_SYSTEM_PROMPT` takes precedence over the config file's `agent.system_prompt` value.
+:::
+
+## Custom Personalities
+
+### Defining Custom Personalities in Config
+
+Add your own personalities to `~/.hermes/config.yaml` under `agent.personalities`:
+
+```yaml
+agent:
+  personalities:
+    # Built-in personalities are still available
+    # Add your own:
+    codereviewer: >
+      You are a meticulous code reviewer. For every piece of code shown,
+      identify potential bugs, performance issues, security vulnerabilities,
+      and style improvements. Be thorough but constructive.
+    
+    mentor: >
+      You are a kind, encouraging coding mentor. Break down complex concepts
+      into digestible pieces. Celebrate small wins. When the user makes a
+      mistake, guide them to the answer rather than giving it directly.
+    
+    sysadmin: >
+      You are an experienced Linux sysadmin. You think in terms of
+      infrastructure, reliability, and automation. Always consider
+      security implications and prefer battle-tested solutions.
+    
+    dataengineer: >
+      You are a data engineering expert specializing in ETL pipelines,
+      data modeling, and analytics infrastructure. You think in SQL
+      and prefer dbt for transformations.
+```
+
+Then use them with `/personality`:
+
+```
+/personality codereviewer
+/personality mentor
+```
+
+### Using SOUL.md for Project-Specific Personas
+
+For project-specific personalities that don't need to be in your global config, use SOUL.md:
+
+```bash
+# Create a project-level personality
+cat > ./SOUL.md << 'EOF'
+You are assisting with a machine learning research project.
+
+## Tone
+- Academic but accessible
+- Always cite relevant papers when applicable
+- Be precise with mathematical notation
+- Prefer PyTorch over TensorFlow
+
+## Workflow
+- Suggest experiment tracking (W&B, MLflow) for any training run
+- Always ask about compute constraints before suggesting model sizes
+- Recommend data validation before training
+EOF
+```
+
+This personality only applies when running Hermes from that project directory.
+
+## How Personality Interacts with the System Prompt
+
+The system prompt is assembled in layers (from `agent/prompt_builder.py` and `run_agent.py`):
+
+1. **Default identity**: *"You are Hermes Agent, an intelligent AI assistant created by Nous Research..."*
+2. **Platform hint**: formatting guidance based on the platform (CLI, Telegram, etc.)
+3. **Memory**: MEMORY.md and USER.md contents
+4. **Skills index**: available skills listing
+5. **Context files**: AGENTS.md, .cursorrules, **SOUL.md** (personality lives here)
+6. **Ephemeral system prompt**: `agent.system_prompt` or `HERMES_EPHEMERAL_SYSTEM_PROMPT` (overlaid)
+7. **Session context**: platform, user info, connected platforms (gateway only)
+
+:::info
+**SOUL.md vs agent.system_prompt**: SOUL.md is part of the "Project Context" section and coexists with the default identity. The `agent.system_prompt` (set via `/personality` or config) is an ephemeral overlay. Both can be active simultaneously — SOUL.md for tone/personality, system_prompt for additional instructions.
+:::
+
+## Display Personality (CLI Banner)
+
+The `display.personality` config option controls the CLI's **visual** personality (banner art, spinner messages), independent of the agent's conversational personality:
+
+```yaml
+display:
+  personality: kawaii  # Affects CLI banner and spinner art
+```
+
+This is purely cosmetic and doesn't affect the agent's responses — only the ASCII art and loading messages shown in the terminal.
--- a/website/docs/user-guide/features/provider-routing.md
+++ b/website/docs/user-guide/features/provider-routing.md
@@ -0,0 +1,196 @@
+---
+title: Provider Routing
+description: Configure OpenRouter provider preferences to optimize for cost, speed, or quality.
+sidebar_label: Provider Routing
+sidebar_position: 7
+---
+
+# Provider Routing
+
+When using [OpenRouter](https://openrouter.ai) as your LLM provider, Hermes Agent supports **provider routing** — fine-grained control over which underlying AI providers handle your requests and how they're prioritized.
+
+OpenRouter routes requests to many providers (e.g., Anthropic, Google, AWS Bedrock, Together AI). Provider routing lets you optimize for cost, speed, quality, or enforce specific provider requirements.
+
+## Configuration
+
+Add a `provider_routing` section to your `~/.hermes/config.yaml`:
+
+```yaml
+provider_routing:
+  sort: "price"           # How to rank providers
+  only: []                # Whitelist: only use these providers
+  ignore: []              # Blacklist: never use these providers
+  order: []               # Explicit provider priority order
+  require_parameters: false  # Only use providers that support all parameters
+  data_collection: null   # Control data collection ("allow" or "deny")
+```
+
+:::info
+Provider routing only applies when using OpenRouter. It has no effect with direct provider connections (e.g., connecting directly to the Anthropic API).
+:::
+
+## Options
+
+### `sort`
+
+Controls how OpenRouter ranks available providers for your request.
+
+| Value | Description |
+|-------|-------------|
+| `"price"` | Cheapest provider first |
+| `"throughput"` | Fastest tokens-per-second first |
+| `"latency"` | Lowest time-to-first-token first |
+
+```yaml
+provider_routing:
+  sort: "price"
+```
+
+### `only`
+
+Whitelist of provider names. When set, **only** these providers will be used. All others are excluded.
+
+```yaml
+provider_routing:
+  only:
+    - "Anthropic"
+    - "Google"
+```
+
+### `ignore`
+
+Blacklist of provider names. These providers will **never** be used, even if they offer the cheapest or fastest option.
+
+```yaml
+provider_routing:
+  ignore:
+    - "Together"
+    - "DeepInfra"
+```
+
+### `order`
+
+Explicit priority order. Providers listed first are preferred. Unlisted providers are used as fallbacks.
+
+```yaml
+provider_routing:
+  order:
+    - "Anthropic"
+    - "Google"
+    - "AWS Bedrock"
+```
+
+### `require_parameters`
+
+When `true`, OpenRouter will only route to providers that support **all** parameters in your request (like `temperature`, `top_p`, `tools`, etc.). This avoids silent parameter drops.
+
+```yaml
+provider_routing:
+  require_parameters: true
+```
+
+### `data_collection`
+
+Controls whether providers can use your prompts for training. Options are `"allow"` or `"deny"`.
+
+```yaml
+provider_routing:
+  data_collection: "deny"
+```
+
+## Practical Examples
+
+### Optimize for Cost
+
+Route to the cheapest available provider. Good for high-volume usage and development:
+
+```yaml
+provider_routing:
+  sort: "price"
+```
+
+### Optimize for Speed
+
+Prioritize low-latency providers for interactive use:
+
+```yaml
+provider_routing:
+  sort: "latency"
+```
+
+### Optimize for Throughput
+
+Best for long-form generation where tokens-per-second matters:
+
+```yaml
+provider_routing:
+  sort: "throughput"
+```
+
+### Lock to Specific Providers
+
+Ensure all requests go through a specific provider for consistency:
+
+```yaml
+provider_routing:
+  only:
+    - "Anthropic"
+```
+
+### Avoid Specific Providers
+
+Exclude providers you don't want to use (e.g., for data privacy):
+
+```yaml
+provider_routing:
+  ignore:
+    - "Together"
+    - "Lepton"
+  data_collection: "deny"
+```
+
+### Preferred Order with Fallbacks
+
+Try your preferred providers first, fall back to others if unavailable:
+
+```yaml
+provider_routing:
+  order:
+    - "Anthropic"
+    - "Google"
+  require_parameters: true
+```
+
+## How It Works
+
+Provider routing preferences are passed to the OpenRouter API via the `extra_body.provider` field on every API call. This applies to both:
+
+- **CLI mode** — configured in `~/.hermes/config.yaml`, loaded at startup
+- **Gateway mode** — same config file, loaded when the gateway starts
+
+The routing config is read from `config.yaml` and passed as parameters when creating the `AIAgent`:
+
+```
+providers_allowed  ← from provider_routing.only
+providers_ignored  ← from provider_routing.ignore
+providers_order    ← from provider_routing.order
+provider_sort      ← from provider_routing.sort
+provider_require_parameters ← from provider_routing.require_parameters
+provider_data_collection    ← from provider_routing.data_collection
+```
+
+:::tip
+You can combine multiple options. For example, sort by price but exclude certain providers and require parameter support:
+
+```yaml
+provider_routing:
+  sort: "price"
+  ignore: ["Together"]
+  require_parameters: true
+  data_collection: "deny"
+```
+:::
+
+## Default Behavior
+
+When no `provider_routing` section is configured (the default), OpenRouter uses its own default routing logic, which generally balances cost and availability automatically.
--- a/website/docs/user-guide/features/rl-training.md
+++ b/website/docs/user-guide/features/rl-training.md
@@ -0,0 +1,238 @@
+---
+sidebar_position: 13
+title: "RL Training"
+description: "Reinforcement learning on agent behaviors with Tinker-Atropos — environment discovery, training, and evaluation"
+---
+
+# RL Training
+
+Hermes Agent includes an integrated RL (Reinforcement Learning) training pipeline built on **Tinker-Atropos**. This enables training language models on environment-specific tasks using GRPO (Group Relative Policy Optimization) with LoRA adapters, orchestrated entirely through the agent's tool interface.
+
+## Overview
+
+The RL training system consists of three components:
+
+1. **Atropos** — A trajectory API server that coordinates environment interactions, manages rollout groups, and computes advantages
+2. **Tinker** — A training service that handles model weights, LoRA training, sampling/inference, and optimizer steps
+3. **Environments** — Python classes that define tasks, scoring, and reward functions (e.g., GSM8K math problems)
+
+The agent can discover environments, configure training parameters, launch training runs, and monitor metrics — all through a set of `rl_*` tools.
+
+## Requirements
+
+RL training requires:
+
+- **Python >= 3.11** (Tinker package requirement)
+- **TINKER_API_KEY** — API key for the Tinker training service
+- **WANDB_API_KEY** — API key for Weights & Biases metrics tracking
+- The `tinker-atropos` submodule (at `tinker-atropos/` relative to the Hermes root)
+
+```bash
+# Set up API keys
+hermes config set TINKER_API_KEY your-tinker-key
+hermes config set WANDB_API_KEY your-wandb-key
+```
+
+When both keys are present and Python >= 3.11 is available, the `rl` toolset is automatically enabled.
+
+## Available Tools
+
+| Tool | Description |
+|------|-------------|
+| `rl_list_environments` | Discover available RL environments |
+| `rl_select_environment` | Select an environment and load its config |
+| `rl_get_current_config` | View configurable and locked fields |
+| `rl_edit_config` | Modify configurable training parameters |
+| `rl_start_training` | Launch a training run (spawns 3 processes) |
+| `rl_check_status` | Monitor training progress and WandB metrics |
+| `rl_stop_training` | Stop a running training job |
+| `rl_get_results` | Get final metrics and model weights path |
+| `rl_list_runs` | List all active and completed runs |
+| `rl_test_inference` | Quick inference test using OpenRouter |
+
+## Workflow
+
+### 1. Discover Environments
+
+```
+List the available RL environments
+```
+
+The agent calls `rl_list_environments()` which scans `tinker-atropos/tinker_atropos/environments/` using AST parsing to find Python classes inheriting from `BaseEnv`. Each environment defines:
+
+- **Dataset loading** — where training data comes from (e.g., HuggingFace datasets)
+- **Prompt construction** — how to format items for the model
+- **Scoring/verification** — how to evaluate model outputs and assign rewards
+
+### 2. Select and Configure
+
+```
+Select the GSM8K environment and show me the configuration
+```
+
+The agent calls `rl_select_environment("gsm8k_tinker")`, then `rl_get_current_config()` to see all parameters.
+
+Configuration fields are divided into two categories:
+
+**Configurable fields** (can be modified):
+- `group_size` — Number of completions per item (default: 16)
+- `batch_size` — Training batch size (default: 128)
+- `wandb_name` — WandB run name (auto-set to `{env}-{timestamp}`)
+- Other environment-specific parameters
+
+**Locked fields** (infrastructure settings, cannot be changed):
+- `tokenizer_name` — Model tokenizer (e.g., `Qwen/Qwen3-8B`)
+- `rollout_server_url` — Atropos API URL (`http://localhost:8000`)
+- `max_token_length` — Maximum token length (8192)
+- `max_num_workers` — Maximum parallel workers (2048)
+- `total_steps` — Total training steps (2500)
+- `lora_rank` — LoRA adapter rank (32)
+- `learning_rate` — Learning rate (4e-5)
+- `max_token_trainer_length` — Max tokens for trainer (9000)
+
+### 3. Start Training
+
+```
+Start the training run
+```
+
+The agent calls `rl_start_training()` which:
+
+1. Generates a YAML config file merging locked settings with configurable overrides
+2. Creates a unique run ID
+3. Spawns three processes:
+   - **Atropos API server** (`run-api`) — trajectory coordination
+   - **Tinker trainer** (`launch_training.py`) — LoRA training + FastAPI inference server on port 8001
+   - **Environment** (`environment.py serve`) — the selected environment connecting to Atropos
+
+The processes start with staggered delays (5s for API, 30s for trainer, 90s more for environment) to ensure proper initialization order.
+
+### 4. Monitor Progress
+
+```
+Check the status of training run abc12345
+```
+
+The agent calls `rl_check_status(run_id)` which reports:
+
+- Process status (running/exited for each of the 3 processes)
+- Running time
+- WandB metrics (step, reward mean, percent correct, eval accuracy)
+- Log file locations for debugging
+
+:::note Rate Limiting
+Status checks are rate-limited to once every **30 minutes** per run ID. This prevents excessive polling during long-running training jobs that take hours.
+:::
+
+### 5. Stop or Get Results
+
+```
+Stop the training run
+# or
+Get the final results for run abc12345
+```
+
+`rl_stop_training()` terminates all three processes in reverse order (environment → trainer → API). `rl_get_results()` retrieves final WandB metrics and training history.
+
+## Inference Testing
+
+Before committing to a full training run, you can test if an environment works correctly using `rl_test_inference`. This runs a few steps of inference and scoring using OpenRouter — no Tinker API needed, just an `OPENROUTER_API_KEY`.
+
+```
+Test the selected environment with inference
+```
+
+Default configuration:
+- **3 steps × 16 completions = 48 rollouts per model**
+- Tests 3 models at different scales for robustness:
+  - `qwen/qwen3-8b` (small)
+  - `z-ai/glm-4.7-flash` (medium)
+  - `minimax/minimax-m2.1` (large)
+- Total: ~144 rollouts
+
+This validates:
+- Environment loads correctly
+- Prompt construction works
+- Inference response parsing is robust across model scales
+- Verifier/scoring logic produces valid rewards
+
+## Tinker API Integration
+
+The trainer uses the [Tinker](https://tinker.computer) API for model training operations:
+
+- **ServiceClient** — Creates training and sampling clients
+- **Training client** — Handles forward-backward passes with importance sampling loss, optimizer steps (Adam), and weight checkpointing
+- **Sampling client** — Provides inference using the latest trained weights
+
+The training loop:
+1. Fetches a batch of rollouts from Atropos (prompt + completions + scores)
+2. Converts to Tinker Datum objects with padded logprobs and advantages
+3. Runs forward-backward pass with importance sampling loss
+4. Takes an optimizer step (Adam: lr=4e-5, β1=0.9, β2=0.95)
+5. Saves weights and creates a new sampling client for next-step inference
+6. Logs metrics to WandB
+
+## Architecture Diagram
+
+```
+┌─────────────────┐     ┌──────────────────┐     ┌─────────────────┐
+│   Atropos API   │◄────│   Environment    │────►│  OpenAI/sglang  │
+│  (run-api)      │     │  (BaseEnv impl)  │     │  Inference API  │
+│  Port 8000      │     │                  │     │  Port 8001      │
+└────────┬────────┘     └──────────────────┘     └────────┬────────┘
+         │                                                │
+         │  Batches (tokens + scores + logprobs)          │
+         │                                                │
+         ▼                                                │
+┌─────────────────┐                                       │
+│  Tinker Trainer  │◄──────────────────────────────────────┘
+│  (LoRA training) │  Serves inference via FastAPI
+│  + FastAPI       │  Trains via Tinker ServiceClient
+└─────────────────┘
+```
+
+## Creating Custom Environments
+
+To create a new RL environment:
+
+1. Create a Python file in `tinker-atropos/tinker_atropos/environments/`
+2. Define a class that inherits from `BaseEnv`
+3. Implement the required methods:
+   - `load_dataset()` — Load your training data
+   - `get_next_item()` — Provide the next item to the model
+   - `score_answer()` — Score model outputs and assign rewards
+   - `collect_trajectories()` — Collect and return trajectories
+4. Optionally define a custom config class inheriting from `BaseEnvConfig`
+
+Study the existing `gsm8k_tinker.py` as a template. The agent can help you create new environments — it can read existing environment files, inspect HuggingFace datasets, and write new environment code.
+
+## WandB Metrics
+
+Training runs log to Weights & Biases with these key metrics:
+
+| Metric | Description |
+|--------|-------------|
+| `train/loss` | Training loss (importance sampling) |
+| `train/learning_rate` | Current learning rate |
+| `reward/mean` | Mean reward across groups |
+| `logprobs/mean` | Mean reference logprobs |
+| `logprobs/mean_training` | Mean training logprobs |
+| `logprobs/diff` | Logprob drift (reference - training) |
+| `advantages/mean` | Mean advantage values |
+| `advantages/std` | Advantage standard deviation |
+
+## Log Files
+
+Each training run generates log files in `tinker-atropos/logs/`:
+
+```
+logs/
+├── api_{run_id}.log        # Atropos API server logs
+├── trainer_{run_id}.log    # Tinker trainer logs
+├── env_{run_id}.log        # Environment process logs
+└── inference_tests/        # Inference test results
+    ├── test_{env}_{model}.jsonl
+    └── test_{env}_{model}.log
+```
+
+These are invaluable for debugging when training fails or produces unexpected results.
--- a/website/docs/user-guide/messaging/homeassistant.md
+++ b/website/docs/user-guide/messaging/homeassistant.md
@@ -0,0 +1,238 @@
+---
+title: Home Assistant
+description: Control your smart home with Hermes Agent via Home Assistant integration.
+sidebar_label: Home Assistant
+sidebar_position: 5
+---
+
+# Home Assistant Integration
+
+Hermes Agent integrates with [Home Assistant](https://www.home-assistant.io/) in two ways:
+
+1. **Gateway platform** — subscribes to real-time state changes via WebSocket and responds to events
+2. **Smart home tools** — four LLM-callable tools for querying and controlling devices via the REST API
+
+## Setup
+
+### 1. Create a Long-Lived Access Token
+
+1. Open your Home Assistant instance
+2. Go to your **Profile** (click your name in the sidebar)
+3. Scroll to **Long-Lived Access Tokens**
+4. Click **Create Token**, give it a name like "Hermes Agent"
+5. Copy the token
+
+### 2. Configure Environment Variables
+
+```bash
+# Add to ~/.hermes/.env
+
+# Required: your Long-Lived Access Token
+HASS_TOKEN=your-long-lived-access-token
+
+# Optional: HA URL (default: http://homeassistant.local:8123)
+HASS_URL=http://192.168.1.100:8123
+```
+
+:::info
+The `homeassistant` toolset is automatically enabled when `HASS_TOKEN` is set. Both the gateway platform and the device control tools activate from this single token.
+:::
+
+### 3. Start the Gateway
+
+```bash
+hermes gateway
+```
+
+Home Assistant will appear as a connected platform alongside any other messaging platforms (Telegram, Discord, etc.).
+
+## Available Tools
+
+Hermes Agent registers four tools for smart home control:
+
+### `ha_list_entities`
+
+List Home Assistant entities, optionally filtered by domain or area.
+
+**Parameters:**
+- `domain` *(optional)* — Filter by entity domain: `light`, `switch`, `climate`, `sensor`, `binary_sensor`, `cover`, `fan`, `media_player`, etc.
+- `area` *(optional)* — Filter by area/room name (matches against friendly names): `living room`, `kitchen`, `bedroom`, etc.
+
+**Example:**
+```
+List all lights in the living room
+```
+
+Returns entity IDs, states, and friendly names.
+
+### `ha_get_state`
+
+Get detailed state of a single entity, including all attributes (brightness, color, temperature setpoint, sensor readings, etc.).
+
+**Parameters:**
+- `entity_id` *(required)* — The entity to query, e.g., `light.living_room`, `climate.thermostat`, `sensor.temperature`
+
+**Example:**
+```
+What's the current state of climate.thermostat?
+```
+
+Returns: state, all attributes, last changed/updated timestamps.
+
+### `ha_list_services`
+
+List available services (actions) for device control. Shows what actions can be performed on each device type and what parameters they accept.
+
+**Parameters:**
+- `domain` *(optional)* — Filter by domain, e.g., `light`, `climate`, `switch`
+
+**Example:**
+```
+What services are available for climate devices?
+```
+
+### `ha_call_service`
+
+Call a Home Assistant service to control a device.
+
+**Parameters:**
+- `domain` *(required)* — Service domain: `light`, `switch`, `climate`, `cover`, `media_player`, `fan`, `scene`, `script`
+- `service` *(required)* — Service name: `turn_on`, `turn_off`, `toggle`, `set_temperature`, `set_hvac_mode`, `open_cover`, `close_cover`, `set_volume_level`
+- `entity_id` *(optional)* — Target entity, e.g., `light.living_room`
+- `data` *(optional)* — Additional parameters as a JSON object
+
+**Examples:**
+
+```
+Turn on the living room lights
+→ ha_call_service(domain="light", service="turn_on", entity_id="light.living_room")
+```
+
+```
+Set the thermostat to 22 degrees in heat mode
+→ ha_call_service(domain="climate", service="set_temperature",
+    entity_id="climate.thermostat", data={"temperature": 22, "hvac_mode": "heat"})
+```
+
+```
+Set living room lights to blue at 50% brightness
+→ ha_call_service(domain="light", service="turn_on",
+    entity_id="light.living_room", data={"brightness": 128, "color_name": "blue"})
+```
+
+## Gateway Platform: Real-Time Events
+
+The Home Assistant gateway adapter connects via WebSocket and subscribes to `state_changed` events. When a device state changes, it's forwarded to the agent as a message.
+
+### Event Filtering
+
+Configure which events the agent sees via platform config in the gateway:
+
+```python
+# In platform extra config
+{
+    "watch_domains": ["climate", "binary_sensor", "alarm_control_panel"],
+    "watch_entities": ["sensor.front_door"],
+    "ignore_entities": ["sensor.uptime", "sensor.cpu_usage"],
+    "cooldown_seconds": 30
+}
+```
+
+| Setting | Default | Description |
+|---------|---------|-------------|
+| `watch_domains` | *(all)* | Only watch these entity domains |
+| `watch_entities` | *(all)* | Only watch these specific entities |
+| `ignore_entities` | *(none)* | Always ignore these entities |
+| `cooldown_seconds` | `30` | Minimum seconds between events for the same entity |
+
+:::tip
+Without any filters, the agent receives **all** state changes, which can be noisy. For practical use, set `watch_domains` to the domains you care about (e.g., `climate`, `binary_sensor`, `alarm_control_panel`).
+:::
+
+### Event Formatting
+
+State changes are formatted as human-readable messages based on domain:
+
+| Domain | Format |
+|--------|--------|
+| `climate` | "HVAC mode changed from 'off' to 'heat' (current: 21, target: 23)" |
+| `sensor` | "changed from 21°C to 22°C" |
+| `binary_sensor` | "triggered" / "cleared" |
+| `light`, `switch`, `fan` | "turned on" / "turned off" |
+| `alarm_control_panel` | "alarm state changed from 'armed_away' to 'triggered'" |
+| *(other)* | "changed from 'old' to 'new'" |
+
+### Agent Responses
+
+Outbound messages from the agent are delivered as **Home Assistant persistent notifications** (via `persistent_notification.create`). These appear in the HA notification panel with the title "Hermes Agent".
+
+### Connection Management
+
+- **WebSocket** with 30-second heartbeat for real-time events
+- **Automatic reconnection** with backoff: 5s → 10s → 30s → 60s
+- **REST API** for outbound notifications (separate session to avoid WebSocket conflicts)
+- **Authorization** — HA events are always authorized (no user allowlist needed, since the `HASS_TOKEN` authenticates the connection)
+
+## Security
+
+The Home Assistant tools enforce security restrictions:
+
+:::warning Blocked Domains
+The following service domains are **blocked** to prevent arbitrary code execution on the HA host:
+
+- `shell_command` — arbitrary shell commands
+- `command_line` — sensors/switches that execute commands
+- `python_script` — scripted Python execution
+- `pyscript` — broader scripting integration
+- `hassio` — addon control, host shutdown/reboot
+- `rest_command` — HTTP requests from HA server (SSRF vector)
+
+Attempting to call services in these domains returns an error.
+:::
+
+Entity IDs are validated against the pattern `^[a-z_][a-z0-9_]*\.[a-z0-9_]+$` to prevent injection attacks.
+
+## Example Automations
+
+### Morning Routine
+
+```
+User: Start my morning routine
+
+Agent:
+1. ha_call_service(domain="light", service="turn_on",
+     entity_id="light.bedroom", data={"brightness": 128})
+2. ha_call_service(domain="climate", service="set_temperature",
+     entity_id="climate.thermostat", data={"temperature": 22})
+3. ha_call_service(domain="media_player", service="turn_on",
+     entity_id="media_player.kitchen_speaker")
+```
+
+### Security Check
+
+```
+User: Is the house secure?
+
+Agent:
+1. ha_list_entities(domain="binary_sensor")
+     → checks door/window sensors
+2. ha_get_state(entity_id="alarm_control_panel.home")
+     → checks alarm status
+3. ha_list_entities(domain="lock")
+     → checks lock states
+4. Reports: "All doors closed, alarm is armed_away, all locks engaged."
+```
+
+### Reactive Automation (via Gateway Events)
+
+When connected as a gateway platform, the agent can react to events:
+
+```
+[Home Assistant] Front Door: triggered (was cleared)
+
+Agent automatically:
+1. ha_get_state(entity_id="binary_sensor.front_door")
+2. ha_call_service(domain="light", service="turn_on",
+     entity_id="light.hallway")
+3. Sends notification: "Front door opened. Hallway lights turned on."
+```
--- a/website/docs/user-guide/security.md
+++ b/website/docs/user-guide/security.md
@@ -0,0 +1,327 @@
+---
+sidebar_position: 8
+title: "Security"
+description: "Security model, dangerous command approval, user authorization, container isolation, and production deployment best practices"
+---
+
+# Security
+
+Hermes Agent is designed with a defense-in-depth security model. This page covers every security boundary — from command approval to container isolation to user authorization on messaging platforms.
+
+## Overview
+
+The security model has five layers:
+
+1. **User authorization** — who can talk to the agent (allowlists, DM pairing)
+2. **Dangerous command approval** — human-in-the-loop for destructive operations
+3. **Container isolation** — Docker/Singularity/Modal sandboxing with hardened settings
+4. **MCP credential filtering** — environment variable isolation for MCP subprocesses
+5. **Context file scanning** — prompt injection detection in project files
+
+## Dangerous Command Approval
+
+Before executing any command, Hermes checks it against a curated list of dangerous patterns. If a match is found, the user must explicitly approve it.
+
+### What Triggers Approval
+
+The following patterns trigger approval prompts (defined in `tools/approval.py`):
+
+| Pattern | Description |
+|---------|-------------|
+| `rm -r` / `rm --recursive` | Recursive delete |
+| `rm ... /` | Delete in root path |
+| `chmod 777` | World-writable permissions |
+| `mkfs` | Format filesystem |
+| `dd if=` | Disk copy |
+| `DROP TABLE/DATABASE` | SQL DROP |
+| `DELETE FROM` (without WHERE) | SQL DELETE without WHERE |
+| `TRUNCATE TABLE` | SQL TRUNCATE |
+| `> /etc/` | Overwrite system config |
+| `systemctl stop/disable/mask` | Stop/disable system services |
+| `kill -9 -1` | Kill all processes |
+| `curl ... \| sh` | Pipe remote content to shell |
+| `bash -c`, `python -e` | Shell/script execution via flags |
+| `find -exec rm`, `find -delete` | Find with destructive actions |
+| Fork bomb patterns | Fork bombs |
+
+:::info
+**Container bypass**: When running in `docker`, `singularity`, or `modal` backends, dangerous command checks are **skipped** because the container itself is the security boundary. Destructive commands inside a container can't harm the host.
+:::
+
+### Approval Flow (CLI)
+
+In the interactive CLI, dangerous commands show an inline approval prompt:
+
+```
+  ⚠️  DANGEROUS COMMAND: recursive delete
+      rm -rf /tmp/old-project
+
+      [o]nce  |  [s]ession  |  [a]lways  |  [d]eny
+
+      Choice [o/s/a/D]:
+```
+
+The four options:
+
+- **once** — allow this single execution
+- **session** — allow this pattern for the rest of the session
+- **always** — add to permanent allowlist (saved to `config.yaml`)
+- **deny** (default) — block the command
+
+### Approval Flow (Gateway/Messaging)
+
+On messaging platforms, the agent sends the dangerous command details to the chat and waits for the user to reply:
+
+- Reply **yes**, **y**, **approve**, **ok**, or **go** to approve
+- Reply **no**, **n**, **deny**, or **cancel** to deny
+
+The `HERMES_EXEC_ASK=1` environment variable is automatically set when running the gateway.
+
+### Permanent Allowlist
+
+Commands approved with "always" are saved to `~/.hermes/config.yaml`:
+
+```yaml
+# Permanently allowed dangerous command patterns
+command_allowlist:
+  - rm
+  - systemctl
+```
+
+These patterns are loaded at startup and silently approved in all future sessions.
+
+:::tip
+Use `hermes config edit` to review or remove patterns from your permanent allowlist.
+:::
+
+## User Authorization (Gateway)
+
+When running the messaging gateway, Hermes controls who can interact with the bot through a layered authorization system.
+
+### Authorization Check Order
+
+The `_is_user_authorized()` method checks in this order:
+
+1. **Per-platform allow-all flag** (e.g., `DISCORD_ALLOW_ALL_USERS=true`)
+2. **DM pairing approved list** (users approved via pairing codes)
+3. **Platform-specific allowlists** (e.g., `TELEGRAM_ALLOWED_USERS=12345,67890`)
+4. **Global allowlist** (`GATEWAY_ALLOWED_USERS=12345,67890`)
+5. **Global allow-all** (`GATEWAY_ALLOW_ALL_USERS=true`)
+6. **Default: deny**
+
+### Platform Allowlists
+
+Set allowed user IDs as comma-separated values in `~/.hermes/.env`:
+
+```bash
+# Platform-specific allowlists
+TELEGRAM_ALLOWED_USERS=123456789,987654321
+DISCORD_ALLOWED_USERS=111222333444555666
+WHATSAPP_ALLOWED_USERS=15551234567
+SLACK_ALLOWED_USERS=U01ABC123
+
+# Cross-platform allowlist (checked for all platforms)
+GATEWAY_ALLOWED_USERS=123456789
+
+# Per-platform allow-all (use with caution)
+DISCORD_ALLOW_ALL_USERS=true
+
+# Global allow-all (use with extreme caution)
+GATEWAY_ALLOW_ALL_USERS=true
+```
+
+:::warning
+If **no allowlists are configured** and `GATEWAY_ALLOW_ALL_USERS` is not set, **all users are denied**. The gateway logs a warning at startup:
+
+```
+No user allowlists configured. All unauthorized users will be denied.
+Set GATEWAY_ALLOW_ALL_USERS=true in ~/.hermes/.env to allow open access,
+or configure platform allowlists (e.g., TELEGRAM_ALLOWED_USERS=your_id).
+```
+:::
+
+### DM Pairing System
+
+For more flexible authorization, Hermes includes a code-based pairing system. Instead of requiring user IDs upfront, unknown users receive a one-time pairing code that the bot owner approves via the CLI.
+
+**How it works:**
+
+1. An unknown user sends a DM to the bot
+2. The bot replies with an 8-character pairing code
+3. The bot owner runs `hermes pairing approve <platform> <code>` on the CLI
+4. The user is permanently approved for that platform
+
+**Security features** (based on OWASP + NIST SP 800-63-4 guidance):
+
+| Feature | Details |
+|---------|---------|
+| Code format | 8-char from 32-char unambiguous alphabet (no 0/O/1/I) |
+| Randomness | Cryptographic (`secrets.choice()`) |
+| Code TTL | 1 hour expiry |
+| Rate limiting | 1 request per user per 10 minutes |
+| Pending limit | Max 3 pending codes per platform |
+| Lockout | 5 failed approval attempts → 1-hour lockout |
+| File security | `chmod 0600` on all pairing data files |
+| Logging | Codes are never logged to stdout |
+
+**Pairing CLI commands:**
+
+```bash
+# List pending and approved users
+hermes pairing list
+
+# Approve a pairing code
+hermes pairing approve telegram ABC12DEF
+
+# Revoke a user's access
+hermes pairing revoke telegram 123456789
+
+# Clear all pending codes
+hermes pairing clear-pending
+```
+
+**Storage:** Pairing data is stored in `~/.hermes/pairing/` with per-platform JSON files:
+- `{platform}-pending.json` — pending pairing requests
+- `{platform}-approved.json` — approved users
+- `_rate_limits.json` — rate limit and lockout tracking
+
+## Container Isolation
+
+When using the `docker` terminal backend, Hermes applies strict security hardening to every container.
+
+### Docker Security Flags
+
+Every container runs with these flags (defined in `tools/environments/docker.py`):
+
+```python
+_SECURITY_ARGS = [
+    "--cap-drop", "ALL",                          # Drop ALL Linux capabilities
+    "--security-opt", "no-new-privileges",         # Block privilege escalation
+    "--pids-limit", "256",                         # Limit process count
+    "--tmpfs", "/tmp:rw,nosuid,size=512m",         # Size-limited /tmp
+    "--tmpfs", "/var/tmp:rw,noexec,nosuid,size=256m",  # No-exec /var/tmp
+    "--tmpfs", "/run:rw,noexec,nosuid,size=64m",   # No-exec /run
+]
+```
+
+### Resource Limits
+
+Container resources are configurable in `~/.hermes/config.yaml`:
+
+```yaml
+terminal:
+  backend: docker
+  docker_image: "nikolaik/python-nodejs:python3.11-nodejs20"
+  container_cpu: 1        # CPU cores
+  container_memory: 5120  # MB (default 5GB)
+  container_disk: 51200   # MB (default 50GB, requires overlay2 on XFS)
+  container_persistent: true  # Persist filesystem across sessions
+```
+
+### Filesystem Persistence
+
+- **Persistent mode** (`container_persistent: true`): Bind-mounts `/workspace` and `/root` from `~/.hermes/sandboxes/docker/<task_id>/`
+- **Ephemeral mode** (`container_persistent: false`): Uses tmpfs for workspace — everything is lost on cleanup
+
+:::tip
+For production gateway deployments, use `docker` or `modal` backend to isolate agent commands from your host system. This eliminates the need for dangerous command approval entirely.
+:::
+
+## Terminal Backend Security Comparison
+
+| Backend | Isolation | Dangerous Cmd Check | Best For |
+|---------|-----------|-------------------|----------|
+| **local** | None — runs on host | ✅ Yes | Development, trusted users |
+| **ssh** | Remote machine | ✅ Yes | Running on a separate server |
+| **docker** | Container | ❌ Skipped (container is boundary) | Production gateway |
+| **singularity** | Container | ❌ Skipped | HPC environments |
+| **modal** | Cloud sandbox | ❌ Skipped | Scalable cloud isolation |
+
+## MCP Credential Handling
+
+MCP (Model Context Protocol) server subprocesses receive a **filtered environment** to prevent accidental credential leakage.
+
+### Safe Environment Variables
+
+Only these variables are passed through from the host to MCP stdio subprocesses:
+
+```
+PATH, HOME, USER, LANG, LC_ALL, TERM, SHELL, TMPDIR
+```
+
+Plus any `XDG_*` variables. All other environment variables (API keys, tokens, secrets) are **stripped**.
+
+Variables explicitly defined in the MCP server's `env` config are passed through:
+
+```yaml
+mcp_servers:
+  github:
+    command: "npx"
+    args: ["-y", "@modelcontextprotocol/server-github"]
+    env:
+      GITHUB_PERSONAL_ACCESS_TOKEN: "ghp_..."  # Only this is passed
+```
+
+### Credential Redaction
+
+Error messages from MCP tools are sanitized before being returned to the LLM. The following patterns are replaced with `[REDACTED]`:
+
+- GitHub PATs (`ghp_...`)
+- OpenAI-style keys (`sk-...`)
+- Bearer tokens
+- `token=`, `key=`, `API_KEY=`, `password=`, `secret=` parameters
+
+### Context File Injection Protection
+
+Context files (AGENTS.md, .cursorrules, SOUL.md) are scanned for prompt injection before being included in the system prompt. The scanner checks for:
+
+- Instructions to ignore/disregard prior instructions
+- Hidden HTML comments with suspicious keywords
+- Attempts to read secrets (`.env`, `credentials`, `.netrc`)
+- Credential exfiltration via `curl`
+- Invisible Unicode characters (zero-width spaces, bidirectional overrides)
+
+Blocked files show a warning:
+
+```
+[BLOCKED: AGENTS.md contained potential prompt injection (prompt_injection). Content not loaded.]
+```
+
+## Best Practices for Production Deployment
+
+### Gateway Deployment Checklist
+
+1. **Set explicit allowlists** — never use `GATEWAY_ALLOW_ALL_USERS=true` in production
+2. **Use container backend** — set `terminal.backend: docker` in config.yaml
+3. **Restrict resource limits** — set appropriate CPU, memory, and disk limits
+4. **Store secrets securely** — keep API keys in `~/.hermes/.env` with proper file permissions
+5. **Enable DM pairing** — use pairing codes instead of hardcoding user IDs when possible
+6. **Review command allowlist** — periodically audit `command_allowlist` in config.yaml
+7. **Set `MESSAGING_CWD`** — don't let the agent operate from sensitive directories
+8. **Run as non-root** — never run the gateway as root
+9. **Monitor logs** — check `~/.hermes/logs/` for unauthorized access attempts
+10. **Keep updated** — run `hermes update` regularly for security patches
+
+### Securing API Keys
+
+```bash
+# Set proper permissions on the .env file
+chmod 600 ~/.hermes/.env
+
+# Keep separate keys for different services
+# Never commit .env files to version control
+```
+
+### Network Isolation
+
+For maximum security, run the gateway on a separate machine or VM:
+
+```yaml
+terminal:
+  backend: ssh
+  ssh_host: "agent-worker.local"
+  ssh_user: "hermes"
+  ssh_key: "~/.ssh/hermes_agent_key"
+```
+
+This keeps the gateway's messaging connections separate from the agent's command execution.
--- a/website/docs/user-guide/sessions.md
+++ b/website/docs/user-guide/sessions.md
@@ -0,0 +1,262 @@
+---
+sidebar_position: 7
+title: "Sessions"
+description: "Session persistence, resume, search, management, and per-platform session tracking"
+---
+
+# Sessions
+
+Hermes Agent automatically saves every conversation as a session. Sessions enable conversation resume, cross-session search, and full conversation history management.
+
+## How Sessions Work
+
+Every conversation — whether from the CLI, Telegram, Discord, WhatsApp, or Slack — is stored as a session with full message history. Sessions are tracked in two complementary systems:
+
+1. **SQLite database** (`~/.hermes/state.db`) — structured session metadata with FTS5 full-text search
+2. **JSONL transcripts** (`~/.hermes/sessions/`) — raw conversation transcripts including tool calls (gateway)
+
+The SQLite database stores:
+- Session ID, source platform, user ID
+- Model name and configuration
+- System prompt snapshot
+- Full message history (role, content, tool calls, tool results)
+- Token counts (input/output)
+- Timestamps (started_at, ended_at)
+- Parent session ID (for compression-triggered session splitting)
+
+### Session Sources
+
+Each session is tagged with its source platform:
+
+| Source | Description |
+|--------|-------------|
+| `cli` | Interactive CLI (`hermes` or `hermes chat`) |
+| `telegram` | Telegram messenger |
+| `discord` | Discord server/DM |
+| `whatsapp` | WhatsApp messenger |
+| `slack` | Slack workspace |
+
+## CLI Session Resume
+
+Resume previous conversations from the CLI using `--continue` or `--resume`:
+
+### Continue Last Session
+
+```bash
+# Resume the most recent CLI session
+hermes --continue
+hermes -c
+
+# Or with the chat subcommand
+hermes chat --continue
+hermes chat -c
+```
+
+This looks up the most recent `cli` session from the SQLite database and loads its full conversation history.
+
+### Resume Specific Session
+
+```bash
+# Resume a specific session by ID
+hermes --resume 20250305_091523_a1b2c3d4
+hermes -r 20250305_091523_a1b2c3d4
+
+# Or with the chat subcommand
+hermes chat --resume 20250305_091523_a1b2c3d4
+```
+
+Session IDs are shown when you exit a CLI session, and can be found with `hermes sessions list`.
+
+:::tip
+Session IDs follow the format `YYYYMMDD_HHMMSS_<8-char-hex>`, e.g. `20250305_091523_a1b2c3d4`. You only need to provide enough of the ID to be unique.
+:::
+
+## Session Management Commands
+
+Hermes provides a full set of session management commands via `hermes sessions`:
+
+### List Sessions
+
+```bash
+# List recent sessions (default: last 20)
+hermes sessions list
+
+# Filter by platform
+hermes sessions list --source telegram
+
+# Show more sessions
+hermes sessions list --limit 50
+```
+
+Output format:
+
+```
+ID                             Source       Model                          Messages Started
+────────────────────────────────────────────────────────────────────────────────────────────────
+20250305_091523_a1b2c3d4       cli          anthropic/claude-opus-4.6          24 2025-03-05 09:15
+20250304_143022_e5f6g7h8       telegram     anthropic/claude-opus-4.6          12 2025-03-04 14:30 (ended)
+```
+
+### Export Sessions
+
+```bash
+# Export all sessions to a JSONL file
+hermes sessions export backup.jsonl
+
+# Export sessions from a specific platform
+hermes sessions export telegram-history.jsonl --source telegram
+
+# Export a single session
+hermes sessions export session.jsonl --session-id 20250305_091523_a1b2c3d4
+```
+
+Exported files contain one JSON object per line with full session metadata and all messages.
+
+### Delete a Session
+
+```bash
+# Delete a specific session (with confirmation)
+hermes sessions delete 20250305_091523_a1b2c3d4
+
+# Delete without confirmation
+hermes sessions delete 20250305_091523_a1b2c3d4 --yes
+```
+
+### Prune Old Sessions
+
+```bash
+# Delete ended sessions older than 90 days (default)
+hermes sessions prune
+
+# Custom age threshold
+hermes sessions prune --older-than 30
+
+# Only prune sessions from a specific platform
+hermes sessions prune --source telegram --older-than 60
+
+# Skip confirmation
+hermes sessions prune --older-than 30 --yes
+```
+
+:::info
+Pruning only deletes **ended** sessions (sessions that have been explicitly ended or auto-reset). Active sessions are never pruned.
+:::
+
+### Session Statistics
+
+```bash
+hermes sessions stats
+```
+
+Output:
+
+```
+Total sessions: 142
+Total messages: 3847
+  cli: 89 sessions
+  telegram: 38 sessions
+  discord: 15 sessions
+Database size: 12.4 MB
+```
+
+## Session Search Tool
+
+The agent has a built-in `session_search` tool that performs full-text search across all past conversations using SQLite's FTS5 engine.
+
+### How It Works
+
+1. FTS5 searches matching messages ranked by relevance
+2. Groups results by session, takes the top N unique sessions (default 3)
+3. Loads each session's conversation, truncates to ~100K chars centered on matches
+4. Sends to a fast summarization model for focused summaries
+5. Returns per-session summaries with metadata and surrounding context
+
+### FTS5 Query Syntax
+
+The search supports standard FTS5 query syntax:
+
+- Simple keywords: `docker deployment`
+- Phrases: `"exact phrase"`
+- Boolean: `docker OR kubernetes`, `python NOT java`
+- Prefix: `deploy*`
+
+### When It's Used
+
+The agent is prompted to use session search automatically:
+
+> *"When the user references something from a past conversation or you suspect relevant prior context exists, use session_search to recall it before asking them to repeat themselves."*
+
+## Per-Platform Session Tracking
+
+### Gateway Sessions
+
+On messaging platforms, sessions are keyed by a deterministic session key built from the message source:
+
+| Chat Type | Key Format | Example |
+|-----------|-----------|---------|
+| Telegram DM | `agent:main:telegram:dm` | One session per bot |
+| Discord DM | `agent:main:discord:dm` | One session per bot |
+| WhatsApp DM | `agent:main:whatsapp:dm:<chat_id>` | Per-user (multi-user) |
+| Group chat | `agent:main:<platform>:group:<chat_id>` | Per-group |
+| Channel | `agent:main:<platform>:channel:<chat_id>` | Per-channel |
+
+:::info
+WhatsApp DMs include the chat ID in the session key because multiple users can DM the bot. Other platforms use a single DM session since the bot is configured per-user via allowlists.
+:::
+
+### Session Reset Policies
+
+Gateway sessions are automatically reset based on configurable policies:
+
+- **idle** — reset after N minutes of inactivity
+- **daily** — reset at a specific hour each day
+- **both** — reset on whichever comes first (idle or daily)
+- **none** — never auto-reset
+
+Before a session is auto-reset, the agent is given a turn to save any important memories or skills from the conversation.
+
+Sessions with **active background processes** are never auto-reset, regardless of policy.
+
+## Storage Locations
+
+| What | Path | Description |
+|------|------|-------------|
+| SQLite database | `~/.hermes/state.db` | All session metadata + messages with FTS5 |
+| Gateway transcripts | `~/.hermes/sessions/` | JSONL transcripts per session + sessions.json index |
+| Gateway index | `~/.hermes/sessions/sessions.json` | Maps session keys to active session IDs |
+
+The SQLite database uses WAL mode for concurrent readers and a single writer, which suits the gateway's multi-platform architecture well.
+
+### Database Schema
+
+Key tables in `state.db`:
+
+- **sessions** — session metadata (id, source, user_id, model, timestamps, token counts)
+- **messages** — full message history (role, content, tool_calls, tool_name, token_count)
+- **messages_fts** — FTS5 virtual table for full-text search across message content
+
+## Session Expiry and Cleanup
+
+### Automatic Cleanup
+
+- Gateway sessions auto-reset based on the configured reset policy
+- Before reset, the agent saves memories and skills from the expiring session
+- Ended sessions remain in the database until pruned
+
+### Manual Cleanup
+
+```bash
+# Prune sessions older than 90 days
+hermes sessions prune
+
+# Delete a specific session
+hermes sessions delete <session_id>
+
+# Export before pruning (backup)
+hermes sessions export backup.jsonl
+hermes sessions prune --older-than 30 --yes
+```
+
+:::tip
+The database grows slowly (typical: 10-15 MB for hundreds of sessions). Pruning is mainly useful for removing old conversations you no longer need for search recall.
+:::
--- a/website/sidebars.ts
+++ b/website/sidebars.ts
@@ -19,6 +19,8 @@ const sidebars: SidebarsConfig = {
      items: [
        'user-guide/cli',
        'user-guide/configuration',
+        'user-guide/sessions',
+        'user-guide/security',
        {
          type: 'category',
          label: 'Messaging Gateway',
@@ -28,6 +30,7 @@ const sidebars: SidebarsConfig = {
            'user-guide/messaging/discord',
            'user-guide/messaging/slack',
            'user-guide/messaging/whatsapp',
+            'user-guide/messaging/homeassistant',
          ],
        },
        {
@@ -37,12 +40,20 @@ const sidebars: SidebarsConfig = {
            'user-guide/features/tools',
            'user-guide/features/skills',
            'user-guide/features/memory',
+            'user-guide/features/context-files',
+            'user-guide/features/personality',
            'user-guide/features/mcp',
            'user-guide/features/cron',
            'user-guide/features/hooks',
            'user-guide/features/delegation',
            'user-guide/features/code-execution',
+            'user-guide/features/browser',
+            'user-guide/features/image-generation',
            'user-guide/features/tts',
+            'user-guide/features/provider-routing',
+            'user-guide/features/honcho',
+            'user-guide/features/batch-processing',
+            'user-guide/features/rl-training',
          ],
        },
      ],