feat: add documentation website (Docusaurus)
- 25 documentation pages covering Getting Started, User Guide, Developer Guide, and Reference - Docusaurus with custom amber/gold theme matching the landing page branding - GitHub Actions workflow to deploy landing page + docs to GitHub Pages - Landing page at root, docs at /docs/ on hermes-agent.nousresearch.com - Content extracted and restructured from existing repo docs (README, AGENTS.md, CONTRIBUTING.md, docs/) - Auto-deploy on push to main when website/ or landingpage/ changes
This commit is contained in:
8
website/docs/user-guide/_category_.json
Normal file
8
website/docs/user-guide/_category_.json
Normal file
@@ -0,0 +1,8 @@
|
||||
{
|
||||
"label": "User Guide",
|
||||
"position": 2,
|
||||
"link": {
|
||||
"type": "generated-index",
|
||||
"description": "Learn how to use Hermes Agent effectively."
|
||||
}
|
||||
}
|
||||
268
website/docs/user-guide/cli.md
Normal file
268
website/docs/user-guide/cli.md
Normal file
@@ -0,0 +1,268 @@
|
||||
---
|
||||
sidebar_position: 1
|
||||
title: "CLI Interface"
|
||||
description: "Master the Hermes Agent terminal interface — commands, keybindings, personalities, and more"
|
||||
---
|
||||
|
||||
# CLI Interface
|
||||
|
||||
Hermes Agent's CLI is a full terminal user interface (TUI) — not a web UI. It features multiline editing, slash-command autocomplete, conversation history, interrupt-and-redirect, and streaming tool output. Built for people who live in the terminal.
|
||||
|
||||
## Running the CLI
|
||||
|
||||
```bash
|
||||
# Start an interactive session (default)
|
||||
hermes
|
||||
|
||||
# Single query mode (non-interactive)
|
||||
hermes chat -q "Hello"
|
||||
|
||||
# With a specific model
|
||||
hermes --model "anthropic/claude-sonnet-4"
|
||||
|
||||
# With a specific provider
|
||||
hermes --provider nous # Use Nous Portal
|
||||
hermes --provider openrouter # Force OpenRouter
|
||||
|
||||
# With specific toolsets
|
||||
hermes --toolsets "web,terminal,skills"
|
||||
|
||||
# Resume previous sessions
|
||||
hermes --continue # Resume the most recent CLI session (-c)
|
||||
hermes --resume <session_id> # Resume a specific session by ID (-r)
|
||||
|
||||
# Verbose mode (debug output)
|
||||
hermes --verbose
|
||||
```
|
||||
|
||||
## Interface Layout
|
||||
|
||||
```text
|
||||
┌─────────────────────────────────────────────────┐
|
||||
│ HERMES-AGENT ASCII Logo │
|
||||
│ ┌─────────────┐ ┌────────────────────────────┐ │
|
||||
│ │ Caduceus │ │ Model: claude-sonnet-4 │ │
|
||||
│ │ ASCII Art │ │ Terminal: local │ │
|
||||
│ │ │ │ Working Dir: /home/user │ │
|
||||
│ │ │ │ Available Tools: 19 │ │
|
||||
│ │ │ │ Available Skills: 12 │ │
|
||||
│ └─────────────┘ └────────────────────────────┘ │
|
||||
├─────────────────────────────────────────────────┤
|
||||
│ Conversation output scrolls here... │
|
||||
│ │
|
||||
│ (◕‿◕✿) 🧠 pondering... (2.3s) │
|
||||
│ ✧٩(ˊᗜˋ*)و✧ got it! (2.3s) │
|
||||
│ │
|
||||
│ Assistant: Hello! How can I help you today? │
|
||||
├─────────────────────────────────────────────────┤
|
||||
│ ❯ [Fixed input area at bottom] │
|
||||
└─────────────────────────────────────────────────┘
|
||||
```
|
||||
|
||||
The welcome banner shows your model, terminal backend, working directory, available tools, and installed skills at a glance.
|
||||
|
||||
## Keybindings
|
||||
|
||||
| Key | Action |
|
||||
|-----|--------|
|
||||
| `Enter` | Send message |
|
||||
| `Alt+Enter` or `Ctrl+J` | New line (multi-line input) |
|
||||
| `Ctrl+C` | Interrupt agent (double-press within 2s to force exit) |
|
||||
| `Ctrl+D` | Exit |
|
||||
| `Tab` | Autocomplete slash commands |
|
||||
|
||||
## Slash Commands
|
||||
|
||||
Type `/` to see an autocomplete dropdown of all available commands.
|
||||
|
||||
### Navigation & Control
|
||||
|
||||
| Command | Description |
|
||||
|---------|-------------|
|
||||
| `/help` | Show available commands |
|
||||
| `/quit` | Exit the CLI (also: `/exit`, `/q`) |
|
||||
| `/clear` | Clear screen and reset conversation |
|
||||
| `/new` | Start a new conversation |
|
||||
| `/reset` | Reset conversation only (keep screen) |
|
||||
|
||||
### Tools & Configuration
|
||||
|
||||
| Command | Description |
|
||||
|---------|-------------|
|
||||
| `/tools` | List all available tools grouped by toolset |
|
||||
| `/toolsets` | List available toolsets with descriptions |
|
||||
| `/model [name]` | Show or change the current model |
|
||||
| `/config` | Show current configuration |
|
||||
| `/prompt [text]` | View/set/clear custom system prompt |
|
||||
| `/personality [name]` | Set a predefined personality |
|
||||
|
||||
### Conversation Management
|
||||
|
||||
| Command | Description |
|
||||
|---------|-------------|
|
||||
| `/history` | Show conversation history |
|
||||
| `/retry` | Retry the last message |
|
||||
| `/undo` | Remove the last user/assistant exchange |
|
||||
| `/save` | Save the current conversation |
|
||||
| `/compress` | Manually compress conversation context |
|
||||
| `/usage` | Show token usage for this session |
|
||||
|
||||
### Skills & Scheduling
|
||||
|
||||
| Command | Description |
|
||||
|---------|-------------|
|
||||
| `/cron` | Manage scheduled tasks |
|
||||
| `/skills` | Search, install, inspect, or manage skills |
|
||||
| `/platforms` | Show gateway/messaging platform status |
|
||||
| `/verbose` | Cycle tool progress display: off → new → all → verbose |
|
||||
| `/<skill-name>` | Invoke any installed skill (e.g., `/axolotl`, `/gif-search`) |
|
||||
|
||||
:::tip
|
||||
Commands are case-insensitive — `/HELP` works the same as `/help`. Most commands work mid-conversation.
|
||||
:::
|
||||
|
||||
## Skill Slash Commands
|
||||
|
||||
Every installed skill in `~/.hermes/skills/` is automatically registered as a slash command. The skill name becomes the command:
|
||||
|
||||
```
|
||||
/gif-search funny cats
|
||||
/axolotl help me fine-tune Llama 3 on my dataset
|
||||
/github-pr-workflow create a PR for the auth refactor
|
||||
|
||||
# Just the skill name loads it and lets the agent ask what you need:
|
||||
/excalidraw
|
||||
```
|
||||
|
||||
## Personalities
|
||||
|
||||
Set a predefined personality to change the agent's tone:
|
||||
|
||||
```
|
||||
/personality pirate
|
||||
/personality kawaii
|
||||
/personality concise
|
||||
```
|
||||
|
||||
Built-in personalities include: `helpful`, `concise`, `technical`, `creative`, `teacher`, `kawaii`, `catgirl`, `pirate`, `shakespeare`, `surfer`, `noir`, `uwu`, `philosopher`, `hype`.
|
||||
|
||||
You can also define custom personalities in `~/.hermes/config.yaml`:
|
||||
|
||||
```yaml
|
||||
agent:
|
||||
personalities:
|
||||
helpful: "You are a helpful, friendly AI assistant."
|
||||
kawaii: "You are a kawaii assistant! Use cute expressions..."
|
||||
pirate: "Arrr! Ye be talkin' to Captain Hermes..."
|
||||
# Add your own!
|
||||
```
|
||||
|
||||
## Multi-line Input
|
||||
|
||||
There are two ways to enter multi-line messages:
|
||||
|
||||
1. **`Alt+Enter` or `Ctrl+J`** — inserts a new line
|
||||
2. **Backslash continuation** — end a line with `\` to continue:
|
||||
|
||||
```
|
||||
❯ Write a function that:\
|
||||
1. Takes a list of numbers\
|
||||
2. Returns the sum
|
||||
```
|
||||
|
||||
:::info
|
||||
Pasting 5+ lines of text automatically saves to `~/.hermes/pastes/` and collapses to a reference, keeping your prompt clean.
|
||||
:::
|
||||
|
||||
## Interrupting the Agent
|
||||
|
||||
You can interrupt the agent at any point:
|
||||
|
||||
- **Type a new message + Enter** while the agent is working — it interrupts and processes your new instructions
|
||||
- **`Ctrl+C`** — interrupt the current operation (press twice within 2s to force exit)
|
||||
- In-progress terminal commands are killed immediately (SIGTERM, then SIGKILL after 1s)
|
||||
- Multiple messages typed during interrupt are combined into one prompt
|
||||
|
||||
## Tool Progress Display
|
||||
|
||||
The CLI shows animated feedback as the agent works:
|
||||
|
||||
**Thinking animation** (during API calls):
|
||||
```
|
||||
◜ (。•́︿•̀。) pondering... (1.2s)
|
||||
◠ (⊙_⊙) contemplating... (2.4s)
|
||||
✧٩(ˊᗜˋ*)و✧ got it! (3.1s)
|
||||
```
|
||||
|
||||
**Tool execution feed:**
|
||||
```
|
||||
┊ 💻 terminal `ls -la` (0.3s)
|
||||
┊ 🔍 web_search (1.2s)
|
||||
┊ 📄 web_extract (2.1s)
|
||||
```
|
||||
|
||||
Cycle through display modes with `/verbose`: `off → new → all → verbose`.
|
||||
|
||||
## Session Management
|
||||
|
||||
### Resuming Sessions
|
||||
|
||||
When you exit a CLI session, a resume command is printed:
|
||||
|
||||
```
|
||||
Resume this session with:
|
||||
hermes --resume 20260225_143052_a1b2c3
|
||||
|
||||
Session: 20260225_143052_a1b2c3
|
||||
Duration: 12m 34s
|
||||
Messages: 28 (5 user, 18 tool calls)
|
||||
```
|
||||
|
||||
Resume options:
|
||||
|
||||
```bash
|
||||
hermes --continue # Resume the most recent CLI session
|
||||
hermes -c # Short form
|
||||
hermes --resume 20260225_143052_a1b2c3 # Resume a specific session by ID
|
||||
hermes -r 20260225_143052_a1b2c3 # Short form
|
||||
```
|
||||
|
||||
Resuming restores the full conversation history from SQLite. The agent sees all previous messages, tool calls, and responses — just as if you never left.
|
||||
|
||||
Use `hermes sessions list` to browse past sessions.
|
||||
|
||||
### Session Logging
|
||||
|
||||
Sessions are automatically logged to `~/.hermes/sessions/`:
|
||||
|
||||
```
|
||||
sessions/
|
||||
├── session_20260201_143052_a1b2c3.json
|
||||
├── session_20260201_150217_d4e5f6.json
|
||||
└── ...
|
||||
```
|
||||
|
||||
### Context Compression
|
||||
|
||||
Long conversations are automatically summarized when approaching context limits:
|
||||
|
||||
```yaml
|
||||
# In ~/.hermes/config.yaml
|
||||
compression:
|
||||
enabled: true
|
||||
threshold: 0.85 # Compress at 85% of context limit
|
||||
```
|
||||
|
||||
When compression triggers, middle turns are summarized while the first 3 and last 4 turns are always preserved.
|
||||
|
||||
## Quiet Mode
|
||||
|
||||
By default, the CLI runs in quiet mode which:
|
||||
- Suppresses verbose logging from tools
|
||||
- Enables kawaii-style animated feedback
|
||||
- Keeps output clean and user-friendly
|
||||
|
||||
For debug output:
|
||||
```bash
|
||||
hermes --verbose
|
||||
```
|
||||
204
website/docs/user-guide/configuration.md
Normal file
204
website/docs/user-guide/configuration.md
Normal file
@@ -0,0 +1,204 @@
|
||||
---
|
||||
sidebar_position: 2
|
||||
title: "Configuration"
|
||||
description: "Configure Hermes Agent — config.yaml, providers, models, API keys, and more"
|
||||
---
|
||||
|
||||
# Configuration
|
||||
|
||||
All settings are stored in the `~/.hermes/` directory for easy access.
|
||||
|
||||
## Directory Structure
|
||||
|
||||
```text
|
||||
~/.hermes/
|
||||
├── config.yaml # Settings (model, terminal, TTS, compression, etc.)
|
||||
├── .env # API keys and secrets
|
||||
├── auth.json # OAuth provider credentials (Nous Portal, etc.)
|
||||
├── SOUL.md # Optional: global persona (agent embodies this personality)
|
||||
├── memories/ # Persistent memory (MEMORY.md, USER.md)
|
||||
├── skills/ # Agent-created skills (managed via skill_manage tool)
|
||||
├── cron/ # Scheduled jobs
|
||||
├── sessions/ # Gateway sessions
|
||||
└── logs/ # Logs (errors.log, gateway.log — secrets auto-redacted)
|
||||
```
|
||||
|
||||
## Managing Configuration
|
||||
|
||||
```bash
|
||||
hermes config # View current configuration
|
||||
hermes config edit # Open config.yaml in your editor
|
||||
hermes config set KEY VAL # Set a specific value
|
||||
hermes config check # Check for missing options (after updates)
|
||||
hermes config migrate # Interactively add missing options
|
||||
|
||||
# Examples:
|
||||
hermes config set model anthropic/claude-opus-4
|
||||
hermes config set terminal.backend docker
|
||||
hermes config set OPENROUTER_API_KEY sk-or-... # Saves to .env
|
||||
```
|
||||
|
||||
:::tip
|
||||
The `hermes config set` command automatically routes values to the right file — API keys are saved to `.env`, everything else to `config.yaml`.
|
||||
:::
|
||||
|
||||
## Configuration Precedence
|
||||
|
||||
Settings are resolved in this order (highest priority first):
|
||||
|
||||
1. **CLI arguments** — `hermes chat --max-turns 100` (per-invocation override)
|
||||
2. **`~/.hermes/config.yaml`** — the primary config file for all non-secret settings
|
||||
3. **`~/.hermes/.env`** — fallback for env vars; **required** for secrets (API keys, tokens, passwords)
|
||||
4. **Built-in defaults** — hardcoded safe defaults when nothing else is set
|
||||
|
||||
:::info Rule of Thumb
|
||||
Secrets (API keys, bot tokens, passwords) go in `.env`. Everything else (model, terminal backend, compression settings, memory limits, toolsets) goes in `config.yaml`. When both are set, `config.yaml` wins for non-secret settings.
|
||||
:::
|
||||
|
||||
## Inference Providers
|
||||
|
||||
You need at least one way to connect to an LLM. Use `hermes model` to switch providers and models interactively, or configure directly:
|
||||
|
||||
| Provider | Setup |
|
||||
|----------|-------|
|
||||
| **Nous Portal** | `hermes model` (OAuth, subscription-based) |
|
||||
| **OpenAI Codex** | `hermes model` (ChatGPT OAuth, uses Codex models) |
|
||||
| **OpenRouter** | `OPENROUTER_API_KEY` in `~/.hermes/.env` |
|
||||
| **Custom Endpoint** | `OPENAI_BASE_URL` + `OPENAI_API_KEY` in `~/.hermes/.env` |
|
||||
|
||||
:::info Codex Note
|
||||
The OpenAI Codex provider authenticates via device code (open a URL, enter a code). Credentials are stored at `~/.codex/auth.json` and auto-refresh. No Codex CLI installation required.
|
||||
:::
|
||||
|
||||
:::warning
|
||||
Even when using Nous Portal, Codex, or a custom endpoint, some tools (vision, web summarization, MoA) use OpenRouter independently. An `OPENROUTER_API_KEY` enables these tools.
|
||||
:::
|
||||
|
||||
## Optional API Keys
|
||||
|
||||
| Feature | Provider | Env Variable |
|
||||
|---------|----------|--------------|
|
||||
| Web scraping | [Firecrawl](https://firecrawl.dev/) | `FIRECRAWL_API_KEY` |
|
||||
| Browser automation | [Browserbase](https://browserbase.com/) | `BROWSERBASE_API_KEY`, `BROWSERBASE_PROJECT_ID` |
|
||||
| Image generation | [FAL](https://fal.ai/) | `FAL_KEY` |
|
||||
| Premium TTS voices | [ElevenLabs](https://elevenlabs.io/) | `ELEVENLABS_API_KEY` |
|
||||
| OpenAI TTS + voice transcription | [OpenAI](https://platform.openai.com/api-keys) | `VOICE_TOOLS_OPENAI_KEY` |
|
||||
| RL Training | [Tinker](https://tinker-console.thinkingmachines.ai/) + [WandB](https://wandb.ai/) | `TINKER_API_KEY`, `WANDB_API_KEY` |
|
||||
| Cross-session user modeling | [Honcho](https://honcho.dev/) | `HONCHO_API_KEY` |
|
||||
|
||||
## OpenRouter Provider Routing
|
||||
|
||||
When using OpenRouter, you can control how requests are routed across providers. Add a `provider_routing` section to `~/.hermes/config.yaml`:
|
||||
|
||||
```yaml
|
||||
provider_routing:
|
||||
sort: "throughput" # "price" (default), "throughput", or "latency"
|
||||
# only: ["anthropic"] # Only use these providers
|
||||
# ignore: ["deepinfra"] # Skip these providers
|
||||
# order: ["anthropic", "google"] # Try providers in this order
|
||||
# require_parameters: true # Only use providers that support all request params
|
||||
# data_collection: "deny" # Exclude providers that may store/train on data
|
||||
```
|
||||
|
||||
**Shortcuts:** Append `:nitro` to any model name for throughput sorting (e.g., `anthropic/claude-sonnet-4:nitro`), or `:floor` for price sorting.
|
||||
|
||||
## Terminal Backend Configuration
|
||||
|
||||
Configure which environment the agent uses for terminal commands:
|
||||
|
||||
```yaml
|
||||
terminal:
|
||||
backend: local # or: docker, ssh, singularity, modal
|
||||
cwd: "." # Working directory ("." = current dir)
|
||||
timeout: 180 # Command timeout in seconds
|
||||
```
|
||||
|
||||
See [Code Execution](features/code-execution.md) and the [Terminal section of the README](features/tools.md) for details on each backend.
|
||||
|
||||
## Memory Configuration
|
||||
|
||||
```yaml
|
||||
memory:
|
||||
memory_enabled: true
|
||||
user_profile_enabled: true
|
||||
memory_char_limit: 2200 # ~800 tokens
|
||||
user_char_limit: 1375 # ~500 tokens
|
||||
```
|
||||
|
||||
## Context Compression
|
||||
|
||||
```yaml
|
||||
compression:
|
||||
enabled: true
|
||||
threshold: 0.85 # Compress at 85% of context limit
|
||||
```
|
||||
|
||||
## Reasoning Effort
|
||||
|
||||
Control how much "thinking" the model does before responding:
|
||||
|
||||
```yaml
|
||||
agent:
|
||||
reasoning_effort: "xhigh" # xhigh (max), high, medium, low, minimal, none
|
||||
```
|
||||
|
||||
Higher reasoning effort gives better results on complex tasks at the cost of more tokens and latency.
|
||||
|
||||
## TTS Configuration
|
||||
|
||||
```yaml
|
||||
tts:
|
||||
provider: "edge" # "edge" | "elevenlabs" | "openai"
|
||||
edge:
|
||||
voice: "en-US-AriaNeural" # 322 voices, 74 languages
|
||||
elevenlabs:
|
||||
voice_id: "pNInz6obpgDQGcFmaJgB"
|
||||
model_id: "eleven_multilingual_v2"
|
||||
openai:
|
||||
model: "gpt-4o-mini-tts"
|
||||
voice: "alloy" # alloy, echo, fable, onyx, nova, shimmer
|
||||
```
|
||||
|
||||
## Display Settings
|
||||
|
||||
```yaml
|
||||
display:
|
||||
tool_progress: all # off | new | all | verbose
|
||||
```
|
||||
|
||||
| Mode | What you see |
|
||||
|------|-------------|
|
||||
| `off` | Silent — just the final response |
|
||||
| `new` | Tool indicator only when the tool changes |
|
||||
| `all` | Every tool call with a short preview (default) |
|
||||
| `verbose` | Full args, results, and debug logs |
|
||||
|
||||
## Context Files (SOUL.md, AGENTS.md)
|
||||
|
||||
Drop these files in your project directory and the agent automatically picks them up:
|
||||
|
||||
| File | Purpose |
|
||||
|------|---------|
|
||||
| `AGENTS.md` | Project-specific instructions, coding conventions |
|
||||
| `SOUL.md` | Persona definition — the agent embodies this personality |
|
||||
| `.cursorrules` | Cursor IDE rules (also detected) |
|
||||
| `.cursor/rules/*.mdc` | Cursor rule files (also detected) |
|
||||
|
||||
- **AGENTS.md** is hierarchical: if subdirectories also have AGENTS.md, all are combined.
|
||||
- **SOUL.md** checks cwd first, then `~/.hermes/SOUL.md` as a global fallback.
|
||||
- All context files are capped at 20,000 characters with smart truncation.
|
||||
|
||||
## Working Directory
|
||||
|
||||
| Context | Default |
|
||||
|---------|---------|
|
||||
| **CLI (`hermes`)** | Current directory where you run the command |
|
||||
| **Messaging gateway** | Home directory `~` (override with `MESSAGING_CWD`) |
|
||||
| **Docker / Singularity / Modal / SSH** | User's home directory inside the container or remote machine |
|
||||
|
||||
Override the working directory:
|
||||
```bash
|
||||
# In ~/.hermes/.env or ~/.hermes/config.yaml:
|
||||
MESSAGING_CWD=/home/myuser/projects # Gateway sessions
|
||||
TERMINAL_CWD=/workspace # All terminal sessions
|
||||
```
|
||||
8
website/docs/user-guide/features/_category_.json
Normal file
8
website/docs/user-guide/features/_category_.json
Normal file
@@ -0,0 +1,8 @@
|
||||
{
|
||||
"label": "Features",
|
||||
"position": 4,
|
||||
"link": {
|
||||
"type": "generated-index",
|
||||
"description": "Explore the powerful features of Hermes Agent."
|
||||
}
|
||||
}
|
||||
51
website/docs/user-guide/features/code-execution.md
Normal file
51
website/docs/user-guide/features/code-execution.md
Normal file
@@ -0,0 +1,51 @@
|
||||
---
|
||||
sidebar_position: 8
|
||||
title: "Code Execution"
|
||||
description: "Sandboxed Python execution with RPC tool access — collapse multi-step workflows into a single turn"
|
||||
---
|
||||
|
||||
# Code Execution (Programmatic Tool Calling)
|
||||
|
||||
The `execute_code` tool lets the agent write Python scripts that call Hermes tools programmatically, collapsing multi-step workflows into a single LLM turn. The script runs in a sandboxed child process on the agent host, communicating via Unix domain socket RPC.
|
||||
|
||||
## How It Works
|
||||
|
||||
```python
|
||||
# The agent can write scripts like:
|
||||
from hermes_tools import web_search, web_extract
|
||||
|
||||
results = web_search("Python 3.13 features", limit=5)
|
||||
for r in results["data"]["web"]:
|
||||
content = web_extract([r["url"]])
|
||||
# ... filter and process ...
|
||||
print(summary)
|
||||
```
|
||||
|
||||
**Available tools in sandbox:** `web_search`, `web_extract`, `read_file`, `write_file`, `search`, `patch`, `terminal` (foreground only).
|
||||
|
||||
## When the Agent Uses This
|
||||
|
||||
The agent uses `execute_code` when there are:
|
||||
|
||||
- **3+ tool calls** with processing logic between them
|
||||
- Bulk data filtering or conditional branching
|
||||
- Loops over results
|
||||
|
||||
The key benefit: intermediate tool results never enter the context window — only the final `print()` output comes back, dramatically reducing token usage.
|
||||
|
||||
## Security
|
||||
|
||||
:::danger Security Model
|
||||
The child process runs with a **minimal environment**. API keys, tokens, and credentials are stripped entirely. The script accesses tools exclusively via the RPC channel — it cannot read secrets from environment variables.
|
||||
:::
|
||||
|
||||
Only safe system variables (`PATH`, `HOME`, `LANG`, etc.) are passed through.
|
||||
|
||||
## Configuration
|
||||
|
||||
```yaml
|
||||
# In ~/.hermes/config.yaml
|
||||
code_execution:
|
||||
timeout: 300 # Max seconds per script (default: 300)
|
||||
max_tool_calls: 50 # Max tool calls per execution (default: 50)
|
||||
```
|
||||
87
website/docs/user-guide/features/cron.md
Normal file
87
website/docs/user-guide/features/cron.md
Normal file
@@ -0,0 +1,87 @@
|
||||
---
|
||||
sidebar_position: 5
|
||||
title: "Scheduled Tasks (Cron)"
|
||||
description: "Schedule automated tasks with natural language — cron jobs, delivery options, and the gateway scheduler"
|
||||
---
|
||||
|
||||
# Scheduled Tasks (Cron)
|
||||
|
||||
Schedule tasks to run automatically with natural language or cron expressions. The agent can self-schedule using the `schedule_cronjob` tool from any platform.
|
||||
|
||||
## Creating Scheduled Tasks
|
||||
|
||||
### In the CLI
|
||||
|
||||
Use the `/cron` slash command:
|
||||
|
||||
```
|
||||
/cron add 30m "Remind me to check the build"
|
||||
/cron add "every 2h" "Check server status"
|
||||
/cron add "0 9 * * *" "Morning briefing"
|
||||
/cron list
|
||||
/cron remove <job_id>
|
||||
```
|
||||
|
||||
### Through Natural Conversation
|
||||
|
||||
Simply ask the agent on any platform:
|
||||
|
||||
```
|
||||
Every morning at 9am, check Hacker News for AI news and send me a summary on Telegram.
|
||||
```
|
||||
|
||||
The agent will use the `schedule_cronjob` tool to set it up.
|
||||
|
||||
## How It Works
|
||||
|
||||
**Cron execution is handled by the gateway daemon.** The gateway ticks the scheduler every 60 seconds, running any due jobs in isolated agent sessions:
|
||||
|
||||
```bash
|
||||
hermes gateway install # Install as system service (recommended)
|
||||
hermes gateway # Or run in foreground
|
||||
|
||||
hermes cron list # View scheduled jobs
|
||||
hermes cron status # Check if gateway is running
|
||||
```
|
||||
|
||||
:::info
|
||||
Even if no messaging platforms are configured, the gateway stays running for cron. A file lock prevents duplicate execution if multiple processes overlap.
|
||||
:::
|
||||
|
||||
## Delivery Options
|
||||
|
||||
When scheduling jobs, you specify where the output goes:
|
||||
|
||||
| Option | Description |
|
||||
|--------|-------------|
|
||||
| `"origin"` | Back to where the job was created |
|
||||
| `"local"` | Save to local files only |
|
||||
| `"telegram"` | Telegram home channel |
|
||||
| `"discord"` | Discord home channel |
|
||||
| `"telegram:123456"` | Specific Telegram chat |
|
||||
|
||||
The agent knows your connected platforms and home channels — it'll choose sensible defaults.
|
||||
|
||||
## Schedule Formats
|
||||
|
||||
- **Relative:** `30m`, `2h`, `1d`
|
||||
- **Human-readable:** `"every 2 hours"`, `"daily at 9am"`
|
||||
- **Cron expressions:** `"0 9 * * *"` (standard 5-field cron syntax)
|
||||
|
||||
## Managing Jobs
|
||||
|
||||
```bash
|
||||
# CLI commands
|
||||
hermes cron list # View all scheduled jobs
|
||||
hermes cron status # Check if the scheduler is running
|
||||
|
||||
# Slash commands (inside chat)
|
||||
/cron list
|
||||
/cron remove <job_id>
|
||||
```
|
||||
|
||||
## Security
|
||||
|
||||
:::warning
|
||||
Scheduled task prompts are scanned for instruction-override patterns (prompt injection). Jobs with suspicious content are blocked.
|
||||
:::
|
||||
60
website/docs/user-guide/features/delegation.md
Normal file
60
website/docs/user-guide/features/delegation.md
Normal file
@@ -0,0 +1,60 @@
|
||||
---
|
||||
sidebar_position: 7
|
||||
title: "Subagent Delegation"
|
||||
description: "Spawn isolated child agents for parallel workstreams with delegate_task"
|
||||
---
|
||||
|
||||
# Subagent Delegation
|
||||
|
||||
The `delegate_task` tool spawns child AIAgent instances with isolated context, restricted toolsets, and their own terminal sessions. Each child gets a fresh conversation and works independently — only its final summary enters the parent's context.
|
||||
|
||||
## Single Task
|
||||
|
||||
```python
|
||||
delegate_task(
|
||||
goal="Debug why tests fail",
|
||||
context="Error: assertion in test_foo.py line 42",
|
||||
toolsets=["terminal", "file"]
|
||||
)
|
||||
```
|
||||
|
||||
## Parallel Batch
|
||||
|
||||
Up to 3 concurrent subagents:
|
||||
|
||||
```python
|
||||
delegate_task(tasks=[
|
||||
{"goal": "Research topic A", "toolsets": ["web"]},
|
||||
{"goal": "Research topic B", "toolsets": ["web"]},
|
||||
{"goal": "Fix the build", "toolsets": ["terminal", "file"]}
|
||||
])
|
||||
```
|
||||
|
||||
## Key Properties
|
||||
|
||||
- Each subagent gets its **own terminal session** (separate from the parent)
|
||||
- **Depth limit of 2** — no grandchildren
|
||||
- Subagents **cannot** call: `delegate_task`, `clarify`, `memory`, `send_message`, `execute_code`
|
||||
- **Interrupt propagation** — interrupting the parent interrupts all active children
|
||||
- Only the final summary enters the parent's context, keeping token usage efficient
|
||||
|
||||
## Configuration
|
||||
|
||||
```yaml
|
||||
# In ~/.hermes/config.yaml
|
||||
delegation:
|
||||
max_iterations: 50 # Max turns per child (default: 50)
|
||||
default_toolsets: ["terminal", "file", "web"] # Default toolsets
|
||||
```
|
||||
|
||||
## When to Use Delegation
|
||||
|
||||
Delegation is most useful when:
|
||||
|
||||
- You have **independent workstreams** that can run in parallel
|
||||
- A subtask needs a **clean context** (e.g., debugging a long error trace without polluting the main conversation)
|
||||
- You want to **fan out** research across multiple topics and collect summaries
|
||||
|
||||
:::tip
|
||||
The agent handles delegation automatically based on the task complexity. You don't need to explicitly ask it to delegate — it will do so when it makes sense.
|
||||
:::
|
||||
182
website/docs/user-guide/features/hooks.md
Normal file
182
website/docs/user-guide/features/hooks.md
Normal file
@@ -0,0 +1,182 @@
|
||||
---
|
||||
sidebar_position: 6
|
||||
title: "Event Hooks"
|
||||
description: "Run custom code at key lifecycle points — log activity, send alerts, post to webhooks"
|
||||
---
|
||||
|
||||
# Event Hooks
|
||||
|
||||
The hooks system lets you run custom code at key points in the agent lifecycle — session creation, slash commands, each tool-calling step, and more. Hooks fire automatically during gateway operation without blocking the main agent pipeline.
|
||||
|
||||
## Creating a Hook
|
||||
|
||||
Each hook is a directory under `~/.hermes/hooks/` containing two files:
|
||||
|
||||
```
|
||||
~/.hermes/hooks/
|
||||
└── my-hook/
|
||||
├── HOOK.yaml # Declares which events to listen for
|
||||
└── handler.py # Python handler function
|
||||
```
|
||||
|
||||
### HOOK.yaml
|
||||
|
||||
```yaml
|
||||
name: my-hook
|
||||
description: Log all agent activity to a file
|
||||
events:
|
||||
- agent:start
|
||||
- agent:end
|
||||
- agent:step
|
||||
```
|
||||
|
||||
The `events` list determines which events trigger your handler. You can subscribe to any combination of events, including wildcards like `command:*`.
|
||||
|
||||
### handler.py
|
||||
|
||||
```python
|
||||
import json
|
||||
from datetime import datetime
|
||||
from pathlib import Path
|
||||
|
||||
LOG_FILE = Path.home() / ".hermes" / "hooks" / "my-hook" / "activity.log"
|
||||
|
||||
async def handle(event_type: str, context: dict):
|
||||
"""Called for each subscribed event. Must be named 'handle'."""
|
||||
entry = {
|
||||
"timestamp": datetime.now().isoformat(),
|
||||
"event": event_type,
|
||||
**context,
|
||||
}
|
||||
with open(LOG_FILE, "a") as f:
|
||||
f.write(json.dumps(entry) + "\n")
|
||||
```
|
||||
|
||||
**Handler rules:**
|
||||
- Must be named `handle`
|
||||
- Receives `event_type` (string) and `context` (dict)
|
||||
- Can be `async def` or regular `def` — both work
|
||||
- Errors are caught and logged, never crashing the agent
|
||||
|
||||
## Available Events
|
||||
|
||||
| Event | When it fires | Context keys |
|
||||
|-------|---------------|--------------|
|
||||
| `gateway:startup` | Gateway process starts | `platforms` (list of active platform names) |
|
||||
| `session:start` | New messaging session created | `platform`, `user_id`, `session_id`, `session_key` |
|
||||
| `session:reset` | User ran `/new` or `/reset` | `platform`, `user_id`, `session_key` |
|
||||
| `agent:start` | Agent begins processing a message | `platform`, `user_id`, `session_id`, `message` |
|
||||
| `agent:step` | Each iteration of the tool-calling loop | `platform`, `user_id`, `session_id`, `iteration`, `tool_names` |
|
||||
| `agent:end` | Agent finishes processing | `platform`, `user_id`, `session_id`, `message`, `response` |
|
||||
| `command:*` | Any slash command executed | `platform`, `user_id`, `command`, `args` |
|
||||
|
||||
### Wildcard Matching
|
||||
|
||||
Handlers registered for `command:*` fire for any `command:` event (`command:model`, `command:reset`, etc.). Monitor all slash commands with a single subscription.
|
||||
|
||||
## Examples
|
||||
|
||||
### Telegram Alert on Long Tasks
|
||||
|
||||
Send yourself a message when the agent takes more than 10 steps:
|
||||
|
||||
```yaml
|
||||
# ~/.hermes/hooks/long-task-alert/HOOK.yaml
|
||||
name: long-task-alert
|
||||
description: Alert when agent is taking many steps
|
||||
events:
|
||||
- agent:step
|
||||
```
|
||||
|
||||
```python
|
||||
# ~/.hermes/hooks/long-task-alert/handler.py
|
||||
import os
|
||||
import httpx
|
||||
|
||||
THRESHOLD = 10
|
||||
BOT_TOKEN = os.getenv("TELEGRAM_BOT_TOKEN")
|
||||
CHAT_ID = os.getenv("TELEGRAM_HOME_CHANNEL")
|
||||
|
||||
async def handle(event_type: str, context: dict):
|
||||
iteration = context.get("iteration", 0)
|
||||
if iteration == THRESHOLD and BOT_TOKEN and CHAT_ID:
|
||||
tools = ", ".join(context.get("tool_names", []))
|
||||
text = f"⚠️ Agent has been running for {iteration} steps. Last tools: {tools}"
|
||||
async with httpx.AsyncClient() as client:
|
||||
await client.post(
|
||||
f"https://api.telegram.org/bot{BOT_TOKEN}/sendMessage",
|
||||
json={"chat_id": CHAT_ID, "text": text},
|
||||
)
|
||||
```
|
||||
|
||||
### Command Usage Logger
|
||||
|
||||
Track which slash commands are used:
|
||||
|
||||
```yaml
|
||||
# ~/.hermes/hooks/command-logger/HOOK.yaml
|
||||
name: command-logger
|
||||
description: Log slash command usage
|
||||
events:
|
||||
- command:*
|
||||
```
|
||||
|
||||
```python
|
||||
# ~/.hermes/hooks/command-logger/handler.py
|
||||
import json
|
||||
from datetime import datetime
|
||||
from pathlib import Path
|
||||
|
||||
LOG = Path.home() / ".hermes" / "logs" / "command_usage.jsonl"
|
||||
|
||||
def handle(event_type: str, context: dict):
|
||||
LOG.parent.mkdir(parents=True, exist_ok=True)
|
||||
entry = {
|
||||
"ts": datetime.now().isoformat(),
|
||||
"command": context.get("command"),
|
||||
"args": context.get("args"),
|
||||
"platform": context.get("platform"),
|
||||
"user": context.get("user_id"),
|
||||
}
|
||||
with open(LOG, "a") as f:
|
||||
f.write(json.dumps(entry) + "\n")
|
||||
```
|
||||
|
||||
### Session Start Webhook
|
||||
|
||||
POST to an external service on new sessions:
|
||||
|
||||
```yaml
|
||||
# ~/.hermes/hooks/session-webhook/HOOK.yaml
|
||||
name: session-webhook
|
||||
description: Notify external service on new sessions
|
||||
events:
|
||||
- session:start
|
||||
- session:reset
|
||||
```
|
||||
|
||||
```python
|
||||
# ~/.hermes/hooks/session-webhook/handler.py
|
||||
import httpx
|
||||
|
||||
WEBHOOK_URL = "https://your-service.example.com/hermes-events"
|
||||
|
||||
async def handle(event_type: str, context: dict):
|
||||
async with httpx.AsyncClient() as client:
|
||||
await client.post(WEBHOOK_URL, json={
|
||||
"event": event_type,
|
||||
**context,
|
||||
}, timeout=5)
|
||||
```
|
||||
|
||||
## How It Works
|
||||
|
||||
1. On gateway startup, `HookRegistry.discover_and_load()` scans `~/.hermes/hooks/`
|
||||
2. Each subdirectory with `HOOK.yaml` + `handler.py` is loaded dynamically
|
||||
3. Handlers are registered for their declared events
|
||||
4. At each lifecycle point, `hooks.emit()` fires all matching handlers
|
||||
5. Errors in any handler are caught and logged — a broken hook never crashes the agent
|
||||
|
||||
:::info
|
||||
Hooks only fire in the **gateway** (Telegram, Discord, Slack, WhatsApp). The CLI does not currently load hooks.
|
||||
:::
|
||||
269
website/docs/user-guide/features/mcp.md
Normal file
269
website/docs/user-guide/features/mcp.md
Normal file
@@ -0,0 +1,269 @@
|
||||
---
|
||||
sidebar_position: 4
|
||||
title: "MCP (Model Context Protocol)"
|
||||
description: "Connect Hermes Agent to external tool servers via MCP — databases, APIs, filesystems, and more"
|
||||
---
|
||||
|
||||
# MCP (Model Context Protocol)
|
||||
|
||||
MCP lets Hermes Agent connect to external tool servers — giving the agent access to databases, APIs, filesystems, and more without any code changes.
|
||||
|
||||
## Overview
|
||||
|
||||
The [Model Context Protocol](https://modelcontextprotocol.io/) (MCP) is an open standard for connecting AI agents to external tools and data sources. MCP servers expose tools over a lightweight RPC protocol, and Hermes Agent can connect to any compliant server automatically.
|
||||
|
||||
What this means for you:
|
||||
|
||||
- **Thousands of ready-made tools** — browse the [MCP server directory](https://github.com/modelcontextprotocol/servers) for servers covering GitHub, Slack, databases, file systems, web scraping, and more
|
||||
- **No code changes needed** — add a few lines to `~/.hermes/config.yaml` and the tools appear alongside built-in ones
|
||||
- **Mix and match** — run multiple MCP servers simultaneously, combining stdio-based and HTTP-based servers
|
||||
- **Secure by default** — environment variables are filtered and credentials are stripped from error messages
|
||||
|
||||
## Prerequisites
|
||||
|
||||
```bash
|
||||
pip install hermes-agent[mcp]
|
||||
```
|
||||
|
||||
| Server Type | Runtime Needed | Example |
|
||||
|-------------|---------------|---------|
|
||||
| HTTP/remote | Nothing extra | `url: "https://mcp.example.com"` |
|
||||
| npm-based (npx) | Node.js 18+ | `command: "npx"` |
|
||||
| Python-based | uv (recommended) | `command: "uvx"` |
|
||||
|
||||
## Configuration
|
||||
|
||||
MCP servers are configured in `~/.hermes/config.yaml` under the `mcp_servers` key.
|
||||
|
||||
### Stdio Servers
|
||||
|
||||
Stdio servers run as local subprocesses, communicating over stdin/stdout:
|
||||
|
||||
```yaml
|
||||
mcp_servers:
|
||||
filesystem:
|
||||
command: "npx"
|
||||
args: ["-y", "@modelcontextprotocol/server-filesystem", "/home/user/projects"]
|
||||
env: {}
|
||||
|
||||
github:
|
||||
command: "npx"
|
||||
args: ["-y", "@modelcontextprotocol/server-github"]
|
||||
env:
|
||||
GITHUB_PERSONAL_ACCESS_TOKEN: "ghp_xxxxxxxxxxxx"
|
||||
```
|
||||
|
||||
| Key | Required | Description |
|
||||
|-----|----------|-------------|
|
||||
| `command` | Yes | Executable to run (`npx`, `uvx`, `python`) |
|
||||
| `args` | No | Command-line arguments |
|
||||
| `env` | No | Environment variables for the subprocess |
|
||||
|
||||
:::info Security
|
||||
Only explicitly listed `env` variables plus a safe baseline (`PATH`, `HOME`, `USER`, `LANG`, `SHELL`, `TMPDIR`, `XDG_*`) are passed to the subprocess. Your API keys and secrets are **not** leaked.
|
||||
:::
|
||||
|
||||
### HTTP Servers
|
||||
|
||||
```yaml
|
||||
mcp_servers:
|
||||
remote_api:
|
||||
url: "https://my-mcp-server.example.com/mcp"
|
||||
headers:
|
||||
Authorization: "Bearer sk-xxxxxxxxxxxx"
|
||||
```
|
||||
|
||||
### Per-Server Timeouts
|
||||
|
||||
```yaml
|
||||
mcp_servers:
|
||||
slow_database:
|
||||
command: "npx"
|
||||
args: ["-y", "@modelcontextprotocol/server-postgres"]
|
||||
env:
|
||||
DATABASE_URL: "postgres://user:pass@localhost/mydb"
|
||||
timeout: 300 # Tool call timeout (default: 120s)
|
||||
connect_timeout: 90 # Initial connection timeout (default: 60s)
|
||||
```
|
||||
|
||||
### Mixed Configuration Example
|
||||
|
||||
```yaml
|
||||
mcp_servers:
|
||||
# Local filesystem via stdio
|
||||
filesystem:
|
||||
command: "npx"
|
||||
args: ["-y", "@modelcontextprotocol/server-filesystem", "/tmp"]
|
||||
|
||||
# GitHub API via stdio with auth
|
||||
github:
|
||||
command: "npx"
|
||||
args: ["-y", "@modelcontextprotocol/server-github"]
|
||||
env:
|
||||
GITHUB_PERSONAL_ACCESS_TOKEN: "ghp_xxxxxxxxxxxx"
|
||||
|
||||
# Remote database via HTTP
|
||||
company_db:
|
||||
url: "https://mcp.internal.company.com/db"
|
||||
headers:
|
||||
Authorization: "Bearer sk-xxxxxxxxxxxx"
|
||||
timeout: 180
|
||||
|
||||
# Python-based server via uvx
|
||||
memory:
|
||||
command: "uvx"
|
||||
args: ["mcp-server-memory"]
|
||||
```
|
||||
|
||||
## Translating from Claude Desktop Config
|
||||
|
||||
Many MCP server docs show Claude Desktop JSON format. Here's the translation:
|
||||
|
||||
**Claude Desktop JSON:**
|
||||
```json
|
||||
{
|
||||
"mcpServers": {
|
||||
"filesystem": {
|
||||
"command": "npx",
|
||||
"args": ["-y", "@modelcontextprotocol/server-filesystem", "/tmp"]
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
**Hermes YAML:**
|
||||
```yaml
|
||||
mcp_servers: # mcpServers → mcp_servers (snake_case)
|
||||
filesystem:
|
||||
command: "npx"
|
||||
args: ["-y", "@modelcontextprotocol/server-filesystem", "/tmp"]
|
||||
```
|
||||
|
||||
Rules: `mcpServers` → `mcp_servers` (snake_case), JSON → YAML. Keys like `command`, `args`, `env` are identical.
|
||||
|
||||
## How It Works
|
||||
|
||||
### Tool Registration
|
||||
|
||||
Each MCP tool is registered with a prefixed name:
|
||||
|
||||
```
|
||||
mcp_{server_name}_{tool_name}
|
||||
```
|
||||
|
||||
| Server Name | MCP Tool Name | Registered As |
|
||||
|-------------|--------------|---------------|
|
||||
| `filesystem` | `read_file` | `mcp_filesystem_read_file` |
|
||||
| `github` | `create-issue` | `mcp_github_create_issue` |
|
||||
| `my-api` | `query.data` | `mcp_my_api_query_data` |
|
||||
|
||||
Tools appear alongside built-in tools — the agent calls them like any other tool.
|
||||
|
||||
### Reconnection
|
||||
|
||||
If an MCP server disconnects, Hermes automatically reconnects with exponential backoff (1s, 2s, 4s, 8s, 16s — max 5 attempts). Initial connection failures are reported immediately.
|
||||
|
||||
### Shutdown
|
||||
|
||||
On agent exit, all MCP server connections are cleanly shut down.
|
||||
|
||||
## Popular MCP Servers
|
||||
|
||||
| Server | Package | Description |
|
||||
|--------|---------|-------------|
|
||||
| Filesystem | `@modelcontextprotocol/server-filesystem` | Read/write/search local files |
|
||||
| GitHub | `@modelcontextprotocol/server-github` | Issues, PRs, repos, code search |
|
||||
| Git | `@modelcontextprotocol/server-git` | Git operations on local repos |
|
||||
| Fetch | `@modelcontextprotocol/server-fetch` | HTTP fetching and web content |
|
||||
| Memory | `@modelcontextprotocol/server-memory` | Persistent key-value memory |
|
||||
| SQLite | `@modelcontextprotocol/server-sqlite` | Query SQLite databases |
|
||||
| PostgreSQL | `@modelcontextprotocol/server-postgres` | Query PostgreSQL databases |
|
||||
| Brave Search | `@modelcontextprotocol/server-brave-search` | Web search via Brave API |
|
||||
| Puppeteer | `@modelcontextprotocol/server-puppeteer` | Browser automation |
|
||||
|
||||
### Example Configs
|
||||
|
||||
```yaml
|
||||
mcp_servers:
|
||||
# No API key needed
|
||||
filesystem:
|
||||
command: "npx"
|
||||
args: ["-y", "@modelcontextprotocol/server-filesystem", "/home/user/projects"]
|
||||
|
||||
git:
|
||||
command: "uvx"
|
||||
args: ["mcp-server-git", "--repository", "/home/user/my-repo"]
|
||||
|
||||
fetch:
|
||||
command: "uvx"
|
||||
args: ["mcp-server-fetch"]
|
||||
|
||||
sqlite:
|
||||
command: "uvx"
|
||||
args: ["mcp-server-sqlite", "--db-path", "/home/user/data.db"]
|
||||
|
||||
# Requires API key
|
||||
github:
|
||||
command: "npx"
|
||||
args: ["-y", "@modelcontextprotocol/server-github"]
|
||||
env:
|
||||
GITHUB_PERSONAL_ACCESS_TOKEN: "ghp_xxxxxxxxxxxx"
|
||||
|
||||
brave_search:
|
||||
command: "npx"
|
||||
args: ["-y", "@modelcontextprotocol/server-brave-search"]
|
||||
env:
|
||||
BRAVE_API_KEY: "BSA_xxxxxxxxxxxx"
|
||||
```
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
### "MCP SDK not available"
|
||||
|
||||
```bash
|
||||
pip install hermes-agent[mcp]
|
||||
```
|
||||
|
||||
### Server fails to start
|
||||
|
||||
The MCP server command (`npx`, `uvx`) is not on PATH. Install the required runtime:
|
||||
|
||||
```bash
|
||||
# For npm-based servers
|
||||
npm install -g npx # or ensure Node.js 18+ is installed
|
||||
|
||||
# For Python-based servers
|
||||
pip install uv # then use "uvx" as the command
|
||||
```
|
||||
|
||||
### Server connects but tools fail with auth errors
|
||||
|
||||
Ensure the key is in the server's `env` block:
|
||||
|
||||
```yaml
|
||||
mcp_servers:
|
||||
github:
|
||||
command: "npx"
|
||||
args: ["-y", "@modelcontextprotocol/server-github"]
|
||||
env:
|
||||
GITHUB_PERSONAL_ACCESS_TOKEN: "ghp_your_actual_token" # Check this
|
||||
```
|
||||
|
||||
### Connection timeout
|
||||
|
||||
Increase `connect_timeout` for slow-starting servers:
|
||||
|
||||
```yaml
|
||||
mcp_servers:
|
||||
slow_server:
|
||||
command: "npx"
|
||||
args: ["-y", "heavy-server-package"]
|
||||
connect_timeout: 120 # default is 60
|
||||
```
|
||||
|
||||
### Reload MCP Servers
|
||||
|
||||
You can reload MCP servers without restarting Hermes:
|
||||
|
||||
- In the CLI: the agent reconnects automatically
|
||||
- In messaging: send `/reload-mcp`
|
||||
97
website/docs/user-guide/features/memory.md
Normal file
97
website/docs/user-guide/features/memory.md
Normal file
@@ -0,0 +1,97 @@
|
||||
---
|
||||
sidebar_position: 3
|
||||
title: "Persistent Memory"
|
||||
description: "How Hermes Agent remembers across sessions — MEMORY.md, USER.md, and session search"
|
||||
---
|
||||
|
||||
# Persistent Memory
|
||||
|
||||
Hermes Agent has bounded, curated memory that persists across sessions. This lets it remember your preferences, your projects, your environment, and things it has learned.
|
||||
|
||||
## How It Works
|
||||
|
||||
Two files make up the agent's memory:
|
||||
|
||||
| File | Purpose | Token Budget |
|
||||
|------|---------|-------------|
|
||||
| **MEMORY.md** | Agent's personal notes — environment facts, conventions, things learned | ~800 tokens (~2200 chars) |
|
||||
| **USER.md** | User profile — your preferences, communication style, expectations | ~500 tokens (~1375 chars) |
|
||||
|
||||
Both are stored in `~/.hermes/memories/` and are injected into the system prompt as a frozen snapshot at session start. The agent manages its own memory via the `memory` tool — it can add, replace, or remove entries.
|
||||
|
||||
:::info
|
||||
Character limits keep memory focused. When memory is full, the agent consolidates or replaces entries to make room for new information.
|
||||
:::
|
||||
|
||||
## Configuration
|
||||
|
||||
```yaml
|
||||
# In ~/.hermes/config.yaml
|
||||
memory:
|
||||
memory_enabled: true
|
||||
user_profile_enabled: true
|
||||
memory_char_limit: 2200 # ~800 tokens
|
||||
user_char_limit: 1375 # ~500 tokens
|
||||
```
|
||||
|
||||
## Memory Tool Actions
|
||||
|
||||
The agent uses the `memory` tool with these actions:
|
||||
|
||||
- **add** — Add a new memory entry
|
||||
- **replace** — Replace an existing entry with updated content
|
||||
- **remove** — Remove an entry that's no longer relevant
|
||||
- **read** — Read current memory contents
|
||||
|
||||
## Session Search
|
||||
|
||||
Beyond MEMORY.md and USER.md, the agent can search its past conversations using the `session_search` tool:
|
||||
|
||||
- All CLI and messaging sessions are stored in SQLite (`~/.hermes/state.db`) with FTS5 full-text search
|
||||
- Search queries return relevant past conversations with Gemini Flash summarization
|
||||
- The agent can find things it discussed weeks ago, even if they're not in its active memory
|
||||
|
||||
```bash
|
||||
hermes sessions list # Browse past sessions
|
||||
```
|
||||
|
||||
## Honcho Integration (Cross-Session User Modeling)
|
||||
|
||||
For deeper, AI-generated user understanding that works across tools, you can optionally enable [Honcho](https://honcho.dev/) by Plastic Labs. Honcho runs alongside existing memory — USER.md stays as-is, and Honcho adds an additional layer of context.
|
||||
|
||||
When enabled:
|
||||
- **Prefetch**: Each turn, Honcho's user representation is injected into the system prompt
|
||||
- **Sync**: After each conversation, messages are synced to Honcho
|
||||
- **Query tool**: The agent can actively query its understanding of you via `query_user_context`
|
||||
|
||||
**Setup:**
|
||||
|
||||
```bash
|
||||
# 1. Install the optional dependency
|
||||
uv pip install honcho-ai
|
||||
|
||||
# 2. Get an API key from https://app.honcho.dev
|
||||
|
||||
# 3. Create ~/.honcho/config.json
|
||||
cat > ~/.honcho/config.json << 'EOF'
|
||||
{
|
||||
"enabled": true,
|
||||
"apiKey": "your-honcho-api-key",
|
||||
"peerName": "your-name",
|
||||
"hosts": {
|
||||
"hermes": {
|
||||
"workspace": "hermes"
|
||||
}
|
||||
}
|
||||
}
|
||||
EOF
|
||||
```
|
||||
|
||||
Or via environment variable:
|
||||
```bash
|
||||
hermes config set HONCHO_API_KEY your-key
|
||||
```
|
||||
|
||||
:::tip
|
||||
Honcho is fully opt-in — zero behavior change when disabled or unconfigured. All Honcho calls are non-fatal; if the service is unreachable, the agent continues normally.
|
||||
:::
|
||||
159
website/docs/user-guide/features/skills.md
Normal file
159
website/docs/user-guide/features/skills.md
Normal file
@@ -0,0 +1,159 @@
|
||||
---
|
||||
sidebar_position: 2
|
||||
title: "Skills System"
|
||||
description: "On-demand knowledge documents — progressive disclosure, agent-managed skills, and the Skills Hub"
|
||||
---
|
||||
|
||||
# Skills System
|
||||
|
||||
Skills are on-demand knowledge documents the agent can load when needed. They follow a **progressive disclosure** pattern to minimize token usage and are compatible with the [agentskills.io](https://agentskills.io/specification) open standard.
|
||||
|
||||
All skills live in **`~/.hermes/skills/`** — a single directory that serves as the source of truth. On fresh install, bundled skills are copied from the repo. Hub-installed and agent-created skills also go here. The agent can modify or delete any skill.
|
||||
|
||||
## Using Skills
|
||||
|
||||
Every installed skill is automatically available as a slash command:
|
||||
|
||||
```bash
|
||||
# In the CLI or any messaging platform:
|
||||
/gif-search funny cats
|
||||
/axolotl help me fine-tune Llama 3 on my dataset
|
||||
/github-pr-workflow create a PR for the auth refactor
|
||||
|
||||
# Just the skill name loads it and lets the agent ask what you need:
|
||||
/excalidraw
|
||||
```
|
||||
|
||||
You can also interact with skills through natural conversation:
|
||||
|
||||
```bash
|
||||
hermes --toolsets skills -q "What skills do you have?"
|
||||
hermes --toolsets skills -q "Show me the axolotl skill"
|
||||
```
|
||||
|
||||
## Progressive Disclosure
|
||||
|
||||
Skills use a token-efficient loading pattern:
|
||||
|
||||
```
|
||||
Level 0: skills_categories() → ["mlops", "devops"] (~50 tokens)
|
||||
Level 1: skills_list(category) → [{name, description}, ...] (~3k tokens)
|
||||
Level 2: skill_view(name) → Full content + metadata (varies)
|
||||
Level 3: skill_view(name, path) → Specific reference file (varies)
|
||||
```
|
||||
|
||||
The agent only loads the full skill content when it actually needs it.
|
||||
|
||||
## SKILL.md Format
|
||||
|
||||
```markdown
|
||||
---
|
||||
name: my-skill
|
||||
description: Brief description of what this skill does
|
||||
version: 1.0.0
|
||||
metadata:
|
||||
hermes:
|
||||
tags: [python, automation]
|
||||
category: devops
|
||||
---
|
||||
|
||||
# Skill Title
|
||||
|
||||
## When to Use
|
||||
Trigger conditions for this skill.
|
||||
|
||||
## Procedure
|
||||
1. Step one
|
||||
2. Step two
|
||||
|
||||
## Pitfalls
|
||||
- Known failure modes and fixes
|
||||
|
||||
## Verification
|
||||
How to confirm it worked.
|
||||
```
|
||||
|
||||
## Skill Directory Structure
|
||||
|
||||
```
|
||||
~/.hermes/skills/ # Single source of truth
|
||||
├── mlops/ # Category directory
|
||||
│ ├── axolotl/
|
||||
│ │ ├── SKILL.md # Main instructions (required)
|
||||
│ │ ├── references/ # Additional docs
|
||||
│ │ ├── templates/ # Output formats
|
||||
│ │ └── assets/ # Supplementary files
|
||||
│ └── vllm/
|
||||
│ └── SKILL.md
|
||||
├── devops/
|
||||
│ └── deploy-k8s/ # Agent-created skill
|
||||
│ ├── SKILL.md
|
||||
│ └── references/
|
||||
├── .hub/ # Skills Hub state
|
||||
│ ├── lock.json
|
||||
│ ├── quarantine/
|
||||
│ └── audit.log
|
||||
└── .bundled_manifest # Tracks seeded bundled skills
|
||||
```
|
||||
|
||||
## Agent-Managed Skills (skill_manage tool)
|
||||
|
||||
The agent can create, update, and delete its own skills via the `skill_manage` tool. This is the agent's **procedural memory** — when it figures out a non-trivial workflow, it saves the approach as a skill for future reuse.
|
||||
|
||||
### When the Agent Creates Skills
|
||||
|
||||
- After completing a complex task (5+ tool calls) successfully
|
||||
- When it hit errors or dead ends and found the working path
|
||||
- When the user corrected its approach
|
||||
- When it discovered a non-trivial workflow
|
||||
|
||||
### Actions
|
||||
|
||||
| Action | Use for | Key params |
|
||||
|--------|---------|------------|
|
||||
| `create` | New skill from scratch | `name`, `content` (full SKILL.md), optional `category` |
|
||||
| `patch` | Targeted fixes (preferred) | `name`, `old_string`, `new_string` |
|
||||
| `edit` | Major structural rewrites | `name`, `content` (full SKILL.md replacement) |
|
||||
| `delete` | Remove a skill entirely | `name` |
|
||||
| `write_file` | Add/update supporting files | `name`, `file_path`, `file_content` |
|
||||
| `remove_file` | Remove a supporting file | `name`, `file_path` |
|
||||
|
||||
:::tip
|
||||
The `patch` action is preferred for updates — it's more token-efficient than `edit` because only the changed text appears in the tool call.
|
||||
:::
|
||||
|
||||
## Skills Hub
|
||||
|
||||
Search, install, and manage skills from online registries:
|
||||
|
||||
```bash
|
||||
hermes skills search kubernetes # Search all sources
|
||||
hermes skills install openai/skills/k8s # Install with security scan
|
||||
hermes skills inspect openai/skills/k8s # Preview before installing
|
||||
hermes skills list --source hub # List hub-installed skills
|
||||
hermes skills audit # Re-scan all hub skills
|
||||
hermes skills uninstall k8s # Remove a hub skill
|
||||
hermes skills publish skills/my-skill --to github --repo owner/repo
|
||||
hermes skills snapshot export setup.json # Export skill config
|
||||
hermes skills tap add myorg/skills-repo # Add a custom source
|
||||
```
|
||||
|
||||
All hub-installed skills go through a **security scanner** that checks for data exfiltration, prompt injection, destructive commands, and other threats.
|
||||
|
||||
### Trust Levels
|
||||
|
||||
| Level | Source | Policy |
|
||||
|-------|--------|--------|
|
||||
| `builtin` | Ships with Hermes | Always trusted |
|
||||
| `trusted` | openai/skills, anthropics/skills | Trusted sources |
|
||||
| `community` | Everything else | Any findings = blocked unless `--force` |
|
||||
|
||||
### Slash Commands (Inside Chat)
|
||||
|
||||
All the same commands work with `/skills` prefix:
|
||||
|
||||
```
|
||||
/skills search kubernetes
|
||||
/skills install openai/skills/skill-creator
|
||||
/skills list
|
||||
```
|
||||
163
website/docs/user-guide/features/tools.md
Normal file
163
website/docs/user-guide/features/tools.md
Normal file
@@ -0,0 +1,163 @@
|
||||
---
|
||||
sidebar_position: 1
|
||||
title: "Tools & Toolsets"
|
||||
description: "Overview of Hermes Agent's tools — what's available, how toolsets work, and terminal backends"
|
||||
---
|
||||
|
||||
# Tools & Toolsets
|
||||
|
||||
Tools are functions that extend the agent's capabilities. They're organized into logical **toolsets** that can be enabled or disabled per platform.
|
||||
|
||||
## Available Tools
|
||||
|
||||
| Category | Tools | Description |
|
||||
|----------|-------|-------------|
|
||||
| **Web** | `web_search`, `web_extract`, `web_crawl` | Search the web, extract page content, crawl sites |
|
||||
| **Terminal** | `terminal` | Execute commands (local/docker/singularity/modal/ssh backends) |
|
||||
| **File** | `read_file`, `write_file`, `patch`, `search` | Read, write, edit, and search files |
|
||||
| **Browser** | `browser_navigate`, `browser_click`, `browser_type`, etc. | Full browser automation via Browserbase |
|
||||
| **Vision** | `vision_analyze` | Image analysis via multimodal models |
|
||||
| **Image Gen** | `image_generate` | Generate images (FLUX via FAL) |
|
||||
| **TTS** | `text_to_speech` | Text-to-speech (Edge TTS / ElevenLabs / OpenAI) |
|
||||
| **Reasoning** | `mixture_of_agents` | Multi-model reasoning |
|
||||
| **Skills** | `skills_list`, `skill_view`, `skill_manage` | Find, view, create, and manage skills |
|
||||
| **Todo** | `todo` | Read/write task list for multi-step planning |
|
||||
| **Memory** | `memory` | Persistent notes + user profile across sessions |
|
||||
| **Session Search** | `session_search` | Search + summarize past conversations (FTS5) |
|
||||
| **Cronjob** | `schedule_cronjob`, `list_cronjobs`, `remove_cronjob` | Scheduled task management |
|
||||
| **Code Execution** | `execute_code` | Run Python scripts that call tools via RPC sandbox |
|
||||
| **Delegation** | `delegate_task` | Spawn subagents with isolated context |
|
||||
| **Clarify** | `clarify` | Ask the user multiple-choice or open-ended questions (CLI-only) |
|
||||
| **MCP** | Auto-discovered | External tools from MCP servers |
|
||||
|
||||
## Using Toolsets
|
||||
|
||||
```bash
|
||||
# Use specific toolsets
|
||||
hermes --toolsets "web,terminal"
|
||||
|
||||
# See all available tools
|
||||
hermes tools
|
||||
|
||||
# Configure tools per platform (interactive)
|
||||
hermes tools
|
||||
```
|
||||
|
||||
**Available toolsets:** `web`, `terminal`, `file`, `browser`, `vision`, `image_gen`, `moa`, `skills`, `tts`, `todo`, `memory`, `session_search`, `cronjob`, `code_execution`, `delegation`, `clarify`, and more.
|
||||
|
||||
## Terminal Backends
|
||||
|
||||
The terminal tool can execute commands in different environments:
|
||||
|
||||
| Backend | Description | Use Case |
|
||||
|---------|-------------|----------|
|
||||
| `local` | Run on your machine (default) | Development, trusted tasks |
|
||||
| `docker` | Isolated containers | Security, reproducibility |
|
||||
| `ssh` | Remote server | Sandboxing, keep agent away from its own code |
|
||||
| `singularity` | HPC containers | Cluster computing, rootless |
|
||||
| `modal` | Cloud execution | Serverless, scale |
|
||||
|
||||
### Configuration
|
||||
|
||||
```yaml
|
||||
# In ~/.hermes/config.yaml
|
||||
terminal:
|
||||
backend: local # or: docker, ssh, singularity, modal
|
||||
cwd: "." # Working directory
|
||||
timeout: 180 # Command timeout in seconds
|
||||
```
|
||||
|
||||
### Docker Backend
|
||||
|
||||
```yaml
|
||||
terminal:
|
||||
backend: docker
|
||||
docker_image: python:3.11-slim
|
||||
```
|
||||
|
||||
### SSH Backend
|
||||
|
||||
Recommended for security — agent can't modify its own code:
|
||||
|
||||
```yaml
|
||||
terminal:
|
||||
backend: ssh
|
||||
```
|
||||
```bash
|
||||
# Set credentials in ~/.hermes/.env
|
||||
TERMINAL_SSH_HOST=my-server.example.com
|
||||
TERMINAL_SSH_USER=myuser
|
||||
TERMINAL_SSH_KEY=~/.ssh/id_rsa
|
||||
```
|
||||
|
||||
### Singularity/Apptainer
|
||||
|
||||
```bash
|
||||
# Pre-build SIF for parallel workers
|
||||
apptainer build ~/python.sif docker://python:3.11-slim
|
||||
|
||||
# Configure
|
||||
hermes config set terminal.backend singularity
|
||||
hermes config set terminal.singularity_image ~/python.sif
|
||||
```
|
||||
|
||||
### Modal (Serverless Cloud)
|
||||
|
||||
```bash
|
||||
uv pip install "swe-rex[modal]"
|
||||
modal setup
|
||||
hermes config set terminal.backend modal
|
||||
```
|
||||
|
||||
### Container Resources
|
||||
|
||||
Configure CPU, memory, disk, and persistence for all container backends:
|
||||
|
||||
```yaml
|
||||
terminal:
|
||||
backend: docker # or singularity, modal
|
||||
container_cpu: 1 # CPU cores (default: 1)
|
||||
container_memory: 5120 # Memory in MB (default: 5GB)
|
||||
container_disk: 51200 # Disk in MB (default: 50GB)
|
||||
container_persistent: true # Persist filesystem across sessions (default: true)
|
||||
```
|
||||
|
||||
When `container_persistent: true`, installed packages, files, and config survive across sessions.
|
||||
|
||||
### Container Security
|
||||
|
||||
All container backends run with security hardening:
|
||||
|
||||
- Read-only root filesystem (Docker)
|
||||
- All Linux capabilities dropped
|
||||
- No privilege escalation
|
||||
- PID limits (256 processes)
|
||||
- Full namespace isolation
|
||||
- Persistent workspace via volumes, not writable root layer
|
||||
|
||||
## Background Process Management
|
||||
|
||||
Start background processes and manage them:
|
||||
|
||||
```python
|
||||
terminal(command="pytest -v tests/", background=true)
|
||||
# Returns: {"session_id": "proc_abc123", "pid": 12345}
|
||||
|
||||
# Then manage with the process tool:
|
||||
process(action="list") # Show all running processes
|
||||
process(action="poll", session_id="proc_abc123") # Check status
|
||||
process(action="wait", session_id="proc_abc123") # Block until done
|
||||
process(action="log", session_id="proc_abc123") # Full output
|
||||
process(action="kill", session_id="proc_abc123") # Terminate
|
||||
process(action="write", session_id="proc_abc123", data="y") # Send input
|
||||
```
|
||||
|
||||
PTY mode (`pty=true`) enables interactive CLI tools like Codex and Claude Code.
|
||||
|
||||
## Sudo Support
|
||||
|
||||
If a command needs sudo, you'll be prompted for your password (cached for the session). Or set `SUDO_PASSWORD` in `~/.hermes/.env`.
|
||||
|
||||
:::warning
|
||||
On messaging platforms, if sudo fails, the output includes a tip to add `SUDO_PASSWORD` to `~/.hermes/.env`.
|
||||
:::
|
||||
89
website/docs/user-guide/features/tts.md
Normal file
89
website/docs/user-guide/features/tts.md
Normal file
@@ -0,0 +1,89 @@
|
||||
---
|
||||
sidebar_position: 9
|
||||
title: "Voice & TTS"
|
||||
description: "Text-to-speech and voice message transcription across all platforms"
|
||||
---
|
||||
|
||||
# Voice & TTS
|
||||
|
||||
Hermes Agent supports both text-to-speech output and voice message transcription across all messaging platforms.
|
||||
|
||||
## Text-to-Speech
|
||||
|
||||
Convert text to speech with three providers:
|
||||
|
||||
| Provider | Quality | Cost | API Key |
|
||||
|----------|---------|------|---------|
|
||||
| **Edge TTS** (default) | Good | Free | None needed |
|
||||
| **ElevenLabs** | Excellent | Paid | `ELEVENLABS_API_KEY` |
|
||||
| **OpenAI TTS** | Good | Paid | `VOICE_TOOLS_OPENAI_KEY` |
|
||||
|
||||
### Platform Delivery
|
||||
|
||||
| Platform | Delivery | Format |
|
||||
|----------|----------|--------|
|
||||
| Telegram | Voice bubble (plays inline) | Opus `.ogg` |
|
||||
| Discord | Audio file attachment | MP3 |
|
||||
| WhatsApp | Audio file attachment | MP3 |
|
||||
| CLI | Saved to `~/voice-memos/` | MP3 |
|
||||
|
||||
### Configuration
|
||||
|
||||
```yaml
|
||||
# In ~/.hermes/config.yaml
|
||||
tts:
|
||||
provider: "edge" # "edge" | "elevenlabs" | "openai"
|
||||
edge:
|
||||
voice: "en-US-AriaNeural" # 322 voices, 74 languages
|
||||
elevenlabs:
|
||||
voice_id: "pNInz6obpgDQGcFmaJgB" # Adam
|
||||
model_id: "eleven_multilingual_v2"
|
||||
openai:
|
||||
model: "gpt-4o-mini-tts"
|
||||
voice: "alloy" # alloy, echo, fable, onyx, nova, shimmer
|
||||
```
|
||||
|
||||
### Telegram Voice Bubbles & ffmpeg
|
||||
|
||||
Telegram voice bubbles require Opus/OGG audio format:
|
||||
|
||||
- **OpenAI and ElevenLabs** produce Opus natively — no extra setup
|
||||
- **Edge TTS** (default) outputs MP3 and needs **ffmpeg** to convert:
|
||||
|
||||
```bash
|
||||
# Ubuntu/Debian
|
||||
sudo apt install ffmpeg
|
||||
|
||||
# macOS
|
||||
brew install ffmpeg
|
||||
|
||||
# Fedora
|
||||
sudo dnf install ffmpeg
|
||||
```
|
||||
|
||||
Without ffmpeg, Edge TTS audio is sent as a regular audio file (playable, but shows as a rectangular player instead of a voice bubble).
|
||||
|
||||
:::tip
|
||||
If you want voice bubbles without installing ffmpeg, switch to the OpenAI or ElevenLabs provider.
|
||||
:::
|
||||
|
||||
## Voice Message Transcription
|
||||
|
||||
Voice messages sent on Telegram, Discord, WhatsApp, or Slack are automatically transcribed and injected as text into the conversation. The agent sees the transcript as normal text.
|
||||
|
||||
| Provider | Model | Quality | Cost |
|
||||
|----------|-------|---------|------|
|
||||
| **OpenAI Whisper** | `whisper-1` (default) | Good | Low |
|
||||
| **OpenAI GPT-4o** | `gpt-4o-mini-transcribe` | Better | Medium |
|
||||
| **OpenAI GPT-4o** | `gpt-4o-transcribe` | Best | Higher |
|
||||
|
||||
Requires `VOICE_TOOLS_OPENAI_KEY` in `~/.hermes/.env`.
|
||||
|
||||
### Configuration
|
||||
|
||||
```yaml
|
||||
# In ~/.hermes/config.yaml
|
||||
stt:
|
||||
enabled: true
|
||||
model: "whisper-1"
|
||||
```
|
||||
8
website/docs/user-guide/messaging/_category_.json
Normal file
8
website/docs/user-guide/messaging/_category_.json
Normal file
@@ -0,0 +1,8 @@
|
||||
{
|
||||
"label": "Messaging Gateway",
|
||||
"position": 3,
|
||||
"link": {
|
||||
"type": "doc",
|
||||
"id": "user-guide/messaging/index"
|
||||
}
|
||||
}
|
||||
57
website/docs/user-guide/messaging/discord.md
Normal file
57
website/docs/user-guide/messaging/discord.md
Normal file
@@ -0,0 +1,57 @@
|
||||
---
|
||||
sidebar_position: 3
|
||||
title: "Discord"
|
||||
description: "Set up Hermes Agent as a Discord bot"
|
||||
---
|
||||
|
||||
# Discord Setup
|
||||
|
||||
Connect Hermes Agent to Discord to chat with it in DMs or server channels.
|
||||
|
||||
## Setup Steps
|
||||
|
||||
1. **Create a bot:** Go to the [Discord Developer Portal](https://discord.com/developers/applications)
|
||||
2. **Enable intents:** Bot → Privileged Gateway Intents → enable **Message Content Intent**
|
||||
3. **Get your user ID:** Enable Developer Mode in Discord settings, right-click your name → Copy ID
|
||||
4. **Invite to your server:** OAuth2 → URL Generator → scopes: `bot`, `applications.commands` → permissions: Send Messages, Read Message History, Attach Files
|
||||
5. **Configure:** Run `hermes gateway setup` and select Discord, or add to `~/.hermes/.env` manually:
|
||||
|
||||
```bash
|
||||
DISCORD_BOT_TOKEN=MTIz...
|
||||
DISCORD_ALLOWED_USERS=YOUR_USER_ID
|
||||
```
|
||||
|
||||
6. **Start the gateway:**
|
||||
|
||||
```bash
|
||||
hermes gateway
|
||||
```
|
||||
|
||||
## Optional: Home Channel
|
||||
|
||||
Set a default channel for cron job delivery:
|
||||
|
||||
```bash
|
||||
DISCORD_HOME_CHANNEL=123456789012345678
|
||||
DISCORD_HOME_CHANNEL_NAME="#bot-updates"
|
||||
```
|
||||
|
||||
Or use `/sethome` in any Discord channel.
|
||||
|
||||
## Required Bot Permissions
|
||||
|
||||
When generating the invite URL, make sure to include:
|
||||
|
||||
- **Send Messages** — bot needs to reply
|
||||
- **Read Message History** — for context
|
||||
- **Attach Files** — for audio, images, and file outputs
|
||||
|
||||
## Voice Messages
|
||||
|
||||
Voice messages on Discord are automatically transcribed (requires `VOICE_TOOLS_OPENAI_KEY`). TTS audio is sent as MP3 file attachments.
|
||||
|
||||
## Security
|
||||
|
||||
:::warning
|
||||
Always set `DISCORD_ALLOWED_USERS` to restrict who can use the bot. Without it, the gateway denies all users by default.
|
||||
:::
|
||||
204
website/docs/user-guide/messaging/index.md
Normal file
204
website/docs/user-guide/messaging/index.md
Normal file
@@ -0,0 +1,204 @@
|
||||
---
|
||||
sidebar_position: 1
|
||||
title: "Messaging Gateway"
|
||||
description: "Chat with Hermes from Telegram, Discord, Slack, or WhatsApp — architecture and setup overview"
|
||||
---
|
||||
|
||||
# Messaging Gateway
|
||||
|
||||
Chat with Hermes from Telegram, Discord, Slack, or WhatsApp. The gateway is a single background process that connects to all your configured platforms, handles sessions, runs cron jobs, and delivers voice messages.
|
||||
|
||||
## Architecture
|
||||
|
||||
```text
|
||||
┌─────────────────────────────────────────────────────────────────┐
|
||||
│ Hermes Gateway │
|
||||
├─────────────────────────────────────────────────────────────────┤
|
||||
│ │
|
||||
│ ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐ │
|
||||
│ │ Telegram │ │ Discord │ │ WhatsApp │ │ Slack │ │
|
||||
│ │ Adapter │ │ Adapter │ │ Adapter │ │ Adapter │ │
|
||||
│ └────┬─────┘ └────┬─────┘ └────┬─────┘ └────┬─────┘ │
|
||||
│ │ │ │ │ │
|
||||
│ └─────────────┼────────────┼─────────────┘ │
|
||||
│ │ │
|
||||
│ ┌────────▼────────┐ │
|
||||
│ │ Session Store │ │
|
||||
│ │ (per-chat) │ │
|
||||
│ └────────┬────────┘ │
|
||||
│ │ │
|
||||
│ ┌────────▼────────┐ │
|
||||
│ │ AIAgent │ │
|
||||
│ │ (run_agent) │ │
|
||||
│ └─────────────────┘ │
|
||||
│ │
|
||||
└─────────────────────────────────────────────────────────────────┘
|
||||
```
|
||||
|
||||
Each platform adapter receives messages, routes them through a per-chat session store, and dispatches them to the AIAgent for processing. The gateway also runs the cron scheduler, ticking every 60 seconds to execute any due jobs.
|
||||
|
||||
## Quick Setup
|
||||
|
||||
The easiest way to configure messaging platforms is the interactive wizard:
|
||||
|
||||
```bash
|
||||
hermes gateway setup # Interactive setup for all messaging platforms
|
||||
```
|
||||
|
||||
This walks you through configuring each platform with arrow-key selection, shows which platforms are already configured, and offers to start/restart the gateway when done.
|
||||
|
||||
## Gateway Commands
|
||||
|
||||
```bash
|
||||
hermes gateway # Run in foreground
|
||||
hermes gateway setup # Configure messaging platforms interactively
|
||||
hermes gateway install # Install as systemd service (Linux) / launchd (macOS)
|
||||
hermes gateway start # Start the service
|
||||
hermes gateway stop # Stop the service
|
||||
hermes gateway status # Check service status
|
||||
```
|
||||
|
||||
## Chat Commands (Inside Messaging)
|
||||
|
||||
| Command | Description |
|
||||
|---------|-------------|
|
||||
| `/new` or `/reset` | Start fresh conversation |
|
||||
| `/model [name]` | Show or change the model |
|
||||
| `/personality [name]` | Set a personality |
|
||||
| `/retry` | Retry the last message |
|
||||
| `/undo` | Remove the last exchange |
|
||||
| `/status` | Show session info |
|
||||
| `/stop` | Stop the running agent |
|
||||
| `/sethome` | Set this chat as the home channel |
|
||||
| `/compress` | Manually compress conversation context |
|
||||
| `/usage` | Show token usage for this session |
|
||||
| `/reload-mcp` | Reload MCP servers from config |
|
||||
| `/update` | Update Hermes Agent to the latest version |
|
||||
| `/help` | Show available commands |
|
||||
| `/<skill-name>` | Invoke any installed skill |
|
||||
|
||||
## Session Management
|
||||
|
||||
### Session Persistence
|
||||
|
||||
Sessions persist across messages until they reset. The agent remembers your conversation context.
|
||||
|
||||
### Reset Policies
|
||||
|
||||
Sessions reset based on configurable policies:
|
||||
|
||||
| Policy | Default | Description |
|
||||
|--------|---------|-------------|
|
||||
| Daily | 4:00 AM | Reset at a specific hour each day |
|
||||
| Idle | 120 min | Reset after N minutes of inactivity |
|
||||
| Both | (combined) | Whichever triggers first |
|
||||
|
||||
Configure per-platform overrides in `~/.hermes/gateway.json`:
|
||||
|
||||
```json
|
||||
{
|
||||
"reset_by_platform": {
|
||||
"telegram": { "mode": "idle", "idle_minutes": 240 },
|
||||
"discord": { "mode": "idle", "idle_minutes": 60 }
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## Security
|
||||
|
||||
**By default, the gateway denies all users who are not in an allowlist or paired via DM.** This is the safe default for a bot with terminal access.
|
||||
|
||||
```bash
|
||||
# Restrict to specific users (recommended):
|
||||
TELEGRAM_ALLOWED_USERS=123456789,987654321
|
||||
DISCORD_ALLOWED_USERS=123456789012345678
|
||||
|
||||
# Or explicitly allow all users (NOT recommended for bots with terminal access):
|
||||
GATEWAY_ALLOW_ALL_USERS=true
|
||||
```
|
||||
|
||||
### DM Pairing (Alternative to Allowlists)
|
||||
|
||||
Instead of manually configuring user IDs, unknown users receive a one-time pairing code when they DM the bot:
|
||||
|
||||
```bash
|
||||
# The user sees: "Pairing code: XKGH5N7P"
|
||||
# You approve them with:
|
||||
hermes pairing approve telegram XKGH5N7P
|
||||
|
||||
# Other pairing commands:
|
||||
hermes pairing list # View pending + approved users
|
||||
hermes pairing revoke telegram 123456789 # Remove access
|
||||
```
|
||||
|
||||
Pairing codes expire after 1 hour, are rate-limited, and use cryptographic randomness.
|
||||
|
||||
## Interrupting the Agent
|
||||
|
||||
Send any message while the agent is working to interrupt it. Key behaviors:
|
||||
|
||||
- **In-progress terminal commands are killed immediately** (SIGTERM, then SIGKILL after 1s)
|
||||
- **Tool calls are cancelled** — only the currently-executing one runs, the rest are skipped
|
||||
- **Multiple messages are combined** — messages sent during interruption are joined into one prompt
|
||||
- **`/stop` command** — interrupts without queuing a follow-up message
|
||||
|
||||
## Tool Progress Notifications
|
||||
|
||||
Control how much tool activity is displayed in `~/.hermes/config.yaml`:
|
||||
|
||||
```yaml
|
||||
display:
|
||||
tool_progress: all # off | new | all | verbose
|
||||
```
|
||||
|
||||
When enabled, the bot sends status messages as it works:
|
||||
|
||||
```text
|
||||
💻 `ls -la`...
|
||||
🔍 web_search...
|
||||
📄 web_extract...
|
||||
🐍 execute_code...
|
||||
```
|
||||
|
||||
## Service Management
|
||||
|
||||
### Linux (systemd)
|
||||
|
||||
```bash
|
||||
hermes gateway install # Install as user service
|
||||
systemctl --user start hermes-gateway
|
||||
systemctl --user stop hermes-gateway
|
||||
systemctl --user status hermes-gateway
|
||||
journalctl --user -u hermes-gateway -f
|
||||
|
||||
# Enable lingering (keeps running after logout)
|
||||
sudo loginctl enable-linger $USER
|
||||
```
|
||||
|
||||
### macOS (launchd)
|
||||
|
||||
```bash
|
||||
hermes gateway install
|
||||
launchctl start ai.hermes.gateway
|
||||
launchctl stop ai.hermes.gateway
|
||||
tail -f ~/.hermes/logs/gateway.log
|
||||
```
|
||||
|
||||
## Platform-Specific Toolsets
|
||||
|
||||
Each platform has its own toolset:
|
||||
|
||||
| Platform | Toolset | Capabilities |
|
||||
|----------|---------|--------------|
|
||||
| CLI | `hermes-cli` | Full access |
|
||||
| Telegram | `hermes-telegram` | Full tools including terminal |
|
||||
| Discord | `hermes-discord` | Full tools including terminal |
|
||||
| WhatsApp | `hermes-whatsapp` | Full tools including terminal |
|
||||
| Slack | `hermes-slack` | Full tools including terminal |
|
||||
|
||||
## Next Steps
|
||||
|
||||
- [Telegram Setup](telegram.md)
|
||||
- [Discord Setup](discord.md)
|
||||
- [Slack Setup](slack.md)
|
||||
- [WhatsApp Setup](whatsapp.md)
|
||||
57
website/docs/user-guide/messaging/slack.md
Normal file
57
website/docs/user-guide/messaging/slack.md
Normal file
@@ -0,0 +1,57 @@
|
||||
---
|
||||
sidebar_position: 4
|
||||
title: "Slack"
|
||||
description: "Set up Hermes Agent as a Slack bot"
|
||||
---
|
||||
|
||||
# Slack Setup
|
||||
|
||||
Connect Hermes Agent to Slack using Socket Mode for real-time communication.
|
||||
|
||||
## Setup Steps
|
||||
|
||||
1. **Create an app:** Go to [Slack API](https://api.slack.com/apps), create a new app
|
||||
2. **Enable Socket Mode:** In app settings → Socket Mode → Enable
|
||||
3. **Get tokens:**
|
||||
- Bot Token (`xoxb-...`): OAuth & Permissions → Install to Workspace
|
||||
- App Token (`xapp-...`): Basic Information → App-Level Tokens → Generate (with `connections:write` scope)
|
||||
4. **Configure:** Run `hermes gateway setup` and select Slack, or add to `~/.hermes/.env` manually:
|
||||
|
||||
```bash
|
||||
SLACK_BOT_TOKEN=xoxb-...
|
||||
SLACK_APP_TOKEN=xapp-...
|
||||
SLACK_ALLOWED_USERS=U01234ABCDE # Comma-separated Slack user IDs
|
||||
```
|
||||
|
||||
5. **Start the gateway:**
|
||||
|
||||
```bash
|
||||
hermes gateway
|
||||
```
|
||||
|
||||
## Optional: Home Channel
|
||||
|
||||
Set a default channel for cron job delivery:
|
||||
|
||||
```bash
|
||||
SLACK_HOME_CHANNEL=C01234567890
|
||||
```
|
||||
|
||||
## Required Bot Scopes
|
||||
|
||||
Make sure your Slack app has these OAuth scopes:
|
||||
|
||||
- `chat:write` — Send messages
|
||||
- `channels:history` — Read channel messages
|
||||
- `im:history` — Read DM messages
|
||||
- `files:write` — Upload files (audio, images)
|
||||
|
||||
## Voice Messages
|
||||
|
||||
Voice messages on Slack are automatically transcribed (requires `VOICE_TOOLS_OPENAI_KEY`). TTS audio is sent as file attachments.
|
||||
|
||||
## Security
|
||||
|
||||
:::warning
|
||||
Always set `SLACK_ALLOWED_USERS` to restrict who can use the bot. Without it, the gateway denies all users by default.
|
||||
:::
|
||||
74
website/docs/user-guide/messaging/telegram.md
Normal file
74
website/docs/user-guide/messaging/telegram.md
Normal file
@@ -0,0 +1,74 @@
|
||||
---
|
||||
sidebar_position: 2
|
||||
title: "Telegram"
|
||||
description: "Set up Hermes Agent as a Telegram bot"
|
||||
---
|
||||
|
||||
# Telegram Setup
|
||||
|
||||
Connect Hermes Agent to Telegram so you can chat from your phone, send voice memos, and receive scheduled task results.
|
||||
|
||||
## Setup Steps
|
||||
|
||||
1. **Create a bot:** Message [@BotFather](https://t.me/BotFather) on Telegram, use `/newbot`
|
||||
2. **Get your user ID:** Message [@userinfobot](https://t.me/userinfobot) — it replies with your numeric ID
|
||||
3. **Configure:** Run `hermes gateway setup` and select Telegram, or add to `~/.hermes/.env` manually:
|
||||
|
||||
```bash
|
||||
TELEGRAM_BOT_TOKEN=123456:ABC-DEF...
|
||||
TELEGRAM_ALLOWED_USERS=YOUR_USER_ID # Comma-separated for multiple users
|
||||
```
|
||||
|
||||
4. **Start the gateway:**
|
||||
|
||||
```bash
|
||||
hermes gateway
|
||||
```
|
||||
|
||||
## Optional: Home Channel
|
||||
|
||||
Set a home channel for cron job delivery:
|
||||
|
||||
```bash
|
||||
TELEGRAM_HOME_CHANNEL=-1001234567890
|
||||
TELEGRAM_HOME_CHANNEL_NAME="My Notes"
|
||||
```
|
||||
|
||||
Or use the `/sethome` command in any Telegram chat to set it dynamically.
|
||||
|
||||
## Voice Messages
|
||||
|
||||
Voice messages sent on Telegram are automatically transcribed using OpenAI's Whisper API and injected as text into the conversation. Requires `VOICE_TOOLS_OPENAI_KEY` in `~/.hermes/.env`.
|
||||
|
||||
### Voice Bubbles (TTS)
|
||||
|
||||
When the agent generates audio via text-to-speech, it's delivered as native Telegram voice bubbles (the round, inline-playable kind).
|
||||
|
||||
- **OpenAI and ElevenLabs** produce Opus natively — no extra setup needed
|
||||
- **Edge TTS** (the default free provider) outputs MP3 and needs **ffmpeg** to convert to Opus:
|
||||
|
||||
```bash
|
||||
# Ubuntu/Debian
|
||||
sudo apt install ffmpeg
|
||||
|
||||
# macOS
|
||||
brew install ffmpeg
|
||||
```
|
||||
|
||||
Without ffmpeg, Edge TTS audio is sent as a regular audio file (still playable, but rectangular player instead of voice bubble).
|
||||
|
||||
## Exec Approval
|
||||
|
||||
When the agent tries to run a potentially dangerous command, it asks you for approval in the chat:
|
||||
|
||||
> ⚠️ This command is potentially dangerous (recursive delete). Reply "yes" to approve.
|
||||
|
||||
Reply "yes"/"y" to approve or "no"/"n" to deny.
|
||||
|
||||
## Security
|
||||
|
||||
:::warning
|
||||
Always set `TELEGRAM_ALLOWED_USERS` to restrict who can use the bot. Without it, the gateway denies all users by default.
|
||||
:::
|
||||
|
||||
You can also use [DM pairing](/user-guide/messaging#dm-pairing-alternative-to-allowlists) for a more dynamic approach.
|
||||
77
website/docs/user-guide/messaging/whatsapp.md
Normal file
77
website/docs/user-guide/messaging/whatsapp.md
Normal file
@@ -0,0 +1,77 @@
|
||||
---
|
||||
sidebar_position: 5
|
||||
title: "WhatsApp"
|
||||
description: "Set up Hermes Agent as a WhatsApp bot via the built-in Baileys bridge"
|
||||
---
|
||||
|
||||
# WhatsApp Setup
|
||||
|
||||
WhatsApp doesn't have a simple bot API like Telegram or Discord. Hermes includes a built-in bridge using [Baileys](https://github.com/WhiskeySockets/Baileys) that connects via WhatsApp Web.
|
||||
|
||||
## Two Modes
|
||||
|
||||
| Mode | How it works | Best for |
|
||||
|------|-------------|----------|
|
||||
| **Separate bot number** (recommended) | Dedicate a phone number to the bot. People message that number directly. | Clean UX, multiple users |
|
||||
| **Personal self-chat** | Use your own WhatsApp. You message yourself to talk to the agent. | Quick setup, single user |
|
||||
|
||||
## Setup
|
||||
|
||||
```bash
|
||||
hermes whatsapp
|
||||
```
|
||||
|
||||
The wizard will:
|
||||
|
||||
1. Ask which mode you want
|
||||
2. For **bot mode**: guide you through getting a second number
|
||||
3. Configure the allowlist
|
||||
4. Install bridge dependencies (Node.js required)
|
||||
5. Display a QR code — scan from WhatsApp → Settings → Linked Devices → Link a Device
|
||||
6. Exit once paired
|
||||
|
||||
## Getting a Second Number (Bot Mode)
|
||||
|
||||
| Option | Cost | Notes |
|
||||
|--------|------|-------|
|
||||
| WhatsApp Business app + dual-SIM | Free (if you have dual-SIM) | Install alongside personal WhatsApp, no second phone needed |
|
||||
| Google Voice | Free (US only) | voice.google.com, verify WhatsApp via the Google Voice app |
|
||||
| Prepaid SIM | $3-10/month | Any carrier; verify once, phone can go in a drawer on WiFi |
|
||||
|
||||
## Starting the Gateway
|
||||
|
||||
```bash
|
||||
hermes gateway # Foreground
|
||||
hermes gateway install # Or install as a system service
|
||||
```
|
||||
|
||||
The gateway starts the WhatsApp bridge automatically using the saved session.
|
||||
|
||||
## Environment Variables
|
||||
|
||||
```bash
|
||||
WHATSAPP_ENABLED=true
|
||||
WHATSAPP_MODE=bot # "bot" or "self-chat"
|
||||
WHATSAPP_ALLOWED_USERS=15551234567 # Comma-separated phone numbers with country code
|
||||
```
|
||||
|
||||
## Important Notes
|
||||
|
||||
- Agent responses are prefixed with "⚕ **Hermes Agent**" for easy identification
|
||||
- WhatsApp Web sessions can disconnect if WhatsApp updates their protocol
|
||||
- The gateway reconnects automatically
|
||||
- If you see persistent failures, re-pair with `hermes whatsapp`
|
||||
|
||||
:::info Re-pairing
|
||||
If WhatsApp Web sessions disconnect (protocol updates, phone reset), re-pair with `hermes whatsapp`. The gateway handles temporary disconnections automatically.
|
||||
:::
|
||||
|
||||
## Voice Messages
|
||||
|
||||
Voice messages sent on WhatsApp are automatically transcribed (requires `VOICE_TOOLS_OPENAI_KEY`). TTS audio is sent as MP3 file attachments.
|
||||
|
||||
## Security
|
||||
|
||||
:::warning
|
||||
Always set `WHATSAPP_ALLOWED_USERS` with phone numbers (including country code) to restrict who can use the bot.
|
||||
:::
|
||||
Reference in New Issue
Block a user