docs: comprehensive documentation audit — fix 9 HIGH, 20+ MEDIUM gaps (#4087)

Reference docs fixes:
- cli-commands.md: remove non-existent --provider alibaba, add hermes
  profile/completion/plugins/mcp to top-level table, add --profile/-p
  global flag, add --source chat option
- slash-commands.md: add /yolo and /commands, fix /q alias conflict
  (resolves to /queue not /quit), add missing aliases (/bg, /set-home,
  /reload_mcp, /gateway)
- toolsets-reference.md: fix hermes-api-server (not same as hermes-cli,
  omits clarify/send_message/text_to_speech)
- profile-commands.md: fix show name required not optional, --clone-from
  not --from, add --remove/--name to alias, fix alias path, fix export/
  import arg types, remove non-existent fish completion
- tools-reference.md: add EXA_API_KEY to web tools requires_env
- mcp-config-reference.md: add auth key for OAuth, tool name sanitization
- environment-variables.md: add EXA_API_KEY, update provider values
- plugins.md: remove non-existent ctx.register_command(), add
  ctx.inject_message()

Feature docs additions:
- security.md: add /yolo mode, approval modes (manual/smart/off),
  configurable timeout, expanded dangerous patterns table
- cron.md: add wrap_response config, [SILENT] suppression
- mcp.md: add dynamic tool discovery, MCP sampling support
- cli.md: add Ctrl+Z suspend, busy_input_mode, tool_preview_length
- docker.md: add skills/credential file mounting

Messaging platform docs:
- telegram.md: add webhook mode, DoH fallback IPs
- slack.md: add multi-workspace OAuth support
- discord.md: add DISCORD_IGNORE_NO_MENTION
- matrix.md: add MSC3245 native voice messages
- feishu.md: expand from 129 to 365 lines (encrypt key, verification
  token, group policy, card actions, media, rate limiting, markdown,
  troubleshooting)
- wecom.md: expand from 86 to 264 lines (per-group allowlists, media,
  AES decryption, stream replies, reconnection, troubleshooting)

Configuration docs:
- quickstart.md: add DeepSeek, Copilot, Copilot ACP providers
- configuration.md: add DeepSeek provider, Exa web backend, terminal
  env_passthrough/images, browser.command_timeout, compression params,
  discord config, security/tirith config, timezone, auxiliary models

21 files changed, ~1000 lines added
This commit is contained in:
Teknium
2026-03-30 17:15:21 -07:00
committed by GitHub
parent 3c8f910973
commit 7e0c2c3ce3
21 changed files with 1004 additions and 83 deletions

View File

@@ -54,6 +54,9 @@ hermes setup # Or configure everything at once
| **Kilo Code** | KiloCode-hosted models | Set `KILOCODE_API_KEY` |
| **OpenCode Zen** | Pay-as-you-go access to curated models | Set `OPENCODE_ZEN_API_KEY` |
| **OpenCode Go** | $10/month subscription for open models | Set `OPENCODE_GO_API_KEY` |
| **DeepSeek** | Direct DeepSeek API access | Set `DEEPSEEK_API_KEY` |
| **GitHub Copilot** | GitHub Copilot subscription (GPT-5.x, Claude, Gemini, etc.) | OAuth via `hermes model`, or `COPILOT_GITHUB_TOKEN` / `GH_TOKEN` |
| **GitHub Copilot ACP** | Copilot ACP agent backend (spawns local `copilot` CLI) | `hermes model` (requires `copilot` CLI + `copilot login`) |
| **Vercel AI Gateway** | Vercel AI Gateway routing | Set `AI_GATEWAY_API_KEY` |
| **Custom Endpoint** | VLLM, SGLang, Ollama, or any OpenAI-compatible API | Set base URL + API key |

View File

@@ -21,6 +21,7 @@ hermes [global-options] <command> [subcommand/options]
| Option | Description |
|--------|-------------|
| `--version`, `-V` | Show version and exit. |
| `--profile <name>`, `-p <name>` | Select which Hermes profile to use for this invocation. Overrides the sticky default set by `hermes profile use`. |
| `--resume <session>`, `-r <session>` | Resume a previous session by ID or title. |
| `--continue [name]`, `-c [name]` | Resume the most recent session, or the most recent session matching a title. |
| `--worktree`, `-w` | Start in an isolated git worktree for parallel-agent workflows. |
@@ -46,10 +47,14 @@ hermes [global-options] <command> [subcommand/options]
| `hermes skills` | Browse, install, publish, audit, and configure skills. |
| `hermes honcho` | Manage Honcho cross-session memory integration. |
| `hermes acp` | Run Hermes as an ACP server for editor integration. |
| `hermes mcp` | Manage MCP server configurations and run Hermes as an MCP server. |
| `hermes plugins` | Manage Hermes Agent plugins (install, enable, disable, remove). |
| `hermes tools` | Configure enabled tools per platform. |
| `hermes sessions` | Browse, export, prune, rename, and delete sessions. |
| `hermes insights` | Show token/cost/activity analytics. |
| `hermes claw` | OpenClaw migration helpers. |
| `hermes profile` | Manage profiles — multiple isolated Hermes instances. |
| `hermes completion` | Print shell completion scripts (bash/zsh). |
| `hermes version` | Show version information. |
| `hermes update` | Pull latest code and reinstall dependencies. |
| `hermes uninstall` | Remove Hermes from the system. |
@@ -67,7 +72,7 @@ Common options:
| `-q`, `--query "..."` | One-shot, non-interactive prompt. |
| `-m`, `--model <model>` | Override the model for this run. |
| `-t`, `--toolsets <csv>` | Enable a comma-separated set of toolsets. |
| `--provider <provider>` | Force a provider: `auto`, `openrouter`, `nous`, `openai-codex`, `copilot`, `copilot-acp`, `anthropic`, `huggingface`, `alibaba`, `zai`, `kimi-coding`, `minimax`, `minimax-cn`, `kilocode`. |
| `--provider <provider>` | Force a provider: `auto`, `openrouter`, `nous`, `openai-codex`, `copilot-acp`, `copilot`, `anthropic`, `huggingface`, `zai`, `kimi-coding`, `minimax`, `minimax-cn`, `kilocode`. |
| `-s`, `--skills <name>` | Preload one or more skills for the session (can be repeated or comma-separated). |
| `-v`, `--verbose` | Verbose output. |
| `-Q`, `--quiet` | Programmatic mode: suppress banner/spinner/tool previews. |
@@ -76,6 +81,7 @@ Common options:
| `--checkpoints` | Enable filesystem checkpoints before destructive file changes. |
| `--yolo` | Skip approval prompts. |
| `--pass-session-id` | Pass the session ID into the system prompt. |
| `--source <tag>` | Session source tag for filtering (default: `cli`). Use `tool` for third-party integrations that should not appear in user session lists. |
Examples:
@@ -507,6 +513,56 @@ hermes claw migrate --preset user-data --overwrite
hermes claw migrate --source /home/user/old-openclaw
```
## `hermes profile`
```bash
hermes profile <subcommand>
```
Manage profiles — multiple isolated Hermes instances, each with its own config, sessions, skills, and home directory.
| Subcommand | Description |
|------------|-------------|
| `list` | List all profiles. |
| `use <name>` | Set a sticky default profile. |
| `create <name> [--clone] [--no-alias]` | Create a new profile. `--clone` copies config, `.env`, and `SOUL.md` from the active profile. |
| `delete <name> [-y]` | Delete a profile. |
| `show <name>` | Show profile details (home directory, config, etc.). |
| `alias <name> [--remove] [--name NAME]` | Manage wrapper scripts for quick profile access. |
| `rename <old> <new>` | Rename a profile. |
| `export <name> [-o FILE]` | Export a profile to a `.tar.gz` archive. |
| `import <archive> [--name NAME]` | Import a profile from a `.tar.gz` archive. |
Examples:
```bash
hermes profile list
hermes profile create work --clone
hermes profile use work
hermes profile alias work --name h-work
hermes profile export work -o work-backup.tar.gz
hermes profile import work-backup.tar.gz --name restored
hermes -p work chat -q "Hello from work profile"
```
## `hermes completion`
```bash
hermes completion [bash|zsh]
```
Print a shell completion script to stdout. Source the output in your shell profile for tab-completion of Hermes commands, subcommands, and profile names.
Examples:
```bash
# Bash
hermes completion bash >> ~/.bashrc
# Zsh
hermes completion zsh >> ~/.zshrc
```
## Maintenance commands
| Command | Description |

View File

@@ -63,7 +63,7 @@ For native Anthropic auth, Hermes prefers Claude Code's own credential files whe
| Variable | Description |
|----------|-------------|
| `HERMES_INFERENCE_PROVIDER` | Override provider selection: `auto`, `openrouter`, `nous`, `openai-codex`, `copilot`, `copilot-acp`, `anthropic`, `huggingface`, `zai`, `kimi-coding`, `minimax`, `minimax-cn`, `kilocode`, `alibaba` (default: `auto`) |
| `HERMES_INFERENCE_PROVIDER` | Override provider selection: `auto`, `openrouter`, `nous`, `openai-codex`, `copilot`, `copilot-acp`, `anthropic`, `huggingface`, `zai`, `kimi-coding`, `minimax`, `minimax-cn`, `kilocode`, `alibaba`, `deepseek`, `opencode-zen`, `opencode-go`, `ai-gateway` (default: `auto`) |
| `HERMES_PORTAL_BASE_URL` | Override Nous Portal URL (for development/testing) |
| `NOUS_INFERENCE_BASE_URL` | Override Nous inference API URL |
| `HERMES_NOUS_MIN_KEY_TTL_SECONDS` | Min agent key TTL before re-mint (default: 1800 = 30min) |
@@ -80,6 +80,7 @@ For native Anthropic auth, Hermes prefers Claude Code's own credential files whe
| `FIRECRAWL_API_KEY` | Web scraping ([firecrawl.dev](https://firecrawl.dev/)) |
| `FIRECRAWL_API_URL` | Custom Firecrawl API endpoint for self-hosted instances (optional) |
| `TAVILY_API_KEY` | Tavily API key for AI-native web search, extract, and crawl ([app.tavily.com](https://app.tavily.com/home)) |
| `EXA_API_KEY` | Exa API key for AI-native web search and contents ([exa.ai](https://exa.ai/)) |
| `BROWSERBASE_API_KEY` | Browser automation ([browserbase.com](https://browserbase.com/)) |
| `BROWSERBASE_PROJECT_ID` | Browserbase project ID |
| `BROWSER_USE_API_KEY` | Browser Use cloud browser API key ([browser-use.com](https://browser-use.com/)) |

View File

@@ -48,6 +48,8 @@ mcp_servers:
| `timeout` | number | both | Tool call timeout |
| `connect_timeout` | number | both | Initial connection timeout |
| `tools` | mapping | both | Filtering and utility-tool policy |
| `auth` | string | HTTP | Authentication method. Set to `oauth` to enable OAuth 2.1 with PKCE |
| `sampling` | mapping | both | Server-initiated LLM request policy (see MCP guide) |
## `tools` policy keys
@@ -213,3 +215,33 @@ Utility tools follow the same prefixing pattern:
- `mcp_<server>_read_resource`
- `mcp_<server>_list_prompts`
- `mcp_<server>_get_prompt`
### Name sanitization
Hyphens (`-`) and dots (`.`) in both server names and tool names are replaced with underscores before registration. This ensures tool names are valid identifiers for LLM function-calling APIs.
For example, a server named `my-api` exposing a tool called `list-items.v2` becomes:
```text
mcp_my_api_list_items_v2
```
Keep this in mind when writing `include` / `exclude` filters — use the **original** MCP tool name (with hyphens/dots), not the sanitized version.
## OAuth 2.1 authentication
For HTTP servers that require OAuth, set `auth: oauth` on the server entry:
```yaml
mcp_servers:
protected_api:
url: "https://mcp.example.com/mcp"
auth: oauth
```
Behavior:
- Hermes uses the MCP SDK's OAuth 2.1 PKCE flow (metadata discovery, dynamic client registration, token exchange, and refresh)
- On first connect, a browser window opens for authorization
- Tokens are persisted to `~/.hermes/mcp-tokens/<server>.json` and reused across sessions
- Token refresh is automatic; re-authorization only happens when refresh fails
- Only applies to HTTP/StreamableHTTP transport (`url`-based servers)

View File

@@ -78,7 +78,7 @@ Creates a new profile.
| `<name>` | Name for the new profile. Must be a valid directory name (alphanumeric, hyphens, underscores). |
| `--clone` | Copy `config.yaml`, `.env`, and `SOUL.md` from the current profile. |
| `--clone-all` | Copy everything (config, memories, skills, sessions, state) from the current profile. |
| `--from <profile>` | Clone from a specific profile instead of the current one. Used with `--clone` or `--clone-all`. |
| `--clone-from <profile>` | Clone from a specific profile instead of the current one. Used with `--clone` or `--clone-all`. |
**Examples:**
@@ -93,7 +93,7 @@ hermes profile create work --clone
hermes profile create backup --clone-all
# Clone config from a specific profile
hermes profile create work2 --clone --from work
hermes profile create work2 --clone --clone-from work
```
## `hermes profile delete`
@@ -123,14 +123,14 @@ This permanently deletes the profile's entire directory including all config, me
## `hermes profile show`
```bash
hermes profile show [name]
hermes profile show <name>
```
Displays details about a profile including its home directory, configured model, active platforms, and disk usage.
| Argument | Description |
|----------|-------------|
| `[name]` | Profile to inspect. Defaults to the current active profile if omitted. |
| `<name>` | Profile to inspect. |
**Example:**
@@ -147,20 +147,28 @@ Disk: 48 MB
## `hermes profile alias`
```bash
hermes profile alias <name>
hermes profile alias <name> [options]
```
Regenerates the shell alias script at `~/.local/bin/hermes-<name>`. Useful if the alias was accidentally deleted or if you need to update it after moving your Hermes installation.
Regenerates the shell alias script at `~/.local/bin/<name>`. Useful if the alias was accidentally deleted or if you need to update it after moving your Hermes installation.
| Argument | Description |
|----------|-------------|
| Argument / Option | Description |
|-------------------|-------------|
| `<name>` | Profile to create/update the alias for. |
| `--remove` | Remove the wrapper script instead of creating it. |
| `--name <alias>` | Custom alias name (default: profile name). |
**Example:**
```bash
hermes profile alias work
# Creates/updates ~/.local/bin/work
hermes profile alias work --name mywork
# Creates ~/.local/bin/mywork
hermes profile alias work --remove
# Removes the wrapper script
```
## `hermes profile rename`
@@ -187,39 +195,45 @@ hermes profile rename mybot assistant
## `hermes profile export`
```bash
hermes profile export <name> <output-path>
hermes profile export <name> [options]
```
Exports a profile as a compressed tar.gz archive.
| Argument | Description |
|----------|-------------|
| Argument / Option | Description |
|-------------------|-------------|
| `<name>` | Profile to export. |
| `<output-path>` | Path for the output archive (e.g., `./work-backup.tar.gz`). |
| `-o`, `--output <path>` | Output file path (default: `<name>.tar.gz`). |
**Example:**
```bash
hermes profile export work ./work-2026-03-29.tar.gz
hermes profile export work
# Creates work.tar.gz in the current directory
hermes profile export work -o ./work-2026-03-29.tar.gz
```
## `hermes profile import`
```bash
hermes profile import <archive-path> [name]
hermes profile import <archive> [options]
```
Imports a profile from a tar.gz archive.
| Argument | Description |
|----------|-------------|
| `<archive-path>` | Path to the tar.gz archive to import. |
| `[name]` | Name for the imported profile. Defaults to the original profile name from the archive. |
| Argument / Option | Description |
|-------------------|-------------|
| `<archive>` | Path to the tar.gz archive to import. |
| `--name <name>` | Name for the imported profile (default: inferred from archive). |
**Example:**
```bash
hermes profile import ./work-2026-03-29.tar.gz work-restored
hermes profile import ./work-2026-03-29.tar.gz
# Infers profile name from the archive
hermes profile import ./work-2026-03-29.tar.gz --name work-restored
```
## `hermes -p` / `hermes --profile`
@@ -254,7 +268,7 @@ Generates shell completion scripts. Includes completions for profile names and p
| Argument | Description |
|----------|-------------|
| `<shell>` | Shell to generate completions for: `bash`, `zsh`, or `fish`. |
| `<shell>` | Shell to generate completions for: `bash` or `zsh`. |
**Examples:**
@@ -262,7 +276,6 @@ Generates shell completion scripts. Includes completions for profile names and p
# Install completions
hermes completion bash >> ~/.bashrc
hermes completion zsh >> ~/.zshrc
hermes completion fish > ~/.config/fish/completions/hermes.fish
# Reload shell
source ~/.bashrc

View File

@@ -31,10 +31,10 @@ Type `/` in the CLI to open the autocomplete menu. Built-in commands are case-in
| `/compress` | Manually compress conversation context (flush memories + summarize) |
| `/rollback` | List or restore filesystem checkpoints (usage: /rollback [number]) |
| `/stop` | Kill all running background processes |
| `/queue <prompt>` (alias: `/q`) | Queue a prompt for the next turn (doesn't interrupt the current agent response) |
| `/queue <prompt>` (alias: `/q`) | Queue a prompt for the next turn (doesn't interrupt the current agent response). **Note:** `/q` is claimed by both `/queue` and `/quit`; the last registration wins, so `/q` resolves to `/quit` in practice. Use `/queue` explicitly. |
| `/resume [name]` | Resume a previously-named session |
| `/statusbar` (alias: `/sb`) | Toggle the context/model status bar on or off |
| `/background <prompt>` | Run a prompt in a separate background session. The agent processes your prompt independently — your current session stays free for other work. Results appear as a panel when the task finishes. See [CLI Background Sessions](/docs/user-guide/cli#background-sessions). |
| `/background <prompt>` (alias: `/bg`) | Run a prompt in a separate background session. The agent processes your prompt independently — your current session stays free for other work. Results appear as a panel when the task finishes. See [CLI Background Sessions](/docs/user-guide/cli#background-sessions). |
| `/plan [request]` | Load the bundled `plan` skill to write a markdown plan instead of executing the work. Plans are saved under `.hermes/plans/` relative to the active workspace/backend working directory. |
### Configuration
@@ -50,6 +50,7 @@ Type `/` in the CLI to open the autocomplete menu. Built-in commands are case-in
| `/reasoning` | Manage reasoning effort and display (usage: /reasoning [level\|show\|hide]) |
| `/skin` | Show or change the display skin/theme |
| `/voice [on\|off\|tts\|status]` | Toggle CLI voice mode and spoken playback. Recording uses `voice.record_key` (default: `Ctrl+B`). |
| `/yolo` | Toggle YOLO mode — skip all dangerous command approval prompts. |
### Tools & Skills
@@ -60,7 +61,7 @@ Type `/` in the CLI to open the autocomplete menu. Built-in commands are case-in
| `/browser [connect\|disconnect\|status]` | Manage local Chrome CDP connection. `connect` attaches browser tools to a running Chrome instance (default: `ws://localhost:9222`). `disconnect` detaches. `status` shows current connection. Auto-launches Chrome if no debugger is detected. |
| `/skills` | Search, install, inspect, or manage skills from online registries |
| `/cron` | Manage scheduled tasks (list, add/create, edit, pause, resume, run, remove) |
| `/reload-mcp` | Reload MCP servers from config.yaml |
| `/reload-mcp` (alias: `/reload_mcp`) | Reload MCP servers from config.yaml |
| `/plugins` | List installed plugins and their status |
### Info
@@ -70,14 +71,15 @@ Type `/` in the CLI to open the autocomplete menu. Built-in commands are case-in
| `/help` | Show this help message |
| `/usage` | Show token usage, cost breakdown, and session duration |
| `/insights` | Show usage insights and analytics (last 30 days) |
| `/platforms` | Show gateway/messaging platform status |
| `/platforms` (alias: `/gateway`) | Show gateway/messaging platform status |
| `/paste` | Check clipboard for an image and attach it |
| `/profile` | Show active profile name and home directory |
### Exit
| Command | Description |
|---------|-------------|
| `/quit` | Exit the CLI (also: /exit, /q) |
| `/quit` | Exit the CLI (also: `/exit`). See note on `/q` under `/queue` above. |
### Dynamic CLI slash commands
@@ -105,7 +107,7 @@ The messaging gateway supports the following built-in commands inside Telegram,
| `/personality [name]` | Set a personality overlay for the session. |
| `/retry` | Retry the last message. |
| `/undo` | Remove the last exchange. |
| `/sethome` | Mark the current chat as the platform home channel for deliveries. |
| `/sethome` (alias: `/set-home`) | Mark the current chat as the platform home channel for deliveries. |
| `/compress` | Manually compress conversation context. |
| `/title [name]` | Set or show the session title. |
| `/resume [name]` | Resume a previously named session. |
@@ -116,7 +118,9 @@ The messaging gateway supports the following built-in commands inside Telegram,
| `/rollback [number]` | List or restore filesystem checkpoints. |
| `/background <prompt>` | Run a prompt in a separate background session. Results are delivered back to the same chat when the task finishes. See [Messaging Background Sessions](/docs/user-guide/messaging/#background-sessions). |
| `/plan [request]` | Load the bundled `plan` skill to write a markdown plan instead of executing the work. Plans are saved under `.hermes/plans/` relative to the active workspace/backend working directory. |
| `/reload-mcp` | Reload MCP servers from config. |
| `/reload-mcp` (alias: `/reload_mcp`) | Reload MCP servers from config. |
| `/yolo` | Toggle YOLO mode — skip all dangerous command approval prompts. |
| `/commands [page]` | Browse all commands and skills (paginated). |
| `/approve [session\|always]` | Approve and execute a pending dangerous command. `session` approves for this session only; `always` adds to permanent allowlist. |
| `/deny` | Reject a pending dangerous command. |
| `/update` | Update Hermes Agent to the latest version. |
@@ -127,6 +131,6 @@ The messaging gateway supports the following built-in commands inside Telegram,
- `/skin`, `/tools`, `/toolsets`, `/browser`, `/config`, `/prompt`, `/cron`, `/skills`, `/platforms`, `/paste`, `/statusbar`, and `/plugins` are **CLI-only** commands.
- `/verbose` is **CLI-only by default**, but can be enabled for messaging platforms by setting `display.tool_progress_command: true` in `config.yaml`. When enabled, it cycles the `display.tool_progress` mode and saves to config.
- `/status`, `/sethome`, `/update`, `/approve`, and `/deny` are **messaging-only** commands.
- `/background`, `/voice`, `/reload-mcp`, and `/rollback` work in **both** the CLI and the messaging gateway.
- `/status`, `/sethome`, `/update`, `/approve`, `/deny`, and `/commands` are **messaging-only** commands.
- `/background`, `/voice`, `/reload-mcp`, `/rollback`, and `/yolo` work in **both** the CLI and the messaging gateway.
- `/voice join`, `/voice channel`, and `/voice leave` are only meaningful on Discord.

View File

@@ -151,8 +151,8 @@ This page documents the built-in Hermes tool registry as it exists in code. Avai
| Tool | Description | Requires environment |
|------|-------------|----------------------|
| `web_search` | Search the web for information on any topic. Returns up to 5 relevant results with titles, URLs, and descriptions. | PARALLEL_API_KEY or FIRECRAWL_API_KEY or TAVILY_API_KEY |
| `web_extract` | Extract content from web page URLs. Returns page content in markdown format. Also works with PDF URLs — pass the PDF link directly and it converts to markdown text. Pages under 5000 chars return full markdown; larger pages are LLM-summarized. | PARALLEL_API_KEY or FIRECRAWL_API_KEY or TAVILY_API_KEY |
| `web_search` | Search the web for information on any topic. Returns up to 5 relevant results with titles, URLs, and descriptions. | EXA_API_KEY or PARALLEL_API_KEY or FIRECRAWL_API_KEY or TAVILY_API_KEY |
| `web_extract` | Extract content from web page URLs. Returns page content in markdown format. Also works with PDF URLs — pass the PDF link directly and it converts to markdown text. Pages under 5000 chars return full markdown; larger pages are LLM-summarized. | EXA_API_KEY or PARALLEL_API_KEY or FIRECRAWL_API_KEY or TAVILY_API_KEY |
## `tts` toolset

View File

@@ -19,7 +19,7 @@ Toolsets are named bundles of tools that you can enable with `hermes chat --tool
| `file` | core | `patch`, `read_file`, `search_files`, `write_file` |
| `hermes-acp` | platform | `browser_back`, `browser_click`, `browser_close`, `browser_console`, `browser_get_images`, `browser_navigate`, `browser_press`, `browser_scroll`, `browser_snapshot`, `browser_type`, `browser_vision`, `delegate_task`, `execute_code`, `memory`, `patch`, `process`, `read_file`, `search_files`, `session_search`, `skill_manage`, `skill_view`, `skills_list`, `terminal`, `todo`, `vision_analyze`, `web_extract`, `web_search`, `write_file` |
| `hermes-cli` | platform | `browser_back`, `browser_click`, `browser_close`, `browser_console`, `browser_get_images`, `browser_navigate`, `browser_press`, `browser_scroll`, `browser_snapshot`, `browser_type`, `browser_vision`, `clarify`, `cronjob`, `delegate_task`, `execute_code`, `ha_call_service`, `ha_get_state`, `ha_list_entities`, `ha_list_services`, `honcho_conclude`, `honcho_context`, `honcho_profile`, `honcho_search`, `image_generate`, `memory`, `mixture_of_agents`, `patch`, `process`, `read_file`, `search_files`, `send_message`, `session_search`, `skill_manage`, `skill_view`, `skills_list`, `terminal`, `text_to_speech`, `todo`, `vision_analyze`, `web_extract`, `web_search`, `write_file` |
| `hermes-api-server` | platform | _(same as hermes-cli)_ |
| `hermes-api-server` | platform | `browser_back`, `browser_click`, `browser_close`, `browser_console`, `browser_get_images`, `browser_navigate`, `browser_press`, `browser_scroll`, `browser_snapshot`, `browser_type`, `browser_vision`, `cronjob`, `delegate_task`, `execute_code`, `ha_call_service`, `ha_get_state`, `ha_list_entities`, `ha_list_services`, `honcho_conclude`, `honcho_context`, `honcho_profile`, `honcho_search`, `image_generate`, `memory`, `mixture_of_agents`, `patch`, `process`, `read_file`, `search_files`, `session_search`, `skill_manage`, `skill_view`, `skills_list`, `terminal`, `todo`, `vision_analyze`, `web_extract`, `web_search`, `write_file` |
| `hermes-dingtalk` | platform | _(same as hermes-cli)_ |
| `hermes-feishu` | platform | _(same as hermes-cli)_ |
| `hermes-wecom` | platform | _(same as hermes-cli)_ |

View File

@@ -94,6 +94,7 @@ When resuming a previous session (`hermes -c` or `hermes --resume <id>`), a "Pre
| `Ctrl+B` | Start/stop voice recording when voice mode is enabled (`voice.record_key`, default: `ctrl+b`) |
| `Ctrl+C` | Interrupt agent (double-press within 2s to force exit) |
| `Ctrl+D` | Exit |
| `Ctrl+Z` | Suspend Hermes to background (Unix only). Run `fg` in the shell to resume. |
| `Tab` | Accept auto-suggestion (ghost text) or autocomplete slash commands |
## Slash Commands
@@ -212,6 +213,33 @@ You can interrupt the agent at any point:
- In-progress terminal commands are killed immediately (SIGTERM, then SIGKILL after 1s)
- Multiple messages typed during interrupt are combined into one prompt
### Busy Input Mode
The `display.busy_input_mode` config key controls what happens when you press Enter while the agent is working:
| Mode | Behavior |
|------|----------|
| `"interrupt"` (default) | Your message interrupts the current operation and is processed immediately |
| `"queue"` | Your message is silently queued and sent as the next turn after the agent finishes |
```yaml
# ~/.hermes/config.yaml
display:
busy_input_mode: "queue" # or "interrupt" (default)
```
Queue mode is useful when you want to prepare follow-up messages without accidentally canceling in-flight work. Unknown values fall back to `"interrupt"`.
### Suspending to Background
On Unix systems, press **`Ctrl+Z`** to suspend Hermes to the background — just like any terminal process. The shell prints a confirmation:
```
Hermes Agent has been suspended. Run `fg` to bring Hermes Agent back.
```
Type `fg` in your shell to resume the session exactly where you left off. This is not supported on Windows.
## Tool Progress Display
The CLI shows animated feedback as the agent works:
@@ -232,6 +260,18 @@ The CLI shows animated feedback as the agent works:
Cycle through display modes with `/verbose`: `off → new → all → verbose`. This command can also be enabled for messaging platforms — see [configuration](/docs/user-guide/configuration#display-settings).
### Tool Preview Length
The `display.tool_preview_length` config key controls the maximum number of characters shown in tool call preview lines (e.g. file paths, terminal commands). The default is `0`, which means no limit — full paths and commands are shown.
```yaml
# ~/.hermes/config.yaml
display:
tool_preview_length: 80 # Truncate tool previews to 80 chars (0 = no limit)
```
This is useful on narrow terminals or when tool arguments contain very long file paths.
## Session Management
### Resuming Sessions

View File

@@ -92,6 +92,7 @@ You need at least one way to connect to an LLM. Use `hermes model` to switch pro
| **Kilo Code** | `KILOCODE_API_KEY` in `~/.hermes/.env` (provider: `kilocode`) |
| **OpenCode Zen** | `OPENCODE_ZEN_API_KEY` in `~/.hermes/.env` (provider: `opencode-zen`) |
| **OpenCode Go** | `OPENCODE_GO_API_KEY` in `~/.hermes/.env` (provider: `opencode-go`) |
| **DeepSeek** | `DEEPSEEK_API_KEY` in `~/.hermes/.env` (provider: `deepseek`) |
| **Hugging Face** | `HF_TOKEN` in `~/.hermes/.env` (provider: `huggingface`, aliases: `hf`) |
| **Custom Endpoint** | `hermes model` (saved in `config.yaml`) or `OPENAI_BASE_URL` + `OPENAI_API_KEY` in `~/.hermes/.env` |
@@ -706,6 +707,10 @@ terminal:
backend: local # local | docker | ssh | modal | daytona | singularity
cwd: "." # Working directory ("." = current dir for local, "/root" for containers)
timeout: 180 # Per-command timeout in seconds
env_passthrough: [] # Env var names to forward to sandboxed execution (terminal + execute_code)
singularity_image: "docker://nikolaik/python-nodejs:python3.11-nodejs20" # Container image for Singularity backend
modal_image: "nikolaik/python-nodejs:python3.11-nodejs20" # Container image for Modal backend
daytona_image: "nikolaik/python-nodejs:python3.11-nodejs20" # Container image for Daytona backend
```
### Backend Overview
@@ -1012,6 +1017,8 @@ All compression settings live in `config.yaml` (no environment variables).
compression:
enabled: true # Toggle compression on/off
threshold: 0.50 # Compress at this % of context limit
target_ratio: 0.20 # Fraction of threshold to preserve as recent tail
protect_last_n: 20 # Min recent messages to keep uncompressed
summary_model: "google/gemini-3-flash-preview" # Model for summarization
summary_provider: "auto" # Provider: "auto", "openrouter", "nous", "codex", "main", etc.
summary_base_url: null # Custom OpenAI-compatible endpoint (overrides provider)
@@ -1146,6 +1153,38 @@ auxiliary:
# Context compression timeout (separate from compression.* config)
compression:
timeout: 120 # seconds — compression summarizes long conversations, needs more time
# Session search — summarizes past session matches
session_search:
provider: "auto"
model: ""
base_url: ""
api_key: ""
timeout: 30
# Skills hub — skill matching and search
skills_hub:
provider: "auto"
model: ""
base_url: ""
api_key: ""
timeout: 30
# MCP tool dispatch
mcp:
provider: "auto"
model: ""
base_url: ""
api_key: ""
timeout: 30
# Memory flush — summarizes conversation for persistent memory
flush_memories:
provider: "auto"
model: ""
base_url: ""
api_key: ""
timeout: 30
```
:::tip
@@ -1340,6 +1379,7 @@ display:
streaming: false # Stream tokens to terminal as they arrive (real-time output)
background_process_notifications: all # all | result | error | off (gateway only)
show_cost: false # Show estimated $ cost in the CLI status bar
tool_preview_length: 0 # Max chars for tool call previews (0 = no limit, show full paths/commands)
```
### Theme mode
@@ -1554,11 +1594,11 @@ code_execution:
## Web Search Backends
The `web_search`, `web_extract`, and `web_crawl` tools support three backend providers. Configure the backend in `config.yaml` or via `hermes tools`:
The `web_search`, `web_extract`, and `web_crawl` tools support four backend providers. Configure the backend in `config.yaml` or via `hermes tools`:
```yaml
web:
backend: firecrawl # firecrawl | parallel | tavily
backend: firecrawl # firecrawl | parallel | tavily | exa
```
| Backend | Env Var | Search | Extract | Crawl |
@@ -1566,8 +1606,9 @@ web:
| **Firecrawl** (default) | `FIRECRAWL_API_KEY` | ✔ | ✔ | ✔ |
| **Parallel** | `PARALLEL_API_KEY` | ✔ | ✔ | — |
| **Tavily** | `TAVILY_API_KEY` | ✔ | ✔ | ✔ |
| **Exa** | `EXA_API_KEY` | ✔ | ✔ | — |
**Backend selection:** If `web.backend` is not set, the backend is auto-detected from available API keys. If only `TAVILY_API_KEY` is set, Tavily is used. If only `PARALLEL_API_KEY` is set, Parallel is used. Otherwise Firecrawl is the default.
**Backend selection:** If `web.backend` is not set, the backend is auto-detected from available API keys. If only `EXA_API_KEY` is set, Exa is used. If only `TAVILY_API_KEY` is set, Tavily is used. If only `PARALLEL_API_KEY` is set, Parallel is used. Otherwise Firecrawl is the default.
**Self-hosted Firecrawl:** Set `FIRECRAWL_API_URL` to point at your own instance. When a custom URL is set, the API key becomes optional (set `USE_DB_AUTHENTICATION=false` on the server to disable auth).
@@ -1580,11 +1621,60 @@ Configure browser automation behavior:
```yaml
browser:
inactivity_timeout: 120 # Seconds before auto-closing idle sessions
command_timeout: 30 # Timeout in seconds for browser commands (screenshot, navigate, etc.)
record_sessions: false # Auto-record browser sessions as WebM videos to ~/.hermes/browser_recordings/
```
The browser toolset supports multiple providers. See the [Browser feature page](/docs/user-guide/features/browser) for details on Browserbase, Browser Use, and local Chrome CDP setup.
## Timezone
Override the server-local timezone with an IANA timezone string. Affects timestamps in logs, cron scheduling, and system prompt time injection.
```yaml
timezone: "America/New_York" # IANA timezone (default: "" = server-local time)
```
Supported values: any IANA timezone identifier (e.g. `America/New_York`, `Europe/London`, `Asia/Kolkata`, `UTC`). Leave empty or omit for server-local time.
## Discord
Configure Discord-specific behavior for the messaging gateway:
```yaml
discord:
require_mention: true # Require @mention to respond in server channels
free_response_channels: "" # Comma-separated channel IDs where bot responds without @mention
auto_thread: true # Auto-create threads on @mention in channels
```
- `require_mention` — when `true` (default), the bot only responds in server channels when mentioned with `@BotName`. DMs always work without mention.
- `free_response_channels` — comma-separated list of channel IDs where the bot responds to every message without requiring a mention.
- `auto_thread` — when `true` (default), mentions in channels automatically create a thread for the conversation, keeping channels clean (similar to Slack threading).
## Security
Pre-execution security scanning and secret redaction:
```yaml
security:
redact_secrets: true # Redact API key patterns in tool output and logs
tirith_enabled: true # Enable Tirith security scanning for terminal commands
tirith_path: "tirith" # Path to tirith binary (default: "tirith" in $PATH)
tirith_timeout: 5 # Seconds to wait for tirith scan before timing out
tirith_fail_open: true # Allow command execution if tirith is unavailable
website_blocklist: # See Website Blocklist section below
enabled: false
domains: []
shared_files: []
```
- `redact_secrets` — automatically detects and redacts patterns that look like API keys, tokens, and passwords in tool output before it enters the conversation context and logs.
- `tirith_enabled` — when `true`, terminal commands are scanned by [Tirith](https://github.com/StackGuardian/tirith) before execution to detect potentially dangerous operations.
- `tirith_path` — path to the tirith binary. Set this if tirith is installed in a non-standard location.
- `tirith_timeout` — maximum seconds to wait for a tirith scan. Commands proceed if the scan times out.
- `tirith_fail_open` — when `true` (default), commands are allowed to execute if tirith is unavailable or fails. Set to `false` to block commands when tirith cannot verify them.
## Website Blocklist
Block specific domains from being accessed by the agent's web and browser tools:

View File

@@ -54,3 +54,9 @@ docker run -d \
-v ~/.hermes:/opt/data \
nousresearch/hermes-agent
```
## Skills and credential files
When using Docker as the execution environment (not the methods above, but when the agent runs commands inside a Docker sandbox), Hermes automatically bind-mounts the skills directory (`~/.hermes/skills/`) and any credential files declared by skills into the container as read-only volumes. This means skill scripts, templates, and references are available inside the sandbox without manual configuration.
The same syncing happens for SSH and Modal backends — skills and credential files are uploaded via rsync or the Modal mount API before each command.

View File

@@ -193,6 +193,40 @@ When scheduling jobs, you specify where the output goes:
The agent's final response is automatically delivered. You do not need to call `send_message` in the cron prompt.
### Response wrapping
By default, delivered cron output is wrapped with a header and footer so the recipient knows it came from a scheduled task:
```
Cronjob Response: Morning feeds
-------------
<agent output here>
Note: The agent cannot see this message, and therefore cannot respond to it.
```
To deliver the raw agent output without the wrapper, set `cron.wrap_response` to `false`:
```yaml
# ~/.hermes/config.yaml
cron:
wrap_response: false
```
### Silent suppression
If the agent's final response starts with `[SILENT]`, delivery is suppressed entirely. The output is still saved locally for audit (in `~/.hermes/cron/output/`), but no message is sent to the delivery target.
This is useful for monitoring jobs that should only report when something is wrong:
```text
Check if nginx is running. If everything is healthy, respond with only [SILENT].
Otherwise, report the issue.
```
Failed jobs always deliver regardless of the `[SILENT]` marker — only successful runs can be silenced.
## Schedule formats
The agent's final response is automatically delivered — you do **not** need to include `send_message` in the cron prompt for that same destination. If a cron run calls `send_message` to the exact target the scheduler will already deliver to, Hermes skips that duplicate send and tells the model to put the user-facing content in the final response instead. Use `send_message` only for additional or different targets.

View File

@@ -277,6 +277,14 @@ That keeps the tool list clean.
Hermes discovers MCP servers at startup and registers their tools into the normal tool registry.
### Dynamic Tool Discovery
MCP servers can notify Hermes when their available tools change at runtime by sending a `notifications/tools/list_changed` notification. When Hermes receives this notification, it automatically re-fetches the server's tool list and updates the registry — no manual `/reload-mcp` required.
This is useful for MCP servers whose capabilities change dynamically (e.g. a server that adds tools when a new database schema is loaded, or removes tools when a service goes offline).
The refresh is lock-protected so rapid-fire notifications from the same server don't cause overlapping refreshes. Prompt and resource change notifications (`prompts/list_changed`, `resources/list_changed`) are received but not yet acted on.
### Reloading
If you change MCP config, use:
@@ -285,7 +293,7 @@ If you change MCP config, use:
/reload-mcp
```
This reloads MCP servers from config and refreshes the available tool list.
This reloads MCP servers from config and refreshes the available tool list. For runtime tool changes pushed by the server itself, see [Dynamic Tool Discovery](#dynamic-tool-discovery) above.
### Toolsets
@@ -403,6 +411,39 @@ Because Hermes now only registers those wrappers when both are true:
This is intentional and keeps the tool list honest.
## MCP Sampling Support
MCP servers can request LLM inference from Hermes via the `sampling/createMessage` protocol. This allows an MCP server to ask Hermes to generate text on its behalf — useful for servers that need LLM capabilities but don't have their own model access.
Sampling is **enabled by default** for all MCP servers (when the MCP SDK supports it). Configure it per-server under the `sampling` key:
```yaml
mcp_servers:
my_server:
command: "my-mcp-server"
sampling:
enabled: true # Enable sampling (default: true)
model: "openai/gpt-4o" # Override model for sampling requests (optional)
max_tokens_cap: 4096 # Max tokens per sampling response (default: 4096)
timeout: 30 # Timeout in seconds per request (default: 30)
max_rpm: 10 # Rate limit: max requests per minute (default: 10)
max_tool_rounds: 5 # Max tool-use rounds in sampling loops (default: 5)
allowed_models: [] # Allowlist of model names the server may request (empty = any)
log_level: "info" # Audit log level: debug, info, or warning (default: info)
```
The sampling handler includes a sliding-window rate limiter, per-request timeouts, and tool-loop depth limits to prevent runaway usage. Metrics (request count, errors, tokens used) are tracked per server instance.
To disable sampling for a specific server:
```yaml
mcp_servers:
untrusted_server:
url: "https://mcp.example.com"
sampling:
enabled: false
```
## Running Hermes as an MCP server
In addition to connecting **to** MCP servers, Hermes can also **be** an MCP server. This lets other MCP-capable agents (Claude Code, Cursor, Codex, or any MCP client) use Hermes's messaging capabilities — list conversations, read message history, and send messages across all your connected platforms.

View File

@@ -4,7 +4,7 @@ sidebar_position: 20
# Plugins
Hermes has a plugin system for adding custom tools, hooks, slash commands, and integrations without modifying core code.
Hermes has a plugin system for adding custom tools, hooks, and integrations without modifying core code.
**→ [Build a Hermes Plugin](/docs/guides/build-a-hermes-plugin)** — step-by-step guide with a complete working example.
@@ -30,7 +30,7 @@ Project-local plugins under `./.hermes/plugins/` are disabled by default. Enable
|-----------|-----|
| Add tools | `ctx.register_tool(name, schema, handler)` |
| Add hooks | `ctx.register_hook("post_tool_call", callback)` |
| Add slash commands | `ctx.register_command("mycommand", handler)` |
| Inject messages | `ctx.inject_message(content, role="user")` — see [Injecting Messages](#injecting-messages) |
| Ship data files | `Path(__file__).parent / "data" / "file.yaml"` |
| Bundle skills | Copy `skill.md` to `~/.hermes/skills/` at load time |
| Gate on env vars | `requires_env: [API_KEY]` in plugin.yaml |
@@ -57,34 +57,6 @@ Plugins can register callbacks for these lifecycle events. See the **[Event Hook
| `on_session_start` | New session created (first turn only) |
| `on_session_end` | End of every `run_conversation` call |
## Slash commands
Plugins can register slash commands that work in both CLI and messaging platforms:
```python
def register(ctx):
ctx.register_command(
name="greet",
handler=lambda args: f"Hello, {args or 'world'}!",
description="Greet someone",
args_hint="[name]",
aliases=("hi",),
)
```
The handler receives the argument string (everything after `/greet`) and returns a string to display. Registered commands automatically appear in `/help`, tab autocomplete, Telegram bot menu, and Slack subcommand mapping.
| Parameter | Description |
|-----------|-------------|
| `name` | Command name without slash |
| `handler` | Callable that takes `args: str` and returns `str | None` |
| `description` | Shown in `/help` |
| `args_hint` | Usage hint, e.g. `"[name]"` |
| `aliases` | Tuple of alternative names |
| `cli_only` | Only available in CLI |
| `gateway_only` | Only available in messaging platforms |
| `gateway_config_gate` | Config dotpath (e.g. `"display.my_option"`). When set on a `cli_only` command, the command becomes available in the gateway if the config value is truthy. |
## Managing plugins
```bash
@@ -109,4 +81,27 @@ plugins:
In a running session, `/plugins` shows which plugins are currently loaded.
## Injecting Messages
Plugins can inject messages into the active conversation using `ctx.inject_message()`:
```python
ctx.inject_message("New data arrived from the webhook", role="user")
```
**Signature:** `ctx.inject_message(content: str, role: str = "user") -> bool`
How it works:
- If the agent is **idle** (waiting for user input), the message is queued as the next input and starts a new turn.
- If the agent is **mid-turn** (actively running), the message interrupts the current operation — the same as a user typing a new message and pressing Enter.
- For non-`"user"` roles, the content is prefixed with `[role]` (e.g. `[system] ...`).
- Returns `True` if the message was queued successfully, `False` if no CLI reference is available (e.g. in gateway mode).
This enables plugins like remote control viewers, messaging bridges, or webhook receivers to feed messages into the conversation from external sources.
:::note
`inject_message` is only available in CLI mode. In gateway mode, there is no CLI reference and the method returns `False`.
:::
See the **[full guide](/docs/guides/build-a-hermes-plugin)** for handler contracts, schema format, hook behavior, error handling, and common mistakes.

View File

@@ -19,6 +19,7 @@ Before setup, here's the part most people want to know: how Hermes behaves once
| **Free-response channels** | You can make specific channels mention-free with `DISCORD_FREE_RESPONSE_CHANNELS`, or disable mentions globally with `DISCORD_REQUIRE_MENTION=false`. |
| **Threads** | Hermes replies in the same thread. Mention rules still apply unless that thread or its parent channel is configured as free-response. Threads stay isolated from the parent channel for session history. |
| **Shared channels with multiple users** | By default, Hermes isolates session history per user inside the channel for safety and clarity. Two people talking in the same channel do not share one transcript unless you explicitly disable that. |
| **Messages mentioning other users** | When `DISCORD_IGNORE_NO_MENTION` is `true` (the default), Hermes stays silent if a message @mentions other users but does **not** mention the bot. This prevents the bot from jumping into conversations directed at other people. Set to `false` if you want the bot to respond to all messages regardless of who is mentioned. This only applies in server channels, not DMs. |
:::tip
If you want a normal bot-help channel where people can talk to Hermes without tagging it every time, add that channel to `DISCORD_FREE_RESPONSE_CHANNELS`.
@@ -253,6 +254,9 @@ DISCORD_ALLOWED_USERS=284102345871466496
# Optional: channels where bot responds without @mention (comma-separated channel IDs)
# DISCORD_FREE_RESPONSE_CHANNELS=1234567890,9876543210
# Optional: ignore messages that @mention other users but NOT the bot (default: true)
# DISCORD_IGNORE_NO_MENTION=true
```
Optional behavior settings in `~/.hermes/config.yaml`:

View File

@@ -18,7 +18,7 @@ The integration supports both connection modes:
| Context | Behavior |
|---------|----------|
| Direct messages | Hermes responds to every message. |
| Group chats | Hermes responds when the bot is addressed in the chat. |
| Group chats | Hermes responds only when the bot is @mentioned in the chat. |
| Shared group chats | By default, session history is isolated per user inside a shared chat. |
This shared-chat behavior is controlled by `config.yaml`:
@@ -46,12 +46,16 @@ Keep the App Secret private. Anyone with it can impersonate your app.
### Recommended: WebSocket mode
Use WebSocket mode when Hermes runs on your laptop, workstation, or a private server. No public URL is required.
Use WebSocket mode when Hermes runs on your laptop, workstation, or a private server. No public URL is required. The official Lark SDK opens and maintains a persistent outbound WebSocket connection with automatic reconnection.
```bash
FEISHU_CONNECTION_MODE=websocket
```
**Requirements:** The `websockets` Python package must be installed. The SDK handles connection lifecycle, heartbeats, and auto-reconnection internally.
**How it works:** The adapter runs the Lark SDK's WebSocket client in a background executor thread. Inbound events (messages, reactions, card actions) are dispatched to the main asyncio loop. On disconnect, the SDK will attempt to reconnect automatically.
### Optional: Webhook mode
Use webhook mode only when you already run Hermes behind a reachable HTTP endpoint.
@@ -60,12 +64,24 @@ Use webhook mode only when you already run Hermes behind a reachable HTTP endpoi
FEISHU_CONNECTION_MODE=webhook
```
In webhook mode, Hermes serves a Feishu endpoint at:
In webhook mode, Hermes starts an HTTP server (via `aiohttp`) and serves a Feishu endpoint at:
```text
/feishu/webhook
```
**Requirements:** The `aiohttp` Python package must be installed.
You can customize the webhook server bind address and path:
```bash
FEISHU_WEBHOOK_HOST=127.0.0.1 # default: 127.0.0.1
FEISHU_WEBHOOK_PORT=8765 # default: 8765
FEISHU_WEBHOOK_PATH=/feishu/webhook # default: /feishu/webhook
```
When Feishu sends a URL verification challenge (`type: url_verification`), the webhook responds automatically so you can complete the subscription setup in the Feishu developer console.
## Step 3: Configure Hermes
### Option A: Interactive Setup
@@ -116,13 +132,233 @@ FEISHU_HOME_CHANNEL=oc_xxx
## Security
For production use, set an allowlist:
### User Allowlist
For production use, set an allowlist of Feishu Open IDs:
```bash
FEISHU_ALLOWED_USERS=ou_xxx,ou_yyy
```
If you leave the allowlist empty, anyone who can reach the bot may be able to use it.
If you leave the allowlist empty, anyone who can reach the bot may be able to use it. In group chats, the allowlist is checked against the sender's open_id before the message is processed.
### Webhook Encryption Key
When running in webhook mode, set an encryption key to enable signature verification of inbound webhook payloads:
```bash
FEISHU_ENCRYPT_KEY=your-encrypt-key
```
This key is found in the **Event Subscriptions** section of your Feishu app configuration. When set, the adapter verifies every webhook request using the signature algorithm:
```
SHA256(timestamp + nonce + encrypt_key + body)
```
The computed hash is compared against the `x-lark-signature` header using timing-safe comparison. Requests with invalid or missing signatures are rejected with HTTP 401.
:::tip
In WebSocket mode, signature verification is handled by the SDK itself, so `FEISHU_ENCRYPT_KEY` is optional. In webhook mode, it is strongly recommended for production.
:::
### Verification Token
An additional layer of authentication that checks the `token` field inside webhook payloads:
```bash
FEISHU_VERIFICATION_TOKEN=your-verification-token
```
This token is also found in the **Event Subscriptions** section of your Feishu app. When set, every inbound webhook payload must contain a matching `token` in its `header` object. Mismatched tokens are rejected with HTTP 401.
Both `FEISHU_ENCRYPT_KEY` and `FEISHU_VERIFICATION_TOKEN` can be used together for defense in depth.
## Group Message Policy
The `FEISHU_GROUP_POLICY` environment variable controls whether and how Hermes responds in group chats:
```bash
FEISHU_GROUP_POLICY=allowlist # default
```
| Value | Behavior |
|-------|----------|
| `open` | Hermes responds to @mentions from any user in any group. |
| `allowlist` | Hermes only responds to @mentions from users listed in `FEISHU_ALLOWED_USERS`. |
| `disabled` | Hermes ignores all group messages entirely. |
In all modes, the bot must be explicitly @mentioned (or @all) in the group before the message is processed. Direct messages bypass this gate.
### Bot Identity for @Mention Gating
For precise @mention detection in groups, the adapter needs to know the bot's identity. It can be provided explicitly:
```bash
FEISHU_BOT_OPEN_ID=ou_xxx
FEISHU_BOT_USER_ID=xxx
FEISHU_BOT_NAME=MyBot
```
If none of these are set, the adapter will attempt to auto-discover the bot name via the Application Info API on startup. For this to work, grant the `admin:app.info:readonly` or `application:application:self_manage` permission scope.
## Interactive Card Actions
When users click buttons or interact with interactive cards sent by the bot, the adapter routes these as synthetic `/card` command events:
- Button clicks become: `/card button {"key": "value", ...}`
- The action's `value` payload from the card definition is included as JSON.
- Card actions are deduplicated with a 15-minute window to prevent double processing.
Card action events are dispatched with `MessageType.COMMAND`, so they flow through the normal command processing pipeline.
To use this feature, enable the **Interactive Card** event in your Feishu app's event subscriptions (`card.action.trigger`).
## Media Support
### Inbound (receiving)
The adapter receives and caches the following media types from users:
| Type | Extensions | How it's processed |
|------|-----------|-------------------|
| **Images** | .jpg, .jpeg, .png, .gif, .webp, .bmp | Downloaded via Feishu API and cached locally |
| **Audio** | .ogg, .mp3, .wav, .m4a, .aac, .flac, .opus, .webm | Downloaded and cached; small text files are auto-extracted |
| **Video** | .mp4, .mov, .avi, .mkv, .webm, .m4v, .3gp | Downloaded and cached as documents |
| **Files** | .pdf, .doc, .docx, .xls, .xlsx, .ppt, .pptx, and more | Downloaded and cached as documents |
Media from rich-text (post) messages, including inline images and file attachments, is also extracted and cached.
For small text-based documents (.txt, .md), the file content is automatically injected into the message text so the agent can read it directly without needing tools.
### Outbound (sending)
| Method | What it sends |
|--------|--------------|
| `send` | Text or rich post messages (auto-detected based on markdown content) |
| `send_image` / `send_image_file` | Uploads image to Feishu, then sends as native image bubble (with optional caption) |
| `send_document` | Uploads file to Feishu API, then sends as file attachment |
| `send_voice` | Uploads audio file as a Feishu file attachment |
| `send_video` | Uploads video and sends as native media message |
| `send_animation` | GIFs are downgraded to file attachments (Feishu has no native GIF bubble) |
File upload routing is automatic based on extension:
- `.ogg`, `.opus` → uploaded as `opus` audio
- `.mp4`, `.mov`, `.avi`, `.m4v` → uploaded as `mp4` media
- `.pdf`, `.doc(x)`, `.xls(x)`, `.ppt(x)` → uploaded with their document type
- Everything else → uploaded as a generic stream file
## Markdown Rendering and Post Fallback
When outbound text contains markdown formatting (headings, bold, lists, code blocks, links, etc.), the adapter automatically sends it as a Feishu **post** message with an embedded `md` tag rather than as plain text. This enables rich rendering in the Feishu client.
If the Feishu API rejects the post payload (e.g., due to unsupported markdown constructs), the adapter automatically falls back to sending as plain text with markdown stripped. This two-stage fallback ensures messages are always delivered.
Plain text messages (no markdown detected) are sent as the simple `text` message type.
## ACK Emoji Reactions
When the adapter receives an inbound message, it immediately adds an ✅ (OK) emoji reaction to signal that the message was received and is being processed. This provides visual feedback before the agent completes its response.
The reaction is persistent — it remains on the message after the response is sent, serving as a receipt marker.
User reactions on bot messages are also tracked. If a user adds or removes an emoji reaction on a message sent by the bot, it is routed as a synthetic text event (`reaction:added:EMOJI_TYPE` or `reaction:removed:EMOJI_TYPE`) so the agent can respond to feedback.
## Burst Protection and Batching
The adapter includes debouncing for rapid message bursts to avoid overwhelming the agent:
### Text Batching
When a user sends multiple text messages in quick succession, they are merged into a single event before being dispatched:
| Setting | Env Var | Default |
|---------|---------|---------|
| Quiet period | `HERMES_FEISHU_TEXT_BATCH_DELAY_SECONDS` | 0.6s |
| Max messages per batch | `HERMES_FEISHU_TEXT_BATCH_MAX_MESSAGES` | 8 |
| Max characters per batch | `HERMES_FEISHU_TEXT_BATCH_MAX_CHARS` | 4000 |
### Media Batching
Multiple media attachments sent in quick succession (e.g., dragging several images) are merged into a single event:
| Setting | Env Var | Default |
|---------|---------|---------|
| Quiet period | `HERMES_FEISHU_MEDIA_BATCH_DELAY_SECONDS` | 0.8s |
### Per-Chat Serialization
Messages within the same chat are processed serially (one at a time) to maintain conversation coherence. Each chat has its own lock, so messages in different chats are processed concurrently.
## Rate Limiting (Webhook Mode)
In webhook mode, the adapter enforces per-IP rate limiting to protect against abuse:
- **Window:** 60-second sliding window
- **Limit:** 120 requests per window per (app_id, path, IP) triple
- **Tracking cap:** Up to 4096 unique keys tracked (prevents unbounded memory growth)
Requests that exceed the limit receive HTTP 429 (Too Many Requests).
### Webhook Anomaly Tracking
The adapter tracks consecutive error responses per IP address. After 25 consecutive errors from the same IP within a 6-hour window, a warning is logged. This helps detect misconfigured clients or probing attempts.
Additional webhook protections:
- **Body size limit:** 1 MB maximum
- **Body read timeout:** 30 seconds
- **Content-Type enforcement:** Only `application/json` is accepted
## Deduplication
Inbound messages are deduplicated using message IDs with a 24-hour TTL. The dedup state is persisted across restarts to `~/.hermes/feishu_seen_message_ids.json`.
| Setting | Env Var | Default |
|---------|---------|---------|
| Cache size | `HERMES_FEISHU_DEDUP_CACHE_SIZE` | 2048 entries |
## All Environment Variables
| Variable | Required | Default | Description |
|----------|----------|---------|-------------|
| `FEISHU_APP_ID` | ✅ | — | Feishu/Lark App ID |
| `FEISHU_APP_SECRET` | ✅ | — | Feishu/Lark App Secret |
| `FEISHU_DOMAIN` | — | `feishu` | `feishu` (China) or `lark` (international) |
| `FEISHU_CONNECTION_MODE` | — | `websocket` | `websocket` or `webhook` |
| `FEISHU_ALLOWED_USERS` | — | _(empty)_ | Comma-separated open_id list for user allowlist |
| `FEISHU_HOME_CHANNEL` | — | — | Chat ID for cron/notification output |
| `FEISHU_ENCRYPT_KEY` | — | _(empty)_ | Encrypt key for webhook signature verification |
| `FEISHU_VERIFICATION_TOKEN` | — | _(empty)_ | Verification token for webhook payload auth |
| `FEISHU_GROUP_POLICY` | — | `allowlist` | Group message policy: `open`, `allowlist`, `disabled` |
| `FEISHU_BOT_OPEN_ID` | — | _(empty)_ | Bot's open_id (for @mention detection) |
| `FEISHU_BOT_USER_ID` | — | _(empty)_ | Bot's user_id (for @mention detection) |
| `FEISHU_BOT_NAME` | — | _(empty)_ | Bot's display name (for @mention detection) |
| `FEISHU_WEBHOOK_HOST` | — | `127.0.0.1` | Webhook server bind address |
| `FEISHU_WEBHOOK_PORT` | — | `8765` | Webhook server port |
| `FEISHU_WEBHOOK_PATH` | — | `/feishu/webhook` | Webhook endpoint path |
| `HERMES_FEISHU_DEDUP_CACHE_SIZE` | — | `2048` | Max deduplicated message IDs to track |
| `HERMES_FEISHU_TEXT_BATCH_DELAY_SECONDS` | — | `0.6` | Text burst debounce quiet period |
| `HERMES_FEISHU_TEXT_BATCH_MAX_MESSAGES` | — | `8` | Max messages merged per text batch |
| `HERMES_FEISHU_TEXT_BATCH_MAX_CHARS` | — | `4000` | Max characters merged per text batch |
| `HERMES_FEISHU_MEDIA_BATCH_DELAY_SECONDS` | — | `0.8` | Media burst debounce quiet period |
## Troubleshooting
| Problem | Fix |
|---------|-----|
| `lark-oapi not installed` | Install the SDK: `pip install lark-oapi` |
| `websockets not installed; websocket mode unavailable` | Install websockets: `pip install websockets` |
| `aiohttp not installed; webhook mode unavailable` | Install aiohttp: `pip install aiohttp` |
| `FEISHU_APP_ID or FEISHU_APP_SECRET not set` | Set both env vars or configure via `hermes gateway setup` |
| `Another local Hermes gateway is already using this Feishu app_id` | Only one Hermes instance can use the same app_id at a time. Stop the other gateway first. |
| Bot doesn't respond in groups | Ensure the bot is @mentioned, check `FEISHU_GROUP_POLICY`, and verify the sender is in `FEISHU_ALLOWED_USERS` if policy is `allowlist` |
| `Webhook rejected: invalid verification token` | Ensure `FEISHU_VERIFICATION_TOKEN` matches the token in your Feishu app's Event Subscriptions config |
| `Webhook rejected: invalid signature` | Ensure `FEISHU_ENCRYPT_KEY` matches the encrypt key in your Feishu app config |
| Post messages show as plain text | The Feishu API rejected the post payload; this is normal fallback behavior. Check logs for details. |
| Images/files not received by bot | Grant `im:message` and `im:resource` permission scopes to your Feishu app |
| Bot identity not auto-detected | Grant `admin:app.info:readonly` scope, or set `FEISHU_BOT_OPEN_ID` / `FEISHU_BOT_NAME` manually |
| `Webhook rate limit exceeded` | More than 120 requests/minute from the same IP. This is usually a misconfiguration or loop. |
## Toolset

View File

@@ -352,3 +352,4 @@ For more information on securing your Hermes Agent deployment, see the [Security
- **Federation**: If you're on a federated homeserver, the bot can communicate with users from other servers — just add their full `@user:server` IDs to `MATRIX_ALLOWED_USERS`.
- **Auto-join**: The bot automatically accepts room invites and joins. It starts responding immediately after joining.
- **Media support**: Hermes can send and receive images, audio, video, and file attachments. Media is uploaded to your homeserver using the Matrix content repository API.
- **Native voice messages (MSC3245)**: The Matrix adapter automatically tags outgoing voice messages with the `org.matrix.msc3245.voice` flag. This means TTS responses and voice audio are rendered as **native voice bubbles** in Element and other clients that support MSC3245, rather than as generic audio file attachments. Incoming voice messages with the MSC3245 flag are also correctly identified and routed to speech-to-text transcription. No configuration is needed — this works automatically.

View File

@@ -237,6 +237,60 @@ Make sure the bot has been **invited to the channel** (`/invite @Hermes Agent`).
---
## Multi-Workspace Support
Hermes can connect to **multiple Slack workspaces** simultaneously using a single gateway instance. Each workspace is authenticated independently with its own bot user ID.
### Configuration
Provide multiple bot tokens as a **comma-separated list** in `SLACK_BOT_TOKEN`:
```bash
# Multiple bot tokens — one per workspace
SLACK_BOT_TOKEN=xoxb-workspace1-token,xoxb-workspace2-token,xoxb-workspace3-token
# A single app-level token is still used for Socket Mode
SLACK_APP_TOKEN=xapp-your-app-token
```
Or in `~/.hermes/config.yaml`:
```yaml
platforms:
slack:
token: "xoxb-workspace1-token,xoxb-workspace2-token"
```
### OAuth Token File
In addition to tokens in the environment or config, Hermes also loads tokens from an **OAuth token file** at:
```
~/.hermes/platforms/slack/slack_tokens.json
```
This file is a JSON object mapping team IDs to token entries:
```json
{
"T01ABC2DEF3": {
"token": "xoxb-workspace-token-here",
"team_name": "My Workspace"
}
}
```
Tokens from this file are merged with any tokens specified via `SLACK_BOT_TOKEN`. Duplicate tokens are automatically deduplicated.
### How it works
- The **first token** in the list is the primary token, used for the Socket Mode connection (AsyncApp).
- Each token is authenticated via `auth.test` on startup. The gateway maps each `team_id` to its own `WebClient` and `bot_user_id`.
- When a message arrives, Hermes uses the correct workspace-specific client to respond.
- The primary `bot_user_id` (from the first token) is used for backward compatibility with features that expect a single bot identity.
---
## Voice Messages
Hermes supports voice on Slack:

View File

@@ -258,6 +258,73 @@ Topics created outside of the config (e.g., by manually calling the Telegram API
- **Privacy policy:** Telegram now requires bots to have a privacy policy. Set one via BotFather with `/setprivacy_policy`, or Telegram may auto-generate a placeholder. This is particularly important if your bot is public-facing.
- **Message streaming:** Bot API 9.x added support for streaming long responses, which can improve perceived latency for lengthy agent replies.
## Webhook Mode
By default, the Telegram adapter connects via **long polling** — the gateway makes outbound connections to Telegram's servers. This works everywhere but keeps a persistent connection open.
**Webhook mode** is an alternative where Telegram pushes updates to your server over HTTPS. This is ideal for **serverless and cloud deployments** (Fly.io, Railway, etc.) where inbound HTTP can wake a suspended machine.
### Configuration
Set the `TELEGRAM_WEBHOOK_URL` environment variable to enable webhook mode:
```bash
# Required — your public HTTPS endpoint
TELEGRAM_WEBHOOK_URL=https://app.fly.dev/telegram
# Optional — local listen port (default: 8443)
TELEGRAM_WEBHOOK_PORT=8443
# Optional — secret token for update verification (auto-generated if not set)
TELEGRAM_WEBHOOK_SECRET=my-secret-token
```
Or in `~/.hermes/config.yaml`:
```yaml
telegram:
webhook_mode: true
```
When `TELEGRAM_WEBHOOK_URL` is set, the gateway starts an HTTP server listening on `0.0.0.0:<port>` and registers the webhook URL with Telegram. The URL path is extracted from the webhook URL (defaults to `/telegram`).
:::warning
Telegram requires a **valid TLS certificate** on the webhook endpoint. Self-signed certificates will be rejected. Use a reverse proxy (nginx, Caddy) or a platform that provides TLS termination (Fly.io, Railway, Cloudflare Tunnel).
:::
## DNS-over-HTTPS Fallback IPs
In some restricted networks, `api.telegram.org` may resolve to an IP that is unreachable. The Telegram adapter includes a **fallback IP** mechanism that transparently retries connections against alternative IPs while preserving the correct TLS hostname and SNI.
### How it works
1. If `TELEGRAM_FALLBACK_IPS` is set, those IPs are used directly.
2. Otherwise, the adapter automatically queries **Google DNS** and **Cloudflare DNS** via DNS-over-HTTPS (DoH) to discover alternative IPs for `api.telegram.org`.
3. IPs returned by DoH that differ from the system DNS result are used as fallbacks.
4. If DoH is also blocked, a hardcoded seed IP (`149.154.167.220`) is used as a last resort.
5. Once a fallback IP succeeds, it becomes "sticky" — subsequent requests use it directly without retrying the primary path first.
### Configuration
```bash
# Explicit fallback IPs (comma-separated)
TELEGRAM_FALLBACK_IPS=149.154.167.220,149.154.167.221
```
Or in `~/.hermes/config.yaml`:
```yaml
platforms:
telegram:
extra:
fallback_ips:
- "149.154.167.220"
```
:::tip
You usually don't need to configure this manually. The auto-discovery via DoH handles most restricted-network scenarios. The `TELEGRAM_FALLBACK_IPS` env var is only needed if DoH is also blocked on your network.
:::
## Troubleshooting
| Problem | Solution |

View File

@@ -13,6 +13,7 @@ Connect Hermes to [WeCom](https://work.weixin.qq.com/) (企业微信), Tencent's
- A WeCom organization account
- An AI Bot created in the WeCom Admin Console
- The Bot ID and Secret from the bot's credentials page
- Python packages: `aiohttp` and `httpx`
## Setup
@@ -56,10 +57,12 @@ hermes gateway start
- **WebSocket transport** — persistent connection, no public endpoint needed
- **DM and group messaging** — configurable access policies
- **Per-group sender allowlists** — fine-grained control over who can interact in each group
- **Media support** — images, files, voice, video upload and download
- **AES-encrypted media** — automatic decryption for inbound attachments
- **Quote context** — preserves reply threading
- **Markdown rendering** — rich text responses
- **Reply-mode streaming** — correlates responses to inbound message context
- **Auto-reconnect** — exponential backoff on connection drops
## Configuration Options
@@ -75,12 +78,187 @@ Set these in `config.yaml` under `platforms.wecom.extra`:
| `group_policy` | `open` | Group access: `open`, `allowlist`, `disabled` |
| `allow_from` | `[]` | User IDs allowed for DMs (when dm_policy=allowlist) |
| `group_allow_from` | `[]` | Group IDs allowed (when group_policy=allowlist) |
| `groups` | `{}` | Per-group configuration (see below) |
## Access Policies
### DM Policy
Controls who can send direct messages to the bot:
| Value | Behavior |
|-------|----------|
| `open` | Anyone can DM the bot (default) |
| `allowlist` | Only user IDs in `allow_from` can DM |
| `disabled` | All DMs are ignored |
| `pairing` | Pairing mode (for initial setup) |
```bash
WECOM_DM_POLICY=allowlist
```
### Group Policy
Controls which groups the bot responds in:
| Value | Behavior |
|-------|----------|
| `open` | Bot responds in all groups (default) |
| `allowlist` | Bot only responds in group IDs listed in `group_allow_from` |
| `disabled` | All group messages are ignored |
```bash
WECOM_GROUP_POLICY=allowlist
```
### Per-Group Sender Allowlists
For fine-grained control, you can restrict which users are allowed to interact with the bot within specific groups. This is configured in `config.yaml`:
```yaml
platforms:
wecom:
enabled: true
extra:
bot_id: "your-bot-id"
secret: "your-secret"
group_policy: "allowlist"
group_allow_from:
- "group_id_1"
- "group_id_2"
groups:
group_id_1:
allow_from:
- "user_alice"
- "user_bob"
group_id_2:
allow_from:
- "user_charlie"
"*":
allow_from:
- "user_admin"
```
**How it works:**
1. The `group_policy` and `group_allow_from` controls determine whether a group is allowed at all.
2. If a group passes the top-level check, the `groups.<group_id>.allow_from` list (if present) further restricts which senders within that group can interact with the bot.
3. A wildcard `"*"` group entry serves as a default for groups not explicitly listed.
4. Allowlist entries support the `*` wildcard to allow all users, and entries are case-insensitive.
5. Entries can optionally use the `wecom:user:` or `wecom:group:` prefix format — the prefix is stripped automatically.
If no `allow_from` is configured for a group, all users in that group are allowed (assuming the group itself passes the top-level policy check).
## Media Support
### Inbound (receiving)
The adapter receives media attachments from users and caches them locally for agent processing:
| Type | How it's handled |
|------|-----------------|
| **Images** | Downloaded and cached locally. Supports both URL-based and base64-encoded images. |
| **Files** | Downloaded and cached. Filename is preserved from the original message. |
| **Voice** | Voice message text transcription is extracted if available. |
| **Mixed messages** | WeCom mixed-type messages (text + images) are parsed and all components extracted. |
**Quoted messages:** Media from quoted (replied-to) messages is also extracted, so the agent has context about what the user is replying to.
### AES-Encrypted Media Decryption
WeCom encrypts some inbound media attachments with AES-256-CBC. The adapter handles this automatically:
- When an inbound media item includes an `aeskey` field, the adapter downloads the encrypted bytes and decrypts them using AES-256-CBC with PKCS#7 padding.
- The AES key is the base64-decoded value of the `aeskey` field (must be exactly 32 bytes).
- The IV is derived from the first 16 bytes of the key.
- This requires the `cryptography` Python package (`pip install cryptography`).
No configuration is needed — decryption happens transparently when encrypted media is received.
### Outbound (sending)
| Method | What it sends | Size limit |
|--------|--------------|------------|
| `send` | Markdown text messages | 4000 chars |
| `send_image` / `send_image_file` | Native image messages | 10 MB |
| `send_document` | File attachments | 20 MB |
| `send_voice` | Voice messages (AMR format only for native voice) | 2 MB |
| `send_video` | Video messages | 10 MB |
**Chunked upload:** Files are uploaded in 512 KB chunks through a three-step protocol (init → chunks → finish). The adapter handles this automatically.
**Automatic downgrade:** When media exceeds the native type's size limit but is under the absolute 20 MB file limit, it is automatically sent as a generic file attachment instead:
- Images > 10 MB → sent as file
- Videos > 10 MB → sent as file
- Voice > 2 MB → sent as file
- Non-AMR audio → sent as file (WeCom only supports AMR for native voice)
Files exceeding the absolute 20 MB limit are rejected with an informational message sent to the chat.
## Reply-Mode Stream Responses
When the bot receives a message via the WeCom callback, the adapter remembers the inbound request ID. If a response is sent while the request context is still active, the adapter uses WeCom's reply-mode (`aibot_respond_msg`) with streaming to correlate the response directly to the inbound message. This provides a more natural conversation experience in the WeCom client.
If the inbound request context has expired or is unavailable, the adapter falls back to proactive message sending via `aibot_send_msg`.
Reply-mode also works for media: uploaded media can be sent as a reply to the originating message.
## Connection and Reconnection
The adapter maintains a persistent WebSocket connection to WeCom's gateway at `wss://openws.work.weixin.qq.com`.
### Connection Lifecycle
1. **Connect:** Opens a WebSocket connection and sends an `aibot_subscribe` authentication frame with the bot_id and secret.
2. **Heartbeat:** Sends application-level ping frames every 30 seconds to keep the connection alive.
3. **Listen:** Continuously reads inbound frames and dispatches message callbacks.
### Reconnection Behavior
On connection loss, the adapter uses exponential backoff to reconnect:
| Attempt | Delay |
|---------|-------|
| 1st retry | 2 seconds |
| 2nd retry | 5 seconds |
| 3rd retry | 10 seconds |
| 4th retry | 30 seconds |
| 5th+ retry | 60 seconds |
After each successful reconnection, the backoff counter resets to zero. All pending request futures are failed on disconnect so callers don't hang indefinitely.
### Deduplication
Inbound messages are deduplicated using message IDs with a 5-minute window and a maximum cache of 1000 entries. This prevents double-processing of messages during reconnection or network hiccups.
## All Environment Variables
| Variable | Required | Default | Description |
|----------|----------|---------|-------------|
| `WECOM_BOT_ID` | ✅ | — | WeCom AI Bot ID |
| `WECOM_SECRET` | ✅ | — | WeCom AI Bot Secret |
| `WECOM_ALLOWED_USERS` | — | _(empty)_ | Comma-separated user IDs for the gateway-level allowlist |
| `WECOM_HOME_CHANNEL` | — | — | Chat ID for cron/notification output |
| `WECOM_WEBSOCKET_URL` | — | `wss://openws.work.weixin.qq.com` | WebSocket gateway URL |
| `WECOM_DM_POLICY` | — | `open` | DM access policy |
| `WECOM_GROUP_POLICY` | — | `open` | Group access policy |
## Troubleshooting
| Problem | Fix |
|---------|-----|
| "WECOM_BOT_ID and WECOM_SECRET are required" | Set both env vars or configure in setup wizard |
| "invalid secret (errcode=40013)" | Verify the secret matches your bot's credentials |
| "Timed out waiting for subscribe acknowledgement" | Check network connectivity to `openws.work.weixin.qq.com` |
| Bot doesn't respond in groups | Check `group_policy` setting and group allowlist |
| `WECOM_BOT_ID and WECOM_SECRET are required` | Set both env vars or configure in setup wizard |
| `WeCom startup failed: aiohttp not installed` | Install aiohttp: `pip install aiohttp` |
| `WeCom startup failed: httpx not installed` | Install httpx: `pip install httpx` |
| `invalid secret (errcode=40013)` | Verify the secret matches your bot's credentials |
| `Timed out waiting for subscribe acknowledgement` | Check network connectivity to `openws.work.weixin.qq.com` |
| Bot doesn't respond in groups | Check `group_policy` setting and ensure the group ID is in `group_allow_from` |
| Bot ignores certain users in a group | Check per-group `allow_from` lists in the `groups` config section |
| Media decryption fails | Install `cryptography`: `pip install cryptography` |
| `cryptography is required for WeCom media decryption` | The inbound media is AES-encrypted. Install: `pip install cryptography` |
| Voice messages sent as files | WeCom only supports AMR format for native voice. Other formats are auto-downgraded to file. |
| `File too large` error | WeCom has a 20 MB absolute limit on all file uploads. Compress or split the file. |
| Images sent as files | Images > 10 MB exceed the native image limit and are auto-downgraded to file attachments. |
| `Timeout sending message to WeCom` | The WebSocket may have disconnected. Check logs for reconnection messages. |
| `WeCom websocket closed during authentication` | Network issue or incorrect credentials. Verify bot_id and secret. |

View File

@@ -22,6 +22,61 @@ The security model has five layers:
Before executing any command, Hermes checks it against a curated list of dangerous patterns. If a match is found, the user must explicitly approve it.
### Approval Modes
The approval system supports three modes, configured via `approvals.mode` in `~/.hermes/config.yaml`:
```yaml
approvals:
mode: manual # manual | smart | off
timeout: 60 # seconds to wait for user response (default: 60)
```
| Mode | Behavior |
|------|----------|
| **manual** (default) | Always prompt the user for approval on dangerous commands |
| **smart** | Use an auxiliary LLM to assess risk. Low-risk commands (e.g., `python -c "print('hello')"`) are auto-approved. Genuinely dangerous commands are auto-denied. Uncertain cases escalate to a manual prompt. |
| **off** | Disable all approval checks — equivalent to running with `--yolo`. All commands execute without prompts. |
:::warning
Setting `approvals.mode: off` disables all safety prompts. Use only in trusted environments (CI/CD, containers, etc.).
:::
### YOLO Mode
YOLO mode bypasses **all** dangerous command approval prompts for the current session. It can be activated three ways:
1. **CLI flag**: Start a session with `hermes --yolo` or `hermes chat --yolo`
2. **Slash command**: Type `/yolo` during a session to toggle it on/off
3. **Environment variable**: Set `HERMES_YOLO_MODE=1`
The `/yolo` command is a **toggle** — each use flips the mode on or off:
```
> /yolo
⚡ YOLO mode ON — all commands auto-approved. Use with caution.
> /yolo
⚠ YOLO mode OFF — dangerous commands will require approval.
```
YOLO mode is available in both CLI and gateway sessions. Internally, it sets the `HERMES_YOLO_MODE` environment variable which is checked before every command execution.
:::danger
YOLO mode disables **all** dangerous command safety checks for the session. Use only when you fully trust the commands being generated (e.g., well-tested automation scripts in disposable environments).
:::
### Approval Timeout
When a dangerous command prompt appears, the user has a configurable amount of time to respond. If no response is given within the timeout, the command is **denied** by default (fail-closed).
Configure the timeout in `~/.hermes/config.yaml`:
```yaml
approvals:
timeout: 60 # seconds (default: 60)
```
### What Triggers Approval
The following patterns trigger approval prompts (defined in `tools/approval.py`):
@@ -30,21 +85,32 @@ The following patterns trigger approval prompts (defined in `tools/approval.py`)
|---------|-------------|
| `rm -r` / `rm --recursive` | Recursive delete |
| `rm ... /` | Delete in root path |
| `chmod 777` | World-writable permissions |
| `chmod 777/666` / `o+w` / `a+w` | World/other-writable permissions |
| `chmod --recursive` with unsafe perms | Recursive world/other-writable (long flag) |
| `chown -R root` / `chown --recursive root` | Recursive chown to root |
| `mkfs` | Format filesystem |
| `dd if=` | Disk copy |
| `> /dev/sd` | Write to block device |
| `DROP TABLE/DATABASE` | SQL DROP |
| `DELETE FROM` (without WHERE) | SQL DELETE without WHERE |
| `TRUNCATE TABLE` | SQL TRUNCATE |
| `> /etc/` | Overwrite system config |
| `systemctl stop/disable/mask` | Stop/disable system services |
| `kill -9 -1` | Kill all processes |
| `curl ... \| sh` | Pipe remote content to shell |
| `bash -c`, `python -e` | Shell/script execution via flags |
| `find -exec rm`, `find -delete` | Find with destructive actions |
| `pkill -9` | Force kill processes |
| Fork bomb patterns | Fork bombs |
| `bash -c` / `sh -c` / `zsh -c` / `ksh -c` | Shell command execution via `-c` flag (including combined flags like `-lc`) |
| `python -e` / `perl -e` / `ruby -e` / `node -c` | Script execution via `-e`/`-c` flag |
| `curl ... \| sh` / `wget ... \| sh` | Pipe remote content to shell |
| `bash <(curl ...)` / `sh <(wget ...)` | Execute remote script via process substitution |
| `tee` to `/etc/`, `~/.ssh/`, `~/.hermes/.env` | Overwrite sensitive file via tee |
| `>` / `>>` to `/etc/`, `~/.ssh/`, `~/.hermes/.env` | Overwrite sensitive file via redirection |
| `xargs rm` | xargs with rm |
| `find -exec rm` / `find -delete` | Find with destructive actions |
| `cp`/`mv`/`install` to `/etc/` | Copy/move file into system config |
| `sed -i` / `sed --in-place` on `/etc/` | In-place edit of system config |
| `pkill`/`killall` hermes/gateway | Self-termination prevention |
| `gateway run` with `&`/`disown`/`nohup` | Prevents starting gateway outside service manager |
| `gateway run` with `&`/`disown`/`nohup`/`setsid` | Prevents starting gateway outside service manager |
:::info
**Container bypass**: When running in `docker`, `singularity`, `modal`, or `daytona` backends, dangerous command checks are **skipped** because the container itself is the security boundary. Destructive commands inside a container can't harm the host.