docs: comprehensive update for recent merged PRs (#9019)
Audit and update documentation across 12 files to match changes from
~50 recently merged PRs. Key updates:
Slash commands (slash-commands.md):
- Add 5 missing commands: /snapshot, /fast, /image, /debug, /restart
- Fix /status incorrectly labeled as messaging-only (available in both)
- Add --global flag to /model docs
- Add [focus topic] arg to /compress docs
CLI commands (cli-commands.md):
- Add hermes debug share section with options and examples
- Add hermes backup section with --quick and --label flags
- Add hermes import section
Feature docs:
- TTS: document global tts.speed and per-provider speed for Edge/OpenAI
- Web dashboard: add docs for 5 missing pages (Sessions, Logs,
Analytics, Cron, Skills) and 15+ API endpoints
- WhatsApp: add streaming, 4K chunking, and markdown formatting docs
- Skills: add GitHub rate-limit/GITHUB_TOKEN troubleshooting tip
- Budget: document CLI notification on iteration budget exhaustion
Config migration (compression.summary_* → auxiliary.compression.*):
- Update configuration.md, environment-variables.md,
fallback-providers.md, cli.md, and context-compression-and-caching.md
- Replace legacy compression.summary_model/provider/base_url references
with auxiliary.compression.model/provider/base_url
- Add legacy migration info boxes explaining auto-migration
Minor fixes:
- wecom-callback.md: clarify 'text only' limitation (input only)
- Escape {session_id}/{job_id} in web-dashboard.md headings for MDX
This commit is contained in:
@@ -84,7 +84,13 @@ compression:
|
||||
threshold: 0.50 # Fraction of context window (default: 0.50 = 50%)
|
||||
target_ratio: 0.20 # How much of threshold to keep as tail (default: 0.20)
|
||||
protect_last_n: 20 # Minimum protected tail messages (default: 20)
|
||||
summary_model: null # Override model for summaries (default: uses auxiliary)
|
||||
|
||||
# Summarization model/provider configured under auxiliary:
|
||||
auxiliary:
|
||||
compression:
|
||||
model: null # Override model for summaries (default: auto-detect)
|
||||
provider: auto # Provider: "auto", "openrouter", "nous", "main", etc.
|
||||
base_url: null # Custom OpenAI-compatible endpoint
|
||||
```
|
||||
|
||||
### Parameter Details
|
||||
|
||||
@@ -44,6 +44,9 @@ hermes [global-options] <command> [subcommand/options]
|
||||
| `hermes webhook` | Manage dynamic webhook subscriptions for event-driven activation. |
|
||||
| `hermes doctor` | Diagnose config and dependency issues. |
|
||||
| `hermes dump` | Copy-pasteable setup summary for support/debugging. |
|
||||
| `hermes debug` | Debug tools — upload logs and system info for support. |
|
||||
| `hermes backup` | Back up Hermes home directory to a zip file. |
|
||||
| `hermes import` | Restore a Hermes backup from a zip file. |
|
||||
| `hermes logs` | View, tail, and filter agent/gateway/error log files. |
|
||||
| `hermes config` | Show, edit, migrate, and query configuration files. |
|
||||
| `hermes pairing` | Approve or revoke messaging pairing codes. |
|
||||
@@ -355,6 +358,70 @@ config_overrides:
|
||||
`hermes dump` is specifically designed for sharing. For interactive diagnostics, use `hermes doctor`. For a visual overview, use `hermes status`.
|
||||
:::
|
||||
|
||||
## `hermes debug`
|
||||
|
||||
```bash
|
||||
hermes debug share [options]
|
||||
```
|
||||
|
||||
Upload a debug report (system info + recent logs) to a paste service and get a shareable URL. Useful for quick support requests — includes everything a helper needs to diagnose your issue.
|
||||
|
||||
| Option | Description |
|
||||
|--------|-------------|
|
||||
| `--lines <N>` | Number of log lines to include per log file (default: 200). |
|
||||
| `--expire <days>` | Paste expiry in days (default: 7). |
|
||||
| `--local` | Print the report locally instead of uploading. |
|
||||
|
||||
The report includes system info (OS, Python version, Hermes version), recent agent and gateway logs (512 KB limit per file), and redacted API key status. Keys are always redacted — no secrets are uploaded.
|
||||
|
||||
Paste services tried in order: paste.rs, dpaste.com.
|
||||
|
||||
### Examples
|
||||
|
||||
```bash
|
||||
hermes debug share # Upload debug report, print URL
|
||||
hermes debug share --lines 500 # Include more log lines
|
||||
hermes debug share --expire 30 # Keep paste for 30 days
|
||||
hermes debug share --local # Print report to terminal (no upload)
|
||||
```
|
||||
|
||||
## `hermes backup`
|
||||
|
||||
```bash
|
||||
hermes backup [options]
|
||||
```
|
||||
|
||||
Create a zip archive of your Hermes configuration, skills, sessions, and data. The backup excludes the hermes-agent codebase itself.
|
||||
|
||||
| Option | Description |
|
||||
|--------|-------------|
|
||||
| `-o`, `--output <path>` | Output path for the zip file (default: `~/hermes-backup-<timestamp>.zip`). |
|
||||
| `-q`, `--quick` | Quick snapshot: only critical state files (config.yaml, state.db, .env, auth, cron jobs). Much faster than a full backup. |
|
||||
| `-l`, `--label <name>` | Label for the snapshot (only used with `--quick`). |
|
||||
|
||||
The backup uses SQLite's `backup()` API for safe copying, so it works correctly even when Hermes is running (WAL-mode safe).
|
||||
|
||||
### Examples
|
||||
|
||||
```bash
|
||||
hermes backup # Full backup to ~/hermes-backup-*.zip
|
||||
hermes backup -o /tmp/hermes.zip # Full backup to specific path
|
||||
hermes backup --quick # Quick state-only snapshot
|
||||
hermes backup --quick --label "pre-upgrade" # Quick snapshot with label
|
||||
```
|
||||
|
||||
## `hermes import`
|
||||
|
||||
```bash
|
||||
hermes import <zipfile> [options]
|
||||
```
|
||||
|
||||
Restore a previously created Hermes backup into your Hermes home directory.
|
||||
|
||||
| Option | Description |
|
||||
|--------|-------------|
|
||||
| `-f`, `--force` | Overwrite existing files without confirmation. |
|
||||
|
||||
## `hermes logs`
|
||||
|
||||
```bash
|
||||
|
||||
@@ -328,17 +328,24 @@ For cloud sandbox backends, persistence is filesystem-oriented. `TERMINAL_LIFETI
|
||||
|
||||
## Context Compression (config.yaml only)
|
||||
|
||||
Context compression is configured exclusively through the `compression` section in `config.yaml` — there are no environment variables for it.
|
||||
Context compression is configured exclusively through `config.yaml` — there are no environment variables for it. Threshold settings live in the `compression:` block, while the summarization model/provider lives under `auxiliary.compression:`.
|
||||
|
||||
```yaml
|
||||
compression:
|
||||
enabled: true
|
||||
threshold: 0.50
|
||||
summary_model: "" # empty = use main configured model
|
||||
summary_provider: auto
|
||||
summary_base_url: null # Custom OpenAI-compatible endpoint for summaries
|
||||
|
||||
auxiliary:
|
||||
compression:
|
||||
model: "" # empty = auto-detect
|
||||
provider: auto
|
||||
base_url: null # Custom OpenAI-compatible endpoint for summaries
|
||||
```
|
||||
|
||||
:::info Legacy migration
|
||||
Older configs with `compression.summary_model`, `compression.summary_provider`, and `compression.summary_base_url` are automatically migrated to `auxiliary.compression.*` on first load.
|
||||
:::
|
||||
|
||||
## Auxiliary Task Overrides
|
||||
|
||||
| Variable | Description |
|
||||
|
||||
@@ -28,8 +28,9 @@ Type `/` in the CLI to open the autocomplete menu. Built-in commands are case-in
|
||||
| `/retry` | Retry the last message (resend to agent) |
|
||||
| `/undo` | Remove the last user/assistant exchange |
|
||||
| `/title` | Set a title for the current session (usage: /title My Session Name) |
|
||||
| `/compress` | Manually compress conversation context (flush memories + summarize) |
|
||||
| `/compress [focus topic]` | Manually compress conversation context (flush memories + summarize). Optional focus topic narrows what the summary preserves. |
|
||||
| `/rollback` | List or restore filesystem checkpoints (usage: /rollback [number]) |
|
||||
| `/snapshot [create\|restore <id>\|prune]` (alias: `/snap`) | Create or restore state snapshots of Hermes config/state. `create [label]` saves a snapshot, `restore <id>` reverts to it, `prune [N]` removes old snapshots, or list all with no args. |
|
||||
| `/stop` | Kill all running background processes |
|
||||
| `/queue <prompt>` (alias: `/q`) | Queue a prompt for the next turn (doesn't interrupt the current agent response). **Note:** `/q` is claimed by both `/queue` and `/quit`; the last registration wins, so `/q` resolves to `/quit` in practice. Use `/queue` explicitly. |
|
||||
| `/resume [name]` | Resume a previously-named session |
|
||||
@@ -44,11 +45,12 @@ Type `/` in the CLI to open the autocomplete menu. Built-in commands are case-in
|
||||
| Command | Description |
|
||||
|---------|-------------|
|
||||
| `/config` | Show current configuration |
|
||||
| `/model [model-name]` | Show or change the current model. Supports: `/model claude-sonnet-4`, `/model provider:model` (switch providers), `/model custom:model` (custom endpoint), `/model custom:name:model` (named custom provider), `/model custom` (auto-detect from endpoint) |
|
||||
| `/model [model-name]` | Show or change the current model. Supports: `/model claude-sonnet-4`, `/model provider:model` (switch providers), `/model custom:model` (custom endpoint), `/model custom:name:model` (named custom provider), `/model custom` (auto-detect from endpoint). Use `--global` to persist the change to config.yaml. |
|
||||
| `/provider` | Show available providers and current provider |
|
||||
| `/personality` | Set a predefined personality |
|
||||
| `/verbose` | Cycle tool progress display: off → new → all → verbose. Can be [enabled for messaging](#notes) via config. |
|
||||
| `/reasoning` | Manage reasoning effort and display (usage: /reasoning [level\|show\|hide]) |
|
||||
| `/fast [normal\|fast\|status]` | Toggle fast mode — OpenAI Priority Processing / Anthropic Fast Mode. Options: `normal`, `fast`, `status`, `on`, `off`. |
|
||||
| `/skin` | Show or change the display skin/theme |
|
||||
| `/voice [on\|off\|tts\|status]` | Toggle CLI voice mode and spoken playback. Recording uses `voice.record_key` (default: `Ctrl+B`). |
|
||||
| `/yolo` | Toggle YOLO mode — skip all dangerous command approval prompts. |
|
||||
@@ -75,6 +77,8 @@ Type `/` in the CLI to open the autocomplete menu. Built-in commands are case-in
|
||||
| `/insights` | Show usage insights and analytics (last 30 days) |
|
||||
| `/platforms` (alias: `/gateway`) | Show gateway/messaging platform status |
|
||||
| `/paste` | Check clipboard for an image and attach it |
|
||||
| `/image <path>` | Attach a local image file for your next prompt. |
|
||||
| `/debug` | Upload debug report (system info + logs) and get shareable links. Also available in messaging. |
|
||||
| `/profile` | Show active profile name and home directory |
|
||||
|
||||
### Exit
|
||||
@@ -117,13 +121,14 @@ The messaging gateway supports the following built-in commands inside Telegram,
|
||||
| `/reset` | Reset conversation history. |
|
||||
| `/status` | Show session info. |
|
||||
| `/stop` | Kill all running background processes and interrupt the running agent. |
|
||||
| `/model [provider:model]` | Show or change the model. Supports provider switches (`/model zai:glm-5`), custom endpoints (`/model custom:model`), named custom providers (`/model custom:local:qwen`), and auto-detect (`/model custom`). |
|
||||
| `/model [provider:model]` | Show or change the model. Supports provider switches (`/model zai:glm-5`), custom endpoints (`/model custom:model`), named custom providers (`/model custom:local:qwen`), and auto-detect (`/model custom`). Use `--global` to persist the change to config.yaml. |
|
||||
| `/provider` | Show provider availability and auth status. |
|
||||
| `/personality [name]` | Set a personality overlay for the session. |
|
||||
| `/fast [normal\|fast\|status]` | Toggle fast mode — OpenAI Priority Processing / Anthropic Fast Mode. |
|
||||
| `/retry` | Retry the last message. |
|
||||
| `/undo` | Remove the last exchange. |
|
||||
| `/sethome` (alias: `/set-home`) | Mark the current chat as the platform home channel for deliveries. |
|
||||
| `/compress` | Manually compress conversation context. |
|
||||
| `/compress [focus topic]` | Manually compress conversation context. Optional focus topic narrows what the summary preserves. |
|
||||
| `/title [name]` | Set or show the session title. |
|
||||
| `/resume [name]` | Resume a previously named session. |
|
||||
| `/usage` | Show token usage, estimated cost breakdown (input/output), context window state, and session duration. |
|
||||
@@ -131,6 +136,7 @@ The messaging gateway supports the following built-in commands inside Telegram,
|
||||
| `/reasoning [level\|show\|hide]` | Change reasoning effort or toggle reasoning display. |
|
||||
| `/voice [on\|off\|tts\|join\|channel\|leave\|status]` | Control spoken replies in chat. `join`/`channel`/`leave` manage Discord voice-channel mode. |
|
||||
| `/rollback [number]` | List or restore filesystem checkpoints. |
|
||||
| `/snapshot [create\|restore <id>\|prune]` (alias: `/snap`) | Create or restore state snapshots of Hermes config/state. |
|
||||
| `/background <prompt>` | Run a prompt in a separate background session. Results are delivered back to the same chat when the task finishes. See [Messaging Background Sessions](/docs/user-guide/messaging/#background-sessions). |
|
||||
| `/plan [request]` | Load the bundled `plan` skill to write a markdown plan instead of executing the work. Plans are saved under `.hermes/plans/` relative to the active workspace/backend working directory. |
|
||||
| `/reload-mcp` (alias: `/reload_mcp`) | Reload MCP servers from config. |
|
||||
@@ -140,13 +146,15 @@ The messaging gateway supports the following built-in commands inside Telegram,
|
||||
| `/approve [session\|always]` | Approve and execute a pending dangerous command. `session` approves for this session only; `always` adds to permanent allowlist. |
|
||||
| `/deny` | Reject a pending dangerous command. |
|
||||
| `/update` | Update Hermes Agent to the latest version. |
|
||||
| `/restart` | Gracefully restart the gateway after draining active runs. When the gateway comes back online, it sends a confirmation to the requester's chat/thread. |
|
||||
| `/debug` | Upload debug report (system info + logs) and get shareable links. |
|
||||
| `/help` | Show messaging help. |
|
||||
| `/<skill-name>` | Invoke any installed skill by name. |
|
||||
|
||||
## Notes
|
||||
|
||||
- `/skin`, `/tools`, `/toolsets`, `/browser`, `/config`, `/cron`, `/skills`, `/platforms`, `/paste`, `/statusbar`, and `/plugins` are **CLI-only** commands.
|
||||
- `/skin`, `/tools`, `/toolsets`, `/browser`, `/config`, `/cron`, `/skills`, `/platforms`, `/paste`, `/image`, `/statusbar`, and `/plugins` are **CLI-only** commands.
|
||||
- `/verbose` is **CLI-only by default**, but can be enabled for messaging platforms by setting `display.tool_progress_command: true` in `config.yaml`. When enabled, it cycles the `display.tool_progress` mode and saves to config.
|
||||
- `/status`, `/sethome`, `/update`, `/approve`, `/deny`, and `/commands` are **messaging-only** commands.
|
||||
- `/background`, `/voice`, `/reload-mcp`, `/rollback`, and `/yolo` work in **both** the CLI and the messaging gateway.
|
||||
- `/sethome`, `/update`, `/restart`, `/approve`, `/deny`, and `/commands` are **messaging-only** commands.
|
||||
- `/status`, `/background`, `/voice`, `/reload-mcp`, `/rollback`, `/snapshot`, `/debug`, `/fast`, and `/yolo` work in **both** the CLI and the messaging gateway.
|
||||
- `/voice join`, `/voice channel`, and `/voice leave` are only meaningful on Discord.
|
||||
|
||||
@@ -322,7 +322,11 @@ Long conversations are automatically summarized when approaching context limits:
|
||||
compression:
|
||||
enabled: true
|
||||
threshold: 0.50 # Compress at 50% of context limit by default
|
||||
summary_model: "google/gemini-3-flash-preview" # Model used for summarization
|
||||
|
||||
# Summarization model configured under auxiliary:
|
||||
auxiliary:
|
||||
compression:
|
||||
model: "google/gemini-3-flash-preview" # Model used for summarization
|
||||
```
|
||||
|
||||
When compression triggers, middle turns are summarized while the first 3 and last 4 turns are always preserved.
|
||||
|
||||
@@ -441,11 +441,19 @@ compression:
|
||||
threshold: 0.50 # Compress at this % of context limit
|
||||
target_ratio: 0.20 # Fraction of threshold to preserve as recent tail
|
||||
protect_last_n: 20 # Min recent messages to keep uncompressed
|
||||
summary_model: "google/gemini-3-flash-preview" # Model for summarization
|
||||
summary_provider: "auto" # Provider: "auto", "openrouter", "nous", "codex", "main", etc.
|
||||
summary_base_url: null # Custom OpenAI-compatible endpoint (overrides provider)
|
||||
|
||||
# The summarization model/provider is configured under auxiliary:
|
||||
auxiliary:
|
||||
compression:
|
||||
model: "google/gemini-3-flash-preview" # Model for summarization
|
||||
provider: "auto" # Provider: "auto", "openrouter", "nous", "codex", "main", etc.
|
||||
base_url: null # Custom OpenAI-compatible endpoint (overrides provider)
|
||||
```
|
||||
|
||||
:::info Legacy config migration
|
||||
Older configs with `compression.summary_model`, `compression.summary_provider`, and `compression.summary_base_url` are automatically migrated to `auxiliary.compression.*` on first load (config version 17). No manual action needed.
|
||||
:::
|
||||
|
||||
### Common setups
|
||||
|
||||
**Default (auto-detect) — no configuration needed:**
|
||||
@@ -458,30 +466,32 @@ Uses the first available provider (OpenRouter → Nous → Codex) with Gemini Fl
|
||||
|
||||
**Force a specific provider** (OAuth or API-key based):
|
||||
```yaml
|
||||
compression:
|
||||
summary_provider: nous
|
||||
summary_model: gemini-3-flash
|
||||
auxiliary:
|
||||
compression:
|
||||
provider: nous
|
||||
model: gemini-3-flash
|
||||
```
|
||||
Works with any provider: `nous`, `openrouter`, `codex`, `anthropic`, `main`, etc.
|
||||
|
||||
**Custom endpoint** (self-hosted, Ollama, zai, DeepSeek, etc.):
|
||||
```yaml
|
||||
compression:
|
||||
summary_model: glm-4.7
|
||||
summary_base_url: https://api.z.ai/api/coding/paas/v4
|
||||
auxiliary:
|
||||
compression:
|
||||
model: glm-4.7
|
||||
base_url: https://api.z.ai/api/coding/paas/v4
|
||||
```
|
||||
Points at a custom OpenAI-compatible endpoint. Uses `OPENAI_API_KEY` for auth.
|
||||
|
||||
### How the three knobs interact
|
||||
|
||||
| `summary_provider` | `summary_base_url` | Result |
|
||||
| `auxiliary.compression.provider` | `auxiliary.compression.base_url` | Result |
|
||||
|---------------------|---------------------|--------|
|
||||
| `auto` (default) | not set | Auto-detect best available provider |
|
||||
| `nous` / `openrouter` / etc. | not set | Force that provider, use its auth |
|
||||
| any | set | Use the custom endpoint directly (provider ignored) |
|
||||
|
||||
:::warning Summary model context length requirement
|
||||
The `summary_model` **must** have a context window at least as large as your main agent model's. The compressor sends the full middle section of the conversation to the summary model — if that model's context window is smaller than the main model's, the summarization call will fail with a context length error. When this happens, the middle turns are **dropped without a summary**, losing conversation context silently. If you override `summary_model`, verify its context length meets or exceeds your main model's.
|
||||
The summary model **must** have a context window at least as large as your main agent model's. The compressor sends the full middle section of the conversation to the summary model — if that model's context window is smaller than the main model's, the summarization call will fail with a context length error. When this happens, the middle turns are **dropped without a summary**, losing conversation context silently. If you override the model, verify its context length meets or exceeds your main model's.
|
||||
:::
|
||||
|
||||
## Context Engine
|
||||
@@ -522,6 +532,8 @@ agent:
|
||||
|
||||
Budget pressure is enabled by default. The agent sees warnings naturally as part of tool results, encouraging it to consolidate its work and deliver a response before running out of iterations.
|
||||
|
||||
When the iteration budget is fully exhausted, the CLI shows a notification to the user: `⚠ Iteration budget reached (90/90) — response may be incomplete`. If the budget runs out during active work, the agent generates a summary of what was accomplished before stopping.
|
||||
|
||||
### Streaming Timeouts
|
||||
|
||||
The LLM streaming connection has two timeout layers. Both auto-adjust for local providers (localhost, LAN IPs) — no configuration needed for most setups.
|
||||
@@ -666,7 +678,7 @@ Each auxiliary task has a configurable `timeout` (in seconds). Defaults: vision
|
||||
:::
|
||||
|
||||
:::info
|
||||
Context compression has its own top-level `compression:` block with `summary_provider`, `summary_model`, and `summary_base_url` — see [Context Compression](#context-compression) above. The fallback model uses a `fallback_model:` block — see [Fallback Model](/docs/integrations/providers#fallback-model). All three follow the same provider/model/base_url pattern.
|
||||
Context compression has its own `compression:` block for thresholds and an `auxiliary.compression:` block for model/provider settings — see [Context Compression](#context-compression) above. The fallback model uses a `fallback_model:` block — see [Fallback Model](/docs/integrations/providers#fallback-model). All three follow the same provider/model/base_url pattern.
|
||||
:::
|
||||
|
||||
### Changing the Vision Model
|
||||
@@ -839,16 +851,21 @@ agent:
|
||||
|
||||
```yaml
|
||||
tts:
|
||||
provider: "edge" # "edge" | "elevenlabs" | "openai" | "neutts"
|
||||
provider: "edge" # "edge" | "elevenlabs" | "openai" | "neutts" | "minimax"
|
||||
speed: 1.0 # Global speed multiplier (fallback for all providers)
|
||||
edge:
|
||||
voice: "en-US-AriaNeural" # 322 voices, 74 languages
|
||||
speed: 1.0 # Speed multiplier (converted to rate percentage, e.g. 1.5 → +50%)
|
||||
elevenlabs:
|
||||
voice_id: "pNInz6obpgDQGcFmaJgB"
|
||||
model_id: "eleven_multilingual_v2"
|
||||
openai:
|
||||
model: "gpt-4o-mini-tts"
|
||||
voice: "alloy" # alloy, echo, fable, onyx, nova, shimmer
|
||||
speed: 1.0 # Speed multiplier (clamped to 0.25–4.0 by the API)
|
||||
base_url: "https://api.openai.com/v1" # Override for OpenAI-compatible TTS endpoints
|
||||
minimax:
|
||||
speed: 1.0 # Speech speed multiplier
|
||||
neutts:
|
||||
ref_audio: ''
|
||||
ref_text: ''
|
||||
@@ -858,6 +875,8 @@ tts:
|
||||
|
||||
This controls both the `text_to_speech` tool and spoken replies in voice mode (`/voice tts` in the CLI or messaging gateway).
|
||||
|
||||
**Speed fallback hierarchy:** provider-specific speed (e.g. `tts.edge.speed`) → global `tts.speed` → `1.0` default. Set the global `tts.speed` to apply a uniform speed across all providers, or override per-provider for fine-grained control.
|
||||
|
||||
## Display Settings
|
||||
|
||||
```yaml
|
||||
|
||||
@@ -156,7 +156,7 @@ Hermes uses separate lightweight models for side tasks. Each task has its own pr
|
||||
|------|-------------|-----------|
|
||||
| Vision | Image analysis, browser screenshots | `auxiliary.vision` |
|
||||
| Web Extract | Web page summarization | `auxiliary.web_extract` |
|
||||
| Compression | Context compression summaries | `auxiliary.compression` or `compression.summary_provider` |
|
||||
| Compression | Context compression summaries | `auxiliary.compression` |
|
||||
| Session Search | Past session summarization | `auxiliary.session_search` |
|
||||
| Skills Hub | Skill search and discovery | `auxiliary.skills_hub` |
|
||||
| MCP | MCP helper operations | `auxiliary.mcp` |
|
||||
@@ -219,13 +219,14 @@ auxiliary:
|
||||
model: ""
|
||||
```
|
||||
|
||||
Every task above follows the same **provider / model / base_url** pattern. Context compression uses its own top-level block:
|
||||
Every task above follows the same **provider / model / base_url** pattern. Context compression is configured under `auxiliary.compression`:
|
||||
|
||||
```yaml
|
||||
compression:
|
||||
summary_provider: main # Same provider options as auxiliary tasks
|
||||
summary_model: google/gemini-3-flash-preview
|
||||
summary_base_url: null # Custom OpenAI-compatible endpoint
|
||||
auxiliary:
|
||||
compression:
|
||||
provider: main # Same provider options as other auxiliary tasks
|
||||
model: google/gemini-3-flash-preview
|
||||
base_url: null # Custom OpenAI-compatible endpoint
|
||||
```
|
||||
|
||||
And the fallback model uses:
|
||||
@@ -270,15 +271,18 @@ auxiliary:
|
||||
|
||||
## Context Compression Fallback
|
||||
|
||||
Context compression has a legacy configuration path in addition to the auxiliary system:
|
||||
Context compression uses the `auxiliary.compression` config block to control which model and provider handles summarization:
|
||||
|
||||
```yaml
|
||||
compression:
|
||||
summary_provider: "auto" # auto | openrouter | nous | main
|
||||
summary_model: "google/gemini-3-flash-preview"
|
||||
auxiliary:
|
||||
compression:
|
||||
provider: "auto" # auto | openrouter | nous | main
|
||||
model: "google/gemini-3-flash-preview"
|
||||
```
|
||||
|
||||
This is equivalent to configuring `auxiliary.compression.provider` and `auxiliary.compression.model`. If both are set, the `auxiliary.compression` values take precedence.
|
||||
:::info Legacy migration
|
||||
Older configs with `compression.summary_model` / `compression.summary_provider` / `compression.summary_base_url` are automatically migrated to `auxiliary.compression.*` on first load (config version 17).
|
||||
:::
|
||||
|
||||
If no provider is available for compression, Hermes drops middle conversation turns without generating a summary rather than failing the session.
|
||||
|
||||
@@ -325,7 +329,7 @@ See [Scheduled Tasks (Cron)](/docs/user-guide/features/cron) for full configurat
|
||||
| Main agent model | `fallback_model` in config.yaml — one-shot failover on errors | `fallback_model:` (top-level) |
|
||||
| Vision | Auto-detection chain + internal OpenRouter retry | `auxiliary.vision` |
|
||||
| Web extraction | Auto-detection chain + internal OpenRouter retry | `auxiliary.web_extract` |
|
||||
| Context compression | Auto-detection chain, degrades to no-summary if unavailable | `auxiliary.compression` or `compression.summary_provider` |
|
||||
| Context compression | Auto-detection chain, degrades to no-summary if unavailable | `auxiliary.compression` |
|
||||
| Session search | Auto-detection chain | `auxiliary.session_search` |
|
||||
| Skills hub | Auto-detection chain | `auxiliary.skills_hub` |
|
||||
| MCP helpers | Auto-detection chain | `auxiliary.mcp` |
|
||||
|
||||
@@ -426,6 +426,10 @@ hermes skills update react # Update one specific installed hub skill
|
||||
|
||||
This uses the stored source identifier plus the current upstream bundle content hash to detect drift.
|
||||
|
||||
:::tip GitHub rate limits
|
||||
Skills hub operations use the GitHub API, which has a rate limit of 60 requests/hour for unauthenticated users. If you see rate-limit errors during install or search, set `GITHUB_TOKEN` in your `.env` file to increase the limit to 5,000 requests/hour. The error message includes an actionable hint when this happens.
|
||||
:::
|
||||
|
||||
### Slash commands (inside chat)
|
||||
|
||||
All the same commands work with `/skills`:
|
||||
|
||||
@@ -36,8 +36,10 @@ Convert text to speech with six providers:
|
||||
# In ~/.hermes/config.yaml
|
||||
tts:
|
||||
provider: "edge" # "edge" | "elevenlabs" | "openai" | "minimax" | "mistral" | "neutts"
|
||||
speed: 1.0 # Global speed multiplier (provider-specific settings override this)
|
||||
edge:
|
||||
voice: "en-US-AriaNeural" # 322 voices, 74 languages
|
||||
speed: 1.0 # Converted to rate percentage (+/-%)
|
||||
elevenlabs:
|
||||
voice_id: "pNInz6obpgDQGcFmaJgB" # Adam
|
||||
model_id: "eleven_multilingual_v2"
|
||||
@@ -45,6 +47,7 @@ tts:
|
||||
model: "gpt-4o-mini-tts"
|
||||
voice: "alloy" # alloy, echo, fable, onyx, nova, shimmer
|
||||
base_url: "https://api.openai.com/v1" # Override for OpenAI-compatible TTS endpoints
|
||||
speed: 1.0 # 0.25 - 4.0
|
||||
minimax:
|
||||
model: "speech-2.8-hd" # speech-2.8-hd (default), speech-2.8-turbo
|
||||
voice_id: "English_Graceful_Lady" # See https://platform.minimax.io/faq/system-voice-id
|
||||
@@ -61,6 +64,8 @@ tts:
|
||||
device: cpu
|
||||
```
|
||||
|
||||
**Speed control**: The global `tts.speed` value applies to all providers by default. Each provider can override it with its own `speed` setting (e.g., `tts.openai.speed: 1.5`). Provider-specific speed takes precedence over the global value. Default is `1.0` (normal speed).
|
||||
|
||||
### Telegram Voice Bubbles & ffmpeg
|
||||
|
||||
Telegram voice bubbles require Opus/OGG audio format:
|
||||
|
||||
@@ -1,7 +1,7 @@
|
||||
---
|
||||
sidebar_position: 15
|
||||
title: "Web Dashboard"
|
||||
description: "Browser-based dashboard for managing configuration, API keys, and monitoring sessions"
|
||||
description: "Browser-based dashboard for managing configuration, API keys, sessions, logs, analytics, cron jobs, and skills"
|
||||
---
|
||||
|
||||
# Web Dashboard
|
||||
@@ -104,6 +104,54 @@ Each key shows:
|
||||
|
||||
Advanced/rarely-used keys are hidden by default behind a toggle.
|
||||
|
||||
### Sessions
|
||||
|
||||
Browse and inspect all agent sessions. Each row shows the session title, source platform icon (CLI, Telegram, Discord, Slack, cron), model name, message count, tool call count, and how long ago it was active. Live sessions are marked with a pulsing badge.
|
||||
|
||||
- **Search** — full-text search across all message content using FTS5. Results show highlighted snippets and auto-scroll to the first matching message when expanded.
|
||||
- **Expand** — click a session to load its full message history. Messages are color-coded by role (user, assistant, system, tool) and rendered as Markdown with syntax highlighting.
|
||||
- **Tool calls** — assistant messages with tool calls show collapsible blocks with the function name and JSON arguments.
|
||||
- **Delete** — remove a session and its message history with the trash icon.
|
||||
|
||||
### Logs
|
||||
|
||||
View agent, gateway, and error log files with filtering and live tailing.
|
||||
|
||||
- **File** — switch between `agent`, `errors`, and `gateway` log files
|
||||
- **Level** — filter by log level: ALL, DEBUG, INFO, WARNING, or ERROR
|
||||
- **Component** — filter by source component: all, gateway, agent, tools, cli, or cron
|
||||
- **Lines** — choose how many lines to display (50, 100, 200, or 500)
|
||||
- **Auto-refresh** — toggle live tailing that polls for new log lines every 5 seconds
|
||||
- **Color-coded** — log lines are colored by severity (red for errors, yellow for warnings, dim for debug)
|
||||
|
||||
### Analytics
|
||||
|
||||
Usage and cost analytics computed from session history. Select a time period (7, 30, or 90 days) to see:
|
||||
|
||||
- **Summary cards** — total tokens (input/output), cache hit percentage, total estimated or actual cost, and total session count with daily average
|
||||
- **Daily token chart** — stacked bar chart showing input and output token usage per day, with hover tooltips showing breakdowns and cost
|
||||
- **Daily breakdown table** — date, session count, input tokens, output tokens, cache hit rate, and cost for each day
|
||||
- **Per-model breakdown** — table showing each model used, its session count, token usage, and estimated cost
|
||||
|
||||
### Cron
|
||||
|
||||
Create and manage scheduled cron jobs that run agent prompts on a recurring schedule.
|
||||
|
||||
- **Create** — fill in a name (optional), prompt, cron expression (e.g. `0 9 * * *`), and delivery target (local, Telegram, Discord, Slack, or email)
|
||||
- **Job list** — each job shows its name, prompt preview, schedule expression, state badge (enabled/paused/error), delivery target, last run time, and next run time
|
||||
- **Pause / Resume** — toggle a job between active and paused states
|
||||
- **Trigger now** — immediately execute a job outside its normal schedule
|
||||
- **Delete** — permanently remove a cron job
|
||||
|
||||
### Skills
|
||||
|
||||
Browse, search, and toggle skills and toolsets. Skills are loaded from `~/.hermes/skills/` and grouped by category.
|
||||
|
||||
- **Search** — filter skills and toolsets by name, description, or category
|
||||
- **Category filter** — click category pills to narrow the list (e.g. MLOps, MCP, Red Teaming, AI)
|
||||
- **Toggle** — enable or disable individual skills with a switch. Changes take effect on the next session.
|
||||
- **Toolsets** — a separate section shows built-in toolsets (file operations, web browsing, etc.) with their active/inactive status, setup requirements, and list of included tools
|
||||
|
||||
:::warning Security
|
||||
The web dashboard reads and writes your `.env` file, which contains API keys and secrets. It binds to `127.0.0.1` by default — only accessible from your local machine. If you bind to `0.0.0.0`, anyone on your network can view and modify your credentials. The dashboard has no authentication of its own.
|
||||
:::
|
||||
@@ -159,6 +207,66 @@ Sets an environment variable. Body: `{"key": "VAR_NAME", "value": "secret"}`.
|
||||
|
||||
Removes an environment variable. Body: `{"key": "VAR_NAME"}`.
|
||||
|
||||
### GET /api/sessions/\{session_id\}
|
||||
|
||||
Returns metadata for a single session.
|
||||
|
||||
### GET /api/sessions/\{session_id\}/messages
|
||||
|
||||
Returns the full message history for a session, including tool calls and timestamps.
|
||||
|
||||
### GET /api/sessions/search
|
||||
|
||||
Full-text search across message content. Query parameter: `q`. Returns matching session IDs with highlighted snippets.
|
||||
|
||||
### DELETE /api/sessions/\{session_id\}
|
||||
|
||||
Deletes a session and its message history.
|
||||
|
||||
### GET /api/logs
|
||||
|
||||
Returns log lines. Query parameters: `file` (agent/errors/gateway), `lines` (count), `level`, `component`.
|
||||
|
||||
### GET /api/analytics/usage
|
||||
|
||||
Returns token usage, cost, and session analytics. Query parameter: `days` (default 30). Response includes daily breakdowns and per-model aggregates.
|
||||
|
||||
### GET /api/cron/jobs
|
||||
|
||||
Returns all configured cron jobs with their state, schedule, and run history.
|
||||
|
||||
### POST /api/cron/jobs
|
||||
|
||||
Creates a new cron job. Body: `{"prompt": "...", "schedule": "0 9 * * *", "name": "...", "deliver": "local"}`.
|
||||
|
||||
### POST /api/cron/jobs/\{job_id\}/pause
|
||||
|
||||
Pauses a cron job.
|
||||
|
||||
### POST /api/cron/jobs/\{job_id\}/resume
|
||||
|
||||
Resumes a paused cron job.
|
||||
|
||||
### POST /api/cron/jobs/\{job_id\}/trigger
|
||||
|
||||
Immediately triggers a cron job outside its schedule.
|
||||
|
||||
### DELETE /api/cron/jobs/\{job_id\}
|
||||
|
||||
Deletes a cron job.
|
||||
|
||||
### GET /api/skills
|
||||
|
||||
Returns all skills with their name, description, category, and enabled status.
|
||||
|
||||
### PUT /api/skills/toggle
|
||||
|
||||
Enables or disables a skill. Body: `{"name": "skill-name", "enabled": true}`.
|
||||
|
||||
### GET /api/tools/toolsets
|
||||
|
||||
Returns all toolsets with their label, description, tools list, and active/configured status.
|
||||
|
||||
## CORS
|
||||
|
||||
The web server restricts CORS to localhost origins only:
|
||||
|
||||
@@ -143,5 +143,5 @@ The crypto implementation is compatible with Tencent's official WXBizMsgCrypt SD
|
||||
|
||||
- **No streaming** — replies arrive as complete messages after the agent finishes
|
||||
- **No typing indicators** — the callback model doesn't support typing status
|
||||
- **Text only** — currently supports text messages; image/file/voice not yet implemented
|
||||
- **Text only** — currently supports text messages for input; image/file/voice input not yet implemented. The agent is aware of outbound media capabilities via the WeCom platform hint (images, documents, video, voice).
|
||||
- **Response latency** — agent sessions take 3–30 minutes; users see the reply when processing completes
|
||||
|
||||
@@ -174,6 +174,33 @@ whatsapp:
|
||||
|
||||
---
|
||||
|
||||
## Message Formatting & Delivery
|
||||
|
||||
WhatsApp supports **streaming (progressive) responses** — the bot edits its message in real-time as the AI generates text, just like Discord and Telegram. Internally, WhatsApp is classified as a TIER_MEDIUM platform for delivery capabilities.
|
||||
|
||||
### Chunking
|
||||
|
||||
Long responses are automatically split into multiple messages at **4,096 characters** per chunk (WhatsApp's practical display limit). You don't need to configure anything — the gateway handles splitting and sends chunks sequentially.
|
||||
|
||||
### WhatsApp-Compatible Markdown
|
||||
|
||||
Standard Markdown in AI responses is automatically converted to WhatsApp's native formatting:
|
||||
|
||||
| Markdown | WhatsApp | Renders as |
|
||||
|----------|----------|------------|
|
||||
| `**bold**` | `*bold*` | **bold** |
|
||||
| `~~strikethrough~~` | `~strikethrough~` | ~~strikethrough~~ |
|
||||
| `# Heading` | `*Heading*` | Bold text (no native headings) |
|
||||
| `[link text](url)` | `link text (url)` | Inline URL |
|
||||
|
||||
Code blocks and inline code are preserved as-is since WhatsApp supports triple-backtick formatting natively.
|
||||
|
||||
### Tool Progress
|
||||
|
||||
When the agent calls tools (web search, file operations, etc.), WhatsApp displays real-time progress indicators showing which tool is running. This is enabled by default — no configuration needed.
|
||||
|
||||
---
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
| Problem | Solution |
|
||||
|
||||
Reference in New Issue
Block a user