docs: comprehensive update for recent merged PRs (#9019)

Audit and update documentation across 12 files to match changes from
~50 recently merged PRs. Key updates:

Slash commands (slash-commands.md):
- Add 5 missing commands: /snapshot, /fast, /image, /debug, /restart
- Fix /status incorrectly labeled as messaging-only (available in both)
- Add --global flag to /model docs
- Add [focus topic] arg to /compress docs

CLI commands (cli-commands.md):
- Add hermes debug share section with options and examples
- Add hermes backup section with --quick and --label flags
- Add hermes import section

Feature docs:
- TTS: document global tts.speed and per-provider speed for Edge/OpenAI
- Web dashboard: add docs for 5 missing pages (Sessions, Logs,
  Analytics, Cron, Skills) and 15+ API endpoints
- WhatsApp: add streaming, 4K chunking, and markdown formatting docs
- Skills: add GitHub rate-limit/GITHUB_TOKEN troubleshooting tip
- Budget: document CLI notification on iteration budget exhaustion

Config migration (compression.summary_* → auxiliary.compression.*):
- Update configuration.md, environment-variables.md,
  fallback-providers.md, cli.md, and context-compression-and-caching.md
- Replace legacy compression.summary_model/provider/base_url references
  with auxiliary.compression.model/provider/base_url
- Add legacy migration info boxes explaining auto-migration

Minor fixes:
- wecom-callback.md: clarify 'text only' limitation (input only)
- Escape {session_id}/{job_id} in web-dashboard.md headings for MDX
This commit is contained in:
Teknium
2026-04-13 10:50:59 -07:00
committed by GitHub
parent c449cd1af5
commit 4ca6668daf
12 changed files with 299 additions and 40 deletions

View File

@@ -84,7 +84,13 @@ compression:
threshold: 0.50 # Fraction of context window (default: 0.50 = 50%)
target_ratio: 0.20 # How much of threshold to keep as tail (default: 0.20)
protect_last_n: 20 # Minimum protected tail messages (default: 20)
summary_model: null # Override model for summaries (default: uses auxiliary)
# Summarization model/provider configured under auxiliary:
auxiliary:
compression:
model: null # Override model for summaries (default: auto-detect)
provider: auto # Provider: "auto", "openrouter", "nous", "main", etc.
base_url: null # Custom OpenAI-compatible endpoint
```
### Parameter Details

View File

@@ -44,6 +44,9 @@ hermes [global-options] <command> [subcommand/options]
| `hermes webhook` | Manage dynamic webhook subscriptions for event-driven activation. |
| `hermes doctor` | Diagnose config and dependency issues. |
| `hermes dump` | Copy-pasteable setup summary for support/debugging. |
| `hermes debug` | Debug tools — upload logs and system info for support. |
| `hermes backup` | Back up Hermes home directory to a zip file. |
| `hermes import` | Restore a Hermes backup from a zip file. |
| `hermes logs` | View, tail, and filter agent/gateway/error log files. |
| `hermes config` | Show, edit, migrate, and query configuration files. |
| `hermes pairing` | Approve or revoke messaging pairing codes. |
@@ -355,6 +358,70 @@ config_overrides:
`hermes dump` is specifically designed for sharing. For interactive diagnostics, use `hermes doctor`. For a visual overview, use `hermes status`.
:::
## `hermes debug`
```bash
hermes debug share [options]
```
Upload a debug report (system info + recent logs) to a paste service and get a shareable URL. Useful for quick support requests — includes everything a helper needs to diagnose your issue.
| Option | Description |
|--------|-------------|
| `--lines <N>` | Number of log lines to include per log file (default: 200). |
| `--expire <days>` | Paste expiry in days (default: 7). |
| `--local` | Print the report locally instead of uploading. |
The report includes system info (OS, Python version, Hermes version), recent agent and gateway logs (512 KB limit per file), and redacted API key status. Keys are always redacted — no secrets are uploaded.
Paste services tried in order: paste.rs, dpaste.com.
### Examples
```bash
hermes debug share # Upload debug report, print URL
hermes debug share --lines 500 # Include more log lines
hermes debug share --expire 30 # Keep paste for 30 days
hermes debug share --local # Print report to terminal (no upload)
```
## `hermes backup`
```bash
hermes backup [options]
```
Create a zip archive of your Hermes configuration, skills, sessions, and data. The backup excludes the hermes-agent codebase itself.
| Option | Description |
|--------|-------------|
| `-o`, `--output <path>` | Output path for the zip file (default: `~/hermes-backup-<timestamp>.zip`). |
| `-q`, `--quick` | Quick snapshot: only critical state files (config.yaml, state.db, .env, auth, cron jobs). Much faster than a full backup. |
| `-l`, `--label <name>` | Label for the snapshot (only used with `--quick`). |
The backup uses SQLite's `backup()` API for safe copying, so it works correctly even when Hermes is running (WAL-mode safe).
### Examples
```bash
hermes backup # Full backup to ~/hermes-backup-*.zip
hermes backup -o /tmp/hermes.zip # Full backup to specific path
hermes backup --quick # Quick state-only snapshot
hermes backup --quick --label "pre-upgrade" # Quick snapshot with label
```
## `hermes import`
```bash
hermes import <zipfile> [options]
```
Restore a previously created Hermes backup into your Hermes home directory.
| Option | Description |
|--------|-------------|
| `-f`, `--force` | Overwrite existing files without confirmation. |
## `hermes logs`
```bash

View File

@@ -328,17 +328,24 @@ For cloud sandbox backends, persistence is filesystem-oriented. `TERMINAL_LIFETI
## Context Compression (config.yaml only)
Context compression is configured exclusively through the `compression` section in `config.yaml` — there are no environment variables for it.
Context compression is configured exclusively through `config.yaml` — there are no environment variables for it. Threshold settings live in the `compression:` block, while the summarization model/provider lives under `auxiliary.compression:`.
```yaml
compression:
enabled: true
threshold: 0.50
summary_model: "" # empty = use main configured model
summary_provider: auto
summary_base_url: null # Custom OpenAI-compatible endpoint for summaries
auxiliary:
compression:
model: "" # empty = auto-detect
provider: auto
base_url: null # Custom OpenAI-compatible endpoint for summaries
```
:::info Legacy migration
Older configs with `compression.summary_model`, `compression.summary_provider`, and `compression.summary_base_url` are automatically migrated to `auxiliary.compression.*` on first load.
:::
## Auxiliary Task Overrides
| Variable | Description |

View File

@@ -28,8 +28,9 @@ Type `/` in the CLI to open the autocomplete menu. Built-in commands are case-in
| `/retry` | Retry the last message (resend to agent) |
| `/undo` | Remove the last user/assistant exchange |
| `/title` | Set a title for the current session (usage: /title My Session Name) |
| `/compress` | Manually compress conversation context (flush memories + summarize) |
| `/compress [focus topic]` | Manually compress conversation context (flush memories + summarize). Optional focus topic narrows what the summary preserves. |
| `/rollback` | List or restore filesystem checkpoints (usage: /rollback [number]) |
| `/snapshot [create\|restore <id>\|prune]` (alias: `/snap`) | Create or restore state snapshots of Hermes config/state. `create [label]` saves a snapshot, `restore <id>` reverts to it, `prune [N]` removes old snapshots, or list all with no args. |
| `/stop` | Kill all running background processes |
| `/queue <prompt>` (alias: `/q`) | Queue a prompt for the next turn (doesn't interrupt the current agent response). **Note:** `/q` is claimed by both `/queue` and `/quit`; the last registration wins, so `/q` resolves to `/quit` in practice. Use `/queue` explicitly. |
| `/resume [name]` | Resume a previously-named session |
@@ -44,11 +45,12 @@ Type `/` in the CLI to open the autocomplete menu. Built-in commands are case-in
| Command | Description |
|---------|-------------|
| `/config` | Show current configuration |
| `/model [model-name]` | Show or change the current model. Supports: `/model claude-sonnet-4`, `/model provider:model` (switch providers), `/model custom:model` (custom endpoint), `/model custom:name:model` (named custom provider), `/model custom` (auto-detect from endpoint) |
| `/model [model-name]` | Show or change the current model. Supports: `/model claude-sonnet-4`, `/model provider:model` (switch providers), `/model custom:model` (custom endpoint), `/model custom:name:model` (named custom provider), `/model custom` (auto-detect from endpoint). Use `--global` to persist the change to config.yaml. |
| `/provider` | Show available providers and current provider |
| `/personality` | Set a predefined personality |
| `/verbose` | Cycle tool progress display: off → new → all → verbose. Can be [enabled for messaging](#notes) via config. |
| `/reasoning` | Manage reasoning effort and display (usage: /reasoning [level\|show\|hide]) |
| `/fast [normal\|fast\|status]` | Toggle fast mode — OpenAI Priority Processing / Anthropic Fast Mode. Options: `normal`, `fast`, `status`, `on`, `off`. |
| `/skin` | Show or change the display skin/theme |
| `/voice [on\|off\|tts\|status]` | Toggle CLI voice mode and spoken playback. Recording uses `voice.record_key` (default: `Ctrl+B`). |
| `/yolo` | Toggle YOLO mode — skip all dangerous command approval prompts. |
@@ -75,6 +77,8 @@ Type `/` in the CLI to open the autocomplete menu. Built-in commands are case-in
| `/insights` | Show usage insights and analytics (last 30 days) |
| `/platforms` (alias: `/gateway`) | Show gateway/messaging platform status |
| `/paste` | Check clipboard for an image and attach it |
| `/image <path>` | Attach a local image file for your next prompt. |
| `/debug` | Upload debug report (system info + logs) and get shareable links. Also available in messaging. |
| `/profile` | Show active profile name and home directory |
### Exit
@@ -117,13 +121,14 @@ The messaging gateway supports the following built-in commands inside Telegram,
| `/reset` | Reset conversation history. |
| `/status` | Show session info. |
| `/stop` | Kill all running background processes and interrupt the running agent. |
| `/model [provider:model]` | Show or change the model. Supports provider switches (`/model zai:glm-5`), custom endpoints (`/model custom:model`), named custom providers (`/model custom:local:qwen`), and auto-detect (`/model custom`). |
| `/model [provider:model]` | Show or change the model. Supports provider switches (`/model zai:glm-5`), custom endpoints (`/model custom:model`), named custom providers (`/model custom:local:qwen`), and auto-detect (`/model custom`). Use `--global` to persist the change to config.yaml. |
| `/provider` | Show provider availability and auth status. |
| `/personality [name]` | Set a personality overlay for the session. |
| `/fast [normal\|fast\|status]` | Toggle fast mode — OpenAI Priority Processing / Anthropic Fast Mode. |
| `/retry` | Retry the last message. |
| `/undo` | Remove the last exchange. |
| `/sethome` (alias: `/set-home`) | Mark the current chat as the platform home channel for deliveries. |
| `/compress` | Manually compress conversation context. |
| `/compress [focus topic]` | Manually compress conversation context. Optional focus topic narrows what the summary preserves. |
| `/title [name]` | Set or show the session title. |
| `/resume [name]` | Resume a previously named session. |
| `/usage` | Show token usage, estimated cost breakdown (input/output), context window state, and session duration. |
@@ -131,6 +136,7 @@ The messaging gateway supports the following built-in commands inside Telegram,
| `/reasoning [level\|show\|hide]` | Change reasoning effort or toggle reasoning display. |
| `/voice [on\|off\|tts\|join\|channel\|leave\|status]` | Control spoken replies in chat. `join`/`channel`/`leave` manage Discord voice-channel mode. |
| `/rollback [number]` | List or restore filesystem checkpoints. |
| `/snapshot [create\|restore <id>\|prune]` (alias: `/snap`) | Create or restore state snapshots of Hermes config/state. |
| `/background <prompt>` | Run a prompt in a separate background session. Results are delivered back to the same chat when the task finishes. See [Messaging Background Sessions](/docs/user-guide/messaging/#background-sessions). |
| `/plan [request]` | Load the bundled `plan` skill to write a markdown plan instead of executing the work. Plans are saved under `.hermes/plans/` relative to the active workspace/backend working directory. |
| `/reload-mcp` (alias: `/reload_mcp`) | Reload MCP servers from config. |
@@ -140,13 +146,15 @@ The messaging gateway supports the following built-in commands inside Telegram,
| `/approve [session\|always]` | Approve and execute a pending dangerous command. `session` approves for this session only; `always` adds to permanent allowlist. |
| `/deny` | Reject a pending dangerous command. |
| `/update` | Update Hermes Agent to the latest version. |
| `/restart` | Gracefully restart the gateway after draining active runs. When the gateway comes back online, it sends a confirmation to the requester's chat/thread. |
| `/debug` | Upload debug report (system info + logs) and get shareable links. |
| `/help` | Show messaging help. |
| `/<skill-name>` | Invoke any installed skill by name. |
## Notes
- `/skin`, `/tools`, `/toolsets`, `/browser`, `/config`, `/cron`, `/skills`, `/platforms`, `/paste`, `/statusbar`, and `/plugins` are **CLI-only** commands.
- `/skin`, `/tools`, `/toolsets`, `/browser`, `/config`, `/cron`, `/skills`, `/platforms`, `/paste`, `/image`, `/statusbar`, and `/plugins` are **CLI-only** commands.
- `/verbose` is **CLI-only by default**, but can be enabled for messaging platforms by setting `display.tool_progress_command: true` in `config.yaml`. When enabled, it cycles the `display.tool_progress` mode and saves to config.
- `/status`, `/sethome`, `/update`, `/approve`, `/deny`, and `/commands` are **messaging-only** commands.
- `/background`, `/voice`, `/reload-mcp`, `/rollback`, and `/yolo` work in **both** the CLI and the messaging gateway.
- `/sethome`, `/update`, `/restart`, `/approve`, `/deny`, and `/commands` are **messaging-only** commands.
- `/status`, `/background`, `/voice`, `/reload-mcp`, `/rollback`, `/snapshot`, `/debug`, `/fast`, and `/yolo` work in **both** the CLI and the messaging gateway.
- `/voice join`, `/voice channel`, and `/voice leave` are only meaningful on Discord.

View File

@@ -322,7 +322,11 @@ Long conversations are automatically summarized when approaching context limits:
compression:
enabled: true
threshold: 0.50 # Compress at 50% of context limit by default
summary_model: "google/gemini-3-flash-preview" # Model used for summarization
# Summarization model configured under auxiliary:
auxiliary:
compression:
model: "google/gemini-3-flash-preview" # Model used for summarization
```
When compression triggers, middle turns are summarized while the first 3 and last 4 turns are always preserved.

View File

@@ -441,11 +441,19 @@ compression:
threshold: 0.50 # Compress at this % of context limit
target_ratio: 0.20 # Fraction of threshold to preserve as recent tail
protect_last_n: 20 # Min recent messages to keep uncompressed
summary_model: "google/gemini-3-flash-preview" # Model for summarization
summary_provider: "auto" # Provider: "auto", "openrouter", "nous", "codex", "main", etc.
summary_base_url: null # Custom OpenAI-compatible endpoint (overrides provider)
# The summarization model/provider is configured under auxiliary:
auxiliary:
compression:
model: "google/gemini-3-flash-preview" # Model for summarization
provider: "auto" # Provider: "auto", "openrouter", "nous", "codex", "main", etc.
base_url: null # Custom OpenAI-compatible endpoint (overrides provider)
```
:::info Legacy config migration
Older configs with `compression.summary_model`, `compression.summary_provider`, and `compression.summary_base_url` are automatically migrated to `auxiliary.compression.*` on first load (config version 17). No manual action needed.
:::
### Common setups
**Default (auto-detect) — no configuration needed:**
@@ -458,30 +466,32 @@ Uses the first available provider (OpenRouter → Nous → Codex) with Gemini Fl
**Force a specific provider** (OAuth or API-key based):
```yaml
compression:
summary_provider: nous
summary_model: gemini-3-flash
auxiliary:
compression:
provider: nous
model: gemini-3-flash
```
Works with any provider: `nous`, `openrouter`, `codex`, `anthropic`, `main`, etc.
**Custom endpoint** (self-hosted, Ollama, zai, DeepSeek, etc.):
```yaml
compression:
summary_model: glm-4.7
summary_base_url: https://api.z.ai/api/coding/paas/v4
auxiliary:
compression:
model: glm-4.7
base_url: https://api.z.ai/api/coding/paas/v4
```
Points at a custom OpenAI-compatible endpoint. Uses `OPENAI_API_KEY` for auth.
### How the three knobs interact
| `summary_provider` | `summary_base_url` | Result |
| `auxiliary.compression.provider` | `auxiliary.compression.base_url` | Result |
|---------------------|---------------------|--------|
| `auto` (default) | not set | Auto-detect best available provider |
| `nous` / `openrouter` / etc. | not set | Force that provider, use its auth |
| any | set | Use the custom endpoint directly (provider ignored) |
:::warning Summary model context length requirement
The `summary_model` **must** have a context window at least as large as your main agent model's. The compressor sends the full middle section of the conversation to the summary model — if that model's context window is smaller than the main model's, the summarization call will fail with a context length error. When this happens, the middle turns are **dropped without a summary**, losing conversation context silently. If you override `summary_model`, verify its context length meets or exceeds your main model's.
The summary model **must** have a context window at least as large as your main agent model's. The compressor sends the full middle section of the conversation to the summary model — if that model's context window is smaller than the main model's, the summarization call will fail with a context length error. When this happens, the middle turns are **dropped without a summary**, losing conversation context silently. If you override the model, verify its context length meets or exceeds your main model's.
:::
## Context Engine
@@ -522,6 +532,8 @@ agent:
Budget pressure is enabled by default. The agent sees warnings naturally as part of tool results, encouraging it to consolidate its work and deliver a response before running out of iterations.
When the iteration budget is fully exhausted, the CLI shows a notification to the user: `⚠ Iteration budget reached (90/90) — response may be incomplete`. If the budget runs out during active work, the agent generates a summary of what was accomplished before stopping.
### Streaming Timeouts
The LLM streaming connection has two timeout layers. Both auto-adjust for local providers (localhost, LAN IPs) — no configuration needed for most setups.
@@ -666,7 +678,7 @@ Each auxiliary task has a configurable `timeout` (in seconds). Defaults: vision
:::
:::info
Context compression has its own top-level `compression:` block with `summary_provider`, `summary_model`, and `summary_base_url` — see [Context Compression](#context-compression) above. The fallback model uses a `fallback_model:` block — see [Fallback Model](/docs/integrations/providers#fallback-model). All three follow the same provider/model/base_url pattern.
Context compression has its own `compression:` block for thresholds and an `auxiliary.compression:` block for model/provider settings — see [Context Compression](#context-compression) above. The fallback model uses a `fallback_model:` block — see [Fallback Model](/docs/integrations/providers#fallback-model). All three follow the same provider/model/base_url pattern.
:::
### Changing the Vision Model
@@ -839,16 +851,21 @@ agent:
```yaml
tts:
provider: "edge" # "edge" | "elevenlabs" | "openai" | "neutts"
provider: "edge" # "edge" | "elevenlabs" | "openai" | "neutts" | "minimax"
speed: 1.0 # Global speed multiplier (fallback for all providers)
edge:
voice: "en-US-AriaNeural" # 322 voices, 74 languages
speed: 1.0 # Speed multiplier (converted to rate percentage, e.g. 1.5 → +50%)
elevenlabs:
voice_id: "pNInz6obpgDQGcFmaJgB"
model_id: "eleven_multilingual_v2"
openai:
model: "gpt-4o-mini-tts"
voice: "alloy" # alloy, echo, fable, onyx, nova, shimmer
speed: 1.0 # Speed multiplier (clamped to 0.254.0 by the API)
base_url: "https://api.openai.com/v1" # Override for OpenAI-compatible TTS endpoints
minimax:
speed: 1.0 # Speech speed multiplier
neutts:
ref_audio: ''
ref_text: ''
@@ -858,6 +875,8 @@ tts:
This controls both the `text_to_speech` tool and spoken replies in voice mode (`/voice tts` in the CLI or messaging gateway).
**Speed fallback hierarchy:** provider-specific speed (e.g. `tts.edge.speed`) → global `tts.speed``1.0` default. Set the global `tts.speed` to apply a uniform speed across all providers, or override per-provider for fine-grained control.
## Display Settings
```yaml

View File

@@ -156,7 +156,7 @@ Hermes uses separate lightweight models for side tasks. Each task has its own pr
|------|-------------|-----------|
| Vision | Image analysis, browser screenshots | `auxiliary.vision` |
| Web Extract | Web page summarization | `auxiliary.web_extract` |
| Compression | Context compression summaries | `auxiliary.compression` or `compression.summary_provider` |
| Compression | Context compression summaries | `auxiliary.compression` |
| Session Search | Past session summarization | `auxiliary.session_search` |
| Skills Hub | Skill search and discovery | `auxiliary.skills_hub` |
| MCP | MCP helper operations | `auxiliary.mcp` |
@@ -219,13 +219,14 @@ auxiliary:
model: ""
```
Every task above follows the same **provider / model / base_url** pattern. Context compression uses its own top-level block:
Every task above follows the same **provider / model / base_url** pattern. Context compression is configured under `auxiliary.compression`:
```yaml
compression:
summary_provider: main # Same provider options as auxiliary tasks
summary_model: google/gemini-3-flash-preview
summary_base_url: null # Custom OpenAI-compatible endpoint
auxiliary:
compression:
provider: main # Same provider options as other auxiliary tasks
model: google/gemini-3-flash-preview
base_url: null # Custom OpenAI-compatible endpoint
```
And the fallback model uses:
@@ -270,15 +271,18 @@ auxiliary:
## Context Compression Fallback
Context compression has a legacy configuration path in addition to the auxiliary system:
Context compression uses the `auxiliary.compression` config block to control which model and provider handles summarization:
```yaml
compression:
summary_provider: "auto" # auto | openrouter | nous | main
summary_model: "google/gemini-3-flash-preview"
auxiliary:
compression:
provider: "auto" # auto | openrouter | nous | main
model: "google/gemini-3-flash-preview"
```
This is equivalent to configuring `auxiliary.compression.provider` and `auxiliary.compression.model`. If both are set, the `auxiliary.compression` values take precedence.
:::info Legacy migration
Older configs with `compression.summary_model` / `compression.summary_provider` / `compression.summary_base_url` are automatically migrated to `auxiliary.compression.*` on first load (config version 17).
:::
If no provider is available for compression, Hermes drops middle conversation turns without generating a summary rather than failing the session.
@@ -325,7 +329,7 @@ See [Scheduled Tasks (Cron)](/docs/user-guide/features/cron) for full configurat
| Main agent model | `fallback_model` in config.yaml — one-shot failover on errors | `fallback_model:` (top-level) |
| Vision | Auto-detection chain + internal OpenRouter retry | `auxiliary.vision` |
| Web extraction | Auto-detection chain + internal OpenRouter retry | `auxiliary.web_extract` |
| Context compression | Auto-detection chain, degrades to no-summary if unavailable | `auxiliary.compression` or `compression.summary_provider` |
| Context compression | Auto-detection chain, degrades to no-summary if unavailable | `auxiliary.compression` |
| Session search | Auto-detection chain | `auxiliary.session_search` |
| Skills hub | Auto-detection chain | `auxiliary.skills_hub` |
| MCP helpers | Auto-detection chain | `auxiliary.mcp` |

View File

@@ -426,6 +426,10 @@ hermes skills update react # Update one specific installed hub skill
This uses the stored source identifier plus the current upstream bundle content hash to detect drift.
:::tip GitHub rate limits
Skills hub operations use the GitHub API, which has a rate limit of 60 requests/hour for unauthenticated users. If you see rate-limit errors during install or search, set `GITHUB_TOKEN` in your `.env` file to increase the limit to 5,000 requests/hour. The error message includes an actionable hint when this happens.
:::
### Slash commands (inside chat)
All the same commands work with `/skills`:

View File

@@ -36,8 +36,10 @@ Convert text to speech with six providers:
# In ~/.hermes/config.yaml
tts:
provider: "edge" # "edge" | "elevenlabs" | "openai" | "minimax" | "mistral" | "neutts"
speed: 1.0 # Global speed multiplier (provider-specific settings override this)
edge:
voice: "en-US-AriaNeural" # 322 voices, 74 languages
speed: 1.0 # Converted to rate percentage (+/-%)
elevenlabs:
voice_id: "pNInz6obpgDQGcFmaJgB" # Adam
model_id: "eleven_multilingual_v2"
@@ -45,6 +47,7 @@ tts:
model: "gpt-4o-mini-tts"
voice: "alloy" # alloy, echo, fable, onyx, nova, shimmer
base_url: "https://api.openai.com/v1" # Override for OpenAI-compatible TTS endpoints
speed: 1.0 # 0.25 - 4.0
minimax:
model: "speech-2.8-hd" # speech-2.8-hd (default), speech-2.8-turbo
voice_id: "English_Graceful_Lady" # See https://platform.minimax.io/faq/system-voice-id
@@ -61,6 +64,8 @@ tts:
device: cpu
```
**Speed control**: The global `tts.speed` value applies to all providers by default. Each provider can override it with its own `speed` setting (e.g., `tts.openai.speed: 1.5`). Provider-specific speed takes precedence over the global value. Default is `1.0` (normal speed).
### Telegram Voice Bubbles & ffmpeg
Telegram voice bubbles require Opus/OGG audio format:

View File

@@ -1,7 +1,7 @@
---
sidebar_position: 15
title: "Web Dashboard"
description: "Browser-based dashboard for managing configuration, API keys, and monitoring sessions"
description: "Browser-based dashboard for managing configuration, API keys, sessions, logs, analytics, cron jobs, and skills"
---
# Web Dashboard
@@ -104,6 +104,54 @@ Each key shows:
Advanced/rarely-used keys are hidden by default behind a toggle.
### Sessions
Browse and inspect all agent sessions. Each row shows the session title, source platform icon (CLI, Telegram, Discord, Slack, cron), model name, message count, tool call count, and how long ago it was active. Live sessions are marked with a pulsing badge.
- **Search** — full-text search across all message content using FTS5. Results show highlighted snippets and auto-scroll to the first matching message when expanded.
- **Expand** — click a session to load its full message history. Messages are color-coded by role (user, assistant, system, tool) and rendered as Markdown with syntax highlighting.
- **Tool calls** — assistant messages with tool calls show collapsible blocks with the function name and JSON arguments.
- **Delete** — remove a session and its message history with the trash icon.
### Logs
View agent, gateway, and error log files with filtering and live tailing.
- **File** — switch between `agent`, `errors`, and `gateway` log files
- **Level** — filter by log level: ALL, DEBUG, INFO, WARNING, or ERROR
- **Component** — filter by source component: all, gateway, agent, tools, cli, or cron
- **Lines** — choose how many lines to display (50, 100, 200, or 500)
- **Auto-refresh** — toggle live tailing that polls for new log lines every 5 seconds
- **Color-coded** — log lines are colored by severity (red for errors, yellow for warnings, dim for debug)
### Analytics
Usage and cost analytics computed from session history. Select a time period (7, 30, or 90 days) to see:
- **Summary cards** — total tokens (input/output), cache hit percentage, total estimated or actual cost, and total session count with daily average
- **Daily token chart** — stacked bar chart showing input and output token usage per day, with hover tooltips showing breakdowns and cost
- **Daily breakdown table** — date, session count, input tokens, output tokens, cache hit rate, and cost for each day
- **Per-model breakdown** — table showing each model used, its session count, token usage, and estimated cost
### Cron
Create and manage scheduled cron jobs that run agent prompts on a recurring schedule.
- **Create** — fill in a name (optional), prompt, cron expression (e.g. `0 9 * * *`), and delivery target (local, Telegram, Discord, Slack, or email)
- **Job list** — each job shows its name, prompt preview, schedule expression, state badge (enabled/paused/error), delivery target, last run time, and next run time
- **Pause / Resume** — toggle a job between active and paused states
- **Trigger now** — immediately execute a job outside its normal schedule
- **Delete** — permanently remove a cron job
### Skills
Browse, search, and toggle skills and toolsets. Skills are loaded from `~/.hermes/skills/` and grouped by category.
- **Search** — filter skills and toolsets by name, description, or category
- **Category filter** — click category pills to narrow the list (e.g. MLOps, MCP, Red Teaming, AI)
- **Toggle** — enable or disable individual skills with a switch. Changes take effect on the next session.
- **Toolsets** — a separate section shows built-in toolsets (file operations, web browsing, etc.) with their active/inactive status, setup requirements, and list of included tools
:::warning Security
The web dashboard reads and writes your `.env` file, which contains API keys and secrets. It binds to `127.0.0.1` by default — only accessible from your local machine. If you bind to `0.0.0.0`, anyone on your network can view and modify your credentials. The dashboard has no authentication of its own.
:::
@@ -159,6 +207,66 @@ Sets an environment variable. Body: `{"key": "VAR_NAME", "value": "secret"}`.
Removes an environment variable. Body: `{"key": "VAR_NAME"}`.
### GET /api/sessions/\{session_id\}
Returns metadata for a single session.
### GET /api/sessions/\{session_id\}/messages
Returns the full message history for a session, including tool calls and timestamps.
### GET /api/sessions/search
Full-text search across message content. Query parameter: `q`. Returns matching session IDs with highlighted snippets.
### DELETE /api/sessions/\{session_id\}
Deletes a session and its message history.
### GET /api/logs
Returns log lines. Query parameters: `file` (agent/errors/gateway), `lines` (count), `level`, `component`.
### GET /api/analytics/usage
Returns token usage, cost, and session analytics. Query parameter: `days` (default 30). Response includes daily breakdowns and per-model aggregates.
### GET /api/cron/jobs
Returns all configured cron jobs with their state, schedule, and run history.
### POST /api/cron/jobs
Creates a new cron job. Body: `{"prompt": "...", "schedule": "0 9 * * *", "name": "...", "deliver": "local"}`.
### POST /api/cron/jobs/\{job_id\}/pause
Pauses a cron job.
### POST /api/cron/jobs/\{job_id\}/resume
Resumes a paused cron job.
### POST /api/cron/jobs/\{job_id\}/trigger
Immediately triggers a cron job outside its schedule.
### DELETE /api/cron/jobs/\{job_id\}
Deletes a cron job.
### GET /api/skills
Returns all skills with their name, description, category, and enabled status.
### PUT /api/skills/toggle
Enables or disables a skill. Body: `{"name": "skill-name", "enabled": true}`.
### GET /api/tools/toolsets
Returns all toolsets with their label, description, tools list, and active/configured status.
## CORS
The web server restricts CORS to localhost origins only:

View File

@@ -143,5 +143,5 @@ The crypto implementation is compatible with Tencent's official WXBizMsgCrypt SD
- **No streaming** — replies arrive as complete messages after the agent finishes
- **No typing indicators** — the callback model doesn't support typing status
- **Text only** — currently supports text messages; image/file/voice not yet implemented
- **Text only** — currently supports text messages for input; image/file/voice input not yet implemented. The agent is aware of outbound media capabilities via the WeCom platform hint (images, documents, video, voice).
- **Response latency** — agent sessions take 330 minutes; users see the reply when processing completes

View File

@@ -174,6 +174,33 @@ whatsapp:
---
## Message Formatting & Delivery
WhatsApp supports **streaming (progressive) responses** — the bot edits its message in real-time as the AI generates text, just like Discord and Telegram. Internally, WhatsApp is classified as a TIER_MEDIUM platform for delivery capabilities.
### Chunking
Long responses are automatically split into multiple messages at **4,096 characters** per chunk (WhatsApp's practical display limit). You don't need to configure anything — the gateway handles splitting and sends chunks sequentially.
### WhatsApp-Compatible Markdown
Standard Markdown in AI responses is automatically converted to WhatsApp's native formatting:
| Markdown | WhatsApp | Renders as |
|----------|----------|------------|
| `**bold**` | `*bold*` | **bold** |
| `~~strikethrough~~` | `~strikethrough~` | ~~strikethrough~~ |
| `# Heading` | `*Heading*` | Bold text (no native headings) |
| `[link text](url)` | `link text (url)` | Inline URL |
Code blocks and inline code are preserved as-is since WhatsApp supports triple-backtick formatting natively.
### Tool Progress
When the agent calls tools (web search, file operations, etc.), WhatsApp displays real-time progress indicators showing which tool is running. This is enabled by default — no configuration needed.
---
## Troubleshooting
| Problem | Solution |