Files

Teknium dd60bcbfb7 feat: OpenAI-compatible API server + WhatsApp configurable reply prefix (#1756 )

* feat: OpenAI-compatible API server platform adapter

Salvaged from PR #956, updated for current main.

Adds an HTTP API server as a gateway platform adapter that exposes
hermes-agent via the OpenAI Chat Completions and Responses APIs.
Any OpenAI-compatible frontend (Open WebUI, LobeChat, LibreChat,
AnythingLLM, NextChat, ChatBox, etc.) can connect by pointing at
http://localhost:8642/v1.

Endpoints:
- POST /v1/chat/completions  — stateless Chat Completions API
- POST /v1/responses         — stateful Responses API with chaining
- GET  /v1/responses/{id}    — retrieve stored response
- DELETE /v1/responses/{id}  — delete stored response
- GET  /v1/models            — list hermes-agent as available model
- GET  /health               — health check

Features:
- Real SSE streaming via stream_delta_callback (uses main's streaming)
- In-memory LRU response store for Responses API conversation chaining
- Named conversations via 'conversation' parameter
- Bearer token auth (optional, via API_SERVER_KEY)
- CORS support for browser-based frontends
- System prompt layering (frontend system messages on top of core)
- Real token usage tracking in responses

Integration points:
- Platform.API_SERVER in gateway/config.py
- _create_adapter() branch in gateway/run.py
- API_SERVER_* env vars in hermes_cli/config.py
- Env var overrides in gateway/config.py _apply_env_overrides()

Changes vs original PR #956:
- Removed streaming infrastructure (already on main via stream_consumer.py)
- Removed Telegram reply_to_mode (separate feature, not included)
- Updated _resolve_model() -> _resolve_gateway_model()
- Updated stream_callback -> stream_delta_callback
- Updated connect()/disconnect() to use _mark_connected()/_mark_disconnected()
- Adapted to current Platform enum (includes MATTERMOST, MATRIX, DINGTALK)

Tests: 72 new tests, all passing
Docs: API server guide, Open WebUI integration guide, env var reference

* feat(whatsapp): make reply prefix configurable via config.yaml

Reworked from PR #1764 (ifrederico) to use config.yaml instead of .env.

The WhatsApp bridge prepends a header to every outgoing message.
This was hardcoded to '⚕ *Hermes Agent*'. Users can now customize
or disable it via config.yaml:

  whatsapp:
    reply_prefix: ''                     # disable header
    reply_prefix: '🤖 *My Bot*\n───\n'  # custom prefix

How it works:
- load_gateway_config() reads whatsapp.reply_prefix from config.yaml
  and stores it in PlatformConfig.extra['reply_prefix']
- WhatsAppAdapter reads it from config.extra at init
- When spawning bridge.js, the adapter passes it as
  WHATSAPP_REPLY_PREFIX in the subprocess environment
- bridge.js handles undefined (default), empty (no header),
  or custom values with \\n escape support
- Self-chat echo suppression uses the configured prefix

Also fixes _config_version: was 9 but ENV_VARS_BY_VERSION had a
key 10 (TAVILY_API_KEY), so existing users at v9 would never be
prompted for Tavily. Bumped to 10 to close the gap. Added a
regression test to prevent this from happening again.

Credit: ifrederico (PR #1764) for the bridge.js implementation
and the config version gap discovery.

---------

Co-authored-by: Test <test@test.com>

2026-03-17 10:44:37 -07:00

12 KiB

Raw Blame History

sidebar_position, title, description

sidebar_position	title	description
1	Messaging Gateway	Chat with Hermes from Telegram, Discord, Slack, WhatsApp, Signal, SMS, Email, Home Assistant, Mattermost, Matrix, DingTalk, or any OpenAI-compatible frontend via the API server — architecture and setup overview

Messaging Gateway

Chat with Hermes from Telegram, Discord, Slack, WhatsApp, Signal, SMS, Email, Home Assistant, Mattermost, Matrix, DingTalk, or your browser. The gateway is a single background process that connects to all your configured platforms, handles sessions, runs cron jobs, and delivers voice messages.

For the full voice feature set — including CLI microphone mode, spoken replies in messaging, and Discord voice-channel conversations — see Voice Mode and Use Voice Mode with Hermes.

Architecture

flowchart TB
    subgraph Gateway["Hermes Gateway"]
        subgraph Adapters["Platform adapters"]
            tg[Telegram]
            dc[Discord]
            wa[WhatsApp]
            sl[Slack]
            sig[Signal]
            sms[SMS]
            em[Email]
            ha[Home Assistant]
            mm[Mattermost]
            mx[Matrix]
            dt[DingTalk]
            api["API Server<br/>(OpenAI-compatible)"]
        end

        store["Session store<br/>per chat"]
        agent["AIAgent<br/>run_agent.py"]
        cron["Cron scheduler<br/>ticks every 60s"]
    end

    tg --> store
    dc --> store
    wa --> store
    sl --> store
    sig --> store
    sms --> store
    em --> store
    ha --> store
    mm --> store
    mx --> store
    dt --> store
    api --> store
    store --> agent
    cron --> store

Each platform adapter receives messages, routes them through a per-chat session store, and dispatches them to the AIAgent for processing. The gateway also runs the cron scheduler, ticking every 60 seconds to execute any due jobs.

Quick Setup

The easiest way to configure messaging platforms is the interactive wizard:

hermes gateway setup        # Interactive setup for all messaging platforms

This walks you through configuring each platform with arrow-key selection, shows which platforms are already configured, and offers to start/restart the gateway when done.

Gateway Commands

hermes gateway              # Run in foreground
hermes gateway setup        # Configure messaging platforms interactively
hermes gateway install      # Install as a user service (Linux) / launchd service (macOS)
sudo hermes gateway install --system   # Linux only: install a boot-time system service
hermes gateway start        # Start the default service
hermes gateway stop         # Stop the default service
hermes gateway status       # Check default service status
hermes gateway status --system         # Linux only: inspect the system service explicitly

Chat Commands (Inside Messaging)

Command	Description
`/new` or `/reset`	Start a fresh conversation
`/model [provider:model]`	Show or change the model (supports `provider:model` syntax)
`/provider`	Show available providers with auth status
`/personality [name]`	Set a personality
`/retry`	Retry the last message
`/undo`	Remove the last exchange
`/status`	Show session info
`/stop`	Stop the running agent
`/sethome`	Set this chat as the home channel
`/compress`	Manually compress conversation context
`/title [name]`	Set or show the session title
`/resume [name]`	Resume a previously named session
`/usage`	Show token usage for this session
`/insights [days]`	Show usage insights and analytics
`/reasoning [level\|show\|hide]`	Change reasoning effort or toggle reasoning display
`/voice [on\|off\|tts\|join\|leave\|status]`	Control messaging voice replies and Discord voice-channel behavior
`/rollback [number]`	List or restore filesystem checkpoints
`/background <prompt>`	Run a prompt in a separate background session
`/reload-mcp`	Reload MCP servers from config
`/update`	Update Hermes Agent to the latest version
`/help`	Show available commands
`/<skill-name>`	Invoke any installed skill

Session Management

Session Persistence

Sessions persist across messages until they reset. The agent remembers your conversation context.

Reset Policies

Sessions reset based on configurable policies:

Policy	Default	Description
Daily	4:00 AM	Reset at a specific hour each day
Idle	1440 min	Reset after N minutes of inactivity
Both	(combined)	Whichever triggers first

Configure per-platform overrides in ~/.hermes/gateway.json:

{
  "reset_by_platform": {
    "telegram": { "mode": "idle", "idle_minutes": 240 },
    "discord": { "mode": "idle", "idle_minutes": 60 }
  }
}

Security

By default, the gateway denies all users who are not in an allowlist or paired via DM. This is the safe default for a bot with terminal access.

# Restrict to specific users (recommended):
TELEGRAM_ALLOWED_USERS=123456789,987654321
DISCORD_ALLOWED_USERS=123456789012345678
SIGNAL_ALLOWED_USERS=+155****4567,+155****6543
SMS_ALLOWED_USERS=+155****4567,+155****6543
EMAIL_ALLOWED_USERS=trusted@example.com,colleague@work.com
MATTERMOST_ALLOWED_USERS=3uo8dkh1p7g1mfk49ear5fzs5c
MATRIX_ALLOWED_USERS=@alice:matrix.org
DINGTALK_ALLOWED_USERS=user-id-1

# Or allow
GATEWAY_ALLOWED_USERS=123456789,987654321

# Or explicitly allow all users (NOT recommended for bots with terminal access):
GATEWAY_ALLOW_ALL_USERS=true

DM Pairing (Alternative to Allowlists)

Instead of manually configuring user IDs, unknown users receive a one-time pairing code when they DM the bot:

# The user sees: "Pairing code: XKGH5N7P"
# You approve them with:
hermes pairing approve telegram XKGH5N7P

# Other pairing commands:
hermes pairing list          # View pending + approved users
hermes pairing revoke telegram 123456789  # Remove access

Pairing codes expire after 1 hour, are rate-limited, and use cryptographic randomness.

Interrupting the Agent

Send any message while the agent is working to interrupt it. Key behaviors:

In-progress terminal commands are killed immediately (SIGTERM, then SIGKILL after 1s)
Tool calls are cancelled — only the currently-executing one runs, the rest are skipped
Multiple messages are combined — messages sent during interruption are joined into one prompt
/stop command — interrupts without queuing a follow-up message

Tool Progress Notifications

Control how much tool activity is displayed in ~/.hermes/config.yaml:

display:
  tool_progress: all    # off | new | all | verbose

When enabled, the bot sends status messages as it works:

💻 `ls -la`...
🔍 web_search...
📄 web_extract...
🐍 execute_code...

Background Sessions

Run a prompt in a separate background session so the agent works on it independently while your main chat stays responsive:

/background Check all servers in the cluster and report any that are down

Hermes confirms immediately:

🔄 Background task started: "Check all servers in the cluster..."
   Task ID: bg_143022_a1b2c3

How It Works

Each /background prompt spawns a separate agent instance that runs asynchronously:

Isolated session — the background agent has its own session with its own conversation history. It has no knowledge of your current chat context and receives only the prompt you provide.
Same configuration — inherits your model, provider, toolsets, reasoning settings, and provider routing from the current gateway setup.
Non-blocking — your main chat stays fully interactive. Send messages, run other commands, or start more background tasks while it works.
Result delivery — when the task finishes, the result is sent back to the same chat or channel where you issued the command, prefixed with "✅ Background task complete". If it fails, you'll see "❌ Background task failed" with the error.

Background Process Notifications

When the agent running a background session uses terminal(background=true) to start long-running processes (servers, builds, etc.), the gateway can push status updates to your chat. Control this with display.background_process_notifications in ~/.hermes/config.yaml:

display:
  background_process_notifications: all    # all | result | error | off

Mode	What you receive
`all`	Running-output updates and the final completion message (default)
`result`	Only the final completion message (regardless of exit code)
`error`	Only the final message when the exit code is non-zero
`off`	No process watcher messages at all

You can also set this via environment variable:

HERMES_BACKGROUND_NOTIFICATIONS=result

Use Cases

Server monitoring — "/background Check the health of all services and alert me if anything is down"
Long builds — "/background Build and deploy the staging environment" while you continue chatting
Research tasks — "/background Research competitor pricing and summarize in a table"
File operations — "/background Organize the photos in ~/Downloads by date into folders"

:::tip Background tasks on messaging platforms are fire-and-forget — you don't need to wait or check on them. Results arrive in the same chat automatically when the task finishes. :::

Service Management

Linux (systemd)

hermes gateway install               # Install as user service
hermes gateway start                 # Start the service
hermes gateway stop                  # Stop the service
hermes gateway status                # Check status
journalctl --user -u hermes-gateway -f  # View logs

# Enable lingering (keeps running after logout)
sudo loginctl enable-linger $USER

# Or install a boot-time system service that still runs as your user
sudo hermes gateway install --system
sudo hermes gateway start --system
sudo hermes gateway status --system
journalctl -u hermes-gateway -f

Use the user service on laptops and dev boxes. Use the system service on VPS or headless hosts that should come back at boot without relying on systemd linger.

Avoid keeping both the user and system gateway units installed at once unless you really mean to. Hermes will warn if it detects both because start/stop/status behavior gets ambiguous.

:::info Multiple installations If you run multiple Hermes installations on the same machine (with different HERMES_HOME directories), each gets its own systemd service name. The default ~/.hermes uses hermes-gateway; other installations use hermes-gateway-<hash>. The hermes gateway commands automatically target the correct service for your current HERMES_HOME. :::

macOS (launchd)

hermes gateway install
launchctl start ai.hermes.gateway
launchctl stop ai.hermes.gateway
tail -f ~/.hermes/logs/gateway.log

Platform-Specific Toolsets

Each platform has its own toolset:

Platform	Toolset	Capabilities
CLI	`hermes-cli`	Full access
Telegram	`hermes-telegram`	Full tools including terminal
Discord	`hermes-discord`	Full tools including terminal
WhatsApp	`hermes-whatsapp`	Full tools including terminal
Slack	`hermes-slack`	Full tools including terminal
Signal	`hermes-signal`	Full tools including terminal
SMS	`hermes-sms`	Full tools including terminal
Email	`hermes-email`	Full tools including terminal
Home Assistant	`hermes-homeassistant`	Full tools + HA device control (ha_list_entities, ha_get_state, ha_call_service, ha_list_services)
Mattermost	`hermes-mattermost`	Full tools including terminal
Matrix	`hermes-matrix`	Full tools including terminal
DingTalk	`hermes-dingtalk`	Full tools including terminal
API Server	`hermes` (default)	Full tools including terminal

12 KiB Raw Blame History