Files
hermes-agent/website/docs/user-guide/messaging/open-webui.md
Teknium dd60bcbfb7 feat: OpenAI-compatible API server + WhatsApp configurable reply prefix (#1756)
* feat: OpenAI-compatible API server platform adapter

Salvaged from PR #956, updated for current main.

Adds an HTTP API server as a gateway platform adapter that exposes
hermes-agent via the OpenAI Chat Completions and Responses APIs.
Any OpenAI-compatible frontend (Open WebUI, LobeChat, LibreChat,
AnythingLLM, NextChat, ChatBox, etc.) can connect by pointing at
http://localhost:8642/v1.

Endpoints:
- POST /v1/chat/completions  — stateless Chat Completions API
- POST /v1/responses         — stateful Responses API with chaining
- GET  /v1/responses/{id}    — retrieve stored response
- DELETE /v1/responses/{id}  — delete stored response
- GET  /v1/models            — list hermes-agent as available model
- GET  /health               — health check

Features:
- Real SSE streaming via stream_delta_callback (uses main's streaming)
- In-memory LRU response store for Responses API conversation chaining
- Named conversations via 'conversation' parameter
- Bearer token auth (optional, via API_SERVER_KEY)
- CORS support for browser-based frontends
- System prompt layering (frontend system messages on top of core)
- Real token usage tracking in responses

Integration points:
- Platform.API_SERVER in gateway/config.py
- _create_adapter() branch in gateway/run.py
- API_SERVER_* env vars in hermes_cli/config.py
- Env var overrides in gateway/config.py _apply_env_overrides()

Changes vs original PR #956:
- Removed streaming infrastructure (already on main via stream_consumer.py)
- Removed Telegram reply_to_mode (separate feature, not included)
- Updated _resolve_model() -> _resolve_gateway_model()
- Updated stream_callback -> stream_delta_callback
- Updated connect()/disconnect() to use _mark_connected()/_mark_disconnected()
- Adapted to current Platform enum (includes MATTERMOST, MATRIX, DINGTALK)

Tests: 72 new tests, all passing
Docs: API server guide, Open WebUI integration guide, env var reference

* feat(whatsapp): make reply prefix configurable via config.yaml

Reworked from PR #1764 (ifrederico) to use config.yaml instead of .env.

The WhatsApp bridge prepends a header to every outgoing message.
This was hardcoded to '⚕ *Hermes Agent*'. Users can now customize
or disable it via config.yaml:

  whatsapp:
    reply_prefix: ''                     # disable header
    reply_prefix: '🤖 *My Bot*\n───\n'  # custom prefix

How it works:
- load_gateway_config() reads whatsapp.reply_prefix from config.yaml
  and stores it in PlatformConfig.extra['reply_prefix']
- WhatsAppAdapter reads it from config.extra at init
- When spawning bridge.js, the adapter passes it as
  WHATSAPP_REPLY_PREFIX in the subprocess environment
- bridge.js handles undefined (default), empty (no header),
  or custom values with \\n escape support
- Self-chat echo suppression uses the configured prefix

Also fixes _config_version: was 9 but ENV_VARS_BY_VERSION had a
key 10 (TAVILY_API_KEY), so existing users at v9 would never be
prompted for Tavily. Bumped to 10 to close the gap. Added a
regression test to prevent this from happening again.

Credit: ifrederico (PR #1764) for the bridge.js implementation
and the config version gap discovery.

---------

Co-authored-by: Test <test@test.com>
2026-03-17 10:44:37 -07:00

214 lines
7.8 KiB
Markdown

---
sidebar_position: 8
title: "Open WebUI"
description: "Connect Open WebUI to Hermes Agent via the OpenAI-compatible API server"
---
# Open WebUI Integration
[Open WebUI](https://github.com/open-webui/open-webui) (126k★) is the most popular self-hosted chat interface for AI. With Hermes Agent's built-in API server, you can use Open WebUI as a polished web frontend for your agent — complete with conversation management, user accounts, and a modern chat interface.
## Architecture
```
┌──────────────────┐ POST /v1/chat/completions ┌──────────────────────┐
│ Open WebUI │ ──────────────────────────────► │ hermes-agent │
│ (browser UI) │ SSE streaming response │ gateway API server │
│ port 3000 │ ◄────────────────────────────── │ port 8642 │
└──────────────────┘ └──────────────────────┘
```
Open WebUI connects to Hermes Agent's API server just like it would connect to OpenAI. Your agent handles the requests with its full toolset — terminal, file operations, web search, memory, skills — and returns the final response.
## Quick Setup
### 1. Enable the API server
Add to `~/.hermes/.env`:
```bash
API_SERVER_ENABLED=true
# Optional: set a key for auth (recommended if accessible beyond localhost)
# API_SERVER_KEY=your-secret-key
```
### 2. Start Hermes Agent gateway
```bash
hermes gateway
```
You should see:
```
[API Server] API server listening on http://127.0.0.1:8642
```
### 3. Start Open WebUI
```bash
docker run -d -p 3000:8080 \
-e OPENAI_API_BASE_URL=http://host.docker.internal:8642/v1 \
-e OPENAI_API_KEY=not-needed \
--add-host=host.docker.internal:host-gateway \
-v open-webui:/app/backend/data \
--name open-webui \
--restart always \
ghcr.io/open-webui/open-webui:main
```
If you set an `API_SERVER_KEY`, use it instead of `not-needed`:
```bash
-e OPENAI_API_KEY=your-secret-key
```
### 4. Open the UI
Go to **http://localhost:3000**. Create your admin account (the first user becomes admin). You should see **hermes-agent** in the model dropdown. Start chatting!
## Docker Compose Setup
For a more permanent setup, create a `docker-compose.yml`:
```yaml
services:
open-webui:
image: ghcr.io/open-webui/open-webui:main
ports:
- "3000:8080"
volumes:
- open-webui:/app/backend/data
environment:
- OPENAI_API_BASE_URL=http://host.docker.internal:8642/v1
- OPENAI_API_KEY=not-needed
extra_hosts:
- "host.docker.internal:host-gateway"
restart: always
volumes:
open-webui:
```
Then:
```bash
docker compose up -d
```
## Configuring via the Admin UI
If you prefer to configure the connection through the UI instead of environment variables:
1. Log in to Open WebUI at **http://localhost:3000**
2. Click your **profile avatar****Admin Settings**
3. Go to **Connections**
4. Under **OpenAI API**, click the **wrench icon** (Manage)
5. Click **+ Add New Connection**
6. Enter:
- **URL**: `http://host.docker.internal:8642/v1`
- **API Key**: your key or any non-empty value (e.g., `not-needed`)
7. Click the **checkmark** to verify the connection
8. **Save**
The **hermes-agent** model should now appear in the model dropdown.
:::warning
Environment variables only take effect on Open WebUI's **first launch**. After that, connection settings are stored in its internal database. To change them later, use the Admin UI or delete the Docker volume and start fresh.
:::
## API Type: Chat Completions vs Responses
Open WebUI supports two API modes when connecting to a backend:
| Mode | Format | When to use |
|------|--------|-------------|
| **Chat Completions** (default) | `/v1/chat/completions` | Recommended. Works out of the box. |
| **Responses** (experimental) | `/v1/responses` | For server-side conversation state via `previous_response_id`. |
### Using Chat Completions (recommended)
This is the default and requires no extra configuration. Open WebUI sends standard OpenAI-format requests and Hermes Agent responds accordingly. Each request includes the full conversation history.
### Using Responses API
To use the Responses API mode:
1. Go to **Admin Settings****Connections****OpenAI****Manage**
2. Edit your hermes-agent connection
3. Change **API Type** from "Chat Completions" to **"Responses (Experimental)"**
4. Save
With the Responses API, Open WebUI sends requests in the Responses format (`input` array + `instructions`), and Hermes Agent can preserve full tool call history across turns via `previous_response_id`.
:::note
Open WebUI currently manages conversation history client-side even in Responses mode — it sends the full message history in each request rather than using `previous_response_id`. The Responses API mode is mainly useful for future compatibility as frontends evolve.
:::
## How It Works
When you send a message in Open WebUI:
1. Open WebUI sends a `POST /v1/chat/completions` request with your message and conversation history
2. Hermes Agent creates an AIAgent instance with its full toolset
3. The agent processes your request — it may call tools (terminal, file operations, web search, etc.)
4. Tool calls happen invisibly server-side
5. The agent's final text response is returned to Open WebUI
6. Open WebUI displays the response in its chat interface
Your agent has access to all the same tools and capabilities as when using the CLI or Telegram — the only difference is the frontend.
## Configuration Reference
### Hermes Agent (API server)
| Variable | Default | Description |
|----------|---------|-------------|
| `API_SERVER_ENABLED` | `false` | Enable the API server |
| `API_SERVER_PORT` | `8642` | HTTP server port |
| `API_SERVER_HOST` | `127.0.0.1` | Bind address |
| `API_SERVER_KEY` | _(none)_ | Bearer token for auth. No key = allow all. |
### Open WebUI
| Variable | Description |
|----------|-------------|
| `OPENAI_API_BASE_URL` | Hermes Agent's API URL (include `/v1`) |
| `OPENAI_API_KEY` | Must be non-empty. Match your `API_SERVER_KEY`. |
## Troubleshooting
### No models appear in the dropdown
- **Check the URL has `/v1` suffix**: `http://host.docker.internal:8642/v1` (not just `:8642`)
- **Verify the gateway is running**: `curl http://localhost:8642/health` should return `{"status": "ok"}`
- **Check model listing**: `curl http://localhost:8642/v1/models` should return a list with `hermes-agent`
- **Docker networking**: From inside Docker, `localhost` means the container, not your host. Use `host.docker.internal` or `--network=host`.
### Connection test passes but no models load
This is almost always the missing `/v1` suffix. Open WebUI's connection test is a basic connectivity check — it doesn't verify model listing works.
### Response takes a long time
Hermes Agent may be executing multiple tool calls (reading files, running commands, searching the web) before producing its final response. This is normal for complex queries. The response appears all at once when the agent finishes.
### "Invalid API key" errors
Make sure your `OPENAI_API_KEY` in Open WebUI matches the `API_SERVER_KEY` in Hermes Agent. If no key is configured on the Hermes side, any non-empty value works.
## Linux Docker (no Docker Desktop)
On Linux without Docker Desktop, `host.docker.internal` doesn't resolve by default. Options:
```bash
# Option 1: Add host mapping
docker run --add-host=host.docker.internal:host-gateway ...
# Option 2: Use host networking
docker run --network=host -e OPENAI_API_BASE_URL=http://localhost:8642/v1 ...
# Option 3: Use Docker bridge IP
docker run -e OPENAI_API_BASE_URL=http://172.17.0.1:8642/v1 ...
```