diff --git a/website/docs/user-guide/features/api-server.md b/website/docs/user-guide/features/api-server.md index 6739ad7ab..71732285e 100644 --- a/website/docs/user-guide/features/api-server.md +++ b/website/docs/user-guide/features/api-server.md @@ -8,7 +8,7 @@ description: "Expose hermes-agent as an OpenAI-compatible API for any frontend" The API server exposes hermes-agent as an OpenAI-compatible HTTP endpoint. Any frontend that speaks the OpenAI format — Open WebUI, LobeChat, LibreChat, NextChat, ChatBox, and hundreds more — can connect to hermes-agent and use it as a backend. -Your agent handles requests with its full toolset (terminal, file operations, web search, memory, skills) and returns the final response. Tool calls execute invisibly server-side. +Your agent handles requests with its full toolset (terminal, file operations, web search, memory, skills) and returns the final response. When streaming, tool progress indicators appear inline so frontends can show what the agent is doing. ## Quick Start @@ -85,6 +85,8 @@ Standard OpenAI Chat Completions format. Stateless — the full conversation is **Streaming** (`"stream": true`): Returns Server-Sent Events (SSE) with token-by-token response chunks. When streaming is enabled in config, tokens are emitted live as the LLM generates them. When disabled, the full response is sent as a single SSE chunk. +**Tool progress in streams**: When the agent calls tools during a streaming request, brief progress indicators are injected into the content stream as the tools start executing (e.g. `` `💻 pwd` ``, `` `🔍 Python docs` ``). These appear as inline markdown before the agent's response text, giving frontends like Open WebUI real-time visibility into tool execution. + ### POST /v1/responses OpenAI Responses API format. Supports server-side conversation state via `previous_response_id` — the server stores full conversation history (including tool calls and results) so multi-turn context is preserved without the client managing it. diff --git a/website/docs/user-guide/messaging/open-webui.md b/website/docs/user-guide/messaging/open-webui.md index a3eb5fbc0..7d4eaee36 100644 --- a/website/docs/user-guide/messaging/open-webui.md +++ b/website/docs/user-guide/messaging/open-webui.md @@ -147,12 +147,16 @@ When you send a message in Open WebUI: 1. Open WebUI sends a `POST /v1/chat/completions` request with your message and conversation history 2. Hermes Agent creates an AIAgent instance with its full toolset 3. The agent processes your request — it may call tools (terminal, file operations, web search, etc.) -4. Tool calls happen invisibly server-side -5. The agent's final text response is returned to Open WebUI +4. As tools execute, **inline progress messages stream to the UI** so you can see what the agent is doing (e.g. `` `💻 ls -la` ``, `` `🔍 Python 3.12 release` ``) +5. The agent's final text response streams back to Open WebUI 6. Open WebUI displays the response in its chat interface Your agent has access to all the same tools and capabilities as when using the CLI or Telegram — the only difference is the frontend. +:::tip Tool Progress +With streaming enabled (the default), you'll see brief inline indicators as tools run — the tool emoji and its key argument. These appear in the response stream before the agent's final answer, giving you visibility into what's happening behind the scenes. +::: + ## Configuration Reference ### Hermes Agent (API server)