From 773d3bb4dfe6e594986db8fa0208446b93cb8af8 Mon Sep 17 00:00:00 2001 From: Teknium <127238744+teknium1@users.noreply.github.com> Date: Tue, 24 Mar 2026 07:19:26 -0700 Subject: [PATCH] docs: update all docs for /model command overhaul and custom provider support Documents the full /model command overhaul across 6 files: AGENTS.md: - Add model_switch.py to project structure tree configuration.md: - Rewrite General Setup with 3 config methods (interactive, config.yaml, env vars) - Add new 'Switching Models with /model' section documenting all syntax variants - Add 'Named Custom Providers' section with config.yaml examples and custom:name:model triple syntax slash-commands.md: - Update /model descriptions in both CLI and messaging tables with full syntax examples (provider:model, custom:model, custom:name:model, bare custom auto-detect) cli-commands.md: - Add /model slash command subsection under hermes model with syntax table - Add custom endpoint config to hermes model use cases faq.md: - Add config.yaml example for offline/local model setup - Note that provider: custom is a first-class provider - Document /model custom auto-detect provider-runtime.md: - Add model_switch.py to implementation file list - Update provider families to show Custom as first-class with named variants --- AGENTS.md | 1 + .../docs/developer-guide/provider-runtime.md | 10 ++- website/docs/reference/cli-commands.md | 17 +++++ website/docs/reference/faq.md | 11 ++- website/docs/reference/slash-commands.md | 4 +- website/docs/user-guide/configuration.md | 74 +++++++++++++++++-- 6 files changed, 105 insertions(+), 12 deletions(-) diff --git a/AGENTS.md b/AGENTS.md index fa733bc00..a25393ad9 100644 --- a/AGENTS.md +++ b/AGENTS.md @@ -38,6 +38,7 @@ hermes-agent/ │ ├── tools_config.py # `hermes tools` — enable/disable tools per platform │ ├── skills_hub.py # `/skills` slash command (search, browse, install) │ ├── models.py # Model catalog, provider model lists +│ ├── model_switch.py # Shared /model switch pipeline (CLI + gateway) │ └── auth.py # Provider credential resolution ├── tools/ # Tool implementations (one file per tool) │ ├── registry.py # Central tool registry (schemas, handlers, dispatch) diff --git a/website/docs/developer-guide/provider-runtime.md b/website/docs/developer-guide/provider-runtime.md index faa84d5f6..007729595 100644 --- a/website/docs/developer-guide/provider-runtime.md +++ b/website/docs/developer-guide/provider-runtime.md @@ -16,9 +16,10 @@ Hermes has a shared provider runtime resolver used across: Primary implementation: -- `hermes_cli/runtime_provider.py` -- `hermes_cli/auth.py` -- `agent/auxiliary_client.py` +- `hermes_cli/runtime_provider.py` — credential resolution, `_resolve_custom_runtime()` +- `hermes_cli/auth.py` — provider registry, `resolve_provider()` +- `hermes_cli/model_switch.py` — shared `/model` switch pipeline (CLI + gateway) +- `agent/auxiliary_client.py` — auxiliary model routing If you are trying to add a new first-class inference provider, read [Adding Providers](./adding-providers.md) alongside this page. @@ -46,7 +47,8 @@ Current provider families include: - Kimi / Moonshot - MiniMax - MiniMax China -- custom OpenAI-compatible endpoints +- Custom (`provider: custom`) — first-class provider for any OpenAI-compatible endpoint +- Named custom providers (`custom_providers` list in config.yaml) ## Output of runtime resolution diff --git a/website/docs/reference/cli-commands.md b/website/docs/reference/cli-commands.md index b76859081..db8a0d314 100644 --- a/website/docs/reference/cli-commands.md +++ b/website/docs/reference/cli-commands.md @@ -98,8 +98,25 @@ Use this when you want to: - switch default providers - log into OAuth-backed providers during model selection - pick from provider-specific model lists +- configure a custom/self-hosted endpoint - save the new default into config +### `/model` slash command (mid-session) + +Switch models without leaving a session: + +``` +/model # Show current model and available options +/model claude-sonnet-4 # Switch model (auto-detects provider) +/model zai:glm-5 # Switch provider and model +/model custom:qwen-2.5 # Use model on your custom endpoint +/model custom # Auto-detect model from custom endpoint +/model custom:local:qwen-2.5 # Use a named custom provider +/model openrouter:anthropic/claude-sonnet-4 # Switch back to cloud +``` + +Provider and base URL changes are persisted to `config.yaml` automatically. When switching away from a custom endpoint, the stale base URL is cleared to prevent it leaking into other providers. + ## `hermes gateway` ```bash diff --git a/website/docs/reference/faq.md b/website/docs/reference/faq.md index 97051fcee..5e8326ff9 100644 --- a/website/docs/reference/faq.md +++ b/website/docs/reference/faq.md @@ -53,7 +53,16 @@ hermes model # Context length: 32768 ← set this to match your server's actual context window ``` -Hermes persists the endpoint in `config.yaml` and prompts for the context window size so compression triggers at the right time. If you leave context length blank, Hermes auto-detects it from the server's `/models` endpoint or [models.dev](https://models.dev). +Or configure it directly in `config.yaml`: + +```yaml +model: + default: qwen3.5:27b + provider: custom + base_url: http://localhost:11434/v1 +``` + +Hermes persists the endpoint, provider, and base URL in `config.yaml` so it survives restarts. If your local server has exactly one model loaded, `/model custom` auto-detects it. You can also set `provider: custom` in config.yaml — it's a first-class provider, not an alias for anything else. This works with Ollama, vLLM, llama.cpp server, SGLang, LocalAI, and others. See the [Configuration guide](../user-guide/configuration.md) for details. diff --git a/website/docs/reference/slash-commands.md b/website/docs/reference/slash-commands.md index 0ccf116fc..9c9b42cbe 100644 --- a/website/docs/reference/slash-commands.md +++ b/website/docs/reference/slash-commands.md @@ -40,7 +40,7 @@ Type `/` in the CLI to open the autocomplete menu. Built-in commands are case-in | Command | Description | |---------|-------------| | `/config` | Show current configuration | -| `/model` | Show or change the current model | +| `/model [model-name]` | Show or change the current model. Supports: `/model claude-sonnet-4`, `/model provider:model` (switch providers), `/model custom:model` (custom endpoint), `/model custom:name:model` (named custom provider), `/model custom` (auto-detect from endpoint) | | `/provider` | Show available providers and current provider | | `/prompt` | View/set custom system prompt | | `/personality` | Set a predefined personality | @@ -98,7 +98,7 @@ The messaging gateway supports the following built-in commands inside Telegram, | `/reset` | Reset conversation history. | | `/status` | Show session info. | | `/stop` | Kill all running background processes and interrupt the running agent. | -| `/model [provider:model]` | Show or change the model, including provider switches. | +| `/model [provider:model]` | Show or change the model. Supports provider switches (`/model zai:glm-5`), custom endpoints (`/model custom:model`), named custom providers (`/model custom:local:qwen`), and auto-detect (`/model custom`). | | `/provider` | Show provider availability and auth status. | | `/personality [name]` | Set a personality overlay for the session. | | `/retry` | Retry the last message. | diff --git a/website/docs/user-guide/configuration.md b/website/docs/user-guide/configuration.md index 10cb92b48..8e97cf99f 100644 --- a/website/docs/user-guide/configuration.md +++ b/website/docs/user-guide/configuration.md @@ -214,24 +214,57 @@ Hermes Agent works with **any OpenAI-compatible API endpoint**. If a server impl ### General Setup -Two ways to configure a custom endpoint: +Three ways to configure a custom endpoint: -**Interactive (recommended):** +**Interactive setup (recommended):** ```bash hermes model # Select "Custom endpoint (self-hosted / VLLM / etc.)" # Enter: API base URL, API key, Model name ``` -**Manual (`.env` file):** +**Manual config (`config.yaml`):** +```yaml +# In ~/.hermes/config.yaml +model: + default: your-model-name + provider: custom + base_url: http://localhost:8000/v1 + api_key: your-key-or-leave-empty-for-local +``` + +**Environment variables (`.env` file):** ```bash # Add to ~/.hermes/.env OPENAI_BASE_URL=http://localhost:8000/v1 -OPENAI_API_KEY=*** +OPENAI_API_KEY=your-key # Any non-empty string for local servers LLM_MODEL=your-model-name ``` -`hermes model` and the manual `.env` approach end up in the same runtime path. If you save a custom endpoint through `hermes model`, Hermes persists the provider + base URL in `config.yaml` so later sessions keep using that endpoint even if `OPENAI_BASE_URL` is not exported in your current shell. +All three approaches end up in the same runtime path. `hermes model` persists provider, model, and base URL to `config.yaml` so later sessions keep using that endpoint even if env vars are not set. + +### Switching Models with `/model` + +Once a custom endpoint is configured, you can switch models mid-session: + +``` +/model custom:qwen-2.5 # Switch to a model on your custom endpoint +/model custom # Auto-detect the model from the endpoint +/model openrouter:claude-sonnet-4 # Switch back to a cloud provider +``` + +If you have **named custom providers** configured (see below), use the triple syntax: + +``` +/model custom:local:qwen-2.5 # Use the "local" custom provider with model qwen-2.5 +/model custom:work:llama3 # Use the "work" custom provider with llama3 +``` + +When switching providers, Hermes persists the base URL and provider to config so the change survives restarts. When switching away from a custom endpoint to a built-in provider, the stale base URL is automatically cleared. + +:::tip +`/model custom` (bare, no model name) queries your endpoint's `/models` API and auto-selects the model if exactly one is loaded. Useful for local servers running a single model. +::: Everything below follows this same pattern — just change the URL, key, and model name. @@ -462,6 +495,37 @@ custom_providers: --- +### Named Custom Providers + +If you work with multiple custom endpoints (e.g., a local dev server and a remote GPU server), you can define them as named custom providers in `config.yaml`: + +```yaml +custom_providers: + - name: local + base_url: http://localhost:8080/v1 + # api_key omitted — Hermes uses "no-key-required" for keyless local servers + - name: work + base_url: https://gpu-server.internal.corp/v1 + api_key: corp-api-key + api_mode: chat_completions # optional, auto-detected from URL + - name: anthropic-proxy + base_url: https://proxy.example.com/anthropic + api_key: proxy-key + api_mode: anthropic_messages # for Anthropic-compatible proxies +``` + +Switch between them mid-session with the triple syntax: + +``` +/model custom:local:qwen-2.5 # Use the "local" endpoint with qwen-2.5 +/model custom:work:llama3-70b # Use the "work" endpoint with llama3-70b +/model custom:anthropic-proxy:claude-sonnet-4 # Use the proxy +``` + +You can also select named custom providers from the interactive `hermes model` menu. + +--- + ### Choosing the Right Setup | Use Case | Recommended |