feat: add direct endpoint overrides for auxiliary and delegation
Add base_url/api_key overrides for auxiliary tasks and delegation so users can route those flows straight to a custom OpenAI-compatible endpoint without having to rely on provider=main or named custom providers. Also clear gateway session env vars in test isolation so the full suite stays deterministic when run from a messaging-backed agent session.
This commit is contained in:
@@ -180,6 +180,23 @@ For native Anthropic auth, Hermes prefers Claude Code's own credential files whe
|
||||
| `CONTEXT_COMPRESSION_THRESHOLD` | Trigger at this % of limit (default: 0.50) |
|
||||
| `CONTEXT_COMPRESSION_MODEL` | Model for summaries |
|
||||
|
||||
## Auxiliary Task Overrides
|
||||
|
||||
| Variable | Description |
|
||||
|----------|-------------|
|
||||
| `AUXILIARY_VISION_PROVIDER` | Override provider for vision tasks |
|
||||
| `AUXILIARY_VISION_MODEL` | Override model for vision tasks |
|
||||
| `AUXILIARY_VISION_BASE_URL` | Direct OpenAI-compatible endpoint for vision tasks |
|
||||
| `AUXILIARY_VISION_API_KEY` | API key paired with `AUXILIARY_VISION_BASE_URL` |
|
||||
| `AUXILIARY_WEB_EXTRACT_PROVIDER` | Override provider for web extraction/summarization |
|
||||
| `AUXILIARY_WEB_EXTRACT_MODEL` | Override model for web extraction/summarization |
|
||||
| `AUXILIARY_WEB_EXTRACT_BASE_URL` | Direct OpenAI-compatible endpoint for web extraction/summarization |
|
||||
| `AUXILIARY_WEB_EXTRACT_API_KEY` | API key paired with `AUXILIARY_WEB_EXTRACT_BASE_URL` |
|
||||
| `CONTEXT_COMPRESSION_PROVIDER` | Override provider for context compression summaries |
|
||||
| `CONTEXT_COMPRESSION_MODEL` | Override model for context compression summaries |
|
||||
|
||||
For task-specific direct endpoints, Hermes uses the task's configured API key or `OPENAI_API_KEY`. It does not reuse `OPENROUTER_API_KEY` for those custom endpoints.
|
||||
|
||||
## Provider Routing (config.yaml only)
|
||||
|
||||
These go in `~/.hermes/config.yaml` under the `provider_routing` section:
|
||||
|
||||
@@ -569,11 +569,15 @@ auxiliary:
|
||||
vision:
|
||||
provider: "auto" # "auto", "openrouter", "nous", "main"
|
||||
model: "" # e.g. "openai/gpt-4o", "google/gemini-2.5-flash"
|
||||
base_url: "" # direct OpenAI-compatible endpoint (takes precedence over provider)
|
||||
api_key: "" # API key for base_url (falls back to OPENAI_API_KEY)
|
||||
|
||||
# Web page summarization + browser page text extraction
|
||||
web_extract:
|
||||
provider: "auto"
|
||||
model: "" # e.g. "google/gemini-2.5-flash"
|
||||
base_url: ""
|
||||
api_key: ""
|
||||
```
|
||||
|
||||
### Changing the Vision Model
|
||||
@@ -604,6 +608,17 @@ AUXILIARY_VISION_MODEL=openai/gpt-4o
|
||||
|
||||
### Common Setups
|
||||
|
||||
**Using a direct custom endpoint** (clearer than `provider: "main"` for local/self-hosted APIs):
|
||||
```yaml
|
||||
auxiliary:
|
||||
vision:
|
||||
base_url: "http://localhost:1234/v1"
|
||||
api_key: "local-key"
|
||||
model: "qwen2.5-vl"
|
||||
```
|
||||
|
||||
`base_url` takes precedence over `provider`, so this is the most explicit way to route an auxiliary task to a specific endpoint. For direct endpoint overrides, Hermes uses the configured `api_key` or falls back to `OPENAI_API_KEY`; it does not reuse `OPENROUTER_API_KEY` for that custom endpoint.
|
||||
|
||||
**Using OpenAI API key for vision:**
|
||||
```yaml
|
||||
# In ~/.hermes/.env:
|
||||
@@ -848,13 +863,17 @@ delegation:
|
||||
- web
|
||||
# model: "google/gemini-3-flash-preview" # Override model (empty = inherit parent)
|
||||
# provider: "openrouter" # Override provider (empty = inherit parent)
|
||||
# base_url: "http://localhost:1234/v1" # Direct OpenAI-compatible endpoint (takes precedence over provider)
|
||||
# api_key: "local-key" # API key for base_url (falls back to OPENAI_API_KEY)
|
||||
```
|
||||
|
||||
**Subagent provider:model override:** By default, subagents inherit the parent agent's provider and model. Set `delegation.provider` and `delegation.model` to route subagents to a different provider:model pair — e.g., use a cheap/fast model for narrowly-scoped subtasks while your primary agent runs an expensive reasoning model.
|
||||
|
||||
**Direct endpoint override:** If you want the obvious custom-endpoint path, set `delegation.base_url`, `delegation.api_key`, and `delegation.model`. That sends subagents directly to that OpenAI-compatible endpoint and takes precedence over `delegation.provider`. If `delegation.api_key` is omitted, Hermes falls back to `OPENAI_API_KEY` only.
|
||||
|
||||
The delegation provider uses the same credential resolution as CLI/gateway startup. All configured providers are supported: `openrouter`, `nous`, `zai`, `kimi-coding`, `minimax`, `minimax-cn`. When a provider is set, the system automatically resolves the correct base URL, API key, and API mode — no manual credential wiring needed.
|
||||
|
||||
**Precedence:** `delegation.provider` in config → parent provider (inherited). `delegation.model` in config → parent model (inherited). Setting just `model` without `provider` changes only the model name while keeping the parent's credentials (useful for switching models within the same provider like OpenRouter).
|
||||
**Precedence:** `delegation.base_url` in config → `delegation.provider` in config → parent provider (inherited). `delegation.model` in config → parent model (inherited). Setting just `model` without `provider` changes only the model name while keeping the parent's credentials (useful for switching models within the same provider like OpenRouter).
|
||||
|
||||
## Clarify
|
||||
|
||||
|
||||
@@ -209,6 +209,14 @@ Delegation has a **depth limit of 2** — a parent (depth 0) can spawn children
|
||||
delegation:
|
||||
max_iterations: 50 # Max turns per child (default: 50)
|
||||
default_toolsets: ["terminal", "file", "web"] # Default toolsets
|
||||
model: "google/gemini-3-flash-preview" # Optional provider/model override
|
||||
provider: "openrouter" # Optional built-in provider
|
||||
|
||||
# Or use a direct custom endpoint instead of provider:
|
||||
delegation:
|
||||
model: "qwen2.5-coder"
|
||||
base_url: "http://localhost:1234/v1"
|
||||
api_key: "local-key"
|
||||
```
|
||||
|
||||
:::tip
|
||||
|
||||
Reference in New Issue
Block a user