fix: harden auxiliary model config — gateway bridge, vision safety, tests

Improvements on top of PR #606 (auxiliary model configuration):

1. Gateway bridge: Added auxiliary.* and compression.summary_provider
   config bridging to gateway/run.py so config.yaml settings work from
   messaging platforms (not just CLI). Matches the pattern in cli.py.

2. Vision auto-fallback safety: In auto mode, vision now only tries
   OpenRouter + Nous Portal (known multimodal-capable providers).
   Custom endpoints, Codex, and API-key providers are skipped to avoid
   confusing errors from providers that don't support vision input.
   Explicit provider override (AUXILIARY_VISION_PROVIDER=main) still
   allows using any provider.

3. Comprehensive tests (46 new):
   - _get_auxiliary_provider env var resolution (8 tests)
   - _resolve_forced_provider with all provider types (8 tests)
   - Per-task provider routing integration (4 tests)
   - Vision auto-fallback safety (7 tests)
   - Config bridging logic (11 tests)
   - Gateway/CLI bridge parity (2 tests)
   - Vision model override via env var (2 tests)
   - DEFAULT_CONFIG shape validation (4 tests)

4. Docs: Added auxiliary_client.py to AGENTS.md project structure.
   Updated module docstring with separate text/vision resolution chains.

Tests: 2429 passed (was 2383).
This commit is contained in:
teknium1
2026-03-08 18:06:40 -07:00
parent d9f373654b
commit 5ae0b731d0
5 changed files with 534 additions and 5 deletions

View File

@@ -4,7 +4,7 @@ Provides a single resolution chain so every consumer (context compression,
session search, web extraction, vision analysis, browser vision) picks up
the best available backend without duplicating fallback logic.
Resolution order (same for text and vision tasks):
Resolution order for text tasks (auto mode):
1. OpenRouter (OPENROUTER_API_KEY)
2. Nous Portal (~/.hermes/auth.json active provider)
3. Custom endpoint (OPENAI_BASE_URL + OPENAI_API_KEY)
@@ -14,10 +14,19 @@ Resolution order (same for text and vision tasks):
— checked via PROVIDER_REGISTRY entries with auth_type='api_key'
6. None
Resolution order for vision/multimodal tasks (auto mode):
1. OpenRouter
2. Nous Portal
3. None (steps 3-5 are skipped — they may not support multimodal)
Per-task provider overrides (e.g. AUXILIARY_VISION_PROVIDER,
CONTEXT_COMPRESSION_PROVIDER) can force a specific provider for each task:
"openrouter", "nous", or "main" (= steps 3-5).
Default "auto" follows the full chain above.
Default "auto" follows the chains above.
Per-task model overrides (e.g. AUXILIARY_VISION_MODEL,
AUXILIARY_WEB_EXTRACT_MODEL) let callers use a different model slug
than the provider's default.
"""
import json
@@ -485,11 +494,23 @@ def get_vision_auxiliary_client() -> Tuple[Optional[OpenAI], Optional[str]]:
Checks AUXILIARY_VISION_PROVIDER for a forced provider, otherwise
auto-detects. Callers may override the returned model with
AUXILIARY_VISION_MODEL.
In auto mode, only OpenRouter and Nous Portal are tried because they
are known to support multimodal (Gemini). Custom endpoints, Codex,
and API-key providers are skipped — they may not handle vision input
and would produce confusing errors. To use one of those providers
for vision, set AUXILIARY_VISION_PROVIDER explicitly.
"""
forced = _get_auxiliary_provider("vision")
if forced != "auto":
return _resolve_forced_provider(forced)
return _resolve_auto()
# Auto: only multimodal-capable providers
for try_fn in (_try_openrouter, _try_nous):
client, model = try_fn()
if client is not None:
return client, model
logger.debug("Auxiliary vision client: none available (auto only tries OpenRouter/Nous)")
return None, None
def get_auxiliary_extra_body() -> dict: