docs: fallback providers + /background command documentation

* docs: comprehensive fallback providers documentation - New dedicated page: user-guide/features/fallback-providers.md covering both primary model fallback and auxiliary task fallback systems - Updated configuration.md with fallback_model config section - Updated environment-variables.md noting fallback is config-only - Fleshed out developer-guide/provider-runtime.md fallback section with internal architecture details (trigger points, activation flow, config flow) - Added cross-reference from provider-routing.md distinguishing OpenRouter sub-provider routing from Hermes-level model fallback - Added new page to sidebar under Integrations * docs: comprehensive /background command documentation - Added Background Sessions section to cli.md covering how it works (daemon threads, isolated sessions, config inheritance, Rich panel output, bell notification, concurrent tasks) - Added Background Sessions section to messaging/index.md covering messaging-specific behavior (async execution, result delivery back to same chat, fire-and-forget pattern) - Documented background_process_notifications config (all/result/error/off) in messaging docs and configuration.md - Added HERMES_BACKGROUND_NOTIFICATIONS env var to reference page - Fixed inconsistency in slash-commands.md: /background was listed as messaging-only but works in both CLI and messaging. Moved it to the 'both surfaces' note. - Expanded one-liner table descriptions with detail and cross-references
2026-03-15 06:24:28 -07:00
parent 60cce9ca6d
commit 463239ed85
9 changed files with 495 additions and 5 deletions
--- a/website/docs/developer-guide/provider-runtime.md
+++ b/website/docs/developer-guide/provider-runtime.md
@@ -130,7 +130,41 @@ When an auxiliary task is configured with provider `main`, Hermes resolves that

 ## Fallback models

-Hermes also supports a configured fallback model/provider, allowing runtime failover in supported error paths.
+Hermes supports a configured fallback model/provider pair, allowing runtime failover when the primary model encounters errors.
+
+### How it works internally
+
+1. **Storage**: `AIAgent.__init__` stores the `fallback_model` dict and sets `_fallback_activated = False`.
+
+2. **Trigger points**: `_try_activate_fallback()` is called from three places in the main retry loop in `run_agent.py`:
+   - After max retries on invalid API responses (None choices, missing content)
+   - On non-retryable client errors (HTTP 401, 403, 404)
+   - After max retries on transient errors (HTTP 429, 500, 502, 503)
+
+3. **Activation flow** (`_try_activate_fallback`):
+   - Returns `False` immediately if already activated or not configured
+   - Calls `resolve_provider_client()` from `auxiliary_client.py` to build a new client with proper auth
+   - Determines `api_mode`: `codex_responses` for openai-codex, `anthropic_messages` for anthropic, `chat_completions` for everything else
+   - Swaps in-place: `self.model`, `self.provider`, `self.base_url`, `self.api_mode`, `self.client`, `self._client_kwargs`
+   - For anthropic fallback: builds a native Anthropic client instead of OpenAI-compatible
+   - Re-evaluates prompt caching (enabled for Claude models on OpenRouter)
+   - Sets `_fallback_activated = True` — prevents firing again
+   - Resets retry count to 0 and continues the loop
+
+4. **Config flow**:
+   - CLI: `cli.py` reads `CLI_CONFIG["fallback_model"]` → passes to `AIAgent(fallback_model=...)`
+   - Gateway: `gateway/run.py._load_fallback_model()` reads `config.yaml` → passes to `AIAgent`
+   - Validation: both `provider` and `model` keys must be non-empty, or fallback is disabled
+
+### What does NOT support fallback
+
+- **Subagent delegation** (`tools/delegate_tool.py`): subagents inherit the parent's provider but not the fallback config
+- **Cron jobs** (`cron/`): run with a fixed provider, no fallback mechanism
+- **Auxiliary tasks**: use their own independent provider auto-detection chain (see Auxiliary model routing above)
+
+### Test coverage
+
+See `tests/test_fallback_model.py` for comprehensive tests covering all supported providers, one-shot semantics, and edge cases.

 ## Related docs