docs: fallback providers + /background command documentation

* docs: comprehensive fallback providers documentation

- New dedicated page: user-guide/features/fallback-providers.md covering
  both primary model fallback and auxiliary task fallback systems
- Updated configuration.md with fallback_model config section
- Updated environment-variables.md noting fallback is config-only
- Fleshed out developer-guide/provider-runtime.md fallback section with
  internal architecture details (trigger points, activation flow, config flow)
- Added cross-reference from provider-routing.md distinguishing OpenRouter
  sub-provider routing from Hermes-level model fallback
- Added new page to sidebar under Integrations

* docs: comprehensive /background command documentation

- Added Background Sessions section to cli.md covering how it works
  (daemon threads, isolated sessions, config inheritance, Rich panel
  output, bell notification, concurrent tasks)
- Added Background Sessions section to messaging/index.md covering
  messaging-specific behavior (async execution, result delivery back
  to same chat, fire-and-forget pattern)
- Documented background_process_notifications config
  (all/result/error/off) in messaging docs and configuration.md
- Added HERMES_BACKGROUND_NOTIFICATIONS env var to reference page
- Fixed inconsistency in slash-commands.md: /background was listed as
  messaging-only but works in both CLI and messaging. Moved it to the
  'both surfaces' note.
- Expanded one-liner table descriptions with detail and cross-references
This commit is contained in:
Teknium
2026-03-15 06:24:28 -07:00
committed by GitHub
parent 60cce9ca6d
commit 463239ed85
9 changed files with 495 additions and 5 deletions

View File

@@ -130,7 +130,41 @@ When an auxiliary task is configured with provider `main`, Hermes resolves that
## Fallback models
Hermes also supports a configured fallback model/provider, allowing runtime failover in supported error paths.
Hermes supports a configured fallback model/provider pair, allowing runtime failover when the primary model encounters errors.
### How it works internally
1. **Storage**: `AIAgent.__init__` stores the `fallback_model` dict and sets `_fallback_activated = False`.
2. **Trigger points**: `_try_activate_fallback()` is called from three places in the main retry loop in `run_agent.py`:
- After max retries on invalid API responses (None choices, missing content)
- On non-retryable client errors (HTTP 401, 403, 404)
- After max retries on transient errors (HTTP 429, 500, 502, 503)
3. **Activation flow** (`_try_activate_fallback`):
- Returns `False` immediately if already activated or not configured
- Calls `resolve_provider_client()` from `auxiliary_client.py` to build a new client with proper auth
- Determines `api_mode`: `codex_responses` for openai-codex, `anthropic_messages` for anthropic, `chat_completions` for everything else
- Swaps in-place: `self.model`, `self.provider`, `self.base_url`, `self.api_mode`, `self.client`, `self._client_kwargs`
- For anthropic fallback: builds a native Anthropic client instead of OpenAI-compatible
- Re-evaluates prompt caching (enabled for Claude models on OpenRouter)
- Sets `_fallback_activated = True` — prevents firing again
- Resets retry count to 0 and continues the loop
4. **Config flow**:
- CLI: `cli.py` reads `CLI_CONFIG["fallback_model"]` → passes to `AIAgent(fallback_model=...)`
- Gateway: `gateway/run.py._load_fallback_model()` reads `config.yaml` → passes to `AIAgent`
- Validation: both `provider` and `model` keys must be non-empty, or fallback is disabled
### What does NOT support fallback
- **Subagent delegation** (`tools/delegate_tool.py`): subagents inherit the parent's provider but not the fallback config
- **Cron jobs** (`cron/`): run with a fixed provider, no fallback mechanism
- **Auxiliary tasks**: use their own independent provider auto-detection chain (see Auxiliary model routing above)
### Test coverage
See `tests/test_fallback_model.py` for comprehensive tests covering all supported providers, one-shot semantics, and edge cases.
## Related docs