feat: simple fallback model for provider resilience

When the primary model/provider fails after retries (rate limit, overload, auth errors, connection failures), Hermes automatically switches to a configured fallback model for the remainder of the session. Config (in ~/.hermes/config.yaml): fallback_model: provider: openrouter model: anthropic/claude-sonnet-4 Supports all major providers: OpenRouter, OpenAI, Nous, DeepSeek, Together, Groq, Fireworks, Mistral, Gemini — plus custom endpoints via base_url and api_key_env overrides. Design principles: - Dead simple: one fallback model, not a chain - One-shot: switches once, doesn't ping-pong back - Zero new dependencies: uses existing OpenAI client - Minimal code: ~100 lines in run_agent.py, ~5 lines in cli.py/gateway - Three trigger points: max retries exhausted, non-retryable client errors, and invalid response exhaustion Does NOT trigger on context overflow or payload-too-large errors (those are handled by the existing compression system). Addresses #737. 25 new tests, 2492 total passing.
2026-03-08 20:22:33 -07:00
parent 4d7d9d9715
commit 161436cfdd
6 changed files with 410 additions and 0 deletions
--- a/hermes_cli/config.py
+++ b/hermes_cli/config.py
@@ -103,6 +103,18 @@ DEFAULT_CONFIG = {
        },
    },
    
+    # Fallback model — used when the primary model/provider fails after retries.
+    # When the primary hits rate limits (429), overload (529), or service errors (503),
+    # Hermes will automatically switch to this model for the remainder of the session.
+    # Set to None / omit to disable fallback.
+    "fallback_model": {
+        "provider": "",   # e.g. "openrouter", "openai", "nous", "deepseek", "together", "groq"
+        "model": "",      # e.g. "anthropic/claude-sonnet-4", "gpt-4.1", "deepseek-chat"
+        # Optional overrides (usually auto-resolved from provider):
+        # "base_url": "",       # custom endpoint URL
+        # "api_key_env": "",    # env var name for API key (e.g. "MY_CUSTOM_KEY")
+    },
+
    "display": {
        "compact": False,
        "personality": "kawaii",