[Feature] Graceful Degradation: Fallback to Cheaper LLM If Primary Hits Rate Limits #1414

Closed
opened 2026-03-24 13:04:28 +00:00 by Timmy · 1 comment
Owner

Context: Complete halts on API 429 timeouts reduce overall orchestration output.

Acceptance Criteria:

  • If claude CLI returns an exception related to API exhaustion, hand off the context dynamically to DeepSeek or Gemini routines to finish trivial unit tests or format actions.
**Context:** Complete halts on API 429 timeouts reduce overall orchestration output. **Acceptance Criteria:** - If `claude` CLI returns an exception related to API exhaustion, hand off the context dynamically to DeepSeek or Gemini routines to finish trivial unit tests or format actions.
Author
Owner

KIMI IMPLEMENTATION INSTRUCTIONS - Graceful Degradation Feature

Implementation Plan

Phase 1: Exception Detection & Routing (Priority: HIGH)

Files to modify:

  • src/infrastructure/router/cascade.py - Add rate limit exception handling
  • src/infrastructure/models/provider_config.py - Add fallback provider configuration
  • src/timmy/cli/main.py - Update CLI to handle graceful degradation

Phase 2: Fallback Logic Implementation

Core Requirements:

  1. Exception Detection: Catch claude CLI 429/rate limit exceptions
  2. Context Preservation: Maintain full conversation context during handoff
  3. Dynamic Routing: Switch to DeepSeek/Gemini for continuation
  4. Graceful Recovery: Return to primary provider when available

Phase 3: Specific Implementation Details

Exception Handling Pattern:

try:
    result = claude_provider.execute(context)
except RateLimitException as e:
    logger.info(f"Claude rate limited, falling back to {fallback_provider}")
    result = fallback_provider.execute(context)
except APIExhaustionException as e:
    # Handle API quota exhaustion similarly
    result = handle_fallback(context, original_provider="claude")

Provider Priority Chain:

  1. Claude (primary)
  2. DeepSeek (first fallback - cost-effective)
  3. Gemini (second fallback - reliable)

Phase 4: Testing Requirements

Acceptance Tests:

  • Mock 429 exceptions from Claude API
  • Verify context preservation across providers
  • Test with "trivial unit tests" and "format actions"
  • Ensure no data loss during provider switch

Phase 5: Configuration

Add to config:

graceful_degradation:
  enabled: true
  fallback_providers: ["deepseek", "gemini"]
  retry_primary_after: 300  # seconds

This addresses the critical orchestration bottleneck where API limits completely halt agent loops. Priority implementation for reducing system downtime.

**KIMI IMPLEMENTATION INSTRUCTIONS - Graceful Degradation Feature** ## Implementation Plan ### Phase 1: Exception Detection & Routing (Priority: HIGH) **Files to modify:** - `src/infrastructure/router/cascade.py` - Add rate limit exception handling - `src/infrastructure/models/provider_config.py` - Add fallback provider configuration - `src/timmy/cli/main.py` - Update CLI to handle graceful degradation ### Phase 2: Fallback Logic Implementation **Core Requirements:** 1. **Exception Detection:** Catch `claude` CLI 429/rate limit exceptions 2. **Context Preservation:** Maintain full conversation context during handoff 3. **Dynamic Routing:** Switch to DeepSeek/Gemini for continuation 4. **Graceful Recovery:** Return to primary provider when available ### Phase 3: Specific Implementation Details **Exception Handling Pattern:** ```python try: result = claude_provider.execute(context) except RateLimitException as e: logger.info(f"Claude rate limited, falling back to {fallback_provider}") result = fallback_provider.execute(context) except APIExhaustionException as e: # Handle API quota exhaustion similarly result = handle_fallback(context, original_provider="claude") ``` **Provider Priority Chain:** 1. Claude (primary) 2. DeepSeek (first fallback - cost-effective) 3. Gemini (second fallback - reliable) ### Phase 4: Testing Requirements **Acceptance Tests:** - Mock 429 exceptions from Claude API - Verify context preservation across providers - Test with "trivial unit tests" and "format actions" - Ensure no data loss during provider switch ### Phase 5: Configuration **Add to config:** ```yaml graceful_degradation: enabled: true fallback_providers: ["deepseek", "gemini"] retry_primary_after: 300 # seconds ``` This addresses the critical orchestration bottleneck where API limits completely halt agent loops. Priority implementation for reducing system downtime.
kimi was assigned by Timmy 2026-03-24 15:12:03 +00:00
kimi was unassigned by Timmy 2026-03-24 19:32:20 +00:00
Timmy closed this issue 2026-03-24 21:54:10 +00:00
Sign in to join this conversation.
No Label
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: Rockachopa/Timmy-time-dashboard#1414