[Refactor] Implement Exponential Backoff with Jitter in Agent API Calls #1409

Closed
opened 2026-03-24 13:04:20 +00:00 by Timmy · 1 comment
Owner

Context: Current automation loops sleep for static durations when an agent faces API rate limits.

Acceptance Criteria:

  • Implement exponential backoff with a randomized jitter modifier to disperse the spike in concurrent requests.
  • Apply to timmy-loop.sh, kimi-loop.sh, and claude-loop.sh.
  • Include appropriate logging telemetry indicating backoff increments.
**Context:** Current automation loops sleep for static durations when an agent faces API rate limits. **Acceptance Criteria:** - Implement exponential backoff with a randomized jitter modifier to disperse the spike in concurrent requests. - Apply to `timmy-loop.sh`, `kimi-loop.sh`, and `claude-loop.sh`. - Include appropriate logging telemetry indicating backoff increments.
Author
Owner

Implementation Plan

Implement exponential backoff with jitter in agent API calls for better resilience.

Files to Modify:

  1. src/infrastructure/router/cascade.py - Add retry logic to provider calls
  2. src/timmy/kimi_delegation.py - Add backoff to Gitea API calls
  3. src/infrastructure/models/multimodal.py - Retry for model API failures
  4. src/infrastructure/hermes/monitor.py - Resilient monitoring calls

Implementation Steps:

  1. Create Retry Utilities: Add src/infrastructure/retry.py module
  2. Exponential Backoff: Implement 2^retry_count * base_delay formula
  3. Add Jitter: Random component to prevent thundering herd
  4. Configure Timeouts: Per-API timeout and max retry limits
  5. Error Classification: Distinguish retryable vs fatal errors
  6. Add Metrics: Track retry patterns for monitoring

Retry Configuration:

RETRY_CONFIG = {
    'gitea_api': {'max_retries': 3, 'base_delay': 1.0},
    'anthropic_api': {'max_retries': 2, 'base_delay': 2.0},
    'ollama_api': {'max_retries': 5, 'base_delay': 0.5}
}

Acceptance Criteria:

  • Exponential backoff implemented with jitter
  • Configurable retry policies per API
  • Proper error classification (5xx retryable, 4xx not)
  • Circuit breaker for repeated failures
  • Metrics and logging for retry attempts
  • All existing tests pass

Testing:

  • Unit tests with mocked failures
  • Integration tests with network issues
  • Verify backoff timing with actual delays
## Implementation Plan Implement exponential backoff with jitter in agent API calls for better resilience. ### Files to Modify: 1. `src/infrastructure/router/cascade.py` - Add retry logic to provider calls 2. `src/timmy/kimi_delegation.py` - Add backoff to Gitea API calls 3. `src/infrastructure/models/multimodal.py` - Retry for model API failures 4. `src/infrastructure/hermes/monitor.py` - Resilient monitoring calls ### Implementation Steps: 1. **Create Retry Utilities**: Add `src/infrastructure/retry.py` module 2. **Exponential Backoff**: Implement 2^retry_count * base_delay formula 3. **Add Jitter**: Random component to prevent thundering herd 4. **Configure Timeouts**: Per-API timeout and max retry limits 5. **Error Classification**: Distinguish retryable vs fatal errors 6. **Add Metrics**: Track retry patterns for monitoring ### Retry Configuration: ```python RETRY_CONFIG = { 'gitea_api': {'max_retries': 3, 'base_delay': 1.0}, 'anthropic_api': {'max_retries': 2, 'base_delay': 2.0}, 'ollama_api': {'max_retries': 5, 'base_delay': 0.5} } ``` ### Acceptance Criteria: - [ ] Exponential backoff implemented with jitter - [ ] Configurable retry policies per API - [ ] Proper error classification (5xx retryable, 4xx not) - [ ] Circuit breaker for repeated failures - [ ] Metrics and logging for retry attempts - [ ] All existing tests pass ### Testing: - Unit tests with mocked failures - Integration tests with network issues - Verify backoff timing with actual delays
kimi was assigned by Timmy 2026-03-24 13:11:37 +00:00
kimi was unassigned by Timmy 2026-03-24 19:32:22 +00:00
Timmy closed this issue 2026-03-24 21:54:12 +00:00
Sign in to join this conversation.
No Label
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: Rockachopa/Timmy-time-dashboard#1409