[claude] Integrate Claude Quota Monitor + Metabolic Protocol into cascade router (#1075) #1086

Merged
claude merged 1 commits from claude/issue-1075 into main 2026-03-23 15:18:11 +00:00

1 Commits

Author SHA1 Message Date
Alexander Whitestone
3a72eb7a7e feat: integrate Claude quota monitor + metabolic protocol into cascade router
Some checks failed
Tests / lint (pull_request) Failing after 12s
Tests / test (pull_request) Has been skipped
Adds `QuotaMonitor` class (src/infrastructure/claude_quota.py) that reads
the Claude Code OAuth token from macOS Keychain, calls the Anthropic usage
API with 30s caching, and applies the Metabolic Protocol to auto-select the
right inference tier:

- BURST  (5h < 50%): cloud available for high-value tasks
- ACTIVE (5h 50-80%): local Qwen3-14B only
- RESTING (7d > 80%): local Qwen3-8B only

`select_model(task_complexity)` returns an Ollama tag or "claude-sonnet-4-6".
`should_use_cloud(task_value)` provides a boolean gate for cloud calls.

Integrates into cascade.py: before routing to anthropic/openai/grok providers
the router calls `_quota_allows_cloud()`, skipping cloud when quota is low.
Degrades gracefully on Linux (no Keychain) — returns local-only defaults.

Also adds `scripts/claude_quota_check.sh`: CLI tool with color-coded usage
bars, `--json` and `--watch` modes for monitoring from the terminal.

Fixes #1075

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-23 11:17:24 -04:00