Adds `QuotaMonitor` class (src/infrastructure/claude_quota.py) that reads
the Claude Code OAuth token from macOS Keychain, calls the Anthropic usage
API with 30s caching, and applies the Metabolic Protocol to auto-select the
right inference tier:
- BURST (5h < 50%): cloud available for high-value tasks
- ACTIVE (5h 50-80%): local Qwen3-14B only
- RESTING (7d > 80%): local Qwen3-8B only
`select_model(task_complexity)` returns an Ollama tag or "claude-sonnet-4-6".
`should_use_cloud(task_value)` provides a boolean gate for cloud calls.
Integrates into cascade.py: before routing to anthropic/openai/grok providers
the router calls `_quota_allows_cloud()`, skipping cloud when quota is low.
Degrades gracefully on Linux (no Keychain) — returns local-only defaults.
Also adds `scripts/claude_quota_check.sh`: CLI tool with color-coded usage
bars, `--json` and `--watch` modes for monitoring from the terminal.
Fixes#1075
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>