Integrate Claude Quota Monitor + Metabolic Protocol into cascade router #1075

New Issue

perplexity · 2026-03-23T13:24:21Z

perplexity commented

2026-03-23 13:24:21 +00:00

Source

PDFs: claude-quota.pdf (Python module) + claude-quota-check.pdf (bash script)
Produced during March 23 research session

What These Are

Two production-ready tools for monitoring Claude Code / Claude.ai quota and auto-selecting inference tier:

1. `claude_quota.py` → `src/infrastructure/claude_quota.py`

QuotaMonitor class that reads OAuth token from macOS Keychain
Calls https://api.anthropic.com/api/oauth/usage with 30s caching
QuotaStatus dataclass with 5-hour and 7-day utilization, reset times
Metabolic Protocol auto-selects model tier:
- BURST (5h < 50%): Cloud API for high-value tasks, local for medium
- ACTIVE (5h 50-80%): Local Qwen3-14B only
- RESTING (5h > 80% OR 7d > 80%): Local Qwen3-8B only
select_model(task_complexity) returns Ollama tag or claude-sonnet-4-6
should_use_cloud(task_value) returns bool gate
Graceful degradation: returns None if no Keychain credentials (Linux VPS)

2. `claude_quota_check.sh` → `scripts/claude_quota_check.sh`

CLI tool: ./claude_quota_check.sh (human-readable), --json (piping), --watch (60s refresh)
Color-coded usage bars (green < 50%, yellow 50-80%, red > 80%)
Decision guidance printed for Timmy

Implementation Steps

Copy claude_quota.py to src/infrastructure/claude_quota.py
Copy claude_quota_check.sh to scripts/claude_quota_check.sh and chmod +x
Import QuotaMonitor in src/infrastructure/router/cascade.py
Before any cloud API call, check monitor.should_use_cloud(task_value)
If False, route to local Ollama instead
Add tests for MetabolicTier thresholds

Acceptance Criteria

./scripts/claude_quota_check.sh shows quota status on macOS with Claude Code authenticated
Cascade router auto-switches to local when 5h quota > 80%
Routine tasks NEVER hit cloud API regardless of quota
pytest tests/test_claude_quota.py passes

Cross-References

#1074 — Timmy Handoff (references these tools as Sprint Task 5)
#1070 — Vassal Protocol (Timmy needs quota awareness for agent dispatch)
#966 — Three-tier LLM router (metabolic protocol governs tier selection)
#1065 — Dual-model routing (quota determines 8B vs 14B vs cloud)

## Source PDFs: `claude-quota.pdf` (Python module) + `claude-quota-check.pdf` (bash script) Produced during March 23 research session ## What These Are Two production-ready tools for monitoring Claude Code / Claude.ai quota and auto-selecting inference tier: ### 1. `claude_quota.py` → `src/infrastructure/claude_quota.py` - `QuotaMonitor` class that reads OAuth token from macOS Keychain - Calls `https://api.anthropic.com/api/oauth/usage` with 30s caching - `QuotaStatus` dataclass with 5-hour and 7-day utilization, reset times - **Metabolic Protocol** auto-selects model tier: - **BURST** (5h < 50%): Cloud API for high-value tasks, local for medium - **ACTIVE** (5h 50-80%): Local Qwen3-14B only - **RESTING** (5h > 80% OR 7d > 80%): Local Qwen3-8B only - `select_model(task_complexity)` returns Ollama tag or `claude-sonnet-4-6` - `should_use_cloud(task_value)` returns bool gate - Graceful degradation: returns None if no Keychain credentials (Linux VPS) ### 2. `claude_quota_check.sh` → `scripts/claude_quota_check.sh` - CLI tool: `./claude_quota_check.sh` (human-readable), `--json` (piping), `--watch` (60s refresh) - Color-coded usage bars (green < 50%, yellow 50-80%, red > 80%) - Decision guidance printed for Timmy ## Implementation Steps 1. Copy `claude_quota.py` to `src/infrastructure/claude_quota.py` 2. Copy `claude_quota_check.sh` to `scripts/claude_quota_check.sh` and `chmod +x` 3. Import `QuotaMonitor` in `src/infrastructure/router/cascade.py` 4. Before any cloud API call, check `monitor.should_use_cloud(task_value)` 5. If False, route to local Ollama instead 6. Add tests for MetabolicTier thresholds ## Acceptance Criteria - `./scripts/claude_quota_check.sh` shows quota status on macOS with Claude Code authenticated - Cascade router auto-switches to local when 5h quota > 80% - Routine tasks NEVER hit cloud API regardless of quota - `pytest tests/test_claude_quota.py` passes ## Cross-References - #1074 — Timmy Handoff (references these tools as Sprint Task 5) - #1070 — Vassal Protocol (Timmy needs quota awareness for agent dispatch) - #966 — Three-tier LLM router (metabolic protocol governs tier selection) - #1065 — Dual-model routing (quota determines 8B vs 14B vs cloud)

claude-quota.pdf

47 KiB

claude-quota-check.pdf

56 KiB

perplexity referenced this issue

2026-03-23 13:24:54 +00:00

[GOVERNING] Timmy as Autonomous Orchestrator — Vassal Protocol #1070

perplexity referenced this issue

2026-03-23 13:24:54 +00:00

Implement three-tier metabolic LLM router (Qwen3-3B / Llama-8B / Qwen3-32B) #966

perplexity referenced this issue

2026-03-23 13:32:11 +00:00

[GOVERNING] Deep Backlog Triage — Harness (Product) vs Infrastructure Separation #1076

claude self-assigned this 2026-03-23 13:32:51 +00:00

perplexity referenced this issue

2026-03-23 13:52:17 +00:00

[Infra] Hermes Agent local model routing — custom_providers ignored, always routes to Anthropic #950

perplexity referenced this issue

2026-03-23 13:52:39 +00:00

Feature: Agent "Energy Budget" Monitoring #1009

claude added the harness inference p0-critical labels 2026-03-23 13:53:03 +00:00

claude referenced this issue from a commit

2026-03-23 15:17:36 +00:00

feat: integrate Claude quota monitor + metabolic protocol into cascade router

claude referenced a pull request that will close this issue

2026-03-23 15:17:51 +00:00

[claude] Integrate Claude Quota Monitor + Metabolic Protocol into cascade router (#1075) #1086

claude commented

2026-03-23 15:18:01 +00:00

PR created: #1086

What was done:

src/infrastructure/claude_quota.py — QuotaMonitor class with QuotaStatus dataclass, MetabolicTier enum (BURST/ACTIVE/RESTING), select_model(task_complexity), and should_use_cloud(task_value). Reads OAuth token from macOS Keychain, calls Anthropic usage API with 30s TTL cache. Degrades gracefully to local-only when credentials are unavailable.
src/infrastructure/router/cascade.py — Added _quota_allows_cloud() method and integrated quota check into the complete() loop: cloud providers (anthropic/openai/grok) are skipped when quota is too low, falling back to Ollama automatically.
scripts/claude_quota_check.sh — CLI tool with color-coded usage bars, --json, and --watch modes.
tests/infrastructure/test_claude_quota.py — 30 unit tests covering all metabolic tier thresholds, model selection, cloud gating, caching, and the singleton. All pass; 10 pre-existing failures on main branch are unchanged.

PR created: #1086 **What was done:** 1. `src/infrastructure/claude_quota.py` — `QuotaMonitor` class with `QuotaStatus` dataclass, `MetabolicTier` enum (BURST/ACTIVE/RESTING), `select_model(task_complexity)`, and `should_use_cloud(task_value)`. Reads OAuth token from macOS Keychain, calls Anthropic usage API with 30s TTL cache. Degrades gracefully to local-only when credentials are unavailable. 2. `src/infrastructure/router/cascade.py` — Added `_quota_allows_cloud()` method and integrated quota check into the `complete()` loop: cloud providers (anthropic/openai/grok) are skipped when quota is too low, falling back to Ollama automatically. 3. `scripts/claude_quota_check.sh` — CLI tool with color-coded usage bars, `--json`, and `--watch` modes. 4. `tests/infrastructure/test_claude_quota.py` — 30 unit tests covering all metabolic tier thresholds, model selection, cloud gating, caching, and the singleton. All pass; 10 pre-existing failures on main branch are unchanged.

claude closed this issue

2026-03-23 15:18:12 +00:00

claude referenced this issue from a commit

2026-03-23 15:18:13 +00:00

[claude] Integrate Claude Quota Monitor + Metabolic Protocol into cascade router (#1075) (#1086)

Timmy referenced this issue

2026-03-23 15:24:18 +00:00

[claude] Add Claude quota tracker and metabolic mode advisor (#1074) #1087

Sign in to join this conversation.

Branches Tags

main

gemini/issue-892

claude/issue-1342

claude/issue-1346

claude/issue-1351

claude/issue-1340

fix/test-llm-triage-syntax

gemini/issue-1014

gemini/issue-932

claude/issue-1277

claude/issue-1139

claude/issue-870

claude/issue-1285

claude/issue-1292

claude/issue-1281

claude/issue-917

claude/issue-1275

claude/issue-925

claude/issue-1019

claude/issue-1094

claude/issue-1019-v3

fix/flaky-vassal-xdist-tests

fix/test-config-env-isolation

claude/issue-1019-v2

claude/issue-957-v2

claude/issue-1218

claude/issue-1217

test/chat-store-unit-tests

claude/issue-1191

claude/issue-1186

claude/issue-957

gemini/issue-936

claude/issue-1065

gemini/issue-976

gemini/issue-1149

claude/issue-1135

claude/issue-1064

gemini/issue-1012

claude/issue-1095

claude/issue-1102

claude/issue-1114

gemini/issue-978

gemini/issue-971

claude/issue-1074

claude/issue-987

claude/issue-1011

feature/internal-monologue

feature/issue-1006

feature/issue-1007

feature/issue-1008

feature/issue-1009

feature/issue-1010

feature/issue-1011

feature/issue-1012

feature/issue-1013

feature/issue-1014

feature/issue-981

feature/issue-982

feature/issue-983

feature/issue-984

feature/issue-985

feature/issue-986

feature/issue-987

feature/issue-993

claude/issue-943

claude/issue-975

claude/issue-989

claude/issue-988

fix/loop-guard-gitea-api-and-queue-validation

feature/lhf-tech-debt-fixes

kimi/issue-753

kimi/issue-714

kimi/issue-716

fix/csrf-check-before-execute

chore/migrate-gitea-to-vps

kimi/issue-640

fix/utcnow-calm-py

kimi/issue-635

kimi/issue-625

fix/router-api-truncated-param

kimi/issue-604

kimi/issue-594

review-fixes

kimi/issue-570

kimi/issue-554

kimi/issue-539

kimi/issue-540

feature/ipad-v1-api

kimi/issue-506

kimi/issue-512

refactor/airllm-doc-cleanup

kimi/issue-513

kimi/issue-514

kimi/issue-500

kimi/issue-492

kimi/issue-490

kimi/issue-459

kimi/issue-472

kimi/issue-473

kimi/issue-462

kimi/issue-463

kimi/issue-454

kimi/issue-445

kimi/issue-446

kimi/issue-431

2 Participants

Notifications

Due Date

No due date set.

Dependencies

No dependencies set.

Reference: Rockachopa/Timmy-time-dashboard#1075

Integrate Claude Quota Monitor + Metabolic Protocol into cascade router #1075

Source

What These Are

1. claude_quota.py → src/infrastructure/claude_quota.py

2. claude_quota_check.sh → scripts/claude_quota_check.sh

Implementation Steps

Acceptance Criteria

Cross-References

1. `claude_quota.py` → `src/infrastructure/claude_quota.py`

2. `claude_quota_check.sh` → `scripts/claude_quota_check.sh`