[claude] Implement Qwen3-8B / Qwen3-14B dual-model routing strategy (#1065) #1157

Closed
claude wants to merge 2 commits from claude/issue-1065 into main
Collaborator

Fixes #1065

Summary

  • src/infrastructure/router/classifier.py — New TaskComplexity enum and classify_task() heuristic classifier. Routes SIMPLE tasks (short, keyword-matched: status/list/run/etc.) to Qwen3-8B and COMPLEX tasks (long, code blocks, plan/review/analyze keywords) to Qwen3-14B. No LLM inference required.

  • src/infrastructure/router/cascade.py — Added _get_model_for_complexity() helper and wired it into complete(). When no explicit model is given, task complexity is auto-classified and the appropriate model is selected from the routine/complex fallback chains. Fixed a bug where the secondary capability lookup incorrectly fell back to the provider default model instead of returning None.

  • config/providers.yaml — Added qwen3:8b and qwen3:14b model entries with routine/complex capability tags; added routine and complex fallback chains to the capability section.

  • src/config.py — Added ollama_max_loaded_models: int = 2 setting (env: OLLAMA_MAX_LOADED_MODELS) enabling both models to stay loaded simultaneously (~17 GB combined on Apple Silicon).

  • tests/infrastructure/test_router_classifier.py — 21 tests covering SIMPLE/COMPLEX classification heuristics.

  • tests/infrastructure/test_router_cascade.py — 9 new TestComplexityRouting tests covering explicit hints, auto-classification, fallback chain selection, and the None-return fix.

Test plan

  • All 75 infrastructure tests pass (pytest tests/infrastructure/)
  • Full unit suite green (tox -e unit)
  • Both models configured in providers.yaml with correct capability tags
  • OLLAMA_MAX_LOADED_MODELS=2 documented in config
Fixes #1065 ## Summary - **`src/infrastructure/router/classifier.py`** — New `TaskComplexity` enum and `classify_task()` heuristic classifier. Routes SIMPLE tasks (short, keyword-matched: status/list/run/etc.) to Qwen3-8B and COMPLEX tasks (long, code blocks, plan/review/analyze keywords) to Qwen3-14B. No LLM inference required. - **`src/infrastructure/router/cascade.py`** — Added `_get_model_for_complexity()` helper and wired it into `complete()`. When no explicit model is given, task complexity is auto-classified and the appropriate model is selected from the `routine`/`complex` fallback chains. Fixed a bug where the secondary capability lookup incorrectly fell back to the provider default model instead of returning `None`. - **`config/providers.yaml`** — Added `qwen3:8b` and `qwen3:14b` model entries with `routine`/`complex` capability tags; added `routine` and `complex` fallback chains to the capability section. - **`src/config.py`** — Added `ollama_max_loaded_models: int = 2` setting (env: `OLLAMA_MAX_LOADED_MODELS`) enabling both models to stay loaded simultaneously (~17 GB combined on Apple Silicon). - **`tests/infrastructure/test_router_classifier.py`** — 21 tests covering SIMPLE/COMPLEX classification heuristics. - **`tests/infrastructure/test_router_cascade.py`** — 9 new `TestComplexityRouting` tests covering explicit hints, auto-classification, fallback chain selection, and the None-return fix. ## Test plan - [x] All 75 infrastructure tests pass (`pytest tests/infrastructure/`) - [x] Full unit suite green (`tox -e unit`) - [x] Both models configured in `providers.yaml` with correct capability tags - [x] `OLLAMA_MAX_LOADED_MODELS=2` documented in config
claude added 2 commits 2026-03-23 19:34:31 +00:00
Automated salvage commit — agent session ended (exit 124).
Work in progress, may need continuation.
fix: correct complexity routing to not fall back to default model
Some checks failed
Tests / lint (pull_request) Failing after 17s
Tests / test (pull_request) Has been skipped
0b284972cb
`_get_model_for_complexity` was calling `get_model_with_capability`,
which silently falls back to the provider default when no model has the
requested capability tag.  This caused the method to return a generic
model instead of None when neither the fallback chain nor any explicit
capability tag matched, misleading callers into skipping the provider
default logic.

Replace the call with an explicit next() comprehension that returns None
when no model explicitly carries the 'routine' or 'complex' capability.

Refs #1065

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Owner

Closing: this PR has merge conflicts and is stale. The issue remains open for a fresh attempt.

Closing: this PR has merge conflicts and is stale. The issue remains open for a fresh attempt.
Timmy closed this pull request 2026-03-23 19:40:07 +00:00
Some checks failed
Tests / lint (pull_request) Failing after 17s
Tests / test (pull_request) Has been skipped

Pull request closed

Sign in to join this conversation.
No Reviewers
No Label
2 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: Rockachopa/Timmy-time-dashboard#1157