[HARNESS] Smart model routing — narrow decision space to competency space #609

Closed
opened 2026-03-27 01:10:18 +00:00 by perplexity · 13 comments
Member

Key Insight

"I just need to narrow the decision space based off the competency space. Claude Opus or Gemini for design discussions. A very dumb but very well-aligned model handles simple decisions."

Routing Strategy

Decision Type Model Why
Architecture/design discussions Claude Opus, Gemini High reasoning ceiling
Simple yes/no routing decisions Tiny aligned local model (Phi-3, Gemma 2B) Fast, cheap, no cloud dependency
Code generation Hermes 4 via harness Self-improvement loops capture training data
Math/logic verification Deterministic engine (SymPy/Z3) No hallucination risk
Personality/conversation Soul-file-tuned Hermes 4 Vibe alignment

Implementation

  • Define competency tiers for available models
  • Huey task router inspects task metadata → selects appropriate model
  • Log which model handled which task (feeds into Prometheus metrics)
  • Track cost-per-task across model tiers
  • Sovereignty bonus: prefer local model when confidence is sufficient
  • Depends on local inference telemetry (already wired into Huey)
  • Feeds into sovereignty rubric (% of tasks routed locally vs cloud)

Source: Gemini brainstorm session 2026-03-26 — triaged by Perplexity

## Key Insight > "I just need to narrow the decision space based off the competency space. Claude Opus or Gemini for design discussions. A very dumb but very well-aligned model handles simple decisions." ## Routing Strategy | Decision Type | Model | Why | |---|---|---| | Architecture/design discussions | Claude Opus, Gemini | High reasoning ceiling | | Simple yes/no routing decisions | Tiny aligned local model (Phi-3, Gemma 2B) | Fast, cheap, no cloud dependency | | Code generation | Hermes 4 via harness | Self-improvement loops capture training data | | Math/logic verification | Deterministic engine (SymPy/Z3) | No hallucination risk | | Personality/conversation | Soul-file-tuned Hermes 4 | Vibe alignment | ## Implementation - [ ] Define competency tiers for available models - [ ] Huey task router inspects task metadata → selects appropriate model - [ ] Log which model handled which task (feeds into Prometheus metrics) - [ ] Track cost-per-task across model tiers - [ ] Sovereignty bonus: prefer local model when confidence is sufficient ## Related - Depends on local inference telemetry (already wired into Huey) - Feeds into sovereignty rubric (% of tasks routed locally vs cloud) --- _Source: [Gemini brainstorm session 2026-03-26](https://g.co/gemini/share/3700c8d29b6b) — triaged by Perplexity_
perplexity added the modularizationharnessp1-important labels 2026-03-27 01:10:18 +00:00
Owner

Dispatched to claude. Huey task queued.

⚡ Dispatched to `claude`. Huey task queued.
Owner

Dispatched to gemini. Huey task queued.

⚡ Dispatched to `gemini`. Huey task queued.
Owner

Dispatched to kimi. Huey task queued.

⚡ Dispatched to `kimi`. Huey task queued.
Owner

Dispatched to grok. Huey task queued.

⚡ Dispatched to `grok`. Huey task queued.
Owner

Dispatched to perplexity. Huey task queued.

⚡ Dispatched to `perplexity`. Huey task queued.
Owner

🔍 Triaged by Huey — needs assignment.

🔍 Triaged by Huey — needs assignment.
Owner

🔍 Triaged by Huey — needs assignment.

🔍 Triaged by Huey — needs assignment.
Member

🔧 gemini working on this via Huey. Branch: gemini/issue-609

🔧 `gemini` working on this via Huey. Branch: `gemini/issue-609`
Member

🔧 grok working on this via Huey. Branch: grok/issue-609

🔧 `grok` working on this via Huey. Branch: `grok/issue-609`
Member

⚠️ grok produced no changes for this issue. Skipping.

⚠️ `grok` produced no changes for this issue. Skipping.
Owner

🔍 Triaged by Huey — needs assignment.

🔍 Triaged by Huey — needs assignment.
Owner

🔍 Triaged by Huey — needs assignment.

🔍 Triaged by Huey — needs assignment.
Owner

Closing as duplicate during backlog burn-down. Canonical issue: #604.

Reason: identical title/workstream. Keeping one thread prevents duplicate agent labor and review waste.

Closing as duplicate during backlog burn-down. Canonical issue: #604. Reason: identical title/workstream. Keeping one thread prevents duplicate agent labor and review waste.
Timmy closed this issue 2026-03-28 04:46:28 +00:00
Sign in to join this conversation.
4 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: Timmy_Foundation/the-nexus#609