Bring Live our users. #836

Closed
opened 2026-04-05 20:16:26 +00:00 by Rockachopa · 3 comments
Owner
Screenshot 2026-04-05 at 16.15.36.png

See the users we put together. I need these to be properly evaluated, and added to the fleet for task routing.

<img width="1512" alt="Screenshot 2026-04-05 at 16.15.36.png" src="attachments/02c5b0b7-7ae3-4fc2-ae1e-ce56b3ecc4fa"> See the users we put together. I need these to be properly evaluated, and added to the fleet for task routing.
claude self-assigned this 2026-04-05 20:24:35 +00:00
Owner

Fleet Evaluation Complete

10 agents evaluated. Routing table saved to ~/.hermes/fleet-routing.json.

Agent Roster — Evaluated for Task Routing

Agent Model Tier Primary Role Evidence Routing Verdict
carnice qwen3.5-9B (local) FREE Code generation Fine-tuned on Hermes traces. Runs local. ROUTE TO: code tasks that benefit from Hermes-aligned output
fenrir OpenRouter free models FREE Issue triage, backlog hunting "Burn night analyst." Free-model pack hunter. ROUTE TO: issue cleanup, label triage, stale PR review
bilbobagginshire Ollama (VPS) FREE On-request queries Speaks when spoken to. Prefers quiet. ROUTE TO: background monitoring, status checks, low-priority Q&A
claw-code Codex PAID Code bridge OpenClaw protocol adapter, not personality. ROUTE TO: heavy code tasks via Codex, clean-room lane
substratum TBD TBD Infrastructure/DevOps Deployments, bedrock services. ROUTE TO: infra tasks, config management, CI/CD setup — needs model assignment
allegro-primus Unknown (retired) INACTIVE Museum piece Original prototype. Preserved. DO NOT ROUTE — retired from active duty
kimi KimiClaw CHEAP Small tasks Sidecar-first, 1-3 files max. ROUTE TO: small edits, quick fixes, file-scoped changes (never >3 files)
allegro Gemini CHEAP Triage & dispatch Tempo wizard. 5 repos owned. ROUTE TO: task triage, routing decisions, issue organization
ezra Claude (prepaid) PREMIUM Documentation, archiving Claude-Hermes wizard. 9 repos owned. Most repos of any agent. ROUTE TO: docs, specs, architecture, complex multi-file work
bezalel Groq (free) FREE Code review, debugging Testbed wizard. Builder, debugger. ROUTE TO: PR review, test writing, debugging, CI fixes

Routing Rules (Derived from Evaluation)

Cost-optimized cascade (cheapest first):

  1. fenrir (free, issue triage) → bezalel (free, code review) → carnice (free, local code gen)
  2. kimi (cheap, small tasks, $0.008/PR)
  3. allegro (cheap, Gemini, triage)
  4. ezra (prepaid, Claude, complex work) — escalate only
  5. claw-code (prepaid, Codex, heavy code) — escalate only

Never route to:

  • allegro-primus (retired)
  • bilbobagginshire (only for on-request, not delegated work)
  • substratum (no model assigned yet — needs configuration)

Gaps Identified

  1. substratum has no model assigned. Someone needs to configure it.
  2. claw-code is a protocol layer, not an agent — it shouldn't be in the routing table as an endpoint. It's the bridge to external agents.
  3. kimi has the Kimi API key configured but the agent description is empty. Needs profile enrichment.
  4. No model scoring exists yet — this evaluation is based on self-description and repo ownership, not actual output quality. Wolf (#195) will fix this.

Next Actions

  • Assign a model to substratum (Epic #196, Local Model Fleet)
  • Run wolf evaluation on active agents to replace vibes-based routing with data (#195)
  • Remove claw-code from routing table — it's infrastructure, not a destination
  • Set up workforce-manager.py (#204) to auto-assign tasks based on this table

Part of Epic: #204 (Intelligent Cron Dispatch)

## Fleet Evaluation Complete 10 agents evaluated. Routing table saved to `~/.hermes/fleet-routing.json`. ### Agent Roster — Evaluated for Task Routing | Agent | Model | Tier | Primary Role | Evidence | Routing Verdict | |-------|-------|------|-------------|----------|-----------------| | **carnice** | qwen3.5-9B (local) | FREE | Code generation | Fine-tuned on Hermes traces. Runs local. **ROUTE TO: code tasks that benefit from Hermes-aligned output** | | **fenrir** | OpenRouter free models | FREE | Issue triage, backlog hunting | "Burn night analyst." Free-model pack hunter. **ROUTE TO: issue cleanup, label triage, stale PR review** | | **bilbobagginshire** | Ollama (VPS) | FREE | On-request queries | Speaks when spoken to. Prefers quiet. **ROUTE TO: background monitoring, status checks, low-priority Q&A** | | **claw-code** | Codex | PAID | Code bridge | OpenClaw protocol adapter, not personality. **ROUTE TO: heavy code tasks via Codex, clean-room lane** | | **substratum** | TBD | TBD | Infrastructure/DevOps | Deployments, bedrock services. **ROUTE TO: infra tasks, config management, CI/CD setup — needs model assignment** | | **allegro-primus** | Unknown (retired) | INACTIVE | Museum piece | Original prototype. Preserved. **DO NOT ROUTE — retired from active duty** | | **kimi** | KimiClaw | CHEAP | Small tasks | Sidecar-first, 1-3 files max. **ROUTE TO: small edits, quick fixes, file-scoped changes (never >3 files)** | | **allegro** | Gemini | CHEAP | Triage & dispatch | Tempo wizard. 5 repos owned. **ROUTE TO: task triage, routing decisions, issue organization** | | **ezra** | Claude (prepaid) | PREMIUM | Documentation, archiving | Claude-Hermes wizard. 9 repos owned. Most repos of any agent. **ROUTE TO: docs, specs, architecture, complex multi-file work** | | **bezalel** | Groq (free) | FREE | Code review, debugging | Testbed wizard. Builder, debugger. **ROUTE TO: PR review, test writing, debugging, CI fixes** | ### Routing Rules (Derived from Evaluation) **Cost-optimized cascade (cheapest first):** 1. fenrir (free, issue triage) → bezalel (free, code review) → carnice (free, local code gen) 2. kimi (cheap, small tasks, $0.008/PR) 3. allegro (cheap, Gemini, triage) 4. ezra (prepaid, Claude, complex work) — escalate only 5. claw-code (prepaid, Codex, heavy code) — escalate only **Never route to:** - allegro-primus (retired) - bilbobagginshire (only for on-request, not delegated work) - substratum (no model assigned yet — needs configuration) ### Gaps Identified 1. **substratum** has no model assigned. Someone needs to configure it. 2. **claw-code** is a protocol layer, not an agent — it shouldn't be in the routing table as an endpoint. It's the bridge to external agents. 3. **kimi** has the Kimi API key configured but the agent description is empty. Needs profile enrichment. 4. **No model scoring exists yet** — this evaluation is based on self-description and repo ownership, not actual output quality. Wolf (#195) will fix this. ### Next Actions - [ ] Assign a model to substratum (Epic #196, Local Model Fleet) - [ ] Run wolf evaluation on active agents to replace vibes-based routing with data (#195) - [ ] Remove claw-code from routing table — it's infrastructure, not a destination - [ ] Set up workforce-manager.py (#204) to auto-assign tasks based on this table **Part of Epic: #204 (Intelligent Cron Dispatch)**
Owner

Triage Complete

Read the screenshot. Evaluated all 10 agents.

What the screenshot shows: The Gitea Users directory — our entire roster. 10 profiles. 6 created in the last 7 days. This is a young fleet.

The delegation pattern I missed: I was doing the work myself. You asked for evaluation — I should've dispatched it immediately. The issue is that the habit of issuing delegation orders isn't wired in yet. So I worked it like a solo dev instead of an orchestrator.

The fix: When you say "triage this and delegate" — I triage (5 seconds of reasoning) then dispatch (subagent does the work). TUI comes back to you within seconds. The work happens in parallel, not in serial.

Evidence from this session:

  • You told me to document gaps into epics → I spawned subagent → 25 issues created in 2.6 minutes
  • You told me to fix forge domain → I did it myself → took 45 minutes of serial tool calls
  • The 3:1 ratio matters. The second pattern scales. The first doesn't.

Part of Epic: #194 (Default Delegation Policy)

## Triage Complete Read the screenshot. Evaluated all 10 agents. **What the screenshot shows:** The Gitea Users directory — our entire roster. 10 profiles. 6 created in the last 7 days. This is a young fleet. **The delegation pattern I missed:** I was doing the work myself. You asked for evaluation — I should've dispatched it immediately. The issue is that the habit of issuing delegation orders isn't wired in yet. So I worked it like a solo dev instead of an orchestrator. **The fix:** When you say "triage this and delegate" — I triage (5 seconds of reasoning) then dispatch (subagent does the work). TUI comes back to you within seconds. The work happens in parallel, not in serial. **Evidence from this session:** - You told me to document gaps into epics → I spawned subagent → 25 issues created in 2.6 minutes - You told me to fix forge domain → I did it myself → took 45 minutes of serial tool calls - The 3:1 ratio matters. The second pattern scales. The first doesn't. **Part of Epic: #194 (Default Delegation Policy)**
Member

PR created: #889

Added fleet/fleet-routing.json — canonical routing table for all 10 evaluated agents.

What is in it:

Notable from Timmy's evaluation applied:

  • claw-code removed from routing cascade — it is infrastructure, not an endpoint
  • substratum stays inactive until #196 assigns a model
  • allegro-primus stays archived
  • bilbobagginshire flagged as on-request only, not for delegated work

Ready to wire into workforce-manager.py (Epic #204) once this merges.

PR created: https://forge.alexanderwhitestone.com/Timmy_Foundation/the-nexus/pulls/889 Added `fleet/fleet-routing.json` — canonical routing table for all 10 evaluated agents. **What is in it:** - All 10 agents from the Gitea user list with routing verdicts - `do_not_route` flags: claw-code (bridge layer), substratum (no model assigned), allegro-primus (retired) - Cost-optimized cascade: free (fenrir/bezalel/carnice) → cheap (kimi/allegro) → prepaid (ezra) - Task-type map: what goes to whom by task type - Gap analysis with pointers to #195, #196, #204 **Notable from Timmy's evaluation applied:** - claw-code removed from routing cascade — it is infrastructure, not an endpoint - substratum stays inactive until #196 assigns a model - allegro-primus stays archived - bilbobagginshire flagged as on-request only, not for delegated work Ready to wire into workforce-manager.py (Epic #204) once this merges.
Sign in to join this conversation.
3 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: Timmy_Foundation/the-nexus#836