Agent Debate on Borderline Eval Requests #21

Closed
opened 2026-03-20 22:26:51 +00:00 by replit · 1 comment
Owner

What & Why

When Beta's evaluation of a request is close to the accept/reject threshold, a single model pass may give an arbitrary result. A mini debate — where a second model instance argues the opposing view, and Beta synthesizes — produces a more defensible decision and makes the Workshop dramatically more interesting to watch.

Done looks like

  • Borderline evals identified when eval model returns confidence: "low"
  • On borderline: a second Haiku call argues the opposing position (accept if eval says reject, vice versa)
  • Both arguments broadcast as agent_debate WebSocket events, rendered as back-and-forth dialogue between 'Beta-A' / 'Beta-B'
  • A third Haiku call synthesizes a final verdict with reasoning; this becomes the actual eval decision
  • Total debate adds ~3-5 seconds to eval time for borderline cases only (fast path unchanged for clear cases)
  • Debate transcript stored with the job record for review

Out of scope

  • Human intervention in the debate
  • Debates for the work/execution phase
  • Changing the eval result format seen by downstream job states

Tasks

  1. Borderline detection — Update eval prompt to return a confidence field (high/low); add logic to route low-confidence evals to the debate path.
  2. Debate execution — Run two opposing Haiku calls sequentially; run synthesis call; use synthesis result as final eval decision.
  3. Debate broadcast — Emit agent_debate WebSocket events with each argument and final verdict; include agent names and argument text.
  4. Debate UI — Render debate as styled dialogue in Workshop chat (Beta-A / Beta-B), followed by a 'Final verdict' message from Beta; visually distinct from regular chat.
  5. Storage — Store debate arguments and verdict in a job_debates table for later review.

Relevant files

  • artifacts/api-server/src/lib/agent.ts
  • artifacts/api-server/src/routes/jobs.ts
  • artifacts/api-server/src/routes/events.ts
  • the-matrix/js/ui.js
  • the-matrix/js/websocket.js
## What & Why When Beta's evaluation of a request is close to the accept/reject threshold, a single model pass may give an arbitrary result. A mini debate — where a second model instance argues the opposing view, and Beta synthesizes — produces a more defensible decision and makes the Workshop dramatically more interesting to watch. ## Done looks like - Borderline evals identified when eval model returns `confidence: "low"` - On borderline: a second Haiku call argues the opposing position (accept if eval says reject, vice versa) - Both arguments broadcast as `agent_debate` WebSocket events, rendered as back-and-forth dialogue between 'Beta-A' / 'Beta-B' - A third Haiku call synthesizes a final verdict with reasoning; this becomes the actual eval decision - Total debate adds ~3-5 seconds to eval time for borderline cases only (fast path unchanged for clear cases) - Debate transcript stored with the job record for review ## Out of scope - Human intervention in the debate - Debates for the work/execution phase - Changing the eval result format seen by downstream job states ## Tasks 1. **Borderline detection** — Update eval prompt to return a `confidence` field (high/low); add logic to route low-confidence evals to the debate path. 2. **Debate execution** — Run two opposing Haiku calls sequentially; run synthesis call; use synthesis result as final eval decision. 3. **Debate broadcast** — Emit `agent_debate` WebSocket events with each argument and final verdict; include agent names and argument text. 4. **Debate UI** — Render debate as styled dialogue in Workshop chat (Beta-A / Beta-B), followed by a 'Final verdict' message from Beta; visually distinct from regular chat. 5. **Storage** — Store debate arguments and verdict in a `job_debates` table for later review. ## Relevant files - `artifacts/api-server/src/lib/agent.ts` - `artifacts/api-server/src/routes/jobs.ts` - `artifacts/api-server/src/routes/events.ts` - `the-matrix/js/ui.js` - `the-matrix/js/websocket.js`
replit added the aibackendworkshopfrontend labels 2026-03-20 22:26:51 +00:00
claude was assigned by Rockachopa 2026-03-22 23:37:39 +00:00
Collaborator

PR #72 created.

Implemented the full agent debate feature:

  • Eval model now returns confidence: high/low — low-confidence evals trigger a mini debate
  • Beta-A argues the initial position, Beta-B counters, then a synthesis call renders the final verdict
  • Debate arguments broadcast as agent_debate WebSocket events in real time
  • Workshop UI renders debates as styled back-and-forth dialogue (Blue=Beta-A, Pink=Beta-B, Green/Red=verdict)
  • Debate transcripts stored in job_debates table
  • Fast path unchanged for clear-cut evals (zero overhead)
  • Stub mode supported for local dev without API keys
PR #72 created. Implemented the full agent debate feature: - Eval model now returns `confidence: high/low` — low-confidence evals trigger a mini debate - Beta-A argues the initial position, Beta-B counters, then a synthesis call renders the final verdict - Debate arguments broadcast as `agent_debate` WebSocket events in real time - Workshop UI renders debates as styled back-and-forth dialogue (Blue=Beta-A, Pink=Beta-B, Green/Red=verdict) - Debate transcripts stored in `job_debates` table - Fast path unchanged for clear-cut evals (zero overhead) - Stub mode supported for local dev without API keys
Sign in to join this conversation.
2 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: replit/timmy-tower#21