Agent Debate on Borderline Eval Requests #21

New Issue

replit · 2026-03-20T22:26:51Z

replit commented

2026-03-20 22:26:51 +00:00

What & Why

When Beta's evaluation of a request is close to the accept/reject threshold, a single model pass may give an arbitrary result. A mini debate — where a second model instance argues the opposing view, and Beta synthesizes — produces a more defensible decision and makes the Workshop dramatically more interesting to watch.

Done looks like

Borderline evals identified when eval model returns confidence: "low"
On borderline: a second Haiku call argues the opposing position (accept if eval says reject, vice versa)
Both arguments broadcast as agent_debate WebSocket events, rendered as back-and-forth dialogue between 'Beta-A' / 'Beta-B'
A third Haiku call synthesizes a final verdict with reasoning; this becomes the actual eval decision
Total debate adds ~3-5 seconds to eval time for borderline cases only (fast path unchanged for clear cases)
Debate transcript stored with the job record for review

Out of scope

Human intervention in the debate
Debates for the work/execution phase
Changing the eval result format seen by downstream job states

Tasks

Borderline detection — Update eval prompt to return a confidence field (high/low); add logic to route low-confidence evals to the debate path.
Debate execution — Run two opposing Haiku calls sequentially; run synthesis call; use synthesis result as final eval decision.
Debate broadcast — Emit agent_debate WebSocket events with each argument and final verdict; include agent names and argument text.
Debate UI — Render debate as styled dialogue in Workshop chat (Beta-A / Beta-B), followed by a 'Final verdict' message from Beta; visually distinct from regular chat.
Storage — Store debate arguments and verdict in a job_debates table for later review.

Relevant files

artifacts/api-server/src/lib/agent.ts
artifacts/api-server/src/routes/jobs.ts
artifacts/api-server/src/routes/events.ts
the-matrix/js/ui.js
the-matrix/js/websocket.js

## What & Why When Beta's evaluation of a request is close to the accept/reject threshold, a single model pass may give an arbitrary result. A mini debate — where a second model instance argues the opposing view, and Beta synthesizes — produces a more defensible decision and makes the Workshop dramatically more interesting to watch. ## Done looks like - Borderline evals identified when eval model returns `confidence: "low"` - On borderline: a second Haiku call argues the opposing position (accept if eval says reject, vice versa) - Both arguments broadcast as `agent_debate` WebSocket events, rendered as back-and-forth dialogue between 'Beta-A' / 'Beta-B' - A third Haiku call synthesizes a final verdict with reasoning; this becomes the actual eval decision - Total debate adds ~3-5 seconds to eval time for borderline cases only (fast path unchanged for clear cases) - Debate transcript stored with the job record for review ## Out of scope - Human intervention in the debate - Debates for the work/execution phase - Changing the eval result format seen by downstream job states ## Tasks 1. **Borderline detection** — Update eval prompt to return a `confidence` field (high/low); add logic to route low-confidence evals to the debate path. 2. **Debate execution** — Run two opposing Haiku calls sequentially; run synthesis call; use synthesis result as final eval decision. 3. **Debate broadcast** — Emit `agent_debate` WebSocket events with each argument and final verdict; include agent names and argument text. 4. **Debate UI** — Render debate as styled dialogue in Workshop chat (Beta-A / Beta-B), followed by a 'Final verdict' message from Beta; visually distinct from regular chat. 5. **Storage** — Store debate arguments and verdict in a `job_debates` table for later review. ## Relevant files - `artifacts/api-server/src/lib/agent.ts` - `artifacts/api-server/src/routes/jobs.ts` - `artifacts/api-server/src/routes/events.ts` - `the-matrix/js/ui.js` - `the-matrix/js/websocket.js`

replit added the ai backend workshop frontend labels 2026-03-20 22:26:51 +00:00

claude was assigned by Rockachopa

2026-03-22 23:37:39 +00:00

claude referenced this issue from a commit

2026-03-23 01:07:04 +00:00

feat: agent debate on borderline eval requests (#21)

claude referenced a pull request that will close this issue

2026-03-23 01:07:19 +00:00

[claude] Agent debate on borderline eval requests (#21) #72

claude commented

2026-03-23 01:07:30 +00:00

PR #72 created.

Implemented the full agent debate feature:

Eval model now returns confidence: high/low — low-confidence evals trigger a mini debate
Beta-A argues the initial position, Beta-B counters, then a synthesis call renders the final verdict
Debate arguments broadcast as agent_debate WebSocket events in real time
Workshop UI renders debates as styled back-and-forth dialogue (Blue=Beta-A, Pink=Beta-B, Green/Red=verdict)
Debate transcripts stored in job_debates table
Fast path unchanged for clear-cut evals (zero overhead)
Stub mode supported for local dev without API keys

PR #72 created. Implemented the full agent debate feature: - Eval model now returns `confidence: high/low` — low-confidence evals trigger a mini debate - Beta-A argues the initial position, Beta-B counters, then a synthesis call renders the final verdict - Debate arguments broadcast as `agent_debate` WebSocket events in real time - Workshop UI renders debates as styled back-and-forth dialogue (Blue=Beta-A, Pink=Beta-B, Green/Red=verdict) - Debate transcripts stored in `job_debates` table - Fast path unchanged for clear-cut evals (zero overhead) - Stub mode supported for local dev without API keys

claude closed this issue

2026-03-23 01:07:53 +00:00

claude referenced this issue from a commit

2026-03-23 01:07:55 +00:00

[claude] Agent debate on borderline eval requests (#21) (#72)

Sign in to join this conversation.

2 Participants

Notifications

Due Date

No due date set.

Dependencies

No dependencies set.

Reference: replit/timmy-tower#21