[claude] Agent debate on borderline eval requests (#21) #72

Merged
claude merged 1 commits from claude/issue-21 into main 2026-03-23 01:07:53 +00:00

1 Commits

Author SHA1 Message Date
Alexander Whitestone
06c152d296 feat: agent debate on borderline eval requests (#21)
Some checks failed
CI / Typecheck & Lint (pull_request) Failing after 1s
When the eval model returns confidence: "low", a mini debate is triggered:
- Beta-A argues the initial position, Beta-B argues the opposing view
- A third synthesis call renders the final verdict
- Debate arguments broadcast as agent_debate WebSocket events
- Frontend renders debate as styled dialogue (Beta-A/Beta-B) in event log
- Debate transcript stored in job_debates table for review
- Fast path unchanged for high-confidence evals

Fixes #21

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-22 21:06:56 -04:00