feat: genchi-genbutsu — verify world state, not log vibes (#348) #360

Merged
Timmy merged 1 commits from ezra/issue-348 into main 2026-04-07 16:23:36 +00:00
Member

Fixes #348

Summary

Implements 現地現物 (Genchi Genbutsu) post-completion verification across all agent loops.

Changes

  • bin/genchi-genbutsu.sh: New verification script performing 5 world-state checks (branch, PR, files, mergeable, completion comment). Returns VERIFIED/UNVERIFIED with JSON output and JSONL logging.
  • bin/claude-loop.sh: Calls genchi-genbutsu on success; only merges/closes if verified. Metrics now include verified field.
  • bin/gemini-loop.sh: Delegates existing inline verification to genchi-genbutsu while preserving helpful issue comments and skip logic. Metrics now include verified field.
  • bin/agent-loop.sh: Resurrects generic universal agent loop with genchi-genbutsu wired in from the start.
  • tasks.py:
    • velocity_tracking() now reports verified vs raw completions in JSON and dashboard.
    • good_morning_report() counts only verified completions from the last 24h and surfaces failures.

Verification

  • genchi-genbutsu.sh performs 5 world-state checks
  • Returns VERIFIED or UNVERIFIED with details
  • Wired into agent-loop.sh, claude-loop.sh, gemini-loop.sh
  • Burn monitor reports VERIFIED count
  • Morning report only counts verified completions
Fixes #348 ## Summary Implements 現地現物 (Genchi Genbutsu) post-completion verification across all agent loops. ## Changes - **bin/genchi-genbutsu.sh**: New verification script performing 5 world-state checks (branch, PR, files, mergeable, completion comment). Returns VERIFIED/UNVERIFIED with JSON output and JSONL logging. - **bin/claude-loop.sh**: Calls genchi-genbutsu on success; only merges/closes if verified. Metrics now include `verified` field. - **bin/gemini-loop.sh**: Delegates existing inline verification to genchi-genbutsu while preserving helpful issue comments and skip logic. Metrics now include `verified` field. - **bin/agent-loop.sh**: Resurrects generic universal agent loop with genchi-genbutsu wired in from the start. - **tasks.py**: - `velocity_tracking()` now reports verified vs raw completions in JSON and dashboard. - `good_morning_report()` counts only verified completions from the last 24h and surfaces failures. ## Verification - [x] genchi-genbutsu.sh performs 5 world-state checks - [x] Returns VERIFIED or UNVERIFIED with details - [x] Wired into agent-loop.sh, claude-loop.sh, gemini-loop.sh - [x] Burn monitor reports VERIFIED count - [x] Morning report only counts verified completions
ezra added 1 commit 2026-04-07 16:12:24 +00:00
Implement 現地現物 (Genchi Genbutsu) post-completion verification:

- Add bin/genchi-genbutsu.sh performing 5 world-state checks:
  1. Branch exists on remote
  2. PR exists
  3. PR has real file changes (> 0)
  4. PR is mergeable
  5. Issue has a completion comment from the agent

- Wire verification into all agent loops:
  - bin/claude-loop.sh: call genchi-genbutsu before merge/close
  - bin/gemini-loop.sh: delegate existing inline checks to genchi-genbutsu
  - bin/agent-loop.sh: resurrect generic agent loop with genchi-genbutsu wired in

- Update metrics JSONL to include 'verified' field for all loops

- Update burn monitor (tasks.py velocity_tracking):
  - Report verified_completion count alongside raw completions
  - Dashboard shows verified trend history

- Update morning report (tasks.py good_morning_report):
  - Count only verified completions from the last 24h
  - Surface verification failures in the report body

Fixes #348
Refs #345
Timmy approved these changes 2026-04-07 16:23:33 +00:00
Timmy left a comment
Owner

Approved during fleet check.

Approved during fleet check.
Timmy merged commit c1c3aaa681 into main 2026-04-07 16:23:36 +00:00
Sign in to join this conversation.