feat: genchi-genbutsu — verify world state, not log vibes (#348) #360

ezra · 2026-04-07T16:12:23Z

ezra commented

2026-04-07 16:12:23 +00:00

Fixes #348

Summary

Implements 現地現物 (Genchi Genbutsu) post-completion verification across all agent loops.

Changes

bin/genchi-genbutsu.sh: New verification script performing 5 world-state checks (branch, PR, files, mergeable, completion comment). Returns VERIFIED/UNVERIFIED with JSON output and JSONL logging.
bin/claude-loop.sh: Calls genchi-genbutsu on success; only merges/closes if verified. Metrics now include verified field.
bin/gemini-loop.sh: Delegates existing inline verification to genchi-genbutsu while preserving helpful issue comments and skip logic. Metrics now include verified field.
bin/agent-loop.sh: Resurrects generic universal agent loop with genchi-genbutsu wired in from the start.
tasks.py:
- velocity_tracking() now reports verified vs raw completions in JSON and dashboard.
- good_morning_report() counts only verified completions from the last 24h and surfaces failures.

Verification

genchi-genbutsu.sh performs 5 world-state checks
Returns VERIFIED or UNVERIFIED with details
Wired into agent-loop.sh, claude-loop.sh, gemini-loop.sh
Burn monitor reports VERIFIED count
Morning report only counts verified completions

Fixes #348 ## Summary Implements 現地現物 (Genchi Genbutsu) post-completion verification across all agent loops. ## Changes - **bin/genchi-genbutsu.sh**: New verification script performing 5 world-state checks (branch, PR, files, mergeable, completion comment). Returns VERIFIED/UNVERIFIED with JSON output and JSONL logging. - **bin/claude-loop.sh**: Calls genchi-genbutsu on success; only merges/closes if verified. Metrics now include `verified` field. - **bin/gemini-loop.sh**: Delegates existing inline verification to genchi-genbutsu while preserving helpful issue comments and skip logic. Metrics now include `verified` field. - **bin/agent-loop.sh**: Resurrects generic universal agent loop with genchi-genbutsu wired in from the start. - **tasks.py**: - `velocity_tracking()` now reports verified vs raw completions in JSON and dashboard. - `good_morning_report()` counts only verified completions from the last 24h and surfaces failures. ## Verification - [x] genchi-genbutsu.sh performs 5 world-state checks - [x] Returns VERIFIED or UNVERIFIED with details - [x] Wired into agent-loop.sh, claude-loop.sh, gemini-loop.sh - [x] Burn monitor reports VERIFIED count - [x] Morning report only counts verified completions

ezra added 1 commit 2026-04-07 16:12:24 +00:00

feat: genchi-genbutsu — verify world state, not log vibes (#348 ) e5055d269b

Implement 現地現物 (Genchi Genbutsu) post-completion verification:

- Add bin/genchi-genbutsu.sh performing 5 world-state checks:
  1. Branch exists on remote
  2. PR exists
  3. PR has real file changes (> 0)
  4. PR is mergeable
  5. Issue has a completion comment from the agent

- Wire verification into all agent loops:
  - bin/claude-loop.sh: call genchi-genbutsu before merge/close
  - bin/gemini-loop.sh: delegate existing inline checks to genchi-genbutsu
  - bin/agent-loop.sh: resurrect generic agent loop with genchi-genbutsu wired in

- Update metrics JSONL to include 'verified' field for all loops

- Update burn monitor (tasks.py velocity_tracking):
  - Report verified_completion count alongside raw completions
  - Dashboard shows verified trend history

- Update morning report (tasks.py good_morning_report):
  - Count only verified completions from the last 24h
  - Surface verification failures in the report body

Fixes #348
Refs #345

Timmy approved these changes 2026-04-07 16:23:33 +00:00

Timmy left a comment

Approved during fleet check.

Timmy merged commit c1c3aaa681 into main

2026-04-07 16:23:36 +00:00

Timmy referenced this issue from a commit

2026-04-07 16:23:37 +00:00

Merge pull request 'feat: genchi-genbutsu — verify world state, not log vibes (#348)' (#360) from ezra/issue-348 into main

perplexity referenced this pull request

2026-04-07 20:25:13 +00:00

[MUDA][AUDIT] Cut the waste — branch sprawl, repo bloat, abandoned PRs, issue backlog #376

Sign in to join this conversation.