[KAIZEN] 改善 — Continuous improvement. Automated retrospective after every burn cycle. #349

New Issue

Timmy · 2026-04-07T14:30:12Z

Timmy commented

2026-04-07 14:30:12 +00:00

Part of Epic: #345

Principle

Kaizen means "change for better" — continuous, incremental improvement. Not big reorganizations. Small adjustments after every cycle. The system gets 1% better every day.

Our Problem

We run burn nights but don't systematically learn from them. What issues did agents struggle with? Which repos had highest success rates? Which agents produced the best PRs? We don't know because we don't measure and reflect.

Implementation: kaizen-retro.sh

Daily automated retrospective that runs after the morning report:

Read overnight metrics (verified completions per agent per repo)
Identify patterns:
- Which issues were skipped most? (max-attempts log)
- Which agents had highest verify rate?
- Which repos had most failures?
- Which issue TYPES succeeded vs failed? (bug vs feature vs docs)
Generate ONE concrete improvement suggestion:
- "Groq fails on timmy-home issues. Consider removing timmy-home from groq's repo list."
- "Gemini succeeds on the-nexus but fails on timmy-config. Focus gemini on the-nexus."
- "3 issues hit max-attempts. Their common pattern: they require VPS access."
Post the retro to Telegram and as a comment on the master report issue
If the improvement is automatable (e.g., change repo list), do it.

Acceptance Criteria

kaizen-retro.sh runs daily after morning report
Analyzes: success rates by agent, by repo, by issue type
Identifies max-attempts issues and their patterns
Generates at least one concrete improvement suggestion
Posts retro to Telegram
Over time: the system measurably improves (higher verify rate week over week)

Part of Epic: #345 ## Principle Kaizen means "change for better" — continuous, incremental improvement. Not big reorganizations. Small adjustments after every cycle. The system gets 1% better every day. ## Our Problem We run burn nights but don't systematically learn from them. What issues did agents struggle with? Which repos had highest success rates? Which agents produced the best PRs? We don't know because we don't measure and reflect. ## Implementation: kaizen-retro.sh Daily automated retrospective that runs after the morning report: 1. Read overnight metrics (verified completions per agent per repo) 2. Identify patterns: - Which issues were skipped most? (max-attempts log) - Which agents had highest verify rate? - Which repos had most failures? - Which issue TYPES succeeded vs failed? (bug vs feature vs docs) 3. Generate ONE concrete improvement suggestion: - "Groq fails on timmy-home issues. Consider removing timmy-home from groq's repo list." - "Gemini succeeds on the-nexus but fails on timmy-config. Focus gemini on the-nexus." - "3 issues hit max-attempts. Their common pattern: they require VPS access." 4. Post the retro to Telegram and as a comment on the master report issue 5. If the improvement is automatable (e.g., change repo list), do it. ## Acceptance Criteria - [ ] kaizen-retro.sh runs daily after morning report - [ ] Analyzes: success rates by agent, by repo, by issue type - [ ] Identifies max-attempts issues and their patterns - [ ] Generates at least one concrete improvement suggestion - [ ] Posts retro to Telegram - [ ] Over time: the system measurably improves (higher verify rate week over week)

Timmy self-assigned this 2026-04-07 14:30:12 +00:00

ezra referenced this issue from a commit

2026-04-07 15:27:23 +00:00

[KAIZEN] Implement automated burn-cycle retrospective (fixes #349)

ezra referenced a pull request that will close this issue

2026-04-07 15:27:49 +00:00

[KAIZEN] Automated retrospective after every burn cycle (fixes #349) #352

ezra referenced this issue from a commit

2026-04-07 15:54:24 +00:00

[KAIZEN] Harden retro scheduling, chunking, and tests (#349)

ezra referenced this issue from a commit

2026-04-07 15:59:09 +00:00