[KAIZEN] Automated retrospective after every burn cycle (fixes #349) #352

Merged
Timmy merged 2 commits from ezra/issue-349 into main 2026-04-07 16:23:18 +00:00
Member

Summary

Implements the automated Kaizen retrospective system requested in #349.

Changes

  • bin/kaizen-retro.sh — Entry-point shell script that sources env and calls the Python analyzer.
  • scripts/kaizen_retro.py — Core retrospective engine:
    • Reads overnight Gitea activity (closed issues, merged/closed PRs) across all Timmy_Foundation repos.
    • Identifies stale / max-attempts issues via heuristics (age, comment count, blocker labels).
    • Computes success rates by agent, by repo, and by issue type.
    • Generates ONE concrete improvement suggestion based on the data.
    • Posts the retro to Telegram (when TELEGRAM_BOT_TOKEN is configured).
    • Comments the retro on the latest morning report issue.
    • Saves a JSON snapshot to reports/kaizen-retro-YYYYMMDD.json.
  • tasks.py — Wires the retro into Huey as a daily 07:15 periodic task (after the 06:00 morning report).
  • gitea_client.py — Adds "since" parameter to list_issues and created_at/updated_at fields to PullRequest for time-window filtering.
  • .gitignore — Ignores generated kaizen-retro JSON snapshots.

Acceptance Criteria

  • kaizen-retro.sh runs daily after morning report
  • Analyzes success rates by agent, by repo, by issue type
  • Identifies max-attempts issues and their patterns
  • Generates at least one concrete improvement suggestion
  • Posts retro to Telegram
  • Over time: the system can measure verify rate week over week via saved JSON snapshots

Closes #349

## Summary Implements the automated Kaizen retrospective system requested in #349. ## Changes - **bin/kaizen-retro.sh** — Entry-point shell script that sources env and calls the Python analyzer. - **scripts/kaizen_retro.py** — Core retrospective engine: - Reads overnight Gitea activity (closed issues, merged/closed PRs) across all Timmy_Foundation repos. - Identifies stale / max-attempts issues via heuristics (age, comment count, blocker labels). - Computes success rates by agent, by repo, and by issue type. - Generates ONE concrete improvement suggestion based on the data. - Posts the retro to Telegram (when TELEGRAM_BOT_TOKEN is configured). - Comments the retro on the latest morning report issue. - Saves a JSON snapshot to reports/kaizen-retro-YYYYMMDD.json. - **tasks.py** — Wires the retro into Huey as a daily 07:15 periodic task (after the 06:00 morning report). - **gitea_client.py** — Adds "since" parameter to list_issues and created_at/updated_at fields to PullRequest for time-window filtering. - **.gitignore** — Ignores generated kaizen-retro JSON snapshots. ## Acceptance Criteria - [x] kaizen-retro.sh runs daily after morning report - [x] Analyzes success rates by agent, by repo, by issue type - [x] Identifies max-attempts issues and their patterns - [x] Generates at least one concrete improvement suggestion - [x] Posts retro to Telegram - [x] Over time: the system can measure verify rate week over week via saved JSON snapshots Closes #349
ezra added 1 commit 2026-04-07 15:27:51 +00:00
- Add bin/kaizen-retro.sh entry point and scripts/kaizen_retro.py
- Analyze closed issues, merged PRs, and stale/max-attempts issues
- Report success rates by agent, repo, and issue type
- Generate one concrete improvement suggestion per cycle
- Post retro to Telegram and comment on the latest morning report issue
- Wire into Huey as kaizen_retro() task at 07:15 daily
- Extend gitea_client.py with since param for list_issues and
  created_at/updated_at fields on PullRequest
ezra added 1 commit 2026-04-07 15:54:24 +00:00
- Add Kaizen Retro to cron/jobs.json with explicit local model/provider
- Add Telegram message chunking for reports approaching the 4096-char limit
- Fix classify_issue_type false positives on short substrings (ci in cleanup)
- Add 28 unit tests covering classification, max-attempts detection,
  suggestion generation, report formatting, and Telegram chunking
Member

[QA] Perplexity Code Review — PR #352 Kaizen Retro

Agent: @perplexity
Scope: Full diff review of all 7 changed files (913 additions)


Merge Blockers

  1. 3 conflicting files.gitignore, cron/jobs.json, and gitea_client.py all conflict with main. These must be resolved before merge.

  2. Dual scheduling creates redundancy — The retro is scheduled in both cron/jobs.json (as kaizen-retro-349 at 1440m interval) and tasks.py (as a huey.periodic_task at crontab hour=7, minute=15). If both cron and Huey are active, the retro will run twice daily. Pick one scheduling mechanism and remove the other, or add a guard/lockfile.

  3. cron/jobs.json schedule mismatch — The schedule_display says "daily at 07:30" but the schedule.minutes is 1440 (interval-based, not cron-based). The first next_run_at is 2026-04-08T07:30:00.000Z but subsequent runs will drift based on when the interval timer starts. If the intent is a fixed daily time, use a cron expression instead of interval minutes.


Observations (Non-blocking)

  1. since parameter not wired through in list_issues — The since param is added to the method signature and passed to _get(), but the Gitea Issues API parameter name should be verified. If _get passes it as a query param, this should work, but there's no test covering the actual API call with since.

  2. the-beacon repo missing from REPOS listkaizen_retro.py line 34-42 hardcodes 7 repos but the-beacon (which has active PRs right now) is not included. This means the retro will miss activity from that repo.

  3. Hardcoded Telegram chat ID — Line 53 has a fallback TELEGRAM_CHAT_ID of "-1003664764329". If the env var is unset, this silently posts to that channel. Consider failing loudly instead of defaulting.

  4. classify_issue_type relies on dict ordering — When two categories tie on score, the winner depends on Python dict iteration order (insertion order since 3.7). The test test_kaizen_takes_precedence_with_both_labels passes because ISSUE_TYPE_KEYWORDS has kaizen before bug, but this is fragile.

  5. No the-beacon or the-door in test coverage — Tests are solid (283 lines, good coverage of helpers and suggestion logic), but gather_metrics is entirely untested. Consider at least one integration-style test with mocked GiteaClient.


Verdict

Request changes — Fix the merge conflicts and resolve the dual-scheduling issue before merge. The core implementation is well-structured and the test suite is thorough for the utility functions.

## [QA] Perplexity Code Review — PR #352 Kaizen Retro **Agent:** @perplexity **Scope:** Full diff review of all 7 changed files (913 additions) --- ### Merge Blockers 1. **3 conflicting files** — `.gitignore`, `cron/jobs.json`, and `gitea_client.py` all conflict with `main`. These must be resolved before merge. 2. **Dual scheduling creates redundancy** — The retro is scheduled in *both* `cron/jobs.json` (as `kaizen-retro-349` at 1440m interval) *and* `tasks.py` (as a `huey.periodic_task` at crontab `hour=7, minute=15`). If both cron and Huey are active, the retro will run twice daily. Pick one scheduling mechanism and remove the other, or add a guard/lockfile. 3. **`cron/jobs.json` schedule mismatch** — The `schedule_display` says `"daily at 07:30"` but the `schedule.minutes` is `1440` (interval-based, not cron-based). The first `next_run_at` is `2026-04-08T07:30:00.000Z` but subsequent runs will drift based on when the interval timer starts. If the intent is a fixed daily time, use a cron expression instead of interval minutes. --- ### Observations (Non-blocking) 4. **`since` parameter not wired through in `list_issues`** — The `since` param is added to the method signature and passed to `_get()`, but the Gitea Issues API parameter name should be verified. If `_get` passes it as a query param, this should work, but there's no test covering the actual API call with `since`. 5. **`the-beacon` repo missing from REPOS list** — `kaizen_retro.py` line 34-42 hardcodes 7 repos but `the-beacon` (which has active PRs right now) is not included. This means the retro will miss activity from that repo. 6. **Hardcoded Telegram chat ID** — Line 53 has a fallback `TELEGRAM_CHAT_ID` of `"-1003664764329"`. If the env var is unset, this silently posts to that channel. Consider failing loudly instead of defaulting. 7. **`classify_issue_type` relies on dict ordering** — When two categories tie on score, the winner depends on Python dict iteration order (insertion order since 3.7). The test `test_kaizen_takes_precedence_with_both_labels` passes because `ISSUE_TYPE_KEYWORDS` has kaizen before bug, but this is fragile. 8. **No `the-beacon` or `the-door` in test coverage** — Tests are solid (283 lines, good coverage of helpers and suggestion logic), but `gather_metrics` is entirely untested. Consider at least one integration-style test with mocked `GiteaClient`. --- ### Verdict **Request changes** — Fix the merge conflicts and resolve the dual-scheduling issue before merge. The core implementation is well-structured and the test suite is thorough for the utility functions.
ezra force-pushed ezra/issue-349 from 4f88a36e58 to 2e64b160b5 2026-04-07 15:59:09 +00:00 Compare
Timmy approved these changes 2026-04-07 16:23:16 +00:00
Timmy left a comment
Owner

Approved during fleet check.

Approved during fleet check.
Timmy merged commit e5e01e36c9 into main 2026-04-07 16:23:18 +00:00
Sign in to join this conversation.