From ad01b4ca78c0648c081d2630cb1ae85e334eac01 Mon Sep 17 00:00:00 2001 From: Alexander Whitestone Date: Tue, 21 Apr 2026 14:14:31 -0400 Subject: [PATCH] docs: Allegro burn-mode validator rules (#242) Defines tangible artifact criteria, stop compliance checks, lane boundaries, proof structure, and pass/fail validation examples. Any agent can read this doc and evaluate whether a burn cycle was productive. References cycle_guard.py for implementation details. Closes #242 --- allegro/burn-mode-validator.md | 284 +++++++++++++++++++++++++++++++++ 1 file changed, 284 insertions(+) create mode 100644 allegro/burn-mode-validator.md diff --git a/allegro/burn-mode-validator.md b/allegro/burn-mode-validator.md new file mode 100644 index 00000000..ccee4d82 --- /dev/null +++ b/allegro/burn-mode-validator.md @@ -0,0 +1,284 @@ +# Allegro Burn-Mode Validator Rules + +**Epic:** #842 M7 — Autonomous Burn-Mode Hardening +**Issue:** #242 +**Source:** `allegro/cycle_guard.py` + +Any agent can read this doc and evaluate whether a burn cycle was productive. + +--- + +## Cycle Lifecycle + +``` +start_cycle(target) + → start_slice("clone") → end_slice(artifact="repo cloned") + → start_slice("implement") → end_slice(artifact="PR #42") + → commit_cycle(proof={...}) +``` + +A cycle has three terminal states: +- **complete** — `commit_cycle()` called with proof +- **aborted** — `abort_cycle(reason)` called, reason recorded +- **stale** — crashed or hung; auto-aborted by `resume_or_abort()` after 30 min + +--- + +## 1. Tangible Artifact Criteria + +A cycle is **productive** if at least one slice produces a tangible artifact. + +### What Counts as Tangible + +| Artifact Type | Example | Valid | +|---|---|---| +| Git commit | `abc1234 — fix: resolve import collision` | ✅ | +| Pull request | `PR #42: https://forge.../pulls/42` | ✅ | +| Issue closure | `Closed #17 with comment explaining resolution` | ✅ | +| Test file | `tests/test_new_feature.py — 5 passing` | ✅ | +| Report | `reports/2026-04-15-audit.md` committed to repo | ✅ | +| Config change | Modified `config.yaml`, pushed to branch | ✅ | +| Documentation | New or updated `.md` file committed | ✅ | +| Skill created | `skill_manage(create)` with SKILL.md | ✅ | +| Memory updated | Facts saved via `memory()` tool | ✅ | + +### What Does NOT Count + +| Non-Artifact | Why | Invalid | +|---|---|---| +| Log output only | Ephemeral, not durable | ❌ | +| "I analyzed the code" | No file touched, no commit made | ❌ | +| Conversation summary | Not a repo artifact | ❌ | +| Plan without execution | Intent without delivery | ❌ | +| Duplicate of existing work | No new value produced | ❌ | +| Deleted work with no record | Net-zero artifact | ❌ | + +### Rule + +> **Every cycle must produce at least one tangible artifact or a documented abort reason.** +> A cycle that ends with `status: complete` but `proof: null` and zero commits is **invalid**. + +--- + +## 2. Stop Compliance Checks + +The cycle guard enforces time discipline via `cycle_guard.py`. + +### Slice Timeout (10-minute rule) + +- Each slice has a 10-minute default max (`check_slice_timeout(max_minutes=10)`) +- If a slice exceeds 10 minutes, the agent should either: + - End it with a partial artifact and start a new slice + - Abort with reason: `"slice timeout — {description}"` + +### Crash Recovery (30-minute rule) + +- `resume_or_abort()` auto-aborts cycles stuck for >30 minutes +- If `cycle_guard.py resume` returns `aborted`, the agent must not continue the old cycle +- Start a fresh cycle instead + +### Stop Signals + +An agent MUST stop and abort when: + +| Signal | Action | +|---|---| +| `check_slice_timeout` returns `True` | End slice, start new or abort | +| `resume_or_abort` returns `aborted` | Do not resume; start fresh | +| Issue already closed/implemented | Abort: `"already resolved"` | +| Authentication failure | Abort: `"auth failure — {detail}"` | +| Clone timeout > 120s | Abort: `"clone timeout"` | +| Unresolvable merge conflict | Abort: `"merge conflict — manual intervention needed"` | +| Human says stop | Abort: `"human override"` | + +### Lane Boundary Checks + +Agents must stay in their lane: + +| Agent | Lane | Out-of-Bounds | +|---|---|---| +| Allegro | Dispatch, reporting, infra, docs | Direct model training, UI changes | +| Claude | Architecture, complex bugs | Simple config edits (use Gemini) | +| Gemini | Issue burn, simple fixes | Architecture decisions | +| Codex | Code generation, test writing | Operational dispatch | +| Ezra | Analysis, pattern recognition | Implementation | + +If an agent detects it's working out-of-lane: +``` +abort_cycle("out-of-lane — {what} should be done by {correct_agent}") +``` + +--- + +## 3. Proof Structure + +The `proof` field in `commit_cycle()` should contain: + +```json +{ + "artifacts": [ + { + "type": "pr", + "number": 42, + "url": "https://forge.../pulls/42", + "repo": "Timmy_Foundation/timmy-config" + } + ], + "commits": ["abc1234", "def5678"], + "files_changed": ["allegro/burn-mode-validator.md"], + "summary": "Documented burn-loop validator rules per #242" +} +``` + +Minimal valid proof (single commit, no PR): +```json +{ + "commits": ["abc1234"], + "summary": "Fixed import collision in scripts/ci_automation_gate.py" +} +``` + +Invalid proof (empty): +```json +null +``` +A cycle with `proof: null` and no commits is **invalid** — it should have been aborted. + +--- + +## 4. Validation Examples + +### PASS: Productive Cycle + +```json +{ + "cycle_id": "2026-04-21T14:00:00+00:00", + "status": "complete", + "target": "timmy-config issue #242", + "slices": [ + {"name": "clone", "status": "complete", "artifact": "repo cloned"}, + {"name": "implement", "status": "complete", "artifact": "PR #301"}, + {"name": "verify", "status": "complete", "artifact": "tests passing"} + ], + "proof": {"commits": ["abc1234"], "summary": "Documented validator rules"} +} +``` +**Verdict: PASS** — Three slices, each with artifact, proof present, committed. + +### PASS: Productive Abort + +```json +{ + "cycle_id": "2026-04-21T14:00:00+00:00", + "status": "aborted", + "target": "hermes-agent issue #999", + "slices": [ + {"name": "clone", "status": "complete", "artifact": "repo cloned"}, + {"name": "investigate", "status": "aborted", "artifact": null} + ], + "abort_reason": "already resolved — PR #888 merged this fix yesterday" +} +``` +**Verdict: PASS** — Legitimate abort with clear reason. Investigation was productive (discovered duplicate). + +### FAIL: Empty Cycle + +```json +{ + "cycle_id": "2026-04-21T14:00:00+00:00", + "status": "complete", + "target": "timmy-config issue #123", + "slices": [], + "proof": null +} +``` +**Verdict: FAIL** — No slices, no artifacts, no proof. Agent started but produced nothing. + +### FAIL: Log-Only Cycle + +```json +{ + "cycle_id": "2026-04-21T14:00:00+00:00", + "status": "complete", + "target": "timmy-config issue #456", + "slices": [ + {"name": "analyze", "status": "complete", "artifact": "analysis output to stdout"} + ], + "proof": {"summary": "Analyzed the codebase and identified 3 patterns"} +} +``` +**Verdict: FAIL** — Analysis produced no durable artifact. Should have written findings to a file or created an issue. + +### FAIL: Stale Cycle (Auto-Aborted) + +```json +{ + "cycle_id": "2026-04-21T14:00:00+00:00", + "status": "aborted", + "target": "timmy-config issue #789", + "slices": [ + {"name": "clone", "status": "in_progress", "artifact": null} + ], + "abort_reason": "crash recovery — stale cycle detected (45m old)" +} +``` +**Verdict: FAIL** — Clone hung or agent crashed. `resume_or_abort()` caught it. Start fresh. + +--- + +## 5. Integration Points + +### In Burn-Loop Prompts + +Add to dispatch prompts: +``` +Before starting: python3 allegro/cycle_guard.py resume +After each slice: python3 allegro/cycle_guard.py end --artifact "description" +After all work: python3 allegro/cycle_guard.py commit --proof '{"commits":["..."],"summary":"..."}' +If stuck: python3 allegro/cycle_guard.py abort "reason" +``` + +### In Reporting + +Morning reports should include cycle validation: +``` +Cycles last night: 12 +- Complete with proof: 9 +- Productive aborts: 2 +- Failed (empty/stale): 1 ← RCA filed +``` + +### In Metrics + +Track as fleet health metric: +- **Cycle completion rate:** complete / total cycles +- **Artifact density:** artifacts per cycle (target: ≥1) +- **Abort quality:** % of aborts with descriptive reasons +- **Stale detection rate:** auto-aborted / total cycles (target: <5%) + +--- + +## Quick Reference + +``` +# Start +python3 allegro/cycle_guard.py start "timmy-config #242" + +# Work slices +python3 allegro/cycle_guard.py slice "clone" +# ... do work ... +python3 allegro/cycle_guard.py end --artifact "repo cloned" + +python3 allegro/cycle_guard.py slice "implement" +# ... do work ... +python3 allegro/cycle_guard.py end --artifact "PR #301" + +# Finish +python3 allegro/cycle_guard.py commit --proof '{"commits":["abc1234"],"summary":"Done"}' + +# Or abort +python3 allegro/cycle_guard.py abort "already resolved" + +# Recovery check +python3 allegro/cycle_guard.py resume +```