Merge pull request 'docs: Allegro burn-mode validator rules (#242)' (#843) from sprint/issue-242 into main
Some checks failed
Validate Config / YAML Lint (push) Failing after 16s
Smoke Test / smoke (push) Failing after 22s
Architecture Lint / Linter Tests (push) Successful in 27s
Validate Config / JSON Validate (push) Successful in 18s
Validate Config / Cron Syntax Check (push) Successful in 13s
Validate Config / Deploy Script Dry Run (push) Successful in 13s
Validate Config / Python Syntax & Import Check (push) Failing after 54s
Validate Config / Python Test Suite (push) Has been skipped
Validate Config / Playbook Schema Validation (push) Successful in 24s
Validate Config / Shell Script Lint (push) Failing after 1m0s
Architecture Lint / Lint Repository (push) Failing after 15s
Some checks failed
Validate Config / YAML Lint (push) Failing after 16s
Smoke Test / smoke (push) Failing after 22s
Architecture Lint / Linter Tests (push) Successful in 27s
Validate Config / JSON Validate (push) Successful in 18s
Validate Config / Cron Syntax Check (push) Successful in 13s
Validate Config / Deploy Script Dry Run (push) Successful in 13s
Validate Config / Python Syntax & Import Check (push) Failing after 54s
Validate Config / Python Test Suite (push) Has been skipped
Validate Config / Playbook Schema Validation (push) Successful in 24s
Validate Config / Shell Script Lint (push) Failing after 1m0s
Architecture Lint / Lint Repository (push) Failing after 15s
This commit was merged in pull request #843.
This commit is contained in:
284
allegro/burn-mode-validator.md
Normal file
284
allegro/burn-mode-validator.md
Normal file
@@ -0,0 +1,284 @@
|
||||
# Allegro Burn-Mode Validator Rules
|
||||
|
||||
**Epic:** #842 M7 — Autonomous Burn-Mode Hardening
|
||||
**Issue:** #242
|
||||
**Source:** `allegro/cycle_guard.py`
|
||||
|
||||
Any agent can read this doc and evaluate whether a burn cycle was productive.
|
||||
|
||||
---
|
||||
|
||||
## Cycle Lifecycle
|
||||
|
||||
```
|
||||
start_cycle(target)
|
||||
→ start_slice("clone") → end_slice(artifact="repo cloned")
|
||||
→ start_slice("implement") → end_slice(artifact="PR #42")
|
||||
→ commit_cycle(proof={...})
|
||||
```
|
||||
|
||||
A cycle has three terminal states:
|
||||
- **complete** — `commit_cycle()` called with proof
|
||||
- **aborted** — `abort_cycle(reason)` called, reason recorded
|
||||
- **stale** — crashed or hung; auto-aborted by `resume_or_abort()` after 30 min
|
||||
|
||||
---
|
||||
|
||||
## 1. Tangible Artifact Criteria
|
||||
|
||||
A cycle is **productive** if at least one slice produces a tangible artifact.
|
||||
|
||||
### What Counts as Tangible
|
||||
|
||||
| Artifact Type | Example | Valid |
|
||||
|---|---|---|
|
||||
| Git commit | `abc1234 — fix: resolve import collision` | ✅ |
|
||||
| Pull request | `PR #42: https://forge.../pulls/42` | ✅ |
|
||||
| Issue closure | `Closed #17 with comment explaining resolution` | ✅ |
|
||||
| Test file | `tests/test_new_feature.py — 5 passing` | ✅ |
|
||||
| Report | `reports/2026-04-15-audit.md` committed to repo | ✅ |
|
||||
| Config change | Modified `config.yaml`, pushed to branch | ✅ |
|
||||
| Documentation | New or updated `.md` file committed | ✅ |
|
||||
| Skill created | `skill_manage(create)` with SKILL.md | ✅ |
|
||||
| Memory updated | Facts saved via `memory()` tool | ✅ |
|
||||
|
||||
### What Does NOT Count
|
||||
|
||||
| Non-Artifact | Why | Invalid |
|
||||
|---|---|---|
|
||||
| Log output only | Ephemeral, not durable | ❌ |
|
||||
| "I analyzed the code" | No file touched, no commit made | ❌ |
|
||||
| Conversation summary | Not a repo artifact | ❌ |
|
||||
| Plan without execution | Intent without delivery | ❌ |
|
||||
| Duplicate of existing work | No new value produced | ❌ |
|
||||
| Deleted work with no record | Net-zero artifact | ❌ |
|
||||
|
||||
### Rule
|
||||
|
||||
> **Every cycle must produce at least one tangible artifact or a documented abort reason.**
|
||||
> A cycle that ends with `status: complete` but `proof: null` and zero commits is **invalid**.
|
||||
|
||||
---
|
||||
|
||||
## 2. Stop Compliance Checks
|
||||
|
||||
The cycle guard enforces time discipline via `cycle_guard.py`.
|
||||
|
||||
### Slice Timeout (10-minute rule)
|
||||
|
||||
- Each slice has a 10-minute default max (`check_slice_timeout(max_minutes=10)`)
|
||||
- If a slice exceeds 10 minutes, the agent should either:
|
||||
- End it with a partial artifact and start a new slice
|
||||
- Abort with reason: `"slice timeout — {description}"`
|
||||
|
||||
### Crash Recovery (30-minute rule)
|
||||
|
||||
- `resume_or_abort()` auto-aborts cycles stuck for >30 minutes
|
||||
- If `cycle_guard.py resume` returns `aborted`, the agent must not continue the old cycle
|
||||
- Start a fresh cycle instead
|
||||
|
||||
### Stop Signals
|
||||
|
||||
An agent MUST stop and abort when:
|
||||
|
||||
| Signal | Action |
|
||||
|---|---|
|
||||
| `check_slice_timeout` returns `True` | End slice, start new or abort |
|
||||
| `resume_or_abort` returns `aborted` | Do not resume; start fresh |
|
||||
| Issue already closed/implemented | Abort: `"already resolved"` |
|
||||
| Authentication failure | Abort: `"auth failure — {detail}"` |
|
||||
| Clone timeout > 120s | Abort: `"clone timeout"` |
|
||||
| Unresolvable merge conflict | Abort: `"merge conflict — manual intervention needed"` |
|
||||
| Human says stop | Abort: `"human override"` |
|
||||
|
||||
### Lane Boundary Checks
|
||||
|
||||
Agents must stay in their lane:
|
||||
|
||||
| Agent | Lane | Out-of-Bounds |
|
||||
|---|---|---|
|
||||
| Allegro | Dispatch, reporting, infra, docs | Direct model training, UI changes |
|
||||
| Claude | Architecture, complex bugs | Simple config edits (use Gemini) |
|
||||
| Gemini | Issue burn, simple fixes | Architecture decisions |
|
||||
| Codex | Code generation, test writing | Operational dispatch |
|
||||
| Ezra | Analysis, pattern recognition | Implementation |
|
||||
|
||||
If an agent detects it's working out-of-lane:
|
||||
```
|
||||
abort_cycle("out-of-lane — {what} should be done by {correct_agent}")
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 3. Proof Structure
|
||||
|
||||
The `proof` field in `commit_cycle()` should contain:
|
||||
|
||||
```json
|
||||
{
|
||||
"artifacts": [
|
||||
{
|
||||
"type": "pr",
|
||||
"number": 42,
|
||||
"url": "https://forge.../pulls/42",
|
||||
"repo": "Timmy_Foundation/timmy-config"
|
||||
}
|
||||
],
|
||||
"commits": ["abc1234", "def5678"],
|
||||
"files_changed": ["allegro/burn-mode-validator.md"],
|
||||
"summary": "Documented burn-loop validator rules per #242"
|
||||
}
|
||||
```
|
||||
|
||||
Minimal valid proof (single commit, no PR):
|
||||
```json
|
||||
{
|
||||
"commits": ["abc1234"],
|
||||
"summary": "Fixed import collision in scripts/ci_automation_gate.py"
|
||||
}
|
||||
```
|
||||
|
||||
Invalid proof (empty):
|
||||
```json
|
||||
null
|
||||
```
|
||||
A cycle with `proof: null` and no commits is **invalid** — it should have been aborted.
|
||||
|
||||
---
|
||||
|
||||
## 4. Validation Examples
|
||||
|
||||
### PASS: Productive Cycle
|
||||
|
||||
```json
|
||||
{
|
||||
"cycle_id": "2026-04-21T14:00:00+00:00",
|
||||
"status": "complete",
|
||||
"target": "timmy-config issue #242",
|
||||
"slices": [
|
||||
{"name": "clone", "status": "complete", "artifact": "repo cloned"},
|
||||
{"name": "implement", "status": "complete", "artifact": "PR #301"},
|
||||
{"name": "verify", "status": "complete", "artifact": "tests passing"}
|
||||
],
|
||||
"proof": {"commits": ["abc1234"], "summary": "Documented validator rules"}
|
||||
}
|
||||
```
|
||||
**Verdict: PASS** — Three slices, each with artifact, proof present, committed.
|
||||
|
||||
### PASS: Productive Abort
|
||||
|
||||
```json
|
||||
{
|
||||
"cycle_id": "2026-04-21T14:00:00+00:00",
|
||||
"status": "aborted",
|
||||
"target": "hermes-agent issue #999",
|
||||
"slices": [
|
||||
{"name": "clone", "status": "complete", "artifact": "repo cloned"},
|
||||
{"name": "investigate", "status": "aborted", "artifact": null}
|
||||
],
|
||||
"abort_reason": "already resolved — PR #888 merged this fix yesterday"
|
||||
}
|
||||
```
|
||||
**Verdict: PASS** — Legitimate abort with clear reason. Investigation was productive (discovered duplicate).
|
||||
|
||||
### FAIL: Empty Cycle
|
||||
|
||||
```json
|
||||
{
|
||||
"cycle_id": "2026-04-21T14:00:00+00:00",
|
||||
"status": "complete",
|
||||
"target": "timmy-config issue #123",
|
||||
"slices": [],
|
||||
"proof": null
|
||||
}
|
||||
```
|
||||
**Verdict: FAIL** — No slices, no artifacts, no proof. Agent started but produced nothing.
|
||||
|
||||
### FAIL: Log-Only Cycle
|
||||
|
||||
```json
|
||||
{
|
||||
"cycle_id": "2026-04-21T14:00:00+00:00",
|
||||
"status": "complete",
|
||||
"target": "timmy-config issue #456",
|
||||
"slices": [
|
||||
{"name": "analyze", "status": "complete", "artifact": "analysis output to stdout"}
|
||||
],
|
||||
"proof": {"summary": "Analyzed the codebase and identified 3 patterns"}
|
||||
}
|
||||
```
|
||||
**Verdict: FAIL** — Analysis produced no durable artifact. Should have written findings to a file or created an issue.
|
||||
|
||||
### FAIL: Stale Cycle (Auto-Aborted)
|
||||
|
||||
```json
|
||||
{
|
||||
"cycle_id": "2026-04-21T14:00:00+00:00",
|
||||
"status": "aborted",
|
||||
"target": "timmy-config issue #789",
|
||||
"slices": [
|
||||
{"name": "clone", "status": "in_progress", "artifact": null}
|
||||
],
|
||||
"abort_reason": "crash recovery — stale cycle detected (45m old)"
|
||||
}
|
||||
```
|
||||
**Verdict: FAIL** — Clone hung or agent crashed. `resume_or_abort()` caught it. Start fresh.
|
||||
|
||||
---
|
||||
|
||||
## 5. Integration Points
|
||||
|
||||
### In Burn-Loop Prompts
|
||||
|
||||
Add to dispatch prompts:
|
||||
```
|
||||
Before starting: python3 allegro/cycle_guard.py resume
|
||||
After each slice: python3 allegro/cycle_guard.py end --artifact "description"
|
||||
After all work: python3 allegro/cycle_guard.py commit --proof '{"commits":["..."],"summary":"..."}'
|
||||
If stuck: python3 allegro/cycle_guard.py abort "reason"
|
||||
```
|
||||
|
||||
### In Reporting
|
||||
|
||||
Morning reports should include cycle validation:
|
||||
```
|
||||
Cycles last night: 12
|
||||
- Complete with proof: 9
|
||||
- Productive aborts: 2
|
||||
- Failed (empty/stale): 1 ← RCA filed
|
||||
```
|
||||
|
||||
### In Metrics
|
||||
|
||||
Track as fleet health metric:
|
||||
- **Cycle completion rate:** complete / total cycles
|
||||
- **Artifact density:** artifacts per cycle (target: ≥1)
|
||||
- **Abort quality:** % of aborts with descriptive reasons
|
||||
- **Stale detection rate:** auto-aborted / total cycles (target: <5%)
|
||||
|
||||
---
|
||||
|
||||
## Quick Reference
|
||||
|
||||
```
|
||||
# Start
|
||||
python3 allegro/cycle_guard.py start "timmy-config #242"
|
||||
|
||||
# Work slices
|
||||
python3 allegro/cycle_guard.py slice "clone"
|
||||
# ... do work ...
|
||||
python3 allegro/cycle_guard.py end --artifact "repo cloned"
|
||||
|
||||
python3 allegro/cycle_guard.py slice "implement"
|
||||
# ... do work ...
|
||||
python3 allegro/cycle_guard.py end --artifact "PR #301"
|
||||
|
||||
# Finish
|
||||
python3 allegro/cycle_guard.py commit --proof '{"commits":["abc1234"],"summary":"Done"}'
|
||||
|
||||
# Or abort
|
||||
python3 allegro/cycle_guard.py abort "already resolved"
|
||||
|
||||
# Recovery check
|
||||
python3 allegro/cycle_guard.py resume
|
||||
```
|
||||
Reference in New Issue
Block a user