Files
timmy-config/allegro/burn-mode-validator.md
Alexander Whitestone ad01b4ca78
Some checks failed
Architecture Lint / Linter Tests (pull_request) Successful in 21s
Smoke Test / smoke (pull_request) Failing after 19s
Validate Config / YAML Lint (pull_request) Failing after 16s
Validate Config / JSON Validate (pull_request) Successful in 19s
Validate Config / Python Syntax & Import Check (pull_request) Failing after 59s
Validate Config / Python Test Suite (pull_request) Has been skipped
Validate Config / Shell Script Lint (pull_request) Failing after 1m3s
Validate Config / Cron Syntax Check (pull_request) Successful in 11s
Validate Config / Deploy Script Dry Run (pull_request) Successful in 11s
Validate Config / Playbook Schema Validation (pull_request) Successful in 27s
Architecture Lint / Lint Repository (pull_request) Failing after 22s
PR Checklist / pr-checklist (pull_request) Failing after 3m0s
docs: Allegro burn-mode validator rules (#242)
Defines tangible artifact criteria, stop compliance checks, lane
boundaries, proof structure, and pass/fail validation examples.

Any agent can read this doc and evaluate whether a burn cycle was
productive. References cycle_guard.py for implementation details.

Closes #242
2026-04-21 14:14:31 -04:00

285 lines
8.0 KiB
Markdown

# Allegro Burn-Mode Validator Rules
**Epic:** #842 M7 — Autonomous Burn-Mode Hardening
**Issue:** #242
**Source:** `allegro/cycle_guard.py`
Any agent can read this doc and evaluate whether a burn cycle was productive.
---
## Cycle Lifecycle
```
start_cycle(target)
→ start_slice("clone") → end_slice(artifact="repo cloned")
→ start_slice("implement") → end_slice(artifact="PR #42")
→ commit_cycle(proof={...})
```
A cycle has three terminal states:
- **complete** — `commit_cycle()` called with proof
- **aborted** — `abort_cycle(reason)` called, reason recorded
- **stale** — crashed or hung; auto-aborted by `resume_or_abort()` after 30 min
---
## 1. Tangible Artifact Criteria
A cycle is **productive** if at least one slice produces a tangible artifact.
### What Counts as Tangible
| Artifact Type | Example | Valid |
|---|---|---|
| Git commit | `abc1234 — fix: resolve import collision` | ✅ |
| Pull request | `PR #42: https://forge.../pulls/42` | ✅ |
| Issue closure | `Closed #17 with comment explaining resolution` | ✅ |
| Test file | `tests/test_new_feature.py — 5 passing` | ✅ |
| Report | `reports/2026-04-15-audit.md` committed to repo | ✅ |
| Config change | Modified `config.yaml`, pushed to branch | ✅ |
| Documentation | New or updated `.md` file committed | ✅ |
| Skill created | `skill_manage(create)` with SKILL.md | ✅ |
| Memory updated | Facts saved via `memory()` tool | ✅ |
### What Does NOT Count
| Non-Artifact | Why | Invalid |
|---|---|---|
| Log output only | Ephemeral, not durable | ❌ |
| "I analyzed the code" | No file touched, no commit made | ❌ |
| Conversation summary | Not a repo artifact | ❌ |
| Plan without execution | Intent without delivery | ❌ |
| Duplicate of existing work | No new value produced | ❌ |
| Deleted work with no record | Net-zero artifact | ❌ |
### Rule
> **Every cycle must produce at least one tangible artifact or a documented abort reason.**
> A cycle that ends with `status: complete` but `proof: null` and zero commits is **invalid**.
---
## 2. Stop Compliance Checks
The cycle guard enforces time discipline via `cycle_guard.py`.
### Slice Timeout (10-minute rule)
- Each slice has a 10-minute default max (`check_slice_timeout(max_minutes=10)`)
- If a slice exceeds 10 minutes, the agent should either:
- End it with a partial artifact and start a new slice
- Abort with reason: `"slice timeout — {description}"`
### Crash Recovery (30-minute rule)
- `resume_or_abort()` auto-aborts cycles stuck for >30 minutes
- If `cycle_guard.py resume` returns `aborted`, the agent must not continue the old cycle
- Start a fresh cycle instead
### Stop Signals
An agent MUST stop and abort when:
| Signal | Action |
|---|---|
| `check_slice_timeout` returns `True` | End slice, start new or abort |
| `resume_or_abort` returns `aborted` | Do not resume; start fresh |
| Issue already closed/implemented | Abort: `"already resolved"` |
| Authentication failure | Abort: `"auth failure — {detail}"` |
| Clone timeout > 120s | Abort: `"clone timeout"` |
| Unresolvable merge conflict | Abort: `"merge conflict — manual intervention needed"` |
| Human says stop | Abort: `"human override"` |
### Lane Boundary Checks
Agents must stay in their lane:
| Agent | Lane | Out-of-Bounds |
|---|---|---|
| Allegro | Dispatch, reporting, infra, docs | Direct model training, UI changes |
| Claude | Architecture, complex bugs | Simple config edits (use Gemini) |
| Gemini | Issue burn, simple fixes | Architecture decisions |
| Codex | Code generation, test writing | Operational dispatch |
| Ezra | Analysis, pattern recognition | Implementation |
If an agent detects it's working out-of-lane:
```
abort_cycle("out-of-lane — {what} should be done by {correct_agent}")
```
---
## 3. Proof Structure
The `proof` field in `commit_cycle()` should contain:
```json
{
"artifacts": [
{
"type": "pr",
"number": 42,
"url": "https://forge.../pulls/42",
"repo": "Timmy_Foundation/timmy-config"
}
],
"commits": ["abc1234", "def5678"],
"files_changed": ["allegro/burn-mode-validator.md"],
"summary": "Documented burn-loop validator rules per #242"
}
```
Minimal valid proof (single commit, no PR):
```json
{
"commits": ["abc1234"],
"summary": "Fixed import collision in scripts/ci_automation_gate.py"
}
```
Invalid proof (empty):
```json
null
```
A cycle with `proof: null` and no commits is **invalid** — it should have been aborted.
---
## 4. Validation Examples
### PASS: Productive Cycle
```json
{
"cycle_id": "2026-04-21T14:00:00+00:00",
"status": "complete",
"target": "timmy-config issue #242",
"slices": [
{"name": "clone", "status": "complete", "artifact": "repo cloned"},
{"name": "implement", "status": "complete", "artifact": "PR #301"},
{"name": "verify", "status": "complete", "artifact": "tests passing"}
],
"proof": {"commits": ["abc1234"], "summary": "Documented validator rules"}
}
```
**Verdict: PASS** — Three slices, each with artifact, proof present, committed.
### PASS: Productive Abort
```json
{
"cycle_id": "2026-04-21T14:00:00+00:00",
"status": "aborted",
"target": "hermes-agent issue #999",
"slices": [
{"name": "clone", "status": "complete", "artifact": "repo cloned"},
{"name": "investigate", "status": "aborted", "artifact": null}
],
"abort_reason": "already resolved — PR #888 merged this fix yesterday"
}
```
**Verdict: PASS** — Legitimate abort with clear reason. Investigation was productive (discovered duplicate).
### FAIL: Empty Cycle
```json
{
"cycle_id": "2026-04-21T14:00:00+00:00",
"status": "complete",
"target": "timmy-config issue #123",
"slices": [],
"proof": null
}
```
**Verdict: FAIL** — No slices, no artifacts, no proof. Agent started but produced nothing.
### FAIL: Log-Only Cycle
```json
{
"cycle_id": "2026-04-21T14:00:00+00:00",
"status": "complete",
"target": "timmy-config issue #456",
"slices": [
{"name": "analyze", "status": "complete", "artifact": "analysis output to stdout"}
],
"proof": {"summary": "Analyzed the codebase and identified 3 patterns"}
}
```
**Verdict: FAIL** — Analysis produced no durable artifact. Should have written findings to a file or created an issue.
### FAIL: Stale Cycle (Auto-Aborted)
```json
{
"cycle_id": "2026-04-21T14:00:00+00:00",
"status": "aborted",
"target": "timmy-config issue #789",
"slices": [
{"name": "clone", "status": "in_progress", "artifact": null}
],
"abort_reason": "crash recovery — stale cycle detected (45m old)"
}
```
**Verdict: FAIL** — Clone hung or agent crashed. `resume_or_abort()` caught it. Start fresh.
---
## 5. Integration Points
### In Burn-Loop Prompts
Add to dispatch prompts:
```
Before starting: python3 allegro/cycle_guard.py resume
After each slice: python3 allegro/cycle_guard.py end --artifact "description"
After all work: python3 allegro/cycle_guard.py commit --proof '{"commits":["..."],"summary":"..."}'
If stuck: python3 allegro/cycle_guard.py abort "reason"
```
### In Reporting
Morning reports should include cycle validation:
```
Cycles last night: 12
- Complete with proof: 9
- Productive aborts: 2
- Failed (empty/stale): 1 ← RCA filed
```
### In Metrics
Track as fleet health metric:
- **Cycle completion rate:** complete / total cycles
- **Artifact density:** artifacts per cycle (target: ≥1)
- **Abort quality:** % of aborts with descriptive reasons
- **Stale detection rate:** auto-aborted / total cycles (target: <5%)
---
## Quick Reference
```
# Start
python3 allegro/cycle_guard.py start "timmy-config #242"
# Work slices
python3 allegro/cycle_guard.py slice "clone"
# ... do work ...
python3 allegro/cycle_guard.py end --artifact "repo cloned"
python3 allegro/cycle_guard.py slice "implement"
# ... do work ...
python3 allegro/cycle_guard.py end --artifact "PR #301"
# Finish
python3 allegro/cycle_guard.py commit --proof '{"commits":["abc1234"],"summary":"Done"}'
# Or abort
python3 allegro/cycle_guard.py abort "already resolved"
# Recovery check
python3 allegro/cycle_guard.py resume
```