Some checks failed
Architecture Lint / Linter Tests (pull_request) Successful in 21s
Smoke Test / smoke (pull_request) Failing after 19s
Validate Config / YAML Lint (pull_request) Failing after 16s
Validate Config / JSON Validate (pull_request) Successful in 19s
Validate Config / Python Syntax & Import Check (pull_request) Failing after 59s
Validate Config / Python Test Suite (pull_request) Has been skipped
Validate Config / Shell Script Lint (pull_request) Failing after 1m3s
Validate Config / Cron Syntax Check (pull_request) Successful in 11s
Validate Config / Deploy Script Dry Run (pull_request) Successful in 11s
Validate Config / Playbook Schema Validation (pull_request) Successful in 27s
Architecture Lint / Lint Repository (pull_request) Failing after 22s
PR Checklist / pr-checklist (pull_request) Failing after 3m0s
Defines tangible artifact criteria, stop compliance checks, lane boundaries, proof structure, and pass/fail validation examples. Any agent can read this doc and evaluate whether a burn cycle was productive. References cycle_guard.py for implementation details. Closes #242
285 lines
8.0 KiB
Markdown
285 lines
8.0 KiB
Markdown
# Allegro Burn-Mode Validator Rules
|
|
|
|
**Epic:** #842 M7 — Autonomous Burn-Mode Hardening
|
|
**Issue:** #242
|
|
**Source:** `allegro/cycle_guard.py`
|
|
|
|
Any agent can read this doc and evaluate whether a burn cycle was productive.
|
|
|
|
---
|
|
|
|
## Cycle Lifecycle
|
|
|
|
```
|
|
start_cycle(target)
|
|
→ start_slice("clone") → end_slice(artifact="repo cloned")
|
|
→ start_slice("implement") → end_slice(artifact="PR #42")
|
|
→ commit_cycle(proof={...})
|
|
```
|
|
|
|
A cycle has three terminal states:
|
|
- **complete** — `commit_cycle()` called with proof
|
|
- **aborted** — `abort_cycle(reason)` called, reason recorded
|
|
- **stale** — crashed or hung; auto-aborted by `resume_or_abort()` after 30 min
|
|
|
|
---
|
|
|
|
## 1. Tangible Artifact Criteria
|
|
|
|
A cycle is **productive** if at least one slice produces a tangible artifact.
|
|
|
|
### What Counts as Tangible
|
|
|
|
| Artifact Type | Example | Valid |
|
|
|---|---|---|
|
|
| Git commit | `abc1234 — fix: resolve import collision` | ✅ |
|
|
| Pull request | `PR #42: https://forge.../pulls/42` | ✅ |
|
|
| Issue closure | `Closed #17 with comment explaining resolution` | ✅ |
|
|
| Test file | `tests/test_new_feature.py — 5 passing` | ✅ |
|
|
| Report | `reports/2026-04-15-audit.md` committed to repo | ✅ |
|
|
| Config change | Modified `config.yaml`, pushed to branch | ✅ |
|
|
| Documentation | New or updated `.md` file committed | ✅ |
|
|
| Skill created | `skill_manage(create)` with SKILL.md | ✅ |
|
|
| Memory updated | Facts saved via `memory()` tool | ✅ |
|
|
|
|
### What Does NOT Count
|
|
|
|
| Non-Artifact | Why | Invalid |
|
|
|---|---|---|
|
|
| Log output only | Ephemeral, not durable | ❌ |
|
|
| "I analyzed the code" | No file touched, no commit made | ❌ |
|
|
| Conversation summary | Not a repo artifact | ❌ |
|
|
| Plan without execution | Intent without delivery | ❌ |
|
|
| Duplicate of existing work | No new value produced | ❌ |
|
|
| Deleted work with no record | Net-zero artifact | ❌ |
|
|
|
|
### Rule
|
|
|
|
> **Every cycle must produce at least one tangible artifact or a documented abort reason.**
|
|
> A cycle that ends with `status: complete` but `proof: null` and zero commits is **invalid**.
|
|
|
|
---
|
|
|
|
## 2. Stop Compliance Checks
|
|
|
|
The cycle guard enforces time discipline via `cycle_guard.py`.
|
|
|
|
### Slice Timeout (10-minute rule)
|
|
|
|
- Each slice has a 10-minute default max (`check_slice_timeout(max_minutes=10)`)
|
|
- If a slice exceeds 10 minutes, the agent should either:
|
|
- End it with a partial artifact and start a new slice
|
|
- Abort with reason: `"slice timeout — {description}"`
|
|
|
|
### Crash Recovery (30-minute rule)
|
|
|
|
- `resume_or_abort()` auto-aborts cycles stuck for >30 minutes
|
|
- If `cycle_guard.py resume` returns `aborted`, the agent must not continue the old cycle
|
|
- Start a fresh cycle instead
|
|
|
|
### Stop Signals
|
|
|
|
An agent MUST stop and abort when:
|
|
|
|
| Signal | Action |
|
|
|---|---|
|
|
| `check_slice_timeout` returns `True` | End slice, start new or abort |
|
|
| `resume_or_abort` returns `aborted` | Do not resume; start fresh |
|
|
| Issue already closed/implemented | Abort: `"already resolved"` |
|
|
| Authentication failure | Abort: `"auth failure — {detail}"` |
|
|
| Clone timeout > 120s | Abort: `"clone timeout"` |
|
|
| Unresolvable merge conflict | Abort: `"merge conflict — manual intervention needed"` |
|
|
| Human says stop | Abort: `"human override"` |
|
|
|
|
### Lane Boundary Checks
|
|
|
|
Agents must stay in their lane:
|
|
|
|
| Agent | Lane | Out-of-Bounds |
|
|
|---|---|---|
|
|
| Allegro | Dispatch, reporting, infra, docs | Direct model training, UI changes |
|
|
| Claude | Architecture, complex bugs | Simple config edits (use Gemini) |
|
|
| Gemini | Issue burn, simple fixes | Architecture decisions |
|
|
| Codex | Code generation, test writing | Operational dispatch |
|
|
| Ezra | Analysis, pattern recognition | Implementation |
|
|
|
|
If an agent detects it's working out-of-lane:
|
|
```
|
|
abort_cycle("out-of-lane — {what} should be done by {correct_agent}")
|
|
```
|
|
|
|
---
|
|
|
|
## 3. Proof Structure
|
|
|
|
The `proof` field in `commit_cycle()` should contain:
|
|
|
|
```json
|
|
{
|
|
"artifacts": [
|
|
{
|
|
"type": "pr",
|
|
"number": 42,
|
|
"url": "https://forge.../pulls/42",
|
|
"repo": "Timmy_Foundation/timmy-config"
|
|
}
|
|
],
|
|
"commits": ["abc1234", "def5678"],
|
|
"files_changed": ["allegro/burn-mode-validator.md"],
|
|
"summary": "Documented burn-loop validator rules per #242"
|
|
}
|
|
```
|
|
|
|
Minimal valid proof (single commit, no PR):
|
|
```json
|
|
{
|
|
"commits": ["abc1234"],
|
|
"summary": "Fixed import collision in scripts/ci_automation_gate.py"
|
|
}
|
|
```
|
|
|
|
Invalid proof (empty):
|
|
```json
|
|
null
|
|
```
|
|
A cycle with `proof: null` and no commits is **invalid** — it should have been aborted.
|
|
|
|
---
|
|
|
|
## 4. Validation Examples
|
|
|
|
### PASS: Productive Cycle
|
|
|
|
```json
|
|
{
|
|
"cycle_id": "2026-04-21T14:00:00+00:00",
|
|
"status": "complete",
|
|
"target": "timmy-config issue #242",
|
|
"slices": [
|
|
{"name": "clone", "status": "complete", "artifact": "repo cloned"},
|
|
{"name": "implement", "status": "complete", "artifact": "PR #301"},
|
|
{"name": "verify", "status": "complete", "artifact": "tests passing"}
|
|
],
|
|
"proof": {"commits": ["abc1234"], "summary": "Documented validator rules"}
|
|
}
|
|
```
|
|
**Verdict: PASS** — Three slices, each with artifact, proof present, committed.
|
|
|
|
### PASS: Productive Abort
|
|
|
|
```json
|
|
{
|
|
"cycle_id": "2026-04-21T14:00:00+00:00",
|
|
"status": "aborted",
|
|
"target": "hermes-agent issue #999",
|
|
"slices": [
|
|
{"name": "clone", "status": "complete", "artifact": "repo cloned"},
|
|
{"name": "investigate", "status": "aborted", "artifact": null}
|
|
],
|
|
"abort_reason": "already resolved — PR #888 merged this fix yesterday"
|
|
}
|
|
```
|
|
**Verdict: PASS** — Legitimate abort with clear reason. Investigation was productive (discovered duplicate).
|
|
|
|
### FAIL: Empty Cycle
|
|
|
|
```json
|
|
{
|
|
"cycle_id": "2026-04-21T14:00:00+00:00",
|
|
"status": "complete",
|
|
"target": "timmy-config issue #123",
|
|
"slices": [],
|
|
"proof": null
|
|
}
|
|
```
|
|
**Verdict: FAIL** — No slices, no artifacts, no proof. Agent started but produced nothing.
|
|
|
|
### FAIL: Log-Only Cycle
|
|
|
|
```json
|
|
{
|
|
"cycle_id": "2026-04-21T14:00:00+00:00",
|
|
"status": "complete",
|
|
"target": "timmy-config issue #456",
|
|
"slices": [
|
|
{"name": "analyze", "status": "complete", "artifact": "analysis output to stdout"}
|
|
],
|
|
"proof": {"summary": "Analyzed the codebase and identified 3 patterns"}
|
|
}
|
|
```
|
|
**Verdict: FAIL** — Analysis produced no durable artifact. Should have written findings to a file or created an issue.
|
|
|
|
### FAIL: Stale Cycle (Auto-Aborted)
|
|
|
|
```json
|
|
{
|
|
"cycle_id": "2026-04-21T14:00:00+00:00",
|
|
"status": "aborted",
|
|
"target": "timmy-config issue #789",
|
|
"slices": [
|
|
{"name": "clone", "status": "in_progress", "artifact": null}
|
|
],
|
|
"abort_reason": "crash recovery — stale cycle detected (45m old)"
|
|
}
|
|
```
|
|
**Verdict: FAIL** — Clone hung or agent crashed. `resume_or_abort()` caught it. Start fresh.
|
|
|
|
---
|
|
|
|
## 5. Integration Points
|
|
|
|
### In Burn-Loop Prompts
|
|
|
|
Add to dispatch prompts:
|
|
```
|
|
Before starting: python3 allegro/cycle_guard.py resume
|
|
After each slice: python3 allegro/cycle_guard.py end --artifact "description"
|
|
After all work: python3 allegro/cycle_guard.py commit --proof '{"commits":["..."],"summary":"..."}'
|
|
If stuck: python3 allegro/cycle_guard.py abort "reason"
|
|
```
|
|
|
|
### In Reporting
|
|
|
|
Morning reports should include cycle validation:
|
|
```
|
|
Cycles last night: 12
|
|
- Complete with proof: 9
|
|
- Productive aborts: 2
|
|
- Failed (empty/stale): 1 ← RCA filed
|
|
```
|
|
|
|
### In Metrics
|
|
|
|
Track as fleet health metric:
|
|
- **Cycle completion rate:** complete / total cycles
|
|
- **Artifact density:** artifacts per cycle (target: ≥1)
|
|
- **Abort quality:** % of aborts with descriptive reasons
|
|
- **Stale detection rate:** auto-aborted / total cycles (target: <5%)
|
|
|
|
---
|
|
|
|
## Quick Reference
|
|
|
|
```
|
|
# Start
|
|
python3 allegro/cycle_guard.py start "timmy-config #242"
|
|
|
|
# Work slices
|
|
python3 allegro/cycle_guard.py slice "clone"
|
|
# ... do work ...
|
|
python3 allegro/cycle_guard.py end --artifact "repo cloned"
|
|
|
|
python3 allegro/cycle_guard.py slice "implement"
|
|
# ... do work ...
|
|
python3 allegro/cycle_guard.py end --artifact "PR #301"
|
|
|
|
# Finish
|
|
python3 allegro/cycle_guard.py commit --proof '{"commits":["abc1234"],"summary":"Done"}'
|
|
|
|
# Or abort
|
|
python3 allegro/cycle_guard.py abort "already resolved"
|
|
|
|
# Recovery check
|
|
python3 allegro/cycle_guard.py resume
|
|
```
|