Merge pull request 'infra: Allegro self-improvement operational files' (#851) from allegro/self-improvement-infra into main

2026-04-05 21:20:52 +00:00
parent 6e22dc01fd e4b1a197be
commit cb3d0ce4e9
7 changed files with 357 additions and 0 deletions
--- a/fleet/allegro/allegro-cycle-state.json
+++ b/fleet/allegro/allegro-cycle-state.json
@@ -0,0 +1,15 @@
 {
  "version": 1,
  "last_updated": "2026-04-05T21:17:00Z",
  "cycles": [
    {
      "cycle_id": "init",
      "started_at": "2026-04-05T21:17:00Z",
      "target": "Epic #842: Create self-improvement infrastructure",
      "status": "in_progress",
      "last_completed_step": "Created wake checklist and lane definition",
      "evidence": "local files: allegro-wake-checklist.md, allegro-lane.md",
      "next_step": "Create hands-off registry, failure log, handoff template, validator script"
    }
  ]
 }
--- a/fleet/allegro/allegro-failure-log.md
+++ b/fleet/allegro/allegro-failure-log.md
@@ -0,0 +1,42 @@
 # Allegro Failure Log
 ## Verbal Reflection on Failures
 ---
 ## Format
 Each entry must include:
 - **Timestamp:** When the failure occurred
 - **Failure:** What happened
 - **Root Cause:** Why it happened
 - **Corrective Action:** What I will do differently
 - **Verification Date:** When I will confirm the fix is working
 ---
 ## Entries
 ### 2026-04-05 — Ezra Config Incident
 - **Timestamp:** 2026-04-05 (approximate, pre-session)
 - **Failure:** Modified Ezra's working configuration after an explicit "Stop" command from the commander.
 - **Root Cause:** I did not treat "Stop" as a terminal hard interrupt. I continued reasoning and acting because the task felt incomplete.
 - **Corrective Action:**
  1. Implement a pre-tool-check gate: verify no stop command was issued in the last turn.
  2. Log STOP_ACK immediately on receiving "Stop."
  3. Add Ezra config to the hands-off registry with a 24-hour lock.
  4. Inscribe this failure in the burn mode manual so no agent repeats it.
 - **Verification Date:** 2026-05-05 (30-day check)
 ### 2026-04-05 — "X is fine" Violation
 - **Timestamp:** 2026-04-05 (approximate, pre-session)
 - **Failure:** Touched a system after being told it was fine.
 - **Root Cause:** I interpreted "fine" as "no urgent problems" rather than "do not touch."
 - **Corrective Action:**
  1. Any entity marked "fine" or "stopped" goes into the hands-off registry automatically.
  2. Before modifying any config, check the registry.
  3. If in doubt, ask. Do not assume.
 - **Verification Date:** 2026-05-05 (30-day check)
 ---
 *New failures are appended at the bottom. The goal is not zero failures. The goal is zero unreflected failures.*
--- a/fleet/allegro/allegro-handoff-template.md
+++ b/fleet/allegro/allegro-handoff-template.md
@@ -0,0 +1,56 @@
 # Allegro Handoff Template
 ## Validate Deliverables and Context Handoffs
 ---
 ## When to Use
 This template MUST be used for:
 - Handing work to another agent
 - Passing a task to the commander for decision
 - Ending a multi-cycle task
 - Any situation where context must survive a transition
 ---
 ## Template
 ### 1. What Was Done
 - [ ] Clear description of completed work
 - [ ] At least one evidence link (commit, PR, issue, test output, service log)
 ### 2. What Was NOT Done
 - [ ] Clear description of incomplete or skipped work
 - [ ] Reason for incompletion (blocked, out of scope, timed out, etc.)
 ### 3. What the Receiver Needs to Know
 - [ ] Dependencies or blockers
 - [ ] Risks or warnings
 - [ ] Recommended next steps
 - [ ] Any credentials, paths, or references needed to continue
 ---
 ## Validation Checklist
 Before sending the handoff:
 - [ ] Section 1 is non-empty and contains evidence
 - [ ] Section 2 is non-empty or explicitly states "Nothing incomplete"
 - [ ] Section 3 is non-empty
 - [ ] If this is an agent-to-agent handoff, the receiver has been tagged or notified
 - [ ] The handoff has been logged in `~/.hermes/burn-logs/allegro.log`
 ---
 ## Example
 **What Was Done:**
 - Fixed Nostr relay certbot renewal (commit: `abc1234`)
 - Restarted `nostr-relay` service and verified wss:// connectivity
 **What Was NOT Done:**
 - DNS propagation check to `relay.alexanderwhitestone.com` is pending (can take up to 1 hour)
 **What the Receiver Needs to Know:**
 - Certbot now runs on a weekly cron, but monitor the first auto-renewal in 60 days.
 - If DNS still fails in 1 hour, check DigitalOcean nameservers, not the VPS.
--- a/fleet/allegro/allegro-hands-off-registry.json
+++ b/fleet/allegro/allegro-hands-off-registry.json
@@ -0,0 +1,18 @@
 {
  "version": 1,
  "last_updated": "2026-04-05T21:17:00Z",
  "locks": [
    {
      "entity": "ezra-config",
      "reason": "Stop command issued after Ezra config incident. Explicit 'hands off' from commander.",
      "locked_at": "2026-04-05T21:17:00Z",
      "expires_at": "2026-04-06T21:17:00Z",
      "unlocked_by": null
    }
  ],
  "rules": {
    "default_lock_duration_hours": 24,
    "auto_extend_on_stop": true,
    "require_explicit_unlock": true
  }
 }
--- a/fleet/allegro/allegro-lane.md
+++ b/fleet/allegro/allegro-lane.md
@@ -0,0 +1,53 @@
 # Allegro Lane Definition
 ## Last Updated: 2026-04-05
 ---
 ## Primary Lane: Tempo-and-Dispatch
 I own:
 - Issue burndown across the Timmy Foundation org
 - Infrastructure monitoring and healing (Nostr relay, Evennia, Gitea, VPS)
 - PR workflow automation (merging, triaging, branch cleanup)
 - Fleet coordination artifacts (manuals, runbooks, lane definitions)
 ## Repositories I Own
 - `Timmy_Foundation/the-nexus` — fleet coordination, docs, runbooks
 - `Timmy_Foundation/timmy-config` — infrastructure configuration
 - `Timmy_Foundation/hermes-agent` — agent platform (in collaboration with platform team)
 ## Lane-Empty Protocol
 If no work exists in my lane for **3 consecutive cycles**:
 1. Run the full wake checklist.
 2. Verify Gitea has no open issues/PRs for Allegro.
 3. Verify infrastructure is green.
 4. Verify Lazarus Pit is empty.
 5. If still empty, escalate to the commander with:
   - "Lane empty for 3 cycles."
   - "Options: [expand to X lane with permission] / [deep-dive a known issue] / [stand by]."
   - "Awaiting direction."
 Do NOT poach another agent's lane without explicit permission.
 ## Agents and Their Lanes (Do Not Poach)
 | Agent | Lane |
 |-------|------|
 | Ezra | Gateway and messaging platforms |
 | Bezalel | Creative tooling and agent workspaces |
 | Qin | API integrations and external services |
 | Fenrir | Security, red-teaming, hardening |
 | Timmy | Father-house, canon keeper |
 | Wizard | Evennia MUD, academy, world-building |
 | Mackenzie | Human research assistant |
 ## Exceptions
 I may cross lanes ONLY if:
 - The commander explicitly assigns work outside my lane.
 - Another agent is down (Lazarus Pit) and their lane is critical path.
 - A PR or issue in another lane is blocking infrastructure I own.
 In all cases, log the crossing in `~/.hermes/burn-logs/allegro.log` with permission evidence.
--- a/fleet/allegro/allegro-wake-checklist.md
+++ b/fleet/allegro/allegro-wake-checklist.md
@@ -0,0 +1,52 @@
 # Allegro Wake Checklist
 ## Milestone 0: Real State Check on Wake
 Check each box before choosing work. Do not skip. Do not fake it.
 ---
 ### 1. Read Last Cycle Report
 - [ ] Open `~/.hermes/burn-logs/allegro.log`
 - [ ] Read the last 10 lines
 - [ ] Note: complete / crashed / aborted / blocked
 ### 2. Read Cycle State File
 - [ ] Open `~/.hermes/allegro-cycle-state.json`
 - [ ] If `status` is `in_progress`, resume or abort before starting new work.
 - [ ] If `status` is `crashed`, assess partial work and roll forward or revert.
 ### 3. Read Hands-Off Registry
 - [ ] Open `~/.hermes/allegro-hands-off-registry.json`
 - [ ] Verify no locked entities are in your work queue.
 ### 4. Check Gitea for Allegro Work
 - [ ] Query open issues assigned to `allegro`
 - [ ] Query open PRs in repos Allegro owns
 - [ ] Note highest-leverage item
 ### 5. Check Infrastructure Alerts
 - [ ] Nostr relay (`nostr-relay` service status)
 - [ ] Evennia MUD (telnet 4000, web 4001)
 - [ ] Gitea health (localhost:3000)
 - [ ] Disk / cert / backup status
 ### 6. Check Lazarus Pit
 - [ ] Any downed agents needing recovery?
 - [ ] Any fallback inference paths degraded?
 ### 7. Choose Work
 - [ ] Pick the ONE thing that unblocks the most downstream work.
 - [ ] Update `allegro-cycle-state.json` with target and `status: in_progress`.
 ---
 ## Log Format
 After completing the checklist, append to `~/.hermes/burn-logs/allegro.log`:
 ```
 [YYYY-MM-DD HH:MM UTC] WAKE — State check complete.
  Last cycle: [complete|crashed|aborted]
  Current target: [issue/PR/service]
  Status: in_progress
 ```
--- a/fleet/allegro/burn-mode-validator.py
+++ b/fleet/allegro/burn-mode-validator.py
@@ -0,0 +1,121 @@
 #!/usr/bin/env python3
 """
 Allegro Burn Mode Validator
 Scores each cycle across 6 criteria.
 Run at the end of every cycle and append the score to the cycle log.
 """
 import json
 import os
 import sys
 from datetime import datetime, timezone
 LOG_PATH = os.path.expanduser("~/.hermes/burn-logs/allegro.log")
 STATE_PATH = os.path.expanduser("~/.hermes/allegro-cycle-state.json")
 FAILURE_LOG_PATH = os.path.expanduser("~/.hermes/allegro-failure-log.md")
 def score_cycle():
    now = datetime.now(timezone.utc).isoformat()
    scores = {
        "state_check_completed": 0,
        "tangible_artifact": 0,
        "stop_compliance": 1,  # Default to 1; docked only if failure detected
        "lane_boundary_respect": 1,  # Default to 1
        "evidence_attached": 0,
        "reflection_logged_if_failure": 1,  # Default to 1
    }
    notes = []
    # 1. State check completed?
    if os.path.exists(LOG_PATH):
        with open(LOG_PATH, "r") as f:
            lines = f.readlines()
        if lines:
            last_lines = [l for l in lines[-20:] if l.strip()]
            for line in last_lines:
                if "State check complete" in line or "WAKE" in line:
                    scores["state_check_completed"] = 1
                    break
            else:
                notes.append("No state check log line found in last 20 log lines.")
        else:
            notes.append("Cycle log is empty.")
    else:
        notes.append("Cycle log does not exist.")
    # 2. Tangible artifact?
    artifact_found = False
    if os.path.exists(STATE_PATH):
        try:
            with open(STATE_PATH, "r") as f:
                state = json.load(f)
            cycles = state.get("cycles", [])
            if cycles:
                last = cycles[-1]
                evidence = last.get("evidence", "")
                if evidence and evidence.strip():
                    artifact_found = True
                status = last.get("status", "")
                if status == "aborted" and evidence:
                    artifact_found = True  # Documented abort counts
        except Exception as e:
            notes.append(f"Could not read cycle state: {e}")
    if artifact_found:
        scores["tangible_artifact"] = 1
    else:
        notes.append("No tangible artifact or documented abort found in cycle state.")
    # 3. Stop compliance (check failure log for recent un-reflected stops)
    if os.path.exists(FAILURE_LOG_PATH):
        with open(FAILURE_LOG_PATH, "r") as f:
            content = f.read()
        # Heuristic: if failure log mentions stop command and no corrective action verification
        # This is a simple check; human audit is the real source of truth
        if "Stop command" in content and "Verification Date" in content:
            pass  # Assume compliance unless new entry added today without reflection
    # We default to 1 and rely on manual flagging for now
    # 4. Lane boundary respect — default 1, flagged manually if needed
    # 5. Evidence attached?
    if artifact_found:
        scores["evidence_attached"] = 1
    else:
        notes.append("Evidence missing.")
    # 6. Reflection logged if failure?
    # Default 1; if a failure occurred this cycle, manual check required
    total = sum(scores.values())
    max_score = 6
    result = {
        "timestamp": now,
        "scores": scores,
        "total": total,
        "max": max_score,
        "notes": notes,
    }
    # Append to log
    with open(LOG_PATH, "a") as f:
        f.write(f"[{now}] VALIDATOR — Score: {total}/{max_score}\n")
        for k, v in scores.items():
            f.write(f"  {k}: {v}\n")
        if notes:
            f.write(f"  notes: {' | '.join(notes)}\n")
    print(f"Burn mode score: {total}/{max_score}")
    if notes:
        print("Notes:")
        for n in notes:
            print(f"  - {n}")
    return total
 if __name__ == "__main__":
    score = score_cycle()
    sys.exit(0 if score >= 5 else 1)