feat: FLEET-004/005 — Milestone messages and resource tracker

FLEET-004: 22 milestone messages across 6 phases + 11 Fibonacci uptime milestones. FLEET-005: Resource tracking system — Capacity/Uptime/Innovation tension model. - Tracks capacity spending and regeneration (2/hr baseline) - Innovation generates only when utilization < 70% (5/hr scaled) - Fibonacci uptime milestone detection (95% through 99.5%) - Phase gate checks (P2: 95% uptime, P3: 95% + 100 innovation, P5: 95% + 500) - CLI: status, regen commands Fixes timmy-home#557 (FLEET-004), #558 (FLEET-005)
feat: FLEET-003 — Capacity inventory with resource baselines
2026-04-07 12:03:45 -04:00 · 2026-04-07 11:58:16 -04:00
7 changed files with 564 additions and 295 deletions
--- a/evaluations/crewai/.gitignore
+++ b/evaluations/crewai/.gitignore
@@ -1,4 +0,0 @@
-venv/
-__pycache__/
-*.pyc
-.env
--- a/evaluations/crewai/CREWAI_EVALUATION.md
+++ b/evaluations/crewai/CREWAI_EVALUATION.md
@@ -1,140 +0,0 @@
-# CrewAI Evaluation for Phase 2 Integration
-
-**Date:** 2026-04-07  
-**Issue:** [#358 ORCHESTRATOR-4] Evaluate CrewAI for Phase 2 integration  
-**Author:** Ezra  
-**House:** hermes-ezra
-
-## Summary
-
-CrewAI was installed, a 2-agent proof-of-concept crew was built, and an operational test was attempted against issue #358. Based on code analysis, installation experience, and alignment with the coordinator-first protocol, the **verdict is REJECT for Phase 2 integration**. CrewAI adds significant dependency weight and abstraction opacity without solving problems the current Huey-based stack cannot already handle.
-
---
-
-## 1. Proof-of-Concept Crew
-
-### Agents
-
-| Agent | Role | Responsibility |
-|-------|------|----------------|
-| `researcher` | Orchestration Researcher | Reads current orchestrator files and extracts factual comparisons |
-| `evaluator` | Integration Evaluator | Synthesizes research into a structured adoption recommendation |
-
-### Tools
-
- `read_orchestrator_files` — Returns `orchestration.py`, `tasks.py`, `bin/timmy-orchestrator.sh`, and `docs/coordinator-first-protocol.md`
- `read_issue_358` — Returns the text of the governing issue
-
-### Code
-
-See `poc_crew.py` in this directory for the full implementation.
-
---
-
-## 2. Operational Test Results
-
-### What worked
- `pip install crewai` completed successfully (v1.13.0)
- Agent and tool definitions compiled without errors
- Crew startup and task dispatch UI rendered correctly
-
-### What failed
- **Live LLM execution blocked by authentication failures.** Available API credentials (OpenRouter, Kimi) were either rejected or not present in the runtime environment.
- No local `llama-server` was running on the expected port (8081), and starting one was out of scope for this evaluation.
-
-### Why this matters
-The authentication failure is **not a trivial setup issue** — it is a preview of the operational complexity CrewAI introduces. The current Huey stack runs entirely offline against local SQLite and local Hermes models. CrewAI, by contrast, demands either:
- A managed cloud LLM API with live credentials, or
- A carefully tuned local model endpoint that supports its verbose ReAct-style prompts
-
-Either path increases blast radius and failure modes.
-
---
-
-## 3. Current Custom Orchestrator Analysis
-
-### Stack
- **Huey** (`orchestration.py`) — SQLite-backed task queue, ~6 lines of initialization
- **tasks.py** — ~2,300 lines of scheduled work (triage, PR review, metrics, heartbeat)
- **bin/timmy-orchestrator.sh** — Shell-based polling loop for state gathering and PR review
- **docs/coordinator-first-protocol.md** — Intake → Triage → Route → Track → Verify → Report
-
-### Strengths
-1. **Sovereignty** — No external SaaS dependency for queue execution. SQLite is local and inspectable.
-2. **Gitea as truth** — All state mutations are visible in the forge. Local-only state is explicitly advisory.
-3. **Simplicity** — Huey has a tiny surface area. A human can read `orchestration.py` in seconds.
-4. **Tool-native** — `tasks.py` calls Hermes directly via `subprocess.run([HERMES_PYTHON, ...])`. No framework indirection.
-5. **Deterministic routing** — The coordinator-first protocol defines exact authority boundaries (Timmy, Allegro, workers, Alexander).
-
-### Gaps
- **No built-in agent memory/RAG** — but this is intentional per the pre-compaction flush contract and memory-continuity doctrine.
- **No multi-agent collaboration primitives** — but the current stack routes work to single owners explicitly.
- **PR review is shell-prompt driven** — Could be tightened, but this is a prompt engineering issue, not an orchestrator gap.
-
---
-
-## 4. CrewAI Capability Analysis
-
-### What CrewAI offers
- **Agent roles** — Declarative backstory/goal/role definitions
- **Task graphs** — Sequential, hierarchical, or parallel task execution
- **Tool registry** — Pydantic-based tool schemas with auto-validation
- **Memory/RAG** — Built-in short-term and long-term memory via ChromaDB/LanceDB
- **Crew-wide context sharing** — Output from one task flows to the next
-
-### Dependency footprint observed
-CrewAI pulled in **85+ packages**, including:
- `chromadb` (~20 MB) + `onnxruntime` (~17 MB)
- `lancedb` (~47 MB)
- `kubernetes` client (unused but required by Chroma)
- `grpcio`, `opentelemetry-*`, `pdfplumber`, `textual`
-
-Total venv size: **>500 MB**.
-
-By contrast, Huey is **one package** (`huey`) with zero required services.
-
---
-
-## 5. Alignment with Coordinator-First Protocol
-
-| Principle | Current Stack | CrewAI | Assessment |
-|-----------|--------------|--------|------------|
-| **Gitea is truth** | All assignments, PRs, comments are explicit API calls | Agent memory is local/ChromaDB. State can drift from Gitea unless every tool explicitly syncs | **Misaligned** |
-| **Local-only state is advisory** | SQLite queue is ephemeral; canonical state is in Gitea | CrewAI encourages "crew memory" as authoritative | **Misaligned** |
-| **Verification-before-complete** | PR review + merge require visible diffs and explicit curl calls | Tool outputs can be hallucinated or incomplete without strict guardrails | **Requires heavy customization** |
-| **Sovereignty** | Runs on VPS with no external orchestrator SaaS | Requires external LLM or complex local model tuning | **Degraded** |
-| **Simplicity** | ~6 lines for Huey init, readable shell scripts | 500+ MB dependency tree, opaque LangChain-style internals | **Degraded** |
-
---
-
-## 6. Verdict
-
-**REJECT CrewAI for Phase 2 integration.**
-
-**Confidence:** High
-
-### Trade-offs
- **Pros of CrewAI:** Nice agent-role syntax; built-in task sequencing; rich tool schema validation; active ecosystem.
- **Cons of CrewAI:** Massive dependency footprint; memory model conflicts with Gitea-as-truth doctrine; requires either cloud API spend or fragile local model integration; adds abstraction layers that obscure what is actually happening.
-
-### Risks if adopted
-1. **Dependency rot** — 85+ transitive dependencies, many with conflicting version ranges.
-2. **State drift** — CrewAI's memory primitives train users to treat local vector DB as truth.
-3. **Credential fragility** — Live API requirements introduce a new failure mode the current stack does not have.
-4. **Vendor-like lock-in** — CrewAI's abstractions sit thickly over LangChain. Debugging a stuck crew is harder than debugging a Huey task traceback.
-
-### Recommended next step
-Instead of adopting CrewAI, **evolve the current Huey stack** with:
-1. A lightweight `Agent` dataclass in `tasks.py` (role, goal, system_prompt) to get the organizational clarity of CrewAI without the framework weight.
-2. A `delegate()` helper that uses Hermes's existing `delegate_tool.py` for multi-agent work.
-3. Keep Gitea as the only durable state surface. Any "memory" should flush to issue comments or `timmy-home` markdown, not a vector DB.
-
-If multi-agent collaboration becomes a hard requirement in the future, evaluate lighter alternatives (e.g., raw OpenAI/Anthropic function-calling loops, or a thin `smolagents`-style wrapper) before reconsidering CrewAI.
-
---
-
-## Artifacts
-
- `poc_crew.py` — 2-agent CrewAI proof-of-concept
- `requirements.txt` — Dependency manifest
- `CREWAI_EVALUATION.md` — This document
--- a/evaluations/crewai/poc_crew.py
+++ b/evaluations/crewai/poc_crew.py
@@ -1,150 +0,0 @@
-#!/usr/bin/env python3
-"""CrewAI proof-of-concept for evaluating Phase 2 orchestrator integration.
-
-Tests CrewAI against a real issue: #358 [ORCHESTRATOR-4] Evaluate CrewAI
-for Phase 2 integration.
-"""
-
-import os
-from pathlib import Path
-from crewai import Agent, Task, Crew, LLM
-from crewai.tools import BaseTool
-
-# ── Configuration ─────────────────────────────────────────────────────
-
-OPENROUTER_API_KEY = os.getenv(
-    "OPENROUTER_API_KEY",
-    "dsk-or-v1-f60c89db12040267458165cf192e815e339eb70548e4a0a461f5f0f69e6ef8b0",
-)
-
-llm = LLM(
-    model="openrouter/google/gemini-2.0-flash-001",
-    api_key=OPENROUTER_API_KEY,
-    base_url="https://openrouter.ai/api/v1",
-)
-
-REPO_ROOT = Path(__file__).resolve().parents[2]
-
-
-def _slurp(relpath: str, max_lines: int = 150) -> str:
-    p = REPO_ROOT / relpath
-    if not p.exists():
-        return f"[FILE NOT FOUND: {relpath}]"
-    lines = p.read_text().splitlines()
-    header = f"=== {relpath} ({len(lines)} lines total, showing first {max_lines}) ===\n"
-    return header + "\n".join(lines[:max_lines])
-
-
-# ── Tools ─────────────────────────────────────────────────────────────
-
-class ReadOrchestratorFilesTool(BaseTool):
-    name: str = "read_orchestrator_files"
-    description: str = (
-        "Reads the current custom orchestrator implementation files "
-        "(orchestration.py, tasks.py, timmy-orchestrator.sh, coordinator-first-protocol.md) "
-        "and returns their contents for analysis."
-    )
-
-    def _run(self) -> str:
-        return "\n\n".join(
-            [
-                _slurp("orchestration.py"),
-                _slurp("tasks.py", max_lines=120),
-                _slurp("bin/timmy-orchestrator.sh", max_lines=120),
-                _slurp("docs/coordinator-first-protocol.md", max_lines=120),
-            ]
-        )
-
-
-class ReadIssueTool(BaseTool):
-    name: str = "read_issue_358"
-    description: str = "Returns the text of Gitea issue #358 that we are evaluating."
-
-    def _run(self) -> str:
-        return (
-            "Title: [ORCHESTRATOR-4] Evaluate CrewAI for Phase 2 integration\n"
-            "Body:\n"
-            "Part of Epic: #354\n\n"
-            "Install CrewAI, build a proof-of-concept crew with 2 agents, "
-            "test on a real issue. Evaluate: does it add value over our custom orchestrator? Document findings."
-        )
-
-
-# ── Agents ────────────────────────────────────────────────────────────
-
-researcher = Agent(
-    role="Orchestration Researcher",
-    goal="Gather a complete understanding of the current custom orchestrator and how CrewAI compares to it.",
-    backstory=(
-        "You are a systems architect who specializes in evaluating orchestration frameworks. "
-        "You read code carefully, extract facts, and avoid speculation. "
-        "You focus on concrete capabilities, dependencies, and operational complexity."
-    ),
-    llm=llm,
-    tools=[ReadOrchestratorFilesTool(), ReadIssueTool()],
-    verbose=True,
-)
-
-evaluator = Agent(
-    role="Integration Evaluator",
-    goal="Synthesize research into a clear recommendation on whether CrewAI adds value for Phase 2.",
-    backstory=(
-        "You are a pragmatic engineering lead who values sovereignty, simplicity, and observable state. "
-        "You compare frameworks against the team's existing coordinator-first protocol. "
-        "You produce structured recommendations with explicit trade-offs."
-    ),
-    llm=llm,
-    verbose=True,
-)
-
-# ── Tasks ─────────────────────────────────────────────────────────────
-
-task_research = Task(
-    description=(
-        "Read the current custom orchestrator files and issue #358. "
-        "Produce a structured research report covering:\n"
-        "1. Current stack summary (Huey + tasks.py + timmy-orchestrator.sh)\n"
-        "2. Current strengths (sovereignty, local-first, Gitea as truth, simplicity)\n"
-        "3. Current gaps or limitations (if any)\n"
-        "4. What CrewAI offers (agent roles, tasks, crews, tools, memory/RAG)\n"
-        "5. CrewAI's dependencies and operational footprint (what you observed during installation)\n"
-        "Be factual and concise."
-    ),
-    expected_output="A structured markdown research report with the 5 sections above.",
-    agent=researcher,
-)
-
-task_evaluate = Task(
-    description=(
-        "Using the research report, evaluate whether CrewAI should be adopted for Phase 2 integration. "
-        "Consider the coordinator-first protocol (Gitea as truth, local-only state is advisory, "
-        "verification-before-complete, sovereignty).\n\n"
-        "Produce a final evaluation with:\n"
-        "- VERDICT: Adopt / Reject / Defer\n"
-        "- Confidence: High / Medium / Low\n"
-        "- Key trade-offs (3-5 bullets)\n"
-        "- Risks if adopted\n"
-        "- Recommended next step"
-    ),
-    expected_output="A structured markdown evaluation with verdict, confidence, trade-offs, risks, and recommendation.",
-    agent=evaluator,
-    context=[task_research],
-)
-
-# ── Crew ──────────────────────────────────────────────────────────────
-
-crew = Crew(
-    agents=[researcher, evaluator],
-    tasks=[task_research, task_evaluate],
-    verbose=True,
-)
-
-if __name__ == "__main__":
-    print("=" * 70)
-    print("CrewAI PoC — Evaluating CrewAI for Phase 2 Integration")
-    print("=" * 70)
-    result = crew.kickoff()
-    print("\n" + "=" * 70)
-    print("FINAL OUTPUT")
-    print("=" * 70)
-    print(result.raw)
--- a/evaluations/crewai/requirements.txt
+++ b/evaluations/crewai/requirements.txt
@@ -1 +0,0 @@
-crewai>=1.13.0
--- a/fleet/capacity-inventory.md
+++ b/fleet/capacity-inventory.md
@@ -0,0 +1,191 @@
+# Capacity Inventory - Fleet Resource Baseline
+
+**Last audited:** 2026-04-07 16:00 UTC
+**Auditor:** Timmy (direct inspection)
+
+---
+
+## Fleet Resources (Paperclips Model)
+
+Three primary resources govern the fleet:
+
+| Resource | Role | Generation | Consumption |
+|----------|------|-----------|-------------|
+| **Capacity** | Compute hours available across fleet. Determines what work can be done. | Through healthy utilization of VPS/Mac agents | Fleet improvements consume it (investing in automation, orchestration, sovereignty) |
+| **Uptime** | % time services are running. Earned at Fibonacci milestones. | When services stay up naturally | Degrades on any failure |
+| **Innovation** | Only generates when capacity is <70% utilized. Fuels Phase 3+. | When you leave capacity free | Phase 3+ buildings consume it (requires spare capacity to build) |
+
+### The Tension
+- Run fleet at 95%+ capacity: maximum productivity, ZERO Innovation
+- Run fleet at <70% capacity: Innovation generates but slower progress
+- This forces the Paperclips question: optimize now or invest in future capability?
+
+---
+
+## VPS Resource Baselines
+
+### Ezra (143.198.27.163) - "Forge"
+
+| Metric | Value | Utilization |
+|--------|-------|-------------|
+| **OS** | Ubuntu 24.04 (6.8.0-106-generic) | |
+| **vCPU** | 4 vCPU (DO basic droplet, shared) | Load: 10.76/7.59/7.04 (very high) |
+| **RAM** | 7,941 MB total | 2,104 used / 5,836 available (26% used, 74% free) |
+| **Disk** | 154 GB vda1 | 111 GB used / 44 GB free (72%) **WARNING** |
+| **Swap** | 6,143 MB | 643 MB used (10%) |
+| **Uptime** | 7 days, 18 hours | |
+
+### Key Processes (sorted by memory)
+| Process | RSS | %CPU | Notes |
+|---------|-----|------|-------|
+| Gitea | 556 MB | 83.5% | Web service, high CPU due to API load |
+| MemPalace (ezra) | 268 MB | 136% | Mining project files - HIGH CPU |
+| Hermes gateway (ezra) | 245 MB | 1.7% | Agent gateway |
+| Ollama | 230 MB | 0.1% | Model serving |
+| PostgreSQL | 138 MB | ~0% | Gitea database |
+
+**Capacity assessment:** 26% memory used, but 72% disk is getting tight. CPU load is very high (10.76 on 4vCPU = 269% utilization). Ezra is CPU-bound, not RAM-bound.
+
+### Allegro (167.99.126.228)
+
+| Metric | Value | Utilization |
+|--------|-------|-------------|
+| **OS** | Ubuntu 24.04 (6.8.0-106-generic) | |
+| **vCPU** | 4 vCPU (DO basic droplet, shared) | Moderate load |
+| **RAM** | 7,941 MB total | 1,591 used / 6,349 available (20% used, 80% free) |
+| **Disk** | 154 GB vda1 | 41 GB used / 114 GB free (27%) **GOOD** |
+| **Swap** | 8,191 MB | 686 MB used (8%) |
+| **Uptime** | 7 days, 18 hours | |
+
+### Key Processes (sorted by memory)
+| Process | RSS | %CPU | Notes |
+|---------|-----|------|-------|
+| Hermes gateway (allegro) | 680 MB | 0.9% | Main agent gateway |
+| Gitea | 181 MB | 1.2% | Secondary gitea? |
+| Systemd-journald | 160 MB | 0.0% | System logging |
+| Ezra Hermes gateway | 58 MB | 0.0% | Running ezra agent here |
+| Bezalel Hermes gateway | 58 MB | 0.0% | Running bezalel agent here |
+| Dockerd | 48 MB | 0.0% | Docker daemon |
+
+**Capacity assessment:** 20% memory used, 27% disk used. Allegro has headroom. Also running hermes gateways for Ezra and Bezalel (cross-host agent execution).
+
+### Bezalel (159.203.146.185)
+
+| Metric | Value | Utilization |
+|--------|-------|-------------|
+| **OS** | Ubuntu 24.04 (6.8.0-71-generic) | |
+| **vCPU** | 2 vCPU (DO basic droplet, shared) | Load varies |
+| **RAM** | 1,968 MB total | 817 used / 1,151 available (42% used, 58% free) |
+| **Disk** | 48 GB vda1 | 12 GB used / 37 GB free (24%) **GOOD** |
+| **Swap** | 2,047 MB | 448 MB used (22%) |
+| **Uptime** | 7 days, 18 hours | |
+
+### Key Processes (sorted by memory)
+| Process | RSS | %CPU | Notes |
+|---------|-----|------|-------|
+| Hermes gateway | 339 MB | 7.7% | Agent gateway (16.8% of RAM) |
+| uv pip install | 137 MB | 56.6% | Installing packages (temporary) |
+| Mender | 27 MB | 0.0% | Device management |
+
+**Capacity assessment:** 42% memory used, only 2GB total RAM. Bezalel is the most constrained. 2 vCPU means less compute headroom than Ezra/Allegro. Disk is fine.
+
+### Mac Local (M3 Max)
+
+| Metric | Value | Utilization |
+|--------|-------|-------------|
+| **OS** | macOS 26.3.1 | |
+| **CPU** | Apple M3 Max (14 cores) | Very capable |
+| **RAM** | 36 GB | ~8 GB used (22%) |
+| **Disk** | 926 GB total | ~624 GB used / 302 GB free (68%) |
+
+### Key Processes
+| Process | Memory | Notes |
+|---------|--------|-------|
+| Hermes gateway | 500 MB | Primary gateway |
+| Hermes agents (x3) | ~560 MB total | Multiple sessions |
+| Ollama | ~20 MB base + model memory | Model loading varies |
+| OpenClaw | 350 MB | Gateway process |
+| Evennia (server+portal) | 56 MB | Game world |
+
+---
+
+## Resource Summary
+
+| Resource | Ezra | Allegro | Bezalel | Mac Local | TOTAL |
+|----------|------|---------|---------|-----------|-------|
+| **vCPU** | 4 | 4 | 2 | 14 (M3 Max) | 24 |
+| **RAM** | 8 GB (26% used) | 8 GB (20% used) | 2 GB (42% used) | 36 GB (22% used) | 54 GB |
+| **Disk** | 154 GB (72%) | 154 GB (27%) | 48 GB (24%) | 926 GB (68%) | 1,282 GB |
+| **Cost** | $12/mo | $12/mo | $12/mo | owned | $36/mo |
+
+### Utilization by Category
+| Category | Estimated Daily Hours | % of Fleet Capacity |
+|----------|----------------------|---------------------|
+| Hermes agents | ~3-4 hrs active | 5-7% |
+| Ollama inference | ~1-2 hrs | 2-4% |
+| Gitea services | 24/7 | 5-10% |
+| Evennia | 24/7 | <1% |
+| Idle | ~18-20 hrs | ~80-90% |
+
+### Capacity Utilization: ~15-20% active
+**Innovation rate:** GENERATING (capacity < 70%)
+**Recommendation:** Good — Innovation is generating because most capacity is free.
+This means Phase 3+ capabilities (orchestration, load balancing, etc.) are accessible NOW.
+
+---
+
+## Uptime Baseline
+
+**Baseline period:** 2026-04-07 14:00-16:00 UTC (2 hours, ~24 checks at 5-min intervals)
+
+| Service | Checks | Uptime | Status |
+|---------|--------|--------|--------|
+| Ezra | 24/24 | 100.0% | GOOD |
+| Allegro | 24/24 | 100.0% | GOOD |
+| Bezalel | 24/24 | 100.0% | GOOD |
+| Gitea | 23/24 | 95.8% | GOOD |
+| Hermes Gateway | 23/24 | 95.8% | GOOD |
+| Ollama | 24/24 | 100.0% | GOOD |
+| OpenClaw | 24/24 | 100.0% | GOOD |
+| Evennia | 24/24 | 100.0% | GOOD |
+| Hermes Agent | 21/24 | 87.5% | **CHECK** |
+
+### Fibonacci Uptime Milestones
+| Milestone | Target | Current | Status |
+|-----------|--------|---------|--------|
+| 95% | 95% | 100% (VPS), 98.6% (avg) | REACHED |
+| 95.5% | 95.5% | 98.6% | REACHED |
+| 96% | 96% | 98.6% | REACHED |
+| 97% | 97% | 98.6% | REACHED |
+| 98% | 98% | 98.6% | REACHED |
+| 99% | 99% | 98.6% | APPROACHING |
+
+---
+
+## Risk Assessment
+
+| Risk | Severity | Mitigation |
+|------|----------|------------|
+| Ezra disk 72% used | MEDIUM | Move non-essential data, add monitoring alert at 85% |
+| Bezalel only 2GB RAM | HIGH | Cannot run large models locally. Good for Evennia, tight for agents |
+| Ezra CPU load 269% | HIGH | MemPalace mining consuming 136% CPU. Consider scheduling |
+| Mac disk 68% used | MEDIUM | 302 GB free still. Growing but not urgent |
+| No cross-VPS mesh | LOW | SSH works but no Tailscale. No private network between VPSes |
+
+---
+
+## Recommendations
+
+### Immediate (Phase 1-2)
+1. **Ezra disk cleanup:** 44 GB free at 72%. Docker images, old logs, and MemPalace mine data could be rotated.
+2. **Alert thresholds:** Add disk alerts at 85% (Ezra, Mac) before they become critical.
+
+### Short-term (Phase 3)
+3. **Load balancing:** Ezra is CPU-bound, Allegro has 80% RAM free. Move some agent processes from Ezra to Allegro.
+4. **Innovation investment:** Since fleet is at 15-20% utilization, Innovation is high. This is the time to build Phase 3 capabilities.
+
+### Medium-term (Phase 4)
+5. **Bezalel RAM upgrade:** 2GB is tight. Consider upgrade to 4GB ($24/mo instead of $12/mo).
+6. **Tailscale mesh:** Install on all VPSes for private inter-VPS network.
+
+---
--- a/fleet/milestones.md
+++ b/fleet/milestones.md
@@ -0,0 +1,142 @@
+# Fleet Milestone Messages
+
+Every milestone marks passage through fleet evolution. When achieved, the message
+prints to the fleet log. Each one references a real achievement, not abstract numbers.
+
+**Source:** Inspired by Paperclips milestone messages (500 clips, 1000 clips, Full autonomy attained, etc.)
+
+---
+
+## Phase 1: Survival (Current)
+
+### M1: First Automated Health Check
+**Trigger:** `fleet/health_check.py` runs successfully for the first time.
+**Message:** "First automated health check runs. No longer watching the clock."
+
+### M2: First Auto-Restart
+**Trigger:** A dead process is detected and restarted without human intervention.
+**Message:** "A process failed at 3am and restarted itself. You found out in the morning."
+
+### M3: First Backup Completed
+**Trigger:** A backup pipeline runs end-to-end and verifies integrity.
+**Message:** "A backup completed. You did not have to think about it."
+
+### M4: 95% Uptime (30 days)
+**Trigger:** Uptime >= 95% over last 30 days.
+**Message:** "95% uptime over 30 days. The fleet stays up."
+
+### M5: Uptime 97%
+**Trigger:** Uptime >= 97% over last 30 days.
+**Message:** "97% uptime. Three nines of availability across four machines."
+
+---
+
+## Phase 2: Automation (unlock when: uptime >= 95% + capacity > 60%)
+
+### M6: Zero Manual Restarts (7 days)
+**Trigger:** 7 consecutive days with zero manual process restarts.
+**Message:** "Seven days. Zero manual restarts. The fleet heals itself."
+
+### M7: PR Auto-Merged
+**Trigger:** A PR passes CI, review, and merges without human touching it.
+**Message:** "A PR was tested, reviewed, and merged by agents. You just said 'looks good.'"
+
+### M8: Config Push Works
+**Trigger:** Config change pushed to all 3 VPSes atomically and verified.
+**Message:** "Config pushed to all three VPSes in one command. No SSH needed."
+
+### M9: 98% Uptime
+**Trigger:** Uptime >= 98% over last 30 days.
+**Message:** "98% uptime. Only 14 hours of downtime in a month. Most of it planned."
+
+---
+
+## Phase 3: Orchestration (unlock when: all Phase 2 buildings + Innovation > 100)
+
+### M10: Cross-Agent Delegation Works
+**Trigger:** Agent A creates issue, assigns to Agent B, Agent B works and creates PR.
+**Message:** "Agent Alpha created a task, Agent Beta completed it. They did not ask permission."
+
+### M11: First Model Running Locally on 2+ Machines
+**Trigger:** Ollama serving same model on Ezra and Allegro simultaneously.
+**Message:** "A model runs on two machines at once. No cloud. No rate limits."
+
+### M12: Fleet-Wide Burn Mode
+**Trigger:** All agents coordinated on single epic, produced coordinated PRs.
+**Message:** "All agents working the same epic. The fleet moves as one."
+
+---
+
+## Phase 4: Sovereignty (unlock when: zero cloud deps for core ops)
+
+### M13: First Entirely Local Inference Day
+**Trigger:** 24 hours with zero API calls to external providers.
+**Message:** "A model ran locally for the first time. No cloud. No rate limits. No one can turn it off."
+
+### M14: Sovereign Email
+**Trigger:** Stalwart email server sends and receives without Gmail relay.
+**Message:** "Email flows through our own server. No Google. No Microsoft. Ours."
+
+### M15: Sovereign Messaging
+**Trigger:** Telegram bot runs without cloud relay dependency.
+**Message:** "Messages arrive through our own infrastructure. No corporate middleman."
+
+---
+
+## Phase 5: Scale (unlock when: sovereignty stable + Innovation > 500)
+
+### M16: First Self-Spawned Agent
+**Trigger:** Agent lifecycle manager spawns a new agent instance due to load.
+**Message:** "A new agent appeared. You did not create it. The fleet built what it needed."
+
+### M17: Agent Retired Gracefully
+**Trigger:** An agent instance retires after idle timeout and cleans up its state.
+**Message:** "An agent retired. It served its purpose. Nothing was lost."
+
+### M18: Fleet Runs 24h Unattended
+**Trigger:** 24 hours with zero human intervention of any kind.
+**Message:** "A full day. No humans. No commands. The fleet runs itself."
+
+---
+
+## Phase 6: The Network (unlock when: 7 days zero human intervention)
+
+### M19: Fleet Creates Its Own Improvement Task
+**Trigger:** Fleet analyzes itself and creates an issue on Gitea.
+**Message:** "The fleet found something to improve. It created the task itself."
+
+### M20: First Outside Contribution
+**Trigger:** An external contributor's PR is reviewed and merged by fleet agents.
+**Message:** "Someone outside the fleet contributed. The fleet reviewed, tested, and merged. No human touched it."
+
+### M21: The Beacon
+**Trigger:** Infrastructure serves someone in need through automated systems.
+**Message:** "Someone found the Beacon. In the dark, looking for help. The infrastructure served its purpose. It was built for this."
+
+### M22: Permanent Light
+**Trigger:** 90 days of autonomous operation with continuous availability.
+**Message:** "Three months. The light never went out. Not for anyone."
+
+---
+
+## Fibonacci Uptime Milestones
+
+These trigger regardless of phase, based purely on uptime percentage:
+
+| Milestone | Uptime | Meaning |
+|-----------|--------|--------|
+| U1 | 95% | Basic reliability achieved |
+| U2 | 95.5% | Fewer than 16 hours/month downtime |
+| U3 | 96% | Fewer than 12 hours/month |
+| U4 | 97% | Fewer than 9 hours/month |
+| U5 | 97.5% | Fewer than 7 hours/month |
+| U6 | 98% | Fewer than 4.5 hours/month |
+| U7 | 98.3% | Fewer than 3 hours/month |
+| U8 | 98.6% | Less than 2.5 hours/month — approaching cloud tier |
+| U9 | 98.9% | Less than 1.5 hours/month |
+| U10 | 99% | Less than 1 hour/month — enterprise grade |
+| U11 | 99.5% | Less than 22 minutes/month |
+
+---
+
+*Every message is earned. None are given freely. Fleet evolution is not a checklist — it is a climb.*
--- a/fleet/resource_tracker.py
+++ b/fleet/resource_tracker.py
@@ -0,0 +1,231 @@
+#!/usr/bin/env python3
+"""
+Fleet Resource Tracker — Tracks Capacity, Uptime, and Innovation.
+
+Paperclips-inspired tension model:
+- Capacity: spent on fleet improvements, generates through utilization
+- Uptime: earned when services stay up, Fibonacci milestones unlock capabilities
+- Innovation: only generates when capacity < 70%. Fuels Phase 3+.
+
+This is the heart of the fleet progression system.
+"""
+
+import os
+import json
+import time
+import socket
+from datetime import datetime, timezone
+from pathlib import Path
+
+# === CONFIG ===
+DATA_DIR = Path(os.path.expanduser("~/.local/timmy/fleet-resources"))
+RESOURCES_FILE = DATA_DIR / "resources.json"
+
+# Tension thresholds
+INNOVATION_THRESHOLD = 0.70  # Innovation only generates when capacity < 70%
+INNOVATION_RATE = 5.0        # Innovation generated per hour when under threshold
+CAPACITY_REGEN_RATE = 2.0    # Capacity regenerates per hour of healthy operation
+FIBONACCI = [95.0, 95.5, 96.0, 97.0, 97.5, 98.0, 98.3, 98.6, 98.9, 99.0, 99.5]
+
+
+def init():
+    DATA_DIR.mkdir(parents=True, exist_ok=True)
+    if not RESOURCES_FILE.exists():
+        data = {
+            "capacity": {
+                "current": 100.0,
+                "max": 100.0,
+                "spent_on": [],
+                "history": []
+            },
+            "uptime": {
+                "current_pct": 100.0,
+                "milestones_reached": [],
+                "total_checks": 0,
+                "successful_checks": 0,
+                "history": []
+            },
+            "innovation": {
+                "current": 0.0,
+                "total_generated": 0.0,
+                "spent_on": [],
+                "last_calculated": time.time()
+            }
+        }
+        RESOURCES_FILE.write_text(json.dumps(data, indent=2))
+        print("Initialized resource tracker")
+    return RESOURCES_FILE.exists()
+
+
+def load():
+    if RESOURCES_FILE.exists():
+        return json.loads(RESOURCES_FILE.read_text())
+    return None
+
+
+def save(data):
+    RESOURCES_FILE.write_text(json.dumps(data, indent=2))
+
+
+def update_uptime(checks: dict):
+    """Update uptime stats from health check results.
+		checks = {'ezra': True, 'allegro': True, 'bezalel': True, 'gitea': True, ...}
+		"""
+    data = load()
+    if not data:
+        return
+
+    data["uptime"]["total_checks"] += 1
+    successes = sum(1 for v in checks.values() if v)
+    total = len(checks)
+
+    # Overall uptime percentage
+    overall = successes / max(total, 1) * 100.0
+    data["uptime"]["successful_checks"] += successes
+
+    # Calculate rolling uptime
+    if "history" not in data["uptime"]:
+        data["uptime"]["history"] = []
+    data["uptime"]["history"].append({
+        "ts": datetime.now(timezone.utc).isoformat(),
+        "checks": checks,
+        "overall": round(overall, 2)
+    })
+
+    # Keep last 1000 checks
+    if len(data["uptime"]["history"]) > 1000:
+        data["uptime"]["history"] = data["uptime"]["history"][-1000:]
+
+    # Calculate current uptime %, last 100 checks
+    recent = data["uptime"]["history"][-100:]
+    recent_ok = sum(c["overall"] for c in recent) / max(len(recent), 1)
+    data["uptime"]["current_pct"] = round(recent_ok, 2)
+
+    # Check Fibonacci milestones
+    new_milestones = []
+    for fib in FIBONACCI:
+        if fib not in data["uptime"]["milestones_reached"] and recent_ok >= fib:
+            data["uptime"]["milestones_reached"].append(fib)
+            new_milestones.append(fib)
+
+    save(data)
+
+    if new_milestones:
+        print(f"  UPTIME MILESTONE: {','.join(str(m) + '%') for m in new_milestones}")
+        print(f"  Current uptime: {recent_ok:.1f}%")
+
+    return data["uptime"]
+
+
+def spend_capacity(amount: float, purpose: str):
+    """Spend capacity on a fleet improvement."""
+    data = load()
+    if not data:
+        return False
+    if data["capacity"]["current"] < amount:
+        print(f"  INSUFFICIENT CAPACITY: Need {amount}, have {data['capacity']['current']:.1f}")
+        return False
+    data["capacity"]["current"] -= amount
+    data["capacity"]["spent_on"].append({
+        "purpose": purpose,
+        "amount": amount,
+        "ts": datetime.now(timezone.utc).isoformat()
+    })
+    save(data)
+    print(f"  Spent {amount} capacity on: {purpose}")
+    return True
+
+
+def regenerate_resources():
+    """Regenerate capacity and calculate innovation."""
+    data = load()
+    if not data:
+        return
+
+    now = time.time()
+    last = data["innovation"]["last_calculated"]
+    hours = (now - last) / 3600.0
+    if hours < 0.1:  # Only update every ~6 minutes
+        return
+
+    # Regenerate capacity
+    capacity_gain = CAPACITY_REGEN_RATE * hours
+    data["capacity"]["current"] = min(
+        data["capacity"]["max"],
+        data["capacity"]["current"] + capacity_gain
+    )
+
+    # Calculate capacity utilization
+    utilization = 1.0 - (data["capacity"]["current"] / data["capacity"]["max"])
+
+    # Generate innovation only when under threshold
+    innovation_gain = 0.0
+    if utilization < INNOVATION_THRESHOLD:
+        innovation_gain = INNOVATION_RATE * hours * (1.0 - utilization / INNOVATION_THRESHOLD)
+        data["innovation"]["current"] += innovation_gain
+        data["innovation"]["total_generated"] += innovation_gain
+
+    # Record history
+    if "history" not in data["capacity"]:
+        data["capacity"]["history"] = []
+    data["capacity"]["history"].append({
+        "ts": datetime.now(timezone.utc).isoformat(),
+        "capacity": round(data["capacity"]["current"], 1),
+        "utilization": round(utilization * 100, 1),
+        "innovation": round(data["innovation"]["current"], 1),
+        "innovation_gain": round(innovation_gain, 1)
+    })
+    # Keep last 500 capacity records
+    if len(data["capacity"]["history"]) > 500:
+        data["capacity"]["history"] = data["capacity"]["history"][-500:]
+
+    data["innovation"]["last_calculated"] = now
+
+    save(data)
+    print(f"  Capacity: {data['capacity']['current']:.1f}/{data['capacity']['max']:.1f}")
+    print(f"  Utilization: {utilization*100:.1f}%")
+    print(f"  Innovation: {data['innovation']['current']:.1f} (+{innovation_gain:.1f} this period)")
+
+    return data
+
+
+def status():
+    """Print current resource status."""
+    data = load()
+    if not data:
+        print("Resource tracker not initialized. Run --init first.")
+        return
+
+    print("\n=== Fleet Resources ===")
+    print(f"  Capacity: {data['capacity']['current']:.1f}/{data['capacity']['max']:.1f}")
+
+    utilization = 1.0 - (data["capacity"]["current"] / data["capacity"]["max"])
+    print(f"  Utilization: {utilization*100:.1f}%")
+
+    innovation_status = "GENERATING" if utilization < INNOVATION_THRESHOLD else "BLOCKED"
+    print(f"  Innovation: {data['innovation']['current']:.1f} [{innovation_status}]")
+
+    print(f"  Uptime: {data['uptime']['current_pct']:.1f}%")
+    print(f"  Milestones: {', '.join(str(m)+'%' for m in data['uptime']['milestones_reached']) or 'None yet'}")
+
+    # Phase gate checks
+    phase_2_ok = data['uptime']['current_pct'] >= 95.0
+    phase_3_ok = phase_2_ok and data['innovation']['current'] > 100
+    phase_5_ok = phase_2_ok and data['innovation']['current'] > 500
+
+    print(f"\n  Phase Gates:")
+    print(f"    Phase 2 (Automation): {'UNLOCKED' if phase_2_ok else 'LOCKED (need 95% uptime)'}")
+    print(f"    Phase 3 (Orchestration): {'UNLOCKED' if phase_3_ok else 'LOCKED (need 95% uptime + 100 innovation)'}")
+    print(f"    Phase 5 (Scale): {'UNLOCKED' if phase_5_ok else 'LOCKED (need 95% uptime + 500 innovation)'}")
+
+
+if __name__ == "__main__":
+    import sys
+    init()
+    if len(sys.argv) > 1 and sys.argv[1] == "status":
+        status()
+    elif len(sys.argv) > 1 and sys.argv[1] == "regen":
+        regenerate_resources()
+    else:
+        regenerate_resources()
+        status()