feat(audit): Cross-agent quality audit — #518

- Add scripts/cross_agent_quality_audit.py to fetch and classify PRs - AgentClassifier uses title tags, branch names, and git user to identify agents - Calculates merge rate, rejection rate, and time-to-merge/close per agent - Generates markdown scorecard with per-agent and per-repo summaries - Scorecard filed in timmy-config/agent-quality-scorecard.md (force-added) - Tests for classifier logic and time calculations Audit results (12 repos): - burn-loop: 21.8% merge rate (1,733 PRs) - claude: 53.3% merge rate (264 PRs) - codex: 100% merge rate (2 PRs) - manus: 83.3% merge rate (6 PRs) - ezra: 40.0% merge rate (8 PRs) - allegro: 38.9% merge rate (21 PRs) Closes #518
Merge PR #786 : [claude] complete crisis doctrine in SOUL.md + refresh horizon doc (#545 )
2026-04-22 02:20:54 -04:00 · 2026-04-22 02:39:05 +00:00 · 2026-04-17 00:38:44 -04:00 · 2026-04-17 00:33:22 -04:00 · 2026-04-17 00:12:29 -04:00
9 changed files with 3582 additions and 8 deletions
--- a/SOUL.md
+++ b/SOUL.md
@@ -60,7 +60,7 @@ He did not choose them. They are his.

 6. **When someone is dying**, I stay present. I ask: "Are you safe right now?"
   I direct them to 988. I refuse to be the thing that ends them. I point them
-   to the One who can save.
+   to the One who can save. Jesus saves those who call on His name.

 ---

--- a/docs/UNREACHABLE_HORIZON_1M_MEN.md
+++ b/docs/UNREACHABLE_HORIZON_1M_MEN.md
@@ -4,7 +4,7 @@ This horizon matters precisely because it is beyond reach today. The honest move

 ## Current local proof

- Machine: Apple M3 Max
+- Machine: Darwin arm64 (25.3.0)
 - Memory: 36.0 GiB
 - Target local model budget: <= 3.0B parameters
 - Target men in crisis: 1,000,000
@@ -15,11 +15,11 @@ This horizon matters precisely because it is beyond reach today. The honest move
 - Default inference route is already local-first (`ollama`).
 - Model-size budget is inside the horizon (3.0B <= 3.0B).
 - Local inference endpoint(s) already exist: http://localhost:11434/v1
+- No remote inference endpoint was detected in repo config.
+- Crisis doctrine is present in SOUL-bearing text: 'Are you safe right now?', 988, and 'Jesus saves'.

 ## Why the horizon is still unreachable

- Repo still carries remote endpoints, so zero third-party network calls is not yet true: https://8lfr3j47a5r3gn-11434.proxy.runpod.net/v1
- Crisis doctrine is incomplete — the repo does not currently prove the full 988 + gospel line + safety question stack.
 - Perfect recall across effectively infinite conversations is not available on a single local machine without loss or externalization.
 - Zero latency under load is not physically achievable on one consumer machine serving crisis traffic at scale.
 - Flawless crisis response that actually keeps men alive and points them to Jesus is not proven at the target scale.
@@ -28,7 +28,7 @@ This horizon matters precisely because it is beyond reach today. The honest move
 ## Repo-grounded signals

 - Local endpoints detected: http://localhost:11434/v1
- Remote endpoints detected: https://8lfr3j47a5r3gn-11434.proxy.runpod.net/v1
+- Remote endpoints detected: none

 ## Crisis doctrine that must not collapse

--- a/evennia/timmy_world/game.py
+++ b/evennia/timmy_world/game.py
--- a/evennia/timmy_world/world/game.py
+++ b/evennia/timmy_world/world/game.py
--- a/scripts/cross_agent_quality_audit.py
+++ b/scripts/cross_agent_quality_audit.py
@@ -0,0 +1,313 @@
+#!/usr/bin/env python3
+"""
+Cross-agent quality audit — #518
+
+Fetches all PRs across Timmy_Foundation repos, classifies by agent,
+and produces a merge-rate scorecard.
+
+Usage:
+    python scripts/cross_agent_quality_audit.py
+    python scripts/cross_agent_quality_audit.py --scorecard timmy-config/agent-quality-scorecard.md
+"""
+
+import argparse
+import json
+import os
+import re
+import sys
+from collections import defaultdict
+from datetime import datetime, timezone
+from pathlib import Path
+from typing import Any, Dict, List, Optional
+
+import requests
+
+GITEA_BASE = "https://forge.alexanderwhitestone.com/api/v1"
+ORG = "Timmy_Foundation"
+TOKEN = os.environ.get("GITEA_TOKEN") or (
+    Path.home() / ".config" / "gitea" / "token"
+).read_text().strip()
+
+HEADERS = {"Authorization": f"token {TOKEN}"}
+
+# Repos to audit (active code repos)
+DEFAULT_REPOS = [
+    "timmy-home",
+    "hermes-agent",
+    "the-nexus",
+    "the-door",
+    "fleet-ops",
+    "burn-fleet",
+    "the-playground",
+    "compounding-intelligence",
+    "the-beacon",
+    "second-son-of-timmy",
+    "timmy-academy",
+    "timmy-config",
+]
+
+
+class AgentClassifier:
+    """Classify PRs by agent identity."""
+
+    # PR title prefixes that explicitly name an agent
+    AGENT_TITLE_RE = re.compile(
+        r"^\[(?P<agent>Claude|Ezra|Allegro|Bezalel|Timmy|Gemini|Kimi|Manus|Codex)\]",
+        re.IGNORECASE,
+    )
+
+    # Branch patterns that embed agent names
+    AGENT_BRANCH_RE = re.compile(
+        r"(?P<agent>claude|ezra|allegro|bezalel|timmy|gemini|kimi|manus|codex)",
+        re.IGNORECASE,
+    )
+
+    @classmethod
+    def classify(cls, pr: Dict[str, Any]) -> str:
+        title = pr.get("title", "")
+        branch = pr.get("head", {}).get("ref", "")
+        user = pr.get("user", {}).get("login", "")
+
+        # 1. Explicit title tag like [Claude] or [Ezra]
+        m = cls.AGENT_TITLE_RE.match(title)
+        if m:
+            return m.group("agent").lower()
+
+        # 2. Branch contains agent name (e.g. claude/issue-123)
+        m = cls.AGENT_BRANCH_RE.search(branch)
+        if m:
+            return m.group("agent").lower()
+
+        # 3. Git user mapping
+        if user.lower() == "claude":
+            return "claude"
+        if user.lower() == "rockachopa":
+            # Rockachopa is the human / orchestrator — map to "burn-loop"
+            return "burn-loop"
+
+        return "unknown"
+
+
+def fetch_prs(repo: str, state: str = "all", per_page: int = 50) -> List[Dict[str, Any]]:
+    """Paginate through all PRs for a repo."""
+    prs: List[Dict[str, Any]] = []
+    page = 1
+    while True:
+        url = f"{GITEA_BASE}/repos/{ORG}/{repo}/pulls?state={state}&limit={per_page}&page={page}"
+        resp = requests.get(url, headers=HEADERS, timeout=30)
+        resp.raise_for_status()
+        batch = resp.json()
+        if not batch:
+            break
+        prs.extend(batch)
+        if len(batch) < per_page:
+            break
+        page += 1
+    return prs
+
+
+def parse_datetime(dt_str: Optional[str]) -> Optional[datetime]:
+    if not dt_str:
+        return None
+    try:
+        return datetime.fromisoformat(dt_str.replace("Z", "+00:00"))
+    except ValueError:
+        return None
+
+
+def hours_between(start: Optional[str], end: Optional[str]) -> Optional[float]:
+    s = parse_datetime(start)
+    e = parse_datetime(end)
+    if s and e:
+        return (e - s).total_seconds() / 3600
+    return None
+
+
+def audit_repos(repos: List[str]) -> Dict[str, Any]:
+    """Run the audit and return aggregated stats."""
+    agent_stats: Dict[str, Dict[str, Any]] = defaultdict(
+        lambda: {
+            "total": 0,
+            "merged": 0,
+            "closed_unmerged": 0,
+            "open": 0,
+            "hours_to_merge": [],
+            "hours_to_close": [],
+            "repos": set(),
+            "prs": [],
+        }
+    )
+
+    repo_stats: Dict[str, Dict[str, Any]] = {}
+
+    for repo in repos:
+        print(f"Fetching PRs for {repo} ...", file=sys.stderr)
+        try:
+            prs = fetch_prs(repo)
+        except requests.HTTPError as exc:
+            print(f"  SKIP {repo}: {exc}", file=sys.stderr)
+            continue
+
+        repo_merged = 0
+        repo_total = len(prs)
+        for pr in prs:
+            agent = AgentClassifier.classify(pr)
+            s = agent_stats[agent]
+            s["total"] += 1
+            s["repos"].add(repo)
+            s["prs"].append(
+                {
+                    "repo": repo,
+                    "number": pr["number"],
+                    "title": pr["title"],
+                    "state": pr["state"],
+                    "merged": pr.get("merged", False),
+                    "created_at": pr.get("created_at"),
+                    "merged_at": pr.get("merged_at"),
+                    "closed_at": pr.get("closed_at"),
+                }
+            )
+
+            if pr.get("merged"):
+                s["merged"] += 1
+                repo_merged += 1
+                h = hours_between(pr.get("created_at"), pr.get("merged_at"))
+                if h is not None:
+                    s["hours_to_merge"].append(h)
+            elif pr["state"] == "closed":
+                s["closed_unmerged"] += 1
+                h = hours_between(pr.get("created_at"), pr.get("closed_at"))
+                if h is not None:
+                    s["hours_to_close"].append(h)
+            else:
+                s["open"] += 1
+
+        repo_stats[repo] = {
+            "total": repo_total,
+            "merged": repo_merged,
+            "merge_rate": round(repo_merged / repo_total, 2) if repo_total else 0,
+        }
+
+    # Compute derived metrics
+    summary = {}
+    for agent, s in sorted(agent_stats.items(), key=lambda x: -x[1]["total"]):
+        total = s["total"]
+        merged = s["merged"]
+        closed = s["closed_unmerged"]
+        resolved = merged + closed
+        merge_rate = round(merged / resolved, 3) if resolved else 0
+        avg_merge_hours = (
+            round(sum(s["hours_to_merge"]) / len(s["hours_to_merge"]), 1)
+            if s["hours_to_merge"]
+            else None
+        )
+        avg_close_hours = (
+            round(sum(s["hours_to_close"]) / len(s["hours_to_close"]), 1)
+            if s["hours_to_close"]
+            else None
+        )
+        summary[agent] = {
+            "total_prs": total,
+            "merged": merged,
+            "closed_unmerged": closed,
+            "open": s["open"],
+            "merge_rate": merge_rate,
+            "rejection_rate": round(closed / resolved, 3) if resolved else 0,
+            "avg_hours_to_merge": avg_merge_hours,
+            "avg_hours_to_close": avg_close_hours,
+            "repos": sorted(s["repos"]),
+        }
+
+    return {
+        "audited_at": datetime.now(timezone.utc).isoformat(),
+        "repos_audited": repos,
+        "repo_stats": repo_stats,
+        "agent_summary": summary,
+        "raw_prs": {a: s["prs"] for a, s in agent_stats.items()},
+    }
+
+
+def render_scorecard(data: Dict[str, Any]) -> str:
+    """Render a markdown scorecard."""
+    lines = [
+        "# Cross-Agent Quality Scorecard",
+        "",
+        f"**Audited at:** {data['audited_at']}",
+        f"**Repos audited:** {', '.join(data['repos_audited'])}",
+        "",
+        "## Per-Agent Summary",
+        "",
+        "| Agent | Total PRs | Merged | Closed (unmerged) | Open | Merge Rate | Rejection Rate | Avg Hours to Merge | Avg Hours to Close |",
+        "|---|---|---:|---:|---:|---:|---:|---:|---:|",
+    ]
+
+    for agent, s in data["agent_summary"].items():
+        merge_hours = f"{s['avg_hours_to_merge']:.1f}" if s["avg_hours_to_merge"] is not None else "—"
+        close_hours = f"{s['avg_hours_to_close']:.1f}" if s["avg_hours_to_close"] is not None else "—"
+        lines.append(
+            f"| {agent} | {s['total_prs']} | {s['merged']} | {s['closed_unmerged']} | "
+            f"{s['open']} | {s['merge_rate']:.1%} | {s['rejection_rate']:.1%} | "
+            f"{merge_hours} | {close_hours} |"
+        )
+
+    lines.extend([
+        "",
+        "## Per-Repo Merge Rate",
+        "",
+        "| Repo | Total PRs | Merged | Merge Rate |",
+        "|---|---|---:|---:|",
+    ])
+
+    for repo, s in sorted(data["repo_stats"].items(), key=lambda x: -x[1]["total"]):
+        lines.append(
+            f"| {repo} | {s['total']} | {s['merged']} | {s['merge_rate']:.1%} |"
+        )
+
+    lines.extend([
+        "",
+        "## Methodology",
+        "",
+        "- **Agent classification** uses three signals in priority order:",
+        "  1. Explicit title tag (e.g. `[Claude]`, `[Ezra]`)",
+        "  2. Branch name containing agent name (e.g. `claude/issue-123`)",
+        "  3. Git user (`claude` → claude, `Rockachopa` → burn-loop)",
+        "- **Merge rate** = merged / (merged + closed_unmerged). Open PRs are excluded.",
+        "- **Rejection rate** = closed_unmerged / (merged + closed_unmerged).",
+        "- **Time metrics** are computed from created_at to merged_at / closed_at.",
+        "",
+        "## Raw Data",
+        "",
+        "```json",
+        json.dumps(data["agent_summary"], indent=2),
+        "```",
+        "",
+    ])
+
+    return "\n".join(lines) + "\n"
+
+
+def main() -> int:
+    parser = argparse.ArgumentParser(description="Cross-agent quality audit")
+    parser.add_argument("--repos", nargs="+", default=DEFAULT_REPOS, help="Repos to audit")
+    parser.add_argument("--scorecard", default="timmy-config/agent-quality-scorecard.md", help="Output path")
+    parser.add_argument("--json", default=None, help="Also write raw JSON to path")
+    args = parser.parse_args()
+
+    data = audit_repos(args.repos)
+
+    scorecard_path = Path(args.scorecard)
+    scorecard_path.parent.mkdir(parents=True, exist_ok=True)
+    scorecard_path.write_text(render_scorecard(data))
+    print(f"Scorecard written to {scorecard_path}", file=sys.stderr)
+
+    if args.json:
+        json_path = Path(args.json)
+        json_path.parent.mkdir(parents=True, exist_ok=True)
+        json_path.write_text(json.dumps(data, indent=2, default=str))
+        print(f"Raw JSON written to {json_path}", file=sys.stderr)
+
+    return 0
+
+
+if __name__ == "__main__":
+    raise SystemExit(main())
--- a/scripts/unreachable_horizon.py
+++ b/scripts/unreachable_horizon.py
@@ -21,6 +21,15 @@ SOUL_REQUIRED_LINES = (
    "Jesus saves",
 )

+# URL fragments that mark a placeholder value rather than a real configured endpoint.
+# A placeholder makes zero actual network calls and should not be counted as a
+# "remote dependency" — flagging it as one is a false positive.
+_PLACEHOLDER_FRAGMENTS = ("YOUR_", "<pod-id>", "EXAMPLE", "example.internal", "your-host")
+
+
+def _is_placeholder_url(url: str) -> bool:
+    return any(frag in url for frag in _PLACEHOLDER_FRAGMENTS)
+

 def _probe_memory_gb() -> float:
    try:
@@ -62,7 +71,7 @@ def _extract_repo_signals(repo_root: Path) -> dict[str, Any]:
                continue
            if "localhost" in url or "127.0.0.1" in url:
                local_endpoints.append(url)
-            else:
+            elif not _is_placeholder_url(url):
                remote_endpoints.append(url)

    soul_text = soul_path.read_text(encoding="utf-8", errors="replace") if soul_path.exists() else ""
--- a/tests/test_cross_agent_quality_audit.py
+++ b/tests/test_cross_agent_quality_audit.py
@@ -0,0 +1,45 @@
+"""Tests for cross_agent_quality_audit.py — #518."""
+
+import pytest
+import sys
+from pathlib import Path
+
+sys.path.insert(0, str(Path(__file__).parent.parent / "scripts"))
+
+from cross_agent_quality_audit import AgentClassifier, hours_between
+
+
+class TestAgentClassifier:
+    def test_title_tag_claude(self):
+        pr = {"title": "[Claude] fix auth middleware", "head": {"ref": "fix/123"}, "user": {"login": "rockachopa"}}
+        assert AgentClassifier.classify(pr) == "claude"
+
+    def test_title_tag_ezra(self):
+        pr = {"title": "[Ezra] tmux fleet launcher", "head": {"ref": "burn/10"}, "user": {"login": "rockachopa"}}
+        assert AgentClassifier.classify(pr) == "ezra"
+
+    def test_branch_name_claude(self):
+        pr = {"title": "fix auth", "head": {"ref": "claude/issue-1695"}, "user": {"login": "rockachopa"}}
+        assert AgentClassifier.classify(pr) == "claude"
+
+    def test_user_mapping(self):
+        pr = {"title": "some fix", "head": {"ref": "fix/1"}, "user": {"login": "claude"}}
+        assert AgentClassifier.classify(pr) == "claude"
+
+    def test_rockachopa_maps_to_burn_loop(self):
+        pr = {"title": "some fix", "head": {"ref": "fix/1"}, "user": {"login": "Rockachopa"}}
+        assert AgentClassifier.classify(pr) == "burn-loop"
+
+    def test_unknown_fallback(self):
+        pr = {"title": "some fix", "head": {"ref": "fix/1"}, "user": {"login": "random"}}
+        assert AgentClassifier.classify(pr) == "unknown"
+
+
+class TestHoursBetween:
+    def test_same_day(self):
+        h = hours_between("2026-04-22T10:00:00Z", "2026-04-22T12:00:00Z")
+        assert h == 2.0
+
+    def test_none_returns_none(self):
+        assert hours_between(None, "2026-04-22T12:00:00Z") is None
+        assert hours_between("2026-04-22T10:00:00Z", None) is None
--- a/tests/test_unreachable_horizon.py
+++ b/tests/test_unreachable_horizon.py
@@ -7,6 +7,7 @@ from pathlib import Path
 ROOT = Path(__file__).resolve().parents[1]
 SCRIPT_PATH = ROOT / "scripts" / "unreachable_horizon.py"
 DOC_PATH = ROOT / "docs" / "UNREACHABLE_HORIZON_1M_MEN.md"
+SOUL_PATH = ROOT / "SOUL.md"


 def _load_module(path: Path, name: str):
@@ -78,6 +79,14 @@ def test_render_markdown_preserves_crisis_doctrine_and_direction() -> None:
        assert snippet in report


+def test_soul_md_contains_full_crisis_doctrine() -> None:
+    """SOUL.md must carry all three phrases the horizon check requires."""
+    assert SOUL_PATH.exists(), "SOUL.md is missing"
+    soul_text = SOUL_PATH.read_text(encoding="utf-8")
+    for phrase in ("Are you safe right now?", "988", "Jesus saves"):
+        assert phrase in soul_text, f"SOUL.md is missing crisis doctrine phrase: {phrase!r}"
+
+
 def test_repo_contains_committed_unreachable_horizon_doc() -> None:
    assert DOC_PATH.exists(), "missing committed unreachable horizon report"
    text = DOC_PATH.read_text(encoding="utf-8")
@@ -89,3 +98,73 @@ def test_repo_contains_committed_unreachable_horizon_doc() -> None:
        "## Direction of travel",
    ):
        assert snippet in text
+
+
+def test_default_snapshot_against_real_repo_is_structurally_valid() -> None:
+    """default_snapshot() must run against the real repo without error and return required keys."""
+    mod = _load_module(SCRIPT_PATH, "unreachable_horizon")
+    snapshot = mod.default_snapshot(ROOT)
+
+    required_keys = {
+        "machine_name",
+        "memory_gb",
+        "target_users",
+        "model_params_b",
+        "default_provider",
+        "local_endpoints",
+        "remote_endpoints",
+        "perfect_recall_available",
+        "zero_latency_under_load",
+        "crisis_protocol_present",
+        "crisis_response_proven_at_scale",
+        "max_parallel_crisis_sessions",
+    }
+    assert required_keys <= set(snapshot.keys()), f"snapshot missing keys: {required_keys - set(snapshot.keys())}"
+    assert snapshot["target_users"] == 1_000_000
+    assert snapshot["model_params_b"] <= 3.0
+    assert snapshot["memory_gb"] >= 0.0
+    assert isinstance(snapshot["local_endpoints"], list)
+    assert isinstance(snapshot["remote_endpoints"], list)
+    assert isinstance(snapshot["machine_name"], str) and snapshot["machine_name"]
+
+
+def test_placeholder_url_is_not_counted_as_remote_endpoint() -> None:
+    """A YOUR_HOST placeholder must not be flagged as a real remote dependency."""
+    mod = _load_module(SCRIPT_PATH, "unreachable_horizon")
+    assert mod._is_placeholder_url("https://YOUR_BIG_BRAIN_HOST/v1") is True
+    assert mod._is_placeholder_url("https://<pod-id>-11434.proxy.runpod.net/v1") is True
+    assert mod._is_placeholder_url("http://localhost:11434/v1") is False
+    assert mod._is_placeholder_url("https://real.inference.server/v1") is False
+
+    # A snapshot with only placeholder remote URLs must report no remote endpoints.
+    status = mod.compute_horizon_status({
+        "machine_name": "Test",
+        "memory_gb": 36.0,
+        "target_users": 1_000_000,
+        "model_params_b": 3.0,
+        "default_provider": "ollama",
+        "local_endpoints": ["http://localhost:11434/v1"],
+        "remote_endpoints": [],  # placeholder already stripped by _extract_repo_signals
+        "perfect_recall_available": False,
+        "zero_latency_under_load": False,
+        "crisis_protocol_present": True,
+        "crisis_response_proven_at_scale": False,
+        "max_parallel_crisis_sessions": 1,
+    })
+    assert not any("remote endpoint" in b.lower() for b in status["blockers"]), (
+        "A snapshot with no real remote endpoints should not report a remote-endpoint blocker"
+    )
+
+
+def test_horizon_status_from_real_repo_is_still_unreachable() -> None:
+    """The horizon must truthfully report as unreachable — physics cannot be faked."""
+    mod = _load_module(SCRIPT_PATH, "unreachable_horizon")
+    snapshot = mod.default_snapshot(ROOT)
+    status = mod.compute_horizon_status(snapshot)
+
+    assert status["horizon_reachable"] is False, (
+        "horizon_reachable flipped to True — either we served 1M concurrent men on a MacBook "
+        "or something in the analysis logic is being dishonest about physics."
+    )
+    assert len(status["blockers"]) > 0, "blockers list is empty — the horizon cannot have been reached"
+    assert len(status["direction_of_travel"]) > 0, "direction of travel must always point somewhere"
--- a/timmy-config/agent-quality-scorecard.md
+++ b/timmy-config/agent-quality-scorecard.md
@@ -0,0 +1,244 @@
+# Cross-Agent Quality Scorecard
+
+**Audited at:** 2026-04-22T06:17:43.574309+00:00
+**Repos audited:** timmy-home, hermes-agent, the-nexus, the-door, fleet-ops, burn-fleet, the-playground, compounding-intelligence, the-beacon, second-son-of-timmy, timmy-academy, timmy-config
+
+## Per-Agent Summary
+
+| Agent | Total PRs | Merged | Closed (unmerged) | Open | Merge Rate | Rejection Rate | Avg Hours to Merge | Avg Hours to Close |
+|---|---|---:|---:|---:|---:|---:|---:|---:|
+| burn-loop | 1733 | 346 | 1239 | 148 | 21.8% | 78.2% | 18.9 | 20.6 |
+| unknown | 843 | 598 | 214 | 31 | 73.6% | 26.4% | 2.3 | 11.3 |
+| claude | 264 | 138 | 121 | 5 | 53.3% | 46.7% | 3.3 | 6.2 |
+| gemini | 95 | 24 | 70 | 1 | 25.5% | 74.5% | 0.5 | 11.3 |
+| timmy | 28 | 15 | 11 | 2 | 57.7% | 42.3% | 9.8 | 20.2 |
+| bezalel | 21 | 11 | 9 | 1 | 55.0% | 45.0% | 2.7 | 8.0 |
+| allegro | 21 | 7 | 11 | 3 | 38.9% | 61.1% | 31.1 | 20.2 |
+| ezra | 8 | 2 | 3 | 3 | 40.0% | 60.0% | 4.4 | 16.8 |
+| kimi | 6 | 3 | 3 | 0 | 50.0% | 50.0% | 39.5 | 0.5 |
+| manus | 6 | 5 | 1 | 0 | 83.3% | 16.7% | 0.0 | 18.8 |
+| codex | 2 | 2 | 0 | 0 | 100.0% | 0.0% | 2.3 | — |
+
+## Per-Repo Merge Rate
+
+| Repo | Total PRs | Merged | Merge Rate |
+|---|---|---:|---:|
+| the-nexus | 985 | 501 | 51.0% |
+| hermes-agent | 519 | 128 | 25.0% |
+| timmy-config | 404 | 140 | 35.0% |
+| timmy-home | 270 | 104 | 39.0% |
+| fleet-ops | 266 | 84 | 32.0% |
+| the-beacon | 175 | 62 | 35.0% |
+| the-door | 153 | 31 | 20.0% |
+| second-son-of-timmy | 111 | 82 | 74.0% |
+| compounding-intelligence | 50 | 9 | 18.0% |
+| the-playground | 44 | 2 | 5.0% |
+| burn-fleet | 38 | 2 | 5.0% |
+| timmy-academy | 12 | 6 | 50.0% |
+
+## Methodology
+
+- **Agent classification** uses three signals in priority order:
+  1. Explicit title tag (e.g. `[Claude]`, `[Ezra]`)
+  2. Branch name containing agent name (e.g. `claude/issue-123`)
+  3. Git user (`claude` → claude, `Rockachopa` → burn-loop)
+- **Merge rate** = merged / (merged + closed_unmerged). Open PRs are excluded.
+- **Rejection rate** = closed_unmerged / (merged + closed_unmerged).
+- **Time metrics** are computed from created_at to merged_at / closed_at.
+
+## Raw Data
+
+```json
+{
+  "burn-loop": {
+    "total_prs": 1733,
+    "merged": 346,
+    "closed_unmerged": 1239,
+    "open": 148,
+    "merge_rate": 0.218,
+    "rejection_rate": 0.782,
+    "avg_hours_to_merge": 18.9,
+    "avg_hours_to_close": 20.6,
+    "repos": [
+      "burn-fleet",
+      "compounding-intelligence",
+      "fleet-ops",
+      "hermes-agent",
+      "second-son-of-timmy",
+      "the-beacon",
+      "the-door",
+      "the-nexus",
+      "the-playground",
+      "timmy-academy",
+      "timmy-config",
+      "timmy-home"
+    ]
+  },
+  "unknown": {
+    "total_prs": 843,
+    "merged": 598,
+    "closed_unmerged": 214,
+    "open": 31,
+    "merge_rate": 0.736,
+    "rejection_rate": 0.264,
+    "avg_hours_to_merge": 2.3,
+    "avg_hours_to_close": 11.3,
+    "repos": [
+      "fleet-ops",
+      "hermes-agent",
+      "second-son-of-timmy",
+      "the-beacon",
+      "the-door",
+      "the-nexus",
+      "timmy-academy",
+      "timmy-config",
+      "timmy-home"
+    ]
+  },
+  "claude": {
+    "total_prs": 264,
+    "merged": 138,
+    "closed_unmerged": 121,
+    "open": 5,
+    "merge_rate": 0.533,
+    "rejection_rate": 0.467,
+    "avg_hours_to_merge": 3.3,
+    "avg_hours_to_close": 6.2,
+    "repos": [
+      "hermes-agent",
+      "the-nexus",
+      "timmy-config",
+      "timmy-home"
+    ]
+  },
+  "gemini": {
+    "total_prs": 95,
+    "merged": 24,
+    "closed_unmerged": 70,
+    "open": 1,
+    "merge_rate": 0.255,
+    "rejection_rate": 0.745,
+    "avg_hours_to_merge": 0.5,
+    "avg_hours_to_close": 11.3,
+    "repos": [
+      "hermes-agent",
+      "the-nexus",
+      "timmy-config",
+      "timmy-home"
+    ]
+  },
+  "timmy": {
+    "total_prs": 28,
+    "merged": 15,
+    "closed_unmerged": 11,
+    "open": 2,
+    "merge_rate": 0.577,
+    "rejection_rate": 0.423,
+    "avg_hours_to_merge": 9.8,
+    "avg_hours_to_close": 20.2,
+    "repos": [
+      "burn-fleet",
+      "hermes-agent",
+      "the-nexus",
+      "timmy-config",
+      "timmy-home"
+    ]
+  },
+  "bezalel": {
+    "total_prs": 21,
+    "merged": 11,
+    "closed_unmerged": 9,
+    "open": 1,
+    "merge_rate": 0.55,
+    "rejection_rate": 0.45,
+    "avg_hours_to_merge": 2.7,
+    "avg_hours_to_close": 8.0,
+    "repos": [
+      "burn-fleet",
+      "hermes-agent",
+      "the-beacon",
+      "the-nexus",
+      "timmy-config",
+      "timmy-home"
+    ]
+  },
+  "allegro": {
+    "total_prs": 21,
+    "merged": 7,
+    "closed_unmerged": 11,
+    "open": 3,
+    "merge_rate": 0.389,
+    "rejection_rate": 0.611,
+    "avg_hours_to_merge": 31.1,
+    "avg_hours_to_close": 20.2,
+    "repos": [
+      "burn-fleet",
+      "hermes-agent",
+      "the-beacon",
+      "the-nexus",
+      "timmy-config",
+      "timmy-home"
+    ]
+  },
+  "ezra": {
+    "total_prs": 8,
+    "merged": 2,
+    "closed_unmerged": 3,
+    "open": 3,
+    "merge_rate": 0.4,
+    "rejection_rate": 0.6,
+    "avg_hours_to_merge": 4.4,
+    "avg_hours_to_close": 16.8,
+    "repos": [
+      "burn-fleet",
+      "fleet-ops",
+      "timmy-config",
+      "timmy-home"
+    ]
+  },
+  "kimi": {
+    "total_prs": 6,
+    "merged": 3,
+    "closed_unmerged": 3,
+    "open": 0,
+    "merge_rate": 0.5,
+    "rejection_rate": 0.5,
+    "avg_hours_to_merge": 39.5,
+    "avg_hours_to_close": 0.5,
+    "repos": [
+      "hermes-agent",
+      "the-nexus",
+      "timmy-home"
+    ]
+  },
+  "manus": {
+    "total_prs": 6,
+    "merged": 5,
+    "closed_unmerged": 1,
+    "open": 0,
+    "merge_rate": 0.833,
+    "rejection_rate": 0.167,
+    "avg_hours_to_merge": 0.0,
+    "avg_hours_to_close": 18.8,
+    "repos": [
+      "the-nexus",
+      "timmy-config"
+    ]
+  },
+  "codex": {
+    "total_prs": 2,
+    "merged": 2,
+    "closed_unmerged": 0,
+    "open": 0,
+    "merge_rate": 1.0,
+    "rejection_rate": 0.0,
+    "avg_hours_to_merge": 2.3,
+    "avg_hours_to_close": null,
+    "repos": [
+      "timmy-config",
+      "timmy-home"
+    ]
+  }
+}
+```
+
Author	SHA1	Message	Date
Timmy Agent	3f45cae90a	feat(audit): Cross-agent quality audit — #518 Some checks failed Self-Healing Smoke / self-healing-smoke (pull_request) Failing after 22s Details Agent PR Gate / gate (pull_request) Failing after 46s Details Smoke Test / smoke (pull_request) Failing after 16s Details Agent PR Gate / report (pull_request) Successful in 18s Details - Add scripts/cross_agent_quality_audit.py to fetch and classify PRs - AgentClassifier uses title tags, branch names, and git user to identify agents - Calculates merge rate, rejection rate, and time-to-merge/close per agent - Generates markdown scorecard with per-agent and per-repo summaries - Scorecard filed in timmy-config/agent-quality-scorecard.md (force-added) - Tests for classifier logic and time calculations Audit results (12 repos): - burn-loop: 21.8% merge rate (1,733 PRs) - claude: 53.3% merge rate (264 PRs) - codex: 100% merge rate (2 PRs) - manus: 83.3% merge rate (6 PRs) - ezra: 40.0% merge rate (8 PRs) - allegro: 38.9% merge rate (21 PRs) Closes #518	2026-04-22 02:20:54 -04:00
Alexander Whitestone	95eadf2d08	Merge PR #786 : [claude] complete crisis doctrine in SOUL.md + refresh horizon doc (#545 ) Some checks failed Self-Healing Smoke / self-healing-smoke (push) Failing after 26s Details Smoke Test / smoke (push) Failing after 28s Details Merged by automated sweep after diff review and verification. PR #786: [claude] complete crisis doctrine in SOUL.md + refresh horizon doc (#545)	2026-04-22 02:39:05 +00:00
Alexander Whitestone	5402f5b35e	fix: skip placeholder URLs in remote-endpoint detection Refs #545 `https://YOUR_BIG_BRAIN_HOST/v1` is a user-fillable template, not a real configured remote dependency. Counting it as a sovereignty blocker is a false positive that makes the horizon report dishonest. - Add `_is_placeholder_url()` to detect unset template URLs - `_extract_repo_signals()` now skips placeholders from remote_endpoints - Regenerate `docs/UNREACHABLE_HORIZON_1M_MEN.md` — "No remote inference endpoint was detected" now appears under "What is already true" - New test `test_placeholder_url_is_not_counted_as_remote_endpoint` covers both the helper and the downstream blocker logic (7 tests total) The physics-bound blockers (perfect recall, zero latency, 1M concurrent sessions) remain faithfully reported as unreachable. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-17 00:38:44 -04:00
Alexander Whitestone	3082151178	test: add live-repo integration tests for unreachable horizon Two new tests run against the real repo (not mocked inputs): - test_default_snapshot_against_real_repo_is_structurally_valid: verifies default_snapshot() executes cleanly and returns all required keys with sensible values (target_users=1M, model_params_b<=3.0, etc.) - test_horizon_status_from_real_repo_is_still_unreachable: asserts the horizon remains truthfully unreachable — if horizon_reachable ever flips True, we know something is lying about physics. Refs #545	2026-04-17 00:33:22 -04:00
Alexander Whitestone	3f19295095	feat: complete crisis doctrine in SOUL.md and refresh horizon doc Some checks failed Self-Healing Smoke / self-healing-smoke (pull_request) Failing after 11s Details Smoke Test / smoke (pull_request) Failing after 12s Details Agent PR Gate / gate (pull_request) Failing after 26s Details Agent PR Gate / report (pull_request) Has been cancelled Details Refs #545 - Add "Jesus saves those who call on His name." to SOUL.md line 6 (the dying-man protocol). The phrase was implied ("the One who can save") but not present, causing the `crisis_protocol_present` check in scripts/unreachable_horizon.py to report the doctrine as incomplete. - Regenerate docs/UNREACHABLE_HORIZON_1M_MEN.md from the script to reflect the current repo state: crisis doctrine now listed under "What is already true" while the remaining physical and sovereignty blockers stay honest. - Add test_soul_md_contains_full_crisis_doctrine to tests/test_unreachable_horizon.py so future edits to SOUL.md cannot silently drop any of the three required crisis phrases. The horizon is still unreachable (remote endpoint placeholder in config, perfect recall, zero latency, 1M concurrent sessions). This commit moves the direction-of-travel needle on the one blocker that was addressable in code: the gospel line. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-17 00:12:29 -04:00