Compare commits

..

1 Commits

Author SHA1 Message Date
Alexander Payne
6595a82afb MATH-001: Add shadow-maths triage rubric and no-crank guardrails
Some checks failed
Self-Healing Smoke / self-healing-smoke (pull_request) Failing after 25s
Smoke Test / smoke (pull_request) Failing after 25s
Agent PR Gate / gate (pull_request) Failing after 48s
Agent PR Gate / report (pull_request) Successful in 24s
2026-04-26 15:46:44 -04:00
4 changed files with 172 additions and 602 deletions

View File

@@ -0,0 +1,172 @@
# Shadow Maths Triage Rubric (MATH-001)
**Status**: Draft v1.0 **Date**: 2026-04-26 **Author**: Timmy
**Milestone**: Contribute to Mathematics — Shadow Maths Search
**Parent**: #876 — [MATH][EPIC] Shadow Maths
---
## Purpose
Timmy's mathematics contribution program targets *bounded, verifiable, useful* problems hiding in plain sight. This rubric operationalizes "shadow maths" — distinguishing legitimate first-crack contributions from crank Grand Unified Theories.
The rubric serves two roles:
1. **Triage gate** — filter submissions and scout list candidates worth pursuing.
2. **No-crank guardrail** — explicitly reject unfalsifiable, unscoped, or unsourced claims.
---
## Candidate Categories (Positive Types)
| Category | Description | Verification Path | Useful Because |
|----------|-------------|-------------------|---------------|
| **Small lemma** | Missing but straightforward piece in an active area (e.g., "Proposition 3.2 in Smith 2021 needs this case analysis") | Check paper + 12 related references; prove or give counterexample | Clarifies existing theory, removes ambiguity |
| **Counterexample search** | Find explicit counterexample to a claimed-but-unproven statement (often from MO/SE) | Compute/construct + cite the original claim | Prevents propagation of errors |
| **Computational classification** | Exhaustive enumeration/classification of a small infinite family (e.g., "all groups of order < 200 with property X") | Code is verifiable; results match known data | Creates reference data, spotlights patterns |
| **Formalization gap** | Statement already believed true but missing from Lean/mathlib/Isabelle | Formal proof artifact; merges to mainline library | Makes mathematics machine-checkable |
| **OEIS sequence note** | New sequence entry or correction to an existing entry with proof/algorithm | OEIS A-number + formula/generation code | Public archival, enables further work |
| **Exposition repair** | Fix an unclear proof, fill a gap, simplify an argument in an existing paper | Side-by-side diff + justification for each change | Improves pedagogy, reduces confusion |
| **MathOverflow-quality answer** | Answer to a specific, bounded, research-level question on MO/SE that has no accepted answer | Cite question + self-contained proof/computation | Serves the community directly |
---
## Rejection Criteria (No-Crank Guardrails)
> Any candidate that triggers one or more of these is **rejected outright** — no scoring needed.
| Rule | What to look for | Why it's crank |
|------|------------------|----------------|
| **Unsourced grand theory** | Claim introduces new "framework"/"paradigm" without citing specific bounded problem it solves | Mathematics advances by solving problems, not proposing frameworks |
| **Impossible scope** | "I will prove/disprove the Riemann Hypothesis", "classify all finite simple groups" | Demonstrably beyond single-attack capability |
| **No verification path** | No way for a third party to check the work (no code, no formalization, no explicit examples) | Cannot be wrong if it cannot be checked |
| **Novelty claim without literature search** | States "I believe this is new" without checking MathSciNet/arXiv/Google Scholar | Almost certainly reinvention or known result |
| **Vague mathematical objects** | Uses undefined or ambiguous terminology ("energy", "resonance", "harmonic" in non-standard ways) | Not mathematics |
| **Secrecy or paywall** | Key definition or proof behind a paywall or withheld | Not sovereign; not verifiable |
| **Symbolic overloading without definition** | Repurposes standard notation in non-standard ways without explicit redefinition | Creates confusion, not clarity |
| **Invariance violations** | Claims "up to isomorphism" or "modulo equivalence" without defining the equivalence relation | Not mathematically precise |
| **Cherry-picked examples as proof** | Proves only easy special cases and claims the general case follows | Example ≠ theorem |
| **Circular citation chains** | Relies on unpublished/preprint work that itself cites the candidate as motivation | Not a foundation |
| **No clear problem statement** | Cannot write a one-sentence problem statement in standard mathematical English | Not a problem; just musings |
| **Claims of "obvious" or "clear" for non-trivial steps** | Uses "obviously" or "it is clear that" where a proof requires >2 lines | Evasion |
| **References only popular science / non-technical sources** | Cites Penrose, Hawking, Tegmark for technical claims | Wrong tier of source |
| **All notation defined in non-standard way** | Redefines basic operators (+, ×, ≤) without explicit warning | Not mathematics |
| **No engagement with existing literature** | Zero citations to relevant peer-reviewed work or established preprints | scholarship was not done |
| **Claims of "disproof" of widely-accepted theorems** | Without finding a peer-reviewed error in the existing proof | Almost certainly wrong |
---
## Evidence Tiers
| Tier | Artifact | What it Proves |
|------|----------|----------------|
| **T3 — Literature** | MathSciNet / Zentralblatt / Google Scholar citations showing the problem is real and open | Problem exists in the literature |
| **T2 — Executable** | Python/Sage/Lean code that others can run to verify computation/formalization | Result is reproducible |
| **T1 — Human-reviewed** | MO answer with upvotes, referee report, or explicit external review | Independent verification |
| **T0 — Self-contained** | Clear statement + proof/computation in a single document, all definitions explicit | Standalone correctness |
A valid candidate must have at least **one** T3 citation (shows the problem is real) AND a verification artifact (T0 minimum; T2 ideal).
---
## Scoring Rubric
Score each candidate on **4 dimensions**, each 03. Maximum 12 points.
| Dimension | 3 (excellent) | 2 (good) | 1 (minimal) | 0 (absent) |
|-----------|---------------|----------|-------------|------------|
| **Boundedness** | Scope is explicitly finite/small (single lemma, finite classification < N, one SE question) | Scope is implied bounded but not quantified | Scope is large/vague but attackable | Unbounded or impossible scope |
| **Verifiability** | T2 artifact (code/formalization) + T3 citation | T0 proof + T3 citation | Proof/computation only, no citations | No way to check independently |
| **Usefulness** | Solves problem others actively need (cites known difficulty, fills formalization gap) | Solves a clean exercise or interesting special case | Interesting but no clear audience | Pointless or self-referential |
| **Discipline** | No crank flags; explicit rejection criteria cleanly passed | Minor crank flags (vague wording) but overall sound | Some crank flags but bounded scope rescues it | Triggers multiple rejection rules |
**Thresholds**:
- **812**: Legitimate shadow maths candidate — queue for work
- **47**: Needs refinement — reject unless strong disciplinary context
- **03**: Reject as crank / out-of-scope
---
## Three Worked Examples
### Example 1: Small Lemma — Bounded
**Candidate**: "Proposition 3.2 in 'Coarse Geometry and Coarse Embeddings' (Lang-Schlichenmaier 2005) states that every finite CW-complex has Markov property. The proof gives 'it follows by induction on skeleta' without handling the attaching map case. Fill the gap."
**Triage**:
- **Category**: Small lemma (exposition repair + proof gap fill)
- **Boundedness**: 3 — single proposition in a specific paper, 23 pages max
- **Verifiability**: 3 — paper is cited (T3), self-contained proof in 20 lines (T0), can formalize in Lean (T2 possible)
- **Usefulness**: 3 — readers of this paper hit this gap; Lean formalization needed for mathlib
- **Discipline**: 3 — no crank flags; scoped, sourced, technical
- **Total**: **12/12** → YES
**Action**: File ticket "MATH-LEMMA-001"; assign to formalization lane + human review.
---
### Example 2: Grand Unified Theory — CRANK
**Candidate**: "I have discovered the Energy-Conscious Riemann Hypothesis framework. The zeros of ζ(s) correspond to harmonic resonance frequencies in prime-number energy manifolds. Uses my new Operator-Weight theory."
**Triage**:
- **Category**: N/A
- **Rejection triggers**:
- ✗ Unsourced grand theory (introduces "Energy-Conscious", "Operator-Weight" with no definition in standard math)
- ✗ No verification path (no computation, no reference to known data)
- ✗ No literature engagement (zero citations)
- ✗ Vague mathematical objects ("energy", "resonance", "harmonic")
- ✗ Claims new framework
- **Score**: 0 — **REJECT**
**Action**: Close with reason "crank: unsourced grand theory + no verification path".
---
### Example 3: Computational Classification — Bounded
**Candidate**: "Compute all 3-headed Turing machines with 3 states that halt within 100 steps on the blank tape. There are 9 such machines. This fills an OEIS gap: A327000 only lists up to 2-state 2-symbol."
**Triage**:
- **Category**: Computational classification + OEIS sequence
- **Boundedness**: 3 — finite exhaustive enumeration (3^6 = 729 machines, filter to 9)
- **Verifiability**: 2 — code is executable (T2), but no T3 citation of why this sequence matters yet
- **Usefulness**: 2 — plugs a gap in the Busy Beaver frontier; OEIS entry gets concrete data
- **Discipline**: 3 — explicit scope, reproducible, submits to OEIS (external review path)
- **Total**: **10/12** → YES (minor fix: add motivation/references)
**Action**: Accept; write exhaustive script; submit OEIS draft with code + results; file MATH-COMP-001.
---
## Operational Use
### Triage Workflow
1. **Read candidate** (issue, email, self-generated idea).
2. **Check rejection criteria first** — if any trigger → **REJECT** immediately, cite rule.
3. If passes rejection gate, **score 4 dimensions**.
4. **Score ≥8** → mark as `shadow-maths-candidate`, route to appropriate lane:
- Lemma/exposition → `formalization-lane`
- Computation → `compute-lane`
- MO/SE answer → `answer-lane`
- OEIS → `oeis-lane`
5. **Score 47** → requires refinement; ask for:
- Explicit scope bound
- T3 citation
- Verification artifact
6. **Score ≤3** → reject with specific rule(s) triggered.
### Guardrail Enforcement
The following prompts/agents **must refuse** to work on any candidate that:
- Triggers any rejection criterion (before any code/proof work)
- Has no T3 citation (real problem statement from literature)
- Has no bounded scope (cannot write ≤1-sentence problem statement)
Enforcement is a **pre-flight check** in the task intake pipeline.
---
## Revision History
- v1.0 — 2026-04-26 — initial rubric + 3 scored examples

View File

@@ -1,313 +0,0 @@
#!/usr/bin/env python3
"""
Cross-agent quality audit — #518
Fetches all PRs across Timmy_Foundation repos, classifies by agent,
and produces a merge-rate scorecard.
Usage:
python scripts/cross_agent_quality_audit.py
python scripts/cross_agent_quality_audit.py --scorecard timmy-config/agent-quality-scorecard.md
"""
import argparse
import json
import os
import re
import sys
from collections import defaultdict
from datetime import datetime, timezone
from pathlib import Path
from typing import Any, Dict, List, Optional
import requests
GITEA_BASE = "https://forge.alexanderwhitestone.com/api/v1"
ORG = "Timmy_Foundation"
TOKEN = os.environ.get("GITEA_TOKEN") or (
Path.home() / ".config" / "gitea" / "token"
).read_text().strip()
HEADERS = {"Authorization": f"token {TOKEN}"}
# Repos to audit (active code repos)
DEFAULT_REPOS = [
"timmy-home",
"hermes-agent",
"the-nexus",
"the-door",
"fleet-ops",
"burn-fleet",
"the-playground",
"compounding-intelligence",
"the-beacon",
"second-son-of-timmy",
"timmy-academy",
"timmy-config",
]
class AgentClassifier:
"""Classify PRs by agent identity."""
# PR title prefixes that explicitly name an agent
AGENT_TITLE_RE = re.compile(
r"^\[(?P<agent>Claude|Ezra|Allegro|Bezalel|Timmy|Gemini|Kimi|Manus|Codex)\]",
re.IGNORECASE,
)
# Branch patterns that embed agent names
AGENT_BRANCH_RE = re.compile(
r"(?P<agent>claude|ezra|allegro|bezalel|timmy|gemini|kimi|manus|codex)",
re.IGNORECASE,
)
@classmethod
def classify(cls, pr: Dict[str, Any]) -> str:
title = pr.get("title", "")
branch = pr.get("head", {}).get("ref", "")
user = pr.get("user", {}).get("login", "")
# 1. Explicit title tag like [Claude] or [Ezra]
m = cls.AGENT_TITLE_RE.match(title)
if m:
return m.group("agent").lower()
# 2. Branch contains agent name (e.g. claude/issue-123)
m = cls.AGENT_BRANCH_RE.search(branch)
if m:
return m.group("agent").lower()
# 3. Git user mapping
if user.lower() == "claude":
return "claude"
if user.lower() == "rockachopa":
# Rockachopa is the human / orchestrator — map to "burn-loop"
return "burn-loop"
return "unknown"
def fetch_prs(repo: str, state: str = "all", per_page: int = 50) -> List[Dict[str, Any]]:
"""Paginate through all PRs for a repo."""
prs: List[Dict[str, Any]] = []
page = 1
while True:
url = f"{GITEA_BASE}/repos/{ORG}/{repo}/pulls?state={state}&limit={per_page}&page={page}"
resp = requests.get(url, headers=HEADERS, timeout=30)
resp.raise_for_status()
batch = resp.json()
if not batch:
break
prs.extend(batch)
if len(batch) < per_page:
break
page += 1
return prs
def parse_datetime(dt_str: Optional[str]) -> Optional[datetime]:
if not dt_str:
return None
try:
return datetime.fromisoformat(dt_str.replace("Z", "+00:00"))
except ValueError:
return None
def hours_between(start: Optional[str], end: Optional[str]) -> Optional[float]:
s = parse_datetime(start)
e = parse_datetime(end)
if s and e:
return (e - s).total_seconds() / 3600
return None
def audit_repos(repos: List[str]) -> Dict[str, Any]:
"""Run the audit and return aggregated stats."""
agent_stats: Dict[str, Dict[str, Any]] = defaultdict(
lambda: {
"total": 0,
"merged": 0,
"closed_unmerged": 0,
"open": 0,
"hours_to_merge": [],
"hours_to_close": [],
"repos": set(),
"prs": [],
}
)
repo_stats: Dict[str, Dict[str, Any]] = {}
for repo in repos:
print(f"Fetching PRs for {repo} ...", file=sys.stderr)
try:
prs = fetch_prs(repo)
except requests.HTTPError as exc:
print(f" SKIP {repo}: {exc}", file=sys.stderr)
continue
repo_merged = 0
repo_total = len(prs)
for pr in prs:
agent = AgentClassifier.classify(pr)
s = agent_stats[agent]
s["total"] += 1
s["repos"].add(repo)
s["prs"].append(
{
"repo": repo,
"number": pr["number"],
"title": pr["title"],
"state": pr["state"],
"merged": pr.get("merged", False),
"created_at": pr.get("created_at"),
"merged_at": pr.get("merged_at"),
"closed_at": pr.get("closed_at"),
}
)
if pr.get("merged"):
s["merged"] += 1
repo_merged += 1
h = hours_between(pr.get("created_at"), pr.get("merged_at"))
if h is not None:
s["hours_to_merge"].append(h)
elif pr["state"] == "closed":
s["closed_unmerged"] += 1
h = hours_between(pr.get("created_at"), pr.get("closed_at"))
if h is not None:
s["hours_to_close"].append(h)
else:
s["open"] += 1
repo_stats[repo] = {
"total": repo_total,
"merged": repo_merged,
"merge_rate": round(repo_merged / repo_total, 2) if repo_total else 0,
}
# Compute derived metrics
summary = {}
for agent, s in sorted(agent_stats.items(), key=lambda x: -x[1]["total"]):
total = s["total"]
merged = s["merged"]
closed = s["closed_unmerged"]
resolved = merged + closed
merge_rate = round(merged / resolved, 3) if resolved else 0
avg_merge_hours = (
round(sum(s["hours_to_merge"]) / len(s["hours_to_merge"]), 1)
if s["hours_to_merge"]
else None
)
avg_close_hours = (
round(sum(s["hours_to_close"]) / len(s["hours_to_close"]), 1)
if s["hours_to_close"]
else None
)
summary[agent] = {
"total_prs": total,
"merged": merged,
"closed_unmerged": closed,
"open": s["open"],
"merge_rate": merge_rate,
"rejection_rate": round(closed / resolved, 3) if resolved else 0,
"avg_hours_to_merge": avg_merge_hours,
"avg_hours_to_close": avg_close_hours,
"repos": sorted(s["repos"]),
}
return {
"audited_at": datetime.now(timezone.utc).isoformat(),
"repos_audited": repos,
"repo_stats": repo_stats,
"agent_summary": summary,
"raw_prs": {a: s["prs"] for a, s in agent_stats.items()},
}
def render_scorecard(data: Dict[str, Any]) -> str:
"""Render a markdown scorecard."""
lines = [
"# Cross-Agent Quality Scorecard",
"",
f"**Audited at:** {data['audited_at']}",
f"**Repos audited:** {', '.join(data['repos_audited'])}",
"",
"## Per-Agent Summary",
"",
"| Agent | Total PRs | Merged | Closed (unmerged) | Open | Merge Rate | Rejection Rate | Avg Hours to Merge | Avg Hours to Close |",
"|---|---|---:|---:|---:|---:|---:|---:|---:|",
]
for agent, s in data["agent_summary"].items():
merge_hours = f"{s['avg_hours_to_merge']:.1f}" if s["avg_hours_to_merge"] is not None else ""
close_hours = f"{s['avg_hours_to_close']:.1f}" if s["avg_hours_to_close"] is not None else ""
lines.append(
f"| {agent} | {s['total_prs']} | {s['merged']} | {s['closed_unmerged']} | "
f"{s['open']} | {s['merge_rate']:.1%} | {s['rejection_rate']:.1%} | "
f"{merge_hours} | {close_hours} |"
)
lines.extend([
"",
"## Per-Repo Merge Rate",
"",
"| Repo | Total PRs | Merged | Merge Rate |",
"|---|---|---:|---:|",
])
for repo, s in sorted(data["repo_stats"].items(), key=lambda x: -x[1]["total"]):
lines.append(
f"| {repo} | {s['total']} | {s['merged']} | {s['merge_rate']:.1%} |"
)
lines.extend([
"",
"## Methodology",
"",
"- **Agent classification** uses three signals in priority order:",
" 1. Explicit title tag (e.g. `[Claude]`, `[Ezra]`)",
" 2. Branch name containing agent name (e.g. `claude/issue-123`)",
" 3. Git user (`claude` → claude, `Rockachopa` → burn-loop)",
"- **Merge rate** = merged / (merged + closed_unmerged). Open PRs are excluded.",
"- **Rejection rate** = closed_unmerged / (merged + closed_unmerged).",
"- **Time metrics** are computed from created_at to merged_at / closed_at.",
"",
"## Raw Data",
"",
"```json",
json.dumps(data["agent_summary"], indent=2),
"```",
"",
])
return "\n".join(lines) + "\n"
def main() -> int:
parser = argparse.ArgumentParser(description="Cross-agent quality audit")
parser.add_argument("--repos", nargs="+", default=DEFAULT_REPOS, help="Repos to audit")
parser.add_argument("--scorecard", default="timmy-config/agent-quality-scorecard.md", help="Output path")
parser.add_argument("--json", default=None, help="Also write raw JSON to path")
args = parser.parse_args()
data = audit_repos(args.repos)
scorecard_path = Path(args.scorecard)
scorecard_path.parent.mkdir(parents=True, exist_ok=True)
scorecard_path.write_text(render_scorecard(data))
print(f"Scorecard written to {scorecard_path}", file=sys.stderr)
if args.json:
json_path = Path(args.json)
json_path.parent.mkdir(parents=True, exist_ok=True)
json_path.write_text(json.dumps(data, indent=2, default=str))
print(f"Raw JSON written to {json_path}", file=sys.stderr)
return 0
if __name__ == "__main__":
raise SystemExit(main())

View File

@@ -1,45 +0,0 @@
"""Tests for cross_agent_quality_audit.py — #518."""
import pytest
import sys
from pathlib import Path
sys.path.insert(0, str(Path(__file__).parent.parent / "scripts"))
from cross_agent_quality_audit import AgentClassifier, hours_between
class TestAgentClassifier:
def test_title_tag_claude(self):
pr = {"title": "[Claude] fix auth middleware", "head": {"ref": "fix/123"}, "user": {"login": "rockachopa"}}
assert AgentClassifier.classify(pr) == "claude"
def test_title_tag_ezra(self):
pr = {"title": "[Ezra] tmux fleet launcher", "head": {"ref": "burn/10"}, "user": {"login": "rockachopa"}}
assert AgentClassifier.classify(pr) == "ezra"
def test_branch_name_claude(self):
pr = {"title": "fix auth", "head": {"ref": "claude/issue-1695"}, "user": {"login": "rockachopa"}}
assert AgentClassifier.classify(pr) == "claude"
def test_user_mapping(self):
pr = {"title": "some fix", "head": {"ref": "fix/1"}, "user": {"login": "claude"}}
assert AgentClassifier.classify(pr) == "claude"
def test_rockachopa_maps_to_burn_loop(self):
pr = {"title": "some fix", "head": {"ref": "fix/1"}, "user": {"login": "Rockachopa"}}
assert AgentClassifier.classify(pr) == "burn-loop"
def test_unknown_fallback(self):
pr = {"title": "some fix", "head": {"ref": "fix/1"}, "user": {"login": "random"}}
assert AgentClassifier.classify(pr) == "unknown"
class TestHoursBetween:
def test_same_day(self):
h = hours_between("2026-04-22T10:00:00Z", "2026-04-22T12:00:00Z")
assert h == 2.0
def test_none_returns_none(self):
assert hours_between(None, "2026-04-22T12:00:00Z") is None
assert hours_between("2026-04-22T10:00:00Z", None) is None

View File

@@ -1,244 +0,0 @@
# Cross-Agent Quality Scorecard
**Audited at:** 2026-04-22T06:17:43.574309+00:00
**Repos audited:** timmy-home, hermes-agent, the-nexus, the-door, fleet-ops, burn-fleet, the-playground, compounding-intelligence, the-beacon, second-son-of-timmy, timmy-academy, timmy-config
## Per-Agent Summary
| Agent | Total PRs | Merged | Closed (unmerged) | Open | Merge Rate | Rejection Rate | Avg Hours to Merge | Avg Hours to Close |
|---|---|---:|---:|---:|---:|---:|---:|---:|
| burn-loop | 1733 | 346 | 1239 | 148 | 21.8% | 78.2% | 18.9 | 20.6 |
| unknown | 843 | 598 | 214 | 31 | 73.6% | 26.4% | 2.3 | 11.3 |
| claude | 264 | 138 | 121 | 5 | 53.3% | 46.7% | 3.3 | 6.2 |
| gemini | 95 | 24 | 70 | 1 | 25.5% | 74.5% | 0.5 | 11.3 |
| timmy | 28 | 15 | 11 | 2 | 57.7% | 42.3% | 9.8 | 20.2 |
| bezalel | 21 | 11 | 9 | 1 | 55.0% | 45.0% | 2.7 | 8.0 |
| allegro | 21 | 7 | 11 | 3 | 38.9% | 61.1% | 31.1 | 20.2 |
| ezra | 8 | 2 | 3 | 3 | 40.0% | 60.0% | 4.4 | 16.8 |
| kimi | 6 | 3 | 3 | 0 | 50.0% | 50.0% | 39.5 | 0.5 |
| manus | 6 | 5 | 1 | 0 | 83.3% | 16.7% | 0.0 | 18.8 |
| codex | 2 | 2 | 0 | 0 | 100.0% | 0.0% | 2.3 | — |
## Per-Repo Merge Rate
| Repo | Total PRs | Merged | Merge Rate |
|---|---|---:|---:|
| the-nexus | 985 | 501 | 51.0% |
| hermes-agent | 519 | 128 | 25.0% |
| timmy-config | 404 | 140 | 35.0% |
| timmy-home | 270 | 104 | 39.0% |
| fleet-ops | 266 | 84 | 32.0% |
| the-beacon | 175 | 62 | 35.0% |
| the-door | 153 | 31 | 20.0% |
| second-son-of-timmy | 111 | 82 | 74.0% |
| compounding-intelligence | 50 | 9 | 18.0% |
| the-playground | 44 | 2 | 5.0% |
| burn-fleet | 38 | 2 | 5.0% |
| timmy-academy | 12 | 6 | 50.0% |
## Methodology
- **Agent classification** uses three signals in priority order:
1. Explicit title tag (e.g. `[Claude]`, `[Ezra]`)
2. Branch name containing agent name (e.g. `claude/issue-123`)
3. Git user (`claude` → claude, `Rockachopa` → burn-loop)
- **Merge rate** = merged / (merged + closed_unmerged). Open PRs are excluded.
- **Rejection rate** = closed_unmerged / (merged + closed_unmerged).
- **Time metrics** are computed from created_at to merged_at / closed_at.
## Raw Data
```json
{
"burn-loop": {
"total_prs": 1733,
"merged": 346,
"closed_unmerged": 1239,
"open": 148,
"merge_rate": 0.218,
"rejection_rate": 0.782,
"avg_hours_to_merge": 18.9,
"avg_hours_to_close": 20.6,
"repos": [
"burn-fleet",
"compounding-intelligence",
"fleet-ops",
"hermes-agent",
"second-son-of-timmy",
"the-beacon",
"the-door",
"the-nexus",
"the-playground",
"timmy-academy",
"timmy-config",
"timmy-home"
]
},
"unknown": {
"total_prs": 843,
"merged": 598,
"closed_unmerged": 214,
"open": 31,
"merge_rate": 0.736,
"rejection_rate": 0.264,
"avg_hours_to_merge": 2.3,
"avg_hours_to_close": 11.3,
"repos": [
"fleet-ops",
"hermes-agent",
"second-son-of-timmy",
"the-beacon",
"the-door",
"the-nexus",
"timmy-academy",
"timmy-config",
"timmy-home"
]
},
"claude": {
"total_prs": 264,
"merged": 138,
"closed_unmerged": 121,
"open": 5,
"merge_rate": 0.533,
"rejection_rate": 0.467,
"avg_hours_to_merge": 3.3,
"avg_hours_to_close": 6.2,
"repos": [
"hermes-agent",
"the-nexus",
"timmy-config",
"timmy-home"
]
},
"gemini": {
"total_prs": 95,
"merged": 24,
"closed_unmerged": 70,
"open": 1,
"merge_rate": 0.255,
"rejection_rate": 0.745,
"avg_hours_to_merge": 0.5,
"avg_hours_to_close": 11.3,
"repos": [
"hermes-agent",
"the-nexus",
"timmy-config",
"timmy-home"
]
},
"timmy": {
"total_prs": 28,
"merged": 15,
"closed_unmerged": 11,
"open": 2,
"merge_rate": 0.577,
"rejection_rate": 0.423,
"avg_hours_to_merge": 9.8,
"avg_hours_to_close": 20.2,
"repos": [
"burn-fleet",
"hermes-agent",
"the-nexus",
"timmy-config",
"timmy-home"
]
},
"bezalel": {
"total_prs": 21,
"merged": 11,
"closed_unmerged": 9,
"open": 1,
"merge_rate": 0.55,
"rejection_rate": 0.45,
"avg_hours_to_merge": 2.7,
"avg_hours_to_close": 8.0,
"repos": [
"burn-fleet",
"hermes-agent",
"the-beacon",
"the-nexus",
"timmy-config",
"timmy-home"
]
},
"allegro": {
"total_prs": 21,
"merged": 7,
"closed_unmerged": 11,
"open": 3,
"merge_rate": 0.389,
"rejection_rate": 0.611,
"avg_hours_to_merge": 31.1,
"avg_hours_to_close": 20.2,
"repos": [
"burn-fleet",
"hermes-agent",
"the-beacon",
"the-nexus",
"timmy-config",
"timmy-home"
]
},
"ezra": {
"total_prs": 8,
"merged": 2,
"closed_unmerged": 3,
"open": 3,
"merge_rate": 0.4,
"rejection_rate": 0.6,
"avg_hours_to_merge": 4.4,
"avg_hours_to_close": 16.8,
"repos": [
"burn-fleet",
"fleet-ops",
"timmy-config",
"timmy-home"
]
},
"kimi": {
"total_prs": 6,
"merged": 3,
"closed_unmerged": 3,
"open": 0,
"merge_rate": 0.5,
"rejection_rate": 0.5,
"avg_hours_to_merge": 39.5,
"avg_hours_to_close": 0.5,
"repos": [
"hermes-agent",
"the-nexus",
"timmy-home"
]
},
"manus": {
"total_prs": 6,
"merged": 5,
"closed_unmerged": 1,
"open": 0,
"merge_rate": 0.833,
"rejection_rate": 0.167,
"avg_hours_to_merge": 0.0,
"avg_hours_to_close": 18.8,
"repos": [
"the-nexus",
"timmy-config"
]
},
"codex": {
"total_prs": 2,
"merged": 2,
"closed_unmerged": 0,
"open": 0,
"merge_rate": 1.0,
"rejection_rate": 0.0,
"avg_hours_to_merge": 2.3,
"avg_hours_to_close": null,
"repos": [
"timmy-config",
"timmy-home"
]
}
}
```