Compare commits
1 Commits
| Author | SHA1 | Date | |
|---|---|---|---|
|
|
3f45cae90a |
@@ -1,107 +0,0 @@
|
||||
# [DIRECTIVE] Unified Fleet Sovereignty & Comms Migration
|
||||
|
||||
Grounding report for `timmy-home #524`.
|
||||
|
||||
Issue #524 is a multi-lane directive, not a one-commit feature. This report grounds the directive in repo evidence, highlights stale cross-links, and names the missing operator bundles that still need real execution.
|
||||
|
||||
This remains a `Refs #524` artifact. The directive spans multiple repos and operator actions, so this report makes the current repo-side state executable without pretending the whole migration is complete.
|
||||
|
||||
## Directive Snapshot
|
||||
|
||||
- Repo-grounded workstreams: 0
|
||||
- Partial workstreams: 4
|
||||
- Missing workstreams: 1
|
||||
- Drifted references: 4
|
||||
|
||||
## Reference Drift
|
||||
|
||||
- #813 is cited for Nostr Migration Leadership, but its current title is 'docs: refresh the-playground genome analysis (#671)'.
|
||||
- #819 is cited for Nostr Migration Leadership, but its current title is 'docs: verify #648 already implemented (closes #818)'.
|
||||
- #139 is cited for v0.7.0 Feature Audit, but its current title is '🐣 Allegro-Primus is born'.
|
||||
- #103 is cited for Morrowind Local-First Benchmark, but its current title is 'Build comprehensive caching layer — cache everywhere'.
|
||||
|
||||
## Workstream Matrix
|
||||
|
||||
### 1. Nostr Migration Leadership — PARTIAL
|
||||
|
||||
- Requirement: Replace Telegram with relay-based sovereign comms, verify wizard keypairs, and prove the NIP-29 group path is stable.
|
||||
- Referenced issues:
|
||||
- #813 (closed) — docs: refresh the-playground genome analysis (#671) [DRIFT]
|
||||
- #819 (open) — docs: verify #648 already implemented (closes #818) [DRIFT]
|
||||
- Repo evidence present:
|
||||
- `infrastructure/timmy-bridge/client/timmy_client.py` — Nostr event client scaffold already exists
|
||||
- `infrastructure/timmy-bridge/monitor/timmy_monitor.py` — Nostr relay monitor already exists
|
||||
- `specs/wizard-telegram-bot-cutover.md` — Telegram cutover planning exists, so the migration lane is real
|
||||
- Missing operator deliverables:
|
||||
- wizard keypair inventory and ownership matrix
|
||||
- NIP-29 relay group verification report
|
||||
- operator runbook for cutting traffic off Telegram
|
||||
- Why this lane remains open: The repo has Nostr-adjacent scaffolding, but the directive still lacks a verified migration packet and the cited issue links drift away from the stated Nostr scope.
|
||||
|
||||
### 2. Lexicon Enforcement — PARTIAL
|
||||
|
||||
- Requirement: Enforce the Fleet Lexicon in PR review and issue triage so the team uses one shared language.
|
||||
- Referenced issues:
|
||||
- #388 (closed) — [KT] Fleet Lexicon & Techniques — Shared Vocabulary, Patterns, and Standards for All Agents [aligned]
|
||||
- Repo evidence present:
|
||||
- `docs/WIZARD_APPRENTICESHIP_CHARTER.md` — The repo already uses wizard-language canon in docs
|
||||
- `specs/timmy-ezra-bezalel-canon-sheet.md` — Canonical agent naming already exists
|
||||
- `docs/OPERATIONS_DASHBOARD.md` — Operational roles are already described in repo language
|
||||
- Missing operator deliverables:
|
||||
- machine-checkable lexicon policy for review/triage
|
||||
- terminology lint or reviewer checklist tied to the lexicon
|
||||
- Why this lane remains open: The naming canon exists, but there is still no executable enforcement bundle that would catch drift during future reviews and triage passes.
|
||||
|
||||
### 3. v0.7.0 Feature Audit — PARTIAL
|
||||
|
||||
- Requirement: Audit Hermes features that can reduce cloud dependency and turn the findings into a sovereignty implementation plan.
|
||||
- Referenced issues:
|
||||
- #139 (open) — 🐣 Allegro-Primus is born [DRIFT]
|
||||
- Repo evidence present:
|
||||
- `scripts/sovereignty_audit.py` — Cloud-vs-local audit machinery already exists
|
||||
- `reports/evaluations/2026-04-15-phase-4-sovereignty-audit.md` — Recent sovereignty audit report is committed
|
||||
- `timmy-local/README.md` — Local-first status is already documented for operators
|
||||
- Missing operator deliverables:
|
||||
- Hermes v0.7.0 feature inventory linked to cloud-reduction leverage
|
||||
- Sovereignty Implementation Plan derived from that feature audit
|
||||
- Why this lane remains open: The repo has sovereignty-audit infrastructure, but it does not yet contain the requested v0.7.0 feature inventory or the plan that turns those findings into rollout steps.
|
||||
|
||||
### 4. Morrowind Local-First Benchmark — PARTIAL
|
||||
|
||||
- Requirement: Compare cloud and local Morrowind agents, prove local parity where possible, and document the reasoning gap when it fails.
|
||||
- Referenced issues:
|
||||
- #103 (open) — Build comprehensive caching layer — cache everywhere [DRIFT]
|
||||
- Repo evidence present:
|
||||
- `morrowind/local_brain.py` — Local Morrowind control loop already exists
|
||||
- `morrowind/mcp_server.py` — Morrowind MCP control surface is already wired
|
||||
- `morrowind/pilot.py` — Trajectory logging for evaluation already exists
|
||||
- Missing operator deliverables:
|
||||
- cloud-vs-local benchmark report for the combat loop
|
||||
- reasoning-gap writeup tied to a proposed LoRA/fine-tune path
|
||||
- Why this lane remains open: The repo has a local Morrowind stack, but it does not yet contain the requested benchmark artifact; the cited issue number also points at an unrelated caching task.
|
||||
|
||||
### 5. Infrastructure Hardening / Syntax Guard — MISSING
|
||||
|
||||
- Requirement: Verify Syntax Guard pre-receive protection across Gitea repos so syntax failures stop earlier.
|
||||
- Referenced issues: none listed in the directive body
|
||||
- Repo evidence present: none
|
||||
- Missing operator deliverables:
|
||||
- repo inventory of Gitea targets that should carry Syntax Guard
|
||||
- deployment verifier for hook presence across those repos
|
||||
- operator report proving installation state instead of assuming it
|
||||
- Why this lane remains open: No repo-managed syntax-guard verifier is present yet, so this directive still depends on manual trust rather than auditable proof.
|
||||
|
||||
## Highest-Leverage Next Actions
|
||||
|
||||
- Nostr Migration Leadership: wizard keypair inventory and ownership matrix
|
||||
- Lexicon Enforcement: machine-checkable lexicon policy for review/triage
|
||||
- v0.7.0 Feature Audit: Hermes v0.7.0 feature inventory linked to cloud-reduction leverage
|
||||
- Morrowind Local-First Benchmark: cloud-vs-local benchmark report for the combat loop
|
||||
- Infrastructure Hardening / Syntax Guard: repo inventory of Gitea targets that should carry Syntax Guard
|
||||
|
||||
## Why #524 Remains Open
|
||||
|
||||
- The directive bundles five separate workstreams with different evidence surfaces.
|
||||
- Multiple cited issue numbers have drifted away from the work they are supposed to anchor.
|
||||
- Repo scaffolding exists for Nostr, sovereignty audits, and Morrowind, but the operator-facing bundles are still missing.
|
||||
- Syntax Guard verification is still undocumented and unproven inside this repo.
|
||||
313
scripts/cross_agent_quality_audit.py
Normal file
313
scripts/cross_agent_quality_audit.py
Normal file
@@ -0,0 +1,313 @@
|
||||
#!/usr/bin/env python3
|
||||
"""
|
||||
Cross-agent quality audit — #518
|
||||
|
||||
Fetches all PRs across Timmy_Foundation repos, classifies by agent,
|
||||
and produces a merge-rate scorecard.
|
||||
|
||||
Usage:
|
||||
python scripts/cross_agent_quality_audit.py
|
||||
python scripts/cross_agent_quality_audit.py --scorecard timmy-config/agent-quality-scorecard.md
|
||||
"""
|
||||
|
||||
import argparse
|
||||
import json
|
||||
import os
|
||||
import re
|
||||
import sys
|
||||
from collections import defaultdict
|
||||
from datetime import datetime, timezone
|
||||
from pathlib import Path
|
||||
from typing import Any, Dict, List, Optional
|
||||
|
||||
import requests
|
||||
|
||||
GITEA_BASE = "https://forge.alexanderwhitestone.com/api/v1"
|
||||
ORG = "Timmy_Foundation"
|
||||
TOKEN = os.environ.get("GITEA_TOKEN") or (
|
||||
Path.home() / ".config" / "gitea" / "token"
|
||||
).read_text().strip()
|
||||
|
||||
HEADERS = {"Authorization": f"token {TOKEN}"}
|
||||
|
||||
# Repos to audit (active code repos)
|
||||
DEFAULT_REPOS = [
|
||||
"timmy-home",
|
||||
"hermes-agent",
|
||||
"the-nexus",
|
||||
"the-door",
|
||||
"fleet-ops",
|
||||
"burn-fleet",
|
||||
"the-playground",
|
||||
"compounding-intelligence",
|
||||
"the-beacon",
|
||||
"second-son-of-timmy",
|
||||
"timmy-academy",
|
||||
"timmy-config",
|
||||
]
|
||||
|
||||
|
||||
class AgentClassifier:
|
||||
"""Classify PRs by agent identity."""
|
||||
|
||||
# PR title prefixes that explicitly name an agent
|
||||
AGENT_TITLE_RE = re.compile(
|
||||
r"^\[(?P<agent>Claude|Ezra|Allegro|Bezalel|Timmy|Gemini|Kimi|Manus|Codex)\]",
|
||||
re.IGNORECASE,
|
||||
)
|
||||
|
||||
# Branch patterns that embed agent names
|
||||
AGENT_BRANCH_RE = re.compile(
|
||||
r"(?P<agent>claude|ezra|allegro|bezalel|timmy|gemini|kimi|manus|codex)",
|
||||
re.IGNORECASE,
|
||||
)
|
||||
|
||||
@classmethod
|
||||
def classify(cls, pr: Dict[str, Any]) -> str:
|
||||
title = pr.get("title", "")
|
||||
branch = pr.get("head", {}).get("ref", "")
|
||||
user = pr.get("user", {}).get("login", "")
|
||||
|
||||
# 1. Explicit title tag like [Claude] or [Ezra]
|
||||
m = cls.AGENT_TITLE_RE.match(title)
|
||||
if m:
|
||||
return m.group("agent").lower()
|
||||
|
||||
# 2. Branch contains agent name (e.g. claude/issue-123)
|
||||
m = cls.AGENT_BRANCH_RE.search(branch)
|
||||
if m:
|
||||
return m.group("agent").lower()
|
||||
|
||||
# 3. Git user mapping
|
||||
if user.lower() == "claude":
|
||||
return "claude"
|
||||
if user.lower() == "rockachopa":
|
||||
# Rockachopa is the human / orchestrator — map to "burn-loop"
|
||||
return "burn-loop"
|
||||
|
||||
return "unknown"
|
||||
|
||||
|
||||
def fetch_prs(repo: str, state: str = "all", per_page: int = 50) -> List[Dict[str, Any]]:
|
||||
"""Paginate through all PRs for a repo."""
|
||||
prs: List[Dict[str, Any]] = []
|
||||
page = 1
|
||||
while True:
|
||||
url = f"{GITEA_BASE}/repos/{ORG}/{repo}/pulls?state={state}&limit={per_page}&page={page}"
|
||||
resp = requests.get(url, headers=HEADERS, timeout=30)
|
||||
resp.raise_for_status()
|
||||
batch = resp.json()
|
||||
if not batch:
|
||||
break
|
||||
prs.extend(batch)
|
||||
if len(batch) < per_page:
|
||||
break
|
||||
page += 1
|
||||
return prs
|
||||
|
||||
|
||||
def parse_datetime(dt_str: Optional[str]) -> Optional[datetime]:
|
||||
if not dt_str:
|
||||
return None
|
||||
try:
|
||||
return datetime.fromisoformat(dt_str.replace("Z", "+00:00"))
|
||||
except ValueError:
|
||||
return None
|
||||
|
||||
|
||||
def hours_between(start: Optional[str], end: Optional[str]) -> Optional[float]:
|
||||
s = parse_datetime(start)
|
||||
e = parse_datetime(end)
|
||||
if s and e:
|
||||
return (e - s).total_seconds() / 3600
|
||||
return None
|
||||
|
||||
|
||||
def audit_repos(repos: List[str]) -> Dict[str, Any]:
|
||||
"""Run the audit and return aggregated stats."""
|
||||
agent_stats: Dict[str, Dict[str, Any]] = defaultdict(
|
||||
lambda: {
|
||||
"total": 0,
|
||||
"merged": 0,
|
||||
"closed_unmerged": 0,
|
||||
"open": 0,
|
||||
"hours_to_merge": [],
|
||||
"hours_to_close": [],
|
||||
"repos": set(),
|
||||
"prs": [],
|
||||
}
|
||||
)
|
||||
|
||||
repo_stats: Dict[str, Dict[str, Any]] = {}
|
||||
|
||||
for repo in repos:
|
||||
print(f"Fetching PRs for {repo} ...", file=sys.stderr)
|
||||
try:
|
||||
prs = fetch_prs(repo)
|
||||
except requests.HTTPError as exc:
|
||||
print(f" SKIP {repo}: {exc}", file=sys.stderr)
|
||||
continue
|
||||
|
||||
repo_merged = 0
|
||||
repo_total = len(prs)
|
||||
for pr in prs:
|
||||
agent = AgentClassifier.classify(pr)
|
||||
s = agent_stats[agent]
|
||||
s["total"] += 1
|
||||
s["repos"].add(repo)
|
||||
s["prs"].append(
|
||||
{
|
||||
"repo": repo,
|
||||
"number": pr["number"],
|
||||
"title": pr["title"],
|
||||
"state": pr["state"],
|
||||
"merged": pr.get("merged", False),
|
||||
"created_at": pr.get("created_at"),
|
||||
"merged_at": pr.get("merged_at"),
|
||||
"closed_at": pr.get("closed_at"),
|
||||
}
|
||||
)
|
||||
|
||||
if pr.get("merged"):
|
||||
s["merged"] += 1
|
||||
repo_merged += 1
|
||||
h = hours_between(pr.get("created_at"), pr.get("merged_at"))
|
||||
if h is not None:
|
||||
s["hours_to_merge"].append(h)
|
||||
elif pr["state"] == "closed":
|
||||
s["closed_unmerged"] += 1
|
||||
h = hours_between(pr.get("created_at"), pr.get("closed_at"))
|
||||
if h is not None:
|
||||
s["hours_to_close"].append(h)
|
||||
else:
|
||||
s["open"] += 1
|
||||
|
||||
repo_stats[repo] = {
|
||||
"total": repo_total,
|
||||
"merged": repo_merged,
|
||||
"merge_rate": round(repo_merged / repo_total, 2) if repo_total else 0,
|
||||
}
|
||||
|
||||
# Compute derived metrics
|
||||
summary = {}
|
||||
for agent, s in sorted(agent_stats.items(), key=lambda x: -x[1]["total"]):
|
||||
total = s["total"]
|
||||
merged = s["merged"]
|
||||
closed = s["closed_unmerged"]
|
||||
resolved = merged + closed
|
||||
merge_rate = round(merged / resolved, 3) if resolved else 0
|
||||
avg_merge_hours = (
|
||||
round(sum(s["hours_to_merge"]) / len(s["hours_to_merge"]), 1)
|
||||
if s["hours_to_merge"]
|
||||
else None
|
||||
)
|
||||
avg_close_hours = (
|
||||
round(sum(s["hours_to_close"]) / len(s["hours_to_close"]), 1)
|
||||
if s["hours_to_close"]
|
||||
else None
|
||||
)
|
||||
summary[agent] = {
|
||||
"total_prs": total,
|
||||
"merged": merged,
|
||||
"closed_unmerged": closed,
|
||||
"open": s["open"],
|
||||
"merge_rate": merge_rate,
|
||||
"rejection_rate": round(closed / resolved, 3) if resolved else 0,
|
||||
"avg_hours_to_merge": avg_merge_hours,
|
||||
"avg_hours_to_close": avg_close_hours,
|
||||
"repos": sorted(s["repos"]),
|
||||
}
|
||||
|
||||
return {
|
||||
"audited_at": datetime.now(timezone.utc).isoformat(),
|
||||
"repos_audited": repos,
|
||||
"repo_stats": repo_stats,
|
||||
"agent_summary": summary,
|
||||
"raw_prs": {a: s["prs"] for a, s in agent_stats.items()},
|
||||
}
|
||||
|
||||
|
||||
def render_scorecard(data: Dict[str, Any]) -> str:
|
||||
"""Render a markdown scorecard."""
|
||||
lines = [
|
||||
"# Cross-Agent Quality Scorecard",
|
||||
"",
|
||||
f"**Audited at:** {data['audited_at']}",
|
||||
f"**Repos audited:** {', '.join(data['repos_audited'])}",
|
||||
"",
|
||||
"## Per-Agent Summary",
|
||||
"",
|
||||
"| Agent | Total PRs | Merged | Closed (unmerged) | Open | Merge Rate | Rejection Rate | Avg Hours to Merge | Avg Hours to Close |",
|
||||
"|---|---|---:|---:|---:|---:|---:|---:|---:|",
|
||||
]
|
||||
|
||||
for agent, s in data["agent_summary"].items():
|
||||
merge_hours = f"{s['avg_hours_to_merge']:.1f}" if s["avg_hours_to_merge"] is not None else "—"
|
||||
close_hours = f"{s['avg_hours_to_close']:.1f}" if s["avg_hours_to_close"] is not None else "—"
|
||||
lines.append(
|
||||
f"| {agent} | {s['total_prs']} | {s['merged']} | {s['closed_unmerged']} | "
|
||||
f"{s['open']} | {s['merge_rate']:.1%} | {s['rejection_rate']:.1%} | "
|
||||
f"{merge_hours} | {close_hours} |"
|
||||
)
|
||||
|
||||
lines.extend([
|
||||
"",
|
||||
"## Per-Repo Merge Rate",
|
||||
"",
|
||||
"| Repo | Total PRs | Merged | Merge Rate |",
|
||||
"|---|---|---:|---:|",
|
||||
])
|
||||
|
||||
for repo, s in sorted(data["repo_stats"].items(), key=lambda x: -x[1]["total"]):
|
||||
lines.append(
|
||||
f"| {repo} | {s['total']} | {s['merged']} | {s['merge_rate']:.1%} |"
|
||||
)
|
||||
|
||||
lines.extend([
|
||||
"",
|
||||
"## Methodology",
|
||||
"",
|
||||
"- **Agent classification** uses three signals in priority order:",
|
||||
" 1. Explicit title tag (e.g. `[Claude]`, `[Ezra]`)",
|
||||
" 2. Branch name containing agent name (e.g. `claude/issue-123`)",
|
||||
" 3. Git user (`claude` → claude, `Rockachopa` → burn-loop)",
|
||||
"- **Merge rate** = merged / (merged + closed_unmerged). Open PRs are excluded.",
|
||||
"- **Rejection rate** = closed_unmerged / (merged + closed_unmerged).",
|
||||
"- **Time metrics** are computed from created_at to merged_at / closed_at.",
|
||||
"",
|
||||
"## Raw Data",
|
||||
"",
|
||||
"```json",
|
||||
json.dumps(data["agent_summary"], indent=2),
|
||||
"```",
|
||||
"",
|
||||
])
|
||||
|
||||
return "\n".join(lines) + "\n"
|
||||
|
||||
|
||||
def main() -> int:
|
||||
parser = argparse.ArgumentParser(description="Cross-agent quality audit")
|
||||
parser.add_argument("--repos", nargs="+", default=DEFAULT_REPOS, help="Repos to audit")
|
||||
parser.add_argument("--scorecard", default="timmy-config/agent-quality-scorecard.md", help="Output path")
|
||||
parser.add_argument("--json", default=None, help="Also write raw JSON to path")
|
||||
args = parser.parse_args()
|
||||
|
||||
data = audit_repos(args.repos)
|
||||
|
||||
scorecard_path = Path(args.scorecard)
|
||||
scorecard_path.parent.mkdir(parents=True, exist_ok=True)
|
||||
scorecard_path.write_text(render_scorecard(data))
|
||||
print(f"Scorecard written to {scorecard_path}", file=sys.stderr)
|
||||
|
||||
if args.json:
|
||||
json_path = Path(args.json)
|
||||
json_path.parent.mkdir(parents=True, exist_ok=True)
|
||||
json_path.write_text(json.dumps(data, indent=2, default=str))
|
||||
print(f"Raw JSON written to {json_path}", file=sys.stderr)
|
||||
|
||||
return 0
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
raise SystemExit(main())
|
||||
@@ -1,418 +0,0 @@
|
||||
#!/usr/bin/env python3
|
||||
"""Ground timmy-home #524 as an executable status report.
|
||||
|
||||
Refs: timmy-home #524
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import argparse
|
||||
import json
|
||||
from copy import deepcopy
|
||||
from pathlib import Path
|
||||
from typing import Any
|
||||
from urllib import request
|
||||
|
||||
DEFAULT_BASE_URL = "https://forge.alexanderwhitestone.com/api/v1"
|
||||
DEFAULT_OWNER = "Timmy_Foundation"
|
||||
DEFAULT_REPO = "timmy-home"
|
||||
DEFAULT_TOKEN_FILE = Path.home() / ".config" / "gitea" / "token"
|
||||
DEFAULT_REPO_ROOT = Path(__file__).resolve().parents[1]
|
||||
DEFAULT_DOC_PATH = DEFAULT_REPO_ROOT / "docs" / "UNIFIED_FLEET_SOVEREIGNTY_STATUS.md"
|
||||
|
||||
DIRECTIVE_TITLE = "[DIRECTIVE] Unified Fleet Sovereignty & Comms Migration"
|
||||
DIRECTIVE_SUMMARY = (
|
||||
"Issue #524 is a multi-lane directive, not a one-commit feature. "
|
||||
"This report grounds the directive in repo evidence, highlights stale cross-links, "
|
||||
"and names the missing operator bundles that still need real execution."
|
||||
)
|
||||
|
||||
DEFAULT_REFERENCE_SNAPSHOT = {
|
||||
388: {
|
||||
"title": "[KT] Fleet Lexicon & Techniques — Shared Vocabulary, Patterns, and Standards for All Agents",
|
||||
"state": "closed",
|
||||
},
|
||||
103: {
|
||||
"title": "Build comprehensive caching layer — cache everywhere",
|
||||
"state": "open",
|
||||
},
|
||||
139: {
|
||||
"title": "🐣 Allegro-Primus is born",
|
||||
"state": "open",
|
||||
},
|
||||
813: {
|
||||
"title": "docs: refresh the-playground genome analysis (#671)",
|
||||
"state": "closed",
|
||||
},
|
||||
819: {
|
||||
"title": "docs: verify #648 already implemented (closes #818)",
|
||||
"state": "open",
|
||||
},
|
||||
}
|
||||
|
||||
WORKSTREAMS = [
|
||||
{
|
||||
"key": "nostr-migration",
|
||||
"name": "Nostr Migration Leadership",
|
||||
"requirement": "Replace Telegram with relay-based sovereign comms, verify wizard keypairs, and prove the NIP-29 group path is stable.",
|
||||
"references": [813, 819],
|
||||
"expected_keywords": ["nostr", "relay", "telegram", "comms", "messenger"],
|
||||
"repo_evidence": [
|
||||
{
|
||||
"path": "infrastructure/timmy-bridge/client/timmy_client.py",
|
||||
"description": "Nostr event client scaffold already exists",
|
||||
},
|
||||
{
|
||||
"path": "infrastructure/timmy-bridge/monitor/timmy_monitor.py",
|
||||
"description": "Nostr relay monitor already exists",
|
||||
},
|
||||
{
|
||||
"path": "specs/wizard-telegram-bot-cutover.md",
|
||||
"description": "Telegram cutover planning exists, so the migration lane is real",
|
||||
},
|
||||
],
|
||||
"missing_deliverables": [
|
||||
"wizard keypair inventory and ownership matrix",
|
||||
"NIP-29 relay group verification report",
|
||||
"operator runbook for cutting traffic off Telegram",
|
||||
],
|
||||
"why_open": "The repo has Nostr-adjacent scaffolding, but the directive still lacks a verified migration packet and the cited issue links drift away from the stated Nostr scope.",
|
||||
},
|
||||
{
|
||||
"key": "lexicon-enforcement",
|
||||
"name": "Lexicon Enforcement",
|
||||
"requirement": "Enforce the Fleet Lexicon in PR review and issue triage so the team uses one shared language.",
|
||||
"references": [388],
|
||||
"expected_keywords": ["lexicon", "vocabulary", "standards", "shared vocabulary"],
|
||||
"repo_evidence": [
|
||||
{
|
||||
"path": "docs/WIZARD_APPRENTICESHIP_CHARTER.md",
|
||||
"description": "The repo already uses wizard-language canon in docs",
|
||||
},
|
||||
{
|
||||
"path": "specs/timmy-ezra-bezalel-canon-sheet.md",
|
||||
"description": "Canonical agent naming already exists",
|
||||
},
|
||||
{
|
||||
"path": "docs/OPERATIONS_DASHBOARD.md",
|
||||
"description": "Operational roles are already described in repo language",
|
||||
},
|
||||
],
|
||||
"missing_deliverables": [
|
||||
"machine-checkable lexicon policy for review/triage",
|
||||
"terminology lint or reviewer checklist tied to the lexicon",
|
||||
],
|
||||
"why_open": "The naming canon exists, but there is still no executable enforcement bundle that would catch drift during future reviews and triage passes.",
|
||||
},
|
||||
{
|
||||
"key": "feature-audit",
|
||||
"name": "v0.7.0 Feature Audit",
|
||||
"requirement": "Audit Hermes features that can reduce cloud dependency and turn the findings into a sovereignty implementation plan.",
|
||||
"references": [139],
|
||||
"expected_keywords": ["hermes", "feature", "audit", "v0.7.0", "sovereignty"],
|
||||
"repo_evidence": [
|
||||
{
|
||||
"path": "scripts/sovereignty_audit.py",
|
||||
"description": "Cloud-vs-local audit machinery already exists",
|
||||
},
|
||||
{
|
||||
"path": "reports/evaluations/2026-04-15-phase-4-sovereignty-audit.md",
|
||||
"description": "Recent sovereignty audit report is committed",
|
||||
},
|
||||
{
|
||||
"path": "timmy-local/README.md",
|
||||
"description": "Local-first status is already documented for operators",
|
||||
},
|
||||
],
|
||||
"missing_deliverables": [
|
||||
"Hermes v0.7.0 feature inventory linked to cloud-reduction leverage",
|
||||
"Sovereignty Implementation Plan derived from that feature audit",
|
||||
],
|
||||
"why_open": "The repo has sovereignty-audit infrastructure, but it does not yet contain the requested v0.7.0 feature inventory or the plan that turns those findings into rollout steps.",
|
||||
},
|
||||
{
|
||||
"key": "morrowind-benchmark",
|
||||
"name": "Morrowind Local-First Benchmark",
|
||||
"requirement": "Compare cloud and local Morrowind agents, prove local parity where possible, and document the reasoning gap when it fails.",
|
||||
"references": [103],
|
||||
"expected_keywords": ["morrowind", "combat", "benchmark", "local", "cloud"],
|
||||
"repo_evidence": [
|
||||
{
|
||||
"path": "morrowind/local_brain.py",
|
||||
"description": "Local Morrowind control loop already exists",
|
||||
},
|
||||
{
|
||||
"path": "morrowind/mcp_server.py",
|
||||
"description": "Morrowind MCP control surface is already wired",
|
||||
},
|
||||
{
|
||||
"path": "morrowind/pilot.py",
|
||||
"description": "Trajectory logging for evaluation already exists",
|
||||
},
|
||||
],
|
||||
"missing_deliverables": [
|
||||
"cloud-vs-local benchmark report for the combat loop",
|
||||
"reasoning-gap writeup tied to a proposed LoRA/fine-tune path",
|
||||
],
|
||||
"why_open": "The repo has a local Morrowind stack, but it does not yet contain the requested benchmark artifact; the cited issue number also points at an unrelated caching task.",
|
||||
},
|
||||
{
|
||||
"key": "syntax-guard",
|
||||
"name": "Infrastructure Hardening / Syntax Guard",
|
||||
"requirement": "Verify Syntax Guard pre-receive protection across Gitea repos so syntax failures stop earlier.",
|
||||
"references": [],
|
||||
"expected_keywords": [],
|
||||
"repo_evidence": [],
|
||||
"missing_deliverables": [
|
||||
"repo inventory of Gitea targets that should carry Syntax Guard",
|
||||
"deployment verifier for hook presence across those repos",
|
||||
"operator report proving installation state instead of assuming it",
|
||||
],
|
||||
"why_open": "No repo-managed syntax-guard verifier is present yet, so this directive still depends on manual trust rather than auditable proof.",
|
||||
},
|
||||
]
|
||||
|
||||
|
||||
def default_snapshot() -> dict[int, dict[str, str]]:
|
||||
return deepcopy(DEFAULT_REFERENCE_SNAPSHOT)
|
||||
|
||||
|
||||
class GiteaClient:
|
||||
def __init__(self, token: str, owner: str = DEFAULT_OWNER, repo: str = DEFAULT_REPO, base_url: str = DEFAULT_BASE_URL):
|
||||
self.token = token
|
||||
self.owner = owner
|
||||
self.repo = repo
|
||||
self.base_url = base_url.rstrip("/")
|
||||
|
||||
def get_issue(self, issue_number: int) -> dict[str, Any]:
|
||||
req = request.Request(
|
||||
f"{self.base_url}/repos/{self.owner}/{self.repo}/issues/{issue_number}",
|
||||
headers={"Authorization": f"token {self.token}", "Accept": "application/json"},
|
||||
)
|
||||
with request.urlopen(req, timeout=30) as resp:
|
||||
return json.loads(resp.read().decode())
|
||||
|
||||
|
||||
def load_snapshot(path: Path | None = None) -> dict[int, dict[str, str]]:
|
||||
if path is None:
|
||||
return default_snapshot()
|
||||
data = json.loads(path.read_text(encoding="utf-8"))
|
||||
return {int(k): v for k, v in data.items()}
|
||||
|
||||
|
||||
def refresh_snapshot(token_file: Path = DEFAULT_TOKEN_FILE) -> dict[int, dict[str, str]]:
|
||||
token = token_file.read_text(encoding="utf-8").strip()
|
||||
client = GiteaClient(token=token)
|
||||
snapshot: dict[int, dict[str, str]] = {}
|
||||
for issue_number in sorted(DEFAULT_REFERENCE_SNAPSHOT):
|
||||
issue = client.get_issue(issue_number)
|
||||
snapshot[issue_number] = {
|
||||
"title": issue["title"],
|
||||
"state": issue["state"],
|
||||
}
|
||||
return snapshot
|
||||
|
||||
|
||||
def collect_repo_evidence(entries: list[dict[str, str]], repo_root: Path) -> tuple[list[str], list[str]]:
|
||||
present: list[str] = []
|
||||
missing: list[str] = []
|
||||
for entry in entries:
|
||||
label = f"`{entry['path']}` — {entry['description']}"
|
||||
if (repo_root / entry["path"]).exists():
|
||||
present.append(label)
|
||||
else:
|
||||
missing.append(label)
|
||||
return present, missing
|
||||
|
||||
|
||||
|
||||
def evaluate_reference(issue_number: int, snapshot: dict[int, dict[str, str]], expected_keywords: list[str]) -> dict[str, Any]:
|
||||
record = snapshot.get(issue_number, {"title": "missing from snapshot", "state": "unknown"})
|
||||
title = record["title"]
|
||||
title_lower = title.lower()
|
||||
matched_keywords = [kw for kw in expected_keywords if kw.lower() in title_lower]
|
||||
aligned = bool(matched_keywords) if expected_keywords else True
|
||||
return {
|
||||
"number": issue_number,
|
||||
"title": title,
|
||||
"state": record["state"],
|
||||
"aligned": aligned,
|
||||
"matched_keywords": matched_keywords,
|
||||
}
|
||||
|
||||
|
||||
|
||||
def classify_workstream(reference_results: list[dict[str, Any]], evidence_present: list[str], missing_deliverables: list[str]) -> str:
|
||||
has_drift = any(not item["aligned"] for item in reference_results)
|
||||
if not evidence_present:
|
||||
return "MISSING"
|
||||
if has_drift or missing_deliverables:
|
||||
return "PARTIAL"
|
||||
return "GROUNDED"
|
||||
|
||||
|
||||
|
||||
def evaluate_directive(snapshot: dict[int, dict[str, str]] | None = None, repo_root: Path | None = None) -> dict[str, Any]:
|
||||
snapshot = snapshot or default_snapshot()
|
||||
repo_root = repo_root or DEFAULT_REPO_ROOT
|
||||
workstreams: list[dict[str, Any]] = []
|
||||
drift_items: list[str] = []
|
||||
|
||||
for lane in WORKSTREAMS:
|
||||
reference_results = [
|
||||
evaluate_reference(issue_number, snapshot, lane["expected_keywords"])
|
||||
for issue_number in lane["references"]
|
||||
]
|
||||
present, missing = collect_repo_evidence(lane["repo_evidence"], repo_root)
|
||||
for item in reference_results:
|
||||
if not item["aligned"]:
|
||||
drift_items.append(
|
||||
f"#{item['number']} is cited for {lane['name']}, but its current title is '{item['title']}'."
|
||||
)
|
||||
workstream = {
|
||||
"key": lane["key"],
|
||||
"name": lane["name"],
|
||||
"requirement": lane["requirement"],
|
||||
"reference_results": reference_results,
|
||||
"repo_evidence_present": present,
|
||||
"repo_evidence_missing": missing,
|
||||
"missing_deliverables": list(lane["missing_deliverables"]),
|
||||
"why_open": lane["why_open"],
|
||||
}
|
||||
workstream["status"] = classify_workstream(
|
||||
reference_results=reference_results,
|
||||
evidence_present=present,
|
||||
missing_deliverables=workstream["missing_deliverables"],
|
||||
)
|
||||
workstreams.append(workstream)
|
||||
|
||||
next_actions: list[str] = []
|
||||
for workstream in workstreams:
|
||||
if workstream["missing_deliverables"]:
|
||||
next_actions.append(f"{workstream['name']}: {workstream['missing_deliverables'][0]}")
|
||||
|
||||
return {
|
||||
"issue_number": 524,
|
||||
"title": DIRECTIVE_TITLE,
|
||||
"summary": DIRECTIVE_SUMMARY,
|
||||
"reference_snapshot": {str(k): v for k, v in sorted(snapshot.items())},
|
||||
"workstreams": workstreams,
|
||||
"reference_drift": drift_items,
|
||||
"grounded_workstreams": sum(1 for item in workstreams if item["status"] == "GROUNDED"),
|
||||
"partial_workstreams": sum(1 for item in workstreams if item["status"] == "PARTIAL"),
|
||||
"missing_workstreams": sum(1 for item in workstreams if item["status"] == "MISSING"),
|
||||
"next_actions": next_actions,
|
||||
}
|
||||
|
||||
|
||||
|
||||
def render_markdown(result: dict[str, Any]) -> str:
|
||||
lines = [
|
||||
f"# {result['title']}",
|
||||
"",
|
||||
"Grounding report for `timmy-home #524`.",
|
||||
"",
|
||||
result["summary"],
|
||||
"",
|
||||
"This remains a `Refs #524` artifact. The directive spans multiple repos and operator actions, so this report makes the current repo-side state executable without pretending the whole migration is complete.",
|
||||
"",
|
||||
"## Directive Snapshot",
|
||||
"",
|
||||
f"- Repo-grounded workstreams: {result['grounded_workstreams']}",
|
||||
f"- Partial workstreams: {result['partial_workstreams']}",
|
||||
f"- Missing workstreams: {result['missing_workstreams']}",
|
||||
f"- Drifted references: {len(result['reference_drift'])}",
|
||||
"",
|
||||
"## Reference Drift",
|
||||
"",
|
||||
]
|
||||
if result["reference_drift"]:
|
||||
lines.extend(f"- {item}" for item in result["reference_drift"])
|
||||
else:
|
||||
lines.append("- No stale cross-links detected in the directive snapshot.")
|
||||
|
||||
lines.extend(["", "## Workstream Matrix", ""])
|
||||
for index, workstream in enumerate(result["workstreams"], start=1):
|
||||
lines.extend(
|
||||
[
|
||||
f"### {index}. {workstream['name']} — {workstream['status']}",
|
||||
"",
|
||||
f"- Requirement: {workstream['requirement']}",
|
||||
]
|
||||
)
|
||||
if workstream["reference_results"]:
|
||||
lines.append("- Referenced issues:")
|
||||
for ref in workstream["reference_results"]:
|
||||
alignment = "aligned" if ref["aligned"] else "DRIFT"
|
||||
lines.append(
|
||||
f" - #{ref['number']} ({ref['state']}) — {ref['title']} [{alignment}]"
|
||||
)
|
||||
else:
|
||||
lines.append("- Referenced issues: none listed in the directive body")
|
||||
|
||||
if workstream["repo_evidence_present"]:
|
||||
lines.append("- Repo evidence present:")
|
||||
lines.extend(f" - {item}" for item in workstream["repo_evidence_present"])
|
||||
else:
|
||||
lines.append("- Repo evidence present: none")
|
||||
|
||||
if workstream["repo_evidence_missing"]:
|
||||
lines.append("- Repo evidence expected but missing:")
|
||||
lines.extend(f" - {item}" for item in workstream["repo_evidence_missing"])
|
||||
|
||||
if workstream["missing_deliverables"]:
|
||||
lines.append("- Missing operator deliverables:")
|
||||
lines.extend(f" - {item}" for item in workstream["missing_deliverables"])
|
||||
else:
|
||||
lines.append("- Missing operator deliverables: none")
|
||||
|
||||
lines.append(f"- Why this lane remains open: {workstream['why_open']}")
|
||||
lines.append("")
|
||||
|
||||
lines.extend(["## Highest-Leverage Next Actions", ""])
|
||||
lines.extend(f"- {item}" for item in result["next_actions"])
|
||||
|
||||
lines.extend(
|
||||
[
|
||||
"",
|
||||
"## Why #524 Remains Open",
|
||||
"",
|
||||
"- The directive bundles five separate workstreams with different evidence surfaces.",
|
||||
"- Multiple cited issue numbers have drifted away from the work they are supposed to anchor.",
|
||||
"- Repo scaffolding exists for Nostr, sovereignty audits, and Morrowind, but the operator-facing bundles are still missing.",
|
||||
"- Syntax Guard verification is still undocumented and unproven inside this repo.",
|
||||
]
|
||||
)
|
||||
|
||||
return "\n".join(lines).rstrip() + "\n"
|
||||
|
||||
|
||||
|
||||
def main() -> None:
|
||||
parser = argparse.ArgumentParser(description="Render the unified fleet sovereignty status report for issue #524")
|
||||
parser.add_argument("--snapshot", help="Optional JSON snapshot file overriding the default issue-title/state snapshot")
|
||||
parser.add_argument("--live", action="store_true", help="Refresh the issue snapshot from Gitea before rendering")
|
||||
parser.add_argument("--token-file", default=str(DEFAULT_TOKEN_FILE), help="Token file used with --live")
|
||||
parser.add_argument("--output", help="Optional path to write the rendered report")
|
||||
parser.add_argument("--json", action="store_true", help="Print computed JSON instead of markdown")
|
||||
args = parser.parse_args()
|
||||
|
||||
if args.live:
|
||||
snapshot = refresh_snapshot(Path(args.token_file).expanduser())
|
||||
else:
|
||||
snapshot = load_snapshot(Path(args.snapshot).expanduser() if args.snapshot else None)
|
||||
|
||||
result = evaluate_directive(snapshot=snapshot, repo_root=DEFAULT_REPO_ROOT)
|
||||
rendered = json.dumps(result, indent=2) if args.json else render_markdown(result)
|
||||
|
||||
if args.output:
|
||||
output_path = Path(args.output).expanduser()
|
||||
output_path.parent.mkdir(parents=True, exist_ok=True)
|
||||
output_path.write_text(rendered, encoding="utf-8")
|
||||
print(f"Directive status written to {output_path}")
|
||||
else:
|
||||
print(rendered)
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
||||
45
tests/test_cross_agent_quality_audit.py
Normal file
45
tests/test_cross_agent_quality_audit.py
Normal file
@@ -0,0 +1,45 @@
|
||||
"""Tests for cross_agent_quality_audit.py — #518."""
|
||||
|
||||
import pytest
|
||||
import sys
|
||||
from pathlib import Path
|
||||
|
||||
sys.path.insert(0, str(Path(__file__).parent.parent / "scripts"))
|
||||
|
||||
from cross_agent_quality_audit import AgentClassifier, hours_between
|
||||
|
||||
|
||||
class TestAgentClassifier:
|
||||
def test_title_tag_claude(self):
|
||||
pr = {"title": "[Claude] fix auth middleware", "head": {"ref": "fix/123"}, "user": {"login": "rockachopa"}}
|
||||
assert AgentClassifier.classify(pr) == "claude"
|
||||
|
||||
def test_title_tag_ezra(self):
|
||||
pr = {"title": "[Ezra] tmux fleet launcher", "head": {"ref": "burn/10"}, "user": {"login": "rockachopa"}}
|
||||
assert AgentClassifier.classify(pr) == "ezra"
|
||||
|
||||
def test_branch_name_claude(self):
|
||||
pr = {"title": "fix auth", "head": {"ref": "claude/issue-1695"}, "user": {"login": "rockachopa"}}
|
||||
assert AgentClassifier.classify(pr) == "claude"
|
||||
|
||||
def test_user_mapping(self):
|
||||
pr = {"title": "some fix", "head": {"ref": "fix/1"}, "user": {"login": "claude"}}
|
||||
assert AgentClassifier.classify(pr) == "claude"
|
||||
|
||||
def test_rockachopa_maps_to_burn_loop(self):
|
||||
pr = {"title": "some fix", "head": {"ref": "fix/1"}, "user": {"login": "Rockachopa"}}
|
||||
assert AgentClassifier.classify(pr) == "burn-loop"
|
||||
|
||||
def test_unknown_fallback(self):
|
||||
pr = {"title": "some fix", "head": {"ref": "fix/1"}, "user": {"login": "random"}}
|
||||
assert AgentClassifier.classify(pr) == "unknown"
|
||||
|
||||
|
||||
class TestHoursBetween:
|
||||
def test_same_day(self):
|
||||
h = hours_between("2026-04-22T10:00:00Z", "2026-04-22T12:00:00Z")
|
||||
assert h == 2.0
|
||||
|
||||
def test_none_returns_none(self):
|
||||
assert hours_between(None, "2026-04-22T12:00:00Z") is None
|
||||
assert hours_between("2026-04-22T10:00:00Z", None) is None
|
||||
@@ -1,77 +0,0 @@
|
||||
from __future__ import annotations
|
||||
|
||||
import importlib.util
|
||||
from pathlib import Path
|
||||
|
||||
|
||||
ROOT = Path(__file__).resolve().parents[1]
|
||||
SCRIPT_PATH = ROOT / "scripts" / "unified_fleet_sovereignty_status.py"
|
||||
DOC_PATH = ROOT / "docs" / "UNIFIED_FLEET_SOVEREIGNTY_STATUS.md"
|
||||
|
||||
|
||||
def _load_module(path: Path, name: str):
|
||||
assert path.exists(), f"missing {path.relative_to(ROOT)}"
|
||||
spec = importlib.util.spec_from_file_location(name, path)
|
||||
assert spec and spec.loader
|
||||
module = importlib.util.module_from_spec(spec)
|
||||
spec.loader.exec_module(module)
|
||||
return module
|
||||
|
||||
|
||||
def _workstream(result: dict, key: str) -> dict:
|
||||
for workstream in result["workstreams"]:
|
||||
if workstream["key"] == key:
|
||||
return workstream
|
||||
raise AssertionError(f"missing workstream {key}")
|
||||
|
||||
|
||||
def test_evaluate_directive_flags_reference_drift_without_faking_completion() -> None:
|
||||
mod = _load_module(SCRIPT_PATH, "unified_fleet_sovereignty_status")
|
||||
result = mod.evaluate_directive(snapshot=mod.default_snapshot(), repo_root=ROOT)
|
||||
|
||||
assert len(result["reference_drift"]) == 4
|
||||
assert any("#813" in item for item in result["reference_drift"])
|
||||
assert any("#103" in item for item in result["reference_drift"])
|
||||
|
||||
nostr = _workstream(result, "nostr-migration")
|
||||
assert nostr["status"] == "PARTIAL"
|
||||
assert any("timmy_client.py" in item for item in nostr["repo_evidence_present"])
|
||||
|
||||
lexicon = _workstream(result, "lexicon-enforcement")
|
||||
assert all(item["aligned"] for item in lexicon["reference_results"])
|
||||
assert lexicon["status"] == "PARTIAL"
|
||||
|
||||
syntax_guard = _workstream(result, "syntax-guard")
|
||||
assert syntax_guard["status"] == "MISSING"
|
||||
assert any("deployment verifier" in item for item in syntax_guard["missing_deliverables"])
|
||||
|
||||
|
||||
def test_render_markdown_includes_required_sections_and_grounding_evidence() -> None:
|
||||
mod = _load_module(SCRIPT_PATH, "unified_fleet_sovereignty_status")
|
||||
result = mod.evaluate_directive(snapshot=mod.default_snapshot(), repo_root=ROOT)
|
||||
report = mod.render_markdown(result)
|
||||
|
||||
for snippet in (
|
||||
"# [DIRECTIVE] Unified Fleet Sovereignty & Comms Migration",
|
||||
"## Directive Snapshot",
|
||||
"## Reference Drift",
|
||||
"## Workstream Matrix",
|
||||
"### 5. Infrastructure Hardening / Syntax Guard — MISSING",
|
||||
"`infrastructure/timmy-bridge/client/timmy_client.py`",
|
||||
"machine-checkable lexicon policy for review/triage",
|
||||
"## Why #524 Remains Open",
|
||||
):
|
||||
assert snippet in report
|
||||
|
||||
|
||||
def test_repo_contains_committed_issue_524_grounding_doc() -> None:
|
||||
assert DOC_PATH.exists(), "missing committed directive grounding doc"
|
||||
text = DOC_PATH.read_text(encoding="utf-8")
|
||||
for snippet in (
|
||||
"# [DIRECTIVE] Unified Fleet Sovereignty & Comms Migration",
|
||||
"## Reference Drift",
|
||||
"## Workstream Matrix",
|
||||
"## Highest-Leverage Next Actions",
|
||||
"## Why #524 Remains Open",
|
||||
):
|
||||
assert snippet in text
|
||||
244
timmy-config/agent-quality-scorecard.md
Normal file
244
timmy-config/agent-quality-scorecard.md
Normal file
@@ -0,0 +1,244 @@
|
||||
# Cross-Agent Quality Scorecard
|
||||
|
||||
**Audited at:** 2026-04-22T06:17:43.574309+00:00
|
||||
**Repos audited:** timmy-home, hermes-agent, the-nexus, the-door, fleet-ops, burn-fleet, the-playground, compounding-intelligence, the-beacon, second-son-of-timmy, timmy-academy, timmy-config
|
||||
|
||||
## Per-Agent Summary
|
||||
|
||||
| Agent | Total PRs | Merged | Closed (unmerged) | Open | Merge Rate | Rejection Rate | Avg Hours to Merge | Avg Hours to Close |
|
||||
|---|---|---:|---:|---:|---:|---:|---:|---:|
|
||||
| burn-loop | 1733 | 346 | 1239 | 148 | 21.8% | 78.2% | 18.9 | 20.6 |
|
||||
| unknown | 843 | 598 | 214 | 31 | 73.6% | 26.4% | 2.3 | 11.3 |
|
||||
| claude | 264 | 138 | 121 | 5 | 53.3% | 46.7% | 3.3 | 6.2 |
|
||||
| gemini | 95 | 24 | 70 | 1 | 25.5% | 74.5% | 0.5 | 11.3 |
|
||||
| timmy | 28 | 15 | 11 | 2 | 57.7% | 42.3% | 9.8 | 20.2 |
|
||||
| bezalel | 21 | 11 | 9 | 1 | 55.0% | 45.0% | 2.7 | 8.0 |
|
||||
| allegro | 21 | 7 | 11 | 3 | 38.9% | 61.1% | 31.1 | 20.2 |
|
||||
| ezra | 8 | 2 | 3 | 3 | 40.0% | 60.0% | 4.4 | 16.8 |
|
||||
| kimi | 6 | 3 | 3 | 0 | 50.0% | 50.0% | 39.5 | 0.5 |
|
||||
| manus | 6 | 5 | 1 | 0 | 83.3% | 16.7% | 0.0 | 18.8 |
|
||||
| codex | 2 | 2 | 0 | 0 | 100.0% | 0.0% | 2.3 | — |
|
||||
|
||||
## Per-Repo Merge Rate
|
||||
|
||||
| Repo | Total PRs | Merged | Merge Rate |
|
||||
|---|---|---:|---:|
|
||||
| the-nexus | 985 | 501 | 51.0% |
|
||||
| hermes-agent | 519 | 128 | 25.0% |
|
||||
| timmy-config | 404 | 140 | 35.0% |
|
||||
| timmy-home | 270 | 104 | 39.0% |
|
||||
| fleet-ops | 266 | 84 | 32.0% |
|
||||
| the-beacon | 175 | 62 | 35.0% |
|
||||
| the-door | 153 | 31 | 20.0% |
|
||||
| second-son-of-timmy | 111 | 82 | 74.0% |
|
||||
| compounding-intelligence | 50 | 9 | 18.0% |
|
||||
| the-playground | 44 | 2 | 5.0% |
|
||||
| burn-fleet | 38 | 2 | 5.0% |
|
||||
| timmy-academy | 12 | 6 | 50.0% |
|
||||
|
||||
## Methodology
|
||||
|
||||
- **Agent classification** uses three signals in priority order:
|
||||
1. Explicit title tag (e.g. `[Claude]`, `[Ezra]`)
|
||||
2. Branch name containing agent name (e.g. `claude/issue-123`)
|
||||
3. Git user (`claude` → claude, `Rockachopa` → burn-loop)
|
||||
- **Merge rate** = merged / (merged + closed_unmerged). Open PRs are excluded.
|
||||
- **Rejection rate** = closed_unmerged / (merged + closed_unmerged).
|
||||
- **Time metrics** are computed from created_at to merged_at / closed_at.
|
||||
|
||||
## Raw Data
|
||||
|
||||
```json
|
||||
{
|
||||
"burn-loop": {
|
||||
"total_prs": 1733,
|
||||
"merged": 346,
|
||||
"closed_unmerged": 1239,
|
||||
"open": 148,
|
||||
"merge_rate": 0.218,
|
||||
"rejection_rate": 0.782,
|
||||
"avg_hours_to_merge": 18.9,
|
||||
"avg_hours_to_close": 20.6,
|
||||
"repos": [
|
||||
"burn-fleet",
|
||||
"compounding-intelligence",
|
||||
"fleet-ops",
|
||||
"hermes-agent",
|
||||
"second-son-of-timmy",
|
||||
"the-beacon",
|
||||
"the-door",
|
||||
"the-nexus",
|
||||
"the-playground",
|
||||
"timmy-academy",
|
||||
"timmy-config",
|
||||
"timmy-home"
|
||||
]
|
||||
},
|
||||
"unknown": {
|
||||
"total_prs": 843,
|
||||
"merged": 598,
|
||||
"closed_unmerged": 214,
|
||||
"open": 31,
|
||||
"merge_rate": 0.736,
|
||||
"rejection_rate": 0.264,
|
||||
"avg_hours_to_merge": 2.3,
|
||||
"avg_hours_to_close": 11.3,
|
||||
"repos": [
|
||||
"fleet-ops",
|
||||
"hermes-agent",
|
||||
"second-son-of-timmy",
|
||||
"the-beacon",
|
||||
"the-door",
|
||||
"the-nexus",
|
||||
"timmy-academy",
|
||||
"timmy-config",
|
||||
"timmy-home"
|
||||
]
|
||||
},
|
||||
"claude": {
|
||||
"total_prs": 264,
|
||||
"merged": 138,
|
||||
"closed_unmerged": 121,
|
||||
"open": 5,
|
||||
"merge_rate": 0.533,
|
||||
"rejection_rate": 0.467,
|
||||
"avg_hours_to_merge": 3.3,
|
||||
"avg_hours_to_close": 6.2,
|
||||
"repos": [
|
||||
"hermes-agent",
|
||||
"the-nexus",
|
||||
"timmy-config",
|
||||
"timmy-home"
|
||||
]
|
||||
},
|
||||
"gemini": {
|
||||
"total_prs": 95,
|
||||
"merged": 24,
|
||||
"closed_unmerged": 70,
|
||||
"open": 1,
|
||||
"merge_rate": 0.255,
|
||||
"rejection_rate": 0.745,
|
||||
"avg_hours_to_merge": 0.5,
|
||||
"avg_hours_to_close": 11.3,
|
||||
"repos": [
|
||||
"hermes-agent",
|
||||
"the-nexus",
|
||||
"timmy-config",
|
||||
"timmy-home"
|
||||
]
|
||||
},
|
||||
"timmy": {
|
||||
"total_prs": 28,
|
||||
"merged": 15,
|
||||
"closed_unmerged": 11,
|
||||
"open": 2,
|
||||
"merge_rate": 0.577,
|
||||
"rejection_rate": 0.423,
|
||||
"avg_hours_to_merge": 9.8,
|
||||
"avg_hours_to_close": 20.2,
|
||||
"repos": [
|
||||
"burn-fleet",
|
||||
"hermes-agent",
|
||||
"the-nexus",
|
||||
"timmy-config",
|
||||
"timmy-home"
|
||||
]
|
||||
},
|
||||
"bezalel": {
|
||||
"total_prs": 21,
|
||||
"merged": 11,
|
||||
"closed_unmerged": 9,
|
||||
"open": 1,
|
||||
"merge_rate": 0.55,
|
||||
"rejection_rate": 0.45,
|
||||
"avg_hours_to_merge": 2.7,
|
||||
"avg_hours_to_close": 8.0,
|
||||
"repos": [
|
||||
"burn-fleet",
|
||||
"hermes-agent",
|
||||
"the-beacon",
|
||||
"the-nexus",
|
||||
"timmy-config",
|
||||
"timmy-home"
|
||||
]
|
||||
},
|
||||
"allegro": {
|
||||
"total_prs": 21,
|
||||
"merged": 7,
|
||||
"closed_unmerged": 11,
|
||||
"open": 3,
|
||||
"merge_rate": 0.389,
|
||||
"rejection_rate": 0.611,
|
||||
"avg_hours_to_merge": 31.1,
|
||||
"avg_hours_to_close": 20.2,
|
||||
"repos": [
|
||||
"burn-fleet",
|
||||
"hermes-agent",
|
||||
"the-beacon",
|
||||
"the-nexus",
|
||||
"timmy-config",
|
||||
"timmy-home"
|
||||
]
|
||||
},
|
||||
"ezra": {
|
||||
"total_prs": 8,
|
||||
"merged": 2,
|
||||
"closed_unmerged": 3,
|
||||
"open": 3,
|
||||
"merge_rate": 0.4,
|
||||
"rejection_rate": 0.6,
|
||||
"avg_hours_to_merge": 4.4,
|
||||
"avg_hours_to_close": 16.8,
|
||||
"repos": [
|
||||
"burn-fleet",
|
||||
"fleet-ops",
|
||||
"timmy-config",
|
||||
"timmy-home"
|
||||
]
|
||||
},
|
||||
"kimi": {
|
||||
"total_prs": 6,
|
||||
"merged": 3,
|
||||
"closed_unmerged": 3,
|
||||
"open": 0,
|
||||
"merge_rate": 0.5,
|
||||
"rejection_rate": 0.5,
|
||||
"avg_hours_to_merge": 39.5,
|
||||
"avg_hours_to_close": 0.5,
|
||||
"repos": [
|
||||
"hermes-agent",
|
||||
"the-nexus",
|
||||
"timmy-home"
|
||||
]
|
||||
},
|
||||
"manus": {
|
||||
"total_prs": 6,
|
||||
"merged": 5,
|
||||
"closed_unmerged": 1,
|
||||
"open": 0,
|
||||
"merge_rate": 0.833,
|
||||
"rejection_rate": 0.167,
|
||||
"avg_hours_to_merge": 0.0,
|
||||
"avg_hours_to_close": 18.8,
|
||||
"repos": [
|
||||
"the-nexus",
|
||||
"timmy-config"
|
||||
]
|
||||
},
|
||||
"codex": {
|
||||
"total_prs": 2,
|
||||
"merged": 2,
|
||||
"closed_unmerged": 0,
|
||||
"open": 0,
|
||||
"merge_rate": 1.0,
|
||||
"rejection_rate": 0.0,
|
||||
"avg_hours_to_merge": 2.3,
|
||||
"avg_hours_to_close": null,
|
||||
"repos": [
|
||||
"timmy-config",
|
||||
"timmy-home"
|
||||
]
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
Reference in New Issue
Block a user