Compare commits

..

1 Commits

Author SHA1 Message Date
Alexander Whitestone
38ce7e4bc5 feat: Add PR backlog management process (#1470)
Some checks failed
CI / test (pull_request) Failing after 1m14s
CI / validate (pull_request) Failing after 50s
Review Approval Gate / verify-review (pull_request) Failing after 5s
## Summary
Added tools and process for managing PR backlog in timmy-config.

## Problem
timmy-config has 31+ open PRs, the highest in the organization.
This creates confusion, slows down development, and increases
merge conflicts.

## Solution
Created automated tools and process for PR backlog management:

### 1. PR Backlog Analyzer (`scripts/pr-backlog-analyzer.py`)
- Fetches all open PRs from timmy-config
- Analyzes age, review status, labels
- Generates markdown report
- Categorizes PRs: stale, needs review, approved, changes requested

### 2. GitHub Actions Workflow (`.github/workflows/pr-backlog-management.yml`)
- Runs weekly on Monday at 10 AM UTC
- Analyzes PR backlog
- Creates issue if backlog is high (>10 stale PRs)
- Uploads report as artifact

### 3. Documentation (`docs/pr-backlog-process.md`)
- Weekly analysis process
- Review stale PRs procedure
- Merge approved PRs workflow
- Review pending PRs SLA
- Close duplicate PRs process
- Metrics to track
- Escalation procedures

## Usage

### Run Analyzer
```bash
python scripts/pr-backlog-analyzer.py
```

### View Report
```bash
cat reports/pr-backlog-$(date +%Y%m%d).md
```

## Metrics
- **Current**: 32 open PRs in timmy-config
- **Target**: <20 open PRs
- **SLA**: Review within 48 hours, merge within 7 days

Issue: #1470
2026-04-14 21:14:55 -04:00
6 changed files with 412 additions and 196 deletions

View File

@@ -0,0 +1,70 @@
name: PR Backlog Management
on:
schedule:
# Run weekly on Monday at 10 AM UTC
- cron: '0 10 * * 1'
workflow_dispatch: # Allow manual trigger
jobs:
analyze-backlog:
runs-on: ubuntu-latest
steps:
- name: Checkout
uses: actions/checkout@v4
- name: Set up Python
uses: actions/setup-python@v5
with:
python-version: '3.11'
- name: Install dependencies
run: |
pip install requests
- name: Analyze PR backlog
env:
GITEA_TOKEN: ${{ secrets.GITEA_TOKEN }}
run: |
python scripts/pr-backlog-analyzer.py
- name: Upload report
uses: actions/upload-artifact@v4
with:
name: pr-backlog-report
path: reports/
- name: Create issue if backlog is high
if: failure()
uses: actions/github-script@v7
with:
script: |
const fs = require('fs');
const report = fs.readFileSync('reports/pr-backlog-' + new Date().toISOString().split('T')[0] + '.md', 'utf8');
// Check if backlog is high (more than 10 stale PRs)
const staleMatch = report.match(/Stale \(>30 days\): (\d+)/);
const staleCount = staleMatch ? parseInt(staleMatch[1]) : 0;
if (staleCount > 10) {
const title = 'PR Backlog Alert: ' + staleCount + ' stale PRs';
const body = `## PR Backlog Alert
The PR backlog analysis found ${staleCount} stale PRs (>30 days old).
### Recommendation
Review and close stale PRs to reduce backlog.
### Report
See attached artifact for full analysis.
This issue was automatically created by the PR backlog management workflow.`;
await github.rest.issues.create({
owner: context.repo.owner,
repo: context.repo.repo,
title,
body,
labels: ['process-improvement', 'p2-backlog']
});
}

126
docs/pr-backlog-process.md Normal file
View File

@@ -0,0 +1,126 @@
# PR Backlog Management Process
## Overview
This document outlines the process for managing PR backlog in the Timmy Foundation repositories, specifically addressing the high PR backlog in timmy-config.
## Current State
As of the latest analysis:
- **timmy-config**: 31 open PRs (highest in org)
- **the-nexus**: Multiple PRs for same issues
- **hermes-agent**: Moderate PR count
## Process
### 1. Weekly Analysis
Run the PR backlog analyzer weekly:
```bash
python scripts/pr-backlog-analyzer.py
```
This generates a report in `reports/pr-backlog-YYYYMMDD.md`.
### 2. Review Stale PRs
PRs older than 30 days are considered stale. For each stale PR:
1. **Check relevance**: Is the PR still needed?
2. **Check conflicts**: Does it conflict with current main?
3. **Check activity**: Has there been recent activity?
4. **Action**: Close, update, or merge
### 3. Merge Approved PRs
PRs with approvals should be merged within 7 days:
1. **Verify CI**: Ensure all checks pass
2. **Verify review**: At least 1 approval
3. **Merge**: Use squash merge for clean history
4. **Delete branch**: Clean up after merge
### 4. Review Pending PRs
PRs waiting for review should be reviewed within 48 hours:
1. **Assign reviewer**: Ensure someone is responsible
2. **Review**: Check code quality, tests, documentation
3. **Approve or request changes**: Don't leave PRs in limbo
4. **Follow up**: If no response in 48 hours, escalate
### 5. Close Duplicate PRs
Multiple PRs for the same issue should be consolidated:
1. **Identify duplicates**: Same issue number or similar changes
2. **Keep newest**: Usually the most up-to-date
3. **Close older**: With explanatory comments
4. **Document**: Update issue with which PR was kept
## Automation
### GitHub Actions Workflow
The `pr-backlog-management.yml` workflow runs weekly to:
1. Analyze all open PRs
2. Generate a report
3. Create an issue if backlog is high (>10 stale PRs)
### Manual Trigger
The workflow can be triggered manually via GitHub Actions UI.
## Metrics
Track these metrics weekly:
- **Total open PRs**: Should be <20 per repo
- **Stale PRs**: Should be <5 per repo
- **Average PR age**: Should be <14 days
- **Time to review**: Should be <48 hours
- **Time to merge**: Should be <7 days after approval
## Escalation
If backlog exceeds thresholds:
1. **Level 1**: Automated issue created
2. **Level 2**: Team lead notified
3. **Level 3**: Organization-wide cleanup sprint
## Tools
### PR Backlog Analyzer
```bash
# Run analysis
python scripts/pr-backlog-analyzer.py
# View report
cat reports/pr-backlog-$(date +%Y%m%d).md
```
### Manual Cleanup
```bash
# List stale PRs
curl -s -H "Authorization: token $GITEA_TOKEN" "https://forge.alexanderwhitestone.com/api/v1/repos/Timmy_Foundation/timmy-config/pulls?state=open" | jq -r '.[] | select(.created_at < "'$(date -u -d '30 days ago' +%Y-%m-%dT%H:%M:%SZ)'") | .number'
# Close a PR
curl -s -X PATCH -H "Authorization: token $GITEA_TOKEN" -H "Content-Type: application/json" -d '{"state": "closed"}' "https://forge.alexanderwhitestone.com/api/v1/repos/Timmy_Foundation/timmy-config/pulls/123"
```
## Success Criteria
- **Short-term**: Reduce timmy-config PRs from 31 to <20
- **Medium-term**: Maintain <15 open PRs across all repos
- **Long-term**: Automated PR lifecycle management
## Related
- Issue #1470: process: Address timmy-config PR backlog (9 PRs - highest in org)
- Issue #1127: Evening triage pass
- Issue #1128: Forge Cleanup

View File

@@ -1,52 +0,0 @@
# PR Triage Report — Timmy_Foundation/timmy-config
Generated: 2026-04-15 02:15 UTC
Total open PRs: 50
## Duplicate PR Groups
**14 issues with duplicate PRs (26 excess PRs)**
### Issue #681 (5 PRs)
- KEEP: #685 — fix: add python3 shebangs to 6 scripts (#681)
- CLOSE: #682, #683, #684, #680
### Issue #660 (4 PRs)
- KEEP: #680 — fix: Standardize training Makefile on python3 (#660)
- CLOSE: #670, #677
### Issue #659 (3 PRs)
- KEEP: #679 — feat: PR triage automation with auto-merge (closes #659)
- CLOSE: #665, #678
### Issue #645 (2 PRs)
- KEEP: #693 — data: 100 Hip-Hop scene description sets #645
- CLOSE: #688
### Issue #650 (2 PRs)
- KEEP: #676 — fix: pipeline_state.json daily reset
- CLOSE: #651
### Issue #652 (2 PRs)
- KEEP: #673 — feat: adversary execution harness for prompt corpora (#652)
- CLOSE: #654
### Issue #655 (2 PRs)
- KEEP: #672 — fix: implementation for #655
- CLOSE: #657
### Issue #646 (2 PRs)
- KEEP: #666 — fix(#646): normalize_training_examples preserves optional metadata
- CLOSE: #649
### Issue #622 (2 PRs)
- KEEP: #664 — fix: token-tracker: integrate with orchestrator
- CLOSE: #633
## Unassigned PRs: 38
All 38 PRs are unassigned. Recommend batch assignment to available reviewers.
## Recommendations
1. Close 26 duplicate PRs (keep newest for each issue)
2. Assign reviewers to all PRs
3. Add duplicate-PR prevention check to CI
4. Run this tool weekly to maintain backlog health

View File

@@ -0,0 +1,35 @@
# PR Backlog Report — Timmy_Foundation/timmy-config
Generated: 2026-04-14 21:13:34
## Summary
- **Total Open PRs**: 32
- **Stale (>30 days)**: 0
- **Needs Review**: 0
- **Approved**: 0
- **Changes Requested**: 0
- **Recent (<7 days)**: 32
## Recommendations
### Immediate Actions
1. **Merge approved PRs**: 0 PRs are ready to merge
2. **Review stale PRs**: 0 PRs are >30 days old
3. **Address changes requested**: 0 PRs need updates
### Process Improvements
1. **Assign reviewers**: Ensure each PR has a reviewer within 24 hours
2. **Set SLAs**:
- Review within 48 hours
- Merge within 7 days of approval
- Close stale PRs after 30 days
3. **Automate**: Add CI checks to prevent backlog
## Detailed Analysis
### Stale PRs (>30 days)
### Approved PRs (Ready to Merge)
### Needs Review

181
scripts/pr-backlog-analyzer.py Executable file
View File

@@ -0,0 +1,181 @@
#!/usr/bin/env python3
"""
PR Backlog Analyzer for timmy-config
Analyzes open PRs and provides recommendations for cleanup.
"""
import json
import subprocess
import sys
from datetime import datetime, timedelta
from pathlib import Path
def get_open_prs(repo: str, token: str) -> list:
"""Get all open PRs from a repository."""
result = subprocess.run([
"curl", "-s", "-H", f"Authorization: token {token}",
f"https://forge.alexanderwhitestone.com/api/v1/repos/{repo}/pulls?state=open&limit=100"
], capture_output=True, text=True)
if result.returncode != 0:
print(f"Error fetching PRs: {result.stderr}")
return []
return json.loads(result.stdout)
def analyze_pr(pr: dict) -> dict:
"""Analyze a single PR."""
created = datetime.fromisoformat(pr['created_at'].replace('Z', '+00:00'))
age_days = (datetime.now(created.tzinfo) - created).days
# Check for reviews
reviews = pr.get('reviews', [])
has_approvals = any(r.get('state') == 'APPROVED' for r in reviews)
has_changes_requested = any(r.get('state') == 'CHANGES_REQUESTED' for r in reviews)
# Check labels
labels = [l['name'] for l in pr.get('labels', [])]
return {
'number': pr['number'],
'title': pr['title'],
'branch': pr['head']['ref'],
'created': pr['created_at'],
'age_days': age_days,
'user': pr['user']['login'],
'has_approvals': has_approvals,
'has_changes_requested': has_changes_requested,
'labels': labels,
'url': pr['html_url'],
}
def categorize_prs(prs: list) -> dict:
"""Categorize PRs by status."""
categories = {
'stale': [], # > 30 days old
'needs_review': [], # No reviews
'approved': [], # Approved but not merged
'changes_requested': [], # Changes requested
'recent': [], # < 7 days old
}
for pr in prs:
if pr['age_days'] > 30:
categories['stale'].append(pr)
elif pr['has_approvals']:
categories['approved'].append(pr)
elif pr['has_changes_requested']:
categories['changes_requested'].append(pr)
elif pr['age_days'] < 7:
categories['recent'].append(pr)
else:
categories['needs_review'].append(pr)
return categories
def generate_report(repo: str, prs: list, categories: dict) -> str:
"""Generate a markdown report."""
report = f"""# PR Backlog Report — {repo}
Generated: {datetime.now().strftime('%Y-%m-%d %H:%M:%S')}
## Summary
- **Total Open PRs**: {len(prs)}
- **Stale (>30 days)**: {len(categories['stale'])}
- **Needs Review**: {len(categories['needs_review'])}
- **Approved**: {len(categories['approved'])}
- **Changes Requested**: {len(categories['changes_requested'])}
- **Recent (<7 days)**: {len(categories['recent'])}
## Recommendations
### Immediate Actions
1. **Merge approved PRs**: {len(categories['approved'])} PRs are ready to merge
2. **Review stale PRs**: {len(categories['stale'])} PRs are >30 days old
3. **Address changes requested**: {len(categories['changes_requested'])} PRs need updates
### Process Improvements
1. **Assign reviewers**: Ensure each PR has a reviewer within 24 hours
2. **Set SLAs**:
- Review within 48 hours
- Merge within 7 days of approval
- Close stale PRs after 30 days
3. **Automate**: Add CI checks to prevent backlog
## Detailed Analysis
### Stale PRs (>30 days)
"""
for pr in categories['stale']:
report += f"- **#{pr['number']}**: {pr['title']}\n"
report += f" - Age: {pr['age_days']} days\n"
report += f" - Author: {pr['user']}\n"
report += f" - URL: {pr['url']}\n\n"
report += "\n### Approved PRs (Ready to Merge)\n"
for pr in categories['approved']:
report += f"- **#{pr['number']}**: {pr['title']}\n"
report += f" - Age: {pr['age_days']} days\n"
report += f" - Author: {pr['user']}\n"
report += f" - URL: {pr['url']}\n\n"
report += "\n### Needs Review\n"
for pr in categories['needs_review']:
report += f"- **#{pr['number']}**: {pr['title']}\n"
report += f" - Age: {pr['age_days']} days\n"
report += f" - Author: {pr['user']}\n"
report += f" - URL: {pr['url']}\n\n"
return report
def main():
"""Main function."""
token = Path.home() / '.config' / 'gitea' / 'token'
if not token.exists():
print("Error: Gitea token not found")
sys.exit(1)
token_str = token.read_text().strip()
repo = "Timmy_Foundation/timmy-config"
print(f"Fetching PRs for {repo}...")
prs = get_open_prs(repo, token_str)
if not prs:
print("No open PRs found")
return
print(f"Found {len(prs)} open PRs")
# Analyze PRs
analyzed = [analyze_pr(pr) for pr in prs]
categories = categorize_prs(analyzed)
# Generate report
report = generate_report(repo, analyzed, categories)
# Save report
output_dir = Path("reports")
output_dir.mkdir(exist_ok=True)
report_file = output_dir / f"pr-backlog-{datetime.now().strftime('%Y%m%d')}.md"
report_file.write_text(report)
print(f"\nReport saved to: {report_file}")
print(f"\nSummary:")
print(f" Total PRs: {len(prs)}")
print(f" Stale: {len(categories['stale'])}")
print(f" Approved: {len(categories['approved'])}")
print(f" Needs Review: {len(categories['needs_review'])}")
if __name__ == "__main__":
main()

View File

@@ -1,144 +0,0 @@
#!/usr/bin/env python3
"""
pr_triage.py — Triage PR backlog for timmy-config.
Identifies duplicate PRs for the same issue, unassigned PRs,
and recommends which to close/merge.
Usage:
python3 scripts/pr_triage.py --repo Timmy_Foundation/timmy-config
python3 scripts/pr_triage.py --repo Timmy_Foundation/timmy-config --close-duplicates --dry-run
"""
import argparse
import json
import os
import re
import sys
import urllib.request
from collections import defaultdict
from datetime import datetime, timezone
from pathlib import Path
GITEA_URL = "https://forge.alexanderwhitestone.com"
def get_token():
return (Path.home() / ".config" / "gitea" / "token").read_text().strip()
def fetch_open_prs(repo, headers):
all_prs = []
page = 1
while True:
url = f"{GITEA_URL}/api/v1/repos/{repo}/pulls?state=open&limit=100&page={page}"
req = urllib.request.Request(url, headers=headers)
resp = urllib.request.urlopen(req, timeout=15)
data = json.loads(resp.read())
if not data:
break
all_prs.extend(data)
if len(data) < 100:
break
page += 1
return all_prs
def find_duplicate_groups(prs):
issue_prs = defaultdict(list)
for pr in prs:
text = (pr.get("body") or "") + " " + (pr.get("title") or "")
issues = set(re.findall(r"#(\d+)", text))
for iss in issues:
issue_prs[iss].append(pr)
return {k: v for k, v in issue_prs.items() if len(v) > 1}
def generate_report(repo, prs):
now = datetime.now(timezone.utc)
lines = [f"# PR Triage Report — {repo}",
f"\nGenerated: {now.strftime('%Y-%m-%d %H:%M UTC')}",
f"Total open PRs: {len(prs)}", ""]
duplicates = find_duplicate_groups(prs)
unassigned = [p for p in prs if not p.get("assignee")]
lines.append("## Duplicate PR Groups")
if duplicates:
total_dupes = sum(len(v) - 1 for v in duplicates.values())
lines.append(f"**{len(duplicates)} issues with duplicate PRs ({total_dupes} excess PRs)**")
for issue, pr_group in sorted(duplicates.items(), key=lambda x: -len(x[1])):
keep = max(pr_group, key=lambda p: p["number"])
close = [p for p in pr_group if p["number"] != keep["number"]]
lines.append(f"\n### Issue #{issue} ({len(pr_group)} PRs)")
lines.append(f"- **KEEP:** #{keep['number']}{keep['title'][:60]}")
for p in close:
lines.append(f"- CLOSE: #{p['number']}{p['title'][:60]}")
else:
lines.append("No duplicate PR groups found.")
lines.append("")
lines.append(f"## Unassigned PRs: {len(unassigned)}")
for p in unassigned[:10]:
lines.append(f"- #{p['number']}: {p['title'][:70]}")
if len(unassigned) > 10:
lines.append(f"- ... and {len(unassigned) - 10} more")
lines.append("")
lines.append("## Recommendations")
excess = sum(len(v) - 1 for v in duplicates.values())
lines.append(f"1. Close {excess} duplicate PRs (keep newest for each issue)")
lines.append(f"2. Assign reviewers to {len(unassigned)} unassigned PRs")
lines.append(f"3. Consider adding duplicate-PR prevention to CI")
return "\n".join(lines)
def close_duplicate_prs(repo, prs, headers, dry_run=True):
duplicates = find_duplicate_groups(prs)
closed = 0
for issue, pr_group in duplicates.items():
keep = max(pr_group, key=lambda p: p["number"])
for pr in pr_group:
if pr["number"] == keep["number"]:
continue
if dry_run:
print(f"Would close PR #{pr['number']}: {pr['title'][:60]}")
else:
url = f"{GITEA_URL}/api/v1/repos/{repo}/pulls/{pr['number']}"
data = json.dumps({"state": "closed"}).encode()
req = urllib.request.Request(url, data=data, headers={**headers, "Content-Type": "application/json"}, method="PATCH")
try:
urllib.request.urlopen(req)
print(f"Closed PR #{pr['number']}")
closed += 1
except Exception as e:
print(f"Failed to close #{pr['number']}: {e}")
return closed
def main():
parser = argparse.ArgumentParser()
parser.add_argument("--repo", default="Timmy_Foundation/timmy-config")
parser.add_argument("--close-duplicates", action="store_true")
parser.add_argument("--dry-run", action="store_true")
args = parser.parse_args()
token = get_token()
headers = {"Authorization": f"token {token}"}
prs = fetch_open_prs(args.repo, headers)
if args.close_duplicates:
closed = close_duplicate_prs(args.repo, prs, headers, args.dry_run)
print(f"\n{'Would close' if args.dry_run else 'Closed'} {closed} duplicate PRs")
else:
report = generate_report(args.repo, prs)
print(report)
docs_dir = Path(__file__).resolve().parent.parent / "docs"
docs_dir.mkdir(exist_ok=True)
(docs_dir / "pr-triage-report.md").write_text(report)
if __name__ == "__main__":
main()