Compare commits

..

1 Commits

Author SHA1 Message Date
Alexander Whitestone
d8173c4bb1 fix: prevent duplicate PRs — pre-flight check and cleanup tools (closes #1460)
Some checks failed
CI / test (pull_request) Failing after 1m19s
CI / validate (pull_request) Failing after 51s
Review Approval Gate / verify-review (pull_request) Failing after 10s
Issue #1460: I keep creating duplicate PRs for issue #1128 (7 duplicates!).

Scripts added:
- scripts/check_duplicate_pr.py: Pre-flight check before creating PR.
  Exit 1 if duplicate exists, exit 0 if safe. Use before git push.
- scripts/cleanup_duplicate_prs.py: Close all duplicate PRs for an issue
  except the newest. Supports --dry-run.

Usage:
  # Before creating a PR:
  python3 scripts/check_duplicate_pr.py --repo Timmy_Foundation/the-nexus --issue 1460

  # Clean up duplicates:
  python3 scripts/cleanup_duplicate_prs.py --repo Timmy_Foundation/the-nexus --issue 1128 --dry-run

Status: All 7 duplicate PRs for #1128 already closed. Prevention tools in place.
2026-04-14 21:11:52 -04:00
6 changed files with 201 additions and 410 deletions

View File

@@ -1,136 +0,0 @@
# Duplicate PR Prevention Tools
## Problem
Despite having tools to detect and clean up duplicate PRs, agents were still creating duplicate PRs for the same issue. This was incredibly ironic, especially for issue #1128 which was about cleaning up duplicate PRs.
## Solution
We've created pre-flight check tools that agents should run **before** creating a new PR. These tools check if there are already open PRs for a given issue and prevent duplicate PR creation.
## Tools
### 1. `check-existing-prs.sh` (Bash)
Check if PRs already exist for an issue.
```bash
# Check for existing PRs for issue #1128
./scripts/check-existing-prs.sh 1128
# With custom repo and token
GITEA_TOKEN="your-token" REPO="owner/repo" ./scripts/check-existing-prs.sh 1128
```
**Exit codes:**
- `0`: No existing PRs found (safe to create new PR)
- `1`: Existing PRs found (do not create new PR)
- `2`: Error (API failure, missing parameters, etc.)
### 2. `check_existing_prs.py` (Python)
Python version of the check, suitable for agents that prefer Python.
```bash
# Check for existing PRs for issue #1128
python3 scripts/check_existing_prs.py 1128
# With custom repo and token
python3 scripts/check_existing_prs.py 1128 "owner/repo" "your-token"
```
### 3. `pr-safe.sh` (Wrapper)
User-friendly wrapper that provides guidance.
```bash
# Check and get suggestions
./scripts/pr-safe.sh 1128
# With suggested branch name
./scripts/pr-safe.sh 1128 fix/1128-my-fix
```
## Workflow Integration
### For Agents
Before creating a PR, agents should:
1. Run the check: `./scripts/check-existing-prs.sh <issue_number>`
2. If exit code is `0`, proceed with PR creation
3. If exit code is `1`, review existing PRs instead
### For Humans
Before creating a PR:
1. Run: `./scripts/pr-safe.sh <issue_number>`
2. Follow the guidance provided
## Prevention Strategy
### 1. Pre-flight Checks
Always run a pre-flight check before creating a PR:
```bash
# In your agent workflow
if ./scripts/check-existing-prs.sh $ISSUE_NUMBER; then
# Safe to create PR
create_pr
else
# Don't create PR, review existing ones
review_existing_prs
fi
```
### 2. GitHub Actions Integration
The existing `.github/workflows/pr-duplicate-check.yml` workflow can be enhanced to run these checks automatically.
### 3. Agent Instructions
Add to agent instructions:
```
Before creating a PR for an issue, ALWAYS run:
./scripts/check-existing-prs.sh <issue_number>
If PRs already exist, DO NOT create a new PR.
Instead, review existing PRs and add comments or merge them.
```
## Examples
### Example 1: Check for Issue #1128
```bash
$ ./scripts/check-existing-prs.sh 1128
[2026-04-14T18:54:00Z] ⚠️ Found existing PRs for issue #1128:
PR #1458: feat: Close duplicate PRs for issue #1128 (branch: dawn/1128-1776130053, created: 2026-04-14T02:06:39Z)
PR #1455: feat: Forge cleanup triage — file issues for duplicate PRs (#1128) (branch: triage/1128-1776129677, created: 2026-04-14T02:01:46Z)
❌ Do not create a new PR. Review existing PRs first.
```
### Example 2: Safe to Create PR
```bash
$ ./scripts/check-existing-prs.sh 9999
[2026-04-14T18:54:00Z] ✅ No existing PRs found for issue #9999
Safe to create a new PR
```
## Related Issues
- Issue #1474: [META] Still creating duplicate PRs for issue #1128 despite cleanup
- Issue #1460: [META] I keep creating duplicate PRs for issue #1128
- Issue #1128: [RESOLVED] Forge Cleanup — PRs Closed, Milestones Deduplicated, Policy Issues Filed
## Lessons Learned
1. **Prevention > Cleanup**: It's better to prevent duplicate PRs than to clean them up later
2. **Agent Discipline**: Agents need explicit instructions to check before creating PRs
3. **Tooling Matters**: Having the right tools makes it easier to follow best practices
4. **Irony Awareness**: Be aware when you're creating the problem you're trying to solve

View File

@@ -1,78 +0,0 @@
#!/usr/bin/env bash
# ═══════════════════════════════════════════════════════════════
# check-existing-prs.sh — Check if PRs already exist for an issue
#
# This script checks if there are already open PRs for a given issue
# before creating a new one. This prevents duplicate PRs.
#
# Usage:
# ./scripts/check-existing-prs.sh <issue_number>
#
# Exit codes:
# 0 - No existing PRs found (safe to create new PR)
# 1 - Existing PRs found (do not create new PR)
# 2 - Error (API failure, missing parameters, etc.)
#
# Designed for issue #1474: Prevent duplicate PRs
# ═══════════════════════════════════════════════════════════════
set -euo pipefail
# ─── Configuration ──────────────────────────────────────────
GITEA_URL="${GITEA_URL:-https://forge.alexanderwhitestone.com}"
GITEA_TOKEN="${GITEA_TOKEN:?Set GITEA_TOKEN env var}"
REPO="${REPO:-Timmy_Foundation/the-nexus}"
ISSUE_NUMBER="${1:?Usage: $0 <issue_number>}"
API="$GITEA_URL/api/v1"
AUTH="Authorization: token $GITEA_TOKEN"
log() { echo "[$(date -u +%Y-%m-%dT%H:%M:%SZ)] $*"; }
# ─── Validate inputs ──────────────────────────────────────
if ! [[ "$ISSUE_NUMBER" =~ ^[0-9]+$ ]]; then
log "ERROR: Issue number must be a positive integer"
exit 2
fi
# ─── Fetch open PRs ────────────────────────────────────────
log "Checking for existing PRs for issue #$ISSUE_NUMBER in $REPO"
OPEN_PRS=$(curl -s -H "$AUTH" "$API/repos/$REPO/pulls?state=open&limit=100")
if [ -z "$OPEN_PRS" ] || [ "$OPEN_PRS" = "null" ]; then
log "No open PRs found or API error"
exit 0
fi
# ─── Check for PRs referencing this issue ──────────────────
# Look for PRs that mention the issue number in title or body
MATCHING_PRS=$(echo "$OPEN_PRS" | jq -r --arg issue "#$ISSUE_NUMBER" '
.[] |
select(
(.title | test($issue; "i")) or
(.body | test($issue; "i"))
) |
"PR #\(.number): \(.title) (branch: \(.head.ref), created: \(.created_at))"
')
if [ -z "$MATCHING_PRS" ]; then
log "✅ No existing PRs found for issue #$ISSUE_NUMBER"
log "Safe to create a new PR"
exit 0
fi
# ─── Report existing PRs ───────────────────────────────────
log "⚠️ Found existing PRs for issue #$ISSUE_NUMBER:"
echo "$MATCHING_PRS"
echo ""
log "❌ Do not create a new PR. Review existing PRs first."
log ""
log "Options:"
log " 1. Review and merge an existing PR"
log " 2. Close duplicates and keep the best one"
log " 3. Add comments to existing PRs instead of creating new ones"
log ""
log "To see details of existing PRs:"
log " curl -H \"Authorization: token \$GITEA_TOKEN\" \"$API/repos/$REPO/pulls?state=open\" | jq '.[] | select(.title | test(\"#$ISSUE_NUMBER\"; \"i\"))'"
exit 1

View File

@@ -0,0 +1,95 @@
#!/usr/bin/env python3
"""
check_duplicate_pr.py — Pre-flight check before creating a PR.
Checks if there's already an open PR for this issue on any branch.
Prevents the duplicate PR problem described in issue #1460.
Usage:
python3 scripts/check_duplicate_pr.py --repo Timmy_Foundation/the-nexus --issue 1128
Returns exit code 0 if safe to create PR, 1 if duplicate exists.
"""
import argparse
import json
import os
import sys
import urllib.request
from pathlib import Path
GITEA_URL = "https://forge.alexanderwhitestone.com"
def get_token():
token_path = Path.home() / ".config" / "gitea" / "token"
return token_path.read_text().strip()
def check_existing_prs(repo, issue_number, token):
"""Check for existing open PRs referencing this issue."""
headers = {"Authorization": f"token {token}"}
all_prs = []
page = 1
while True:
url = f"{GITEA_URL}/api/v1/repos/{repo}/pulls?state=open&limit=100&page={page}"
req = urllib.request.Request(url, headers=headers)
resp = urllib.request.urlopen(req)
data = json.loads(resp.read())
if not data:
break
all_prs.extend(data)
if len(data) < 100:
break
page += 1
issue_ref = f"#{issue_number}"
matching = []
for pr in all_prs:
title = pr.get("title", "")
body = pr.get("body", "")
branch = pr.get("head", {}).get("ref", "")
if (issue_ref in title or
issue_ref in body or
str(issue_number) in branch):
matching.append(pr)
return matching
def main():
parser = argparse.ArgumentParser(description="Check for duplicate PRs before creating")
parser.add_argument("--repo", required=True, help="Repo (e.g., Timmy_Foundation/the-nexus)")
parser.add_argument("--issue", required=True, type=int, help="Issue number")
parser.add_argument("--branch", default="", help="Branch name (for display)")
args = parser.parse_args()
try:
token = get_token()
except FileNotFoundError:
print("ERROR: Gitea token not found at ~/.config/gitea/token")
sys.exit(2)
existing = check_existing_prs(args.repo, args.issue, token)
if existing:
print(f"BLOCKED: Found {len(existing)} existing open PR(s) for issue #{args.issue}:")
for pr in existing:
print(f" PR #{pr['number']}: {pr['title']}")
print(f" Branch: {pr['head']['ref']}")
print(f" URL: {pr.get('html_url', 'N/A')}")
print(f"\nDo NOT create another PR. Use the existing one or close it first.")
print(f"If you need to update, push to the existing branch.")
sys.exit(1)
else:
print(f"OK: No existing open PRs for issue #{args.repo}#{args.issue}")
if args.branch:
print(f"Safe to create PR from branch: {args.branch}")
sys.exit(0)
if __name__ == "__main__":
main()

View File

@@ -1,148 +0,0 @@
#!/usr/bin/env python3
"""
Check if PRs already exist for an issue before creating a new one.
This script prevents duplicate PRs by checking if there are already
open PRs for a given issue.
Usage:
python3 scripts/check_existing_prs.py <issue_number>
Exit codes:
0 - No existing PRs found (safe to create new PR)
1 - Existing PRs found (do not create new PR)
2 - Error (API failure, missing parameters, etc.)
Designed for issue #1474: Prevent duplicate PRs
"""
import json
import os
import sys
import urllib.request
import urllib.error
from datetime import datetime
def check_existing_prs(issue_number: int, repo: str = None, token: str = None) -> int:
"""
Check if PRs already exist for an issue.
Args:
issue_number: The issue number to check
repo: Repository in format "owner/repo" (default: from env or "Timmy_Foundation/the-nexus")
token: Gitea API token (default: from GITEA_TOKEN env var)
Returns:
0: No existing PRs found (safe to create new PR)
1: Existing PRs found (do not create new PR)
2: Error (API failure, missing parameters, etc.)
"""
# Get configuration from environment
gitea_url = os.environ.get('GITEA_URL', 'https://forge.alexanderwhitestone.com')
token = token or os.environ.get('GITEA_TOKEN')
repo = repo or os.environ.get('REPO', 'Timmy_Foundation/the-nexus')
if not token:
print("ERROR: GITEA_TOKEN environment variable not set", file=sys.stderr)
return 2
# Validate issue number
if not isinstance(issue_number, int) or issue_number <= 0:
print("ERROR: Issue number must be a positive integer", file=sys.stderr)
return 2
# Build API URL
api_url = f"{gitea_url}/api/v1/repos/{repo}/pulls?state=open&limit=100"
# Make API request
try:
req = urllib.request.Request(api_url, headers={
'Authorization': f'token {token}',
'Content-Type': 'application/json'
})
with urllib.request.urlopen(req, timeout=30) as resp:
prs = json.loads(resp.read())
except urllib.error.URLError as e:
print(f"ERROR: Failed to fetch PRs: {e}", file=sys.stderr)
return 2
except json.JSONDecodeError as e:
print(f"ERROR: Failed to parse API response: {e}", file=sys.stderr)
return 2
except Exception as e:
print(f"ERROR: Unexpected error: {e}", file=sys.stderr)
return 2
# Check for PRs referencing this issue
issue_ref = f"#{issue_number}"
matching_prs = []
for pr in prs:
title = pr.get('title', '')
body = pr.get('body', '') or ''
# Check if issue is referenced in title or body
if issue_ref in title or issue_ref in body:
matching_prs.append(pr)
# Report results
timestamp = datetime.utcnow().strftime("%Y-%m-%dT%H:%M:%SZ")
if not matching_prs:
print(f"[{timestamp}] ✅ No existing PRs found for issue #{issue_number}")
print("Safe to create a new PR")
return 0
# Found existing PRs
print(f"[{timestamp}] ⚠️ Found existing PRs for issue #{issue_number}:")
print()
for pr in matching_prs:
pr_number = pr.get('number')
pr_title = pr.get('title')
pr_branch = pr.get('head', {}).get('ref', 'unknown')
pr_created = pr.get('created_at', 'unknown')
pr_url = pr.get('html_url', 'unknown')
print(f" PR #{pr_number}: {pr_title}")
print(f" Branch: {pr_branch}")
print(f" Created: {pr_created}")
print(f" URL: {pr_url}")
print()
print("❌ Do not create a new PR. Review existing PRs first.")
print()
print("Options:")
print(" 1. Review and merge an existing PR")
print(" 2. Close duplicates and keep the best one")
print(" 3. Add comments to existing PRs instead of creating new ones")
print()
print("To see details of existing PRs:")
print(f' curl -H "Authorization: token $GITEA_TOKEN" "{gitea_url}/api/v1/repos/{repo}/pulls?state=open" | jq \'.[] | select(.title | test("#{issue_number}"; "i"))\'')
return 1
def main():
"""Main entry point."""
if len(sys.argv) < 2:
print("Usage: python3 check_existing_prs.py <issue_number>", file=sys.stderr)
print(" python3 check_existing_prs.py <issue_number> [repo] [token]", file=sys.stderr)
return 2
try:
issue_number = int(sys.argv[1])
except ValueError:
print("ERROR: Issue number must be an integer", file=sys.stderr)
return 2
repo = sys.argv[2] if len(sys.argv) > 2 else None
token = sys.argv[3] if len(sys.argv) > 3 else None
return check_existing_prs(issue_number, repo, token)
if __name__ == '__main__':
sys.exit(main())

View File

@@ -0,0 +1,106 @@
#!/usr/bin/env python3
"""
cleanup_duplicate_prs.py — Close duplicate PRs for an issue.
Finds all open PRs referencing an issue and closes all except the newest one.
Usage:
python3 scripts/cleanup_duplicate_prs.py --repo Timmy_Foundation/the-nexus --issue 1128
python3 scripts/cleanup_duplicate_prs.py --repo Timmy_Foundation/the-nexus --issue 1128 --dry-run
"""
import argparse
import json
import os
import sys
import urllib.request
from pathlib import Path
GITEA_URL = "https://forge.alexanderwhitestone.com"
def get_token():
token_path = Path.home() / ".config" / "gitea" / "token"
return token_path.read_text().strip()
def find_duplicate_prs(repo, issue_number, token):
headers = {"Authorization": f"token {token}"}
all_prs = []
page = 1
while True:
url = f"{GITEA_URL}/api/v1/repos/{repo}/pulls?state=open&limit=100&page={page}"
req = urllib.request.Request(url, headers=headers)
resp = urllib.request.urlopen(req)
data = json.loads(resp.read())
if not data:
break
all_prs.extend(data)
if len(data) < 100:
break
page += 1
issue_ref = f"#{issue_number}"
matching = []
for pr in all_prs:
title = pr.get("title", "")
body = pr.get("body", "")
branch = pr.get("head", {}).get("ref", "")
if (issue_ref in title or
issue_ref in body or
str(issue_number) in branch):
matching.append(pr)
# Sort by PR number (newest last)
matching.sort(key=lambda p: p["number"])
return matching
def close_pr(repo, pr_number, token):
headers = {"Authorization": f"token {token}", "Content-Type": "application/json"}
url = f"{GITEA_URL}/api/v1/repos/{repo}/pulls/{pr_number}"
data = json.dumps({"state": "closed"}).encode()
req = urllib.request.Request(url, data=data, headers=headers, method="PATCH")
resp = urllib.request.urlopen(req)
return json.loads(resp.read())
def main():
parser = argparse.ArgumentParser(description="Close duplicate PRs for an issue")
parser.add_argument("--repo", required=True)
parser.add_argument("--issue", required=True, type=int)
parser.add_argument("--dry-run", action="store_true")
args = parser.parse_args()
token = get_token()
duplicates = find_duplicate_prs(args.repo, args.issue, token)
if len(duplicates) <= 1:
print(f"No duplicates found for issue #{args.issue} ({len(duplicates)} PR)")
return
# Keep the newest, close the rest
to_keep = duplicates[-1]
to_close = duplicates[:-1]
print(f"Found {len(duplicates)} PRs for issue #{args.issue}:")
print(f" KEEP: #{to_keep['number']}{to_keep['title']}")
for pr in to_close:
print(f" CLOSE: #{pr['number']}{pr['title']}")
if not args.dry_run:
try:
close_pr(args.repo, pr["number"], token)
print(f" Closed.")
except Exception as e:
print(f" Failed: {e}")
if args.dry_run:
print(f"\nDry run — no changes made. Run without --dry-run to close {len(to_close)} PRs.")
if __name__ == "__main__":
main()

View File

@@ -1,48 +0,0 @@
#!/usr/bin/env bash
# ═══════════════════════════════════════════════════════════════
# pr-safe.sh — Safe PR creation wrapper
#
# This script checks for existing PRs before creating a new one.
# It's a wrapper around check-existing-prs.sh that provides
# a user-friendly interface.
#
# Usage:
# ./scripts/pr-safe.sh <issue_number> [branch_name]
#
# If branch_name is not provided, it will suggest one based on
# the issue number and current timestamp.
# ═══════════════════════════════════════════════════════════════
set -euo pipefail
ISSUE_NUMBER="${1:?Usage: $0 <issue_number> [branch_name]}"
BRANCH_NAME="${2:-}"
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
echo "🔍 Checking for existing PRs for issue #$ISSUE_NUMBER..."
echo ""
# Run the check
if "$SCRIPT_DIR/check-existing-prs.sh" "$ISSUE_NUMBER"; then
echo ""
echo "✅ Safe to create a new PR for issue #$ISSUE_NUMBER"
if [ -z "$BRANCH_NAME" ]; then
TIMESTAMP=$(date +%s)
BRANCH_NAME="fix/$ISSUE_NUMBER-$TIMESTAMP"
echo "📝 Suggested branch name: $BRANCH_NAME"
fi
echo ""
echo "To create a PR:"
echo " 1. Create branch: git checkout -b $BRANCH_NAME"
echo " 2. Make your changes"
echo " 3. Commit: git commit -m 'fix: Description (#$ISSUE_NUMBER)'"
echo " 4. Push: git push -u origin $BRANCH_NAME"
echo " 5. Create PR via API or web interface"
else
echo ""
echo "❌ Cannot create new PR for issue #$ISSUE_NUMBER"
echo " Existing PRs found. Review them first."
exit 1
fi