diff --git a/docs/duplicate-pr-prevention.md b/docs/duplicate-pr-prevention.md new file mode 100644 index 00000000..7a5b50fa --- /dev/null +++ b/docs/duplicate-pr-prevention.md @@ -0,0 +1,136 @@ +# Duplicate PR Prevention Tools + +## Problem + +Despite having tools to detect and clean up duplicate PRs, agents were still creating duplicate PRs for the same issue. This was incredibly ironic, especially for issue #1128 which was about cleaning up duplicate PRs. + +## Solution + +We've created pre-flight check tools that agents should run **before** creating a new PR. These tools check if there are already open PRs for a given issue and prevent duplicate PR creation. + +## Tools + +### 1. `check-existing-prs.sh` (Bash) + +Check if PRs already exist for an issue. + +```bash +# Check for existing PRs for issue #1128 +./scripts/check-existing-prs.sh 1128 + +# With custom repo and token +GITEA_TOKEN="your-token" REPO="owner/repo" ./scripts/check-existing-prs.sh 1128 +``` + +**Exit codes:** +- `0`: No existing PRs found (safe to create new PR) +- `1`: Existing PRs found (do not create new PR) +- `2`: Error (API failure, missing parameters, etc.) + +### 2. `check_existing_prs.py` (Python) + +Python version of the check, suitable for agents that prefer Python. + +```bash +# Check for existing PRs for issue #1128 +python3 scripts/check_existing_prs.py 1128 + +# With custom repo and token +python3 scripts/check_existing_prs.py 1128 "owner/repo" "your-token" +``` + +### 3. `pr-safe.sh` (Wrapper) + +User-friendly wrapper that provides guidance. + +```bash +# Check and get suggestions +./scripts/pr-safe.sh 1128 + +# With suggested branch name +./scripts/pr-safe.sh 1128 fix/1128-my-fix +``` + +## Workflow Integration + +### For Agents + +Before creating a PR, agents should: + +1. Run the check: `./scripts/check-existing-prs.sh ` +2. If exit code is `0`, proceed with PR creation +3. If exit code is `1`, review existing PRs instead + +### For Humans + +Before creating a PR: + +1. Run: `./scripts/pr-safe.sh ` +2. Follow the guidance provided + +## Prevention Strategy + +### 1. Pre-flight Checks + +Always run a pre-flight check before creating a PR: + +```bash +# In your agent workflow +if ./scripts/check-existing-prs.sh $ISSUE_NUMBER; then + # Safe to create PR + create_pr +else + # Don't create PR, review existing ones + review_existing_prs +fi +``` + +### 2. GitHub Actions Integration + +The existing `.github/workflows/pr-duplicate-check.yml` workflow can be enhanced to run these checks automatically. + +### 3. Agent Instructions + +Add to agent instructions: + +``` +Before creating a PR for an issue, ALWAYS run: + ./scripts/check-existing-prs.sh + +If PRs already exist, DO NOT create a new PR. +Instead, review existing PRs and add comments or merge them. +``` + +## Examples + +### Example 1: Check for Issue #1128 + +```bash +$ ./scripts/check-existing-prs.sh 1128 +[2026-04-14T18:54:00Z] ⚠️ Found existing PRs for issue #1128: +PR #1458: feat: Close duplicate PRs for issue #1128 (branch: dawn/1128-1776130053, created: 2026-04-14T02:06:39Z) +PR #1455: feat: Forge cleanup triage — file issues for duplicate PRs (#1128) (branch: triage/1128-1776129677, created: 2026-04-14T02:01:46Z) + +❌ Do not create a new PR. Review existing PRs first. +``` + +### Example 2: Safe to Create PR + +```bash +$ ./scripts/check-existing-prs.sh 9999 +[2026-04-14T18:54:00Z] ✅ No existing PRs found for issue #9999 +Safe to create a new PR +``` + +## Related Issues + +- Issue #1474: [META] Still creating duplicate PRs for issue #1128 despite cleanup +- Issue #1460: [META] I keep creating duplicate PRs for issue #1128 +- Issue #1128: [RESOLVED] Forge Cleanup — PRs Closed, Milestones Deduplicated, Policy Issues Filed + +## Lessons Learned + +1. **Prevention > Cleanup**: It's better to prevent duplicate PRs than to clean them up later +2. **Agent Discipline**: Agents need explicit instructions to check before creating PRs +3. **Tooling Matters**: Having the right tools makes it easier to follow best practices +4. **Irony Awareness**: Be aware when you're creating the problem you're trying to solve diff --git a/scripts/check-existing-prs.sh b/scripts/check-existing-prs.sh new file mode 100755 index 00000000..85dd6459 --- /dev/null +++ b/scripts/check-existing-prs.sh @@ -0,0 +1,78 @@ +#!/usr/bin/env bash +# ═══════════════════════════════════════════════════════════════ +# check-existing-prs.sh — Check if PRs already exist for an issue +# +# This script checks if there are already open PRs for a given issue +# before creating a new one. This prevents duplicate PRs. +# +# Usage: +# ./scripts/check-existing-prs.sh +# +# Exit codes: +# 0 - No existing PRs found (safe to create new PR) +# 1 - Existing PRs found (do not create new PR) +# 2 - Error (API failure, missing parameters, etc.) +# +# Designed for issue #1474: Prevent duplicate PRs +# ═══════════════════════════════════════════════════════════════ +set -euo pipefail + +# ─── Configuration ────────────────────────────────────────── +GITEA_URL="${GITEA_URL:-https://forge.alexanderwhitestone.com}" +GITEA_TOKEN="${GITEA_TOKEN:?Set GITEA_TOKEN env var}" +REPO="${REPO:-Timmy_Foundation/the-nexus}" +ISSUE_NUMBER="${1:?Usage: $0 }" + +API="$GITEA_URL/api/v1" +AUTH="Authorization: token $GITEA_TOKEN" + +log() { echo "[$(date -u +%Y-%m-%dT%H:%M:%SZ)] $*"; } + +# ─── Validate inputs ────────────────────────────────────── +if ! [[ "$ISSUE_NUMBER" =~ ^[0-9]+$ ]]; then + log "ERROR: Issue number must be a positive integer" + exit 2 +fi + +# ─── Fetch open PRs ──────────────────────────────────────── +log "Checking for existing PRs for issue #$ISSUE_NUMBER in $REPO" + +OPEN_PRS=$(curl -s -H "$AUTH" "$API/repos/$REPO/pulls?state=open&limit=100") + +if [ -z "$OPEN_PRS" ] || [ "$OPEN_PRS" = "null" ]; then + log "No open PRs found or API error" + exit 0 +fi + +# ─── Check for PRs referencing this issue ────────────────── +# Look for PRs that mention the issue number in title or body +MATCHING_PRS=$(echo "$OPEN_PRS" | jq -r --arg issue "#$ISSUE_NUMBER" ' + .[] | + select( + (.title | test($issue; "i")) or + (.body | test($issue; "i")) + ) | + "PR #\(.number): \(.title) (branch: \(.head.ref), created: \(.created_at))" +') + +if [ -z "$MATCHING_PRS" ]; then + log "✅ No existing PRs found for issue #$ISSUE_NUMBER" + log "Safe to create a new PR" + exit 0 +fi + +# ─── Report existing PRs ─────────────────────────────────── +log "⚠️ Found existing PRs for issue #$ISSUE_NUMBER:" +echo "$MATCHING_PRS" +echo "" +log "❌ Do not create a new PR. Review existing PRs first." +log "" +log "Options:" +log " 1. Review and merge an existing PR" +log " 2. Close duplicates and keep the best one" +log " 3. Add comments to existing PRs instead of creating new ones" +log "" +log "To see details of existing PRs:" +log " curl -H \"Authorization: token \$GITEA_TOKEN\" \"$API/repos/$REPO/pulls?state=open\" | jq '.[] | select(.title | test(\"#$ISSUE_NUMBER\"; \"i\"))'" + +exit 1 \ No newline at end of file diff --git a/scripts/check_existing_prs.py b/scripts/check_existing_prs.py new file mode 100755 index 00000000..086cb136 --- /dev/null +++ b/scripts/check_existing_prs.py @@ -0,0 +1,148 @@ +#!/usr/bin/env python3 +""" +Check if PRs already exist for an issue before creating a new one. + +This script prevents duplicate PRs by checking if there are already +open PRs for a given issue. + +Usage: + python3 scripts/check_existing_prs.py + +Exit codes: + 0 - No existing PRs found (safe to create new PR) + 1 - Existing PRs found (do not create new PR) + 2 - Error (API failure, missing parameters, etc.) + +Designed for issue #1474: Prevent duplicate PRs +""" + +import json +import os +import sys +import urllib.request +import urllib.error +from datetime import datetime + + +def check_existing_prs(issue_number: int, repo: str = None, token: str = None) -> int: + """ + Check if PRs already exist for an issue. + + Args: + issue_number: The issue number to check + repo: Repository in format "owner/repo" (default: from env or "Timmy_Foundation/the-nexus") + token: Gitea API token (default: from GITEA_TOKEN env var) + + Returns: + 0: No existing PRs found (safe to create new PR) + 1: Existing PRs found (do not create new PR) + 2: Error (API failure, missing parameters, etc.) + """ + # Get configuration from environment + gitea_url = os.environ.get('GITEA_URL', 'https://forge.alexanderwhitestone.com') + token = token or os.environ.get('GITEA_TOKEN') + repo = repo or os.environ.get('REPO', 'Timmy_Foundation/the-nexus') + + if not token: + print("ERROR: GITEA_TOKEN environment variable not set", file=sys.stderr) + return 2 + + # Validate issue number + if not isinstance(issue_number, int) or issue_number <= 0: + print("ERROR: Issue number must be a positive integer", file=sys.stderr) + return 2 + + # Build API URL + api_url = f"{gitea_url}/api/v1/repos/{repo}/pulls?state=open&limit=100" + + # Make API request + try: + req = urllib.request.Request(api_url, headers={ + 'Authorization': f'token {token}', + 'Content-Type': 'application/json' + }) + + with urllib.request.urlopen(req, timeout=30) as resp: + prs = json.loads(resp.read()) + + except urllib.error.URLError as e: + print(f"ERROR: Failed to fetch PRs: {e}", file=sys.stderr) + return 2 + except json.JSONDecodeError as e: + print(f"ERROR: Failed to parse API response: {e}", file=sys.stderr) + return 2 + except Exception as e: + print(f"ERROR: Unexpected error: {e}", file=sys.stderr) + return 2 + + # Check for PRs referencing this issue + issue_ref = f"#{issue_number}" + matching_prs = [] + + for pr in prs: + title = pr.get('title', '') + body = pr.get('body', '') or '' + + # Check if issue is referenced in title or body + if issue_ref in title or issue_ref in body: + matching_prs.append(pr) + + # Report results + timestamp = datetime.utcnow().strftime("%Y-%m-%dT%H:%M:%SZ") + + if not matching_prs: + print(f"[{timestamp}] ✅ No existing PRs found for issue #{issue_number}") + print("Safe to create a new PR") + return 0 + + # Found existing PRs + print(f"[{timestamp}] ⚠️ Found existing PRs for issue #{issue_number}:") + print() + + for pr in matching_prs: + pr_number = pr.get('number') + pr_title = pr.get('title') + pr_branch = pr.get('head', {}).get('ref', 'unknown') + pr_created = pr.get('created_at', 'unknown') + pr_url = pr.get('html_url', 'unknown') + + print(f" PR #{pr_number}: {pr_title}") + print(f" Branch: {pr_branch}") + print(f" Created: {pr_created}") + print(f" URL: {pr_url}") + print() + + print("❌ Do not create a new PR. Review existing PRs first.") + print() + print("Options:") + print(" 1. Review and merge an existing PR") + print(" 2. Close duplicates and keep the best one") + print(" 3. Add comments to existing PRs instead of creating new ones") + print() + print("To see details of existing PRs:") + print(f' curl -H "Authorization: token $GITEA_TOKEN" "{gitea_url}/api/v1/repos/{repo}/pulls?state=open" | jq \'.[] | select(.title | test("#{issue_number}"; "i"))\'') + + return 1 + + +def main(): + """Main entry point.""" + if len(sys.argv) < 2: + print("Usage: python3 check_existing_prs.py ", file=sys.stderr) + print(" python3 check_existing_prs.py [repo] [token]", file=sys.stderr) + return 2 + + try: + issue_number = int(sys.argv[1]) + except ValueError: + print("ERROR: Issue number must be an integer", file=sys.stderr) + return 2 + + repo = sys.argv[2] if len(sys.argv) > 2 else None + token = sys.argv[3] if len(sys.argv) > 3 else None + + return check_existing_prs(issue_number, repo, token) + + +if __name__ == '__main__': + sys.exit(main()) \ No newline at end of file diff --git a/scripts/pr-safe.sh b/scripts/pr-safe.sh new file mode 100755 index 00000000..6d52a90a --- /dev/null +++ b/scripts/pr-safe.sh @@ -0,0 +1,48 @@ +#!/usr/bin/env bash +# ═══════════════════════════════════════════════════════════════ +# pr-safe.sh — Safe PR creation wrapper +# +# This script checks for existing PRs before creating a new one. +# It's a wrapper around check-existing-prs.sh that provides +# a user-friendly interface. +# +# Usage: +# ./scripts/pr-safe.sh [branch_name] +# +# If branch_name is not provided, it will suggest one based on +# the issue number and current timestamp. +# ═══════════════════════════════════════════════════════════════ +set -euo pipefail + +ISSUE_NUMBER="${1:?Usage: $0 [branch_name]}" +BRANCH_NAME="${2:-}" + +SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)" + +echo "🔍 Checking for existing PRs for issue #$ISSUE_NUMBER..." +echo "" + +# Run the check +if "$SCRIPT_DIR/check-existing-prs.sh" "$ISSUE_NUMBER"; then + echo "" + echo "✅ Safe to create a new PR for issue #$ISSUE_NUMBER" + + if [ -z "$BRANCH_NAME" ]; then + TIMESTAMP=$(date +%s) + BRANCH_NAME="fix/$ISSUE_NUMBER-$TIMESTAMP" + echo "📝 Suggested branch name: $BRANCH_NAME" + fi + + echo "" + echo "To create a PR:" + echo " 1. Create branch: git checkout -b $BRANCH_NAME" + echo " 2. Make your changes" + echo " 3. Commit: git commit -m 'fix: Description (#$ISSUE_NUMBER)'" + echo " 4. Push: git push -u origin $BRANCH_NAME" + echo " 5. Create PR via API or web interface" +else + echo "" + echo "❌ Cannot create new PR for issue #$ISSUE_NUMBER" + echo " Existing PRs found. Review them first." + exit 1 +fi