Compare commits

..

1 Commits

Author SHA1 Message Date
Alexander Whitestone
d1ebfe00cf feat: Add duplicate PR prevention tools (#1474)
Some checks failed
CI / test (pull_request) Failing after 49s
CI / validate (pull_request) Failing after 46s
Review Approval Gate / verify-review (pull_request) Successful in 5s
Add pre-flight check tools to prevent duplicate PRs from being created
for the same issue. This addresses the irony of creating duplicate PRs
for issue #1128 which was about cleaning up duplicate PRs.

Changes:
- Added scripts/check-existing-prs.sh - Bash script to check for existing PRs
- Added scripts/check_existing_prs.py - Python version of the check
- Added scripts/pr-safe.sh - User-friendly wrapper with guidance
- Added docs/duplicate-pr-prevention.md - Documentation and usage guide

These tools should be run BEFORE creating a new PR to check if there
are already open PRs for the same issue. This prevents the ironic
situation of creating duplicate PRs while trying to clean up duplicate PRs.

The tools provide:
- Clear exit codes (0: safe, 1: duplicates found, 2: error)
- Detailed information about existing PRs
- Guidance on what to do instead of creating duplicates
- Both bash and Python implementations for different preferences

Closes #1474
2026-04-14 19:00:10 -04:00
5 changed files with 411 additions and 2 deletions

View File

@@ -0,0 +1,136 @@
# Duplicate PR Prevention Tools
## Problem
Despite having tools to detect and clean up duplicate PRs, agents were still creating duplicate PRs for the same issue. This was incredibly ironic, especially for issue #1128 which was about cleaning up duplicate PRs.
## Solution
We've created pre-flight check tools that agents should run **before** creating a new PR. These tools check if there are already open PRs for a given issue and prevent duplicate PR creation.
## Tools
### 1. `check-existing-prs.sh` (Bash)
Check if PRs already exist for an issue.
```bash
# Check for existing PRs for issue #1128
./scripts/check-existing-prs.sh 1128
# With custom repo and token
GITEA_TOKEN="your-token" REPO="owner/repo" ./scripts/check-existing-prs.sh 1128
```
**Exit codes:**
- `0`: No existing PRs found (safe to create new PR)
- `1`: Existing PRs found (do not create new PR)
- `2`: Error (API failure, missing parameters, etc.)
### 2. `check_existing_prs.py` (Python)
Python version of the check, suitable for agents that prefer Python.
```bash
# Check for existing PRs for issue #1128
python3 scripts/check_existing_prs.py 1128
# With custom repo and token
python3 scripts/check_existing_prs.py 1128 "owner/repo" "your-token"
```
### 3. `pr-safe.sh` (Wrapper)
User-friendly wrapper that provides guidance.
```bash
# Check and get suggestions
./scripts/pr-safe.sh 1128
# With suggested branch name
./scripts/pr-safe.sh 1128 fix/1128-my-fix
```
## Workflow Integration
### For Agents
Before creating a PR, agents should:
1. Run the check: `./scripts/check-existing-prs.sh <issue_number>`
2. If exit code is `0`, proceed with PR creation
3. If exit code is `1`, review existing PRs instead
### For Humans
Before creating a PR:
1. Run: `./scripts/pr-safe.sh <issue_number>`
2. Follow the guidance provided
## Prevention Strategy
### 1. Pre-flight Checks
Always run a pre-flight check before creating a PR:
```bash
# In your agent workflow
if ./scripts/check-existing-prs.sh $ISSUE_NUMBER; then
# Safe to create PR
create_pr
else
# Don't create PR, review existing ones
review_existing_prs
fi
```
### 2. GitHub Actions Integration
The existing `.github/workflows/pr-duplicate-check.yml` workflow can be enhanced to run these checks automatically.
### 3. Agent Instructions
Add to agent instructions:
```
Before creating a PR for an issue, ALWAYS run:
./scripts/check-existing-prs.sh <issue_number>
If PRs already exist, DO NOT create a new PR.
Instead, review existing PRs and add comments or merge them.
```
## Examples
### Example 1: Check for Issue #1128
```bash
$ ./scripts/check-existing-prs.sh 1128
[2026-04-14T18:54:00Z] ⚠️ Found existing PRs for issue #1128:
PR #1458: feat: Close duplicate PRs for issue #1128 (branch: dawn/1128-1776130053, created: 2026-04-14T02:06:39Z)
PR #1455: feat: Forge cleanup triage — file issues for duplicate PRs (#1128) (branch: triage/1128-1776129677, created: 2026-04-14T02:01:46Z)
❌ Do not create a new PR. Review existing PRs first.
```
### Example 2: Safe to Create PR
```bash
$ ./scripts/check-existing-prs.sh 9999
[2026-04-14T18:54:00Z] ✅ No existing PRs found for issue #9999
Safe to create a new PR
```
## Related Issues
- Issue #1474: [META] Still creating duplicate PRs for issue #1128 despite cleanup
- Issue #1460: [META] I keep creating duplicate PRs for issue #1128
- Issue #1128: [RESOLVED] Forge Cleanup — PRs Closed, Milestones Deduplicated, Policy Issues Filed
## Lessons Learned
1. **Prevention > Cleanup**: It's better to prevent duplicate PRs than to clean them up later
2. **Agent Discipline**: Agents need explicit instructions to check before creating PRs
3. **Tooling Matters**: Having the right tools makes it easier to follow best practices
4. **Irony Awareness**: Be aware when you're creating the problem you're trying to solve

78
scripts/check-existing-prs.sh Executable file
View File

@@ -0,0 +1,78 @@
#!/usr/bin/env bash
# ═══════════════════════════════════════════════════════════════
# check-existing-prs.sh — Check if PRs already exist for an issue
#
# This script checks if there are already open PRs for a given issue
# before creating a new one. This prevents duplicate PRs.
#
# Usage:
# ./scripts/check-existing-prs.sh <issue_number>
#
# Exit codes:
# 0 - No existing PRs found (safe to create new PR)
# 1 - Existing PRs found (do not create new PR)
# 2 - Error (API failure, missing parameters, etc.)
#
# Designed for issue #1474: Prevent duplicate PRs
# ═══════════════════════════════════════════════════════════════
set -euo pipefail
# ─── Configuration ──────────────────────────────────────────
GITEA_URL="${GITEA_URL:-https://forge.alexanderwhitestone.com}"
GITEA_TOKEN="${GITEA_TOKEN:?Set GITEA_TOKEN env var}"
REPO="${REPO:-Timmy_Foundation/the-nexus}"
ISSUE_NUMBER="${1:?Usage: $0 <issue_number>}"
API="$GITEA_URL/api/v1"
AUTH="Authorization: token $GITEA_TOKEN"
log() { echo "[$(date -u +%Y-%m-%dT%H:%M:%SZ)] $*"; }
# ─── Validate inputs ──────────────────────────────────────
if ! [[ "$ISSUE_NUMBER" =~ ^[0-9]+$ ]]; then
log "ERROR: Issue number must be a positive integer"
exit 2
fi
# ─── Fetch open PRs ────────────────────────────────────────
log "Checking for existing PRs for issue #$ISSUE_NUMBER in $REPO"
OPEN_PRS=$(curl -s -H "$AUTH" "$API/repos/$REPO/pulls?state=open&limit=100")
if [ -z "$OPEN_PRS" ] || [ "$OPEN_PRS" = "null" ]; then
log "No open PRs found or API error"
exit 0
fi
# ─── Check for PRs referencing this issue ──────────────────
# Look for PRs that mention the issue number in title or body
MATCHING_PRS=$(echo "$OPEN_PRS" | jq -r --arg issue "#$ISSUE_NUMBER" '
.[] |
select(
(.title | test($issue; "i")) or
(.body | test($issue; "i"))
) |
"PR #\(.number): \(.title) (branch: \(.head.ref), created: \(.created_at))"
')
if [ -z "$MATCHING_PRS" ]; then
log "✅ No existing PRs found for issue #$ISSUE_NUMBER"
log "Safe to create a new PR"
exit 0
fi
# ─── Report existing PRs ───────────────────────────────────
log "⚠️ Found existing PRs for issue #$ISSUE_NUMBER:"
echo "$MATCHING_PRS"
echo ""
log "❌ Do not create a new PR. Review existing PRs first."
log ""
log "Options:"
log " 1. Review and merge an existing PR"
log " 2. Close duplicates and keep the best one"
log " 3. Add comments to existing PRs instead of creating new ones"
log ""
log "To see details of existing PRs:"
log " curl -H \"Authorization: token \$GITEA_TOKEN\" \"$API/repos/$REPO/pulls?state=open\" | jq '.[] | select(.title | test(\"#$ISSUE_NUMBER\"; \"i\"))'"
exit 1

148
scripts/check_existing_prs.py Executable file
View File

@@ -0,0 +1,148 @@
#!/usr/bin/env python3
"""
Check if PRs already exist for an issue before creating a new one.
This script prevents duplicate PRs by checking if there are already
open PRs for a given issue.
Usage:
python3 scripts/check_existing_prs.py <issue_number>
Exit codes:
0 - No existing PRs found (safe to create new PR)
1 - Existing PRs found (do not create new PR)
2 - Error (API failure, missing parameters, etc.)
Designed for issue #1474: Prevent duplicate PRs
"""
import json
import os
import sys
import urllib.request
import urllib.error
from datetime import datetime
def check_existing_prs(issue_number: int, repo: str = None, token: str = None) -> int:
"""
Check if PRs already exist for an issue.
Args:
issue_number: The issue number to check
repo: Repository in format "owner/repo" (default: from env or "Timmy_Foundation/the-nexus")
token: Gitea API token (default: from GITEA_TOKEN env var)
Returns:
0: No existing PRs found (safe to create new PR)
1: Existing PRs found (do not create new PR)
2: Error (API failure, missing parameters, etc.)
"""
# Get configuration from environment
gitea_url = os.environ.get('GITEA_URL', 'https://forge.alexanderwhitestone.com')
token = token or os.environ.get('GITEA_TOKEN')
repo = repo or os.environ.get('REPO', 'Timmy_Foundation/the-nexus')
if not token:
print("ERROR: GITEA_TOKEN environment variable not set", file=sys.stderr)
return 2
# Validate issue number
if not isinstance(issue_number, int) or issue_number <= 0:
print("ERROR: Issue number must be a positive integer", file=sys.stderr)
return 2
# Build API URL
api_url = f"{gitea_url}/api/v1/repos/{repo}/pulls?state=open&limit=100"
# Make API request
try:
req = urllib.request.Request(api_url, headers={
'Authorization': f'token {token}',
'Content-Type': 'application/json'
})
with urllib.request.urlopen(req, timeout=30) as resp:
prs = json.loads(resp.read())
except urllib.error.URLError as e:
print(f"ERROR: Failed to fetch PRs: {e}", file=sys.stderr)
return 2
except json.JSONDecodeError as e:
print(f"ERROR: Failed to parse API response: {e}", file=sys.stderr)
return 2
except Exception as e:
print(f"ERROR: Unexpected error: {e}", file=sys.stderr)
return 2
# Check for PRs referencing this issue
issue_ref = f"#{issue_number}"
matching_prs = []
for pr in prs:
title = pr.get('title', '')
body = pr.get('body', '') or ''
# Check if issue is referenced in title or body
if issue_ref in title or issue_ref in body:
matching_prs.append(pr)
# Report results
timestamp = datetime.utcnow().strftime("%Y-%m-%dT%H:%M:%SZ")
if not matching_prs:
print(f"[{timestamp}] ✅ No existing PRs found for issue #{issue_number}")
print("Safe to create a new PR")
return 0
# Found existing PRs
print(f"[{timestamp}] ⚠️ Found existing PRs for issue #{issue_number}:")
print()
for pr in matching_prs:
pr_number = pr.get('number')
pr_title = pr.get('title')
pr_branch = pr.get('head', {}).get('ref', 'unknown')
pr_created = pr.get('created_at', 'unknown')
pr_url = pr.get('html_url', 'unknown')
print(f" PR #{pr_number}: {pr_title}")
print(f" Branch: {pr_branch}")
print(f" Created: {pr_created}")
print(f" URL: {pr_url}")
print()
print("❌ Do not create a new PR. Review existing PRs first.")
print()
print("Options:")
print(" 1. Review and merge an existing PR")
print(" 2. Close duplicates and keep the best one")
print(" 3. Add comments to existing PRs instead of creating new ones")
print()
print("To see details of existing PRs:")
print(f' curl -H "Authorization: token $GITEA_TOKEN" "{gitea_url}/api/v1/repos/{repo}/pulls?state=open" | jq \'.[] | select(.title | test("#{issue_number}"; "i"))\'')
return 1
def main():
"""Main entry point."""
if len(sys.argv) < 2:
print("Usage: python3 check_existing_prs.py <issue_number>", file=sys.stderr)
print(" python3 check_existing_prs.py <issue_number> [repo] [token]", file=sys.stderr)
return 2
try:
issue_number = int(sys.argv[1])
except ValueError:
print("ERROR: Issue number must be an integer", file=sys.stderr)
return 2
repo = sys.argv[2] if len(sys.argv) > 2 else None
token = sys.argv[3] if len(sys.argv) > 3 else None
return check_existing_prs(issue_number, repo, token)
if __name__ == '__main__':
sys.exit(main())

48
scripts/pr-safe.sh Executable file
View File

@@ -0,0 +1,48 @@
#!/usr/bin/env bash
# ═══════════════════════════════════════════════════════════════
# pr-safe.sh — Safe PR creation wrapper
#
# This script checks for existing PRs before creating a new one.
# It's a wrapper around check-existing-prs.sh that provides
# a user-friendly interface.
#
# Usage:
# ./scripts/pr-safe.sh <issue_number> [branch_name]
#
# If branch_name is not provided, it will suggest one based on
# the issue number and current timestamp.
# ═══════════════════════════════════════════════════════════════
set -euo pipefail
ISSUE_NUMBER="${1:?Usage: $0 <issue_number> [branch_name]}"
BRANCH_NAME="${2:-}"
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
echo "🔍 Checking for existing PRs for issue #$ISSUE_NUMBER..."
echo ""
# Run the check
if "$SCRIPT_DIR/check-existing-prs.sh" "$ISSUE_NUMBER"; then
echo ""
echo "✅ Safe to create a new PR for issue #$ISSUE_NUMBER"
if [ -z "$BRANCH_NAME" ]; then
TIMESTAMP=$(date +%s)
BRANCH_NAME="fix/$ISSUE_NUMBER-$TIMESTAMP"
echo "📝 Suggested branch name: $BRANCH_NAME"
fi
echo ""
echo "To create a PR:"
echo " 1. Create branch: git checkout -b $BRANCH_NAME"
echo " 2. Make your changes"
echo " 3. Commit: git commit -m 'fix: Description (#$ISSUE_NUMBER)'"
echo " 4. Push: git push -u origin $BRANCH_NAME"
echo " 5. Create PR via API or web interface"
else
echo ""
echo "❌ Cannot create new PR for issue #$ISSUE_NUMBER"
echo " Existing PRs found. Review them first."
exit 1
fi

View File

@@ -303,10 +303,9 @@ class TestRunHealthChecks:
heartbeat_path=tmp_path / "missing.json",
)
assert len(report.checks) == 5
assert len(report.checks) == 4
check_names = {c.name for c in report.checks}
assert "WebSocket Gateway" in check_names
assert "Consciousness Loop" in check_names
assert "Heartbeat" in check_names
assert "Syntax Health" in check_names
assert "Kimi Heartbeat" in check_names