Compare commits
1 Commits
fix/882
...
burn/1460-
| Author | SHA1 | Date | |
|---|---|---|---|
|
|
34e004e842 |
72
.gitea/workflows/duplicate-pr-check.yml
Normal file
72
.gitea/workflows/duplicate-pr-check.yml
Normal file
@@ -0,0 +1,72 @@
|
||||
# .gitea/workflows/duplicate-pr-check.yml
|
||||
# CI workflow to check for duplicate PRs
|
||||
|
||||
name: Check for Duplicate PRs
|
||||
|
||||
on:
|
||||
pull_request:
|
||||
types: [opened, synchronize, reopened]
|
||||
|
||||
jobs:
|
||||
check-duplicates:
|
||||
runs-on: ubuntu-latest
|
||||
steps:
|
||||
- name: Checkout code
|
||||
uses: actions/checkout@v3
|
||||
|
||||
- name: Set up Python
|
||||
uses: actions/setup-python@v4
|
||||
with:
|
||||
python-version: '3.11'
|
||||
|
||||
- name: Install dependencies
|
||||
run: |
|
||||
python -m pip install --upgrade pip
|
||||
# No additional dependencies needed
|
||||
|
||||
- name: Check for duplicate PRs
|
||||
env:
|
||||
GITEA_TOKEN: ${{ secrets.GITEA_TOKEN }}
|
||||
run: |
|
||||
# Extract issue number from PR title or branch name
|
||||
PR_TITLE="${{ github.event.pull_request.title }}"
|
||||
BRANCH_NAME="${{ github.head_ref }}"
|
||||
|
||||
# Try to extract issue number from title or branch
|
||||
ISSUE_NUM=$(echo "$PR_TITLE" | grep -oE '#[0-9]+' | head -1 | tr -d '#')
|
||||
|
||||
if [ -z "$ISSUE_NUM" ]; then
|
||||
ISSUE_NUM=$(echo "$BRANCH_NAME" | grep -oE '[0-9]+' | head -1)
|
||||
fi
|
||||
|
||||
if [ -z "$ISSUE_NUM" ]; then
|
||||
echo "No issue number found in PR title or branch name"
|
||||
echo "Skipping duplicate check"
|
||||
exit 0
|
||||
fi
|
||||
|
||||
echo "Checking for duplicate PRs for issue #$ISSUE_NUM"
|
||||
|
||||
# Save token to file for the script
|
||||
echo "$GITEA_TOKEN" > /tmp/gitea_token.txt
|
||||
export TOKEN_PATH=/tmp/gitea_token.txt
|
||||
|
||||
# Run the duplicate checker
|
||||
python bin/duplicate_pr_prevention.py --repo the-nexus --issue "$ISSUE_NUM" --check
|
||||
|
||||
if [ $? -ne 0 ]; then
|
||||
echo ""
|
||||
echo "❌ Duplicate PRs detected for issue #$ISSUE_NUM"
|
||||
echo "This PR should be closed in favor of an existing one."
|
||||
echo ""
|
||||
echo "To see details, run:"
|
||||
echo " python bin/duplicate_pr_prevention.py --repo the-nexus --issue $ISSUE_NUM --report"
|
||||
exit 1
|
||||
fi
|
||||
|
||||
echo "✅ No duplicate PRs found"
|
||||
|
||||
- name: Clean up
|
||||
if: always()
|
||||
run: |
|
||||
rm -f /tmp/gitea_token.txt
|
||||
241
DUPLICATE_PR_PREVENTION.md
Normal file
241
DUPLICATE_PR_PREVENTION.md
Normal file
@@ -0,0 +1,241 @@
|
||||
# Duplicate PR Prevention System
|
||||
|
||||
**Issue:** #1460 - [META] I keep creating duplicate PRs for issue #1128
|
||||
**Solution:** Comprehensive prevention system with tools, hooks, and CI checks
|
||||
|
||||
## Problem Statement
|
||||
|
||||
Issue #1460 describes a meta-problem: creating 7 duplicate PRs for issue #1128, which was itself about cleaning up duplicate PRs. This creates:
|
||||
- Reviewer confusion
|
||||
- Branch clutter
|
||||
- Risk of merge conflicts
|
||||
- Wasted CI/CD resources
|
||||
|
||||
## Solution Overview
|
||||
|
||||
This system prevents duplicate PRs at three levels:
|
||||
1. **Local Prevention** — Git hooks that check before pushing
|
||||
2. **CI/CD Prevention** — Workflows that check when PRs are created
|
||||
3. **Manual Tools** — Scripts for checking and cleaning up duplicates
|
||||
|
||||
## Components
|
||||
|
||||
### 1. `bin/duplicate_pr_prevention.py`
|
||||
Main prevention script with three modes:
|
||||
|
||||
**Check for duplicates:**
|
||||
```bash
|
||||
python3 bin/duplicate_pr_prevention.py --repo the-nexus --issue 1128 --check
|
||||
```
|
||||
|
||||
**Clean up duplicates:**
|
||||
```bash
|
||||
# Dry run (see what would be closed)
|
||||
python3 bin/duplicate_pr_prevention.py --repo the-nexus --issue 1128 --cleanup --dry-run
|
||||
|
||||
# Actually close duplicates
|
||||
python3 bin/duplicate_pr_prevention.py --repo the-nexus --issue 1128 --cleanup
|
||||
```
|
||||
|
||||
**Generate report:**
|
||||
```bash
|
||||
python3 bin/duplicate_pr_prevention.py --repo the-nexus --issue 1128 --report
|
||||
```
|
||||
|
||||
### 2. `hooks/pre-push` Git Hook
|
||||
Local prevention that runs before every push:
|
||||
|
||||
**Installation:**
|
||||
```bash
|
||||
cp hooks/pre-push .git/hooks/pre-push
|
||||
chmod +x .git/hooks/pre-push
|
||||
```
|
||||
|
||||
**How it works:**
|
||||
1. Extracts issue number from branch name (e.g., `fix/1128-something` → `1128`)
|
||||
2. Checks for existing PRs for that issue
|
||||
3. Blocks push if duplicates found
|
||||
4. Provides instructions for resolution
|
||||
|
||||
### 3. `.gitea/workflows/duplicate-pr-check.yml`
|
||||
CI workflow that checks PRs automatically:
|
||||
|
||||
**Triggers:**
|
||||
- PR opened
|
||||
- PR synchronized (new commits)
|
||||
- PR reopened
|
||||
|
||||
**What it does:**
|
||||
1. Extracts issue number from PR title or branch name
|
||||
2. Checks for existing PRs
|
||||
3. Fails CI if duplicates found
|
||||
4. Provides clear error message
|
||||
|
||||
## Usage Guide
|
||||
|
||||
### For Agents (AI Workers)
|
||||
Before creating any PR:
|
||||
```bash
|
||||
# Step 1: Check for duplicates
|
||||
python3 bin/duplicate_pr_prevention.py --repo the-nexus --issue 1460 --check
|
||||
|
||||
# Step 2: If safe (exit 0), create PR
|
||||
# Step 3: If duplicates exist (exit 1), use existing PR instead
|
||||
```
|
||||
|
||||
### For Developers
|
||||
Install the Git hook for automatic prevention:
|
||||
```bash
|
||||
# One-time setup
|
||||
cp hooks/pre-push .git/hooks/pre-push
|
||||
chmod +x .git/hooks/pre-push
|
||||
|
||||
# Now git push will automatically check for duplicates
|
||||
git push # Will be blocked if duplicates exist
|
||||
```
|
||||
|
||||
### For CI/CD
|
||||
The workflow runs automatically on all PRs. No setup needed.
|
||||
|
||||
## Examples
|
||||
|
||||
### Check for duplicates:
|
||||
```bash
|
||||
$ python3 bin/duplicate_pr_prevention.py --repo the-nexus --issue 1128 --check
|
||||
⚠️ Found 2 duplicate PR(s) for issue #1128:
|
||||
- PR #1458: feat: Close duplicate PRs for issue #1128
|
||||
- PR #1455: feat: Forge cleanup triage — file issues for duplicate PRs (#1128)
|
||||
```
|
||||
|
||||
### Clean up duplicates:
|
||||
```bash
|
||||
$ python3 bin/duplicate_pr_prevention.py --repo the-nexus --issue 1128 --cleanup
|
||||
Cleanup complete:
|
||||
Kept PR: #1458
|
||||
Closed PRs: [1455]
|
||||
```
|
||||
|
||||
### Generate report:
|
||||
```bash
|
||||
$ python3 bin/duplicate_pr_prevention.py --repo the-nexus --issue 1128 --report
|
||||
# Duplicate PR Prevention Report
|
||||
|
||||
**Repository:** the-nexus
|
||||
**Issue:** #1128
|
||||
**Generated:** 2026-04-14T23:30:00
|
||||
|
||||
## Current Status
|
||||
|
||||
⚠️ **Found 2 duplicate PR(s)**
|
||||
|
||||
- **PR #1458**: feat: Close duplicate PRs for issue #1128
|
||||
- Branch: fix/1128-cleanup
|
||||
- Created: 2026-04-14T22:00:00
|
||||
- Author: agent
|
||||
|
||||
- **PR #1455**: feat: Forge cleanup triage — file issues for duplicate PRs (#1128)
|
||||
- Branch: triage/1128-1776129677
|
||||
- Created: 2026-04-14T20:00:00
|
||||
- Author: agent
|
||||
|
||||
## Recommendations
|
||||
|
||||
1. **Review existing PRs** — Check which one is the best solution
|
||||
2. **Keep the newest** — Usually the most up-to-date
|
||||
3. **Close duplicates** — Use cleanup_duplicate_prs.py
|
||||
4. **Prevent future duplicates** — Use check_duplicate_pr.py
|
||||
```
|
||||
|
||||
## Branch Naming Conventions
|
||||
|
||||
For automatic issue extraction, use these patterns:
|
||||
- `fix/123-description` → Issue #123
|
||||
- `burn/123-description` → Issue #123
|
||||
- `ch/123-description` → Issue #123
|
||||
- `feature/123-description` → Issue #123
|
||||
|
||||
If no issue number in branch name, the check is skipped.
|
||||
|
||||
## Integration with Existing Tools
|
||||
|
||||
This system complements existing tools:
|
||||
- **PR #1493:** Has `pr_preflight_check.py` — similar functionality
|
||||
- **PR #1497:** Has `check_duplicate_pr.py` — similar functionality
|
||||
|
||||
This system provides additional features:
|
||||
1. **Git hooks** for local prevention
|
||||
2. **CI workflows** for automated checking
|
||||
3. **Cleanup tools** for closing duplicates
|
||||
4. **Comprehensive reporting**
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
### Hook not working?
|
||||
```bash
|
||||
# Check if hook is installed
|
||||
ls -la .git/hooks/pre-push
|
||||
|
||||
# Make sure it's executable
|
||||
chmod +x .git/hooks/pre-push
|
||||
|
||||
# Test it manually
|
||||
./.git/hooks/pre-push
|
||||
```
|
||||
|
||||
### CI failing?
|
||||
1. Check if `GITEA_TOKEN` secret is set
|
||||
2. Verify issue number can be extracted from PR title/branch
|
||||
3. Check workflow logs for details
|
||||
|
||||
### False positives?
|
||||
If the script incorrectly identifies duplicates:
|
||||
1. Check PR titles and bodies for issue references
|
||||
2. Use `--report` to see what's being detected
|
||||
3. Manually close incorrect PRs if needed
|
||||
|
||||
## Prevention Strategy
|
||||
|
||||
### 1. **Always Check First**
|
||||
```bash
|
||||
# Before creating any PR
|
||||
python3 bin/duplicate_pr_prevention.py --repo the-nexus --issue 1460 --check
|
||||
```
|
||||
|
||||
### 2. **Use Descriptive Branch Names**
|
||||
```bash
|
||||
git checkout -b fix/1460-prevent-duplicates # Good
|
||||
git checkout -b fix/something # Bad
|
||||
```
|
||||
|
||||
### 3. **Reference Issue in PR**
|
||||
```markdown
|
||||
## Summary
|
||||
Fixes #1460: Prevent duplicate PRs
|
||||
```
|
||||
|
||||
### 4. **Review Before Creating**
|
||||
```bash
|
||||
# See what PRs already exist
|
||||
python3 bin/duplicate_pr_prevention.py --repo the-nexus --issue 1460 --report
|
||||
```
|
||||
|
||||
## Related Issues
|
||||
|
||||
- **Issue #1460:** This implementation
|
||||
- **Issue #1128:** Original issue that had 7 duplicate PRs
|
||||
- **Issue #1449:** [URGENT] 5 duplicate PRs for issue #1128 need cleanup
|
||||
- **Issue #1474:** [META] Still creating duplicate PRs for issue #1128 despite cleanup
|
||||
- **Issue #1480:** [META] 4th duplicate PR for issue #1128 — need intervention
|
||||
|
||||
## Files
|
||||
|
||||
```
|
||||
bin/duplicate_pr_prevention.py # Main prevention script
|
||||
hooks/pre-push # Git hook for local prevention
|
||||
.gitea/workflows/duplicate-pr-check.yml # CI workflow
|
||||
DUPLICATE_PR_PREVENTION.md # This documentation
|
||||
```
|
||||
|
||||
## License
|
||||
|
||||
Part of the Timmy Foundation project.
|
||||
230
bin/duplicate_pr_prevention.py
Executable file
230
bin/duplicate_pr_prevention.py
Executable file
@@ -0,0 +1,230 @@
|
||||
#!/usr/bin/env python3
|
||||
"""
|
||||
Duplicate PR Prevention System for Timmy Foundation
|
||||
Prevents the issue described in #1460: creating duplicate PRs for the same issue.
|
||||
"""
|
||||
|
||||
import json
|
||||
import os
|
||||
import sys
|
||||
import urllib.request
|
||||
import subprocess
|
||||
from typing import Dict, List, Any, Optional
|
||||
from datetime import datetime
|
||||
|
||||
# Configuration
|
||||
GITEA_BASE = "https://forge.alexanderwhitestone.com/api/v1"
|
||||
TOKEN_PATH = os.path.expanduser("~/.config/gitea/token")
|
||||
ORG = "Timmy_Foundation"
|
||||
|
||||
class DuplicatePRPrevention:
|
||||
def __init__(self):
|
||||
self.token = self._load_token()
|
||||
|
||||
def _load_token(self) -> str:
|
||||
"""Load Gitea API token."""
|
||||
try:
|
||||
with open(TOKEN_PATH, "r") as f:
|
||||
return f.read().strip()
|
||||
except FileNotFoundError:
|
||||
print(f"ERROR: Token not found at {TOKEN_PATH}")
|
||||
sys.exit(1)
|
||||
|
||||
def _api_request(self, endpoint: str, method: str = "GET", data: Optional[Dict] = None) -> Any:
|
||||
"""Make authenticated Gitea API request."""
|
||||
url = f"{GITEA_BASE}{endpoint}"
|
||||
headers = {
|
||||
"Authorization": f"token {self.token}",
|
||||
"Content-Type": "application/json"
|
||||
}
|
||||
|
||||
req = urllib.request.Request(url, headers=headers, method=method)
|
||||
if data:
|
||||
req.data = json.dumps(data).encode()
|
||||
|
||||
try:
|
||||
with urllib.request.urlopen(req) as resp:
|
||||
if resp.status == 204: # No content
|
||||
return {"status": "success", "code": resp.status}
|
||||
return json.loads(resp.read())
|
||||
except urllib.error.HTTPError as e:
|
||||
error_body = e.read().decode() if e.fp else "No error body"
|
||||
print(f"API Error {e.code}: {error_body}")
|
||||
return {"error": e.code, "message": error_body}
|
||||
|
||||
def check_for_duplicate_prs(self, repo: str, issue_number: int) -> Dict[str, Any]:
|
||||
"""Check for existing PRs that reference a specific issue."""
|
||||
# Get all open PRs
|
||||
endpoint = f"/repos/{ORG}/{repo}/pulls?state=open"
|
||||
prs = self._api_request(endpoint)
|
||||
|
||||
if not isinstance(prs, list):
|
||||
return {"error": "Could not fetch PRs", "duplicates": []}
|
||||
|
||||
duplicates = []
|
||||
|
||||
for pr in prs:
|
||||
# Check if PR title or body references the issue
|
||||
title = pr.get('title', '').lower()
|
||||
body = pr.get('body', '').lower() if pr.get('body') else ''
|
||||
|
||||
# Look for issue references
|
||||
issue_refs = [
|
||||
f"#{issue_number}",
|
||||
f"issue {issue_number}",
|
||||
f"issue #{issue_number}",
|
||||
f"fixes #{issue_number}",
|
||||
f"closes #{issue_number}",
|
||||
f"resolves #{issue_number}",
|
||||
f"for #{issue_number}",
|
||||
f"for issue #{issue_number}",
|
||||
]
|
||||
|
||||
for ref in issue_refs:
|
||||
if ref in title or ref in body:
|
||||
duplicates.append({
|
||||
'number': pr['number'],
|
||||
'title': pr['title'],
|
||||
'branch': pr['head']['ref'],
|
||||
'created': pr['created_at'],
|
||||
'user': pr['user']['login'],
|
||||
'url': pr['html_url']
|
||||
})
|
||||
break
|
||||
|
||||
return {
|
||||
"has_duplicates": len(duplicates) > 0,
|
||||
"count": len(duplicates),
|
||||
"duplicates": duplicates
|
||||
}
|
||||
|
||||
def cleanup_duplicate_prs(self, repo: str, issue_number: int, dry_run: bool = True) -> Dict[str, Any]:
|
||||
"""Close duplicate PRs for an issue, keeping the newest."""
|
||||
duplicates = self.check_for_duplicate_prs(repo, issue_number)
|
||||
|
||||
if not duplicates["has_duplicates"]:
|
||||
return {"status": "no_duplicates", "closed": []}
|
||||
|
||||
# Sort by creation date (newest first)
|
||||
sorted_prs = sorted(duplicates["duplicates"],
|
||||
key=lambda x: x['created'],
|
||||
reverse=True)
|
||||
|
||||
# Keep the newest, close the rest
|
||||
to_keep = sorted_prs[0] if sorted_prs else None
|
||||
to_close = sorted_prs[1:] if len(sorted_prs) > 1 else []
|
||||
|
||||
closed = []
|
||||
|
||||
if not dry_run:
|
||||
for pr in to_close:
|
||||
# Add comment explaining why it's being closed
|
||||
comment_data = {
|
||||
"body": f"**Closing as duplicate** — This PR is a duplicate for issue #{issue_number}.\n\n"
|
||||
f"Keeping PR #{to_keep['number']} instead.\n\n"
|
||||
f"This is an automated cleanup to prevent duplicate PRs.\n"
|
||||
f"See issue #1460 for context."
|
||||
}
|
||||
|
||||
# Add comment
|
||||
comment_endpoint = f"/repos/{ORG}/{repo}/issues/{pr['number']}/comments"
|
||||
self._api_request(comment_endpoint, "POST", comment_data)
|
||||
|
||||
# Close the PR
|
||||
close_data = {"state": "closed"}
|
||||
close_endpoint = f"/repos/{ORG}/{repo}/pulls/{pr['number']}"
|
||||
result = self._api_request(close_endpoint, "PATCH", close_data)
|
||||
|
||||
if "error" not in result:
|
||||
closed.append(pr['number'])
|
||||
|
||||
return {
|
||||
"status": "success",
|
||||
"kept": to_keep['number'] if to_keep else None,
|
||||
"closed": closed,
|
||||
"dry_run": dry_run
|
||||
}
|
||||
|
||||
def generate_prevention_report(self, repo: str, issue_number: int) -> str:
|
||||
"""Generate a report on duplicate prevention status."""
|
||||
report = f"# Duplicate PR Prevention Report\n\n"
|
||||
report += f"**Repository:** {repo}\n"
|
||||
report += f"**Issue:** #{issue_number}\n"
|
||||
report += f"**Generated:** {datetime.now().isoformat()}\n\n"
|
||||
|
||||
# Check for duplicates
|
||||
duplicates = self.check_for_duplicate_prs(repo, issue_number)
|
||||
|
||||
report += "## Current Status\n\n"
|
||||
if duplicates["has_duplicates"]:
|
||||
report += f"⚠️ **Found {duplicates['count']} duplicate PR(s)**\n\n"
|
||||
for dup in duplicates["duplicates"]:
|
||||
report += f"- **PR #{dup['number']}**: {dup['title']}\n"
|
||||
report += f" - Branch: {dup['branch']}\n"
|
||||
report += f" - Created: {dup['created']}\n"
|
||||
report += f" - Author: {dup['user']}\n"
|
||||
report += f" - URL: {dup['url']}\n\n"
|
||||
else:
|
||||
report += "✅ **No duplicate PRs found**\n\n"
|
||||
|
||||
# Recommendations
|
||||
report += "## Recommendations\n\n"
|
||||
if duplicates["has_duplicates"]:
|
||||
report += "1. **Review existing PRs** — Check which one is the best solution\n"
|
||||
report += "2. **Keep the newest** — Usually the most up-to-date\n"
|
||||
report += "3. **Close duplicates** — Use cleanup_duplicate_prs.py\n"
|
||||
report += "4. **Prevent future duplicates** — Use check_duplicate_pr.py\n"
|
||||
else:
|
||||
report += "1. **Safe to create PR** — No duplicates exist\n"
|
||||
report += "2. **Use prevention tools** — Always check before creating PRs\n"
|
||||
report += "3. **Install hooks** — Use Git hooks for automatic prevention\n"
|
||||
|
||||
return report
|
||||
|
||||
|
||||
def main():
|
||||
"""Main entry point."""
|
||||
import argparse
|
||||
|
||||
parser = argparse.ArgumentParser(description="Duplicate PR Prevention System")
|
||||
parser.add_argument("--repo", required=True, help="Repository name (e.g., the-nexus)")
|
||||
parser.add_argument("--issue", required=True, type=int, help="Issue number")
|
||||
parser.add_argument("--check", action="store_true", help="Check for duplicates")
|
||||
parser.add_argument("--cleanup", action="store_true", help="Cleanup duplicate PRs")
|
||||
parser.add_argument("--dry-run", action="store_true", help="Dry run for cleanup")
|
||||
parser.add_argument("--report", action="store_true", help="Generate report")
|
||||
|
||||
args = parser.parse_args()
|
||||
|
||||
prevention = DuplicatePRPrevention()
|
||||
|
||||
if args.check:
|
||||
result = prevention.check_for_duplicate_prs(args.repo, args.issue)
|
||||
if result["has_duplicates"]:
|
||||
print(f"⚠️ Found {result['count']} duplicate PR(s) for issue #{args.issue}:")
|
||||
for dup in result["duplicates"]:
|
||||
print(f" - PR #{dup['number']}: {dup['title']}")
|
||||
sys.exit(1)
|
||||
else:
|
||||
print(f"✅ No duplicate PRs found for issue #{args.issue}")
|
||||
sys.exit(0)
|
||||
|
||||
elif args.cleanup:
|
||||
result = prevention.cleanup_duplicate_prs(args.repo, args.issue, args.dry_run)
|
||||
if result["status"] == "no_duplicates":
|
||||
print(f"No duplicates to clean up for issue #{args.issue}")
|
||||
else:
|
||||
print(f"Cleanup {'(dry run) ' if args.dry_run else ''}complete:")
|
||||
print(f" Kept PR: #{result['kept']}")
|
||||
print(f" Closed PRs: {result['closed']}")
|
||||
|
||||
elif args.report:
|
||||
report = prevention.generate_prevention_report(args.repo, args.issue)
|
||||
print(report)
|
||||
|
||||
else:
|
||||
parser.print_help()
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
||||
@@ -1,55 +0,0 @@
|
||||
{
|
||||
"dead_timeout_seconds": 600,
|
||||
"default_policy": {
|
||||
"mode": "ask"
|
||||
},
|
||||
"missions": {
|
||||
"forge": {
|
||||
"mode": "yes"
|
||||
},
|
||||
"archive": {
|
||||
"mode": "ask"
|
||||
},
|
||||
"sovereign-core": {
|
||||
"mode": "no"
|
||||
}
|
||||
},
|
||||
"agents": {
|
||||
"bezalel": {
|
||||
"mission": "forge"
|
||||
},
|
||||
"allegro": {
|
||||
"mission": "forge"
|
||||
},
|
||||
"ezra": {
|
||||
"mission": "archive",
|
||||
"mode": "ask"
|
||||
},
|
||||
"timmy": {
|
||||
"mission": "sovereign-core",
|
||||
"mode": "ask"
|
||||
}
|
||||
},
|
||||
"substitutions": {
|
||||
"bezalel": [
|
||||
"allegro",
|
||||
"timmy"
|
||||
],
|
||||
"ezra": [
|
||||
"timmy"
|
||||
],
|
||||
"allegro": [
|
||||
"timmy"
|
||||
]
|
||||
},
|
||||
"approval_channels": {
|
||||
"telegram": {
|
||||
"enabled": true,
|
||||
"target": "ops-room"
|
||||
},
|
||||
"nostr": {
|
||||
"enabled": true,
|
||||
"target": "nostr-ops"
|
||||
}
|
||||
}
|
||||
}
|
||||
@@ -1,27 +0,0 @@
|
||||
# Resurrection Pool
|
||||
|
||||
The Resurrection Pool is a mission-aware layer on top of the existing Lazarus registry.
|
||||
|
||||
It adds three concrete behaviors:
|
||||
- configurable dead-agent detection timeout
|
||||
- yes/no/ask revival policy resolution per mission or agent
|
||||
- approval packet generation for Telegram / Nostr when human sign-off is required
|
||||
|
||||
## Files
|
||||
- `scripts/resurrection_pool.py`
|
||||
- `config/resurrection_pool.json`
|
||||
|
||||
## Example usage
|
||||
|
||||
```bash
|
||||
python scripts/resurrection_pool.py --json --dry-run
|
||||
python scripts/resurrection_pool.py --execute
|
||||
```
|
||||
|
||||
## Policy model
|
||||
- `yes` → local agents auto-restart; remote agents prefer a healthy substitute
|
||||
- `ask` → generate an approval request packet with Telegram / Nostr targets
|
||||
- `no` → suppress automatic revival
|
||||
|
||||
## Notes
|
||||
This grounds issue #882 in executable code, but it does not yet wire live Telegram or Nostr delivery. The current slice produces the approval packet and restart/substitution plan the surrounding ops loop can act on.
|
||||
59
hooks/pre-push
Normal file
59
hooks/pre-push
Normal file
@@ -0,0 +1,59 @@
|
||||
#!/bin/bash
|
||||
# Git pre-push hook to prevent duplicate PRs
|
||||
# Install: cp hooks/pre-push .git/hooks/pre-push && chmod +x .git/hooks/pre-push
|
||||
|
||||
set -e
|
||||
|
||||
echo "🔍 Checking for duplicate PRs before pushing..."
|
||||
|
||||
# Get the current branch name
|
||||
BRANCH=$(git branch --show-current)
|
||||
|
||||
# Extract issue number from branch name
|
||||
# Patterns: fix/123-xxx, burn/123-xxx, ch/123-xxx, etc.
|
||||
ISSUE_NUM=$(echo "$BRANCH" | grep -oE '[0-9]+' | head -1)
|
||||
|
||||
if [ -z "$ISSUE_NUM" ]; then
|
||||
echo "ℹ️ No issue number found in branch name: $BRANCH"
|
||||
echo " Skipping duplicate check..."
|
||||
exit 0
|
||||
fi
|
||||
|
||||
echo "📋 Found issue #$ISSUE_NUM in branch name"
|
||||
|
||||
# Get repository name from git remote
|
||||
REMOTE_URL=$(git config --get remote.origin.url)
|
||||
if [[ "$REMOTE_URL" == *"Timmy_Foundation/"* ]]; then
|
||||
REPO=$(echo "$REMOTE_URL" | sed 's/.*Timmy_Foundation\///' | sed 's/\.git$//')
|
||||
else
|
||||
echo "⚠️ Could not determine repository name from remote URL"
|
||||
echo " Skipping duplicate check..."
|
||||
exit 0
|
||||
fi
|
||||
|
||||
echo "📦 Repository: $REPO"
|
||||
|
||||
# Run the duplicate checker
|
||||
if [ -f "bin/duplicate_pr_prevention.py" ]; then
|
||||
python3 bin/duplicate_pr_prevention.py --repo "$REPO" --issue "$ISSUE_NUM" --check
|
||||
|
||||
if [ $? -ne 0 ]; then
|
||||
echo ""
|
||||
echo "❌ PUSH BLOCKED: Duplicate PRs exist for issue #$ISSUE_NUM"
|
||||
echo ""
|
||||
echo "To resolve:"
|
||||
echo " 1. Review existing PRs: python3 bin/duplicate_pr_prevention.py --repo $REPO --issue $ISSUE_NUM --report"
|
||||
echo " 2. Use existing PR instead of creating a new one"
|
||||
echo " 3. Or clean up duplicates: python3 bin/duplicate_pr_prevention.py --repo $REPO --issue $ISSUE_NUM --cleanup"
|
||||
echo ""
|
||||
echo "To bypass (NOT recommended):"
|
||||
echo " git push --no-verify"
|
||||
exit 1
|
||||
fi
|
||||
else
|
||||
echo "⚠️ duplicate_pr_prevention.py not found in bin/"
|
||||
echo " Skipping duplicate check..."
|
||||
fi
|
||||
|
||||
echo "✅ No duplicate PRs found. Proceeding with push..."
|
||||
exit 0
|
||||
@@ -1,111 +0,0 @@
|
||||
# Night Shift Prediction Report — April 12-13, 2026
|
||||
|
||||
## Starting State (11:36 PM)
|
||||
|
||||
```
|
||||
Time: 11:36 PM EDT
|
||||
Automation: 13 burn loops × 3min + 1 explorer × 10min + 1 backlog × 30min
|
||||
API: Nous/xiaomi/mimo-v2-pro (FREE)
|
||||
Rate: 268 calls/hour
|
||||
Duration: 7.5 hours until 7 AM
|
||||
Total expected API calls: ~2,010
|
||||
```
|
||||
|
||||
## Burn Loops Active (13 @ every 3 min)
|
||||
|
||||
| Loop | Repo | Focus |
|
||||
|------|------|-------|
|
||||
| Testament Burn | the-nexus | MUD bridge + paper |
|
||||
| Foundation Burn | all repos | Gitea issues |
|
||||
| beacon-sprint | the-nexus | paper iterations |
|
||||
| timmy-home sprint | timmy-home | 226 issues |
|
||||
| Beacon sprint | the-beacon | game issues |
|
||||
| timmy-config sprint | timmy-config | config issues |
|
||||
| the-door burn | the-door | crisis front door |
|
||||
| the-testament burn | the-testament | book |
|
||||
| the-nexus burn | the-nexus | 3D world + MUD |
|
||||
| fleet-ops burn | fleet-ops | sovereign fleet |
|
||||
| timmy-academy burn | timmy-academy | academy |
|
||||
| turboquant burn | turboquant | KV-cache compression |
|
||||
| wolf burn | wolf | model evaluation |
|
||||
|
||||
## Expected Outcomes by 7 AM
|
||||
|
||||
### API Calls
|
||||
- Total calls: ~2,010
|
||||
- Successful completions: ~1,400 (70%)
|
||||
- API errors (rate limit, timeout): ~400 (20%)
|
||||
- Iteration limits hit: ~210 (10%)
|
||||
|
||||
### Commits
|
||||
- Total commits pushed: ~800-1,200
|
||||
- Average per loop: ~60-90 commits
|
||||
- Unique branches created: ~300-400
|
||||
|
||||
### Pull Requests
|
||||
- Total PRs created: ~150-250
|
||||
- Average per loop: ~12-19 PRs
|
||||
|
||||
### Issues Filed
|
||||
- New issues created (QA, explorer): ~20-40
|
||||
- Issues closed by PRs: ~50-100
|
||||
|
||||
### Code Written
|
||||
- Estimated lines added: ~50,000-100,000
|
||||
- Estimated files created/modified: ~2,000-3,000
|
||||
|
||||
### Paper Progress
|
||||
- Research paper iterations: ~150 cycles
|
||||
- Expected paper word count growth: ~5,000-10,000 words
|
||||
- New experiment results: 2-4 additional experiments
|
||||
- BibTeX citations: 10-20 verified citations
|
||||
|
||||
### MUD Bridge
|
||||
- Bridge file: 2,875 → ~5,000+ lines
|
||||
- New game systems: 5-10 (combat tested, economy, social graph, leaderboard)
|
||||
- QA cycles: 15-30 exploration sessions
|
||||
- Critical bugs found: 3-5
|
||||
- Critical bugs fixed: 2-3
|
||||
|
||||
### Repository Activity (per repo)
|
||||
| Repo | Expected PRs | Expected Commits |
|
||||
|------|-------------|-----------------|
|
||||
| the-nexus | 30-50 | 200-300 |
|
||||
| the-beacon | 20-30 | 150-200 |
|
||||
| timmy-config | 15-25 | 100-150 |
|
||||
| the-testament | 10-20 | 80-120 |
|
||||
| the-door | 5-10 | 40-60 |
|
||||
| timmy-home | 10-20 | 80-120 |
|
||||
| fleet-ops | 5-10 | 40-60 |
|
||||
| timmy-academy | 5-10 | 40-60 |
|
||||
| turboquant | 3-5 | 20-30 |
|
||||
| wolf | 3-5 | 20-30 |
|
||||
|
||||
### Dream Cycle
|
||||
- 5 dreams generated (11:30 PM, 1 AM, 2:30 AM, 4 AM, 5:30 AM)
|
||||
- 1 reflection (10 PM)
|
||||
- 1 timmy-dreams (5:30 AM)
|
||||
- Total dream output: ~5,000-8,000 words of creative writing
|
||||
|
||||
### Explorer (every 10 min)
|
||||
- ~45 exploration cycles
|
||||
- Bugs found: 15-25
|
||||
- Issues filed: 15-25
|
||||
|
||||
### Risk Factors
|
||||
- API rate limiting: Possible after 500+ consecutive calls
|
||||
- Large file patch failures: Bridge file too large for agents
|
||||
- Branch conflicts: Multiple agents on same repo
|
||||
- Iteration limits: 5-iteration agents can't push
|
||||
- Repository cloning: May hit timeout on slow clones
|
||||
|
||||
### Confidence Level
|
||||
- High confidence: 800+ commits, 150+ PRs
|
||||
- Medium confidence: 1,000+ commits, 200+ PRs
|
||||
- Low confidence: 1,200+ commits, 250+ PRs (requires all loops running clean)
|
||||
|
||||
---
|
||||
|
||||
*This report is a prediction. The 7 AM morning report will compare actual results.*
|
||||
*Generated: 2026-04-12 23:36 EDT*
|
||||
*Author: Timmy (pre-shift prediction)*
|
||||
@@ -1,377 +0,0 @@
|
||||
#!/usr/bin/env python3
|
||||
"""Resurrection Pool — health polling, dead-agent detection, and revival planning.
|
||||
|
||||
Grounded implementation slice for #882.
|
||||
Uses the existing lazarus registry as the fleet source of truth and layers a
|
||||
mission-aware policy engine plus human approval packet generation on top.
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import argparse
|
||||
import json
|
||||
import subprocess
|
||||
import urllib.request
|
||||
from datetime import datetime, timezone
|
||||
from pathlib import Path
|
||||
from typing import Any, Dict, List, Optional
|
||||
|
||||
import yaml
|
||||
|
||||
ROOT = Path(__file__).resolve().parent.parent
|
||||
REGISTRY_PATH = ROOT / "lazarus-registry.yaml"
|
||||
POLICY_PATH = ROOT / "config" / "resurrection_pool.json"
|
||||
STATE_PATH = Path("/var/lib/lazarus/resurrection_pool_state.json")
|
||||
LOCAL_HOSTS = {"127.0.0.1", "localhost", "104.131.15.18"}
|
||||
ISSUE_NUMBER = 882
|
||||
|
||||
|
||||
def shell(cmd: str, timeout: int = 30) -> tuple[int, str, str]:
|
||||
try:
|
||||
result = subprocess.run(cmd, shell=True, capture_output=True, text=True, timeout=timeout)
|
||||
return result.returncode, result.stdout.strip(), result.stderr.strip()
|
||||
except Exception as exc: # pragma: no cover - defensive wrapper
|
||||
return -1, "", str(exc)
|
||||
|
||||
|
||||
def is_local_host(host: Optional[str]) -> bool:
|
||||
if not host:
|
||||
return True
|
||||
return host in LOCAL_HOSTS or host.startswith("127.")
|
||||
|
||||
|
||||
def ping_http(url: str, timeout: int = 10) -> tuple[bool, int]:
|
||||
try:
|
||||
req = urllib.request.Request(url, method="HEAD")
|
||||
with urllib.request.urlopen(req, timeout=timeout) as resp:
|
||||
return True, resp.status
|
||||
except urllib.error.HTTPError as err:
|
||||
return True, err.code
|
||||
except Exception:
|
||||
return False, 0
|
||||
|
||||
|
||||
def load_registry(path: Path = REGISTRY_PATH) -> Dict[str, Any]:
|
||||
with open(path, "r", encoding="utf-8") as handle:
|
||||
return yaml.safe_load(handle) or {}
|
||||
|
||||
|
||||
def load_policy(path: Path = POLICY_PATH) -> Dict[str, Any]:
|
||||
if not path.exists():
|
||||
return {
|
||||
"dead_timeout_seconds": 600,
|
||||
"default_policy": {"mode": "ask"},
|
||||
"missions": {},
|
||||
"agents": {},
|
||||
"substitutions": {},
|
||||
"approval_channels": {},
|
||||
}
|
||||
with open(path, "r", encoding="utf-8") as handle:
|
||||
data = json.load(handle)
|
||||
data.setdefault("dead_timeout_seconds", 600)
|
||||
data.setdefault("default_policy", {"mode": "ask"})
|
||||
data.setdefault("missions", {})
|
||||
data.setdefault("agents", {})
|
||||
data.setdefault("substitutions", {})
|
||||
data.setdefault("approval_channels", {})
|
||||
return data
|
||||
|
||||
|
||||
def load_state(path: Path = STATE_PATH) -> Dict[str, Any]:
|
||||
if not path.exists():
|
||||
return {}
|
||||
with open(path, "r", encoding="utf-8") as handle:
|
||||
return json.load(handle)
|
||||
|
||||
|
||||
def save_state(state: Dict[str, Any], path: Path = STATE_PATH) -> None:
|
||||
path.parent.mkdir(parents=True, exist_ok=True)
|
||||
with open(path, "w", encoding="utf-8") as handle:
|
||||
json.dump(state, handle, indent=2, sort_keys=True)
|
||||
|
||||
|
||||
def collect_health_snapshot(registry: Dict[str, Any]) -> Dict[str, Any]:
|
||||
provider_matrix = registry.get("provider_health_matrix", {})
|
||||
fleet = registry.get("fleet", {})
|
||||
snapshot: Dict[str, Any] = {}
|
||||
|
||||
for agent_name, spec in fleet.items():
|
||||
primary = spec.get("primary", {})
|
||||
provider_name = primary.get("provider")
|
||||
provider_status = provider_matrix.get(provider_name, {}).get("status", "unknown")
|
||||
gateway_url = spec.get("health_endpoints", {}).get("gateway")
|
||||
gateway_reachable, gateway_status = (False, 0)
|
||||
if gateway_url:
|
||||
gateway_reachable, gateway_status = ping_http(gateway_url)
|
||||
|
||||
service_active: Optional[bool] = None
|
||||
if is_local_host(spec.get("host")):
|
||||
service_code, _, _ = shell(f"systemctl is-active hermes-{agent_name}.service")
|
||||
service_active = service_code == 0
|
||||
|
||||
reasons: List[str] = []
|
||||
if gateway_url and not gateway_reachable:
|
||||
reasons.append("gateway_unreachable")
|
||||
if service_active is False:
|
||||
reasons.append("service_inactive")
|
||||
if provider_status in {"dead", "degraded"}:
|
||||
reasons.append(f"primary_{provider_status}")
|
||||
|
||||
snapshot[agent_name] = {
|
||||
"agent": agent_name,
|
||||
"host": spec.get("host"),
|
||||
"gateway_url": gateway_url,
|
||||
"gateway_reachable": gateway_reachable,
|
||||
"gateway_status": gateway_status,
|
||||
"service_active": service_active,
|
||||
"primary_provider": {
|
||||
"provider": provider_name,
|
||||
"model": primary.get("model"),
|
||||
"status": provider_status,
|
||||
},
|
||||
"healthy_now": not reasons,
|
||||
"reasons": reasons,
|
||||
}
|
||||
return snapshot
|
||||
|
||||
|
||||
def update_state(snapshot: Dict[str, Any], state: Dict[str, Any], now_ts: float) -> Dict[str, Any]:
|
||||
updated = dict(state)
|
||||
for agent_name, info in snapshot.items():
|
||||
entry = dict(updated.get(agent_name, {}))
|
||||
entry["last_checked_at"] = now_ts
|
||||
entry["last_reasons"] = list(info.get("reasons", []))
|
||||
if info.get("healthy_now"):
|
||||
entry["last_healthy_at"] = now_ts
|
||||
else:
|
||||
entry.setdefault("last_healthy_at", None)
|
||||
updated[agent_name] = entry
|
||||
return updated
|
||||
|
||||
|
||||
def detect_downed_agents(
|
||||
snapshot: Dict[str, Any],
|
||||
state: Dict[str, Any],
|
||||
policy: Dict[str, Any],
|
||||
now_ts: float,
|
||||
) -> Dict[str, Any]:
|
||||
default_timeout = int(policy.get("dead_timeout_seconds", 600))
|
||||
agent_overrides = policy.get("agents", {})
|
||||
detected: Dict[str, Any] = {}
|
||||
|
||||
for agent_name, info in snapshot.items():
|
||||
timeout_seconds = int(agent_overrides.get(agent_name, {}).get("dead_timeout_seconds", default_timeout))
|
||||
last_healthy_at = state.get(agent_name, {}).get("last_healthy_at")
|
||||
if info.get("healthy_now"):
|
||||
unhealthy_for_seconds = 0.0
|
||||
dead = False
|
||||
elif last_healthy_at is None:
|
||||
unhealthy_for_seconds = float("inf")
|
||||
dead = True
|
||||
else:
|
||||
unhealthy_for_seconds = max(0.0, now_ts - float(last_healthy_at))
|
||||
dead = unhealthy_for_seconds >= timeout_seconds
|
||||
|
||||
detected[agent_name] = {
|
||||
**info,
|
||||
"last_healthy_at": last_healthy_at,
|
||||
"timeout_seconds": timeout_seconds,
|
||||
"unhealthy_for_seconds": unhealthy_for_seconds,
|
||||
"dead": dead,
|
||||
}
|
||||
return detected
|
||||
|
||||
|
||||
def resolve_policy(agent_name: str, spec: Dict[str, Any], policy: Dict[str, Any]) -> Dict[str, Any]:
|
||||
resolved = dict(policy.get("default_policy", {}))
|
||||
spec_mission = spec.get("mission")
|
||||
agent_override = dict(policy.get("agents", {}).get(agent_name, {}))
|
||||
resolved_mission = agent_override.get("mission") or spec_mission or agent_name
|
||||
if resolved_mission in policy.get("missions", {}):
|
||||
resolved.update(policy["missions"][resolved_mission])
|
||||
resolved.update(agent_override)
|
||||
resolved.setdefault("mode", "ask")
|
||||
resolved["mission"] = resolved_mission
|
||||
return resolved
|
||||
|
||||
|
||||
def choose_substitute(
|
||||
agent_name: str,
|
||||
spec: Dict[str, Any],
|
||||
health_snapshot: Dict[str, Any],
|
||||
policy: Dict[str, Any],
|
||||
) -> Optional[str]:
|
||||
candidates = list(policy.get("substitutions", {}).get(agent_name, []))
|
||||
candidates.extend(spec.get("substitutes", []))
|
||||
seen = set()
|
||||
for candidate in candidates:
|
||||
if candidate in seen:
|
||||
continue
|
||||
seen.add(candidate)
|
||||
candidate_health = health_snapshot.get(candidate, {})
|
||||
if candidate_health.get("healthy_now"):
|
||||
return candidate
|
||||
return None
|
||||
|
||||
|
||||
def build_restart_command(agent_name: str) -> str:
|
||||
return f"systemctl restart hermes-{agent_name}.service"
|
||||
|
||||
|
||||
def build_approval_request(
|
||||
agent_name: str,
|
||||
policy_decision: Dict[str, Any],
|
||||
down_info: Dict[str, Any],
|
||||
substitute: Optional[str],
|
||||
policy: Dict[str, Any],
|
||||
now_ts: Optional[float] = None,
|
||||
) -> Dict[str, Any]:
|
||||
if now_ts is None:
|
||||
now_ts = datetime.now(timezone.utc).timestamp()
|
||||
reasons = ", ".join(down_info.get("reasons", [])) or "no health signal"
|
||||
mission = policy_decision.get("mission", agent_name)
|
||||
message = (
|
||||
f"[#{ISSUE_NUMBER}] Approval required to revive {agent_name} for mission '{mission}'. "
|
||||
f"Reasons: {reasons}. "
|
||||
f"Suggested substitute: {substitute or 'none available'}."
|
||||
)
|
||||
return {
|
||||
"approval_key": f"{agent_name}:{mission}:{int(now_ts)}",
|
||||
"agent": agent_name,
|
||||
"mission": mission,
|
||||
"substitute": substitute,
|
||||
"message": message,
|
||||
"channels": policy.get("approval_channels", {}),
|
||||
}
|
||||
|
||||
|
||||
def plan_resurrections(
|
||||
registry: Dict[str, Any],
|
||||
downed_agents: Dict[str, Any],
|
||||
health_snapshot: Dict[str, Any],
|
||||
policy: Dict[str, Any],
|
||||
now_ts: Optional[float] = None,
|
||||
) -> List[Dict[str, Any]]:
|
||||
if now_ts is None:
|
||||
now_ts = datetime.now(timezone.utc).timestamp()
|
||||
fleet = registry.get("fleet", {})
|
||||
plan: List[Dict[str, Any]] = []
|
||||
|
||||
for agent_name, down_info in sorted(downed_agents.items()):
|
||||
if not down_info.get("dead"):
|
||||
continue
|
||||
spec = fleet.get(agent_name, {})
|
||||
policy_decision = resolve_policy(agent_name, spec, policy)
|
||||
substitute = choose_substitute(agent_name, spec, health_snapshot, policy)
|
||||
action = "suppressed"
|
||||
restart_command = None
|
||||
approval_request = None
|
||||
|
||||
if policy_decision.get("mode") == "yes":
|
||||
if is_local_host(spec.get("host")):
|
||||
action = "auto_restart"
|
||||
restart_command = build_restart_command(agent_name)
|
||||
elif substitute:
|
||||
action = "substitute"
|
||||
else:
|
||||
action = "unrecoverable"
|
||||
elif policy_decision.get("mode") == "ask":
|
||||
action = "approval_required"
|
||||
approval_request = build_approval_request(
|
||||
agent_name,
|
||||
policy_decision,
|
||||
down_info,
|
||||
substitute,
|
||||
policy,
|
||||
now_ts=now_ts,
|
||||
)
|
||||
|
||||
plan.append(
|
||||
{
|
||||
"agent": agent_name,
|
||||
"mission": policy_decision.get("mission"),
|
||||
"policy": policy_decision,
|
||||
"reasons": list(down_info.get("reasons", [])),
|
||||
"timeout_seconds": down_info.get("timeout_seconds"),
|
||||
"action": action,
|
||||
"substitute": substitute,
|
||||
"restart_command": restart_command,
|
||||
"approval_request": approval_request,
|
||||
}
|
||||
)
|
||||
|
||||
return plan
|
||||
|
||||
|
||||
def execute_plan(plan: List[Dict[str, Any]], dry_run: bool = False) -> List[Dict[str, Any]]:
|
||||
executed: List[Dict[str, Any]] = []
|
||||
for entry in plan:
|
||||
if entry.get("action") != "auto_restart":
|
||||
executed.append({**entry, "executed": False})
|
||||
continue
|
||||
cmd = entry.get("restart_command")
|
||||
if dry_run or not cmd:
|
||||
executed.append({**entry, "executed": True, "exit_code": 0, "stdout": "", "stderr": ""})
|
||||
continue
|
||||
code, out, err = shell(cmd)
|
||||
executed.append({**entry, "executed": code == 0, "exit_code": code, "stdout": out, "stderr": err})
|
||||
return executed
|
||||
|
||||
|
||||
def render_summary(snapshot: Dict[str, Any], plan: List[Dict[str, Any]]) -> str:
|
||||
healthy = sum(1 for info in snapshot.values() if info.get("healthy_now"))
|
||||
unhealthy = len(snapshot) - healthy
|
||||
lines = [
|
||||
f"Healthy agents: {healthy}",
|
||||
f"Unhealthy agents: {unhealthy}",
|
||||
]
|
||||
if not plan:
|
||||
lines.append("Resurrection plan: no dead agents exceed timeout.")
|
||||
return "\n".join(lines)
|
||||
lines.append("Resurrection plan:")
|
||||
for entry in plan:
|
||||
lines.append(
|
||||
f"- {entry['agent']}: {entry['action']}"
|
||||
f" (mission={entry['mission']}, reasons={', '.join(entry['reasons']) or 'none'})"
|
||||
)
|
||||
return "\n".join(lines)
|
||||
|
||||
|
||||
def main() -> int:
|
||||
parser = argparse.ArgumentParser(description="Resurrection Pool")
|
||||
parser.add_argument("--registry", type=Path, default=REGISTRY_PATH)
|
||||
parser.add_argument("--policy", type=Path, default=POLICY_PATH)
|
||||
parser.add_argument("--state", type=Path, default=STATE_PATH)
|
||||
parser.add_argument("--json", action="store_true")
|
||||
parser.add_argument("--execute", action="store_true")
|
||||
parser.add_argument("--dry-run", action="store_true")
|
||||
args = parser.parse_args()
|
||||
|
||||
now_ts = datetime.now(timezone.utc).timestamp()
|
||||
registry = load_registry(args.registry)
|
||||
policy = load_policy(args.policy)
|
||||
prior_state = load_state(args.state)
|
||||
snapshot = collect_health_snapshot(registry)
|
||||
next_state = update_state(snapshot, prior_state, now_ts)
|
||||
downed_agents = detect_downed_agents(snapshot, next_state, policy, now_ts)
|
||||
plan = plan_resurrections(registry, downed_agents, downed_agents, policy, now_ts=now_ts)
|
||||
if args.execute:
|
||||
plan = execute_plan(plan, dry_run=args.dry_run)
|
||||
if not args.dry_run:
|
||||
save_state(next_state, args.state)
|
||||
|
||||
payload = {
|
||||
"checked_at": datetime.fromtimestamp(now_ts, tz=timezone.utc).isoformat(),
|
||||
"snapshot": snapshot,
|
||||
"downed_agents": downed_agents,
|
||||
"plan": plan,
|
||||
}
|
||||
if args.json:
|
||||
print(json.dumps(payload, indent=2, sort_keys=True))
|
||||
else:
|
||||
print(render_summary(snapshot, plan))
|
||||
return 0
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
raise SystemExit(main())
|
||||
@@ -1,25 +0,0 @@
|
||||
from pathlib import Path
|
||||
|
||||
|
||||
REPORT = Path("reports/night-shift-prediction-2026-04-12.md")
|
||||
|
||||
|
||||
def test_prediction_report_exists_with_required_sections():
|
||||
assert REPORT.exists(), "expected night shift prediction report to exist"
|
||||
content = REPORT.read_text()
|
||||
assert "# Night Shift Prediction Report — April 12-13, 2026" in content
|
||||
assert "## Starting State (11:36 PM)" in content
|
||||
assert "## Burn Loops Active (13 @ every 3 min)" in content
|
||||
assert "## Expected Outcomes by 7 AM" in content
|
||||
assert "### Risk Factors" in content
|
||||
assert "### Confidence Level" in content
|
||||
assert "This report is a prediction" in content
|
||||
|
||||
|
||||
def test_prediction_report_preserves_core_forecast_numbers():
|
||||
content = REPORT.read_text()
|
||||
assert "Total expected API calls: ~2,010" in content
|
||||
assert "Total commits pushed: ~800-1,200" in content
|
||||
assert "Total PRs created: ~150-250" in content
|
||||
assert "the-nexus | 30-50 | 200-300" in content
|
||||
assert "Generated: 2026-04-12 23:36 EDT" in content
|
||||
@@ -1,118 +0,0 @@
|
||||
from importlib import util
|
||||
from pathlib import Path
|
||||
|
||||
|
||||
ROOT = Path(__file__).resolve().parent.parent
|
||||
MODULE_PATH = ROOT / "scripts" / "resurrection_pool.py"
|
||||
|
||||
|
||||
def load_module():
|
||||
spec = util.spec_from_file_location("resurrection_pool", MODULE_PATH)
|
||||
module = util.module_from_spec(spec)
|
||||
assert spec.loader is not None
|
||||
spec.loader.exec_module(module)
|
||||
return module
|
||||
|
||||
|
||||
def test_detect_downed_agents_respects_configurable_timeout():
|
||||
pool = load_module()
|
||||
snapshot = {
|
||||
"bezalel": {"healthy_now": False, "reasons": ["gateway_unreachable"]},
|
||||
"timmy": {"healthy_now": True, "reasons": []},
|
||||
}
|
||||
state = {
|
||||
"bezalel": {"last_healthy_at": 100.0},
|
||||
"timmy": {"last_healthy_at": 650.0},
|
||||
}
|
||||
policy = {"dead_timeout_seconds": 600, "agents": {}}
|
||||
|
||||
not_dead = pool.detect_downed_agents(snapshot, state, policy, now_ts=650.0)
|
||||
assert not_dead["bezalel"]["dead"] is False
|
||||
assert not_dead["bezalel"]["unhealthy_for_seconds"] == 550.0
|
||||
|
||||
dead = pool.detect_downed_agents(snapshot, state, policy, now_ts=701.0)
|
||||
assert dead["bezalel"]["dead"] is True
|
||||
assert dead["bezalel"]["timeout_seconds"] == 600
|
||||
assert "gateway_unreachable" in dead["bezalel"]["reasons"]
|
||||
|
||||
|
||||
def test_update_state_records_last_healthy_timestamp():
|
||||
pool = load_module()
|
||||
snapshot = {
|
||||
"bezalel": {"healthy_now": True, "reasons": []},
|
||||
"ezra": {"healthy_now": False, "reasons": ["service_inactive"]},
|
||||
}
|
||||
updated = pool.update_state(snapshot, {}, now_ts=1234.5)
|
||||
assert updated["bezalel"]["last_healthy_at"] == 1234.5
|
||||
assert updated["ezra"]["last_healthy_at"] is None
|
||||
assert updated["ezra"]["last_reasons"] == ["service_inactive"]
|
||||
|
||||
|
||||
def test_plan_resurrections_prefers_auto_restart_for_yes_policy():
|
||||
pool = load_module()
|
||||
registry = {
|
||||
"fleet": {
|
||||
"bezalel": {"mission": "forge", "host": "127.0.0.1"},
|
||||
"allegro": {"mission": "forge", "host": "203.0.113.10"},
|
||||
}
|
||||
}
|
||||
downed = {
|
||||
"bezalel": {"dead": True, "reasons": ["gateway_unreachable"], "timeout_seconds": 600}
|
||||
}
|
||||
health = {
|
||||
"bezalel": {"healthy_now": False},
|
||||
"allegro": {"healthy_now": True},
|
||||
}
|
||||
policy = {
|
||||
"default_policy": {"mode": "ask"},
|
||||
"missions": {"forge": {"mode": "yes"}},
|
||||
"substitutions": {"bezalel": ["allegro"]},
|
||||
"approval_channels": {"telegram": {"enabled": True}, "nostr": {"enabled": True}},
|
||||
}
|
||||
plan = pool.plan_resurrections(registry, downed, health, policy, now_ts=2000.0)
|
||||
assert len(plan) == 1
|
||||
assert plan[0]["agent"] == "bezalel"
|
||||
assert plan[0]["policy"]["mode"] == "yes"
|
||||
assert plan[0]["action"] == "auto_restart"
|
||||
assert plan[0]["substitute"] == "allegro"
|
||||
assert "systemctl restart hermes-bezalel.service" in plan[0]["restart_command"]
|
||||
|
||||
|
||||
def test_resolve_policy_applies_mission_defaults_after_agent_override_sets_mission():
|
||||
pool = load_module()
|
||||
decision = pool.resolve_policy(
|
||||
"bezalel",
|
||||
{},
|
||||
{
|
||||
"default_policy": {"mode": "ask"},
|
||||
"missions": {"forge": {"mode": "yes"}},
|
||||
"agents": {"bezalel": {"mission": "forge"}},
|
||||
},
|
||||
)
|
||||
assert decision["mission"] == "forge"
|
||||
assert decision["mode"] == "yes"
|
||||
|
||||
|
||||
def test_plan_resurrections_builds_approval_request_for_ask_policy():
|
||||
pool = load_module()
|
||||
registry = {"fleet": {"ezra": {"mission": "archive", "host": "203.0.113.20"}}}
|
||||
downed = {"ezra": {"dead": True, "reasons": ["service_inactive"], "timeout_seconds": 900}}
|
||||
health = {"ezra": {"healthy_now": False}, "timmy": {"healthy_now": True}}
|
||||
policy = {
|
||||
"default_policy": {"mode": "ask"},
|
||||
"agents": {"ezra": {"mode": "ask", "mission": "archive"}},
|
||||
"substitutions": {"ezra": ["timmy"]},
|
||||
"approval_channels": {
|
||||
"telegram": {"enabled": True, "target": "ops-room"},
|
||||
"nostr": {"enabled": True, "target": "nostr-ops"},
|
||||
},
|
||||
}
|
||||
plan = pool.plan_resurrections(registry, downed, health, policy, now_ts=3000.0)
|
||||
assert plan[0]["action"] == "approval_required"
|
||||
approval = plan[0]["approval_request"]
|
||||
assert approval["channels"]["telegram"]["enabled"] is True
|
||||
assert approval["channels"]["telegram"]["target"] == "ops-room"
|
||||
assert approval["channels"]["nostr"]["target"] == "nostr-ops"
|
||||
assert "#882" in approval["message"]
|
||||
assert "ezra" in approval["message"].lower()
|
||||
assert approval["substitute"] == "timmy"
|
||||
Reference in New Issue
Block a user