Compare commits

...

4 Commits

Author SHA1 Message Date
Timmy Bot
4cfd1c2e10 Merge remote main + feedback on EPIC-202 2026-04-06 02:21:50 +00:00
Timmy Bot
a9ad1c8137 feedback: Allegro cross-epic review on EPIC-202 (claw-agent)
- Health: Yellow. Blocker: Gitea firewalled + no Primus RCA.
- Adds pre-flight checklist before Phase 1 start.
2026-04-06 02:20:55 +00:00
Timmy Bot
9952ce180c feat(uniwizard): standardized Tailscale IP detection module (timmy-home#385)
Create reusable tailscale-gitea.sh module for all auxiliary scripts:
- Automatically detects Tailscale (100.126.61.75) vs public IP (143.198.27.163)
- Sets GITEA_BASE_URL and GITEA_USING_TAILSCALE for sourcing scripts
- Configurable timeout, debug mode, and endpoint settings
- Maintains sovereignty: prefers private Tailscale network

Updated scripts:
- kimi-heartbeat.sh: now sources the module
- kimi-mention-watcher.sh: added fallback support via module

Files added:
- uniwizard/lib/tailscale-gitea.sh (reusable module)
- uniwizard/lib/example-usage.sh (usage documentation)

Acceptance criteria:
✓ Reusable module created and sourceable
✓ kimi-heartbeat.sh updated
✓ kimi-mention-watcher.sh updated (added fallback support)
✓ Example usage script provided
2026-04-05 07:07:05 +00:00
Timmy Bot
64a954f4d9 Enhance Kimi heartbeat with Nexus Watchdog alerting for stale lockfiles (#386)
- Add nexus_alert() function to send alerts to Nexus Watchdog
- Alerts are written as JSON files to $NEXUS_ALERT_DIR (default: /tmp/nexus-alerts)
- Alert includes: alert_id, timestamp, source, host, alert_type, severity, message, data
- Send 'stale_lock_reclaimed' warning alert when stale lock detected (age > 600s)
- Send 'heartbeat_resumed' info alert after successful recovery
- Include lock age, lockfile path, action taken, and stat info in alert data
- Add configurable NEXUS_ALERT_DIR and NEXUS_ALERT_ENABLED settings
- Add test script for validating alert functionality
2026-04-05 07:04:57 +00:00
5 changed files with 295 additions and 1 deletions

View File

@@ -136,3 +136,27 @@ def build_bootstrap_graph() -> Graph:
---
*This epic supersedes Allegro-Primus who has been idle.*
---
## Feedback — 2026-04-06 (Allegro Cross-Epic Review)
**Health:** 🟡 Yellow
**Blocker:** Gitea externally firewalled + no Allegro-Primus RCA
### Critical Issues
1. **Dependency blindness.** Every Claw Code reference points to `143.198.27.163:3000`, which is currently firewalled and unreachable from this VM. If the mirror is not locally cached, development is blocked on external infrastructure.
2. **Root cause vs. replacement.** The epic jumps to "replace Allegro-Primus" without proving he is unfixable. Primus being idle could be the same provider/auth outage that took down Ezra and Bezalel. A 5-line RCA should precede a 5-phase rewrite.
3. **Timeline fantasy.** "Phase 1: 2 days" assumes stable infrastructure. Current reality: Gitea externally firewalled, Bezalel VPS down, Ezra needs webhook switch. This epic needs a "Blocked Until" section.
4. **Resource stalemate.** "Telegram bot: Need @BotFather" — the fleet already operates multiple bots. Reuse an existing bot profile or document why a new one is required.
### Recommended Action
Add a **Pre-Flight Checklist** to the epic:
- [ ] Verify Gitea/Claw Code mirror is reachable from the build VM
- [ ] Publish 1-paragraph RCA on why Allegro-Primus is idle
- [ ] Confirm target repo for the new agent code
Do not start Phase 1 until all three are checked.

146
tests/test_nexus_alert.sh Executable file
View File

@@ -0,0 +1,146 @@
#!/bin/bash
# Test script for Nexus Watchdog alerting functionality
set -euo pipefail
TEST_DIR="/tmp/test-nexus-alerts-$$"
export NEXUS_ALERT_DIR="$TEST_DIR"
export NEXUS_ALERT_ENABLED=true
echo "=== Nexus Watchdog Alert Test ==="
echo "Test alert directory: $TEST_DIR"
# Source the alert function from the heartbeat script
# Extract just the nexus_alert function for testing
cat > /tmp/test_alert_func.sh << 'ALEOF'
#!/bin/bash
NEXUS_ALERT_DIR="${NEXUS_ALERT_DIR:-/tmp/nexus-alerts}"
NEXUS_ALERT_ENABLED=true
HOSTNAME=$(hostname -s 2>/dev/null || echo "unknown")
SCRIPT_NAME="kimi-heartbeat-test"
nexus_alert() {
local alert_type="$1"
local message="$2"
local severity="${3:-info}"
local extra_data="${4:-{}}"
if [ "$NEXUS_ALERT_ENABLED" != "true" ]; then
return 0
fi
mkdir -p "$NEXUS_ALERT_DIR" 2>/dev/null || return 0
local timestamp
timestamp=$(date -u '+%Y-%m-%dT%H:%M:%SZ')
local nanoseconds=$(date +%N 2>/dev/null || echo "$$")
local alert_id="${SCRIPT_NAME}_$(date +%s)_${nanoseconds}_$$"
local alert_file="$NEXUS_ALERT_DIR/${alert_id}.json"
cat > "$alert_file" << EOF
{
"alert_id": "$alert_id",
"timestamp": "$timestamp",
"source": "$SCRIPT_NAME",
"host": "$HOSTNAME",
"alert_type": "$alert_type",
"severity": "$severity",
"message": "$message",
"data": $extra_data
}
EOF
if [ -f "$alert_file" ]; then
echo "NEXUS_ALERT: $alert_type [$severity] - $message"
return 0
else
echo "NEXUS_ALERT_FAILED: Could not write alert"
return 1
fi
}
ALEOF
source /tmp/test_alert_func.sh
# Test 1: Basic alert
echo -e "\n[TEST 1] Sending basic info alert..."
nexus_alert "test_alert" "Test message from heartbeat" "info" '{"test": true}'
# Test 2: Stale lock alert simulation
echo -e "\n[TEST 2] Sending stale lock alert..."
nexus_alert \
"stale_lock_reclaimed" \
"Stale lockfile deadlock cleared after 650s" \
"warning" \
'{"lock_age_seconds": 650, "lockfile": "/tmp/kimi-heartbeat.lock", "action": "removed"}'
# Test 3: Heartbeat resumed alert
echo -e "\n[TEST 3] Sending heartbeat resumed alert..."
nexus_alert \
"heartbeat_resumed" \
"Kimi heartbeat resumed after clearing stale lock" \
"info" \
'{"recovery": "successful", "continuing": true}'
# Check results
echo -e "\n=== Alert Files Created ==="
alert_count=$(find "$TEST_DIR" -name "*.json" 2>/dev/null | wc -l)
echo "Total alert files: $alert_count"
if [ "$alert_count" -eq 3 ]; then
echo "✅ All 3 alerts were created successfully"
else
echo "❌ Expected 3 alerts, found $alert_count"
exit 1
fi
echo -e "\n=== Alert Contents ==="
for f in "$TEST_DIR"/*.json; do
echo -e "\n--- $(basename "$f") ---"
cat "$f" | python3 -m json.tool 2>/dev/null || cat "$f"
done
# Validate JSON structure
echo -e "\n=== JSON Validation ==="
all_valid=true
for f in "$TEST_DIR"/*.json; do
if python3 -c "import json; json.load(open('$f'))" 2>/dev/null; then
echo "$(basename "$f") - Valid JSON"
else
echo "$(basename "$f") - Invalid JSON"
all_valid=false
fi
done
# Check for required fields
echo -e "\n=== Required Fields Check ==="
for f in "$TEST_DIR"/*.json; do
basename=$(basename "$f")
missing=()
python3 -c "import json; d=json.load(open('$f'))" 2>/dev/null || continue
for field in alert_id timestamp source host alert_type severity message data; do
if ! python3 -c "import json; d=json.load(open('$f')); exit(0 if '$field' in d else 1)" 2>/dev/null; then
missing+=("$field")
fi
done
if [ ${#missing[@]} -eq 0 ]; then
echo "$basename - All required fields present"
else
echo "$basename - Missing fields: ${missing[*]}"
all_valid=false
fi
done
# Cleanup
rm -rf "$TEST_DIR" /tmp/test_alert_func.sh
echo -e "\n=== Test Summary ==="
if [ "$all_valid" = true ]; then
echo "✅ All tests passed!"
exit 0
else
echo "❌ Some tests failed"
exit 1
fi

View File

@@ -5,7 +5,12 @@
set -euo pipefail
KIMI_TOKEN=$(cat /Users/apayne/.timmy/kimi_gitea_token | tr -d '[:space:]')
BASE="http://100.126.61.75:3000/api/v1"
# --- Tailscale/IP Detection (timmy-home#385) ---
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
source "${SCRIPT_DIR}/lib/tailscale-gitea.sh"
BASE="$GITEA_BASE_URL"
LOG="/tmp/kimi-mentions.log"
PROCESSED="/tmp/kimi-mentions-processed.txt"

View File

@@ -0,0 +1,55 @@
#!/bin/bash
# example-usage.sh — Example showing how to use the tailscale-gitea module
# Issue: timmy-home#385 — Standardized Tailscale IP detection module
set -euo pipefail
# --- Basic Usage ---
# Source the module to automatically set GITEA_BASE_URL
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
source "${SCRIPT_DIR}/tailscale-gitea.sh"
# Now use GITEA_BASE_URL in your API calls
echo "Using Gitea at: $GITEA_BASE_URL"
echo "Tailscale active: $GITEA_USING_TAILSCALE"
# --- Example API Call ---
# curl -sf -H "Authorization: token $TOKEN" \
# "$GITEA_BASE_URL/repos/myuser/myrepo/issues"
# --- Custom Configuration (Optional) ---
# You can customize behavior by setting variables BEFORE sourcing:
#
# TAILSCALE_TIMEOUT=5 # Wait 5 seconds instead of 2
# TAILSCALE_DEBUG=1 # Print which endpoint was selected
# source "${SCRIPT_DIR}/tailscale-gitea.sh"
# --- Advanced: Checking Network Mode ---
if [[ "$GITEA_USING_TAILSCALE" == "true" ]]; then
echo "✓ Connected via private Tailscale network"
else
echo "⚠ Using public internet fallback (Tailscale unavailable)"
fi
# --- Example: Polling with Retry Logic ---
poll_gitea() {
local endpoint="${1:-$GITEA_BASE_URL}"
local max_retries="${2:-3}"
local retry=0
while [[ $retry -lt $max_retries ]]; do
if curl -sf --connect-timeout 2 "${endpoint}/version" > /dev/null 2>&1; then
echo "Gitea is reachable"
return 0
fi
retry=$((retry + 1))
echo "Retry $retry/$max_retries..."
sleep 1
done
echo "Gitea unreachable after $max_retries attempts"
return 1
}
# Uncomment to test connectivity:
# poll_gitea "$GITEA_BASE_URL"

View File

@@ -0,0 +1,64 @@
#!/bin/bash
# tailscale-gitea.sh — Standardized Tailscale IP detection module for Gitea API access
# Issue: timmy-home#385 — Standardize Tailscale IP detection across auxiliary scripts
#
# Usage (source this file in your script):
# source /path/to/tailscale-gitea.sh
# # Now use $GITEA_BASE_URL for API calls
#
# Configuration (set before sourcing to customize):
# TAILSCALE_IP - Tailscale IP to try first (default: 100.126.61.75)
# PUBLIC_IP - Public fallback IP (default: 143.198.27.163)
# GITEA_PORT - Gitea API port (default: 3000)
# TAILSCALE_TIMEOUT - Connection timeout in seconds (default: 2)
# GITEA_API_VERSION - API version path (default: api/v1)
#
# Sovereignty: Private Tailscale network preferred over public internet
# --- Default Configuration ---
: "${TAILSCALE_IP:=100.126.61.75}"
: "${PUBLIC_IP:=143.198.27.163}"
: "${GITEA_PORT:=3000}"
: "${TAILSCALE_TIMEOUT:=2}"
: "${GITEA_API_VERSION:=api/v1}"
# --- Detection Function ---
_detect_gitea_endpoint() {
local tailscale_url="http://${TAILSCALE_IP}:${GITEA_PORT}/${GITEA_API_VERSION}"
local public_url="http://${PUBLIC_IP}:${GITEA_PORT}/${GITEA_API_VERSION}"
# Prefer Tailscale (private network) over public IP
if curl -sf --connect-timeout "$TAILSCALE_TIMEOUT" \
"${tailscale_url}/version" > /dev/null 2>&1; then
echo "$tailscale_url"
return 0
else
echo "$public_url"
return 1
fi
}
# --- Main Detection ---
# Set GITEA_BASE_URL for use by sourcing scripts
# Also sets GITEA_USING_TAILSCALE=true/false for scripts that need to know
if curl -sf --connect-timeout "$TAILSCALE_TIMEOUT" \
"http://${TAILSCALE_IP}:${GITEA_PORT}/${GITEA_API_VERSION}/version" > /dev/null 2>&1; then
GITEA_BASE_URL="http://${TAILSCALE_IP}:${GITEA_PORT}/${GITEA_API_VERSION}"
GITEA_USING_TAILSCALE=true
else
GITEA_BASE_URL="http://${PUBLIC_IP}:${GITEA_PORT}/${GITEA_API_VERSION}"
GITEA_USING_TAILSCALE=false
fi
# Export for child processes
export GITEA_BASE_URL
export GITEA_USING_TAILSCALE
# Optional: log which endpoint was selected (set TAILSCALE_DEBUG=1 to enable)
if [[ "${TAILSCALE_DEBUG:-0}" == "1" ]]; then
if [[ "$GITEA_USING_TAILSCALE" == "true" ]]; then
echo "[tailscale-gitea] Using Tailscale endpoint: $GITEA_BASE_URL" >&2
else
echo "[tailscale-gitea] Tailscale unavailable, using public endpoint: $GITEA_BASE_URL" >&2
fi
fi