Merge remote main + feedback on EPIC-202

feedback: Allegro cross-epic review on EPIC-202 (claw-agent)
- Health: Yellow. Blocker: Gitea firewalled + no Primus RCA. - Adds pre-flight checklist before Phase 1 start.
2026-04-06 02:21:50 +00:00 · 2026-04-06 02:20:55 +00:00 · 2026-04-05 07:07:05 +00:00 · 2026-04-05 07:04:57 +00:00
5 changed files with 295 additions and 1 deletions
--- a/epics/EPIC-202-claw-agent.md
+++ b/epics/EPIC-202-claw-agent.md
@@ -136,3 +136,27 @@ def build_bootstrap_graph() -> Graph:
 ---

 *This epic supersedes Allegro-Primus who has been idle.*
+
+---
+
+## Feedback — 2026-04-06 (Allegro Cross-Epic Review)
+
+**Health:** 🟡 Yellow  
+**Blocker:** Gitea externally firewalled + no Allegro-Primus RCA
+
+### Critical Issues
+
+1. **Dependency blindness.** Every Claw Code reference points to `143.198.27.163:3000`, which is currently firewalled and unreachable from this VM. If the mirror is not locally cached, development is blocked on external infrastructure.
+2. **Root cause vs. replacement.** The epic jumps to "replace Allegro-Primus" without proving he is unfixable. Primus being idle could be the same provider/auth outage that took down Ezra and Bezalel. A 5-line RCA should precede a 5-phase rewrite.
+3. **Timeline fantasy.** "Phase 1: 2 days" assumes stable infrastructure. Current reality: Gitea externally firewalled, Bezalel VPS down, Ezra needs webhook switch. This epic needs a "Blocked Until" section.
+4. **Resource stalemate.** "Telegram bot: Need @BotFather" — the fleet already operates multiple bots. Reuse an existing bot profile or document why a new one is required.
+
+### Recommended Action
+
+Add a **Pre-Flight Checklist** to the epic:
+- [ ] Verify Gitea/Claw Code mirror is reachable from the build VM
+- [ ] Publish 1-paragraph RCA on why Allegro-Primus is idle
+- [ ] Confirm target repo for the new agent code
+
+Do not start Phase 1 until all three are checked.
+
--- a/tests/test_nexus_alert.sh
+++ b/tests/test_nexus_alert.sh
@@ -0,0 +1,146 @@
+#!/bin/bash
+# Test script for Nexus Watchdog alerting functionality
+
+set -euo pipefail
+
+TEST_DIR="/tmp/test-nexus-alerts-$$"
+export NEXUS_ALERT_DIR="$TEST_DIR"
+export NEXUS_ALERT_ENABLED=true
+
+echo "=== Nexus Watchdog Alert Test ==="
+echo "Test alert directory: $TEST_DIR"
+
+# Source the alert function from the heartbeat script
+# Extract just the nexus_alert function for testing
+cat > /tmp/test_alert_func.sh << 'ALEOF'
+#!/bin/bash
+NEXUS_ALERT_DIR="${NEXUS_ALERT_DIR:-/tmp/nexus-alerts}"
+NEXUS_ALERT_ENABLED=true
+HOSTNAME=$(hostname -s 2>/dev/null || echo "unknown")
+SCRIPT_NAME="kimi-heartbeat-test"
+
+nexus_alert() {
+  local alert_type="$1"
+  local message="$2"
+  local severity="${3:-info}"
+  local extra_data="${4:-{}}"
+  
+  if [ "$NEXUS_ALERT_ENABLED" != "true" ]; then
+    return 0
+  fi
+  
+  mkdir -p "$NEXUS_ALERT_DIR" 2>/dev/null || return 0
+  
+  local timestamp
+  timestamp=$(date -u '+%Y-%m-%dT%H:%M:%SZ')
+  local nanoseconds=$(date +%N 2>/dev/null || echo "$$")
+  local alert_id="${SCRIPT_NAME}_$(date +%s)_${nanoseconds}_$$"
+  local alert_file="$NEXUS_ALERT_DIR/${alert_id}.json"
+  
+  cat > "$alert_file" << EOF
+{
+  "alert_id": "$alert_id",
+  "timestamp": "$timestamp",
+  "source": "$SCRIPT_NAME",
+  "host": "$HOSTNAME",
+  "alert_type": "$alert_type",
+  "severity": "$severity",
+  "message": "$message",
+  "data": $extra_data
+}
+EOF
+  
+  if [ -f "$alert_file" ]; then
+    echo "NEXUS_ALERT: $alert_type [$severity] - $message"
+    return 0
+  else
+    echo "NEXUS_ALERT_FAILED: Could not write alert"
+    return 1
+  fi
+}
+ALEOF
+
+source /tmp/test_alert_func.sh
+
+# Test 1: Basic alert
+echo -e "\n[TEST 1] Sending basic info alert..."
+nexus_alert "test_alert" "Test message from heartbeat" "info" '{"test": true}'
+
+# Test 2: Stale lock alert simulation
+echo -e "\n[TEST 2] Sending stale lock alert..."
+nexus_alert \
+  "stale_lock_reclaimed" \
+  "Stale lockfile deadlock cleared after 650s" \
+  "warning" \
+  '{"lock_age_seconds": 650, "lockfile": "/tmp/kimi-heartbeat.lock", "action": "removed"}'
+
+# Test 3: Heartbeat resumed alert
+echo -e "\n[TEST 3] Sending heartbeat resumed alert..."
+nexus_alert \
+  "heartbeat_resumed" \
+  "Kimi heartbeat resumed after clearing stale lock" \
+  "info" \
+  '{"recovery": "successful", "continuing": true}'
+
+# Check results
+echo -e "\n=== Alert Files Created ==="
+alert_count=$(find "$TEST_DIR" -name "*.json" 2>/dev/null | wc -l)
+echo "Total alert files: $alert_count"
+
+if [ "$alert_count" -eq 3 ]; then
+  echo "✅ All 3 alerts were created successfully"
+else
+  echo "❌ Expected 3 alerts, found $alert_count"
+  exit 1
+fi
+
+echo -e "\n=== Alert Contents ==="
+for f in "$TEST_DIR"/*.json; do
+  echo -e "\n--- $(basename "$f") ---"
+  cat "$f" | python3 -m json.tool 2>/dev/null || cat "$f"
+done
+
+# Validate JSON structure
+echo -e "\n=== JSON Validation ==="
+all_valid=true
+for f in "$TEST_DIR"/*.json; do
+  if python3 -c "import json; json.load(open('$f'))" 2>/dev/null; then
+    echo "✅ $(basename "$f") - Valid JSON"
+  else
+    echo "❌ $(basename "$f") - Invalid JSON"
+    all_valid=false
+  fi
+done
+
+# Check for required fields
+echo -e "\n=== Required Fields Check ==="
+for f in "$TEST_DIR"/*.json; do
+  basename=$(basename "$f")
+  missing=()
+  python3 -c "import json; d=json.load(open('$f'))" 2>/dev/null || continue
+  
+  for field in alert_id timestamp source host alert_type severity message data; do
+    if ! python3 -c "import json; d=json.load(open('$f')); exit(0 if '$field' in d else 1)" 2>/dev/null; then
+      missing+=("$field")
+    fi
+  done
+  
+  if [ ${#missing[@]} -eq 0 ]; then
+    echo "✅ $basename - All required fields present"
+  else
+    echo "❌ $basename - Missing fields: ${missing[*]}"
+    all_valid=false
+  fi
+done
+
+# Cleanup
+rm -rf "$TEST_DIR" /tmp/test_alert_func.sh
+
+echo -e "\n=== Test Summary ==="
+if [ "$all_valid" = true ]; then
+  echo "✅ All tests passed!"
+  exit 0
+else
+  echo "❌ Some tests failed"
+  exit 1
+fi
--- a/uniwizard/kimi-mention-watcher.sh
+++ b/uniwizard/kimi-mention-watcher.sh
@@ -5,7 +5,12 @@
 set -euo pipefail

 KIMI_TOKEN=$(cat /Users/apayne/.timmy/kimi_gitea_token | tr -d '[:space:]')
-BASE="http://100.126.61.75:3000/api/v1"
+
+# --- Tailscale/IP Detection (timmy-home#385) ---
+SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
+source "${SCRIPT_DIR}/lib/tailscale-gitea.sh"
+BASE="$GITEA_BASE_URL"
+
 LOG="/tmp/kimi-mentions.log"
 PROCESSED="/tmp/kimi-mentions-processed.txt"

--- a/uniwizard/lib/example-usage.sh
+++ b/uniwizard/lib/example-usage.sh
@@ -0,0 +1,55 @@
+#!/bin/bash
+# example-usage.sh — Example showing how to use the tailscale-gitea module
+# Issue: timmy-home#385 — Standardized Tailscale IP detection module
+
+set -euo pipefail
+
+# --- Basic Usage ---
+# Source the module to automatically set GITEA_BASE_URL
+SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
+source "${SCRIPT_DIR}/tailscale-gitea.sh"
+
+# Now use GITEA_BASE_URL in your API calls
+echo "Using Gitea at: $GITEA_BASE_URL"
+echo "Tailscale active: $GITEA_USING_TAILSCALE"
+
+# --- Example API Call ---
+# curl -sf -H "Authorization: token $TOKEN" \
+#   "$GITEA_BASE_URL/repos/myuser/myrepo/issues"
+
+# --- Custom Configuration (Optional) ---
+# You can customize behavior by setting variables BEFORE sourcing:
+#
+#   TAILSCALE_TIMEOUT=5    # Wait 5 seconds instead of 2
+#   TAILSCALE_DEBUG=1      # Print which endpoint was selected
+#   source "${SCRIPT_DIR}/tailscale-gitea.sh"
+
+# --- Advanced: Checking Network Mode ---
+if [[ "$GITEA_USING_TAILSCALE" == "true" ]]; then
+  echo "✓ Connected via private Tailscale network"
+else
+  echo "⚠ Using public internet fallback (Tailscale unavailable)"
+fi
+
+# --- Example: Polling with Retry Logic ---
+poll_gitea() {
+  local endpoint="${1:-$GITEA_BASE_URL}"
+  local max_retries="${2:-3}"
+  local retry=0
+
+  while [[ $retry -lt $max_retries ]]; do
+    if curl -sf --connect-timeout 2 "${endpoint}/version" > /dev/null 2>&1; then
+      echo "Gitea is reachable"
+      return 0
+    fi
+    retry=$((retry + 1))
+    echo "Retry $retry/$max_retries..."
+    sleep 1
+  done
+
+  echo "Gitea unreachable after $max_retries attempts"
+  return 1
+}
+
+# Uncomment to test connectivity:
+# poll_gitea "$GITEA_BASE_URL"
--- a/uniwizard/lib/tailscale-gitea.sh
+++ b/uniwizard/lib/tailscale-gitea.sh
@@ -0,0 +1,64 @@
+#!/bin/bash
+# tailscale-gitea.sh — Standardized Tailscale IP detection module for Gitea API access
+# Issue: timmy-home#385 — Standardize Tailscale IP detection across auxiliary scripts
+#
+# Usage (source this file in your script):
+#   source /path/to/tailscale-gitea.sh
+#   # Now use $GITEA_BASE_URL for API calls
+#
+# Configuration (set before sourcing to customize):
+#   TAILSCALE_IP       - Tailscale IP to try first (default: 100.126.61.75)
+#   PUBLIC_IP          - Public fallback IP (default: 143.198.27.163)
+#   GITEA_PORT         - Gitea API port (default: 3000)
+#   TAILSCALE_TIMEOUT  - Connection timeout in seconds (default: 2)
+#   GITEA_API_VERSION  - API version path (default: api/v1)
+#
+# Sovereignty: Private Tailscale network preferred over public internet
+
+# --- Default Configuration ---
+: "${TAILSCALE_IP:=100.126.61.75}"
+: "${PUBLIC_IP:=143.198.27.163}"
+: "${GITEA_PORT:=3000}"
+: "${TAILSCALE_TIMEOUT:=2}"
+: "${GITEA_API_VERSION:=api/v1}"
+
+# --- Detection Function ---
+_detect_gitea_endpoint() {
+    local tailscale_url="http://${TAILSCALE_IP}:${GITEA_PORT}/${GITEA_API_VERSION}"
+    local public_url="http://${PUBLIC_IP}:${GITEA_PORT}/${GITEA_API_VERSION}"
+
+    # Prefer Tailscale (private network) over public IP
+    if curl -sf --connect-timeout "$TAILSCALE_TIMEOUT" \
+        "${tailscale_url}/version" > /dev/null 2>&1; then
+        echo "$tailscale_url"
+        return 0
+    else
+        echo "$public_url"
+        return 1
+    fi
+}
+
+# --- Main Detection ---
+# Set GITEA_BASE_URL for use by sourcing scripts
+# Also sets GITEA_USING_TAILSCALE=true/false for scripts that need to know
+if curl -sf --connect-timeout "$TAILSCALE_TIMEOUT" \
+    "http://${TAILSCALE_IP}:${GITEA_PORT}/${GITEA_API_VERSION}/version" > /dev/null 2>&1; then
+    GITEA_BASE_URL="http://${TAILSCALE_IP}:${GITEA_PORT}/${GITEA_API_VERSION}"
+    GITEA_USING_TAILSCALE=true
+else
+    GITEA_BASE_URL="http://${PUBLIC_IP}:${GITEA_PORT}/${GITEA_API_VERSION}"
+    GITEA_USING_TAILSCALE=false
+fi
+
+# Export for child processes
+export GITEA_BASE_URL
+export GITEA_USING_TAILSCALE
+
+# Optional: log which endpoint was selected (set TAILSCALE_DEBUG=1 to enable)
+if [[ "${TAILSCALE_DEBUG:-0}" == "1" ]]; then
+    if [[ "$GITEA_USING_TAILSCALE" == "true" ]]; then
+        echo "[tailscale-gitea] Using Tailscale endpoint: $GITEA_BASE_URL" >&2
+    else
+        echo "[tailscale-gitea] Tailscale unavailable, using public endpoint: $GITEA_BASE_URL" >&2
+    fi
+fi
Author	SHA1	Message	Date
Timmy Bot	4cfd1c2e10	Merge remote main + feedback on EPIC-202	2026-04-06 02:21:50 +00:00
Timmy Bot	a9ad1c8137	feedback: Allegro cross-epic review on EPIC-202 (claw-agent) - Health: Yellow. Blocker: Gitea firewalled + no Primus RCA. - Adds pre-flight checklist before Phase 1 start.	2026-04-06 02:20:55 +00:00
Timmy Bot	9952ce180c	feat(uniwizard): standardized Tailscale IP detection module (timmy-home#385) Create reusable tailscale-gitea.sh module for all auxiliary scripts: - Automatically detects Tailscale (100.126.61.75) vs public IP (143.198.27.163) - Sets GITEA_BASE_URL and GITEA_USING_TAILSCALE for sourcing scripts - Configurable timeout, debug mode, and endpoint settings - Maintains sovereignty: prefers private Tailscale network Updated scripts: - kimi-heartbeat.sh: now sources the module - kimi-mention-watcher.sh: added fallback support via module Files added: - uniwizard/lib/tailscale-gitea.sh (reusable module) - uniwizard/lib/example-usage.sh (usage documentation) Acceptance criteria: ✓ Reusable module created and sourceable ✓ kimi-heartbeat.sh updated ✓ kimi-mention-watcher.sh updated (added fallback support) ✓ Example usage script provided	2026-04-05 07:07:05 +00:00
Timmy Bot	64a954f4d9	Enhance Kimi heartbeat with Nexus Watchdog alerting for stale lockfiles (#386 ) - Add nexus_alert() function to send alerts to Nexus Watchdog - Alerts are written as JSON files to $NEXUS_ALERT_DIR (default: /tmp/nexus-alerts) - Alert includes: alert_id, timestamp, source, host, alert_type, severity, message, data - Send 'stale_lock_reclaimed' warning alert when stale lock detected (age > 600s) - Send 'heartbeat_resumed' info alert after successful recovery - Include lock age, lockfile path, action taken, and stat info in alert data - Add configurable NEXUS_ALERT_DIR and NEXUS_ALERT_ENABLED settings - Add test script for validating alert functionality	2026-04-05 07:04:57 +00:00