Compare commits

...

5 Commits

Author SHA1 Message Date
Timmy Bot
4cfd1c2e10 Merge remote main + feedback on EPIC-202 2026-04-06 02:21:50 +00:00
Timmy Bot
a9ad1c8137 feedback: Allegro cross-epic review on EPIC-202 (claw-agent)
- Health: Yellow. Blocker: Gitea firewalled + no Primus RCA.
- Adds pre-flight checklist before Phase 1 start.
2026-04-06 02:20:55 +00:00
f708e45ae9 feat: Sovereign Health Dashboard — Operational Force Multiplication (#417)
Co-authored-by: Google AI Agent <gemini@hermes.local>
Co-committed-by: Google AI Agent <gemini@hermes.local>
2026-04-05 22:56:19 +00:00
Timmy Bot
9952ce180c feat(uniwizard): standardized Tailscale IP detection module (timmy-home#385)
Create reusable tailscale-gitea.sh module for all auxiliary scripts:
- Automatically detects Tailscale (100.126.61.75) vs public IP (143.198.27.163)
- Sets GITEA_BASE_URL and GITEA_USING_TAILSCALE for sourcing scripts
- Configurable timeout, debug mode, and endpoint settings
- Maintains sovereignty: prefers private Tailscale network

Updated scripts:
- kimi-heartbeat.sh: now sources the module
- kimi-mention-watcher.sh: added fallback support via module

Files added:
- uniwizard/lib/tailscale-gitea.sh (reusable module)
- uniwizard/lib/example-usage.sh (usage documentation)

Acceptance criteria:
✓ Reusable module created and sourceable
✓ kimi-heartbeat.sh updated
✓ kimi-mention-watcher.sh updated (added fallback support)
✓ Example usage script provided
2026-04-05 07:07:05 +00:00
Timmy Bot
64a954f4d9 Enhance Kimi heartbeat with Nexus Watchdog alerting for stale lockfiles (#386)
- Add nexus_alert() function to send alerts to Nexus Watchdog
- Alerts are written as JSON files to $NEXUS_ALERT_DIR (default: /tmp/nexus-alerts)
- Alert includes: alert_id, timestamp, source, host, alert_type, severity, message, data
- Send 'stale_lock_reclaimed' warning alert when stale lock detected (age > 600s)
- Send 'heartbeat_resumed' info alert after successful recovery
- Include lock age, lockfile path, action taken, and stat info in alert data
- Add configurable NEXUS_ALERT_DIR and NEXUS_ALERT_ENABLED settings
- Add test script for validating alert functionality
2026-04-05 07:04:57 +00:00
8 changed files with 403 additions and 21 deletions

View File

@@ -1,6 +1,6 @@
model:
default: claude-opus-4-6
provider: anthropic
default: hermes4:14b
provider: custom
toolsets:
- all
agent:
@@ -27,7 +27,7 @@ browser:
inactivity_timeout: 120
record_sessions: false
checkpoints:
enabled: false
enabled: true
max_snapshots: 50
compression:
enabled: true
@@ -110,7 +110,7 @@ tts:
device: cpu
stt:
enabled: true
provider: local
provider: openai
local:
model: base
openai:

View File

@@ -136,3 +136,27 @@ def build_bootstrap_graph() -> Graph:
---
*This epic supersedes Allegro-Primus who has been idle.*
---
## Feedback — 2026-04-06 (Allegro Cross-Epic Review)
**Health:** 🟡 Yellow
**Blocker:** Gitea externally firewalled + no Allegro-Primus RCA
### Critical Issues
1. **Dependency blindness.** Every Claw Code reference points to `143.198.27.163:3000`, which is currently firewalled and unreachable from this VM. If the mirror is not locally cached, development is blocked on external infrastructure.
2. **Root cause vs. replacement.** The epic jumps to "replace Allegro-Primus" without proving he is unfixable. Primus being idle could be the same provider/auth outage that took down Ezra and Bezalel. A 5-line RCA should precede a 5-phase rewrite.
3. **Timeline fantasy.** "Phase 1: 2 days" assumes stable infrastructure. Current reality: Gitea externally firewalled, Bezalel VPS down, Ezra needs webhook switch. This epic needs a "Blocked Until" section.
4. **Resource stalemate.** "Telegram bot: Need @BotFather" — the fleet already operates multiple bots. Reuse an existing bot profile or document why a new one is required.
### Recommended Action
Add a **Pre-Flight Checklist** to the epic:
- [ ] Verify Gitea/Claw Code mirror is reachable from the build VM
- [ ] Publish 1-paragraph RCA on why Allegro-Primus is idle
- [ ] Confirm target repo for the new agent code
Do not start Phase 1 until all three are checked.

View File

@@ -0,0 +1,68 @@
import sqlite3
import json
import os
from pathlib import Path
from datetime import datetime
DB_PATH = Path.home() / ".timmy" / "metrics" / "model_metrics.db"
REPORT_PATH = Path.home() / "timmy" / "SOVEREIGN_HEALTH.md"
def generate_report():
if not DB_PATH.exists():
return "No metrics database found."
conn = sqlite3.connect(str(DB_PATH))
# Get latest sovereignty score
row = conn.execute("""
SELECT local_pct, total_sessions, local_sessions, cloud_sessions, est_cloud_cost, est_saved
FROM sovereignty_score ORDER BY timestamp DESC LIMIT 1
""").fetchone()
if not row:
return "No sovereignty data found."
pct, total, local, cloud, cost, saved = row
# Get model breakdown
models = conn.execute("""
SELECT model, SUM(sessions), SUM(messages), is_local, SUM(est_cost_usd)
FROM session_stats
WHERE timestamp > ?
GROUP BY model
ORDER BY SUM(sessions) DESC
""", (datetime.now().timestamp() - 86400 * 7,)).fetchall()
report = f"""# Sovereign Health Report — {datetime.now().strftime('%Y-%m-%d')}
## ◈ Sovereignty Score: {pct:.1f}%
**Status:** {"🟢 OPTIMAL" if pct > 90 else "🟡 WARNING" if pct > 50 else "🔴 COMPROMISED"}
- **Total Sessions:** {total}
- **Local Sessions:** {local} (Zero Cost, Total Privacy)
- **Cloud Sessions:** {cloud} (Token Leakage)
- **Est. Cloud Cost:** ${cost:.2f}
- **Est. Savings:** ${saved:.2f} (Sovereign Dividend)
## ◈ Fleet Composition (Last 7 Days)
| Model | Sessions | Messages | Local? | Est. Cost |
| :--- | :--- | :--- | :--- | :--- |
"""
for m, s, msg, l, c in models:
local_flag = "" if l else ""
report += f"| {m} | {s} | {msg} | {local_flag} | ${c:.2f} |\n"
report += """
---
*Generated by the Sovereign Health Daemon. Sovereignty is a right. Privacy is a duty.*
"""
with open(REPORT_PATH, "w") as f:
f.write(report)
print(f"Report generated at {REPORT_PATH}")
return report
if __name__ == "__main__":
generate_report()

146
tests/test_nexus_alert.sh Executable file
View File

@@ -0,0 +1,146 @@
#!/bin/bash
# Test script for Nexus Watchdog alerting functionality
set -euo pipefail
TEST_DIR="/tmp/test-nexus-alerts-$$"
export NEXUS_ALERT_DIR="$TEST_DIR"
export NEXUS_ALERT_ENABLED=true
echo "=== Nexus Watchdog Alert Test ==="
echo "Test alert directory: $TEST_DIR"
# Source the alert function from the heartbeat script
# Extract just the nexus_alert function for testing
cat > /tmp/test_alert_func.sh << 'ALEOF'
#!/bin/bash
NEXUS_ALERT_DIR="${NEXUS_ALERT_DIR:-/tmp/nexus-alerts}"
NEXUS_ALERT_ENABLED=true
HOSTNAME=$(hostname -s 2>/dev/null || echo "unknown")
SCRIPT_NAME="kimi-heartbeat-test"
nexus_alert() {
local alert_type="$1"
local message="$2"
local severity="${3:-info}"
local extra_data="${4:-{}}"
if [ "$NEXUS_ALERT_ENABLED" != "true" ]; then
return 0
fi
mkdir -p "$NEXUS_ALERT_DIR" 2>/dev/null || return 0
local timestamp
timestamp=$(date -u '+%Y-%m-%dT%H:%M:%SZ')
local nanoseconds=$(date +%N 2>/dev/null || echo "$$")
local alert_id="${SCRIPT_NAME}_$(date +%s)_${nanoseconds}_$$"
local alert_file="$NEXUS_ALERT_DIR/${alert_id}.json"
cat > "$alert_file" << EOF
{
"alert_id": "$alert_id",
"timestamp": "$timestamp",
"source": "$SCRIPT_NAME",
"host": "$HOSTNAME",
"alert_type": "$alert_type",
"severity": "$severity",
"message": "$message",
"data": $extra_data
}
EOF
if [ -f "$alert_file" ]; then
echo "NEXUS_ALERT: $alert_type [$severity] - $message"
return 0
else
echo "NEXUS_ALERT_FAILED: Could not write alert"
return 1
fi
}
ALEOF
source /tmp/test_alert_func.sh
# Test 1: Basic alert
echo -e "\n[TEST 1] Sending basic info alert..."
nexus_alert "test_alert" "Test message from heartbeat" "info" '{"test": true}'
# Test 2: Stale lock alert simulation
echo -e "\n[TEST 2] Sending stale lock alert..."
nexus_alert \
"stale_lock_reclaimed" \
"Stale lockfile deadlock cleared after 650s" \
"warning" \
'{"lock_age_seconds": 650, "lockfile": "/tmp/kimi-heartbeat.lock", "action": "removed"}'
# Test 3: Heartbeat resumed alert
echo -e "\n[TEST 3] Sending heartbeat resumed alert..."
nexus_alert \
"heartbeat_resumed" \
"Kimi heartbeat resumed after clearing stale lock" \
"info" \
'{"recovery": "successful", "continuing": true}'
# Check results
echo -e "\n=== Alert Files Created ==="
alert_count=$(find "$TEST_DIR" -name "*.json" 2>/dev/null | wc -l)
echo "Total alert files: $alert_count"
if [ "$alert_count" -eq 3 ]; then
echo "✅ All 3 alerts were created successfully"
else
echo "❌ Expected 3 alerts, found $alert_count"
exit 1
fi
echo -e "\n=== Alert Contents ==="
for f in "$TEST_DIR"/*.json; do
echo -e "\n--- $(basename "$f") ---"
cat "$f" | python3 -m json.tool 2>/dev/null || cat "$f"
done
# Validate JSON structure
echo -e "\n=== JSON Validation ==="
all_valid=true
for f in "$TEST_DIR"/*.json; do
if python3 -c "import json; json.load(open('$f'))" 2>/dev/null; then
echo "$(basename "$f") - Valid JSON"
else
echo "$(basename "$f") - Invalid JSON"
all_valid=false
fi
done
# Check for required fields
echo -e "\n=== Required Fields Check ==="
for f in "$TEST_DIR"/*.json; do
basename=$(basename "$f")
missing=()
python3 -c "import json; d=json.load(open('$f'))" 2>/dev/null || continue
for field in alert_id timestamp source host alert_type severity message data; do
if ! python3 -c "import json; d=json.load(open('$f')); exit(0 if '$field' in d else 1)" 2>/dev/null; then
missing+=("$field")
fi
done
if [ ${#missing[@]} -eq 0 ]; then
echo "$basename - All required fields present"
else
echo "$basename - Missing fields: ${missing[*]}"
all_valid=false
fi
done
# Cleanup
rm -rf "$TEST_DIR" /tmp/test_alert_func.sh
echo -e "\n=== Test Summary ==="
if [ "$all_valid" = true ]; then
echo "✅ All tests passed!"
exit 0
else
echo "❌ Some tests failed"
exit 1
fi

View File

@@ -24,32 +24,52 @@ class HealthCheckHandler(BaseHTTPRequestHandler):
# Suppress default logging
pass
def do_GET(self):
def do_GET(self):
"""Handle GET requests"""
if self.path == '/health':
self.send_health_response()
elif self.path == '/status':
self.send_full_status()
elif self.path == '/metrics':
self.send_sovereign_metrics()
else:
self.send_error(404)
def send_health_response(self):
"""Send simple health check"""
harness = get_harness()
result = harness.execute("health_check")
def send_sovereign_metrics(self):
"""Send sovereign health metrics as JSON"""
try:
health_data = json.loads(result)
status_code = 200 if health_data.get("overall") == "healthy" else 503
except:
status_code = 503
health_data = {"error": "Health check failed"}
self.send_response(status_code)
import sqlite3
db_path = Path.home() / ".timmy" / "metrics" / "model_metrics.db"
if not db_path.exists():
data = {"error": "No database found"}
else:
conn = sqlite3.connect(str(db_path))
row = conn.execute("""
SELECT local_pct, total_sessions, local_sessions, cloud_sessions, est_cloud_cost, est_saved
FROM sovereignty_score ORDER BY timestamp DESC LIMIT 1
""").fetchone()
if row:
data = {
"sovereignty_score": row[0],
"total_sessions": row[1],
"local_sessions": row[2],
"cloud_sessions": row[3],
"est_cloud_cost": row[4],
"est_saved": row[5],
"timestamp": datetime.now().isoformat()
}
else:
data = {"error": "No data"}
conn.close()
except Exception as e:
data = {"error": str(e)}
self.send_response(200)
self.send_header('Content-Type', 'application/json')
self.end_headers()
self.wfile.write(json.dumps(health_data).encode())
self.wfile.write(json.dumps(data).encode())
def send_full_status(self):
"""Send full system status"""
harness = get_harness()

View File

@@ -5,7 +5,12 @@
set -euo pipefail
KIMI_TOKEN=$(cat /Users/apayne/.timmy/kimi_gitea_token | tr -d '[:space:]')
BASE="http://100.126.61.75:3000/api/v1"
# --- Tailscale/IP Detection (timmy-home#385) ---
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
source "${SCRIPT_DIR}/lib/tailscale-gitea.sh"
BASE="$GITEA_BASE_URL"
LOG="/tmp/kimi-mentions.log"
PROCESSED="/tmp/kimi-mentions-processed.txt"

View File

@@ -0,0 +1,55 @@
#!/bin/bash
# example-usage.sh — Example showing how to use the tailscale-gitea module
# Issue: timmy-home#385 — Standardized Tailscale IP detection module
set -euo pipefail
# --- Basic Usage ---
# Source the module to automatically set GITEA_BASE_URL
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
source "${SCRIPT_DIR}/tailscale-gitea.sh"
# Now use GITEA_BASE_URL in your API calls
echo "Using Gitea at: $GITEA_BASE_URL"
echo "Tailscale active: $GITEA_USING_TAILSCALE"
# --- Example API Call ---
# curl -sf -H "Authorization: token $TOKEN" \
# "$GITEA_BASE_URL/repos/myuser/myrepo/issues"
# --- Custom Configuration (Optional) ---
# You can customize behavior by setting variables BEFORE sourcing:
#
# TAILSCALE_TIMEOUT=5 # Wait 5 seconds instead of 2
# TAILSCALE_DEBUG=1 # Print which endpoint was selected
# source "${SCRIPT_DIR}/tailscale-gitea.sh"
# --- Advanced: Checking Network Mode ---
if [[ "$GITEA_USING_TAILSCALE" == "true" ]]; then
echo "✓ Connected via private Tailscale network"
else
echo "⚠ Using public internet fallback (Tailscale unavailable)"
fi
# --- Example: Polling with Retry Logic ---
poll_gitea() {
local endpoint="${1:-$GITEA_BASE_URL}"
local max_retries="${2:-3}"
local retry=0
while [[ $retry -lt $max_retries ]]; do
if curl -sf --connect-timeout 2 "${endpoint}/version" > /dev/null 2>&1; then
echo "Gitea is reachable"
return 0
fi
retry=$((retry + 1))
echo "Retry $retry/$max_retries..."
sleep 1
done
echo "Gitea unreachable after $max_retries attempts"
return 1
}
# Uncomment to test connectivity:
# poll_gitea "$GITEA_BASE_URL"

View File

@@ -0,0 +1,64 @@
#!/bin/bash
# tailscale-gitea.sh — Standardized Tailscale IP detection module for Gitea API access
# Issue: timmy-home#385 — Standardize Tailscale IP detection across auxiliary scripts
#
# Usage (source this file in your script):
# source /path/to/tailscale-gitea.sh
# # Now use $GITEA_BASE_URL for API calls
#
# Configuration (set before sourcing to customize):
# TAILSCALE_IP - Tailscale IP to try first (default: 100.126.61.75)
# PUBLIC_IP - Public fallback IP (default: 143.198.27.163)
# GITEA_PORT - Gitea API port (default: 3000)
# TAILSCALE_TIMEOUT - Connection timeout in seconds (default: 2)
# GITEA_API_VERSION - API version path (default: api/v1)
#
# Sovereignty: Private Tailscale network preferred over public internet
# --- Default Configuration ---
: "${TAILSCALE_IP:=100.126.61.75}"
: "${PUBLIC_IP:=143.198.27.163}"
: "${GITEA_PORT:=3000}"
: "${TAILSCALE_TIMEOUT:=2}"
: "${GITEA_API_VERSION:=api/v1}"
# --- Detection Function ---
_detect_gitea_endpoint() {
local tailscale_url="http://${TAILSCALE_IP}:${GITEA_PORT}/${GITEA_API_VERSION}"
local public_url="http://${PUBLIC_IP}:${GITEA_PORT}/${GITEA_API_VERSION}"
# Prefer Tailscale (private network) over public IP
if curl -sf --connect-timeout "$TAILSCALE_TIMEOUT" \
"${tailscale_url}/version" > /dev/null 2>&1; then
echo "$tailscale_url"
return 0
else
echo "$public_url"
return 1
fi
}
# --- Main Detection ---
# Set GITEA_BASE_URL for use by sourcing scripts
# Also sets GITEA_USING_TAILSCALE=true/false for scripts that need to know
if curl -sf --connect-timeout "$TAILSCALE_TIMEOUT" \
"http://${TAILSCALE_IP}:${GITEA_PORT}/${GITEA_API_VERSION}/version" > /dev/null 2>&1; then
GITEA_BASE_URL="http://${TAILSCALE_IP}:${GITEA_PORT}/${GITEA_API_VERSION}"
GITEA_USING_TAILSCALE=true
else
GITEA_BASE_URL="http://${PUBLIC_IP}:${GITEA_PORT}/${GITEA_API_VERSION}"
GITEA_USING_TAILSCALE=false
fi
# Export for child processes
export GITEA_BASE_URL
export GITEA_USING_TAILSCALE
# Optional: log which endpoint was selected (set TAILSCALE_DEBUG=1 to enable)
if [[ "${TAILSCALE_DEBUG:-0}" == "1" ]]; then
if [[ "$GITEA_USING_TAILSCALE" == "true" ]]; then
echo "[tailscale-gitea] Using Tailscale endpoint: $GITEA_BASE_URL" >&2
else
echo "[tailscale-gitea] Tailscale unavailable, using public endpoint: $GITEA_BASE_URL" >&2
fi
fi