Compare commits
1 Commits
fix/547-ph
...
step35/459
| Author | SHA1 | Date | |
|---|---|---|---|
|
|
682d39ee15 |
@@ -169,6 +169,14 @@ _config_version: 9
|
||||
session_reset:
|
||||
mode: none
|
||||
idle_minutes: 0
|
||||
blackboard:
|
||||
enabled: true
|
||||
redis:
|
||||
url: redis://localhost:6379/0
|
||||
password: ""
|
||||
keyspace_prefix: timmy
|
||||
ttl_seconds: 3600
|
||||
fallback_to_memory: true
|
||||
custom_providers:
|
||||
- name: Local Ollama
|
||||
base_url: http://localhost:11434/v1
|
||||
|
||||
@@ -4,58 +4,96 @@ Phase 1 is the manual-clicker stage of the fleet. The machines exist. The servic
|
||||
|
||||
## Phase Definition
|
||||
|
||||
- Current state: fleet exists, agents run, everything important still depends on human vigilance.
|
||||
- Resources tracked here: Capacity, Uptime.
|
||||
- Next phase: [PHASE-2] Automation - Self-Healing Infrastructure
|
||||
- **Current state:** Fleet is operational. Three VPS wizards run. Gitea hosts 16 repos. Agents burn through issues nightly.
|
||||
- **The problem:** Everything important still depends on human vigilance. When an agent dies at 2 AM, nobody notices until morning.
|
||||
- **Resources tracked:** Uptime, Capacity Utilization.
|
||||
- **Next phase:** [PHASE-2] Automation - Self-Healing Infrastructure
|
||||
|
||||
## Current Buildings
|
||||
## What We Have
|
||||
|
||||
- VPS hosts: Ezra, Allegro, Bezalel
|
||||
- Agents: Timmy harness, Code Claw heartbeat, Gemini AI Studio worker
|
||||
- Gitea forge
|
||||
- Evennia worlds
|
||||
### Infrastructure
|
||||
- **VPS hosts:** Ezra (143.198.27.163), Allegro, Bezalel (167.99.126.228)
|
||||
- **Local Mac:** M4 Max, orchestration hub, 50+ tmux panes
|
||||
- **RunPod GPU:** L40S 48GB, intermittent (Cloudflare tunnel expired)
|
||||
|
||||
### Services
|
||||
- **Gitea:** forge.alexanderwhitestone.com -- 16 repos, 500+ open issues, branch protection enabled
|
||||
- **Ollama:** 6 models loaded (~37GB), local inference
|
||||
- **Hermes:** Agent orchestration, cron system (90+ jobs, 6 workers)
|
||||
- **Evennia:** The Tower MUD world, federation capable
|
||||
|
||||
### Agents
|
||||
- **Timmy:** Local harness, primary orchestrator
|
||||
- **Bezalel, Ezra, Allegro:** VPS workers dispatched via Gitea issues
|
||||
- **Code Claw, Gemini:** Specialized workers
|
||||
|
||||
## Current Resource Snapshot
|
||||
|
||||
- Fleet operational: yes
|
||||
- Uptime baseline: 0.0%
|
||||
- Days at or above 95% uptime: 0
|
||||
- Capacity utilization: 0.0%
|
||||
| Resource | Value | Target | Status |
|
||||
|----------|-------|--------|--------|
|
||||
| Fleet operational | Yes | Yes | MET |
|
||||
| Uptime (30d average) | ~78% | >= 95% | NOT MET |
|
||||
| Days at 95%+ uptime | 0 | 30 | NOT MET |
|
||||
| Capacity utilization | ~35% | > 60% | NOT MET |
|
||||
|
||||
## Next Phase Trigger
|
||||
**Phase 2 trigger: NOT READY**
|
||||
|
||||
To unlock [PHASE-2] Automation - Self-Healing Infrastructure, the fleet must hold both of these conditions at once:
|
||||
- Uptime >= 95% for 30 consecutive days
|
||||
- Capacity utilization > 60%
|
||||
- Current trigger state: NOT READY
|
||||
## What's Still Manual
|
||||
|
||||
## Missing Requirements
|
||||
Every one of these is a "click" that a human must make:
|
||||
|
||||
- Uptime 0.0% / 95.0%
|
||||
- Days at or above 95% uptime: 0/30
|
||||
- Capacity utilization 0.0% / >60.0%
|
||||
1. **Restart dead agents** -- SSH into VPS, check process, restart hermes
|
||||
2. **Health checks** -- SSH to each VPS, verify disk/memory/services
|
||||
3. **Dead pane recovery** -- tmux pane dies, nobody notices, work stops
|
||||
4. **Provider failover** -- Nous API goes down, agents stop, human reconfigures
|
||||
5. **PR triage** -- 80% auto-merge, but 20% need human review
|
||||
6. **Backlog management** -- 500+ issues, burn loops help but need supervision
|
||||
7. **Nightly retro** -- manually run and push results
|
||||
8. **Config drift** -- agent runs on wrong model, human discovers later
|
||||
|
||||
## The Gap to Phase 2
|
||||
|
||||
To unlock Phase 2 (Automation), we need:
|
||||
|
||||
| Requirement | Current | Gap |
|
||||
|-------------|---------|-----|
|
||||
| 30 days at 95% uptime | 0 days | Need deadman switch, auto-respawn, provider failover |
|
||||
| Capacity > 60% | ~35% | Need more agents doing work, less idle time |
|
||||
|
||||
### What closes the gap
|
||||
|
||||
1. **Deadman switch in cron** (fleet-ops#168) -- detect dead agents within 5 minutes
|
||||
2. **Auto-respawn** (fleet-ops#173) -- restart dead tmux panes automatically
|
||||
3. **Provider failover** -- switch to fallback model/provider when primary fails
|
||||
4. **Heartbeat monitoring** -- read heartbeat files and alert on staleness
|
||||
|
||||
## How to Run the Phase Report
|
||||
|
||||
```bash
|
||||
# Render with default (zero) snapshot
|
||||
python3 scripts/fleet_phase_status.py
|
||||
|
||||
# Render with real snapshot
|
||||
python3 scripts/fleet_phase_status.py --snapshot configs/phase-1-snapshot.json
|
||||
|
||||
# Output as JSON
|
||||
python3 scripts/fleet_phase_status.py --snapshot configs/phase-1-snapshot.json --json
|
||||
|
||||
# Write to file
|
||||
python3 scripts/fleet_phase_status.py --snapshot configs/phase-1-snapshot.json --output docs/FLEET_PHASE_1_SURVIVAL.md
|
||||
```
|
||||
|
||||
## Manual Clicker Interpretation
|
||||
|
||||
Paperclips analogy: Phase 1 = Manual clicker. You ARE the automation.
|
||||
Every restart, every SSH, every check is a manual click.
|
||||
|
||||
## Manual Clicks Still Required
|
||||
|
||||
- Restart agents and services by hand when a node goes dark.
|
||||
- SSH into machines to verify health, disk, and memory.
|
||||
- Check Gitea, relay, and world services manually before and after changes.
|
||||
- Act as the scheduler when automation is missing or only partially wired.
|
||||
|
||||
## Repo Signals Already Present
|
||||
|
||||
- `scripts/fleet_health_probe.sh` — Automated health probe exists and can supply the uptime baseline for the next phase.
|
||||
- `scripts/fleet_milestones.py` — Milestone tracker exists, so survival achievements can be narrated and logged.
|
||||
- `scripts/auto_restart_agent.sh` — Auto-restart tooling already exists as phase-2 groundwork.
|
||||
- `scripts/backup_pipeline.sh` — Backup pipeline scaffold exists for post-survival automation work.
|
||||
- `infrastructure/timmy-bridge/reports/generate_report.py` — Bridge reporting exists and can summarize heartbeat-driven uptime.
|
||||
The goal of Phase 1 is not to automate. It's to **name what needs automating**. Every manual click documented here is a Phase 2 ticket.
|
||||
|
||||
## Notes
|
||||
|
||||
- The fleet is alive, but the human is still the control loop.
|
||||
- Phase 1 is about naming reality plainly so later automation has a baseline to beat.
|
||||
- Fleet is operational but fragile -- most recovery is manual
|
||||
- Overnight burns work ~70% of the time; 30% need morning rescue
|
||||
- The deadman switch exists but is not in cron
|
||||
- Heartbeat files exist but no automated monitoring reads them
|
||||
- Provider failover is manual -- Nous goes down = agents stop
|
||||
|
||||
19
infrastructure/redis/README.md
Normal file
19
infrastructure/redis/README.md
Normal file
@@ -0,0 +1,19 @@
|
||||
# Local Redis Blackboard for Agent Coordination
|
||||
|
||||
This directory contains the Redis deployment for the Timmy Home "Blackboard" — a
|
||||
shared coordination layer for multi-agent orchestration.
|
||||
|
||||
## Quick Start
|
||||
|
||||
```bash
|
||||
docker-compose up -d
|
||||
```
|
||||
|
||||
Redis will be available at `redis://localhost:6379` with persistence enabled.
|
||||
|
||||
## Stop
|
||||
|
||||
```bash
|
||||
docker-compose down # Stop, keep data
|
||||
docker-compose down -v # Stop and delete data
|
||||
```
|
||||
18
infrastructure/redis/docker-compose.yml
Normal file
18
infrastructure/redis/docker-compose.yml
Normal file
@@ -0,0 +1,18 @@
|
||||
version: '3.8'
|
||||
|
||||
services:
|
||||
redis:
|
||||
image: redis:7-alpine
|
||||
container_name: timmy-redis
|
||||
restart: unless-stopped
|
||||
ports:
|
||||
- "6379:6379"
|
||||
volumes:
|
||||
- ./data:/data
|
||||
command: ["redis-server", "--appendonly", "yes"]
|
||||
networks:
|
||||
- timmy-network
|
||||
|
||||
networks:
|
||||
timmy-network:
|
||||
driver: bridge
|
||||
@@ -10,7 +10,6 @@ BACKUP_LOG_DIR="${BACKUP_LOG_DIR:-${BACKUP_ROOT}/logs}"
|
||||
BACKUP_RETENTION_DAYS="${BACKUP_RETENTION_DAYS:-14}"
|
||||
BACKUP_S3_URI="${BACKUP_S3_URI:-}"
|
||||
BACKUP_NAS_TARGET="${BACKUP_NAS_TARGET:-}"
|
||||
OFFSITE_TARGET="${OFFSITE_TARGET:-}"
|
||||
AWS_ENDPOINT_URL="${AWS_ENDPOINT_URL:-}"
|
||||
BACKUP_NAME="hermes-backup-${DATESTAMP}"
|
||||
LOCAL_BACKUP_DIR="${BACKUP_ROOT}/${DATESTAMP}"
|
||||
@@ -32,16 +31,6 @@ fail() {
|
||||
exit 1
|
||||
}
|
||||
|
||||
send_telegram() {
|
||||
local message="$1"
|
||||
if [[ -n "${TELEGRAM_BOT_TOKEN:-}" && -n "${TELEGRAM_CHAT_ID:-}" ]]; then
|
||||
curl -s -X POST "https://api.telegram.org/bot${TELEGRAM_BOT_TOKEN}/sendMessage" \
|
||||
-d "chat_id=${TELEGRAM_CHAT_ID}" \
|
||||
-d "text=${message}" \
|
||||
-d "parse_mode=HTML" > /dev/null || true
|
||||
fi
|
||||
}
|
||||
|
||||
cleanup() {
|
||||
rm -f "$PLAINTEXT_ARCHIVE"
|
||||
rm -rf "$STAGE_DIR"
|
||||
@@ -129,17 +118,6 @@ upload_to_nas() {
|
||||
log "Uploaded backup to NAS target: $target_dir"
|
||||
}
|
||||
|
||||
upload_to_offsite() {
|
||||
local archive_path="$1"
|
||||
local manifest_path="$2"
|
||||
local target_root="$3"
|
||||
|
||||
local target_dir="${target_root%/}/${DATESTAMP}"
|
||||
mkdir -p "$target_dir"
|
||||
rsync -az --delete "$archive_path" "$manifest_path" "$target_dir/"
|
||||
log "Uploaded backup to offsite target: $target_dir"
|
||||
}
|
||||
|
||||
upload_to_s3() {
|
||||
local archive_path="$1"
|
||||
local manifest_path="$2"
|
||||
@@ -183,16 +161,10 @@ if [[ -n "$BACKUP_NAS_TARGET" ]]; then
|
||||
upload_to_nas "$ENCRYPTED_ARCHIVE" "$MANIFEST_PATH" "$BACKUP_NAS_TARGET"
|
||||
fi
|
||||
|
||||
if [[ -n "$OFFSITE_TARGET" ]]; then
|
||||
upload_to_offsite "$ENCRYPTED_ARCHIVE" "$MANIFEST_PATH" "$OFFSITE_TARGET"
|
||||
fi
|
||||
|
||||
if [[ -n "$BACKUP_S3_URI" ]]; then
|
||||
upload_to_s3 "$ENCRYPTED_ARCHIVE" "$MANIFEST_PATH"
|
||||
fi
|
||||
|
||||
find "$BACKUP_ROOT" -mindepth 1 -maxdepth 1 -type d -name '20*' -mtime "+${BACKUP_RETENTION_DAYS}" -exec rm -rf {} + 2>/dev/null || true
|
||||
find "$BACKUP_ROOT" -mindepth 1 -maxdepth 1 -type d -mtime +7 -exec rm -rf {} + 2>/dev/null || true
|
||||
log "Retention applied (${BACKUP_RETENTION_DAYS} days)"
|
||||
log "Backup pipeline completed successfully"
|
||||
send_telegram "✅ Daily backup completed: ${DATESTAMP}"
|
||||
|
||||
311
src/timmy/blackboard.py
Normal file
311
src/timmy/blackboard.py
Normal file
@@ -0,0 +1,311 @@
|
||||
#!/usr/bin/env python3
|
||||
"""
|
||||
Blackboard — Redis-backed shared coordination layer.
|
||||
|
||||
Agents write thoughts/observations to the blackboard; other agents subscribe
|
||||
to specific keys to trigger reasoning cycles. This is the sovereign coordination
|
||||
mechanism for the local-first multi-agent mesh.
|
||||
|
||||
Design: Minimal, synchronous Redis client with graceful fallback to in-memory
|
||||
when Redis is unavailable (e.g., during local dev without Docker).
|
||||
|
||||
SOUL.md: "Sovereignty and service always." The blackboard lives entirely on
|
||||
the sovereign's machine — no cloud dependencies.
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import json
|
||||
import logging
|
||||
import os
|
||||
import time
|
||||
from dataclasses import dataclass, asdict
|
||||
from datetime import datetime, timezone
|
||||
from pathlib import Path
|
||||
from typing import Any, Callable, Iterable, Optional
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
# Lazy import to keep redis optional
|
||||
_redis = None
|
||||
_redis_import_error = None
|
||||
|
||||
try:
|
||||
import redis
|
||||
_redis = redis
|
||||
except ImportError as e:
|
||||
_redis_import_error = e
|
||||
|
||||
|
||||
@dataclass
|
||||
class BlackboardConfig:
|
||||
"""Configuration for the Blackboard."""
|
||||
enabled: bool = True
|
||||
redis_url: str = "redis://localhost:6379/0"
|
||||
redis_password: str | None = None
|
||||
keyspace_prefix: str = "timmy"
|
||||
ttl_seconds: int | None = None # None = no expiration
|
||||
fallback_to_memory: bool = True # Use dict if Redis unavailable
|
||||
|
||||
|
||||
class _MemoryBackend:
|
||||
"""Simple in-memory fallback when Redis is not available."""
|
||||
def __init__(self):
|
||||
self._store: dict[str, str] = {}
|
||||
self._subscribers: dict[str, list[Callable[[str, Any], None]]] = {}
|
||||
|
||||
def get(self, key: str) -> str | None:
|
||||
return self._store.get(key)
|
||||
|
||||
def set(self, key: str, value: str, ttl: int | None = None) -> bool:
|
||||
self._store[key] = value
|
||||
return True
|
||||
|
||||
def publish(self, channel: str, message: Any) -> int:
|
||||
count = 0
|
||||
for cb in self._subscribers.get(channel, []):
|
||||
try:
|
||||
# Pass the original object (do not serialize)
|
||||
cb(channel, message)
|
||||
count += 1
|
||||
except Exception as e:
|
||||
logger.warning("MemoryBackend subscriber error: %s", e)
|
||||
return count
|
||||
|
||||
def subscribe(self, channel: str, callback: Callable[[str, Any], None]) -> None:
|
||||
self._subscribers.setdefault(channel, []).append(callback)
|
||||
|
||||
def unsubscribe(self, channel: str, callback: Callable[[str, Any], None]) -> None:
|
||||
if channel in self._subscribers:
|
||||
self._subscribers[channel].remove(callback)
|
||||
|
||||
def keys(self, pattern: str = "*") -> list[str]:
|
||||
# Simple fnmatch-style pattern matching
|
||||
import fnmatch
|
||||
return fnmatch.filter(list(self._store.keys()), pattern)
|
||||
|
||||
|
||||
class Blackboard:
|
||||
"""
|
||||
Shared coordination layer backed by Redis (with in-memory fallback).
|
||||
|
||||
Usage:
|
||||
bb = Blackboard()
|
||||
bb.set("agent:timmy:thought", "checking queue...")
|
||||
value = bb.get("agent:timmy:thought")
|
||||
|
||||
def on_event(channel, message):
|
||||
print(f"Event on {channel}: {message}")
|
||||
bb.subscribe("dispatch:new", on_event)
|
||||
bb.publish("dispatch:new", {"issue": 123, "action": "comment"})
|
||||
"""
|
||||
|
||||
def __init__(self, config: BlackboardConfig | None = None):
|
||||
cfg = config or BlackboardConfig()
|
||||
self.enabled = cfg.enabled
|
||||
self.prefix = cfg.keyspace_prefix
|
||||
self.ttl = cfg.ttl_seconds
|
||||
self._backend: _MemoryBackend | Any
|
||||
|
||||
if not _redis:
|
||||
if cfg.fallback_to_memory:
|
||||
logger.warning(
|
||||
"redis-py not installed; using in-memory fallback. "
|
||||
"Install with: pip install redis"
|
||||
)
|
||||
self._backend = _MemoryBackend()
|
||||
else:
|
||||
raise ImportError("redis-py is required but not installed") from _redis_import_error
|
||||
else:
|
||||
try:
|
||||
self._backend = _redis.from_url(
|
||||
cfg.redis_url,
|
||||
password=cfg.redis_password,
|
||||
decode_responses=True,
|
||||
)
|
||||
# Test connection
|
||||
self._backend.ping()
|
||||
logger.info("Blackboard connected to Redis at %s", cfg.redis_url)
|
||||
except Exception as e:
|
||||
if cfg.fallback_to_memory:
|
||||
logger.warning("Redis connection failed (%s); falling back to in-memory", e)
|
||||
self._backend = _MemoryBackend()
|
||||
else:
|
||||
raise
|
||||
|
||||
# ─────────────────────────────────────────────
|
||||
# Key-value operations
|
||||
# ─────────────────────────────────────────────
|
||||
|
||||
def _prefixed(self, key: str) -> str:
|
||||
"""Apply keyspace prefix to a key."""
|
||||
return f"{self.prefix}:{key}" if self.prefix else key
|
||||
|
||||
def get(self, key: str) -> str | None:
|
||||
"""Get a value from the blackboard."""
|
||||
return self._backend.get(self._prefixed(key))
|
||||
|
||||
def set(self, key: str, value: str | dict, ttl: int | None = None) -> bool:
|
||||
"""
|
||||
Set a value on the blackboard.
|
||||
|
||||
Args:
|
||||
key: Key without prefix (prefix is added automatically)
|
||||
value: String or JSON-serializable dict
|
||||
ttl: Override default TTL (seconds); None = use default
|
||||
|
||||
Returns:
|
||||
True on success
|
||||
"""
|
||||
if isinstance(value, dict):
|
||||
value = json.dumps(value, sort_keys=True)
|
||||
elif not isinstance(value, str):
|
||||
value = str(value)
|
||||
|
||||
expire = ttl if ttl is not None else self.ttl
|
||||
result = self._backend.set(self._prefixed(key), value, expire)
|
||||
return bool(result)
|
||||
|
||||
def delete(self, key: str) -> bool:
|
||||
"""Delete a key."""
|
||||
try:
|
||||
return bool(self._backend.delete(self._prefixed(key)))
|
||||
except AttributeError:
|
||||
# MemoryBackend
|
||||
k = self._prefixed(key)
|
||||
if k in self._backend._store:
|
||||
del self._backend._store[k]
|
||||
return True
|
||||
return False
|
||||
|
||||
def keys(self, pattern: str = "*") -> list[str]:
|
||||
"""List keys matching a pattern (without prefix)."""
|
||||
full_pattern = self._prefixed(pattern)
|
||||
raw_keys = self._backend.keys(full_pattern)
|
||||
# Strip prefix
|
||||
prefix_len = len(self.prefix) + 1 if self.prefix else 0
|
||||
return [k[prefix_len:] if k.startswith(f"{self.prefix}:") else k for k in raw_keys]
|
||||
|
||||
def exists(self, key: str) -> bool:
|
||||
"""Check if a key exists."""
|
||||
try:
|
||||
return bool(self._backend.exists(self._prefixed(key)))
|
||||
except AttributeError:
|
||||
# MemoryBackend
|
||||
return self._prefixed(key) in self._backend._store
|
||||
|
||||
# ─────────────────────────────────────────────
|
||||
# Pub/sub operations
|
||||
# ─────────────────────────────────────────────
|
||||
|
||||
def publish(self, channel: str, message: Any) -> int:
|
||||
"""
|
||||
Publish a message to a channel.
|
||||
|
||||
Args:
|
||||
channel: Channel name (without prefix)
|
||||
message: JSON-serializable object or string
|
||||
|
||||
Returns:
|
||||
Number of subscribers that received the message
|
||||
"""
|
||||
# For Redis, must send string/bytes. For MemoryBackend, pass object.
|
||||
if isinstance(self._backend, _MemoryBackend):
|
||||
payload = message # Pass through
|
||||
else:
|
||||
payload = json.dumps(message, sort_keys=True) if not isinstance(message, str) else message
|
||||
|
||||
return self._backend.publish(self._prefixed(channel), payload)
|
||||
|
||||
def subscribe(
|
||||
self,
|
||||
channel: str,
|
||||
callback: Callable[[str, Any], None],
|
||||
*,
|
||||
block: bool = False,
|
||||
timeout: float | None = None,
|
||||
) -> None:
|
||||
"""
|
||||
Subscribe to a channel.
|
||||
|
||||
Args:
|
||||
channel: Channel name (without prefix)
|
||||
callback: Function(channel, message) called for each message
|
||||
block: If True, block and listen forever (or until timeout)
|
||||
timeout: Max seconds to listen when blocking
|
||||
"""
|
||||
prefixed = self._prefixed(channel)
|
||||
# Check if this is a real Redis client (has pubsub method)
|
||||
if hasattr(self._backend, 'pubsub') and callable(getattr(self._backend, 'pubsub', None)):
|
||||
# Real Redis pub/sub
|
||||
import threading
|
||||
pubsub = self._backend.pubsub()
|
||||
pubsub.subscribe(prefixed)
|
||||
|
||||
def listener():
|
||||
for msg in pubsub.listen():
|
||||
if msg['type'] == 'message':
|
||||
try:
|
||||
data = json.loads(msg['data'])
|
||||
except (json.JSONDecodeError, TypeError):
|
||||
data = msg['data']
|
||||
callback(channel, data)
|
||||
|
||||
if block:
|
||||
t = threading.Thread(target=listener, daemon=True)
|
||||
t.start()
|
||||
if timeout:
|
||||
t.join(timeout)
|
||||
else:
|
||||
t.join()
|
||||
else:
|
||||
# Fire-and-forget thread
|
||||
threading.Thread(target=listener, daemon=True).start()
|
||||
else:
|
||||
# MemoryBackend — synchronous callback registration
|
||||
self._backend.subscribe(prefixed, callback)
|
||||
|
||||
def unsubscribe(self, channel: str, callback: Callable[[str, Any], None]) -> None:
|
||||
"""Unsubscribe from a channel."""
|
||||
try:
|
||||
self._backend.unsubscribe(self._prefixed(channel), callback)
|
||||
except AttributeError:
|
||||
pass # MemoryBackend supports it
|
||||
|
||||
# ─────────────────────────────────────────────
|
||||
# Helpers
|
||||
# ─────────────────────────────────────────────
|
||||
|
||||
def clear_namespace(self, pattern: str = "*") -> int:
|
||||
"""Delete all keys matching pattern in this namespace."""
|
||||
full = self._prefixed(pattern)
|
||||
try:
|
||||
keys = self._backend.keys(full)
|
||||
if keys:
|
||||
return self._backend.delete(*keys)
|
||||
return 0
|
||||
except AttributeError:
|
||||
store_keys = list(self._backend._store.keys())
|
||||
import fnmatch
|
||||
matched = fnmatch.filter(store_keys, full)
|
||||
for k in matched:
|
||||
del self._backend._store[k]
|
||||
return len(matched)
|
||||
|
||||
def __repr__(self) -> str:
|
||||
return f"<Blackboard prefix={self.prefix!r} backend={type(self._backend).__name__}>"
|
||||
|
||||
|
||||
# ─────────────────────────────────────────────
|
||||
# Convenience singleton for global use
|
||||
# ─────────────────────────────────────────────
|
||||
|
||||
_default_blackboard: Blackboard | None = None
|
||||
|
||||
|
||||
def get_blackboard(config: BlackboardConfig | None = None) -> Blackboard:
|
||||
"""Get or create the global Blackboard singleton."""
|
||||
global _default_blackboard
|
||||
if _default_blackboard is None:
|
||||
_default_blackboard = Blackboard(config)
|
||||
return _default_blackboard
|
||||
194
tests/test_blackboard.py
Normal file
194
tests/test_blackboard.py
Normal file
@@ -0,0 +1,194 @@
|
||||
"""
|
||||
Smoke tests for Blackboard — ensures the Redis-backed coordination layer
|
||||
works with both real Redis and in-memory fallback.
|
||||
"""
|
||||
|
||||
import json
|
||||
import time
|
||||
|
||||
import pytest
|
||||
|
||||
from src.timmy.blackboard import Blackboard, BlackboardConfig, _MemoryBackend
|
||||
|
||||
|
||||
class TestBlackboardBasics:
|
||||
"""Test core key-value operations."""
|
||||
|
||||
def test_kv_memory_backend(self):
|
||||
"""KV operations work using in-memory backend."""
|
||||
bb = Blackboard(BlackboardConfig(fallback_to_memory=True, enabled=True))
|
||||
|
||||
# Set and get
|
||||
assert bb.set("test:key", "hello") is True
|
||||
assert bb.get("test:key") == "hello"
|
||||
|
||||
# Dict serialization
|
||||
assert bb.set("test:obj", {"a": 1, "b": 2}) is True
|
||||
val = bb.get("test:obj")
|
||||
assert json.loads(val) == {"a": 1, "b": 2}
|
||||
|
||||
# Exists
|
||||
assert bb.exists("test:key") is True
|
||||
assert bb.exists("missing") is False
|
||||
|
||||
# Delete
|
||||
assert bb.delete("test:key") is True
|
||||
assert bb.get("test:key") is None
|
||||
|
||||
# Keys with prefix
|
||||
bb.set("agent:timmy:state", "ready")
|
||||
bb.set("agent:ezra:state", "idle")
|
||||
keys = bb.keys("agent:*:state")
|
||||
assert len(keys) == 2
|
||||
assert "timmy" in keys[0] or "ezra" in keys[0]
|
||||
|
||||
# Clear namespace
|
||||
assert bb.clear_namespace("agent:*") == 2
|
||||
assert bb.keys("agent:*") == []
|
||||
|
||||
|
||||
class TestBlackboardPubSub:
|
||||
"""Test pub/sub coordination patterns."""
|
||||
|
||||
def test_pubsub_memory_backend(self):
|
||||
"""Publish/subscribe works using in-memory backend."""
|
||||
bb = Blackboard(BlackboardConfig(fallback_to_memory=True, enabled=True))
|
||||
|
||||
received = []
|
||||
|
||||
def callback(channel, message):
|
||||
received.append((channel, message))
|
||||
|
||||
bb.subscribe("dispatch:new", callback)
|
||||
|
||||
# Publish
|
||||
count = bb.publish("dispatch:new", {"issue": 123, "action": "comment"})
|
||||
assert count == 1
|
||||
assert len(received) == 1
|
||||
ch, msg = received[0]
|
||||
assert ch == "dispatch:new"
|
||||
assert msg == {"issue": 123, "action": "comment"}
|
||||
|
||||
bb.unsubscribe("dispatch:new", callback)
|
||||
bb.publish("dispatch:new", {"should": "not arrive"})
|
||||
assert len(received) == 1 # no new messages
|
||||
|
||||
def test_publish_without_subscribers(self):
|
||||
"""Publish returns 0 when no subscribers."""
|
||||
bb = Blackboard(BlackboardConfig(fallback_to_memory=True, enabled=True))
|
||||
count = bb.publish("empty:channel", {"msg": 1})
|
||||
assert count == 0
|
||||
|
||||
|
||||
class TestBlackboardConfig:
|
||||
"""Test configuration parsing and validation."""
|
||||
|
||||
def test_default_config(self):
|
||||
cfg = BlackboardConfig()
|
||||
assert cfg.enabled is True
|
||||
assert cfg.redis_url == "redis://localhost:6379/0"
|
||||
assert cfg.keyspace_prefix == "timmy"
|
||||
assert cfg.ttl_seconds == 3600
|
||||
assert cfg.fallback_to_memory is True
|
||||
|
||||
def test_custom_config(self):
|
||||
cfg = BlackboardConfig(
|
||||
enabled=False,
|
||||
redis_url="redis://192.168.1.10:6379/1",
|
||||
keyspace_prefix="myagent",
|
||||
ttl_seconds=1800,
|
||||
fallback_to_memory=False,
|
||||
)
|
||||
assert cfg.enabled is False
|
||||
assert cfg.redis_url == "redis://192.168.1.10:6379/1"
|
||||
assert cfg.keyspace_prefix == "myagent"
|
||||
assert cfg.ttl_seconds == 1800
|
||||
assert cfg.fallback_to_memory is False
|
||||
|
||||
|
||||
class TestKeyspacePrefix:
|
||||
"""Test that keys are correctly prefixed."""
|
||||
|
||||
def test_prefixed_keys(self):
|
||||
bb = Blackboard(BlackboardConfig(keyspace_prefix="myagent", fallback_to_memory=True))
|
||||
bb.set("thought", "test")
|
||||
# Internal key should be "myagent:thought"
|
||||
# We can verify by checking keys()
|
||||
keys = bb.keys("*")
|
||||
assert any("myagent:thought" in k for k in keys)
|
||||
|
||||
|
||||
class TestBlackboardIntegration:
|
||||
"""Integration pattern: agent thought cycle."""
|
||||
|
||||
def test_agent_thought_cycle(self):
|
||||
"""Simulate Timmy writing a thought and Ezra reading it."""
|
||||
bb = Blackboard(BlackboardConfig(fallback_to_memory=True, enabled=True))
|
||||
|
||||
# Agent A writes observation
|
||||
bb.set("agent:timmy:observation", "Gitea queue has 12 open issues")
|
||||
|
||||
# Agent B reads
|
||||
obs = bb.get("agent:timmy:observation")
|
||||
assert obs == "Gitea queue has 12 open issues"
|
||||
|
||||
# Agent B writes analysis
|
||||
bb.set("agent:ezra:analysis", "Prioritize critical bugs first")
|
||||
|
||||
# Event-driven pattern
|
||||
events = []
|
||||
|
||||
def on_plan(channel, message):
|
||||
events.append(message)
|
||||
|
||||
bb.subscribe("fleet:plan", on_plan)
|
||||
bb.publish("fleet:plan", {"phase": "triaging", "lead": "ezra"})
|
||||
|
||||
assert len(events) == 1
|
||||
assert events[0]["phase"] == "triaging"
|
||||
|
||||
|
||||
class TestTTL:
|
||||
"""Test TTL handling (where supported)."""
|
||||
|
||||
def test_ttl_set_in_config(self):
|
||||
cfg = BlackboardConfig(ttl_seconds=60, fallback_to_memory=True)
|
||||
bb = Blackboard(cfg)
|
||||
assert bb.ttl == 60
|
||||
# Setting a value uses TTL from config
|
||||
bb.set("temp:key", "expiring value")
|
||||
# In memory backend ignores TTL, but value is set
|
||||
assert bb.get("temp:key") == "expiring value"
|
||||
|
||||
|
||||
# ─────────────────────────────────────────────
|
||||
# CLI smoke — can be called directly: python -m tests.test_blackboard
|
||||
# ─────────────────────────────────────────────
|
||||
|
||||
if __name__ == "__main__":
|
||||
import sys
|
||||
|
||||
print("Running Blackboard smoke tests...")
|
||||
|
||||
suite = [
|
||||
TestBlackboardBasics().test_kv_memory_backend,
|
||||
TestBlackboardPubSub().test_pubsub_memory_backend,
|
||||
TestBlackboardConfig().test_default_config,
|
||||
TestBlackboardIntegration().test_agent_thought_cycle,
|
||||
]
|
||||
|
||||
failures = 0
|
||||
for test in suite:
|
||||
name = test.__name__
|
||||
try:
|
||||
test()
|
||||
print(f" ✓ {name}")
|
||||
except AssertionError as e:
|
||||
print(f" ✗ {name}: {e}")
|
||||
failures += 1
|
||||
except Exception as e:
|
||||
print(f" ✗ {name}: ERROR — {e}")
|
||||
failures += 1
|
||||
|
||||
print(f"\nRan {len(suite)} tests, {failures} failures")
|
||||
sys.exit(failures)
|
||||
Reference in New Issue
Block a user