feat: complete Event Log, Ledger, Memory, Cascade Router, Upgrade Queue, Activity Feed

This commit implements six major features:

1. Event Log System (src/swarm/event_log.py)
   - SQLite-based audit trail for all swarm events
   - Task lifecycle tracking (created, assigned, completed, failed)
   - Agent lifecycle tracking (joined, left, status changes)
   - Integrated with coordinator for automatic logging
   - Dashboard page at /swarm/events

2. Lightning Ledger (src/lightning/ledger.py)
   - Transaction tracking for Lightning Network payments
   - Balance calculations (incoming, outgoing, net, available)
   - Integrated with payment_handler for automatic logging
   - Dashboard page at /lightning/ledger

3. Semantic Memory / Vector Store (src/memory/vector_store.py)
   - Embedding-based similarity search for Echo agent
   - Fallback to keyword matching if sentence-transformers unavailable
   - Personal facts storage and retrieval
   - Dashboard page at /memory

4. Cascade Router Integration (src/timmy/cascade_adapter.py)
   - Automatic LLM failover between providers (Ollama → AirLLM → API)
   - Circuit breaker pattern for failing providers
   - Metrics tracking per provider (latency, error rates)
   - Dashboard status page at /router/status

5. Self-Upgrade Approval Queue (src/upgrades/)
   - State machine for self-modifications: proposed → approved/rejected → applied/failed
   - Human approval required before applying changes
   - Git integration for branch management
   - Dashboard queue at /self-modify/queue

6. Real-Time Activity Feed (src/events/broadcaster.py)
   - WebSocket-based live activity streaming
   - Bridges event_log to dashboard clients
   - Activity panel on /swarm/live

Tests:
- 101 unit tests passing
- 4 new E2E test files for Selenium testing
- Run with: SELENIUM_UI=1 pytest tests/functional/ -v --headed

Documentation:
- 6 ADRs (017-022) documenting architecture decisions
- Implementation summary in docs/IMPLEMENTATION_SUMMARY.md
- Architecture diagram in docs/architecture-v2.md
This commit is contained in:
Alexander Payne
2026-02-26 08:01:01 -05:00
parent 8d85f95ee5
commit d8d976aa60
41 changed files with 6735 additions and 254 deletions

View File

@@ -0,0 +1,73 @@
# ADR 017: Event Logging System
## Status
Accepted
## Context
The swarm system needed a way to audit all agent actions, task lifecycle events, and system events. Without centralized logging, debugging failures and understanding system behavior required grep-ing through application logs.
## Decision
Implement a centralized event logging system in SQLite (`event_log` table) that captures all significant events with structured data.
## Event Types
| Type | Description |
|------|-------------|
| `task.created` | New task posted |
| `task.bidding` | Task opened for bidding |
| `task.assigned` | Task assigned to agent |
| `task.started` | Agent started working |
| `task.completed` | Task finished successfully |
| `task.failed` | Task failed |
| `agent.joined` | New agent registered |
| `agent.left` | Agent deregistered |
| `bid.submitted` | Agent submitted bid |
| `tool.called` | Tool execution started |
| `tool.completed` | Tool execution finished |
| `system.error` | System error occurred |
## Schema
```sql
CREATE TABLE event_log (
id TEXT PRIMARY KEY,
event_type TEXT NOT NULL,
source TEXT NOT NULL,
task_id TEXT,
agent_id TEXT,
data TEXT, -- JSON
timestamp TEXT NOT NULL
);
```
## Usage
```python
from swarm.event_log import log_event, EventType, get_task_events
# Log an event
log_event(
event_type=EventType.TASK_ASSIGNED,
source="coordinator",
task_id=task.id,
agent_id=winner.agent_id,
data={"bid_sats": winner.bid_sats},
)
# Query events
events = get_task_events(task_id)
summary = get_event_summary(minutes=60)
```
## Integration
The coordinator automatically logs:
- Task creation, assignment, completion, failure
- Agent join/leave events
- System warnings and errors
## Consequences
- **Positive**: Complete audit trail, easy debugging, analytics support
- **Negative**: Additional database writes, storage growth over time
## Mitigations
- `prune_events()` function removes events older than N days
- Indexes on `task_id`, `agent_id`, and `timestamp` for fast queries

View File

@@ -0,0 +1,99 @@
# ADR 018: Lightning Network Transaction Ledger
## Status
Accepted
## Context
The system needed to track all Lightning Network payments (incoming and outgoing) for accounting, dashboard display, and audit purposes. The existing payment handler created invoices but didn't persist transaction history.
## Decision
Implement a SQLite-based ledger (`ledger` table) that tracks all Lightning transactions with their lifecycle status.
## Transaction Types
| Type | Description |
|------|-------------|
| `incoming` | Invoice created (we're receiving payment) |
| `outgoing` | Payment sent (we're paying someone) |
## Transaction Status
| Status | Description |
|--------|-------------|
| `pending` | Awaiting settlement |
| `settled` | Payment completed |
| `failed` | Payment failed |
| `expired` | Invoice expired |
## Schema
```sql
CREATE TABLE ledger (
id TEXT PRIMARY KEY,
tx_type TEXT NOT NULL,
status TEXT NOT NULL DEFAULT 'pending',
payment_hash TEXT UNIQUE NOT NULL,
amount_sats INTEGER NOT NULL,
memo TEXT,
invoice TEXT,
preimage TEXT,
source TEXT NOT NULL,
task_id TEXT,
agent_id TEXT,
created_at TEXT NOT NULL,
settled_at TEXT,
fee_sats INTEGER DEFAULT 0
);
```
## Usage
```python
from lightning.ledger import (
create_invoice_entry,
mark_settled,
get_balance,
)
# Create invoice record
entry = create_invoice_entry(
payment_hash=invoice.payment_hash,
amount_sats=1000,
memo="API access",
source="payment_handler",
task_id=task.id,
)
# Mark as paid
mark_settled(payment_hash, preimage="secret")
# Get balance
balance = get_balance()
print(f"Net: {balance['net_sats']} sats")
```
## Integration
The `PaymentHandler` automatically:
- Creates ledger entries when invoices are created
- Updates status when payments are checked/settled
- Tracks fees for outgoing payments
## Balance Calculation
```python
{
"incoming_total_sats": total_received,
"outgoing_total_sats": total_sent,
"fees_paid_sats": total_fees,
"net_sats": incoming - outgoing - fees,
"pending_incoming_sats": pending_received,
"pending_outgoing_sats": pending_sent,
"available_sats": net - pending_outgoing,
}
```
## Consequences
- **Positive**: Complete payment history, balance tracking, audit trail
- **Negative**: Additional DB writes, must keep in sync with actual Lightning node
## Future Work
- Reconciliation job to sync with LND node
- Export to accounting formats (CSV, QIF)

View File

@@ -0,0 +1,114 @@
# ADR 019: Semantic Memory (Vector Store)
## Status
Accepted
## Context
The Echo agent needed the ability to remember conversations, facts, and context across sessions. Simple keyword search was insufficient for finding relevant historical context.
## Decision
Implement a vector-based semantic memory store using SQLite with optional sentence-transformers embeddings.
## Context Types
| Type | Description |
|------|-------------|
| `conversation` | User/agent dialogue |
| `fact` | Extracted facts about user/system |
| `document` | Uploaded documents |
## Schema
```sql
CREATE TABLE memory_entries (
id TEXT PRIMARY KEY,
content TEXT NOT NULL,
source TEXT NOT NULL,
context_type TEXT NOT NULL DEFAULT 'conversation',
agent_id TEXT,
task_id TEXT,
session_id TEXT,
metadata TEXT, -- JSON
embedding TEXT, -- JSON array of floats
timestamp TEXT NOT NULL
);
```
## Embedding Strategy
**Primary**: sentence-transformers `all-MiniLM-L6-v2` (384 dimensions)
- High quality semantic similarity
- Local execution (no cloud)
- ~80MB model download
**Fallback**: Character n-gram hash embedding
- No external dependencies
- Lower quality but functional
- Enables system to work without heavy ML deps
## Usage
```python
from memory.vector_store import (
store_memory,
search_memories,
get_memory_context,
)
# Store a memory
store_memory(
content="User prefers dark mode",
source="user",
context_type="fact",
agent_id="echo",
)
# Search for relevant context
results = search_memories(
query="user preferences",
agent_id="echo",
limit=5,
)
# Get formatted context for LLM
context = get_memory_context(
query="what does user like?",
max_tokens=1000,
)
```
## Integration Points
### Echo Agent
Echo should store all conversations and retrieve relevant context when answering questions about "what we discussed" or "what we know".
### Task Context
Task handlers can query for similar past tasks:
```python
similar = search_memories(
query=task.description,
context_type="conversation",
limit=3,
)
```
## Similarity Scoring
**Cosine Similarity** (when embeddings available):
```python
score = dot(a, b) / (norm(a) * norm(b)) # -1 to 1
```
**Keyword Overlap** (fallback):
```python
score = len(query_words & content_words) / len(query_words)
```
## Consequences
- **Positive**: Semantic search finds related content even without keyword matches
- **Negative**: Embedding computation adds latency (~10-100ms per query)
- **Mitigation**: Background embedding computation, caching
## Future Work
- sqlite-vss extension for vector similarity index
- Memory compression for long-term storage
- Automatic fact extraction from conversations

View File

@@ -0,0 +1,126 @@
# ADR 020: Cascade Router Integration with Timmy Agent
## Status
Proposed
## Context
Currently, the Timmy agent (`src/timmy/agent.py`) uses `src/timmy/backends.py` which provides a simple abstraction over Ollama and AirLLM. However, this lacks:
- Automatic failover between multiple LLM providers
- Circuit breaker pattern for failing providers
- Cost and latency tracking per provider
- Priority-based routing (local first, then APIs)
The Cascade Router (`src/router/cascade.py`) already implements these features but is not integrated with Timmy.
## Decision
Integrate the Cascade Router as the primary LLM routing layer for Timmy, replacing the direct backend abstraction.
## Architecture
### Current Flow
```
User Request → Timmy Agent → backends.py → Ollama/AirLLM
```
### Proposed Flow
```
User Request → Timmy Agent → Cascade Router → Provider 1 (Ollama)
↓ (if fail)
Provider 2 (Local AirLLM)
↓ (if fail)
Provider 3 (API - optional)
Track metrics per provider
```
### Integration Points
1. **Timmy Agent** (`src/timmy/agent.py`)
- Replace `create_timmy()` backend initialization
- Use `CascadeRouter.complete()` instead of direct `agent.run()`
- Expose provider status in agent responses
2. **Cascade Router** (`src/router/cascade.py`)
- Already supports: Ollama, OpenAI, Anthropic, AirLLM
- Already has: Circuit breakers, metrics, failover logic
- Add: Integration with existing `src/timmy/prompts.py`
3. **Configuration** (`config.yaml` or `config.py`)
- Provider list with priorities
- API keys (optional, for cloud fallback)
- Circuit breaker thresholds
4. **Dashboard** (new route)
- `/router/status` - Show provider health, metrics, recent failures
- Real-time provider status indicator
### Provider Priority Order
1. **Ollama (local)** - Priority 1, always try first
2. **AirLLM (local)** - Priority 2, if Ollama unavailable
3. **API providers** - Priority 3+, only if configured
### Data Flow
```python
# Timmy Agent
async def respond(self, message: str) -> str:
# Get cascade router
router = get_cascade_router()
# Route through cascade with automatic failover
response = await router.complete(
messages=[{"role": "user", "content": message}],
system_prompt=TIMMY_SYSTEM_PROMPT,
)
# Response includes which provider was used
return response.content
```
## Schema Additions
### Provider Status Table (new)
```sql
CREATE TABLE provider_metrics (
provider_name TEXT PRIMARY KEY,
total_requests INTEGER DEFAULT 0,
successful_requests INTEGER DEFAULT 0,
failed_requests INTEGER DEFAULT 0,
avg_latency_ms REAL DEFAULT 0,
last_error_time TEXT,
circuit_state TEXT DEFAULT 'closed',
updated_at TEXT
);
```
## Consequences
### Positive
- Automatic failover improves reliability
- Metrics enable data-driven provider selection
- Circuit breakers prevent cascade failures
- Configurable without code changes
### Negative
- Additional complexity in request path
- Potential latency increase from retries
- Requires careful circuit breaker tuning
### Mitigations
- Circuit breakers have short recovery timeouts (60s)
- Metrics exposed for monitoring
- Fallback to mock responses if all providers fail
## Implementation Plan
1. Create `src/timmy/cascade_adapter.py` - Adapter between Timmy and Cascade Router
2. Modify `src/timmy/agent.py` - Use adapter instead of direct backends
3. Create dashboard route `/router/status` - Provider health UI
4. Add provider metrics persistence to SQLite
5. Write tests for failover scenarios
## Dependencies
- Existing `src/router/cascade.py`
- Existing `src/timmy/agent.py`
- New dashboard route

View File

@@ -0,0 +1,189 @@
# ADR 021: Self-Upgrade Approval Queue
## Status
Proposed
## Context
The self-modification system (`src/self_modify/loop.py`) can generate code changes autonomously. However, it currently either:
- Applies changes immediately (risky)
- Requires manual git review (slow)
We need an approval queue where changes are staged for human review before application.
## Decision
Implement a dashboard-based approval queue for self-modifications with the following states:
`proposed``approved` | `rejected``applied` | `failed`
## Architecture
### State Machine
```
┌─────────────┐
│ PROPOSED │
└──────┬──────┘
┌───────────────┼───────────────┐
│ │ │
▼ ▼ ▼
┌────────────┐ ┌────────────┐ ┌────────────┐
│ APPROVED │ │ REJECTED │ │ EXPIRED │
└──────┬─────┘ └────────────┘ └────────────┘
┌────────────┐
│ APPLIED │
└──────┬─────┘
┌────────────┐
│ FAILED │
└────────────┘
```
### Components
1. **Database Table** (`upgrades` table)
```sql
CREATE TABLE upgrades (
id TEXT PRIMARY KEY,
status TEXT NOT NULL, -- proposed, approved, rejected, applied, failed
proposed_at TEXT NOT NULL,
approved_at TEXT,
applied_at TEXT,
rejected_at TEXT,
branch_name TEXT NOT NULL,
description TEXT NOT NULL,
files_changed TEXT, -- JSON array
diff_preview TEXT, -- Short diff for review
test_results TEXT, -- JSON: {passed: bool, output: str}
error_message TEXT,
approved_by TEXT -- For audit
);
```
2. **Self-Modify Loop** (`src/self_modify/loop.py`)
- On change proposal: Create `proposed` entry, stop
- On approval: Checkout branch, apply changes, run tests, commit
- On rejection: Cleanup branch, mark `rejected`
3. **Dashboard UI** (`/self-modify/queue`)
- List all proposed changes
- Show diff preview
- Approve/Reject buttons
- Show test results
- History of past upgrades
4. **API Endpoints**
- `GET /self-modify/queue` - List pending upgrades
- `POST /self-modify/queue/{id}/approve` - Approve upgrade
- `POST /self-modify/queue/{id}/reject` - Reject upgrade
- `GET /self-modify/queue/{id}/diff` - View full diff
### Integration Points
**Existing: Self-Modify Loop**
- Currently: Proposes change → applies immediately (or fails)
- New: Proposes change → creates DB entry → waits for approval
**Existing: Dashboard**
- New page: Upgrade Queue
- New nav item: "UPGRADES" with badge showing pending count
**Existing: Event Log**
- Logs: `upgrade.proposed`, `upgrade.approved`, `upgrade.applied`, `upgrade.failed`
### Security Considerations
1. **Approval Authentication** - Consider requiring password/PIN for approval
2. **Diff Size Limits** - Reject diffs >10k lines (prevents DoS)
3. **Test Requirement** - Must pass tests before applying
4. **Rollback** - Keep previous commit SHA for rollback
### Approval Flow
```python
# 1. System proposes upgrade
upgrade = UpgradeQueue.propose(
description="Fix bug in task assignment",
branch_name="self-modify/fix-task-001",
files_changed=["src/swarm/coordinator.py"],
diff_preview="@@ -123,7 +123,7 @@...",
)
# Status: PROPOSED
# 2. Human reviews in dashboard
# - Views diff
# - Sees test results (auto-run on propose)
# - Clicks APPROVE or REJECT
# 3. If approved
upgrade.apply() # Status: APPLIED or FAILED
# 4. If rejected
upgrade.reject() # Status: REJECTED, branch deleted
```
## UI Design
### Upgrade Queue Page (`/self-modify/queue`)
```
┌─────────────────────────────────────────┐
│ PENDING UPGRADES (2) │
├─────────────────────────────────────────┤
│ │
│ Fix bug in task assignment [VIEW] │
│ Branch: self-modify/fix-task-001 │
│ Files: coordinator.py │
│ Tests: ✓ Passed │
│ [APPROVE] [REJECT] │
│ │
│ Add memory search feature [VIEW] │
│ Branch: self-modify/memory-002 │
│ Files: memory/vector_store.py │
│ Tests: ✗ Failed (1 error) │
│ [APPROVE] [REJECT] │
│ │
└─────────────────────────────────────────┘
┌─────────────────────────────────────────┐
│ UPGRADE HISTORY │
├─────────────────────────────────────────┤
│ ✓ Fix auth bug APPLIED 2h ago │
│ ✗ Add new route FAILED 5h ago │
│ ✗ Change config REJECTED 1d ago│
└─────────────────────────────────────────┘
```
## Consequences
### Positive
- Human oversight prevents bad changes
- Audit trail of all modifications
- Test-before-apply prevents broken states
- Rejection is clean (no lingering branches)
### Negative
- Adds friction to self-modification
- Requires human availability for urgent fixes
- Database storage for upgrade history
### Mitigations
- Auto-approve after 24h for low-risk changes (configurable)
- Urgent changes can bypass queue (with logging)
- Prune old history after 90 days
## Implementation Plan
1. Create `src/upgrades/models.py` - Database schema and ORM
2. Create `src/upgrades/queue.py` - Queue management logic
3. Modify `src/self_modify/loop.py` - Integrate with queue
4. Create dashboard routes - UI for approval
5. Create templates - Queue page, diff view
6. Add event logging for upgrades
7. Write tests for full workflow
## Dependencies
- Existing `src/self_modify/loop.py`
- New database table `upgrades`
- Existing Event Log system

View File

@@ -0,0 +1,212 @@
# ADR 022: Real-Time Activity Feed
## Status
Proposed
## Context
The dashboard currently shows static snapshots of swarm state. Users must refresh to see:
- New tasks being created
- Agents joining/leaving
- Bids being submitted
- Tasks being completed
This creates a poor UX for monitoring the swarm in real-time.
## Decision
Implement a WebSocket-based real-time activity feed that streams events from the Event Log to connected dashboard clients.
## Architecture
### Data Flow
```
Coordinator Event → Event Log (SQLite)
WebSocket Broadcast
Dashboard Clients (via ws_manager)
```
### Components
1. **Event Source** (`src/swarm/coordinator.py`)
- Already emits events via `log_event()`
- Events are persisted to SQLite
2. **WebSocket Bridge** (`src/ws_manager/handler.py`)
- Already exists for agent status
- Extend to broadcast events
3. **Event Broadcaster** (`src/events/broadcaster.py` - NEW)
```python
class EventBroadcaster:
"""Bridges event_log → WebSocket."""
async def on_event_logged(self, event: EventLogEntry):
"""Called when new event is logged."""
await ws_manager.broadcast_event({
"type": event.event_type.value,
"source": event.source,
"task_id": event.task_id,
"agent_id": event.agent_id,
"timestamp": event.timestamp,
"data": event.data,
})
```
4. **Dashboard UI** (`/swarm/live` - enhanced)
- Already exists at `/swarm/live`
- Add activity feed panel
- Connect to WebSocket
- Show real-time events
5. **Mobile Support**
- Same WebSocket for mobile view
- Simplified activity list
### Event Types to Broadcast
| Event Type | Display As | Icon |
|------------|------------|------|
| `task.created` | "New task: {description}" | 📝 |
| `task.assigned` | "Task assigned to {agent}" | 👤 |
| `task.completed` | "Task completed" | ✓ |
| `agent.joined` | "Agent {name} joined" | 🟢 |
| `agent.left` | "Agent {name} left" | 🔴 |
| `bid.submitted` | "Bid: {amount}sats from {agent}" | 💰 |
| `tool.called` | "Tool: {tool_name}" | 🔧 |
| `system.error` | "Error: {message}" | ⚠️ |
### WebSocket Protocol
```json
// Client connects
{"action": "subscribe", "channel": "events"}
// Server broadcasts
{
"type": "event",
"payload": {
"event_type": "task.assigned",
"source": "coordinator",
"task_id": "task-123",
"agent_id": "agent-456",
"timestamp": "2024-01-15T10:30:00Z",
"data": {"bid_sats": 100}
}
}
```
### UI Design: Activity Feed Panel
```
┌─────────────────────────────────────────┐
│ LIVE ACTIVITY [🔴] │
├─────────────────────────────────────────┤
│ 📝 New task: Write Python function │
│ 10:30:01 │
│ 💰 Bid: 50sats from forge │
│ 10:30:02 │
│ 👤 Task assigned to forge │
│ 10:30:07 │
│ ✓ Task completed │
│ 10:30:15 │
│ 🟢 Agent Echo joined │
│ 10:31:00 │
│ │
│ [Show All Events] │
└─────────────────────────────────────────┘
```
### Integration with Existing Systems
**Existing: Event Log** (`src/swarm/event_log.py`)
- Hook into `log_event()` to trigger broadcasts
- Use SQLite `AFTER INSERT` trigger or Python callback
**Existing: WebSocket Manager** (`src/ws_manager/handler.py`)
- Add `broadcast_event()` method
- Handle client subscriptions
**Existing: Coordinator** (`src/swarm/coordinator.py`)
- Already calls `log_event()` for all lifecycle events
- No changes needed
**Existing: Swarm Live Page** (`/swarm/live`)
- Enhance with activity feed panel
- WebSocket client connection
### Technical Design
#### Option A: Direct Callback (Chosen)
Modify `log_event()` to call broadcaster directly.
**Pros:** Simple, immediate delivery
**Cons:** Tight coupling
```python
# In event_log.py
def log_event(...):
# ... store in DB ...
# Broadcast to WebSocket clients
asyncio.create_task(_broadcast_event(event))
```
#### Option B: SQLite Trigger + Poll
Use SQLite trigger to mark new events, poll from broadcaster.
**Pros:** Decoupled, survives restarts
**Cons:** Latency from polling
#### Option C: Event Bus
Use existing `src/events/bus.py` to publish/subscribe.
**Pros:** Decoupled, flexible
**Cons:** Additional complexity
**Decision:** Option A for simplicity, with Option C as future refactoring.
### Performance Considerations
- **Rate Limiting:** Max 10 events/second to clients
- **Buffering:** If client disconnected, buffer last 100 events
- **Filtering:** Clients can filter by event type
- **Deduplication:** WebSocket manager handles client dedup
### Security
- Only authenticated dashboard users receive events
- Sanitize event data (no secrets in logs)
- Rate limit connections per IP
## Consequences
### Positive
- Real-time visibility into swarm activity
- Better UX for monitoring
- Uses existing infrastructure (Event Log, WebSocket)
### Negative
- Increased server load from WebSocket connections
- Event data must be carefully sanitized
- More complex client-side state management
### Mitigations
- Event throttling
- Connection limits
- Graceful degradation to polling
## Implementation Plan
1. **Create EventBroadcaster** - Bridge event_log → ws_manager
2. **Extend ws_manager** - Add `broadcast_event()` method
3. **Modify event_log.py** - Hook in broadcaster
4. **Enhance /swarm/live** - Add activity feed panel with WebSocket
5. **Create EventFeed component** - Reusable HTMX + WebSocket widget
6. **Write tests** - E2E tests for real-time updates
## Dependencies
- Existing `src/swarm/event_log.py`
- Existing `src/ws_manager/handler.py`
- Existing `/swarm/live` page
- HTMX WebSocket extension (already loaded)