This repository has been archived on 2026-03-24. You can view files and clone it. You cannot open issues or pull requests or push a commit.
Files
Timmy-time-dashboard/docs/adr/017-event-logging.md
Alexander Payne d8d976aa60 feat: complete Event Log, Ledger, Memory, Cascade Router, Upgrade Queue, Activity Feed
This commit implements six major features:

1. Event Log System (src/swarm/event_log.py)
   - SQLite-based audit trail for all swarm events
   - Task lifecycle tracking (created, assigned, completed, failed)
   - Agent lifecycle tracking (joined, left, status changes)
   - Integrated with coordinator for automatic logging
   - Dashboard page at /swarm/events

2. Lightning Ledger (src/lightning/ledger.py)
   - Transaction tracking for Lightning Network payments
   - Balance calculations (incoming, outgoing, net, available)
   - Integrated with payment_handler for automatic logging
   - Dashboard page at /lightning/ledger

3. Semantic Memory / Vector Store (src/memory/vector_store.py)
   - Embedding-based similarity search for Echo agent
   - Fallback to keyword matching if sentence-transformers unavailable
   - Personal facts storage and retrieval
   - Dashboard page at /memory

4. Cascade Router Integration (src/timmy/cascade_adapter.py)
   - Automatic LLM failover between providers (Ollama → AirLLM → API)
   - Circuit breaker pattern for failing providers
   - Metrics tracking per provider (latency, error rates)
   - Dashboard status page at /router/status

5. Self-Upgrade Approval Queue (src/upgrades/)
   - State machine for self-modifications: proposed → approved/rejected → applied/failed
   - Human approval required before applying changes
   - Git integration for branch management
   - Dashboard queue at /self-modify/queue

6. Real-Time Activity Feed (src/events/broadcaster.py)
   - WebSocket-based live activity streaming
   - Bridges event_log to dashboard clients
   - Activity panel on /swarm/live

Tests:
- 101 unit tests passing
- 4 new E2E test files for Selenium testing
- Run with: SELENIUM_UI=1 pytest tests/functional/ -v --headed

Documentation:
- 6 ADRs (017-022) documenting architecture decisions
- Implementation summary in docs/IMPLEMENTATION_SUMMARY.md
- Architecture diagram in docs/architecture-v2.md
2026-02-26 08:01:01 -05:00

74 lines
2.0 KiB
Markdown

# ADR 017: Event Logging System
## Status
Accepted
## Context
The swarm system needed a way to audit all agent actions, task lifecycle events, and system events. Without centralized logging, debugging failures and understanding system behavior required grep-ing through application logs.
## Decision
Implement a centralized event logging system in SQLite (`event_log` table) that captures all significant events with structured data.
## Event Types
| Type | Description |
|------|-------------|
| `task.created` | New task posted |
| `task.bidding` | Task opened for bidding |
| `task.assigned` | Task assigned to agent |
| `task.started` | Agent started working |
| `task.completed` | Task finished successfully |
| `task.failed` | Task failed |
| `agent.joined` | New agent registered |
| `agent.left` | Agent deregistered |
| `bid.submitted` | Agent submitted bid |
| `tool.called` | Tool execution started |
| `tool.completed` | Tool execution finished |
| `system.error` | System error occurred |
## Schema
```sql
CREATE TABLE event_log (
id TEXT PRIMARY KEY,
event_type TEXT NOT NULL,
source TEXT NOT NULL,
task_id TEXT,
agent_id TEXT,
data TEXT, -- JSON
timestamp TEXT NOT NULL
);
```
## Usage
```python
from swarm.event_log import log_event, EventType, get_task_events
# Log an event
log_event(
event_type=EventType.TASK_ASSIGNED,
source="coordinator",
task_id=task.id,
agent_id=winner.agent_id,
data={"bid_sats": winner.bid_sats},
)
# Query events
events = get_task_events(task_id)
summary = get_event_summary(minutes=60)
```
## Integration
The coordinator automatically logs:
- Task creation, assignment, completion, failure
- Agent join/leave events
- System warnings and errors
## Consequences
- **Positive**: Complete audit trail, easy debugging, analytics support
- **Negative**: Additional database writes, storage growth over time
## Mitigations
- `prune_events()` function removes events older than N days
- Indexes on `task_id`, `agent_id`, and `timestamp` for fast queries