forked from Rockachopa/Timmy-time-dashboard
feat: complete Event Log, Ledger, Memory, Cascade Router, Upgrade Queue, Activity Feed
This commit implements six major features: 1. Event Log System (src/swarm/event_log.py) - SQLite-based audit trail for all swarm events - Task lifecycle tracking (created, assigned, completed, failed) - Agent lifecycle tracking (joined, left, status changes) - Integrated with coordinator for automatic logging - Dashboard page at /swarm/events 2. Lightning Ledger (src/lightning/ledger.py) - Transaction tracking for Lightning Network payments - Balance calculations (incoming, outgoing, net, available) - Integrated with payment_handler for automatic logging - Dashboard page at /lightning/ledger 3. Semantic Memory / Vector Store (src/memory/vector_store.py) - Embedding-based similarity search for Echo agent - Fallback to keyword matching if sentence-transformers unavailable - Personal facts storage and retrieval - Dashboard page at /memory 4. Cascade Router Integration (src/timmy/cascade_adapter.py) - Automatic LLM failover between providers (Ollama → AirLLM → API) - Circuit breaker pattern for failing providers - Metrics tracking per provider (latency, error rates) - Dashboard status page at /router/status 5. Self-Upgrade Approval Queue (src/upgrades/) - State machine for self-modifications: proposed → approved/rejected → applied/failed - Human approval required before applying changes - Git integration for branch management - Dashboard queue at /self-modify/queue 6. Real-Time Activity Feed (src/events/broadcaster.py) - WebSocket-based live activity streaming - Bridges event_log to dashboard clients - Activity panel on /swarm/live Tests: - 101 unit tests passing - 4 new E2E test files for Selenium testing - Run with: SELENIUM_UI=1 pytest tests/functional/ -v --headed Documentation: - 6 ADRs (017-022) documenting architecture decisions - Implementation summary in docs/IMPLEMENTATION_SUMMARY.md - Architecture diagram in docs/architecture-v2.md
This commit is contained in:
73
docs/adr/017-event-logging.md
Normal file
73
docs/adr/017-event-logging.md
Normal file
@@ -0,0 +1,73 @@
|
||||
# ADR 017: Event Logging System
|
||||
|
||||
## Status
|
||||
Accepted
|
||||
|
||||
## Context
|
||||
The swarm system needed a way to audit all agent actions, task lifecycle events, and system events. Without centralized logging, debugging failures and understanding system behavior required grep-ing through application logs.
|
||||
|
||||
## Decision
|
||||
Implement a centralized event logging system in SQLite (`event_log` table) that captures all significant events with structured data.
|
||||
|
||||
## Event Types
|
||||
|
||||
| Type | Description |
|
||||
|------|-------------|
|
||||
| `task.created` | New task posted |
|
||||
| `task.bidding` | Task opened for bidding |
|
||||
| `task.assigned` | Task assigned to agent |
|
||||
| `task.started` | Agent started working |
|
||||
| `task.completed` | Task finished successfully |
|
||||
| `task.failed` | Task failed |
|
||||
| `agent.joined` | New agent registered |
|
||||
| `agent.left` | Agent deregistered |
|
||||
| `bid.submitted` | Agent submitted bid |
|
||||
| `tool.called` | Tool execution started |
|
||||
| `tool.completed` | Tool execution finished |
|
||||
| `system.error` | System error occurred |
|
||||
|
||||
## Schema
|
||||
```sql
|
||||
CREATE TABLE event_log (
|
||||
id TEXT PRIMARY KEY,
|
||||
event_type TEXT NOT NULL,
|
||||
source TEXT NOT NULL,
|
||||
task_id TEXT,
|
||||
agent_id TEXT,
|
||||
data TEXT, -- JSON
|
||||
timestamp TEXT NOT NULL
|
||||
);
|
||||
```
|
||||
|
||||
## Usage
|
||||
|
||||
```python
|
||||
from swarm.event_log import log_event, EventType, get_task_events
|
||||
|
||||
# Log an event
|
||||
log_event(
|
||||
event_type=EventType.TASK_ASSIGNED,
|
||||
source="coordinator",
|
||||
task_id=task.id,
|
||||
agent_id=winner.agent_id,
|
||||
data={"bid_sats": winner.bid_sats},
|
||||
)
|
||||
|
||||
# Query events
|
||||
events = get_task_events(task_id)
|
||||
summary = get_event_summary(minutes=60)
|
||||
```
|
||||
|
||||
## Integration
|
||||
The coordinator automatically logs:
|
||||
- Task creation, assignment, completion, failure
|
||||
- Agent join/leave events
|
||||
- System warnings and errors
|
||||
|
||||
## Consequences
|
||||
- **Positive**: Complete audit trail, easy debugging, analytics support
|
||||
- **Negative**: Additional database writes, storage growth over time
|
||||
|
||||
## Mitigations
|
||||
- `prune_events()` function removes events older than N days
|
||||
- Indexes on `task_id`, `agent_id`, and `timestamp` for fast queries
|
||||
Reference in New Issue
Block a user