forked from Rockachopa/Timmy-time-dashboard
This commit implements six major features: 1. Event Log System (src/swarm/event_log.py) - SQLite-based audit trail for all swarm events - Task lifecycle tracking (created, assigned, completed, failed) - Agent lifecycle tracking (joined, left, status changes) - Integrated with coordinator for automatic logging - Dashboard page at /swarm/events 2. Lightning Ledger (src/lightning/ledger.py) - Transaction tracking for Lightning Network payments - Balance calculations (incoming, outgoing, net, available) - Integrated with payment_handler for automatic logging - Dashboard page at /lightning/ledger 3. Semantic Memory / Vector Store (src/memory/vector_store.py) - Embedding-based similarity search for Echo agent - Fallback to keyword matching if sentence-transformers unavailable - Personal facts storage and retrieval - Dashboard page at /memory 4. Cascade Router Integration (src/timmy/cascade_adapter.py) - Automatic LLM failover between providers (Ollama → AirLLM → API) - Circuit breaker pattern for failing providers - Metrics tracking per provider (latency, error rates) - Dashboard status page at /router/status 5. Self-Upgrade Approval Queue (src/upgrades/) - State machine for self-modifications: proposed → approved/rejected → applied/failed - Human approval required before applying changes - Git integration for branch management - Dashboard queue at /self-modify/queue 6. Real-Time Activity Feed (src/events/broadcaster.py) - WebSocket-based live activity streaming - Bridges event_log to dashboard clients - Activity panel on /swarm/live Tests: - 101 unit tests passing - 4 new E2E test files for Selenium testing - Run with: SELENIUM_UI=1 pytest tests/functional/ -v --headed Documentation: - 6 ADRs (017-022) documenting architecture decisions - Implementation summary in docs/IMPLEMENTATION_SUMMARY.md - Architecture diagram in docs/architecture-v2.md
74 lines
2.0 KiB
Markdown
74 lines
2.0 KiB
Markdown
# ADR 017: Event Logging System
|
|
|
|
## Status
|
|
Accepted
|
|
|
|
## Context
|
|
The swarm system needed a way to audit all agent actions, task lifecycle events, and system events. Without centralized logging, debugging failures and understanding system behavior required grep-ing through application logs.
|
|
|
|
## Decision
|
|
Implement a centralized event logging system in SQLite (`event_log` table) that captures all significant events with structured data.
|
|
|
|
## Event Types
|
|
|
|
| Type | Description |
|
|
|------|-------------|
|
|
| `task.created` | New task posted |
|
|
| `task.bidding` | Task opened for bidding |
|
|
| `task.assigned` | Task assigned to agent |
|
|
| `task.started` | Agent started working |
|
|
| `task.completed` | Task finished successfully |
|
|
| `task.failed` | Task failed |
|
|
| `agent.joined` | New agent registered |
|
|
| `agent.left` | Agent deregistered |
|
|
| `bid.submitted` | Agent submitted bid |
|
|
| `tool.called` | Tool execution started |
|
|
| `tool.completed` | Tool execution finished |
|
|
| `system.error` | System error occurred |
|
|
|
|
## Schema
|
|
```sql
|
|
CREATE TABLE event_log (
|
|
id TEXT PRIMARY KEY,
|
|
event_type TEXT NOT NULL,
|
|
source TEXT NOT NULL,
|
|
task_id TEXT,
|
|
agent_id TEXT,
|
|
data TEXT, -- JSON
|
|
timestamp TEXT NOT NULL
|
|
);
|
|
```
|
|
|
|
## Usage
|
|
|
|
```python
|
|
from swarm.event_log import log_event, EventType, get_task_events
|
|
|
|
# Log an event
|
|
log_event(
|
|
event_type=EventType.TASK_ASSIGNED,
|
|
source="coordinator",
|
|
task_id=task.id,
|
|
agent_id=winner.agent_id,
|
|
data={"bid_sats": winner.bid_sats},
|
|
)
|
|
|
|
# Query events
|
|
events = get_task_events(task_id)
|
|
summary = get_event_summary(minutes=60)
|
|
```
|
|
|
|
## Integration
|
|
The coordinator automatically logs:
|
|
- Task creation, assignment, completion, failure
|
|
- Agent join/leave events
|
|
- System warnings and errors
|
|
|
|
## Consequences
|
|
- **Positive**: Complete audit trail, easy debugging, analytics support
|
|
- **Negative**: Additional database writes, storage growth over time
|
|
|
|
## Mitigations
|
|
- `prune_events()` function removes events older than N days
|
|
- Indexes on `task_id`, `agent_id`, and `timestamp` for fast queries
|