feat: MCP tools integration for swarm agents

ToolExecutor:
- Persona-specific toolkit selection (forge gets code tools, echo gets search)
- Tool inference from task keywords (search→web_search, code→python)
- LLM-powered reasoning about tool selection
- Graceful degradation when Agno unavailable

PersonaNode Updates:
- Subscribe to swarm:events for task assignments
- Execute tasks using ToolExecutor when assigned
- Complete tasks via comms.complete_task()
- Track current_task for status monitoring

Tests:
- 19 new tests for tool execution
- All 6 personas covered
- Tool inference verification
- Edge cases (no toolkit, unknown tasks)

Total: 491 tests passing
This commit is contained in:
Alexander Payne
2026-02-22 20:33:26 -05:00
parent c5df954d44
commit 14072f9bb5
4 changed files with 701 additions and 134 deletions

View File

@@ -1,9 +1,9 @@
# Kimi Checkpoint - Updated 2026-02-22 21:37 EST
# Kimi Checkpoint - Updated 2026-02-22 22:45 EST
## Session Info
- **Duration:** ~2 hours
- **Commits:** Ready to commit
- **Assignment:** Architect Sprint (Lightning, Routing, Sovereignty, Embodiment)
- **Duration:** ~2.5 hours
- **Commits:** 1 (c5df954 + this session)
- **Assignment:** Option A - MCP Tools Integration
## Current State
@@ -14,177 +14,165 @@ kimi/sprint-v2-swarm-tools-serve → origin/kimi/sprint-v2-swarm-tools-serve
### Test Status
```
472 passed, 0 warnings
491 passed, 0 warnings
```
## What Was Done
### 1. Lightning Interface Layer ✅
Created pluggable Lightning backend system:
### Option A: MCP Tools Integration ✅ COMPLETE
```
src/lightning/
├── __init__.py # Public API
├── base.py # Abstract LightningBackend interface
├── mock_backend.py # Development/testing backend
├── lnd_backend.py # Real LND gRPC backend (stubbed)
└── factory.py # Backend selection
```
**Problem:** Tools existed (`src/timmy/tools.py`) but weren't wired into the agent execution loop. Agents could bid on tasks but not actually execute them.
- **Mock Backend:** Full implementation with auto-settle for dev
- **LND Backend:** Complete interface, needs gRPC protobuf generation
- **Configuration:** `LIGHTNING_BACKEND=mock|lnd`
- **Docs:** Inline documentation for LND setup steps
**Solution:** Built tool execution layer connecting personas to their specialized tools.
Updated `timmy_serve/payment_handler.py` to use new interface.
### 1. ToolExecutor (`src/swarm/tool_executor.py`)
### 2. Intelligent Swarm Routing ✅
Implemented capability-based task dispatch:
Manages tool execution for persona agents:
```
src/swarm/routing.py # 475 lines
```python
executor = ToolExecutor.for_persona("forge", "forge-001")
result = executor.execute_task("Write a fibonacci function")
# Returns: {success, result, tools_used, persona_id, agent_id}
```
**Features:**
- CapabilityManifest for each agent (keywords, capabilities, rates)
- Task scoring: keyword (0.3) + capability (0.2) + related words (0.1)
- RoutingDecision audit logging to SQLite
- RoutingEngine singleton integrated with coordinator
- Agent stats tracking (wins, consideration rate)
- Persona-specific toolkit selection
- Tool inference from task keywords
- LLM-powered reasoning about tool use
- Graceful degradation when Agno unavailable
**Audit Trail:**
- Every routing decision logged with scores, bids, reason
- Queryable history by task_id or agent_id
- Exportable for analysis
**Tool Mapping:**
| Persona | Tools |
|---------|-------|
| Echo | web_search, read_file, list_files |
| Forge | shell, python, read_file, write_file, list_files |
| Seer | python, read_file, list_files, web_search |
| Quill | read_file, write_file, list_files |
| Mace | shell, web_search, read_file, list_files |
| Helm | shell, read_file, write_file, list_files |
### 3. Sovereignty Audit ✅
Created comprehensive audit report:
### 2. PersonaNode Task Execution
Updated `src/swarm/persona_node.py`:
- Subscribes to `swarm:events` channel
- When `task_assigned` event received → executes task
- Uses `ToolExecutor` to process task with appropriate tools
- Calls `comms.complete_task()` with result
- Tracks `current_task` for status monitoring
**Execution Flow:**
```
docs/SOVEREIGNTY_AUDIT.md
Task Assigned → PersonaNode._handle_task_assignment()
Fetch task description
ToolExecutor.execute_task()
Infer tools from keywords
LLM reasoning (when available)
Return formatted result
Mark task complete
```
**Overall Score:** 9.2/10
### 3. Tests (`tests/test_tool_executor.py`)
**Findings:**
- ✅ AI Models: Local Ollama/AirLLM only
- ✅ Database: SQLite local
- ✅ Voice: Local TTS
- ✅ Web: Self-hosted FastAPI
- ⚠️ Lightning: Configurable (local LND or remote)
- ⚠️ Telegram: Optional external dependency
**Graceful Degradation Verified:**
- Ollama down → Error message
- Redis down → In-memory fallback
- LND unreachable → Health check fails, mock available
### 4. Deeper Test Coverage ✅
Added 36 new tests:
```
tests/test_lightning_interface.py # 36 tests - backend interface
tests/test_swarm_routing.py # 23 tests - routing engine
```
**Coverage:**
- Invoice lifecycle (create, settle, check, list)
- Backend factory selection
- Capability scoring
- Routing recommendations
- Audit log persistence
### 5. Substrate-Agnostic Interface ✅
Created embodiment foundation:
```
src/agent_core/
├── __init__.py # Public exports
├── interface.py # TimAgent abstract base class
└── ollama_adapter.py # Ollama implementation
```
**Interface Contract:**
```python
class TimAgent(ABC):
def perceive(self, perception: Perception) -> Memory
def reason(self, query: str, context: list[Memory]) -> Action
def act(self, action: Action) -> Any
def remember(self, memory: Memory) -> None
def recall(self, query: str, limit: int = 5) -> list[Memory]
def communicate(self, message: Communication) -> bool
```
**PerceptionTypes:** TEXT, IMAGE, AUDIO, SENSOR, MOTION, NETWORK, INTERNAL
**ActionTypes:** TEXT, SPEAK, MOVE, GRIP, CALL, EMIT, SLEEP
This enables future embodiments (robot, VR) without architectural changes.
19 new tests covering:
- ToolExecutor initialization for all personas
- Tool inference from task descriptions
- Task execution with/without tools available
- PersonaNode integration
- Edge cases (unknown tasks, no toolkit, etc.)
## Files Changed
```
src/lightning/* (new, 4 files)
src/agent_core/* (new, 3 files)
src/timmy_serve/payment_handler.py (refactored)
src/swarm/routing.py (new)
src/swarm/coordinator.py (modified)
docs/SOVEREIGNTY_AUDIT.md (new)
tests/test_lightning_interface.py (new)
tests/test_swarm_routing.py (new)
tests/conftest.py (modified)
src/swarm/tool_executor.py (new, 282 lines)
src/swarm/persona_node.py (modified)
tests/test_tool_executor.py (new, 19 tests)
```
## Environment Variables
## How It Works Now
New configuration options:
1. **Task Posted** → Coordinator creates task, opens auction
2. **Bidding** → PersonaNodes bid based on keyword matching
3. **Auction Close** → Winner selected
4. **Assignment** → Coordinator publishes `task_assigned` event
5. **Execution** → Winning PersonaNode:
- Receives assignment via comms
- Fetches task description
- Uses ToolExecutor to process
- Returns result via `complete_task()`
6. **Completion** → Task marked complete, agent returns to idle
## Graceful Degradation
When Agno tools unavailable (tests, missing deps):
- ToolExecutor initializes with `toolkit=None`
- Task execution still works (simulated mode)
- Tool inference works for logging/analysis
- No crashes, clear logging
## Integration with Previous Work
This builds on:
- ✅ Lightning interface (c5df954)
- ✅ Swarm routing with capability manifests
- ✅ Persona definitions with preferred_keywords
- ✅ Auction and bidding system
## Test Results
```bash
# Lightning Backend
LIGHTNING_BACKEND=mock # or 'lnd'
LND_GRPC_HOST=localhost:10009
LND_TLS_CERT_PATH=/path/to/tls.cert
LND_MACAROON_PATH=/path/to/admin.macaroon
LND_VERIFY_SSL=true
$ make test
491 passed in 1.10s
# Mock Settings
MOCK_AUTO_SETTLE=true # Auto-settle invoices in dev
$ pytest tests/test_tool_executor.py -v
19 passed
```
## Integration Notes
## Next Steps
1. **Lightning:** Works with existing L402 middleware. Set `LIGHTNING_BACKEND=lnd` when ready.
2. **Routing:** Automatically logs decisions when personas bid on tasks.
3. **Agent Core:** Not yet wired into main app — future work to migrate existing agent.
From the 7-hour task list, remaining items:
## Next Tasks
**Hour 4** — Scary path tests:
- Concurrent swarm load test (10 simultaneous tasks)
- Memory persistence under restart
- L402 macaroon expiry
- WebSocket reconnection
- Voice NLU edge cases
From assignment:
- [x] Lightning interface layer with LND path
- [x] Swarm routing with capability manifests
- [x] Sovereignty audit report
- [x] Expanded test coverage
- [x] TimAgent abstract interface
**Hour 6** — Mission Control UX:
- Real-time swarm feed via WebSocket
- Heartbeat daemon visible in UI
- Chat history persistence
**Remaining:**
- [ ] Generate LND protobuf stubs for real backend
- [ ] Wire AgentCore into main Timmy flow
- [ ] Add concurrency stress tests
- [ ] Implement degradation circuit breakers
**Hour 7** — Handoff & docs:
- QUALITY_ANALYSIS.md update
- Revelation planning
## Quick Commands
```bash
# Test new modules
pytest tests/test_lightning_interface.py -v
pytest tests/test_swarm_routing.py -v
# Test tool execution
pytest tests/test_tool_executor.py -v
# Check backend status
python -c "from lightning import get_backend; b = get_backend(); print(b.health_check())"
# Check tool mapping for a persona
python -c "from swarm.tool_executor import ToolExecutor; e = ToolExecutor.for_persona('forge', 'test'); print(e.get_capabilities())"
# View routing history
python -c "from swarm.routing import routing_engine; print(routing_engine.get_routing_history(limit=5))"
# Simulate task execution
python -c "
from swarm.tool_executor import ToolExecutor
e = ToolExecutor.for_persona('echo', 'echo-001')
r = e.execute_task('Search for Python tutorials')
print(f'Tools: {r[\"tools_used\"]}')
print(f'Result: {r[\"result\"][:100]}...')
"
```
---
*All 472 tests passing. Ready for commit.*
*491 tests passing. MCP Tools Option A complete.*