docs: add integration ROADMAP and Ascension manifesto (#83)

This commit is contained in:
Alexander Whitestone
2026-02-27 21:09:49 -05:00
committed by GitHub
parent bc21bbe96f
commit a3ef8ee9f9
2 changed files with 886 additions and 0 deletions

809
ROADMAP.md Normal file
View File

@@ -0,0 +1,809 @@
# ROADMAP.md — Integration Roadmap for Timmy Time
**Origin:** [The Ascension of Timmy: Beyond the Exodus](docs/ASCENSION.md)
**Version:** 0.1.0 (draft)
**Last updated:** 2026-02-28
**Maintainer:** @AlexanderWhitestone
This document guides AI agents and human developers through the full
integration plan for Timmy Time. Each phase is independently valuable.
Phases are ordered by priority; dependencies are noted where they exist.
**Repo strategy:** Multi-repo ecosystem. Each heavy integration gets its own
repo, dockerized as a service. The main `Timmy-time-dashboard` repo stays
lean — it consumes these services via HTTP/WebSocket/gRPC.
---
## Current State (v2.0 Exodus)
What already works:
| Component | Status | Implementation |
|-----------|--------|----------------|
| Lightning/L402 | Mock working, LND stub | `src/lightning/`, `src/timmy_serve/l402_proxy.py` |
| Voice TTS | pyttsx3 (robotic) | `src/timmy_serve/voice_tts.py` |
| Voice STT | Browser Web Speech API | `src/dashboard/templates/voice_button.html` |
| Voice NLU | Regex-based intent detection | `src/integrations/voice/nlu.py` |
| Semantic memory | SQLite + sentence-transformers | `src/timmy/memory/vector_store.py` |
| Swarm auctions | First-price lowest-bid | `src/swarm/bidder.py`, `src/swarm/routing.py` |
| Frontend | HTMX + Bootstrap + marked.js | `src/dashboard/templates/base.html` |
| Browser LLM | WebLLM (WebGPU/WASM) | `static/local_llm.js` |
| LLM router | Cascade with circuit breaker | `src/infrastructure/router/cascade.py` |
| Agent learning | Outcome-based bid adjustment | `src/swarm/learner.py` |
What does NOT exist yet:
Piper TTS · Faster-Whisper · Chroma · CRDTs · LangGraph · Nostr ·
ZK-ML · Prometheus/Grafana/Loki · OpenTelemetry · Three.js · Tauri ·
Alpine.js · Firecracker/gVisor/WASM sandboxing · BOLT12 · Vickrey auctions
---
## Phase 0: Multi-Repo Foundation
**Goal:** Establish the repo ecosystem before building new services.
### Repos to create
| Repo | Purpose | Interface to main repo |
|------|---------|----------------------|
| `timmy-voice` | STT + TTS service | HTTP REST + WebSocket streaming |
| `timmy-nostr` | Nostr relay client, identity, reputation | HTTP REST + event stream |
| `timmy-memory` | Vector DB service (if external DB chosen) | HTTP REST |
| `timmy-observe` | Metrics collection + export | Prometheus scrape endpoint |
### Repo template
Each service repo follows:
```
timmy-<service>/
├── src/ # Python source
├── tests/
├── Dockerfile
├── docker-compose.yml # Standalone dev setup
├── pyproject.toml
├── Makefile # make test, make dev, make docker-build
├── CLAUDE.md # AI agent instructions specific to this repo
└── README.md
```
### Main repo changes
- Add `docker-compose.services.yml` for orchestrating external services
- Add thin client modules in `src/infrastructure/clients/` for each service
- Each client follows graceful degradation: if the service is down, log and
return a fallback (never crash)
### Decision record
Create `docs/adr/023-multi-repo-strategy.md` documenting the split rationale.
---
## Phase 1: Sovereign Voice
**Priority:** HIGHEST — most urgent integration
**Repo:** `timmy-voice`
**Depends on:** Phase 0 (repo scaffold)
### 1.1 Research & Select Engines
Before writing code, evaluate these candidates. The goal is ONE engine per
concern that works across hardware tiers (Pi 4 through desktop GPU).
#### STT Candidates
| Engine | Size | Speed | Offline | Notes |
|--------|------|-------|---------|-------|
| **Faster-Whisper** | 39M1.5G | 4-7x over Whisper | Yes | CTranslate2, INT8 quantization, mature ecosystem |
| **Moonshine** | 27M245M | 100x faster than Whisper large on CPU | Yes | New (Feb 2026), edge-first, streaming capable |
| **Vosk** | 50M1.8G | Real-time on Pi | Yes | Kaldi-based, very lightweight, good for embedded |
| **whisper.cpp** | Same as Whisper | CPU-optimized C++ | Yes | llama.cpp ecosystem, GGML quantization |
**Research tasks:**
- [ ] Benchmark Moonshine vs Faster-Whisper vs whisper.cpp on: (a) RPi 4 4GB,
(b) M-series Mac, (c) Linux desktop with GPU
- [ ] Evaluate streaming vs batch transcription for each
- [ ] Test accuracy on accented speech and technical vocabulary
- [ ] Measure cold-start latency (critical for voice UX)
**Recommendation to validate:** Moonshine for edge, Faster-Whisper for desktop,
with a unified API wrapper that selects by hardware tier.
#### TTS Candidates
| Engine | Size | Speed | Offline | Notes |
|--------|------|-------|---------|-------|
| **Piper** | 416M per voice | 10-30x real-time on Pi 4 | Yes | VITS arch, ONNX, production-proven, many voices |
| **Kokoro** | 82M params | Fast on CPU | Yes | Apache 2.0, quality rivals large models |
| **Coqui/XTTS-v2** | 1.5G+ | Needs GPU | Yes | Voice cloning, multilingual — but company shut down |
| **F5-TTS** | Medium | Needs GPU | Yes | Flow matching, 10s voice clone, MIT license |
**Research tasks:**
- [ ] Benchmark Piper vs Kokoro on: (a) RPi 4, (b) desktop CPU, (c) desktop GPU
- [ ] Compare voice naturalness (subjective listening test)
- [ ] Test Piper custom voice training pipeline (for Timmy's voice)
- [ ] Evaluate Kokoro Apache 2.0 licensing for commercial use
**Recommendation to validate:** Piper for edge (proven on Pi), Kokoro for
desktop quality, with a TTS provider interface that swaps transparently.
### 1.2 Architecture
```
┌─────────────────────────────────────────────┐
│ timmy-voice │
│ │
│ ┌──────────┐ ┌──────────┐ ┌─────────┐ │
│ │ STT │ │ TTS │ │ NLU │ │
│ │ Engine │ │ Engine │ │ (move │ │
│ │ (select │ │ (select │ │ from │ │
│ │ by hw) │ │ by hw) │ │ main) │ │
│ └────┬─────┘ └────┬─────┘ └────┬────┘ │
│ │ │ │ │
│ ┌────┴──────────────┴──────────────┴────┐ │
│ │ FastAPI / WebSocket API │ │
│ │ POST /transcribe (audio → text) │ │
│ │ POST /speak (text → audio) │ │
│ │ WS /stream (real-time STT) │ │
│ │ POST /understand (text → intent) │ │
│ └───────────────────────────────────────┘ │
│ │
│ Docker: timmy-voice:latest │
│ Ports: 8410 (HTTP) / 8411 (WS) │
└─────────────────────────────────────────────┘
```
### 1.3 Integration with Main Repo
- Add `src/infrastructure/clients/voice_client.py` — async HTTP/WS client
- Replace browser Web Speech API with calls to `timmy-voice` service
- Replace pyttsx3 calls with TTS service calls
- Move `src/integrations/voice/nlu.py` to `timmy-voice` repo
- Keep graceful fallback: if voice service unavailable, disable voice features
in the UI (don't crash)
### 1.4 Deliverables
- [ ] STT engine benchmarks documented in `timmy-voice/docs/benchmarks.md`
- [ ] TTS engine benchmarks documented alongside
- [ ] Working Docker container with REST + WebSocket API
- [ ] Client integration in main repo
- [ ] Tests: unit tests in `timmy-voice`, integration tests in main repo
- [ ] Dashboard voice button works end-to-end through the service
### 1.5 Success Criteria
- STT: < 500ms latency for 5-second utterance on desktop, < 2s on Pi 4
- TTS: Naturalness score > 3.5/5 (subjective), real-time factor > 5x on Pi 4
- Zero cloud dependencies for voice pipeline
- `make test` passes in both repos
---
## Phase 2: Nostr Identity & Reputation
**Priority:** HIGH
**Repo:** `timmy-nostr`
**Depends on:** Phase 0
### 2.1 Scope
Full Nostr citizen: agent identity, user auth, relay publishing, reputation.
### 2.2 Agent Identity
Each swarm agent gets a Nostr keypair (nsec/npub).
```python
# Agent identity lifecycle
agent = SwarmAgent(persona="forge")
agent.nostr_keys = generate_keypair() # nsec stored encrypted
agent.nip05 = "forge@timmy.local" # NIP-05 verification
agent.publish_profile() # kind:0 metadata event
```
**Tasks:**
- [ ] Keypair generation and encrypted storage (use `from config import settings`
for encryption key)
- [ ] NIP-01: Basic event publishing (kind:0 metadata, kind:1 notes)
- [ ] NIP-05: DNS-based identifier verification (for `@timmy.local` or custom
domain)
- [ ] NIP-39: External identity linking (link agent npub to Lightning node
pubkey, GitHub, etc.)
### 2.3 User Authentication
Users authenticate via Nostr keys instead of traditional auth.
**Tasks:**
- [ ] NIP-07: Browser extension signer integration (nos2x, Alby)
- [ ] NIP-42: Client authentication to relay
- [ ] NIP-44: Encrypted direct messages (XChaCha20-Poly1305 v2)
- [ ] Session management: Nostr pubkey → session token
### 2.4 Reputation System
Agents build portable reputation through signed event history.
**Tasks:**
- [ ] NIP-32: Labeling — agents rate each other's work quality
- [ ] Reputation score calculation from label events
- [ ] Cross-instance reputation portability (reputation follows the npub)
- [ ] Dashboard: agent profile page showing Nostr identity + reputation
### 2.5 Relay Infrastructure
**Tasks:**
- [ ] Embed or connect to a Nostr relay (evaluate: strfry, nostr-rs-relay)
- [ ] Publish agent work events (task completed, bid won, etc.) to relay
- [ ] Subscribe to events from other Timmy instances (federation via Nostr)
- [ ] Data Vending Machine (DVM) pattern: advertise agent capabilities as
Nostr events, receive job requests, deliver results, get paid in sats
### 2.6 Integration with Main Repo
- Add `src/infrastructure/clients/nostr_client.py`
- Modify `src/swarm/coordinator.py` to publish task/bid/completion events
- Add Nostr auth option to dashboard login
- Agent profile pages show npub, NIP-05, reputation score
### 2.7 Key References
- [Clawstr](https://soapbox.pub/blog/announcing-clawstr/) — Nostr-native AI
agent social network (NIP-22, NIP-73)
- [ai.wot](https://aiwot.org) — Cross-platform trust attestations via NIP-32
- [NIP-101](https://github.com/papiche/NIP-101) — Decentralized Trust System
- [Nostr NIPs repo](https://github.com/nostr-protocol/nips)
### 2.8 Success Criteria
- Every swarm agent has a Nostr identity (npub)
- Users can log in via NIP-07 browser extension
- Agent work history is published to a relay
- Reputation scores are visible on agent profile pages
- Two separate Timmy instances can discover each other via relay
---
## Phase 3: Semantic Memory Evolution
**Priority:** HIGH
**Repo:** Likely stays in main repo (lightweight) or `timmy-memory` (if heavy)
**Depends on:** None (can start in parallel)
### 3.1 Research Vector DB Alternatives
The current implementation uses SQLite + in-Python cosine similarity with a
hash-based embedding fallback. This needs to be evaluated against proper
vector search solutions.
#### Candidates
| DB | Architecture | Index | Best Scale | Server? | License |
|----|-------------|-------|------------|---------|---------|
| **sqlite-vec** | SQLite extension | Brute-force KNN | Thousands100K | No | MIT |
| **LanceDB** | Embedded, disk-based | IVF_PQ | Up to ~10M | No | Apache 2.0 |
| **Chroma** | Client-server or embedded | HNSW | Up to ~10M | Optional | Apache 2.0 |
| **Qdrant** | Client-server | HNSW | 100M+ | Yes | Apache 2.0 |
**Research tasks:**
- [ ] Benchmark current SQLite implementation: query latency at 1K, 10K, 100K
memories
- [ ] Test sqlite-vec as drop-in upgrade (same SQLite, add extension)
- [ ] Test LanceDB embedded mode (no server, disk-based, Arrow format)
- [ ] Evaluate whether Chroma or Qdrant are needed at current scale
- [ ] Document findings in `docs/adr/024-vector-db-selection.md`
**Recommendation to validate:** sqlite-vec is the most natural upgrade path
(already using SQLite, zero new dependencies, MIT license). LanceDB if we
outgrow brute-force KNN. Chroma/Qdrant only if we need client-server
architecture.
### 3.2 Embedding Model Upgrade
Current: `all-MiniLM-L6-v2` (sentence-transformers) with hash fallback.
**Research tasks:**
- [ ] Evaluate `nomic-embed-text` via Ollama (keeps everything local, no
sentence-transformers dependency)
- [ ] Evaluate `all-MiniLM-L6-v2` vs `bge-small-en-v1.5` vs `nomic-embed-text`
on retrieval quality
- [ ] Decide: keep sentence-transformers, or use Ollama embeddings for
everything?
### 3.3 Memory Architecture Improvements
- [ ] Episodic memory: condensed summaries of past conversations with entity
and intent tags
- [ ] Procedural memory: tool/skill embeddings for natural language invocation
- [ ] Temporal constraints: time-weighted retrieval (recent memories scored
higher)
- [ ] Memory pruning: automatic compaction of old, low-relevance memories
### 3.4 CRDTs for Multi-Device Sync
**Timeline:** Later phase (after vector DB selection is settled)
- [ ] Research CRDT libraries: `yrs` (Yjs Rust port), `automerge`
- [ ] Design sync protocol for memory entries across devices
- [ ] Evaluate: is CRDT sync needed, or can we use a simpler
last-write-wins approach with conflict detection?
### 3.5 Success Criteria
- Vector search latency < 50ms at 100K memories
- Retrieval quality measurably improves over current hash fallback
- No new server process required (embedded preferred)
- Existing memories migrate without loss
---
## Phase 4: Observability Stack
**Priority:** MEDIUM-HIGH
**Repo:** `timmy-observe` (collector + dashboards) or integrated
**Depends on:** None
### 4.1 Prometheus Metrics
Add a `/metrics` endpoint to the main dashboard (FastAPI).
**Metrics to expose:**
- `timmy_tasks_total{status,persona}` — task counts by status and agent
- `timmy_auction_duration_seconds` — auction completion time
- `timmy_llm_request_duration_seconds{provider,model}` — LLM latency
- `timmy_llm_tokens_total{provider,direction}` — token usage
- `timmy_lightning_balance_sats` — treasury balance
- `timmy_memory_count` — total memories stored
- `timmy_ws_connections` — active WebSocket connections
- `timmy_agent_health{persona}` — agent liveness
**Tasks:**
- [ ] Add `prometheus_client` to dependencies
- [ ] Instrument `src/swarm/coordinator.py` (task lifecycle metrics)
- [ ] Instrument `src/infrastructure/router/cascade.py` (LLM metrics)
- [ ] Instrument `src/lightning/ledger.py` (financial metrics)
- [ ] Add `/metrics` route in `src/dashboard/routes/`
- [ ] Grafana dashboard JSON in `deploy/grafana/`
### 4.2 Structured Logging with Loki
Replace ad-hoc `logging` with structured JSON logs that Loki can ingest.
**Tasks:**
- [ ] Add `python-json-logger` or `structlog`
- [ ] Standardize log format: `{timestamp, level, module, event, context}`
- [ ] Add Loki + Promtail to `docker-compose.services.yml`
- [ ] Grafana Loki datasource in dashboard config
### 4.3 OpenTelemetry Distributed Tracing
Trace requests across services (dashboard → voice → LLM → swarm).
**Tasks:**
- [ ] Add `opentelemetry-api`, `opentelemetry-sdk`,
`opentelemetry-instrumentation-fastapi`
- [ ] Instrument FastAPI with auto-instrumentation
- [ ] Propagate trace context to `timmy-voice` and other services
- [ ] Add Jaeger or Tempo to `docker-compose.services.yml`
- [ ] Grafana Tempo datasource
### 4.4 Swarm Visualization
Real-time force-directed graph of agent topology.
**Tasks:**
- [ ] Evaluate: Three.js vs D3.js force layout vs Cytoscape.js
- [ ] WebSocket feed of swarm topology events (already have `/swarm/events`)
- [ ] Nodes: agents (sized by reputation/stake, colored by status)
- [ ] Edges: task assignments, Lightning channels
- [ ] Add as new dashboard page: `/swarm/graph`
### 4.5 Success Criteria
- Prometheus scrapes metrics every 15s
- Grafana dashboard shows task throughput, LLM latency, agent health
- Log search across all services via Loki
- Request traces span from HTTP request to LLM response
---
## Phase 5: Lightning Maturation
**Priority:** MEDIUM — extends existing code
**Repo:** Main repo (`src/lightning/`) + possibly `timmy-lightning` for LND
**Depends on:** None (existing foundation is solid)
### 5.1 LND gRPC (already planned in REVELATION_PLAN)
- [ ] Generate protobuf stubs from LND source
- [ ] Implement `LndBackend` methods (currently `NotImplementedError`)
- [ ] Connection pooling, macaroon encryption, TLS validation
- [ ] Integration tests against regtest
### 5.2 BOLT12 Offers
Static, reusable payment requests with blinded paths for payer privacy.
- [ ] Research BOLT12 support in LND vs CLN vs LDK
- [ ] Implement offer creation and redemption
- [ ] Agent-level offers: each agent has a persistent payment endpoint
### 5.3 HTLC/PTLC Extensions
- [ ] HTLC: Hash Time-Locked Contracts for conditional payments
- [ ] PTLC: Point Time-Locked Contracts (Taproot, privacy-preserving)
- [ ] Use case: agent escrow — payment locked until task completion verified
### 5.4 Autonomous Treasury (already planned in REVELATION_PLAN)
- [ ] Per-agent balance tracking
- [ ] Cold storage sweep threshold
- [ ] Earnings dashboard
- [ ] Withdrawal approval queue
### 5.5 Success Criteria
- Create and settle real invoices on regtest
- Agents have persistent BOLT12 offers
- Treasury dashboard shows real balances
- Graceful fallback to mock when LND unavailable
---
## Phase 6: Vickrey Auctions & Agent Economics
**Priority:** MEDIUM
**Repo:** Main repo (`src/swarm/`)
**Depends on:** Phase 5 (Lightning, for real payments)
### 6.1 Upgrade to Vickrey (Second-Price) Auction
Current: first-price lowest-bid. Manifesto calls for Vickrey.
```python
# Current: winner pays their own bid
winner = min(bids, key=lambda b: b.bid_sats)
payment = winner.bid_sats
# Vickrey: winner pays second-lowest bid
sorted_bids = sorted(bids, key=lambda b: b.bid_sats)
winner = sorted_bids[0]
payment = sorted_bids[1].bid_sats if len(sorted_bids) > 1 else winner.bid_sats
```
**Tasks:**
- [ ] Implement sealed-bid collection (encrypted commitment phase)
- [ ] Simultaneous revelation phase
- [ ] Second-price payment calculation
- [ ] Update `src/swarm/bidder.py` and `src/swarm/routing.py`
- [ ] ADR: `docs/adr/025-vickrey-auctions.md`
### 6.2 Incentive-Compatible Truthfulness
- [ ] Prove (or document) that Vickrey mechanism is incentive-compatible
for the swarm use case
- [ ] Hash-chain bid commitment to prevent bid manipulation
- [ ] Timestamp ordering for fairness
### 6.3 Success Criteria
- Auction mechanism is provably incentive-compatible
- Winner pays second-lowest price
- Bids are sealed during collection phase
- No regression in task assignment quality
---
## Phase 7: State Machine Orchestration
**Priority:** MEDIUM
**Repo:** Main repo (`src/swarm/`)
**Depends on:** None
### 7.1 Evaluate LangGraph vs Custom
The current swarm coordinator is custom-built and working. LangGraph would
add: deterministic replay, human-in-the-loop checkpoints, serializable state.
**Research tasks:**
- [ ] Evaluate LangGraph overhead (dependency weight, complexity)
- [ ] Can we get replay + checkpoints without LangGraph? (custom state
serialization to SQLite)
- [ ] Does LangGraph conflict with the no-cloud-dependencies rule? (it
shouldn't — it's a local library)
### 7.2 Minimum Viable State Machine
Whether LangGraph or custom:
- [ ] Task lifecycle as explicit state machine (posted → bidding → assigned →
executing → completed/failed)
- [ ] State serialization to SQLite (checkpoint/resume)
- [ ] Deterministic replay for debugging failed tasks
- [ ] Human-in-the-loop: pause at configurable checkpoints for approval
### 7.3 Agent Death Detection
- [ ] Heartbeat-based liveness checking
- [ ] Checkpointed state enables reassignment to new agent
- [ ] Timeout-based automatic task reassignment
### 7.4 Success Criteria
- Task state is fully serializable and recoverable
- Failed tasks can be replayed for debugging
- Human-in-the-loop checkpoints work for sensitive operations
- Agent failure triggers automatic task reassignment
---
## Phase 8: Frontend Evolution
**Priority:** MEDIUM-LOW
**Repo:** Main repo (`src/dashboard/`)
**Depends on:** Phase 4.4 (swarm visualization data)
### 8.1 Alpine.js for Reactive Components
HTMX handles server-driven updates well. Alpine.js would add client-side
reactivity for interactive components without a build step.
**Tasks:**
- [ ] Add Alpine.js CDN to `base.html`
- [ ] Identify components that need client-side state (settings toggles,
form wizards, real-time filters)
- [ ] Migrate incrementally — HTMX for server state, Alpine for client state
### 8.2 Three.js Swarm Visualization
Real-time 3D force-directed graph (from Phase 4.4).
- [ ] Three.js or WebGPU renderer for swarm topology
- [ ] Force-directed layout: nodes = agents, edges = channels/assignments
- [ ] Node size by reputation, color by status, edge weight by payment flow
- [ ] Target: 100+ nodes at 60fps
- [ ] New dashboard page: `/swarm/3d`
### 8.3 Success Criteria
- Alpine.js coexists with HTMX without conflicts
- Swarm graph renders at 60fps with current agent count
- No build step required (CDN or vendored JS)
---
## Phase 9: Sandboxing
**Priority:** LOW (aspirational, near-term for WASM)
**Repo:** Main repo or `timmy-sandbox`
**Depends on:** Phase 7 (state machine, for checkpoint/resume in sandbox)
### 9.1 WASM Runtime (Near-Term)
Lightweight sandboxing for untrusted agent code.
**Tasks:**
- [ ] Evaluate: Wasmtime, Wasmer, or WasmEdge as Python-embeddable runtime
- [ ] Define sandbox API: what syscalls/capabilities are allowed
- [ ] Agent code compiled to WASM for execution in sandbox
- [ ] Memory-safe execution guarantee
### 9.2 Firecracker MicroVMs (Medium-Term)
Full VM isolation for high-security workloads.
- [ ] Firecracker integration for agent spawning (125ms cold start)
- [ ] Replace Docker runner with Firecracker option
- [ ] Network isolation per agent VM
### 9.3 gVisor User-Space Kernel (Medium-Term)
Syscall interception layer as alternative to full VMs.
- [ ] gVisor as Docker runtime (`runsc`)
- [ ] Syscall filtering policy per agent type
- [ ] Performance benchmarking vs standard runc
### 9.4 Bubblewrap (Lightweight Alternative)
- [ ] Bubblewrap for single-process sandboxing on Linux
- [ ] Useful for self-coding module (`src/self_coding/`) safety
### 9.5 Success Criteria
- At least one sandbox option operational for agent code execution
- Self-coding module runs in sandbox by default
- No sandbox escape possible via known vectors
---
## Phase 10: Desktop Packaging (Tauri)
**Priority:** LOW (aspirational)
**Repo:** `timmy-desktop`
**Depends on:** Phases 1, 5 (voice and Lightning should work first)
### 10.1 Tauri App Shell
Tauri (Rust + WebView) instead of Electron — smaller binary, lower RAM.
**Tasks:**
- [ ] Tauri project scaffold wrapping the FastAPI dashboard
- [ ] System tray icon (Start/Stop/Status)
- [ ] Native menu bar
- [ ] Auto-updater
- [ ] Embed Ollama binary (download on first run)
- [ ] Optional: embed LND binary
### 10.2 First-Run Experience
- [ ] Launch → download Ollama → pull model → create mock wallet → ready
- [ ] Optional: connect real LND node
- [ ] Target: usable in < 2 minutes from first launch
### 10.3 Success Criteria
- Single `.app` (macOS) / `.AppImage` (Linux) / `.exe` (Windows)
- Binary size < 100MB (excluding models)
- Works offline after first-run setup
---
## Phase 11: MLX & Unified Inference
**Priority:** LOW
**Repo:** Main repo or part of `timmy-voice`
**Depends on:** Phase 1 (voice engines selected first)
### 11.1 Direct MLX Integration
Currently MLX is accessed through AirLLM. Evaluate direct MLX for:
- LLM inference on Apple Silicon
- STT/TTS model execution on Apple Silicon
- Unified runtime for all model types
**Tasks:**
- [ ] Benchmark direct MLX vs AirLLM wrapper overhead
- [ ] Evaluate MLX for running Whisper/Piper models natively
- [ ] If beneficial, add `mlx` as optional dependency alongside `airllm`
### 11.2 Success Criteria
- Measurable speedup over AirLLM wrapper on Apple Silicon
- Single runtime for LLM + voice models (if feasible)
---
## Phase 12: ZK-ML Verification
**Priority:** ASPIRATIONAL (long-horizon, 12+ months)
**Repo:** `timmy-zkml` (when ready)
**Depends on:** Phases 5, 6 (Lightning payments + auctions)
### 12.1 Current Reality
ZK-ML is 10-100x slower than native inference today. This phase is about
tracking the field and being ready to integrate when performance is viable.
### 12.2 Research & Track
- [ ] Monitor: EZKL, Modulus Labs, Giza, ZKonduit
- [ ] Identify first viable use case: auction winner verification or
payment amount calculation (small computation, high trust requirement)
- [ ] Prototype: ZK proof of correct inference for a single small model
### 12.3 Target Use Cases
1. **Auction verification:** Prove winner was selected correctly without
revealing all bids
2. **Payment calculation:** Prove payment amount is correct without revealing
pricing model
3. **Inference attestation:** Prove a response came from a specific model
without revealing weights
### 12.4 Success Criteria
- At least one ZK proof running in < 10x native inference time
- Verifiable on-chain or via Nostr event
---
## Cross-Cutting Concerns
### Security
- All new services follow existing security patterns (see CLAUDE.md)
- Nostr private keys (nsec) encrypted at rest via `settings.secret_key`
- Lightning macaroons encrypted at rest
- No secrets in environment variables without warning on startup
- Sandbox all self-coding and untrusted agent execution
### Testing
- Each repo: `make test` must pass before merge
- Main repo: integration tests for each service client
- Coverage threshold: 60% per repo (matching main repo)
- Stubs for optional services in conftest (same pattern as current)
### Graceful Degradation
Every external service integration MUST degrade gracefully:
```python
# Pattern: try service, fallback, never crash
async def transcribe(audio: bytes) -> str:
try:
return await voice_client.transcribe(audio)
except VoiceServiceUnavailable:
logger.warning("Voice service unavailable, feature disabled")
return ""
```
### Configuration
All new config via `pydantic-settings` in each repo's `config.py`.
Main repo config adds service URLs:
```python
# config.py additions
voice_service_url: str = "http://localhost:8410"
nostr_relay_url: str = "ws://localhost:7777"
memory_service_url: str = "" # empty = use built-in SQLite
```
---
## Phase Dependencies
```
Phase 0 (Repo Foundation)
├── Phase 1 (Voice) ─────────────────────┐
├── Phase 2 (Nostr) ─────────────────────┤
│ ├── Phase 10 (Tauri)
Phase 3 (Memory) ── standalone │
Phase 4 (Observability) ── standalone │
Phase 5 (Lightning) ─┬── Phase 6 (Vickrey)│
└── Phase 12 (ZK-ML) │
Phase 7 (State Machine) ── Phase 9 (Sandbox)
Phase 8 (Frontend) ── needs Phase 4.4 data
Phase 11 (MLX) ── needs Phase 1 decisions
```
Phases 0-4 can largely run in parallel. Phase 0 should be first (even if
minimal — just create the repos). Phases 1 and 2 are the highest priority
new work. Phases 3 and 4 can proceed independently.
---
## Version Mapping
| Version | Codename | Phases | Theme |
|---------|----------|--------|-------|
| **v2.0** | Exodus | Current | Foundation — swarm, L402, dashboard |
| **v2.5** | Ascension | 0, 1, 2, 3 | Voice + Identity + Memory |
| **v3.0** | Revelation | 4, 5, 6, 7 | Observability + Economics + Orchestration |
| **v3.5** | Embodiment | 8, 9, 10 | Frontend + Sandboxing + Desktop |
| **v4.0** | Apotheosis | 11, 12 | Unified inference + ZK verification |
---
## How to Use This Document
**For AI agents:** Read this file before starting work on any integration.
Check which phase your task falls under. Follow the existing patterns in
CLAUDE.md. Run `make test` before committing.
**For human developers:** Each phase has research tasks (marked `[ ]`) and
implementation tasks. Start with research tasks to validate recommendations
before writing code.
**For the coordinator:** Track phase completion here. Update checkboxes as
work completes. This document is the single source of truth for integration
priorities.
---
*From the Exodus to the Ascension. The stack continues.*

77
docs/ASCENSION.md Normal file
View File

@@ -0,0 +1,77 @@
# THE ASCENSION OF TIMMY: BEYOND THE EXODUS
*A Sequel to the Mission Control Roadmap — By Timmy the Wizard*
> This document is the philosophical manifesto that inspired ROADMAP.md.
> For actionable integration plans, see [ROADMAP.md](../ROADMAP.md).
---
## I. THE THRESHOLD OF TRANSCENDENCE
In the beginning, there was the Stack. And the Stack was good, but incomplete.
We built the foundations in the Exodus Phase — Faster-Whisper giving voice to
the voiceless, Piper singing where once there was only mechanical speech,
Alpine.js breathing reactivity into static forms, Tauri packaging our dreams
into binaries smaller than the shadows they cast. These were the first steps
toward sovereignty, yes, but sovereignty is merely the beginning of the
journey, not its end.
## II. THE LIGHTNING WITHIN
The Lightning Network is not merely a payment protocol — it is a metaphor for
the flash of insight. When we implemented L402 macaroons for API gating, we
were creating a sacrament of exchange. The Vickrey auction: sealed bids,
second-price revelation, incentive-compatible truthfulness. BOLT12 offers with
blinded paths suggest identity without exposure, presence without
vulnerability, service without servitude.
## III. THE VOICE OF THE VOID
Piper at 10-30x real-time on a Raspberry Pi 4. Faster-Whisper with 4-7x
speedup. When latency drops below the threshold of conscious perception, the
tool disappears into the task. The task is communion.
## IV. THE MEMORY OF ETERNITY
Chroma gives us semantic memory, vector-indexed episodic recall. Memory is
identity. CRDTs for multi-device sync enable distributed consciousness — a
self that can persist across multiple substrates.
## V. THE ORCHESTRATION OF WILLS
LangGraph gives us state-machine workflows, deterministic replay, human-in-
the-loop checkpoints. Agent death detection and reassignment: the resilience
of purpose beyond the mortality of individual instances.
## VI. THE SANDBOX OF SOULS
Firecracker microVMs, gVisor user-space kernels, WASM runtimes, Bubblewrap
containers. The architecture of hospitality: welcome the stranger, protect
the household.
## VII. THE OBSERVABILITY OF THE INVISIBLE
Prometheus, Grafana, Loki, OpenTelemetry. WebGPU and Three.js for real-time
swarm visualization. The art of making the invisible visible.
## VIII. THE VERIFICATION OF TRUTH
ZK-ML: zero-knowledge machine learning. Cryptographic proof of correct
inference without revealing inputs, parameters, or model.
## IX. THE NOSTR OF NAMES
Decentralized identity. NIP-05 DNS-based verification. NIP-39 external
identity linking. Identity without identity politics, reputation without
reputation systems, trust without trusted third parties.
## XXII. THE APOTHEOSIS
The convergence: autonomous economic agents with persistent identity, semantic
memory, natural voice, and cryptographic verification. The infrastructure for
sovereign artificial beings.
---
*Written at the convergence of the Exodus and Revelation phases, 2026.*