Files
timmy-home/uni-wizard/v3/README.md
Allegro 31026ddcc1 [#76-v4] Final Uni-Wizard Architecture — Production Integration
Complete four-pass evolution to production-ready architecture:

**Pass 1 → Foundation:**
- Tool registry, basic harness, 19 tools
- VPS provisioning, Syncthing mesh
- Health daemon, systemd services

**Pass 2 → Three-House Canon:**
- Timmy (Sovereign), Ezra (Archivist), Bezalel (Artificer)
- Provenance tracking, artifact-flow discipline
- House-aware policy enforcement

**Pass 3 → Self-Improvement:**
- Pattern database with SQLite backend
- Adaptive policies (auto-adjust thresholds)
- Predictive execution (success prediction)
- Hermes bridge for shortest-loop telemetry
- Learning velocity tracking

**Pass 4 → Production Integration:**
- Unified API: `from uni_wizard import Harness, House, Mode`
- Three modes: SIMPLE / INTELLIGENT / SOVEREIGN
- Circuit breaker pattern for fault tolerance
- Async/concurrent execution support
- Production hardening (timeouts, retries)

**Allegro Lane Definition:**
- Narrowed to: Gitea integration, Hermes bridge, redundancy/failover
- Provides: Cloud connectivity, telemetry streaming, issue routing
- Does NOT: Make sovereign decisions, authenticate as Timmy

**Files:**
- v3/: Intelligence engine, adaptive harness, Hermes bridge
- v4/: Unified API, production harness, final architecture

Total: ~25KB architecture documentation + production code
2026-03-30 16:39:42 +00:00

328 lines
12 KiB
Markdown

# Uni-Wizard v3 — Self-Improving Local Sovereignty
> *"Every execution teaches. Every pattern informs. Timmy gets smarter every day he runs."*
## The v3 Breakthrough: Closed-Loop Intelligence
### The Problem with v1/v2
```
Previous Architectures (Open Loop):
┌─────────┐ ┌──────────┐ ┌─────────┐
│ Execute │───→│ Log Data │───→│ Report │───→ 🗑️ (data goes nowhere)
└─────────┘ └──────────┘ └─────────┘
v3 Architecture (Closed Loop):
┌─────────┐ ┌──────────┐ ┌───────────┐ ┌─────────┐
│ Execute │───→│ Log Data │───→│ Analyze │───→│ Adapt │
└─────────┘ └──────────┘ └─────┬─────┘ └────┬────┘
↑ │ │
└───────────────────────────────┴───────────────┘
Intelligence Engine
```
## Core Components
### 1. Intelligence Engine (`intelligence_engine.py`)
The brain that makes Timmy smarter:
- **Pattern Database**: SQLite store of all executions
- **Pattern Recognition**: Tool + params → success rate
- **Adaptive Policies**: Thresholds adjust based on performance
- **Prediction Engine**: Pre-execution success prediction
- **Learning Velocity**: Tracks improvement over time
```python
engine = IntelligenceEngine()
# Predict before executing
prob, reason = engine.predict_success("git_status", "ezra")
print(f"Predicted success: {prob:.0%}{reason}")
# Get optimal routing
house, confidence = engine.get_optimal_house("deploy")
print(f"Best house: {house} (confidence: {confidence:.0%})")
```
### 2. Adaptive Harness (`harness.py`)
Harness v3 with intelligence integration:
```python
# Create harness with learning enabled
harness = UniWizardHarness("timmy", enable_learning=True)
# Execute with predictions
result = harness.execute("git_status", repo_path="/tmp")
print(f"Predicted: {result.provenance.prediction:.0%}")
print(f"Actual: {'' if result.success else ''}")
# Trigger learning
harness.learn_from_batch()
```
### 3. Hermes Bridge (`hermes_bridge.py`)
**Shortest Loop Integration**: Hermes telemetry → Timmy intelligence in <100ms
```python
# Start real-time streaming
integrator = ShortestLoopIntegrator(intelligence_engine)
integrator.start()
# All Hermes sessions now feed into Timmy's intelligence
```
## Key Features
### 1. Self-Improving Policies
Policies adapt based on actual performance:
```python
# If Ezra's success rate drops below 60%
# → Lower evidence threshold automatically
# If Bezalel's tests pass consistently
# → Raise proof requirements (we can be stricter)
```
### 2. Predictive Execution
Predict success before executing:
```python
prediction, reasoning = harness.predict_execution("deploy", params)
# Returns: (0.85, "Based on 23 similar executions: good track record")
```
### 3. Pattern Recognition
```python
# Find patterns in execution history
pattern = engine.db.get_pattern("git_status", "ezra")
print(f"Success rate: {pattern.success_rate:.0%}")
print(f"Avg latency: {pattern.avg_latency_ms}ms")
print(f"Sample count: {pattern.sample_count}")
```
### 4. Model Performance Tracking
```python
# Find best model for task type
best_model = engine.db.get_best_model("read", min_samples=10)
# Returns: "hermes3:8b" (if it has best success rate)
```
### 5. Learning Velocity
```python
report = engine.get_intelligence_report()
velocity = report['learning_velocity']
print(f"Improvement: {velocity['improvement']:+.1%}")
print(f"Status: {velocity['velocity']}") # accelerating/stable/declining
```
## Architecture
```
┌─────────────────────────────────────────────────────────────────┐
│ UNI-WIZARD v3 ARCHITECTURE │
├─────────────────────────────────────────────────────────────────┤
│ │
│ ┌──────────────────────────────────────────────────────────┐ │
│ │ INTELLIGENCE ENGINE │ │
│ │ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │ │
│ │ │ Pattern │ │ Adaptive │ │ Prediction │ │ │
│ │ │ Database │ │ Policies │ │ Engine │ │ │
│ │ └──────────────┘ └──────────────┘ └──────────────┘ │ │
│ └──────────────────────────┬───────────────────────────────┘ │
│ │ │
│ ┌───────────────────┼───────────────────┐ │
│ │ │ │ │
│ ┌──────▼──────┐ ┌──────▼──────┐ ┌──────▼──────┐ │
│ │ TIMMY │ │ EZRA │ │ BEZALEL │ │
│ │ Harness │ │ Harness │ │ Harness │ │
│ │ (Sovereign)│ │ (Adaptive) │ │ (Adaptive) │ │
│ └──────┬──────┘ └──────┬──────┘ └──────┬──────┘ │
│ │ │ │ │
│ └───────────────────┼───────────────────┘ │
│ │ │
│ ┌──────────────────────────▼──────────────────────────┐ │
│ │ HERMES BRIDGE (Shortest Loop) │ │
│ │ Hermes Session DB → Real-time Stream Processor │ │
│ └──────────────────────────┬──────────────────────────┘ │
│ │ │
│ ┌──────────────────────────▼──────────────────────────┐ │
│ │ HERMES HARNESS │ │
│ │ (Source of telemetry) │ │
│ └──────────────────────────────────────────────────────┘ │
│ │
└─────────────────────────────────────────────────────────────────┘
```
## Usage
### Quick Start
```python
from v3.harness import get_harness
from v3.intelligence_engine import IntelligenceEngine
# Create shared intelligence
intel = IntelligenceEngine()
# Create harnesses
timmy = get_harness("timmy", intelligence=intel)
ezra = get_harness("ezra", intelligence=intel)
# Execute (automatically recorded)
result = ezra.execute("git_status", repo_path="/tmp")
# Check what we learned
pattern = intel.db.get_pattern("git_status", "ezra")
print(f"Learned: {pattern.success_rate:.0%} success rate")
```
### With Hermes Integration
```python
from v3.hermes_bridge import ShortestLoopIntegrator
# Connect to Hermes
integrator = ShortestLoopIntegrator(intel)
integrator.start()
# Now all Hermes executions teach Timmy
```
### Adaptive Learning
```python
# After many executions
timmy.learn_from_batch()
# Policies have adapted
print(f"Ezra's evidence threshold: {ezra.policy.get('evidence_threshold')}")
# May have changed from default 0.8 based on performance
```
## Performance Metrics
### Intelligence Report
```python
report = intel.get_intelligence_report()
{
"timestamp": "2026-03-30T20:00:00Z",
"house_performance": {
"ezra": {"success_rate": 0.85, "avg_latency_ms": 120},
"bezalel": {"success_rate": 0.78, "avg_latency_ms": 200}
},
"learning_velocity": {
"velocity": "accelerating",
"improvement": +0.05
},
"recent_adaptations": [
{
"change_type": "policy.ezra.evidence_threshold",
"old_value": 0.8,
"new_value": 0.75,
"reason": "Ezra success rate 55% below threshold"
}
]
}
```
### Prediction Accuracy
```python
# How good are our predictions?
accuracy = intel._calculate_prediction_accuracy()
print(f"Prediction accuracy: {accuracy:.0%}")
```
## File Structure
```
uni-wizard/v3/
├── README.md # This document
├── CRITIQUE.md # Review of v1/v2 gaps
├── intelligence_engine.py # Pattern DB + learning (24KB)
├── harness.py # Adaptive harness (18KB)
├── hermes_bridge.py # Shortest loop bridge (14KB)
└── tests/
└── test_v3.py # Comprehensive tests
```
## Comparison
| Feature | v1 | v2 | v3 |
|---------|-----|-----|-----|
| Telemetry | Basic logging | Provenance tracking | **Pattern recognition** |
| Policies | Static | Static | **Adaptive** |
| Learning | None | None | **Continuous** |
| Predictions | None | None | **Pre-execution** |
| Hermes Integration | Manual | Manual | **Real-time stream** |
| Policy Adaptation | No | No | **Auto-adjust** |
| Self-Improvement | No | No | **Yes** |
## The Self-Improvement Loop
```
┌──────────────────────────────────────────────────────────┐
│ SELF-IMPROVEMENT CYCLE │
└──────────────────────────────────────────────────────────┘
1. EXECUTE
└── Run tool with house policy
2. RECORD
└── Store outcome in Pattern Database
3. ANALYZE (every N executions)
└── Check house performance
└── Identify patterns
└── Detect underperformance
4. ADAPT
└── Adjust policy thresholds
└── Update routing preferences
└── Record adaptation
5. PREDICT (next execution)
└── Query pattern for tool/house
└── Return predicted success rate
6. EXECUTE (with new policy)
└── Apply adapted threshold
└── Use prediction for confidence
7. MEASURE
└── Did adaptation help?
└── Update learning velocity
←─ Repeat ─┘
```
## Design Principles
1. **Every execution teaches** — No telemetry without analysis
2. **Local learning only** — Pattern recognition runs on-device
3. **Shortest feedback loop** — Hermes → Intelligence <100ms
4. **Transparent adaptation** — Timmy explains policy changes
5. **Sovereignty-preserving** — Learning improves local decisions
## Future Work
- [ ] Fine-tune local models based on telemetry
- [ ] Predictive caching (pre-fetch likely tools)
- [ ] Anomaly detection (detect unusual failures)
- [ ] Cross-session pattern learning
- [ ] Automated A/B testing of policies
---
*Timmy gets smarter every day he runs.*