Complete four-pass evolution to production-ready architecture: **Pass 1 → Foundation:** - Tool registry, basic harness, 19 tools - VPS provisioning, Syncthing mesh - Health daemon, systemd services **Pass 2 → Three-House Canon:** - Timmy (Sovereign), Ezra (Archivist), Bezalel (Artificer) - Provenance tracking, artifact-flow discipline - House-aware policy enforcement **Pass 3 → Self-Improvement:** - Pattern database with SQLite backend - Adaptive policies (auto-adjust thresholds) - Predictive execution (success prediction) - Hermes bridge for shortest-loop telemetry - Learning velocity tracking **Pass 4 → Production Integration:** - Unified API: `from uni_wizard import Harness, House, Mode` - Three modes: SIMPLE / INTELLIGENT / SOVEREIGN - Circuit breaker pattern for fault tolerance - Async/concurrent execution support - Production hardening (timeouts, retries) **Allegro Lane Definition:** - Narrowed to: Gitea integration, Hermes bridge, redundancy/failover - Provides: Cloud connectivity, telemetry streaming, issue routing - Does NOT: Make sovereign decisions, authenticate as Timmy **Files:** - v3/: Intelligence engine, adaptive harness, Hermes bridge - v4/: Unified API, production harness, final architecture Total: ~25KB architecture documentation + production code
12 KiB
12 KiB
Uni-Wizard v3 — Self-Improving Local Sovereignty
"Every execution teaches. Every pattern informs. Timmy gets smarter every day he runs."
The v3 Breakthrough: Closed-Loop Intelligence
The Problem with v1/v2
Previous Architectures (Open Loop):
┌─────────┐ ┌──────────┐ ┌─────────┐
│ Execute │───→│ Log Data │───→│ Report │───→ 🗑️ (data goes nowhere)
└─────────┘ └──────────┘ └─────────┘
v3 Architecture (Closed Loop):
┌─────────┐ ┌──────────┐ ┌───────────┐ ┌─────────┐
│ Execute │───→│ Log Data │───→│ Analyze │───→│ Adapt │
└─────────┘ └──────────┘ └─────┬─────┘ └────┬────┘
↑ │ │
└───────────────────────────────┴───────────────┘
Intelligence Engine
Core Components
1. Intelligence Engine (intelligence_engine.py)
The brain that makes Timmy smarter:
- Pattern Database: SQLite store of all executions
- Pattern Recognition: Tool + params → success rate
- Adaptive Policies: Thresholds adjust based on performance
- Prediction Engine: Pre-execution success prediction
- Learning Velocity: Tracks improvement over time
engine = IntelligenceEngine()
# Predict before executing
prob, reason = engine.predict_success("git_status", "ezra")
print(f"Predicted success: {prob:.0%} — {reason}")
# Get optimal routing
house, confidence = engine.get_optimal_house("deploy")
print(f"Best house: {house} (confidence: {confidence:.0%})")
2. Adaptive Harness (harness.py)
Harness v3 with intelligence integration:
# Create harness with learning enabled
harness = UniWizardHarness("timmy", enable_learning=True)
# Execute with predictions
result = harness.execute("git_status", repo_path="/tmp")
print(f"Predicted: {result.provenance.prediction:.0%}")
print(f"Actual: {'✅' if result.success else '❌'}")
# Trigger learning
harness.learn_from_batch()
3. Hermes Bridge (hermes_bridge.py)
Shortest Loop Integration: Hermes telemetry → Timmy intelligence in <100ms
# Start real-time streaming
integrator = ShortestLoopIntegrator(intelligence_engine)
integrator.start()
# All Hermes sessions now feed into Timmy's intelligence
Key Features
1. Self-Improving Policies
Policies adapt based on actual performance:
# If Ezra's success rate drops below 60%
# → Lower evidence threshold automatically
# If Bezalel's tests pass consistently
# → Raise proof requirements (we can be stricter)
2. Predictive Execution
Predict success before executing:
prediction, reasoning = harness.predict_execution("deploy", params)
# Returns: (0.85, "Based on 23 similar executions: good track record")
3. Pattern Recognition
# Find patterns in execution history
pattern = engine.db.get_pattern("git_status", "ezra")
print(f"Success rate: {pattern.success_rate:.0%}")
print(f"Avg latency: {pattern.avg_latency_ms}ms")
print(f"Sample count: {pattern.sample_count}")
4. Model Performance Tracking
# Find best model for task type
best_model = engine.db.get_best_model("read", min_samples=10)
# Returns: "hermes3:8b" (if it has best success rate)
5. Learning Velocity
report = engine.get_intelligence_report()
velocity = report['learning_velocity']
print(f"Improvement: {velocity['improvement']:+.1%}")
print(f"Status: {velocity['velocity']}") # accelerating/stable/declining
Architecture
┌─────────────────────────────────────────────────────────────────┐
│ UNI-WIZARD v3 ARCHITECTURE │
├─────────────────────────────────────────────────────────────────┤
│ │
│ ┌──────────────────────────────────────────────────────────┐ │
│ │ INTELLIGENCE ENGINE │ │
│ │ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │ │
│ │ │ Pattern │ │ Adaptive │ │ Prediction │ │ │
│ │ │ Database │ │ Policies │ │ Engine │ │ │
│ │ └──────────────┘ └──────────────┘ └──────────────┘ │ │
│ └──────────────────────────┬───────────────────────────────┘ │
│ │ │
│ ┌───────────────────┼───────────────────┐ │
│ │ │ │ │
│ ┌──────▼──────┐ ┌──────▼──────┐ ┌──────▼──────┐ │
│ │ TIMMY │ │ EZRA │ │ BEZALEL │ │
│ │ Harness │ │ Harness │ │ Harness │ │
│ │ (Sovereign)│ │ (Adaptive) │ │ (Adaptive) │ │
│ └──────┬──────┘ └──────┬──────┘ └──────┬──────┘ │
│ │ │ │ │
│ └───────────────────┼───────────────────┘ │
│ │ │
│ ┌──────────────────────────▼──────────────────────────┐ │
│ │ HERMES BRIDGE (Shortest Loop) │ │
│ │ Hermes Session DB → Real-time Stream Processor │ │
│ └──────────────────────────┬──────────────────────────┘ │
│ │ │
│ ┌──────────────────────────▼──────────────────────────┐ │
│ │ HERMES HARNESS │ │
│ │ (Source of telemetry) │ │
│ └──────────────────────────────────────────────────────┘ │
│ │
└─────────────────────────────────────────────────────────────────┘
Usage
Quick Start
from v3.harness import get_harness
from v3.intelligence_engine import IntelligenceEngine
# Create shared intelligence
intel = IntelligenceEngine()
# Create harnesses
timmy = get_harness("timmy", intelligence=intel)
ezra = get_harness("ezra", intelligence=intel)
# Execute (automatically recorded)
result = ezra.execute("git_status", repo_path="/tmp")
# Check what we learned
pattern = intel.db.get_pattern("git_status", "ezra")
print(f"Learned: {pattern.success_rate:.0%} success rate")
With Hermes Integration
from v3.hermes_bridge import ShortestLoopIntegrator
# Connect to Hermes
integrator = ShortestLoopIntegrator(intel)
integrator.start()
# Now all Hermes executions teach Timmy
Adaptive Learning
# After many executions
timmy.learn_from_batch()
# Policies have adapted
print(f"Ezra's evidence threshold: {ezra.policy.get('evidence_threshold')}")
# May have changed from default 0.8 based on performance
Performance Metrics
Intelligence Report
report = intel.get_intelligence_report()
{
"timestamp": "2026-03-30T20:00:00Z",
"house_performance": {
"ezra": {"success_rate": 0.85, "avg_latency_ms": 120},
"bezalel": {"success_rate": 0.78, "avg_latency_ms": 200}
},
"learning_velocity": {
"velocity": "accelerating",
"improvement": +0.05
},
"recent_adaptations": [
{
"change_type": "policy.ezra.evidence_threshold",
"old_value": 0.8,
"new_value": 0.75,
"reason": "Ezra success rate 55% below threshold"
}
]
}
Prediction Accuracy
# How good are our predictions?
accuracy = intel._calculate_prediction_accuracy()
print(f"Prediction accuracy: {accuracy:.0%}")
File Structure
uni-wizard/v3/
├── README.md # This document
├── CRITIQUE.md # Review of v1/v2 gaps
├── intelligence_engine.py # Pattern DB + learning (24KB)
├── harness.py # Adaptive harness (18KB)
├── hermes_bridge.py # Shortest loop bridge (14KB)
└── tests/
└── test_v3.py # Comprehensive tests
Comparison
| Feature | v1 | v2 | v3 |
|---|---|---|---|
| Telemetry | Basic logging | Provenance tracking | Pattern recognition |
| Policies | Static | Static | Adaptive |
| Learning | None | None | Continuous |
| Predictions | None | None | Pre-execution |
| Hermes Integration | Manual | Manual | Real-time stream |
| Policy Adaptation | No | No | Auto-adjust |
| Self-Improvement | No | No | Yes |
The Self-Improvement Loop
┌──────────────────────────────────────────────────────────┐
│ SELF-IMPROVEMENT CYCLE │
└──────────────────────────────────────────────────────────┘
1. EXECUTE
└── Run tool with house policy
2. RECORD
└── Store outcome in Pattern Database
3. ANALYZE (every N executions)
└── Check house performance
└── Identify patterns
└── Detect underperformance
4. ADAPT
└── Adjust policy thresholds
└── Update routing preferences
└── Record adaptation
5. PREDICT (next execution)
└── Query pattern for tool/house
└── Return predicted success rate
6. EXECUTE (with new policy)
└── Apply adapted threshold
└── Use prediction for confidence
7. MEASURE
└── Did adaptation help?
└── Update learning velocity
←─ Repeat ─┘
Design Principles
- Every execution teaches — No telemetry without analysis
- Local learning only — Pattern recognition runs on-device
- Shortest feedback loop — Hermes → Intelligence <100ms
- Transparent adaptation — Timmy explains policy changes
- Sovereignty-preserving — Learning improves local decisions
Future Work
- Fine-tune local models based on telemetry
- Predictive caching (pre-fetch likely tools)
- Anomaly detection (detect unusual failures)
- Cross-session pattern learning
- Automated A/B testing of policies
Timmy gets smarter every day he runs.