Complete four-pass evolution to production-ready architecture: **Pass 1 → Foundation:** - Tool registry, basic harness, 19 tools - VPS provisioning, Syncthing mesh - Health daemon, systemd services **Pass 2 → Three-House Canon:** - Timmy (Sovereign), Ezra (Archivist), Bezalel (Artificer) - Provenance tracking, artifact-flow discipline - House-aware policy enforcement **Pass 3 → Self-Improvement:** - Pattern database with SQLite backend - Adaptive policies (auto-adjust thresholds) - Predictive execution (success prediction) - Hermes bridge for shortest-loop telemetry - Learning velocity tracking **Pass 4 → Production Integration:** - Unified API: `from uni_wizard import Harness, House, Mode` - Three modes: SIMPLE / INTELLIGENT / SOVEREIGN - Circuit breaker pattern for fault tolerance - Async/concurrent execution support - Production hardening (timeouts, retries) **Allegro Lane Definition:** - Narrowed to: Gitea integration, Hermes bridge, redundancy/failover - Provides: Cloud connectivity, telemetry streaming, issue routing - Does NOT: Make sovereign decisions, authenticate as Timmy **Files:** - v3/: Intelligence engine, adaptive harness, Hermes bridge - v4/: Unified API, production harness, final architecture Total: ~25KB architecture documentation + production code
328 lines
12 KiB
Markdown
328 lines
12 KiB
Markdown
# Uni-Wizard v3 — Self-Improving Local Sovereignty
|
|
|
|
> *"Every execution teaches. Every pattern informs. Timmy gets smarter every day he runs."*
|
|
|
|
## The v3 Breakthrough: Closed-Loop Intelligence
|
|
|
|
### The Problem with v1/v2
|
|
|
|
```
|
|
Previous Architectures (Open Loop):
|
|
┌─────────┐ ┌──────────┐ ┌─────────┐
|
|
│ Execute │───→│ Log Data │───→│ Report │───→ 🗑️ (data goes nowhere)
|
|
└─────────┘ └──────────┘ └─────────┘
|
|
|
|
v3 Architecture (Closed Loop):
|
|
┌─────────┐ ┌──────────┐ ┌───────────┐ ┌─────────┐
|
|
│ Execute │───→│ Log Data │───→│ Analyze │───→│ Adapt │
|
|
└─────────┘ └──────────┘ └─────┬─────┘ └────┬────┘
|
|
↑ │ │
|
|
└───────────────────────────────┴───────────────┘
|
|
Intelligence Engine
|
|
```
|
|
|
|
## Core Components
|
|
|
|
### 1. Intelligence Engine (`intelligence_engine.py`)
|
|
|
|
The brain that makes Timmy smarter:
|
|
|
|
- **Pattern Database**: SQLite store of all executions
|
|
- **Pattern Recognition**: Tool + params → success rate
|
|
- **Adaptive Policies**: Thresholds adjust based on performance
|
|
- **Prediction Engine**: Pre-execution success prediction
|
|
- **Learning Velocity**: Tracks improvement over time
|
|
|
|
```python
|
|
engine = IntelligenceEngine()
|
|
|
|
# Predict before executing
|
|
prob, reason = engine.predict_success("git_status", "ezra")
|
|
print(f"Predicted success: {prob:.0%} — {reason}")
|
|
|
|
# Get optimal routing
|
|
house, confidence = engine.get_optimal_house("deploy")
|
|
print(f"Best house: {house} (confidence: {confidence:.0%})")
|
|
```
|
|
|
|
### 2. Adaptive Harness (`harness.py`)
|
|
|
|
Harness v3 with intelligence integration:
|
|
|
|
```python
|
|
# Create harness with learning enabled
|
|
harness = UniWizardHarness("timmy", enable_learning=True)
|
|
|
|
# Execute with predictions
|
|
result = harness.execute("git_status", repo_path="/tmp")
|
|
print(f"Predicted: {result.provenance.prediction:.0%}")
|
|
print(f"Actual: {'✅' if result.success else '❌'}")
|
|
|
|
# Trigger learning
|
|
harness.learn_from_batch()
|
|
```
|
|
|
|
### 3. Hermes Bridge (`hermes_bridge.py`)
|
|
|
|
**Shortest Loop Integration**: Hermes telemetry → Timmy intelligence in <100ms
|
|
|
|
```python
|
|
# Start real-time streaming
|
|
integrator = ShortestLoopIntegrator(intelligence_engine)
|
|
integrator.start()
|
|
|
|
# All Hermes sessions now feed into Timmy's intelligence
|
|
```
|
|
|
|
## Key Features
|
|
|
|
### 1. Self-Improving Policies
|
|
|
|
Policies adapt based on actual performance:
|
|
|
|
```python
|
|
# If Ezra's success rate drops below 60%
|
|
# → Lower evidence threshold automatically
|
|
# If Bezalel's tests pass consistently
|
|
# → Raise proof requirements (we can be stricter)
|
|
```
|
|
|
|
### 2. Predictive Execution
|
|
|
|
Predict success before executing:
|
|
|
|
```python
|
|
prediction, reasoning = harness.predict_execution("deploy", params)
|
|
# Returns: (0.85, "Based on 23 similar executions: good track record")
|
|
```
|
|
|
|
### 3. Pattern Recognition
|
|
|
|
```python
|
|
# Find patterns in execution history
|
|
pattern = engine.db.get_pattern("git_status", "ezra")
|
|
print(f"Success rate: {pattern.success_rate:.0%}")
|
|
print(f"Avg latency: {pattern.avg_latency_ms}ms")
|
|
print(f"Sample count: {pattern.sample_count}")
|
|
```
|
|
|
|
### 4. Model Performance Tracking
|
|
|
|
```python
|
|
# Find best model for task type
|
|
best_model = engine.db.get_best_model("read", min_samples=10)
|
|
# Returns: "hermes3:8b" (if it has best success rate)
|
|
```
|
|
|
|
### 5. Learning Velocity
|
|
|
|
```python
|
|
report = engine.get_intelligence_report()
|
|
velocity = report['learning_velocity']
|
|
print(f"Improvement: {velocity['improvement']:+.1%}")
|
|
print(f"Status: {velocity['velocity']}") # accelerating/stable/declining
|
|
```
|
|
|
|
## Architecture
|
|
|
|
```
|
|
┌─────────────────────────────────────────────────────────────────┐
|
|
│ UNI-WIZARD v3 ARCHITECTURE │
|
|
├─────────────────────────────────────────────────────────────────┤
|
|
│ │
|
|
│ ┌──────────────────────────────────────────────────────────┐ │
|
|
│ │ INTELLIGENCE ENGINE │ │
|
|
│ │ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │ │
|
|
│ │ │ Pattern │ │ Adaptive │ │ Prediction │ │ │
|
|
│ │ │ Database │ │ Policies │ │ Engine │ │ │
|
|
│ │ └──────────────┘ └──────────────┘ └──────────────┘ │ │
|
|
│ └──────────────────────────┬───────────────────────────────┘ │
|
|
│ │ │
|
|
│ ┌───────────────────┼───────────────────┐ │
|
|
│ │ │ │ │
|
|
│ ┌──────▼──────┐ ┌──────▼──────┐ ┌──────▼──────┐ │
|
|
│ │ TIMMY │ │ EZRA │ │ BEZALEL │ │
|
|
│ │ Harness │ │ Harness │ │ Harness │ │
|
|
│ │ (Sovereign)│ │ (Adaptive) │ │ (Adaptive) │ │
|
|
│ └──────┬──────┘ └──────┬──────┘ └──────┬──────┘ │
|
|
│ │ │ │ │
|
|
│ └───────────────────┼───────────────────┘ │
|
|
│ │ │
|
|
│ ┌──────────────────────────▼──────────────────────────┐ │
|
|
│ │ HERMES BRIDGE (Shortest Loop) │ │
|
|
│ │ Hermes Session DB → Real-time Stream Processor │ │
|
|
│ └──────────────────────────┬──────────────────────────┘ │
|
|
│ │ │
|
|
│ ┌──────────────────────────▼──────────────────────────┐ │
|
|
│ │ HERMES HARNESS │ │
|
|
│ │ (Source of telemetry) │ │
|
|
│ └──────────────────────────────────────────────────────┘ │
|
|
│ │
|
|
└─────────────────────────────────────────────────────────────────┘
|
|
```
|
|
|
|
## Usage
|
|
|
|
### Quick Start
|
|
|
|
```python
|
|
from v3.harness import get_harness
|
|
from v3.intelligence_engine import IntelligenceEngine
|
|
|
|
# Create shared intelligence
|
|
intel = IntelligenceEngine()
|
|
|
|
# Create harnesses
|
|
timmy = get_harness("timmy", intelligence=intel)
|
|
ezra = get_harness("ezra", intelligence=intel)
|
|
|
|
# Execute (automatically recorded)
|
|
result = ezra.execute("git_status", repo_path="/tmp")
|
|
|
|
# Check what we learned
|
|
pattern = intel.db.get_pattern("git_status", "ezra")
|
|
print(f"Learned: {pattern.success_rate:.0%} success rate")
|
|
```
|
|
|
|
### With Hermes Integration
|
|
|
|
```python
|
|
from v3.hermes_bridge import ShortestLoopIntegrator
|
|
|
|
# Connect to Hermes
|
|
integrator = ShortestLoopIntegrator(intel)
|
|
integrator.start()
|
|
|
|
# Now all Hermes executions teach Timmy
|
|
```
|
|
|
|
### Adaptive Learning
|
|
|
|
```python
|
|
# After many executions
|
|
timmy.learn_from_batch()
|
|
|
|
# Policies have adapted
|
|
print(f"Ezra's evidence threshold: {ezra.policy.get('evidence_threshold')}")
|
|
# May have changed from default 0.8 based on performance
|
|
```
|
|
|
|
## Performance Metrics
|
|
|
|
### Intelligence Report
|
|
|
|
```python
|
|
report = intel.get_intelligence_report()
|
|
|
|
{
|
|
"timestamp": "2026-03-30T20:00:00Z",
|
|
"house_performance": {
|
|
"ezra": {"success_rate": 0.85, "avg_latency_ms": 120},
|
|
"bezalel": {"success_rate": 0.78, "avg_latency_ms": 200}
|
|
},
|
|
"learning_velocity": {
|
|
"velocity": "accelerating",
|
|
"improvement": +0.05
|
|
},
|
|
"recent_adaptations": [
|
|
{
|
|
"change_type": "policy.ezra.evidence_threshold",
|
|
"old_value": 0.8,
|
|
"new_value": 0.75,
|
|
"reason": "Ezra success rate 55% below threshold"
|
|
}
|
|
]
|
|
}
|
|
```
|
|
|
|
### Prediction Accuracy
|
|
|
|
```python
|
|
# How good are our predictions?
|
|
accuracy = intel._calculate_prediction_accuracy()
|
|
print(f"Prediction accuracy: {accuracy:.0%}")
|
|
```
|
|
|
|
## File Structure
|
|
|
|
```
|
|
uni-wizard/v3/
|
|
├── README.md # This document
|
|
├── CRITIQUE.md # Review of v1/v2 gaps
|
|
├── intelligence_engine.py # Pattern DB + learning (24KB)
|
|
├── harness.py # Adaptive harness (18KB)
|
|
├── hermes_bridge.py # Shortest loop bridge (14KB)
|
|
└── tests/
|
|
└── test_v3.py # Comprehensive tests
|
|
```
|
|
|
|
## Comparison
|
|
|
|
| Feature | v1 | v2 | v3 |
|
|
|---------|-----|-----|-----|
|
|
| Telemetry | Basic logging | Provenance tracking | **Pattern recognition** |
|
|
| Policies | Static | Static | **Adaptive** |
|
|
| Learning | None | None | **Continuous** |
|
|
| Predictions | None | None | **Pre-execution** |
|
|
| Hermes Integration | Manual | Manual | **Real-time stream** |
|
|
| Policy Adaptation | No | No | **Auto-adjust** |
|
|
| Self-Improvement | No | No | **Yes** |
|
|
|
|
## The Self-Improvement Loop
|
|
|
|
```
|
|
┌──────────────────────────────────────────────────────────┐
|
|
│ SELF-IMPROVEMENT CYCLE │
|
|
└──────────────────────────────────────────────────────────┘
|
|
|
|
1. EXECUTE
|
|
└── Run tool with house policy
|
|
|
|
2. RECORD
|
|
└── Store outcome in Pattern Database
|
|
|
|
3. ANALYZE (every N executions)
|
|
└── Check house performance
|
|
└── Identify patterns
|
|
└── Detect underperformance
|
|
|
|
4. ADAPT
|
|
└── Adjust policy thresholds
|
|
└── Update routing preferences
|
|
└── Record adaptation
|
|
|
|
5. PREDICT (next execution)
|
|
└── Query pattern for tool/house
|
|
└── Return predicted success rate
|
|
|
|
6. EXECUTE (with new policy)
|
|
└── Apply adapted threshold
|
|
└── Use prediction for confidence
|
|
|
|
7. MEASURE
|
|
└── Did adaptation help?
|
|
└── Update learning velocity
|
|
|
|
←─ Repeat ─┘
|
|
```
|
|
|
|
## Design Principles
|
|
|
|
1. **Every execution teaches** — No telemetry without analysis
|
|
2. **Local learning only** — Pattern recognition runs on-device
|
|
3. **Shortest feedback loop** — Hermes → Intelligence <100ms
|
|
4. **Transparent adaptation** — Timmy explains policy changes
|
|
5. **Sovereignty-preserving** — Learning improves local decisions
|
|
|
|
## Future Work
|
|
|
|
- [ ] Fine-tune local models based on telemetry
|
|
- [ ] Predictive caching (pre-fetch likely tools)
|
|
- [ ] Anomaly detection (detect unusual failures)
|
|
- [ ] Cross-session pattern learning
|
|
- [ ] Automated A/B testing of policies
|
|
|
|
---
|
|
|
|
*Timmy gets smarter every day he runs.*
|