timmy-home/uni-wizard/v3/README.md

# Uni-Wizard v3 — Self-Improving Local Sovereignty

> *"Every execution teaches. Every pattern informs. Timmy gets smarter every day he runs."*

## The v3 Breakthrough: Closed-Loop Intelligence

### The Problem with v1/v2

```
Previous Architectures (Open Loop):
┌─────────┐    ┌──────────┐    ┌─────────┐
│ Execute │───→│ Log Data │───→│  Report │───→ 🗑️  (data goes nowhere)
└─────────┘    └──────────┘    └─────────┘

v3 Architecture (Closed Loop):
┌─────────┐    ┌──────────┐    ┌───────────┐    ┌─────────┐
│ Execute │───→│ Log Data │───→│  Analyze  │───→│  Adapt  │
└─────────┘    └──────────┘    └─────┬─────┘    └────┬────┘
     ↑                               │               │
     └───────────────────────────────┴───────────────┘
                Intelligence Engine
```

## Core Components

### 1. Intelligence Engine (`intelligence_engine.py`)

The brain that makes Timmy smarter:

- **Pattern Database**: SQLite store of all executions
- **Pattern Recognition**: Tool + params → success rate
- **Adaptive Policies**: Thresholds adjust based on performance
- **Prediction Engine**: Pre-execution success prediction
- **Learning Velocity**: Tracks improvement over time

```python
engine = IntelligenceEngine()

# Predict before executing
prob, reason = engine.predict_success("git_status", "ezra")
print(f"Predicted success: {prob:.0%} — {reason}")

# Get optimal routing
house, confidence = engine.get_optimal_house("deploy")
print(f"Best house: {house} (confidence: {confidence:.0%})")
```

### 2. Adaptive Harness (`harness.py`)

Harness v3 with intelligence integration:

```python
# Create harness with learning enabled
harness = UniWizardHarness("timmy", enable_learning=True)

# Execute with predictions
result = harness.execute("git_status", repo_path="/tmp")
print(f"Predicted: {result.provenance.prediction:.0%}")
print(f"Actual: {'✅' if result.success else '❌'}")

# Trigger learning
harness.learn_from_batch()
```

### 3. Hermes Bridge (`hermes_bridge.py`)

**Shortest Loop Integration**: Hermes telemetry → Timmy intelligence in <100ms

```python
# Start real-time streaming
integrator = ShortestLoopIntegrator(intelligence_engine)
integrator.start()

# All Hermes sessions now feed into Timmy's intelligence
```

## Key Features

### 1. Self-Improving Policies

Policies adapt based on actual performance:

```python
# If Ezra's success rate drops below 60%
# → Lower evidence threshold automatically
# If Bezalel's tests pass consistently
# → Raise proof requirements (we can be stricter)
```

### 2. Predictive Execution

Predict success before executing:

```python
prediction, reasoning = harness.predict_execution("deploy", params)
# Returns: (0.85, "Based on 23 similar executions: good track record")
```

### 3. Pattern Recognition

```python
# Find patterns in execution history
pattern = engine.db.get_pattern("git_status", "ezra")
print(f"Success rate: {pattern.success_rate:.0%}")
print(f"Avg latency: {pattern.avg_latency_ms}ms")
print(f"Sample count: {pattern.sample_count}")
```

### 4. Model Performance Tracking

```python
# Find best model for task type
best_model = engine.db.get_best_model("read", min_samples=10)
# Returns: "hermes3:8b" (if it has best success rate)
```

### 5. Learning Velocity

```python
report = engine.get_intelligence_report()
velocity = report['learning_velocity']
print(f"Improvement: {velocity['improvement']:+.1%}")
print(f"Status: {velocity['velocity']}")  # accelerating/stable/declining
```

## Architecture

```
┌─────────────────────────────────────────────────────────────────┐
│                    UNI-WIZARD v3 ARCHITECTURE                    │
├─────────────────────────────────────────────────────────────────┤
│                                                                  │
│  ┌──────────────────────────────────────────────────────────┐  │
│  │              INTELLIGENCE ENGINE                          │  │
│  │  ┌──────────────┐  ┌──────────────┐  ┌──────────────┐   │  │
│  │  │   Pattern    │  │   Adaptive   │  │  Prediction  │   │  │
│  │  │   Database   │  │   Policies   │  │    Engine    │   │  │
│  │  └──────────────┘  └──────────────┘  └──────────────┘   │  │
│  └──────────────────────────┬───────────────────────────────┘  │
│                             │                                   │
│         ┌───────────────────┼───────────────────┐              │
│         │                   │                   │              │
│  ┌──────▼──────┐    ┌──────▼──────┐    ┌──────▼──────┐       │
│  │    TIMMY    │    │    EZRA     │    │   BEZALEL   │       │
│  │  Harness    │    │  Harness    │    │  Harness    │       │
│  │  (Sovereign)│    │  (Adaptive) │    │  (Adaptive) │       │
│  └──────┬──────┘    └──────┬──────┘    └──────┬──────┘       │
│         │                   │                   │              │
│         └───────────────────┼───────────────────┘              │
│                             │                                   │
│  ┌──────────────────────────▼──────────────────────────┐      │
│  │              HERMES BRIDGE (Shortest Loop)           │      │
│  │   Hermes Session DB → Real-time Stream Processor    │      │
│  └──────────────────────────┬──────────────────────────┘      │
│                             │                                   │
│  ┌──────────────────────────▼──────────────────────────┐      │
│  │                 HERMES HARNESS                       │      │
│  │         (Source of telemetry)                        │      │
│  └──────────────────────────────────────────────────────┘      │
│                                                                  │
└─────────────────────────────────────────────────────────────────┘
```

## Usage

### Quick Start

```python
from v3.harness import get_harness
from v3.intelligence_engine import IntelligenceEngine

# Create shared intelligence
intel = IntelligenceEngine()

# Create harnesses
timmy = get_harness("timmy", intelligence=intel)
ezra = get_harness("ezra", intelligence=intel)

# Execute (automatically recorded)
result = ezra.execute("git_status", repo_path="/tmp")

# Check what we learned
pattern = intel.db.get_pattern("git_status", "ezra")
print(f"Learned: {pattern.success_rate:.0%} success rate")
```

### With Hermes Integration

```python
from v3.hermes_bridge import ShortestLoopIntegrator

# Connect to Hermes
integrator = ShortestLoopIntegrator(intel)
integrator.start()

# Now all Hermes executions teach Timmy
```

### Adaptive Learning

```python
# After many executions
timmy.learn_from_batch()

# Policies have adapted
print(f"Ezra's evidence threshold: {ezra.policy.get('evidence_threshold')}")
# May have changed from default 0.8 based on performance
```

## Performance Metrics

### Intelligence Report

```python
report = intel.get_intelligence_report()

{
    "timestamp": "2026-03-30T20:00:00Z",
    "house_performance": {
        "ezra": {"success_rate": 0.85, "avg_latency_ms": 120},
        "bezalel": {"success_rate": 0.78, "avg_latency_ms": 200}
    },
    "learning_velocity": {
        "velocity": "accelerating",
        "improvement": +0.05
    },
    "recent_adaptations": [
        {
            "change_type": "policy.ezra.evidence_threshold",
            "old_value": 0.8,
            "new_value": 0.75,
            "reason": "Ezra success rate 55% below threshold"
        }
    ]
}
```

### Prediction Accuracy

```python
# How good are our predictions?
accuracy = intel._calculate_prediction_accuracy()
print(f"Prediction accuracy: {accuracy:.0%}")
```

## File Structure

```
uni-wizard/v3/
├── README.md                   # This document
├── CRITIQUE.md                 # Review of v1/v2 gaps
├── intelligence_engine.py      # Pattern DB + learning (24KB)
├── harness.py                  # Adaptive harness (18KB)
├── hermes_bridge.py            # Shortest loop bridge (14KB)
└── tests/
    └── test_v3.py             # Comprehensive tests
```

## Comparison

| Feature | v1 | v2 | v3 |
|---------|-----|-----|-----|
| Telemetry | Basic logging | Provenance tracking | **Pattern recognition** |
| Policies | Static | Static | **Adaptive** |
| Learning | None | None | **Continuous** |
| Predictions | None | None | **Pre-execution** |
| Hermes Integration | Manual | Manual | **Real-time stream** |
| Policy Adaptation | No | No | **Auto-adjust** |
| Self-Improvement | No | No | **Yes** |

## The Self-Improvement Loop

```
┌──────────────────────────────────────────────────────────┐
│                  SELF-IMPROVEMENT CYCLE                   │
└──────────────────────────────────────────────────────────┘

1. EXECUTE
   └── Run tool with house policy

2. RECORD
   └── Store outcome in Pattern Database

3. ANALYZE (every N executions)
   └── Check house performance
   └── Identify patterns
   └── Detect underperformance

4. ADAPT
   └── Adjust policy thresholds
   └── Update routing preferences
   └── Record adaptation

5. PREDICT (next execution)
   └── Query pattern for tool/house
   └── Return predicted success rate

6. EXECUTE (with new policy)
   └── Apply adapted threshold
   └── Use prediction for confidence

7. MEASURE
   └── Did adaptation help?
   └── Update learning velocity

←─ Repeat ─┘
```

## Design Principles

1. **Every execution teaches** — No telemetry without analysis
2. **Local learning only** — Pattern recognition runs on-device
3. **Shortest feedback loop** — Hermes → Intelligence <100ms
4. **Transparent adaptation** — Timmy explains policy changes
5. **Sovereignty-preserving** — Learning improves local decisions

## Future Work

- [ ] Fine-tune local models based on telemetry
- [ ] Predictive caching (pre-fetch likely tools)
- [ ] Anomaly detection (detect unusual failures)
- [ ] Cross-session pattern learning
- [ ] Automated A/B testing of policies

---

*Timmy gets smarter every day he runs.*