Files

Allegro 31026ddcc1 [#76-v4] Final Uni-Wizard Architecture — Production Integration

Complete four-pass evolution to production-ready architecture:

**Pass 1 → Foundation:**
- Tool registry, basic harness, 19 tools
- VPS provisioning, Syncthing mesh
- Health daemon, systemd services

**Pass 2 → Three-House Canon:**
- Timmy (Sovereign), Ezra (Archivist), Bezalel (Artificer)
- Provenance tracking, artifact-flow discipline
- House-aware policy enforcement

**Pass 3 → Self-Improvement:**
- Pattern database with SQLite backend
- Adaptive policies (auto-adjust thresholds)
- Predictive execution (success prediction)
- Hermes bridge for shortest-loop telemetry
- Learning velocity tracking

**Pass 4 → Production Integration:**
- Unified API: `from uni_wizard import Harness, House, Mode`
- Three modes: SIMPLE / INTELLIGENT / SOVEREIGN
- Circuit breaker pattern for fault tolerance
- Async/concurrent execution support
- Production hardening (timeouts, retries)

**Allegro Lane Definition:**
- Narrowed to: Gitea integration, Hermes bridge, redundancy/failover
- Provides: Cloud connectivity, telemetry streaming, issue routing
- Does NOT: Make sovereign decisions, authenticate as Timmy

**Files:**
- v3/: Intelligence engine, adaptive harness, Hermes bridge
- v4/: Unified API, production harness, final architecture

Total: ~25KB architecture documentation + production code

2026-03-30 16:39:42 +00:00

12 KiB

Raw Permalink Blame History

Uni-Wizard v3 — Self-Improving Local Sovereignty

"Every execution teaches. Every pattern informs. Timmy gets smarter every day he runs."

The v3 Breakthrough: Closed-Loop Intelligence

The Problem with v1/v2

Previous Architectures (Open Loop):
┌─────────┐    ┌──────────┐    ┌─────────┐
│ Execute │───→│ Log Data │───→│  Report │───→ 🗑️  (data goes nowhere)
└─────────┘    └──────────┘    └─────────┘

v3 Architecture (Closed Loop):
┌─────────┐    ┌──────────┐    ┌───────────┐    ┌─────────┐
│ Execute │───→│ Log Data │───→│  Analyze  │───→│  Adapt  │
└─────────┘    └──────────┘    └─────┬─────┘    └────┬────┘
     ↑                               │               │
     └───────────────────────────────┴───────────────┘
                Intelligence Engine

Core Components

1. Intelligence Engine (`intelligence_engine.py`)

The brain that makes Timmy smarter:

Pattern Database: SQLite store of all executions
Pattern Recognition: Tool + params → success rate
Adaptive Policies: Thresholds adjust based on performance
Prediction Engine: Pre-execution success prediction
Learning Velocity: Tracks improvement over time

engine = IntelligenceEngine()

# Predict before executing
prob, reason = engine.predict_success("git_status", "ezra")
print(f"Predicted success: {prob:.0%} — {reason}")

# Get optimal routing
house, confidence = engine.get_optimal_house("deploy")
print(f"Best house: {house} (confidence: {confidence:.0%})")

2. Adaptive Harness (`harness.py`)

Harness v3 with intelligence integration:

# Create harness with learning enabled
harness = UniWizardHarness("timmy", enable_learning=True)

# Execute with predictions
result = harness.execute("git_status", repo_path="/tmp")
print(f"Predicted: {result.provenance.prediction:.0%}")
print(f"Actual: {'✅' if result.success else '❌'}")

# Trigger learning
harness.learn_from_batch()

3. Hermes Bridge (`hermes_bridge.py`)

Shortest Loop Integration: Hermes telemetry → Timmy intelligence in <100ms

# Start real-time streaming
integrator = ShortestLoopIntegrator(intelligence_engine)
integrator.start()

# All Hermes sessions now feed into Timmy's intelligence

Key Features

1. Self-Improving Policies

Policies adapt based on actual performance:

# If Ezra's success rate drops below 60%
# → Lower evidence threshold automatically
# If Bezalel's tests pass consistently
# → Raise proof requirements (we can be stricter)

2. Predictive Execution

Predict success before executing:

prediction, reasoning = harness.predict_execution("deploy", params)
# Returns: (0.85, "Based on 23 similar executions: good track record")

3. Pattern Recognition

# Find patterns in execution history
pattern = engine.db.get_pattern("git_status", "ezra")
print(f"Success rate: {pattern.success_rate:.0%}")
print(f"Avg latency: {pattern.avg_latency_ms}ms")
print(f"Sample count: {pattern.sample_count}")

4. Model Performance Tracking

# Find best model for task type
best_model = engine.db.get_best_model("read", min_samples=10)
# Returns: "hermes3:8b" (if it has best success rate)

5. Learning Velocity

report = engine.get_intelligence_report()
velocity = report['learning_velocity']
print(f"Improvement: {velocity['improvement']:+.1%}")
print(f"Status: {velocity['velocity']}")  # accelerating/stable/declining

Architecture

┌─────────────────────────────────────────────────────────────────┐
│                    UNI-WIZARD v3 ARCHITECTURE                    │
├─────────────────────────────────────────────────────────────────┤
│                                                                  │
│  ┌──────────────────────────────────────────────────────────┐  │
│  │              INTELLIGENCE ENGINE                          │  │
│  │  ┌──────────────┐  ┌──────────────┐  ┌──────────────┐   │  │
│  │  │   Pattern    │  │   Adaptive   │  │  Prediction  │   │  │
│  │  │   Database   │  │   Policies   │  │    Engine    │   │  │
│  │  └──────────────┘  └──────────────┘  └──────────────┘   │  │
│  └──────────────────────────┬───────────────────────────────┘  │
│                             │                                   │
│         ┌───────────────────┼───────────────────┐              │
│         │                   │                   │              │
│  ┌──────▼──────┐    ┌──────▼──────┐    ┌──────▼──────┐       │
│  │    TIMMY    │    │    EZRA     │    │   BEZALEL   │       │
│  │  Harness    │    │  Harness    │    │  Harness    │       │
│  │  (Sovereign)│    │  (Adaptive) │    │  (Adaptive) │       │
│  └──────┬──────┘    └──────┬──────┘    └──────┬──────┘       │
│         │                   │                   │              │
│         └───────────────────┼───────────────────┘              │
│                             │                                   │
│  ┌──────────────────────────▼──────────────────────────┐      │
│  │              HERMES BRIDGE (Shortest Loop)           │      │
│  │   Hermes Session DB → Real-time Stream Processor    │      │
│  └──────────────────────────┬──────────────────────────┘      │
│                             │                                   │
│  ┌──────────────────────────▼──────────────────────────┐      │
│  │                 HERMES HARNESS                       │      │
│  │         (Source of telemetry)                        │      │
│  └──────────────────────────────────────────────────────┘      │
│                                                                  │
└─────────────────────────────────────────────────────────────────┘

Usage

Quick Start

from v3.harness import get_harness
from v3.intelligence_engine import IntelligenceEngine

# Create shared intelligence
intel = IntelligenceEngine()

# Create harnesses
timmy = get_harness("timmy", intelligence=intel)
ezra = get_harness("ezra", intelligence=intel)

# Execute (automatically recorded)
result = ezra.execute("git_status", repo_path="/tmp")

# Check what we learned
pattern = intel.db.get_pattern("git_status", "ezra")
print(f"Learned: {pattern.success_rate:.0%} success rate")

With Hermes Integration

from v3.hermes_bridge import ShortestLoopIntegrator

# Connect to Hermes
integrator = ShortestLoopIntegrator(intel)
integrator.start()

# Now all Hermes executions teach Timmy

Adaptive Learning

# After many executions
timmy.learn_from_batch()

# Policies have adapted
print(f"Ezra's evidence threshold: {ezra.policy.get('evidence_threshold')}")
# May have changed from default 0.8 based on performance

Performance Metrics

Intelligence Report

report = intel.get_intelligence_report()

{
    "timestamp": "2026-03-30T20:00:00Z",
    "house_performance": {
        "ezra": {"success_rate": 0.85, "avg_latency_ms": 120},
        "bezalel": {"success_rate": 0.78, "avg_latency_ms": 200}
    },
    "learning_velocity": {
        "velocity": "accelerating",
        "improvement": +0.05
    },
    "recent_adaptations": [
        {
            "change_type": "policy.ezra.evidence_threshold",
            "old_value": 0.8,
            "new_value": 0.75,
            "reason": "Ezra success rate 55% below threshold"
        }
    ]
}

Prediction Accuracy

# How good are our predictions?
accuracy = intel._calculate_prediction_accuracy()
print(f"Prediction accuracy: {accuracy:.0%}")

File Structure

uni-wizard/v3/
├── README.md                   # This document
├── CRITIQUE.md                 # Review of v1/v2 gaps
├── intelligence_engine.py      # Pattern DB + learning (24KB)
├── harness.py                  # Adaptive harness (18KB)
├── hermes_bridge.py            # Shortest loop bridge (14KB)
└── tests/
    └── test_v3.py             # Comprehensive tests

Comparison

Feature	v1	v2	v3
Telemetry	Basic logging	Provenance tracking	Pattern recognition
Policies	Static	Static	Adaptive
Learning	None	None	Continuous
Predictions	None	None	Pre-execution
Hermes Integration	Manual	Manual	Real-time stream
Policy Adaptation	No	No	Auto-adjust
Self-Improvement	No	No	Yes

The Self-Improvement Loop

┌──────────────────────────────────────────────────────────┐
│                  SELF-IMPROVEMENT CYCLE                   │
└──────────────────────────────────────────────────────────┘

1. EXECUTE
   └── Run tool with house policy

2. RECORD
   └── Store outcome in Pattern Database

3. ANALYZE (every N executions)
   └── Check house performance
   └── Identify patterns
   └── Detect underperformance

4. ADAPT
   └── Adjust policy thresholds
   └── Update routing preferences
   └── Record adaptation

5. PREDICT (next execution)
   └── Query pattern for tool/house
   └── Return predicted success rate

6. EXECUTE (with new policy)
   └── Apply adapted threshold
   └── Use prediction for confidence

7. MEASURE
   └── Did adaptation help?
   └── Update learning velocity

←─ Repeat ─┘

Design Principles

Every execution teaches — No telemetry without analysis
Local learning only — Pattern recognition runs on-device
Shortest feedback loop — Hermes → Intelligence <100ms
Transparent adaptation — Timmy explains policy changes
Sovereignty-preserving — Learning improves local decisions

Future Work

Fine-tune local models based on telemetry
Predictive caching (pre-fetch likely tools)
Anomaly detection (detect unusual failures)
Cross-session pattern learning
Automated A/B testing of policies

Timmy gets smarter every day he runs.

12 KiB Raw Permalink Blame History