202 lines
5.3 KiB
Markdown
202 lines
5.3 KiB
Markdown
|
|
# Sovereignty Loop — Integration Guide
|
||
|
|
|
||
|
|
How to use the sovereignty subsystem in new code and existing modules.
|
||
|
|
|
||
|
|
> "The measure of progress is not features added. It is model calls eliminated."
|
||
|
|
|
||
|
|
Refs: #953 (The Sovereignty Loop)
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## Quick Start
|
||
|
|
|
||
|
|
Every model call must follow the sovereignty protocol:
|
||
|
|
**check cache → miss → infer → crystallize → return**
|
||
|
|
|
||
|
|
### Perception Layer (VLM calls)
|
||
|
|
|
||
|
|
```python
|
||
|
|
from timmy.sovereignty.sovereignty_loop import sovereign_perceive
|
||
|
|
from timmy.sovereignty.perception_cache import PerceptionCache
|
||
|
|
|
||
|
|
cache = PerceptionCache("data/templates.json")
|
||
|
|
|
||
|
|
state = await sovereign_perceive(
|
||
|
|
screenshot=frame,
|
||
|
|
cache=cache,
|
||
|
|
vlm=my_vlm_client,
|
||
|
|
session_id="session_001",
|
||
|
|
)
|
||
|
|
```
|
||
|
|
|
||
|
|
### Decision Layer (LLM calls)
|
||
|
|
|
||
|
|
```python
|
||
|
|
from timmy.sovereignty.sovereignty_loop import sovereign_decide
|
||
|
|
|
||
|
|
result = await sovereign_decide(
|
||
|
|
context={"health": 25, "enemy_count": 3},
|
||
|
|
llm=my_llm_client,
|
||
|
|
session_id="session_001",
|
||
|
|
)
|
||
|
|
# result["action"] could be "heal" from a cached rule or fresh LLM reasoning
|
||
|
|
```
|
||
|
|
|
||
|
|
### Narration Layer
|
||
|
|
|
||
|
|
```python
|
||
|
|
from timmy.sovereignty.sovereignty_loop import sovereign_narrate
|
||
|
|
|
||
|
|
text = await sovereign_narrate(
|
||
|
|
event={"type": "combat_start", "enemy": "Cliff Racer"},
|
||
|
|
llm=my_llm_client, # optional — None for template-only
|
||
|
|
session_id="session_001",
|
||
|
|
)
|
||
|
|
```
|
||
|
|
|
||
|
|
### General Purpose (Decorator)
|
||
|
|
|
||
|
|
```python
|
||
|
|
from timmy.sovereignty.sovereignty_loop import sovereignty_enforced
|
||
|
|
|
||
|
|
@sovereignty_enforced(
|
||
|
|
layer="decision",
|
||
|
|
cache_check=lambda a, kw: rule_store.find_matching(kw.get("ctx")),
|
||
|
|
crystallize=lambda result, a, kw: rule_store.add(extract_rules(result)),
|
||
|
|
)
|
||
|
|
async def my_expensive_function(ctx):
|
||
|
|
return await llm.reason(ctx)
|
||
|
|
```
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## Auto-Crystallizer
|
||
|
|
|
||
|
|
Automatically extracts rules from LLM reasoning chains:
|
||
|
|
|
||
|
|
```python
|
||
|
|
from timmy.sovereignty.auto_crystallizer import crystallize_reasoning, get_rule_store
|
||
|
|
|
||
|
|
# After any LLM call with reasoning output:
|
||
|
|
rules = crystallize_reasoning(
|
||
|
|
llm_response="I chose heal because health was below 30%.",
|
||
|
|
context={"game": "morrowind"},
|
||
|
|
)
|
||
|
|
|
||
|
|
store = get_rule_store()
|
||
|
|
added = store.add_many(rules)
|
||
|
|
```
|
||
|
|
|
||
|
|
### Rule Lifecycle
|
||
|
|
|
||
|
|
1. **Extracted** — confidence 0.5, not yet reliable
|
||
|
|
2. **Applied** — confidence increases (+0.05 per success, -0.10 per failure)
|
||
|
|
3. **Reliable** — confidence ≥ 0.8 + ≥3 applications + ≥60% success rate
|
||
|
|
4. **Autonomous** — reliably bypasses LLM calls
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## Three-Strike Detector
|
||
|
|
|
||
|
|
Enforces automation for repetitive manual work:
|
||
|
|
|
||
|
|
```python
|
||
|
|
from timmy.sovereignty.three_strike import get_detector, ThreeStrikeError
|
||
|
|
|
||
|
|
detector = get_detector()
|
||
|
|
|
||
|
|
try:
|
||
|
|
detector.record("vlm_prompt_edit", "health_bar_template")
|
||
|
|
except ThreeStrikeError:
|
||
|
|
# Must register an automation before continuing
|
||
|
|
detector.register_automation(
|
||
|
|
"vlm_prompt_edit",
|
||
|
|
"health_bar_template",
|
||
|
|
"scripts/auto_health_bar.py",
|
||
|
|
)
|
||
|
|
```
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## Falsework Checklist
|
||
|
|
|
||
|
|
Before any cloud API call, complete the checklist:
|
||
|
|
|
||
|
|
```python
|
||
|
|
from timmy.sovereignty.three_strike import FalseworkChecklist, falsework_check
|
||
|
|
|
||
|
|
checklist = FalseworkChecklist(
|
||
|
|
durable_artifact="embedding vectors for UI element foo",
|
||
|
|
artifact_storage_path="data/vlm/foo_embeddings.json",
|
||
|
|
local_rule_or_cache="vlm_cache",
|
||
|
|
will_repeat=False,
|
||
|
|
sovereignty_delta="eliminates repeated VLM call",
|
||
|
|
)
|
||
|
|
falsework_check(checklist) # raises ValueError if incomplete
|
||
|
|
```
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## Graduation Test
|
||
|
|
|
||
|
|
Run the five-condition test to evaluate sovereignty readiness:
|
||
|
|
|
||
|
|
```python
|
||
|
|
from timmy.sovereignty.graduation import run_graduation_test
|
||
|
|
|
||
|
|
report = run_graduation_test(
|
||
|
|
sats_earned=100.0,
|
||
|
|
sats_spent=50.0,
|
||
|
|
uptime_hours=24.0,
|
||
|
|
human_interventions=0,
|
||
|
|
)
|
||
|
|
print(report.to_markdown())
|
||
|
|
```
|
||
|
|
|
||
|
|
API endpoint: `GET /sovereignty/graduation/test`
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## Metrics
|
||
|
|
|
||
|
|
Record sovereignty events throughout the codebase:
|
||
|
|
|
||
|
|
```python
|
||
|
|
from timmy.sovereignty.metrics import emit_sovereignty_event
|
||
|
|
|
||
|
|
# Perception hits
|
||
|
|
await emit_sovereignty_event("perception_cache_hit", session_id="s1")
|
||
|
|
await emit_sovereignty_event("perception_vlm_call", session_id="s1")
|
||
|
|
|
||
|
|
# Decision hits
|
||
|
|
await emit_sovereignty_event("decision_rule_hit", session_id="s1")
|
||
|
|
await emit_sovereignty_event("decision_llm_call", session_id="s1")
|
||
|
|
|
||
|
|
# Narration hits
|
||
|
|
await emit_sovereignty_event("narration_template", session_id="s1")
|
||
|
|
await emit_sovereignty_event("narration_llm", session_id="s1")
|
||
|
|
|
||
|
|
# Crystallization
|
||
|
|
await emit_sovereignty_event("skill_crystallized", metadata={"layer": "perception"})
|
||
|
|
```
|
||
|
|
|
||
|
|
Dashboard WebSocket: `ws://localhost:8000/ws/sovereignty`
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## Module Map
|
||
|
|
|
||
|
|
| Module | Purpose | Issue |
|
||
|
|
|--------|---------|-------|
|
||
|
|
| `timmy.sovereignty.metrics` | SQLite event store + sovereignty % | #954 |
|
||
|
|
| `timmy.sovereignty.perception_cache` | OpenCV template matching | #955 |
|
||
|
|
| `timmy.sovereignty.auto_crystallizer` | LLM reasoning → local rules | #961 |
|
||
|
|
| `timmy.sovereignty.sovereignty_loop` | Core orchestration wrappers | #953 |
|
||
|
|
| `timmy.sovereignty.graduation` | Five-condition graduation test | #953 |
|
||
|
|
| `timmy.sovereignty.session_report` | Markdown scorecard + Gitea commit | #957 |
|
||
|
|
| `timmy.sovereignty.three_strike` | Automation enforcement | #962 |
|
||
|
|
| `infrastructure.sovereignty_metrics` | Research sovereignty tracking | #981 |
|
||
|
|
| `dashboard.routes.sovereignty_metrics` | HTMX + API endpoints | #960 |
|
||
|
|
| `dashboard.routes.sovereignty_ws` | WebSocket real-time stream | #960 |
|
||
|
|
| `dashboard.routes.graduation` | Graduation test API | #953 |
|