# The Sovereignty Loop This document establishes the primary engineering constraint for all Timmy Time development: every task must increase sovereignty as a default deliverable. Not as a future goal. Not as an optimization pass. As a constraint on every commit, every function, every inference call. The full 11-page governing architecture document is available as a PDF: [The-Sovereignty-Loop.pdf](./The-Sovereignty-Loop.pdf) > "The measure of progress is not features added. It is model calls eliminated." ## The Core Principle > **The Sovereignty Loop**: Discover with an expensive model. Compress the discovery into a cheap local rule. Replace the model with the rule. Measure the cost reduction. Repeat. Every call to an LLM, VLM, or external API passes through three phases: 1. **Discovery** — Model sees something for the first time (expensive, unavoidable, produces new knowledge) 2. **Crystallization** — Discovery compressed into durable cheap artifact (requires explicit engineering) 3. **Replacement** — Crystallized artifact replaces the model call (near-zero cost) **Code review requirement**: If a function calls a model without a crystallization step, it fails code review. No exceptions. The pattern is always: check cache → miss → infer → crystallize → return. ## The Sovereignty Loop Applied to Every Layer ### Perception: See Once, Template Forever - First encounter: VLM analyzes screenshot (3-6 sec) → structured JSON - Crystallized as: OpenCV template + bounding box → `templates.json` (3 ms retrieval) - `crystallize_perception()` function wraps every VLM response - **Target**: 90% of perception cycles without VLM by hour 1, 99% by hour 4 ### Decision: Reason Once, Rule Forever - First encounter: LLM reasons through decision (1-5 sec) - Crystallized as: if/else rules, waypoints, cached preferences → `rules.py`, `nav_graph.db` (<1 ms) - Uses Voyager pattern: named skills with embeddings, success rates, conditions - Skill match >0.8 confidence + >0.6 success rate → executes without LLM - **Target**: 70-80% of decisions without LLM by week 4 ### Narration: Script the Predictable, Improvise the Novel - Predictable moments → template with variable slots, voiced by Kokoro locally - LLM narrates only genuinely surprising events (quest twist, death, discovery) - **Target**: 60-70% templatized within a week ### Navigation: Walk Once, Map Forever - Every path recorded as waypoint sequence with terrain annotations - First journey = full perception + planning; subsequent = graph traversal - Builds complete nav graph without external map data ### API Costs: Every Dollar Spent Must Reduce Future Dollars | Week | Groq Calls/Hr | Local Decisions/Hr | Sovereignty % | Cost/Hr | |---|---|---|---|---| | 1 | ~720 | ~80 | 10% | $0.40 | | 2 | ~400 | ~400 | 50% | $0.22 | | 4 | ~160 | ~640 | 80% | $0.09 | | 8 | ~40 | ~760 | 95% | $0.02 | | Target | <20 | >780 | >97% | <$0.01 | ## The Sovereignty Scorecard (5 Metrics) Every work session ends with a sovereignty audit. Every PR includes a sovereignty delta. Not optional. | Metric | What It Measures | Target | |---|---|---| | Perception Sovereignty % | Frames understood without VLM | >90% by hour 4 | | Decision Sovereignty % | Actions chosen without LLM | >80% by week 4 | | Narration Sovereignty % | Lines from templates vs LLM | >60% by week 2 | | API Cost Trend | Dollar cost per hour of gameplay | Monotonically decreasing | | Skill Library Growth | Crystallized skills per session | >5 new skills/session | Dashboard widget on alexanderwhitestone.com shows these in real-time during streams. HTMX component via WebSocket. ## The Crystallization Protocol Every model output gets crystallized: | Model Output | Crystallized As | Storage | Retrieval Cost | |---|---|---|---| | VLM: UI element | OpenCV template + bbox | templates.json | 3 ms | | VLM: text | OCR region coords | regions.json | 50 ms | | LLM: nav plan | Waypoint sequence | nav_graph.db | <1 ms | | LLM: combat decision | If/else rule on state | rules.py | <1 ms | | LLM: quest interpretation | Structured entry | quests.db | <1 ms | | LLM: NPC disposition | Name→attitude map | npcs.db | <1 ms | | LLM: narration | Template with slots | narration.json | <1 ms | | API: moderation | Approved phrase cache | approved.set | <1 ms | | Groq: strategic plan | Extracted decision rules | strategy.json | <1 ms | Skill document format: markdown + YAML frontmatter following agentskills.io standard (name, game, type, success_rate, times_used, sovereignty_value). ## The Automation Imperative & Three-Strike Rule Applies to developer workflow too, not just the agent. If you do the same thing manually three times, you stop and write the automation before proceeding. **Falsework Checklist** (before any cloud API call): 1. What durable artifact will this call produce? 2. Where will the artifact be stored locally? 3. What local rule or cache will this populate? 4. After this call, will I need to make it again? 5. If yes, what would eliminate the repeat? 6. What is the sovereignty delta of this call? ## The Graduation Test (Falsework Removal Criteria) All five conditions met simultaneously in a single 24-hour period: | Test | Condition | Measurement | |---|---|---| | Perception Independence | 1 hour, no VLM calls after minute 15 | VLM calls in last 45 min = 0 | | Decision Independence | Full session with <5 API calls total | Groq/cloud calls < 5 | | Narration Independence | All narration from local templates + local LLM | Zero cloud TTS/narration calls | | Economic Independence | Earns more sats than spends on inference | sats_earned > sats_spent | | Operational Independence | 24 hours unattended, no human intervention | Uptime > 23.5 hrs | > "The arch must hold after the falsework is removed."