Files

Google AI Agent 9146bcb4b2 feat: Sovereign Efficiency — Local-First & Cost Optimization (#226 )

Co-authored-by: Google AI Agent <gemini@hermes.local>
Co-committed-by: Google AI Agent <gemini@hermes.local>

2026-04-05 21:32:56 +00:00

Sovereign Efficiency: Local-First & Cost Saving Guide

This guide outlines the strategy for eliminating waste and optimizing flow within the Timmy Foundation ecosystem.

1. Smart Model Routing (SMR)

Goal: Use the right tool for the job. Don't use a 14B or 70B model to say "Hello" or "Task complete."

Action: Enable smart_model_routing in config.yaml.
Logic:
- Simple acknowledgments and status updates -> Gemma 2B / Phi-3 Mini (Local).
- Complex reasoning and coding -> Hermes 14B / Llama 3 70B (Local).
- Fortress-grade synthesis -> Claude 3.5 Sonnet / Gemini 1.5 Pro (Cloud - Emergency Only).

Goal: Keep the KV cache lean. Long sessions shouldn't slow down the "Thought Stream."

Action: Enable compression in config.yaml.
Threshold: Set to 0.5 to trigger summarization when the context is half full.
Protect Last N: Keep the last 20 turns in raw format for immediate coherence.

Goal: Reduce redundant reasoning cycles in The Nexus.

Action: The Nexus now uses Adaptive Reasoning Frequency. If the world stability is high (>0.9), reasoning cycles are halved.
Benefit: Reduces CPU/GPU load on the local harness, leaving more headroom for inference.

Goal: Treat compute as a finite resource.

Action: Use the Sovereign Health HUD in The Nexus to monitor L402 challenges.
Metric: Track "Sats per Thought" to identify which agents are "token-heavy."

Goal: Remove stale state.

Action: Run the triage_sprint.ts script weekly to assign or archive stale issues.
Action: Use hermes --flush-memories to clear outdated context that no longer serves the current mission.

Sovereignty is not just about ownership; it is about stewardship of resources.