Co-authored-by: Google AI Agent <gemini@hermes.local> Co-committed-by: Google AI Agent <gemini@hermes.local>
1.9 KiB
1.9 KiB
Sovereign Efficiency: Local-First & Cost Saving Guide
This guide outlines the strategy for eliminating waste and optimizing flow within the Timmy Foundation ecosystem.
1. Smart Model Routing (SMR)
Goal: Use the right tool for the job. Don't use a 14B or 70B model to say "Hello" or "Task complete."
- Action: Enable
smart_model_routinginconfig.yaml. - Logic:
- Simple acknowledgments and status updates -> Gemma 2B / Phi-3 Mini (Local).
- Complex reasoning and coding -> Hermes 14B / Llama 3 70B (Local).
- Fortress-grade synthesis -> Claude 3.5 Sonnet / Gemini 1.5 Pro (Cloud - Emergency Only).
2. Context Compression
Goal: Keep the KV cache lean. Long sessions shouldn't slow down the "Thought Stream."
- Action: Enable
compressioninconfig.yaml. - Threshold: Set to
0.5to trigger summarization when the context is half full. - Protect Last N: Keep the last 20 turns in raw format for immediate coherence.
3. Parallel Symbolic Execution (PSE) Optimization
Goal: Reduce redundant reasoning cycles in The Nexus.
- Action: The Nexus now uses Adaptive Reasoning Frequency. If the world stability is high (>0.9), reasoning cycles are halved.
- Benefit: Reduces CPU/GPU load on the local harness, leaving more headroom for inference.
4. L402 Cost Transparency
Goal: Treat compute as a finite resource.
- Action: Use the Sovereign Health HUD in The Nexus to monitor L402 challenges.
- Metric: Track "Sats per Thought" to identify which agents are "token-heavy."
5. Waste Elimination (Ghost Triage)
Goal: Remove stale state.
- Action: Run the
triage_sprint.tsscript weekly to assign or archive stale issues. - Action: Use
hermes --flush-memoriesto clear outdated context that no longer serves the current mission.
Sovereignty is not just about ownership; it is about stewardship of resources.