feat: Frontier Local Agenda — Gemma Scout & Local RAG (#227)

Co-authored-by: Google AI Agent <gemini@hermes.local>
Co-committed-by: Google AI Agent <gemini@hermes.local>
This commit was merged in pull request #227.
This commit is contained in:
2026-04-05 21:38:56 +00:00
committed by Timmy Time
parent 2b308f300a
commit 317140efcf
2 changed files with 38 additions and 3 deletions

30
FRONTIER_LOCAL.md Normal file
View File

@@ -0,0 +1,30 @@
# The Frontier Local Agenda: Technical Standards v1.0
This document defines the "Frontier Local" agenda — the technical strategy for achieving sovereign, high-performance intelligence on consumer hardware.
## 1. The Multi-Layered Mind (MLM)
We do not rely on a single "God Model." We use a hierarchy of local intelligence:
- **Reflex Layer (Gemma 2B):** Instantaneous tactical decisions, input classification, and simple acknowledgments. Latency: <100ms.
- **Reasoning Layer (Hermes 14B / Llama 3 8B):** General-purpose problem solving, coding, and tool use. Latency: <1s.
- **Synthesis Layer (Llama 3 70B / Qwen 72B):** Deep architectural planning, creative synthesis, and complex debugging. Latency: <5s.
## 2. Local-First RAG (Retrieval Augmented Generation)
Sovereignty requires that your memories stay on your disk.
- **Embedding:** Use `nomic-embed-text` or `all-minilm` locally via Ollama.
- **Vector Store:** Use a local instance of ChromaDB or LanceDB.
- **Privacy:** Zero data leaves the local network for indexing or retrieval.
## 3. Speculative Decoding
Where supported by the harness (e.g., llama.cpp), use Gemma 2B as a draft model for larger Hermes/Llama models to achieve 2x-3x speedups in token generation.
## 4. The "Gemma Scout" Protocol
Gemma 2B is our "Scout." It pre-processes every user request to:
1. Detect PII (Personally Identifiable Information) for redaction.
2. Determine if the request requires the "Reasoning Layer" or can be handled by the "Reflex Layer."
3. Extract keywords for local memory retrieval.
---
*Intelligence is a utility. Sovereignty is a right. The Frontier is Local.*

View File

@@ -20,7 +20,12 @@ terminal:
modal_image: nikolaik/python-nodejs:python3.11-nodejs20
daytona_image: nikolaik/python-nodejs:python3.11-nodejs20
container_cpu: 1
container_memory: 5120
container_embeddings:
provider: ollama
model: nomic-embed-text
base_url: http://localhost:11434/v1
memory: 5120
container_disk: 51200
container_persistent: true
docker_volumes: []
@@ -43,8 +48,8 @@ compression:
summary_base_url: ''
smart_model_routing:
enabled: true
max_simple_chars: 200
max_simple_words: 35
max_simple_chars: 400
max_simple_words: 75
cheap_model:
provider: 'ollama'
model: 'gemma2:2b'