Compare commits

...

2 Commits

2 changed files with 38 additions and 3 deletions

30
FRONTIER_LOCAL.md Normal file
View File

@@ -0,0 +1,30 @@
# The Frontier Local Agenda: Technical Standards v1.0
This document defines the "Frontier Local" agenda — the technical strategy for achieving sovereign, high-performance intelligence on consumer hardware.
## 1. The Multi-Layered Mind (MLM)
We do not rely on a single "God Model." We use a hierarchy of local intelligence:
- **Reflex Layer (Gemma 2B):** Instantaneous tactical decisions, input classification, and simple acknowledgments. Latency: <100ms.
- **Reasoning Layer (Hermes 14B / Llama 3 8B):** General-purpose problem solving, coding, and tool use. Latency: <1s.
- **Synthesis Layer (Llama 3 70B / Qwen 72B):** Deep architectural planning, creative synthesis, and complex debugging. Latency: <5s.
## 2. Local-First RAG (Retrieval Augmented Generation)
Sovereignty requires that your memories stay on your disk.
- **Embedding:** Use `nomic-embed-text` or `all-minilm` locally via Ollama.
- **Vector Store:** Use a local instance of ChromaDB or LanceDB.
- **Privacy:** Zero data leaves the local network for indexing or retrieval.
## 3. Speculative Decoding
Where supported by the harness (e.g., llama.cpp), use Gemma 2B as a draft model for larger Hermes/Llama models to achieve 2x-3x speedups in token generation.
## 4. The "Gemma Scout" Protocol
Gemma 2B is our "Scout." It pre-processes every user request to:
1. Detect PII (Personally Identifiable Information) for redaction.
2. Determine if the request requires the "Reasoning Layer" or can be handled by the "Reflex Layer."
3. Extract keywords for local memory retrieval.
---
*Intelligence is a utility. Sovereignty is a right. The Frontier is Local.*

View File

@@ -20,7 +20,12 @@ terminal:
modal_image: nikolaik/python-nodejs:python3.11-nodejs20 modal_image: nikolaik/python-nodejs:python3.11-nodejs20
daytona_image: nikolaik/python-nodejs:python3.11-nodejs20 daytona_image: nikolaik/python-nodejs:python3.11-nodejs20
container_cpu: 1 container_cpu: 1
container_memory: 5120 container_embeddings:
provider: ollama
model: nomic-embed-text
base_url: http://localhost:11434/v1
memory: 5120
container_disk: 51200 container_disk: 51200
container_persistent: true container_persistent: true
docker_volumes: [] docker_volumes: []
@@ -43,8 +48,8 @@ compression:
summary_base_url: '' summary_base_url: ''
smart_model_routing: smart_model_routing:
enabled: true enabled: true
max_simple_chars: 200 max_simple_chars: 400
max_simple_words: 35 max_simple_words: 75
cheap_model: cheap_model:
provider: 'ollama' provider: 'ollama'
model: 'gemma2:2b' model: 'gemma2:2b'