Timmy Local — Sovereign AI Infrastructure
Local infrastructure for Timmy's sovereign AI operation. Runs entirely on your hardware with no cloud dependencies for core functionality.
Quick Start
# 1. Run setup
./setup-local-timmy.sh
# 2. Start llama-server (in another terminal)
llama-server -m ~/models/hermes4-14b.gguf -c 8192 --jinja -ngl 99
# 3. Test the cache layer
python3 -c "from cache.agent_cache import cache_manager; print(cache_manager.get_all_stats())"
# 4. Warm the prompt cache
python3 scripts/warmup_cache.py --all
Components
1. Multi-Tier Caching (cache/)
Issue #103 — Cache Everywhere
| Tier | Purpose | Speedup |
|---|---|---|
| KV Cache | llama-server prefix caching | 50-70% |
| Response Cache | Full LLM response caching | Instant repeat |
| Tool Cache | Stable tool outputs | 30%+ |
| Embedding Cache | RAG embeddings | 80%+ |
| Template Cache | Pre-compiled prompts | 10%+ |
| HTTP Cache | API responses | Varies |
Usage:
from cache.agent_cache import cache_manager
# Tool result caching
result = cache_manager.tool.get("system_info", {})
if result is None:
result = get_system_info()
cache_manager.tool.put("system_info", {}, result)
# Response caching
cached = cache_manager.response.get("What is 2+2?")
if cached is None:
response = query_llm("What is 2+2?")
cache_manager.response.put("What is 2+2?", response)
# Check stats
print(cache_manager.get_all_stats())
2. Evennia World (evennia/)
Issues #83, #84 — World Shell + Tool Bridge
Rooms:
- Workshop — Execute tasks, use tools
- Library — Knowledge storage, retrieval
- Observatory — Monitor systems, check health
- Forge — Build capabilities, create tools
- Dispatch — Task queue, routing
Commands:
read <path>,write <path> = <content>,search <pattern>git status,git log [n],git pullsysinfo,healththink <prompt>— Local LLM reasoninggitea issues
Setup:
cd evennia
python evennia_launcher.py shell -f world/build.py
3. Knowledge Ingestion (scripts/ingest.py)
Issue #87 — Auto-ingest Intelligence
# Ingest a file
python3 scripts/ingest.py ~/papers/speculative-decoding.md
# Batch ingest directory
python3 scripts/ingest.py --batch ~/knowledge/
# Search knowledge
python3 scripts/ingest.py --search "optimization"
# Search by tag
python3 scripts/ingest.py --tag inference
# View stats
python3 scripts/ingest.py --stats
4. Prompt Cache Warming (scripts/warmup_cache.py)
Issue #85 — KV Cache Reuse
# Warm specific prompt tier
python3 scripts/warmup_cache.py --prompt standard
# Warm all tiers
python3 scripts/warmup_cache.py --all
# Benchmark improvement
python3 scripts/warmup_cache.py --benchmark
Directory Structure
timmy-local/
├── cache/
│ ├── agent_cache.py # Main cache implementation
│ └── cache_config.py # TTL and configuration
├── evennia/
│ ├── typeclasses/
│ │ ├── characters.py # Timmy, KnowledgeItem, ToolObject
│ │ └── rooms.py # Workshop, Library, Observatory, Forge, Dispatch
│ ├── commands/
│ │ └── tools.py # In-world tool commands
│ └── world/
│ └── build.py # World construction script
├── scripts/
│ ├── ingest.py # Knowledge ingestion pipeline
│ └── warmup_cache.py # Prompt cache warming
├── setup-local-timmy.sh # Installation script
└── README.md # This file
Configuration
All configuration in ~/.timmy/config/:
# ~/.timmy/config/timmy.yaml
name: "Timmy"
llm:
local_endpoint: http://localhost:8080/v1
model: hermes4
cache:
enabled: true
gitea:
url: http://143.198.27.163:3000
repo: Timmy_Foundation/timmy-home
Integration with Main Architecture
┌─────────────────────────────────────────────────────────────┐
│ LOCAL TIMMY │
│ ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐ │
│ │ Cache │ │ Evennia │ │ Knowledge│ │ Tools │ │
│ │ Layer │ │ World │ │ Base │ │ │ │
│ └────┬─────┘ └────┬─────┘ └────┬─────┘ └────┬─────┘ │
│ └──────────────┴─────────────┴─────────────┘ │
│ │ │
│ ┌────┴────┐ │
│ │ Timmy │ │
│ └────┬────┘ │
└─────────────────────────┼───────────────────────────────────┘
│
┌───────────┼───────────┐
│ │ │
┌────┴───┐ ┌────┴───┐ ┌────┴───┐
│ Ezra │ │Allegro │ │Bezalel │
│ (Cloud)│ │ (Cloud)│ │ (Cloud)│
└────────┘ └────────┘ └────────┘
Local Timmy operates sovereignly. Cloud backends provide additional capacity but Timmy survives without them.
Performance Targets
| Metric | Target |
|---|---|
| Cache hit rate | > 30% |
| Prompt cache warming | 50-70% faster |
| Local inference | < 5s for simple tasks |
| Knowledge retrieval | < 100ms |
Troubleshooting
Cache not working
# Check cache databases
ls -la ~/.timmy/cache/
# Test cache layer
python3 -c "from cache.agent_cache import cache_manager; print(cache_manager.get_all_stats())"
llama-server not responding
# Check if running
curl http://localhost:8080/health
# Restart
pkill llama-server
llama-server -m ~/models/hermes4-14b.gguf -c 8192 --jinja -ngl 99
Evennia commands not available
# Rebuild world
cd evennia
python evennia_launcher.py shell -f world/build.py
# Or manually create Timmy
@create/drop Timmy:typeclasses.characters.TimmyCharacter
@tel Timmy = Workshop
Contributing
All changes flow through Gitea:
- Create branch:
git checkout -b feature/my-change - Commit:
git commit -m '[#XXX] Description' - Push:
git push origin feature/my-change - Create PR via web interface
License
Timmy Foundation — Sovereign AI Infrastructure
Sovereignty and service always.