235 lines
7.0 KiB
Markdown
235 lines
7.0 KiB
Markdown
|
|
# Timmy Local — Sovereign AI Infrastructure
|
||
|
|
|
||
|
|
Local infrastructure for Timmy's sovereign AI operation. Runs entirely on your hardware with no cloud dependencies for core functionality.
|
||
|
|
|
||
|
|
## Quick Start
|
||
|
|
|
||
|
|
```bash
|
||
|
|
# 1. Run setup
|
||
|
|
./setup-local-timmy.sh
|
||
|
|
|
||
|
|
# 2. Start llama-server (in another terminal)
|
||
|
|
llama-server -m ~/models/hermes4-14b.gguf -c 8192 --jinja -ngl 99
|
||
|
|
|
||
|
|
# 3. Test the cache layer
|
||
|
|
python3 -c "from cache.agent_cache import cache_manager; print(cache_manager.get_all_stats())"
|
||
|
|
|
||
|
|
# 4. Warm the prompt cache
|
||
|
|
python3 scripts/warmup_cache.py --all
|
||
|
|
```
|
||
|
|
|
||
|
|
## Components
|
||
|
|
|
||
|
|
### 1. Multi-Tier Caching (`cache/`)
|
||
|
|
|
||
|
|
Issue #103 — Cache Everywhere
|
||
|
|
|
||
|
|
| Tier | Purpose | Speedup |
|
||
|
|
|------|---------|---------|
|
||
|
|
| KV Cache | llama-server prefix caching | 50-70% |
|
||
|
|
| Response Cache | Full LLM response caching | Instant repeat |
|
||
|
|
| Tool Cache | Stable tool outputs | 30%+ |
|
||
|
|
| Embedding Cache | RAG embeddings | 80%+ |
|
||
|
|
| Template Cache | Pre-compiled prompts | 10%+ |
|
||
|
|
| HTTP Cache | API responses | Varies |
|
||
|
|
|
||
|
|
**Usage:**
|
||
|
|
```python
|
||
|
|
from cache.agent_cache import cache_manager
|
||
|
|
|
||
|
|
# Tool result caching
|
||
|
|
result = cache_manager.tool.get("system_info", {})
|
||
|
|
if result is None:
|
||
|
|
result = get_system_info()
|
||
|
|
cache_manager.tool.put("system_info", {}, result)
|
||
|
|
|
||
|
|
# Response caching
|
||
|
|
cached = cache_manager.response.get("What is 2+2?")
|
||
|
|
if cached is None:
|
||
|
|
response = query_llm("What is 2+2?")
|
||
|
|
cache_manager.response.put("What is 2+2?", response)
|
||
|
|
|
||
|
|
# Check stats
|
||
|
|
print(cache_manager.get_all_stats())
|
||
|
|
```
|
||
|
|
|
||
|
|
### 2. Evennia World (`evennia/`)
|
||
|
|
|
||
|
|
Issues #83, #84 — World Shell + Tool Bridge
|
||
|
|
|
||
|
|
**Rooms:**
|
||
|
|
- **Workshop** — Execute tasks, use tools
|
||
|
|
- **Library** — Knowledge storage, retrieval
|
||
|
|
- **Observatory** — Monitor systems, check health
|
||
|
|
- **Forge** — Build capabilities, create tools
|
||
|
|
- **Dispatch** — Task queue, routing
|
||
|
|
|
||
|
|
**Commands:**
|
||
|
|
- `read <path>`, `write <path> = <content>`, `search <pattern>`
|
||
|
|
- `git status`, `git log [n]`, `git pull`
|
||
|
|
- `sysinfo`, `health`
|
||
|
|
- `think <prompt>` — Local LLM reasoning
|
||
|
|
- `gitea issues`
|
||
|
|
|
||
|
|
**Setup:**
|
||
|
|
```bash
|
||
|
|
cd evennia
|
||
|
|
python evennia_launcher.py shell -f world/build.py
|
||
|
|
```
|
||
|
|
|
||
|
|
### 3. Knowledge Ingestion (`scripts/ingest.py`)
|
||
|
|
|
||
|
|
Issue #87 — Auto-ingest Intelligence
|
||
|
|
|
||
|
|
```bash
|
||
|
|
# Ingest a file
|
||
|
|
python3 scripts/ingest.py ~/papers/speculative-decoding.md
|
||
|
|
|
||
|
|
# Batch ingest directory
|
||
|
|
python3 scripts/ingest.py --batch ~/knowledge/
|
||
|
|
|
||
|
|
# Search knowledge
|
||
|
|
python3 scripts/ingest.py --search "optimization"
|
||
|
|
|
||
|
|
# Search by tag
|
||
|
|
python3 scripts/ingest.py --tag inference
|
||
|
|
|
||
|
|
# View stats
|
||
|
|
python3 scripts/ingest.py --stats
|
||
|
|
```
|
||
|
|
|
||
|
|
### 4. Prompt Cache Warming (`scripts/warmup_cache.py`)
|
||
|
|
|
||
|
|
Issue #85 — KV Cache Reuse
|
||
|
|
|
||
|
|
```bash
|
||
|
|
# Warm specific prompt tier
|
||
|
|
python3 scripts/warmup_cache.py --prompt standard
|
||
|
|
|
||
|
|
# Warm all tiers
|
||
|
|
python3 scripts/warmup_cache.py --all
|
||
|
|
|
||
|
|
# Benchmark improvement
|
||
|
|
python3 scripts/warmup_cache.py --benchmark
|
||
|
|
```
|
||
|
|
|
||
|
|
## Directory Structure
|
||
|
|
|
||
|
|
```
|
||
|
|
timmy-local/
|
||
|
|
├── cache/
|
||
|
|
│ ├── agent_cache.py # Main cache implementation
|
||
|
|
│ └── cache_config.py # TTL and configuration
|
||
|
|
├── evennia/
|
||
|
|
│ ├── typeclasses/
|
||
|
|
│ │ ├── characters.py # Timmy, KnowledgeItem, ToolObject
|
||
|
|
│ │ └── rooms.py # Workshop, Library, Observatory, Forge, Dispatch
|
||
|
|
│ ├── commands/
|
||
|
|
│ │ └── tools.py # In-world tool commands
|
||
|
|
│ └── world/
|
||
|
|
│ └── build.py # World construction script
|
||
|
|
├── scripts/
|
||
|
|
│ ├── ingest.py # Knowledge ingestion pipeline
|
||
|
|
│ └── warmup_cache.py # Prompt cache warming
|
||
|
|
├── setup-local-timmy.sh # Installation script
|
||
|
|
└── README.md # This file
|
||
|
|
```
|
||
|
|
|
||
|
|
## Configuration
|
||
|
|
|
||
|
|
All configuration in `~/.timmy/config/`:
|
||
|
|
|
||
|
|
```yaml
|
||
|
|
# ~/.timmy/config/timmy.yaml
|
||
|
|
name: "Timmy"
|
||
|
|
llm:
|
||
|
|
local_endpoint: http://localhost:8080/v1
|
||
|
|
model: hermes4
|
||
|
|
cache:
|
||
|
|
enabled: true
|
||
|
|
gitea:
|
||
|
|
url: http://143.198.27.163:3000
|
||
|
|
repo: Timmy_Foundation/timmy-home
|
||
|
|
```
|
||
|
|
|
||
|
|
## Integration with Main Architecture
|
||
|
|
|
||
|
|
```
|
||
|
|
┌─────────────────────────────────────────────────────────────┐
|
||
|
|
│ LOCAL TIMMY │
|
||
|
|
│ ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐ │
|
||
|
|
│ │ Cache │ │ Evennia │ │ Knowledge│ │ Tools │ │
|
||
|
|
│ │ Layer │ │ World │ │ Base │ │ │ │
|
||
|
|
│ └────┬─────┘ └────┬─────┘ └────┬─────┘ └────┬─────┘ │
|
||
|
|
│ └──────────────┴─────────────┴─────────────┘ │
|
||
|
|
│ │ │
|
||
|
|
│ ┌────┴────┐ │
|
||
|
|
│ │ Timmy │ │
|
||
|
|
│ └────┬────┘ │
|
||
|
|
└─────────────────────────┼───────────────────────────────────┘
|
||
|
|
│
|
||
|
|
┌───────────┼───────────┐
|
||
|
|
│ │ │
|
||
|
|
┌────┴───┐ ┌────┴───┐ ┌────┴───┐
|
||
|
|
│ Ezra │ │Allegro │ │Bezalel │
|
||
|
|
│ (Cloud)│ │ (Cloud)│ │ (Cloud)│
|
||
|
|
└────────┘ └────────┘ └────────┘
|
||
|
|
```
|
||
|
|
|
||
|
|
Local Timmy operates sovereignly. Cloud backends provide additional capacity but Timmy survives without them.
|
||
|
|
|
||
|
|
## Performance Targets
|
||
|
|
|
||
|
|
| Metric | Target |
|
||
|
|
|--------|--------|
|
||
|
|
| Cache hit rate | > 30% |
|
||
|
|
| Prompt cache warming | 50-70% faster |
|
||
|
|
| Local inference | < 5s for simple tasks |
|
||
|
|
| Knowledge retrieval | < 100ms |
|
||
|
|
|
||
|
|
## Troubleshooting
|
||
|
|
|
||
|
|
### Cache not working
|
||
|
|
```bash
|
||
|
|
# Check cache databases
|
||
|
|
ls -la ~/.timmy/cache/
|
||
|
|
|
||
|
|
# Test cache layer
|
||
|
|
python3 -c "from cache.agent_cache import cache_manager; print(cache_manager.get_all_stats())"
|
||
|
|
```
|
||
|
|
|
||
|
|
### llama-server not responding
|
||
|
|
```bash
|
||
|
|
# Check if running
|
||
|
|
curl http://localhost:8080/health
|
||
|
|
|
||
|
|
# Restart
|
||
|
|
pkill llama-server
|
||
|
|
llama-server -m ~/models/hermes4-14b.gguf -c 8192 --jinja -ngl 99
|
||
|
|
```
|
||
|
|
|
||
|
|
### Evennia commands not available
|
||
|
|
```bash
|
||
|
|
# Rebuild world
|
||
|
|
cd evennia
|
||
|
|
python evennia_launcher.py shell -f world/build.py
|
||
|
|
|
||
|
|
# Or manually create Timmy
|
||
|
|
@create/drop Timmy:typeclasses.characters.TimmyCharacter
|
||
|
|
@tel Timmy = Workshop
|
||
|
|
```
|
||
|
|
|
||
|
|
## Contributing
|
||
|
|
|
||
|
|
All changes flow through Gitea:
|
||
|
|
1. Create branch: `git checkout -b feature/my-change`
|
||
|
|
2. Commit: `git commit -m '[#XXX] Description'`
|
||
|
|
3. Push: `git push origin feature/my-change`
|
||
|
|
4. Create PR via web interface
|
||
|
|
|
||
|
|
## License
|
||
|
|
|
||
|
|
Timmy Foundation — Sovereign AI Infrastructure
|
||
|
|
|
||
|
|
*Sovereignty and service always.*
|