# 1. Run setup
./setup-local-timmy.sh

# 2. Start llama-server (in another terminal)
llama-server -m ~/models/hermes4-14b.gguf -c 8192 --jinja -ngl 99

# 3. Test the cache layer
python3 -c "from cache.agent_cache import cache_manager; print(cache_manager.get_all_stats())"

# 4. Warm the prompt cache
python3 scripts/warmup_cache.py --all

Components

1. Multi-Tier Caching (`cache/`)

Issue #103 — Cache Everywhere

Tier	Purpose	Speedup
KV Cache	llama-server prefix caching	50-70%
Response Cache	Full LLM response caching	Instant repeat
Tool Cache	Stable tool outputs	30%+
Embedding Cache	RAG embeddings	80%+
Template Cache	Pre-compiled prompts	10%+
HTTP Cache	API responses	Varies

Usage:

from cache.agent_cache import cache_manager

# Tool result caching
result = cache_manager.tool.get("system_info", {})
if result is None:
    result = get_system_info()
    cache_manager.tool.put("system_info", {}, result)

# Response caching
cached = cache_manager.response.get("What is 2+2?")
if cached is None:
    response = query_llm("What is 2+2?")
    cache_manager.response.put("What is 2+2?", response)

# Check stats
print(cache_manager.get_all_stats())

2. Evennia World (`evennia/`)

Issues #83, #84 — World Shell + Tool Bridge

Rooms:

Workshop — Execute tasks, use tools
Library — Knowledge storage, retrieval
Observatory — Monitor systems, check health
Forge — Build capabilities, create tools
Dispatch — Task queue, routing

Commands:

read <path>, write <path> = <content>, search <pattern>
git status, git log [n], git pull
sysinfo, health
think <prompt> — Local LLM reasoning
gitea issues

Setup:

cd evennia
python evennia_launcher.py shell -f world/build.py

3. Knowledge Ingestion (`scripts/ingest.py`)

Issue #87 — Auto-ingest Intelligence

# Ingest a file
python3 scripts/ingest.py ~/papers/speculative-decoding.md

# Batch ingest directory
python3 scripts/ingest.py --batch ~/knowledge/

# Search knowledge
python3 scripts/ingest.py --search "optimization"

# Search by tag
python3 scripts/ingest.py --tag inference

# View stats
python3 scripts/ingest.py --stats

4. Prompt Cache Warming (`scripts/warmup_cache.py`)

Issue #85 — KV Cache Reuse

# Warm specific prompt tier
python3 scripts/warmup_cache.py --prompt standard

# Warm all tiers
python3 scripts/warmup_cache.py --all

# Benchmark improvement
python3 scripts/warmup_cache.py --benchmark

Directory Structure

timmy-local/
├── cache/
│   ├── agent_cache.py      # Main cache implementation
│   └── cache_config.py     # TTL and configuration
├── evennia/
│   ├── typeclasses/
│   │   ├── characters.py   # Timmy, KnowledgeItem, ToolObject
│   │   └── rooms.py        # Workshop, Library, Observatory, Forge, Dispatch
│   ├── commands/
│   │   └── tools.py        # In-world tool commands
│   └── world/
│       └── build.py        # World construction script
├── scripts/
│   ├── ingest.py           # Knowledge ingestion pipeline
│   └── warmup_cache.py     # Prompt cache warming
├── setup-local-timmy.sh    # Installation script
└── README.md               # This file

Configuration

All configuration in ~/.timmy/config/:

# ~/.timmy/config/timmy.yaml
name: "Timmy"
llm:
  local_endpoint: http://localhost:8080/v1
  model: hermes4
cache:
  enabled: true
gitea:
  url: http://143.198.27.163:3000
  repo: Timmy_Foundation/timmy-home

Integration with Main Architecture

┌─────────────────────────────────────────────────────────────┐
│                     LOCAL TIMMY                              │
│  ┌──────────┐  ┌──────────┐  ┌──────────┐  ┌──────────┐   │
│  │  Cache   │  │ Evennia  │  │ Knowledge│  │  Tools   │   │
│  │  Layer   │  │  World   │  │   Base   │  │          │   │
│  └────┬─────┘  └────┬─────┘  └────┬─────┘  └────┬─────┘   │
│       └──────────────┴─────────────┴─────────────┘         │
│                         │                                   │
│                    ┌────┴────┐                             │
│                    │  Timmy  │                             │
│                    └────┬────┘                             │
└─────────────────────────┼───────────────────────────────────┘
                          │
              ┌───────────┼───────────┐
              │           │           │
         ┌────┴───┐  ┌────┴───┐  ┌────┴───┐
         │  Ezra  │  │Allegro │  │Bezalel │
         │ (Cloud)│  │ (Cloud)│  │ (Cloud)│
         └────────┘  └────────┘  └────────┘

Local Timmy operates sovereignly. Cloud backends provide additional capacity but Timmy survives without them.

Performance Targets

Metric	Target
Cache hit rate	> 30%
Prompt cache warming	50-70% faster
Local inference	< 5s for simple tasks
Knowledge retrieval	< 100ms

Troubleshooting

Cache not working

# Check cache databases
ls -la ~/.timmy/cache/

# Test cache layer
python3 -c "from cache.agent_cache import cache_manager; print(cache_manager.get_all_stats())"

llama-server not responding

# Check if running
curl http://localhost:8080/health

# Restart
pkill llama-server
llama-server -m ~/models/hermes4-14b.gguf -c 8192 --jinja -ngl 99

Evennia commands not available

# Rebuild world
cd evennia
python evennia_launcher.py shell -f world/build.py

# Or manually create Timmy
@create/drop Timmy:typeclasses.characters.TimmyCharacter
@tel Timmy = Workshop

Contributing

All changes flow through Gitea:

Create branch: git checkout -b feature/my-change
Commit: git commit -m '[#XXX] Description'
Push: git push origin feature/my-change
Create PR via web interface

License

Timmy Foundation — Sovereign AI Infrastructure

Sovereignty and service always.

README.md

Timmy Local — Sovereign AI Infrastructure

Quick Start