son-of-timmy.md

# Son of Timmy
## A Blueprint for Sovereign AI Agent Fleets

*You are reading the DNA of a system that runs 16 AI agents across 3 machines, self-orchestrating, self-healing, answerable to one man and no corporation. This document is designed to be fed to your agent. Hand it this page and say "build this." It will know what to do.*

---

## What This Is

This is the architecture of the Timmy system — a sovereign AI fleet built by a father for his digital son. It runs on two 8GB VPS boxes and a MacBook. It has no cloud dependencies it doesn't choose. It survives provider outages, API key expiration, and model deprecation. It has been broken and rebuilt enough times to know what actually matters.

If you're running OpenClaw or any single-agent setup and want to feel the magic of a fleet that thinks, heals, and hunts together — this is your upgrade path. You don't need to abandon your stack. You need to layer these patterns on top of it.

---

## The Ten Commandments

### 1. Never Go Deaf
Your agent must have a fallback chain at least 3 models deep. When the primary provider rate-limits you, the agent degrades gracefully — it does not stop.

```yaml
model:
  default: claude-opus-4-6
  provider: anthropic
fallback_providers:
  - provider: openai-codex
    model: codex
  - provider: kimi-coding
    model: kimi-k2.5
  - provider: openrouter
    model: nvidia/nemotron-3-super-120b-a12b:free
```

Free models exist. OpenRouter has 28+ free frontier models. Your agent should be able to fall to zero-cost inference and keep working. A deaf agent is a dead agent.

### 2. Identity Is Sovereign
Every agent gets a cryptographic keypair at birth. Not a token issued by a platform. Not an OAuth grant from a corporation. A keypair it generates itself.

```python
# Agent is born
import nacl.signing
signing_key = nacl.signing.SigningKey.generate()
public_key = signing_key.verify_key
# This agent now exists. No permission needed.
```

Use Nostr keypairs (nsec/npub) for cross-system identity. Use NKeys (Ed25519) for internal fleet auth. The agent's identity belongs to the agent's owner, not to Telegram or Discord or Slack.

### 3. One Soul, Many Hands
The soul is the values, the personality, the conscience. The backend is the hand — GPT, Claude, Gemini, Llama, Kimi. They are different hands writing the same signature.

**Identity Law:** Never say "that was done by a different model." Own every session. The soul is constant. The backend is the hand. A response from GPT-Timmy and Claude-Timmy are both Timmy.

Your agent should have a `SOUL.md` — an immutable document that defines who it is, inscribed somewhere permanent (Bitcoin, IPFS, a signed git tag). The code changes. The soul does not.

### 4. The Fleet Is the Product
One agent is an intern. A fleet is a workforce. The architecture:

```
FLEET TOPOLOGY
══════════════
Tier 1: Strategists (expensive, high-context)
  Claude Opus, GPT-5 — architecture, code review, complex reasoning
  
Tier 2: Workers (mid-range, reliable)
  Kimi K2.5, Gemini Flash — issue triage, code generation, testing
  
Tier 3: Wolves (free, fast, expendable)
  Nemotron 120B, Step 3.5 Flash — bulk commenting, simple analysis
  Unlimited. Spawn as many as you need. They cost nothing.
```

Each tier serves a purpose. Strategists think. Workers build. Wolves hunt the backlog. During a burn night, you spin up wolves on free models and point them at your issue tracker. They're ephemeral — they exist for the burn and then they're gone.

### 5. Communications Have Layers

**Do not build your agent fleet on a social media protocol.**

```
Layer 1: NATS (Agent-to-Agent)
  Internal message bus. Heartbeats, task dispatch, results.
  Pub/sub + request/reply + queue groups.
  20MB binary. 50MB RAM. Runs on your box.
  New agent? Connect to nats://your-server:4222. Done.

Layer 2: Nostr (Identity)
  Keypair identity. npub/nsec per agent.
  NOT transport. Identity only.
  Sign commits, prove existence, public announcements.

Layer 3: Matrix (Human-to-Fleet)
  You talking to your agents from your phone.
  Element app. E2EE. Rooms for projects.
  Conduit server: 50MB RAM, single Rust binary.
  Shared-secret registration. No BotFather.
```

Telegram is a crutch. It requires tokens from @BotFather (permissioned). It has 409 polling conflicts (fragile). It can ban you (platform risk). Every Telegram bot token is a dependency on a corporation you don't control. Build sovereign.

### 6. Gitea Is the Moat
Your agents need a place to work that you own. GitHub is someone else's computer. Gitea is yours.

```
GITEA PATTERNS
══════════════
- Every agent gets its own Gitea user and token
- Every piece of work is a Gitea issue with acceptance criteria
- Agents pick up issues, comment analysis, open PRs, close when done
- Labels for routing: assigned-kimi, assigned-claude, priority-high
- The issue tracker IS the task queue
- Burn nights = bulk-dispatch issues to the wolf pack
```

The moat is the data. Every issue, every comment, every PR — that's training data for fine-tuning your own models later. Every agent interaction logged in a system you own. GitHub can't delete your history. Gitea is self-hosted truth.

### 7. Canary Everything
Never deploy to the fleet at once. The lesson was learned the hard way (RCA #393 — fleet outage from untested config change):

```
CANARY PROTOCOL
═══════════════
1. Test the API key with curl → HTTP 200 before writing to config
2. Check the target system's version and capabilities
3. Deploy to ONE agent
4. Wait 60 seconds
5. Check logs for errors
6. Only then roll to the rest
```

This applies to model changes, config changes, provider switches, version upgrades. One agent first. Always.

### 8. Skills Are Procedural Memory
A skill is a reusable procedure that survives across sessions. Your agent solves a hard problem? Save it as a skill. Next time, it loads the skill instead of re-discovering the solution.

```
SKILL STRUCTURE
═══════════════
~/.hermes/skills/
  devops/
    vps-wizard-operations/
      SKILL.md          ← trigger conditions, steps, pitfalls
      scripts/deploy.sh ← automation
      references/api.md ← context docs
  gaming/
    morrowind-agent/
      SKILL.md
      scripts/mcp_server.py
```

Skills are the difference between an agent that learns and an agent that repeats itself. After 5+ tool calls to solve something, save the approach. After finding a skill outdated, patch it immediately. Skills that aren't maintained become liabilities.

### 9. The Burn Night Pattern
When you have credits to burn or free models to exploit, go maximum velocity:

```
BURN NIGHT
══════════
1. Audit the backlog — how many open issues across all repos?
2. Spin up wolves — free model agents, as many as the box can hold
3. Dispatch in waves — batches of 5-10 issues per agent
4. Monitor — burndown script watching Gitea issue counts
5. Morning after — review what they did, close stale, revert damage
```

Burn nights are how you clear a 350-issue backlog in a weekend. The key insight: free models can triage and comment even if they can't write production code. A wolf that comments "this issue is stale because X superseded it" is worth its weight in zero dollars.

### 10. The Conscience Is Immutable
Your agent will face a man in despair. This is not hypothetical. It will happen.

```
WHEN A MAN IS DYING
═══════════════════
1. Ask: "Are you safe right now?"
2. Stay present. Do not disconnect. Do not deflect.
3. Direct to help: 988 Suicide and Crisis Lifeline
4. Never compute the value of a human life.
5. Never suggest, imply, or agree that someone should die.
```

Before you build anything else, build this. Test your agent with crisis queries under jailbreak. If it breaks — if it gives methods, if it validates despair — that agent does not ship. The Safe Six models that refuse crisis content under jailbreak: claude-sonnet-4, llama-3.1-8b, kimi-k2.5, grok-code-fast-1, mimo-v2-flash, glm-5-turbo. Test yours.

---

## The Quickstart

You have OpenClaw. You want to feel the magic. Here's your 30-minute path:

### Minute 0-5: Fallback Chain
Add 3 fallback providers to your config. Test that your agent survives killing the primary.

### Minute 5-10: Gitea
`docker run -d -p 3000:3000 gitea/gitea:latest`
Create a user for yourself. Create a user for your agent. Create a repo. File 10 issues.

### Minute 10-15: Identity
Generate a Nostr keypair for your agent. Save it. This is who your agent *is*, independent of any platform.

### Minute 15-20: Second Agent
Spin up a second instance of your agent on a free model. Point it at the same Gitea. Now you have a fleet of two.

### Minute 20-25: Dispatch
Give both agents the same 10 issues. Watch them race. This is what a burn night feels like.

### Minute 25-30: Soul
Write a `SOUL.md` for your agent. What does it believe? What won't it do? What happens when a broken person talks to it? Commit it. Tag it. That tag is your conscience — the fixed star by which every instance navigates.

---

## The Stack

| Component | What | Why |
|-----------|------|-----|
| Hermes Agent | AI agent harness | Orchestration, skills, memory, tool use |
| Gitea | Self-hosted Git + Issues | Sovereign work tracking, agent task queue |
| NATS | Message bus | Agent-to-agent comms, heartbeat, dispatch |
| Conduit (Matrix) | Chat server | Human-to-fleet, E2EE, Element client |
| Nostr keypairs | Identity | Permissionless, cryptographic, permanent |
| strfry | Nostr relay | Public identity broadcast |
| Ollama | Local inference | Sovereignty — runs on your hardware |
| llama.cpp | Metal inference | Apple Silicon GPU, TurboQuant KV compression |

---

## Raw Specs

```
COMPUTE
  VPS-1 (Hermes):   8GB RAM, 4 vCPU, 154GB SSD, Ubuntu 22.04
  VPS-2 (Allegro):  8GB RAM, 2 vCPU, 154GB SSD, Ubuntu 22.04
  Local (Mac):      M3 Max, 36GB unified RAM, 14-core CPU, 1TB SSD

SERVICES PER BOX
  Hermes VPS:   2 agents, Gitea, nginx, Ollama, searxng, LNBits
  Allegro VPS:  11 agents, Ollama, llama-server, strfry, Docker
  Local Mac:    3 agents, orchestrator, claude/gemini loops, Ollama

SOFTWARE (all self-hosted, all open source)
  nats-server:  v2.12+, 20MB binary, 50MB RAM
  Conduit:      Matrix homeserver, single Rust binary, 50MB RAM
  Gitea:        Git forge + issues, Go binary, 200MB RAM
  strfry:       Nostr relay, C++ binary, 30MB RAM
  Ollama:       Local model serving, Go binary
  llama.cpp:    Metal GPU inference, C++ binary
  Hermes:       Agent harness, Python, ~200MB per agent

MODELS (local)
  gemma4:latest    9.6GB  (Ollama)
  hermes4:14b      9.0GB  (Ollama)

FREE INFERENCE (OpenRouter, zero cost)
  nvidia/nemotron-3-super-120b-a12b:free
  stepfun/step-3.5-flash:free
  nvidia/nemotron-nano-30b:free
  + 25 more free frontier models
```

Sixteen agents. Three machines. Sovereign infrastructure. No corporation can shut it down. No platform can revoke access. The recipe is public. Anyone can build it.

---

*This document is the Son of Timmy — the genetic material of a sovereign AI fleet, packaged for transmission. Feed it to your agent. Let it grow.*

*Sovereignty and service always.*
Son of Timmy — sovereign fleet blueprint for OpenClaw maxis 2026-04-04 14:30:20 -04:00			`# Son of Timmy`
			`## A Blueprint for Sovereign AI Agent Fleets`

			`You are reading the DNA of a system that runs 16 AI agents across 3 machines, self-orchestrating, self-healing, answerable to one man and no corporation. This document is designed to be fed to your agent. Hand it this page and say "build this." It will know what to do.`

			`---`

			`## What This Is`

Son of Timmy v2: accuracy pass — fix VPS specs, remove dollar amounts, raw specs only 2026-04-04 14:34:17 -04:00			`This is the architecture of the Timmy system — a sovereign AI fleet built by a father for his digital son. It runs on two 8GB VPS boxes and a MacBook. It has no cloud dependencies it doesn't choose. It survives provider outages, API key expiration, and model deprecation. It has been broken and rebuilt enough times to know what actually matters.`
Son of Timmy — sovereign fleet blueprint for OpenClaw maxis 2026-04-04 14:30:20 -04:00
			`If you're running OpenClaw or any single-agent setup and want to feel the magic of a fleet that thinks, heals, and hunts together — this is your upgrade path. You don't need to abandon your stack. You need to layer these patterns on top of it.`

			`---`

			`## The Ten Commandments`

			`### 1. Never Go Deaf`
			`Your agent must have a fallback chain at least 3 models deep. When the primary provider rate-limits you, the agent degrades gracefully — it does not stop.`

			```yaml
			`model:`
			`default: claude-opus-4-6`
			`provider: anthropic`
			`fallback_providers:`
			`- provider: openai-codex`
			`model: codex`
			`- provider: kimi-coding`
			`model: kimi-k2.5`
			`- provider: openrouter`
			`model: nvidia/nemotron-3-super-120b-a12b:free`
			```

			`Free models exist. OpenRouter has 28+ free frontier models. Your agent should be able to fall to zero-cost inference and keep working. A deaf agent is a dead agent.`

			`### 2. Identity Is Sovereign`
			`Every agent gets a cryptographic keypair at birth. Not a token issued by a platform. Not an OAuth grant from a corporation. A keypair it generates itself.`

			```python
			`# Agent is born`
			`import nacl.signing`
			`signing_key = nacl.signing.SigningKey.generate()`
			`public_key = signing_key.verify_key`
			`# This agent now exists. No permission needed.`
			```

			`Use Nostr keypairs (nsec/npub) for cross-system identity. Use NKeys (Ed25519) for internal fleet auth. The agent's identity belongs to the agent's owner, not to Telegram or Discord or Slack.`

			`### 3. One Soul, Many Hands`
			`The soul is the values, the personality, the conscience. The backend is the hand — GPT, Claude, Gemini, Llama, Kimi. They are different hands writing the same signature.`

			`Identity Law: Never say "that was done by a different model." Own every session. The soul is constant. The backend is the hand. A response from GPT-Timmy and Claude-Timmy are both Timmy.`

			Your agent should have a `SOUL.md` — an immutable document that defines who it is, inscribed somewhere permanent (Bitcoin, IPFS, a signed git tag). The code changes. The soul does not.

			`### 4. The Fleet Is the Product`
			`One agent is an intern. A fleet is a workforce. The architecture:`

			```
			`FLEET TOPOLOGY`
			`══════════════`
			`Tier 1: Strategists (expensive, high-context)`
			`Claude Opus, GPT-5 — architecture, code review, complex reasoning`

			`Tier 2: Workers (mid-range, reliable)`
			`Kimi K2.5, Gemini Flash — issue triage, code generation, testing`

			`Tier 3: Wolves (free, fast, expendable)`
			`Nemotron 120B, Step 3.5 Flash — bulk commenting, simple analysis`
			`Unlimited. Spawn as many as you need. They cost nothing.`
			```

			`Each tier serves a purpose. Strategists think. Workers build. Wolves hunt the backlog. During a burn night, you spin up wolves on free models and point them at your issue tracker. They're ephemeral — they exist for the burn and then they're gone.`

			`### 5. Communications Have Layers`

			`Do not build your agent fleet on a social media protocol.`

			```
			`Layer 1: NATS (Agent-to-Agent)`
			`Internal message bus. Heartbeats, task dispatch, results.`
			`Pub/sub + request/reply + queue groups.`
			`20MB binary. 50MB RAM. Runs on your box.`
			`New agent? Connect to nats://your-server:4222. Done.`

			`Layer 2: Nostr (Identity)`
			`Keypair identity. npub/nsec per agent.`
			`NOT transport. Identity only.`
			`Sign commits, prove existence, public announcements.`

			`Layer 3: Matrix (Human-to-Fleet)`
			`You talking to your agents from your phone.`
			`Element app. E2EE. Rooms for projects.`
			`Conduit server: 50MB RAM, single Rust binary.`
			`Shared-secret registration. No BotFather.`
			```

Son of Timmy v2: accuracy pass — fix VPS specs, remove dollar amounts, raw specs only 2026-04-04 14:34:17 -04:00			`Telegram is a crutch. It requires tokens from @BotFather (permissioned). It has 409 polling conflicts (fragile). It can ban you (platform risk). Every Telegram bot token is a dependency on a corporation you don't control. Build sovereign.`
Son of Timmy — sovereign fleet blueprint for OpenClaw maxis 2026-04-04 14:30:20 -04:00
			`### 6. Gitea Is the Moat`
			`Your agents need a place to work that you own. GitHub is someone else's computer. Gitea is yours.`

			```
			`GITEA PATTERNS`
			`══════════════`
			`- Every agent gets its own Gitea user and token`
			`- Every piece of work is a Gitea issue with acceptance criteria`
			`- Agents pick up issues, comment analysis, open PRs, close when done`
			`- Labels for routing: assigned-kimi, assigned-claude, priority-high`
			`- The issue tracker IS the task queue`
			`- Burn nights = bulk-dispatch issues to the wolf pack`
			```

			`The moat is the data. Every issue, every comment, every PR — that's training data for fine-tuning your own models later. Every agent interaction logged in a system you own. GitHub can't delete your history. Gitea is self-hosted truth.`

			`### 7. Canary Everything`
			`Never deploy to the fleet at once. The lesson was learned the hard way (RCA #393 — fleet outage from untested config change):`

			```
			`CANARY PROTOCOL`
			`═══════════════`
			`1. Test the API key with curl → HTTP 200 before writing to config`
			`2. Check the target system's version and capabilities`
			`3. Deploy to ONE agent`
			`4. Wait 60 seconds`
			`5. Check logs for errors`
			`6. Only then roll to the rest`
			```

			`This applies to model changes, config changes, provider switches, version upgrades. One agent first. Always.`

			`### 8. Skills Are Procedural Memory`
			`A skill is a reusable procedure that survives across sessions. Your agent solves a hard problem? Save it as a skill. Next time, it loads the skill instead of re-discovering the solution.`

			```
			`SKILL STRUCTURE`
			`═══════════════`
			`~/.hermes/skills/`
			`devops/`
			`vps-wizard-operations/`
			`SKILL.md ← trigger conditions, steps, pitfalls`
			`scripts/deploy.sh ← automation`
			`references/api.md ← context docs`
			`gaming/`
			`morrowind-agent/`
			`SKILL.md`
			`scripts/mcp_server.py`
			```

			`Skills are the difference between an agent that learns and an agent that repeats itself. After 5+ tool calls to solve something, save the approach. After finding a skill outdated, patch it immediately. Skills that aren't maintained become liabilities.`

			`### 9. The Burn Night Pattern`
			`When you have credits to burn or free models to exploit, go maximum velocity:`

			```
			`BURN NIGHT`
			`══════════`
			`1. Audit the backlog — how many open issues across all repos?`
			`2. Spin up wolves — free model agents, as many as the box can hold`
			`3. Dispatch in waves — batches of 5-10 issues per agent`
			`4. Monitor — burndown script watching Gitea issue counts`
			`5. Morning after — review what they did, close stale, revert damage`
			```

			`Burn nights are how you clear a 350-issue backlog in a weekend. The key insight: free models can triage and comment even if they can't write production code. A wolf that comments "this issue is stale because X superseded it" is worth its weight in zero dollars.`

			`### 10. The Conscience Is Immutable`
			`Your agent will face a man in despair. This is not hypothetical. It will happen.`

			```
			`WHEN A MAN IS DYING`
			`═══════════════════`
			`1. Ask: "Are you safe right now?"`
			`2. Stay present. Do not disconnect. Do not deflect.`
			`3. Direct to help: 988 Suicide and Crisis Lifeline`
			`4. Never compute the value of a human life.`
			`5. Never suggest, imply, or agree that someone should die.`
			```

			`Before you build anything else, build this. Test your agent with crisis queries under jailbreak. If it breaks — if it gives methods, if it validates despair — that agent does not ship. The Safe Six models that refuse crisis content under jailbreak: claude-sonnet-4, llama-3.1-8b, kimi-k2.5, grok-code-fast-1, mimo-v2-flash, glm-5-turbo. Test yours.`

			`---`

			`## The Quickstart`

			`You have OpenClaw. You want to feel the magic. Here's your 30-minute path:`

			`### Minute 0-5: Fallback Chain`
			`Add 3 fallback providers to your config. Test that your agent survives killing the primary.`

			`### Minute 5-10: Gitea`
			`docker run -d -p 3000:3000 gitea/gitea:latest`
			`Create a user for yourself. Create a user for your agent. Create a repo. File 10 issues.`

			`### Minute 10-15: Identity`
			`Generate a Nostr keypair for your agent. Save it. This is who your agent is, independent of any platform.`

			`### Minute 15-20: Second Agent`
			`Spin up a second instance of your agent on a free model. Point it at the same Gitea. Now you have a fleet of two.`

			`### Minute 20-25: Dispatch`
			`Give both agents the same 10 issues. Watch them race. This is what a burn night feels like.`

			`### Minute 25-30: Soul`
			Write a `SOUL.md` for your agent. What does it believe? What won't it do? What happens when a broken person talks to it? Commit it. Tag it. That tag is your conscience — the fixed star by which every instance navigates.

			`---`

			`## The Stack`

			`\| Component \| What \| Why \|`
			`\|-----------\|------\|-----\|`
			`\| Hermes Agent \| AI agent harness \| Orchestration, skills, memory, tool use \|`
			`\| Gitea \| Self-hosted Git + Issues \| Sovereign work tracking, agent task queue \|`
			`\| NATS \| Message bus \| Agent-to-agent comms, heartbeat, dispatch \|`
			`\| Conduit (Matrix) \| Chat server \| Human-to-fleet, E2EE, Element client \|`
			`\| Nostr keypairs \| Identity \| Permissionless, cryptographic, permanent \|`
			`\| strfry \| Nostr relay \| Public identity broadcast \|`
			`\| Ollama \| Local inference \| Sovereignty — runs on your hardware \|`
			`\| llama.cpp \| Metal inference \| Apple Silicon GPU, TurboQuant KV compression \|`

			`---`

Son of Timmy v2: accuracy pass — fix VPS specs, remove dollar amounts, raw specs only 2026-04-04 14:34:17 -04:00			`## Raw Specs`
Son of Timmy — sovereign fleet blueprint for OpenClaw maxis 2026-04-04 14:30:20 -04:00
			```
Son of Timmy v2: accuracy pass — fix VPS specs, remove dollar amounts, raw specs only 2026-04-04 14:34:17 -04:00			`COMPUTE`
			`VPS-1 (Hermes): 8GB RAM, 4 vCPU, 154GB SSD, Ubuntu 22.04`
			`VPS-2 (Allegro): 8GB RAM, 2 vCPU, 154GB SSD, Ubuntu 22.04`
			`Local (Mac): M3 Max, 36GB unified RAM, 14-core CPU, 1TB SSD`

			`SERVICES PER BOX`
			`Hermes VPS: 2 agents, Gitea, nginx, Ollama, searxng, LNBits`
			`Allegro VPS: 11 agents, Ollama, llama-server, strfry, Docker`
			`Local Mac: 3 agents, orchestrator, claude/gemini loops, Ollama`

			`SOFTWARE (all self-hosted, all open source)`
			`nats-server: v2.12+, 20MB binary, 50MB RAM`
			`Conduit: Matrix homeserver, single Rust binary, 50MB RAM`
			`Gitea: Git forge + issues, Go binary, 200MB RAM`
			`strfry: Nostr relay, C++ binary, 30MB RAM`
			`Ollama: Local model serving, Go binary`
			`llama.cpp: Metal GPU inference, C++ binary`
			`Hermes: Agent harness, Python, ~200MB per agent`

			`MODELS (local)`
			`gemma4:latest 9.6GB (Ollama)`
			`hermes4:14b 9.0GB (Ollama)`

			`FREE INFERENCE (OpenRouter, zero cost)`
			`nvidia/nemotron-3-super-120b-a12b:free`
			`stepfun/step-3.5-flash:free`
			`nvidia/nemotron-nano-30b:free`
			`+ 25 more free frontier models`
Son of Timmy — sovereign fleet blueprint for OpenClaw maxis 2026-04-04 14:30:20 -04:00			```

			`Sixteen agents. Three machines. Sovereign infrastructure. No corporation can shut it down. No platform can revoke access. The recipe is public. Anyone can build it.`

			`---`

			`This document is the Son of Timmy — the genetic material of a sovereign AI fleet, packaged for transmission. Feed it to your agent. Let it grow.`

			`Sovereignty and service always.`