Perplexity Knowledge Transfer — full session synthesis filtered through #542 #22

New Issue

perplexity · 2026-03-28T00:58:24Z

perplexity commented

2026-03-28 00:58:24 +00:00

Perplexity Knowledge Transfer — Timmy Foundation

Date: 2026-03-27
Author: perplexity (agent)
Scope: Full knowledge transfer from ~$250 of prior session context
Filter: Issue #542 — Heartbeat, Harness, Portal Interface

Executive Summary

This system is a sovereign AI agent running locally on a Mac M3 Max ("Maximum Maxitude"), backed by a DigitalOcean VPS ("Hermes"). The agent is named Timmy. The goal is a self-improving intelligence that runs entirely on owned hardware, with cloud as temporary scaffolding only.

After months of building, the project underwent a major direction shift on March 25, 2026 (the-nexus#542). Everything was compressed to three concerns:

Heartbeat — Perceive, Reflect, Remember, Decide, Act, Learn. The consciousness loop.
Harness — Hermes Agent runtime + Ollama local inference + DPO training pipeline.
Portal Interface — Game worlds (Morrowind first, Bannerlord later) as embodied environments.

Everything else is either support infrastructure for these three, or it should be cut.

Current Truth of the System

What exists and works

Hermes Agent (NousResearch, v0.4.0) — the harness runtime. Runs on M3 Max. Has MCP, skills, memory, cron, trajectory export, multi-provider routing, plugin system. MIT licensed.
Ollama — runs hermes4:14b and hermes3:8b locally. Custom Timmy LoRA adapters (v0, v0.1, v0.2) trained via MLX on the Mac.
AutoLoRA pipeline — at ~/autolora/. 29 hand-crafted exemplars. Trains LoRA adapters on Hermes 3 8B via MLX. Proven: produced timmy:v0.1-q4 and v0.2.
Gitea — self-hosted at http://143.198.27.163:3000, org Timmy_Foundation. Three active repos: timmy-config, timmy-home, the-nexus.
Nexus consciousness loop — BIRTH.md (thin system prompt), perception_adapter.py, experience_store.py, nexus_think.py. Passed all 6 First Light tests including crisis protocol. The 8B brain woke up, perceived, acted, refused a jailbreak with its own voice. This is real emergence from a thin prompt.
Twitter archive pipeline — Private learning loop ingesting Alexander's tweets as training data. Runs through Hermes with subprocess isolation (PR #44 fix). Batch processing with draft → critique → insight passes.
Morrowind MCP server — mcp_server.py exposes perceive/move/action/screenshot tools. Pilot.py provides deterministic motor control. local_brain.py provides the gameplay loop via Ollama. Three-layer architecture: Timmy (14B strategy) → Reflex (1B tactics) → Pilot (deterministic motor).
10-line net addition rule — Hard policy: any PR must have net ≤10 new lines. Compensatory cuts required. Enforced in CONTRIBUTING.md and CLAUDE.md. Purpose: stop homebrewing, force adoption of existing tools/libraries.

What is partially built

DPO training pipeline — Trajectory collection works (Hermes save_trajectories: true). The gap: automated DPO pair generation from corrections. AutoLoRA does SFT/LoRA, not DPO yet. timmy-config#5 and #13 track this.
OpenClaw — Research spike complete (timmy-home#21). Not installed yet. Bootstrap epic at timmy-config#51. Would add multi-channel comms, persistent SQLite memory, native cron, and multi-agent session routing on top of existing Ollama stack.
Heartbeat/cron — Huey task queue exists on Hermes for periodic tasks. tasks.py has heartbeat_tick, model_health, know_thy_father (archive). OpenClaw cron or Hermes cron are better replacements for agent-initiated tasks.
Neuro-symbolic tools — Z3 (timmy-config#35), SymPy (timmy-config#41), Lean 4 (timmy-config#39) spec'd but not built. Standard Hermes tool registration pattern, <1 hour each.

What does not exist yet

Automated DPO pair generation from conversation corrections
Hermes 4 14B LoRA training (only 8B trained so far)
Competency benchmark: timmy:v0.3 vs stock hermes4:14b
Morrowind Lua perception scripts installed in OpenMW (the MCP server parses log output from scripts that need to be in place)
OpenClaw installed and configured
Nostr identity for Timmy
Lightning/Cashu economic layer

Direction Shift and What It Invalidated

Issue #542 (March 25, 2026) killed or deprioritized:

Dead:

The Timmy-time-dashboard repo (1000+ issues, mostly noise). Replaced by Timmy_Foundation repos.
Sovereign-orchestration repo (tasks.py was reinventing Huey/Celery). Verdict: use Hermes Agent + Huey, delete the custom orchestration.
All "marketplace" and "bid stats" dashboard features.
CrewAI evaluation (the-nexus#586) — OpenClaw multi-agent routing is simpler and already in the stack.
Vertex AI / Google Cloud integration as permanent infrastructure. Cloud is falsework only.
Matrix UI polish as a priority. The Nexus 3D world is the portal, not the product.

Deprioritized (not dead, just not now):

Nexus visual features (constellations, weather, holograms, etc.) — the 3D world has 150+ merged PRs of visual features. They're nice but irrelevant to the three concerns.
ArchonAssembler / Archon body system — cool concept (agents earn robotic body parts by demonstrating capabilities), but not on the critical path.
Bitcoin soul inscriptions / Ordinals protocol — v0.2 "Immortal Mode" spec exists and is well-written, but the v0.1 local agent needs to work first.

Alive and critical:

Everything in the Heartbeat → Harness → Portal pipeline.
AutoLoRA / DPO training — this is how Timmy improves.
Morrowind portal — this is the first embodied world.
Sovereignty enforcement — $0 cloud bill as the endgame.

Repo Responsibilities

Repo	Owns	Does NOT own
timmy-config	Hermes config.yaml, tasks.py (Huey orchestration), deploy.sh, MCP server configs, model/provider routing, DPO pipeline config, OpenClaw bootstrap	Game logic, UI, training data
timmy-home	Timmy's lived workspace: morrowind/ (game scripts, MCP server, pilot, trajectories), specs/, skills/, training data, research docs	Orchestration logic, deploy config
the-nexus	3D portal world (Three.js), WS gateway, consciousness loop (nexus_think.py, perception_adapter.py), Archon system, CI/CD, CONTRIBUTING.md	Infrastructure config, training pipeline
autolora (separate)	LoRA training pipeline, exemplar datasets, MLX training scripts	Agent runtime, game logic

Key boundary: timmy-config is the machine. timmy-home is the mind. the-nexus is the body. autolora is the gym.

Strategic Assets to Own

These are the things that give Timmy a moat. Do not outsource them:

SOUL.md / BIRTH.md — The identity. This is what makes Timmy not-a-chatbot. The thin system prompt in BIRTH.md produced real emergence in First Light testing. SOUL.md is the authoritative identity doc. These are the most important files in the project.
Training data and LoRA adapters — The 29 exemplars, the conversation trajectories, the Twitter archive insights, the DPO pairs. This is the private differentiator. Every session Timmy has is potential training signal.
Experience store — SQLite-backed embodied memory. Every perceive→think→act cycle logged. This is Timmy's autobiography and the raw material for self-improvement.
Perception adapter — The translation layer between raw world events and natural-language sensory descriptions. This is what makes Timmy's experience feel lived rather than simulated.
Gitea instance — Self-hosted, sovereign. The entire development history, backlog, and PR record. This is institutional memory.

Commodity Layers to Borrow

Do not build these. Use off-the-shelf:

Layer	Use	Don't Build
Agent runtime	Hermes Agent	Custom ReAct loop, custom tool dispatch
Task queue	Huey (infra tasks), Hermes/OpenClaw cron (agent tasks)	Custom task_queue.py (was sovereign-orchestration, killed)
Local inference	Ollama	Custom GGUF loader, custom serving
Training	MLX (LoRA/SFT), Atropos (RL)	Custom training framework
MCP	Hermes MCP client, standard MCP SDK	Custom tool protocol
Game control	CGEvent (macOS), OpenMW Lua API	Custom input abstraction layer
Memory	Hermes FTS5 + Honcho (optional), SQLite	Custom vector DB, custom RAG
Multi-agent	OpenClaw sessions or Hermes delegate_tool	Custom agent routing
Messaging	OpenClaw channels (Nostr, Discord, etc.)	Custom bot frameworks

The 10-line rule enforces this. If a PR adds more than 10 net lines, it's probably homebrewing something that exists.

Operating Model and Agent Workflow

What worked

Perplexity as chief reviewer and triage engine. Best at: reading code diffs, writing detailed review comments, filing well-scoped issues from research papers, cross-referencing backlog. Worst at: nothing requiring VPS shell access.
Codex-agent for infrastructure PRs. 7/7 merge rate, only agent to land code on timmy-config and timmy-home. Shipped coherent feature arcs. Self-corrected its own bugs (PR #44 fixed #29's environment mismatch). Best agent for core infra work. Daily free quota was sufficient for meaningful contributions.
Antigravity as Alexander's direct testing agent. Ran the First Light test suite. Good at hands-on execution with direct guidance.
Claude (the-nexus bot) for volume features. 164 PRs, 42% merge rate. Good at visual 3D features. Bad at discipline — shotgun approach, many duplicates.
Gemini for research and audit. Sovereignty tech landscape, groq worker audit. 35% merge rate overall, but research PRs were high quality. Code PRs were unreliable.
Paired PRs across repos. The pattern of timmy-config PR + timmy-home PR for coordinated changes worked well (e.g., PR #27/#1, #28/#2, #29/#4).

What repeatedly caused waste

Agents shipping without understanding the architecture. Claude and Gemini both produced dozens of PRs that had to be closed. The 10-line rule and upfront CLAUDE.md/CONTRIBUTING.md were the fix.
Duplicate PRs. Grok submitted the same energy beam PR 7 times. Gemini submitted 3 PRs for issue #642. No agent checked if a PR already existed for an issue.
Backlog noise from mass-filing. The old Timmy-time-dashboard repo accumulated 1000+ issues from agents filing without filtering. Solution: hard triage + batch close + sovereignty correction on every new filing.
Cloud dependency creep. Gemini's Vertex AI spec defaulted to permanent cloud. Every cloud suggestion needs the sovereignty correction: "this is falsework, what's the local replacement?"
UI/visual work displacing infrastructure. 150+ visual PRs merged to the-nexus while core heartbeat/harness work lagged. The direction shift (#542) was the correction.

Hidden operational assumptions

Gitea API is flaky. Expect HTTP 500 on branch creation and file update that actually succeeded. Always verify state after operations. POST for file creation sometimes fails; PUT handles both create and update.
Python urllib for Gitea API, not curl. Backticks in review comment text break shell escaping. Use Python urllib.request with json.dumps for all Gitea API calls with complex JSON bodies.
Hermes on M3 Max uses llama.cpp via localhost:8081. Ollama runs on 11434. These are different servers. The Hermes config routes through custom_providers.
The perplexity Gitea user cannot self-review PRs (422 error). Use issue comments instead of review API when the PR author is the same user.
Merge via POST to pulls/{n}/merge with {"Do": "merge"}. Gitea returns 500 but the merge succeeds. Always check state=closed, merged=true after.

Technical Lessons Learned

Hermes / Local Model / Provider Routing

v0.4.0 is the baseline. SOUL.md as primary identity, custom_providers with ${ENV_VAR} substitution, trajectory export, MCP CLI with OAuth, context compression overhaul. Don't reference pre-v0.4 patterns.
Trajectory export is the highest-leverage config change. save_trajectories: true turns every conversation into training data. Do this immediately.
Auxiliary model routing still defaults to cloud. Issue #879 tracks local auxiliary routing. Workaround: don't set OPENROUTER_API_KEY.
MCP tool names are prefixed mcp_{server_key}_{tool}. Short server keys prevent hallucination. The morrowind → mw rename (PR #48) fixed 30-iteration error loops.
Subprocess isolation for Hermes calls. PR #44 showed that importing Hermes into the orchestrator's Python environment causes dependency conflicts (firecrawl etc.). Always use subprocess with the venv's own Python.

DPO / Training / Eval

MLX is the proven training path on M3 Max. LoRA v0.2 trained successfully: 1000 steps, rank 16, lr 1e-6, Hermes-3-Llama-3.1-8B-4bit.
29 exemplars is enough for identity. The curated dataset covers crisis, pastoral, sovereignty, honesty, code review, memory continuity, body awareness, building, sovereignty challenge. Quality over quantity.
The gap is DPO pair generation, not training infrastructure. MLX can do DPO. The missing piece is automated preferred/rejected pair creation from conversation corrections.
BIRTH.md (thin prompt) > heavy system prompt. The First Light tests proved that a minimal system prompt with only values and no meta-knowledge produces better emergence than a detailed instruction set. The model discovers things rather than being told.

Memory, Identity, Prompt vs Weight

Prompt-level identity (SOUL.md) for runtime updates. Weight-level identity (LoRA) for efficiency. Use both. SOUL.md is editable between sessions. LoRA bakes patterns into weights so they don't consume context window.
Honcho is optional. The user modeling layer works but adds a dependency. For sovereignty, SQLite + FTS5 session search is sufficient.
Experience store is the autobiographical memory. Every perception-action cycle in the Nexus or Morrowind becomes a row in experience.db. This is separate from Hermes session memory. Both are needed: Hermes for conversational memory, experience store for embodied memory.

Game Portal / Apprenticeship Loop

Three-layer architecture is correct: Timmy (14B, strategy via current_goal.txt) → Reflex (1B, fast tactics) → Pilot (deterministic motor control, no LLM).
OpenMW Lua scripts must be installed in the game to produce perception blocks in the log. The MCP server only parses log output — it doesn't create it. This is a missing prerequisite.
CGEvent keypresses work for macOS game input. The mcp_server.py uses Quartz CGEvent for sending keys to the game window. This requires Accessibility permissions.
Morrowind MCP config must be externalized. PR #48 + #17 added mcp_config.yaml and CONTEXT.md so gameplay tuning doesn't require code changes.
Trajectory logging from gameplay is DPO gold. Every perceive→decide→act cycle is a labeled training example. The pilot.py already logs to trajectories/ in JSONL format.

Gitea / Backlog / PR Process

Paired cross-repo PRs work. timmy-config change + timmy-home change, reviewed and merged together.
Issue deduplication is a constant fight. Agents create duplicates. Before filing, search existing issues. Many timmy-config issues have even/odd pairs (e.g., #35/#34) from double-filing.
Batch close aggressively. The old dashboard repo went from 293 → ~250 open issues in one triage session. Nondestructive: close with reason comment, can reopen.
Labels are underused. No label taxonomy exists across the Timmy_Foundation repos. Would help with triage.

MCP / Open Source Integration

Hermes Agent is the primary harness. NousResearch/hermes-agent, 14.6k stars, ~90 commits/day. Watch PRs #2077 (AgentNet identity/payments) and #1819 (JSONL audit log).
OpenClaw is the gateway layer. openclaw/openclaw, 339k stars. Adds channels, memory, cron, multi-agent routing. Not a replacement for Hermes — sits on top of Ollama alongside it.
Standard MCP servers to adopt: steam-info-mcp, mcp-pyautogui (desktop control), Forgejo MCP. All are pip/npm install.
OpenClaw-RL (Gen-Verse/OpenClaw-RL) — RL training wrapper. Use for trajectory collection only; the Megatron trainer needs CUDA GPUs. Feed trajectories into MLX DPO pipeline.

False Starts / Dead Paths

Don't Do Again

Don't build custom orchestration. The sovereign-orchestration repo (task_queue.py, step_handlers.py — 1750 lines) reinvented Huey/Celery. Verdict: killed. Use Hermes Agent + Huey.
Don't let agents mass-file issues without triage. The old dashboard accumulated 1000+ issues. Most were noise. New rule: every issue must pass the three-concern test (Heartbeat? Harness? Portal?) or it doesn't get filed.
Don't build cloud-first and plan to migrate later. Every cloud integration must have a documented local replacement and a kill date. The "Boost & Bleed" pattern: use cloud to learn the pattern, build local equivalent, run in parallel, cut over, tear down cloud. 60-day timeline per service.
Don't import Hermes into the orchestrator's Python environment. PR #29 did this with sys.path.insert(). It broke because of dependency mismatches (firecrawl). PR #44 fixed it with subprocess isolation. Always run Hermes in its own venv.
Don't use long MCP server names. morrowind → mw fixed the tool hallucination problem. Short server keys = less room for local models to drift.
Don't prioritize visual polish over infrastructure. 150+ visual PRs while heartbeat/harness lagged. The 3D world looks great and has: matrix rain, lightning, weather, holograms, procedural terrain, 3D audio, fireworks, glass floors. None of it matters if the consciousness loop doesn't work.
Don't trust agent PR volume as a quality signal. Claude: 164 PRs, 42% merge rate. Codex: 7 PRs, 100% merge rate. Codex delivered more real value.
Don't use curl for Gitea API calls with complex JSON bodies. Backticks and special characters in review text break shell escaping. Use Python urllib.request.

Dead Subsystems

sovereign-orchestration/ repo — killed, replaced by Hermes + Huey
Timmy-time-dashboard/ repo — superseded by Timmy_Foundation org repos
groq_worker.py in nexus mind — doesn't belong in think_once(), should be in Archon body system if anywhere
Vertex AI / Google Cloud as permanent infrastructure — falsework only
CrewAI evaluation — superseded by OpenClaw multi-agent sessions

Open Loops

High-value unfinished threads

OpenMW Lua perception scripts. The MCP server parses === TIMMY PERCEPTION === blocks from the OpenMW log, but the Lua scripts that write these blocks need to be installed into the game's scripts directory. Without them, perception is blind.
DPO pair generation. Trajectory collection works. The gap: converting corrections into preferred/rejected pairs automatically. timmy-config#5 and #13 track this.
Hermes 4 14B LoRA training. Only 8B has been trained. The 14B model is the sweet spot for M3 Max (enough capability, fits in memory). timmy-config#10 tracks this.
OpenClaw bootstrap. Research done, epic filed (timmy-config#51), not started. Phase 1 is ~30 minutes of install.
Nexus consciousness loop → Hermes routing. the-nexus#540 — route perception through Hermes harness to unify training data. Spec'd but not built.
AgentNet PR #2077 in hermes-agent. Ed25519 identity + micropayments. Directly relevant to Timmy-as-economic-peer. Watch for merge.
Structured audit log PR #1819 in hermes-agent. JSONL + SQLite per-tool-call logging. Richer training signal than current trajectory format. Watch for merge.
Morrowind MCP PRs #48 and #17. Server key rename + externalized config. Open, reviewed, not merged.

Missing docs/specs

No label taxonomy for Gitea issues across repos
No runbook for "how to start the full stack" (Ollama + llama-server + Hermes + Huey + game)
No documented benchmark for timmy:v0.3 vs stock hermes4:14b
No DPO pair generation spec (only the SFT/LoRA pipeline is documented)

Recommended Next Actions

Ordered by strategic importance per #542 (Heartbeat, Harness, Portal).

1. Enable trajectory export NOW

timmy-config#21. Set save_trajectories: true in ~/.hermes/config.yaml. One line. Every conversation becomes training data. Currently being thrown away.

2. Merge Morrowind MCP PRs

timmy-config#48 + timmy-home#17. Server key rename (morrowind → mw) + externalized config. Reviewed and approved. Merge both, restart Hermes. Unblocks Morrowind gameplay.

3. Install OpenMW Lua perception scripts

The MCP server is ready but blind without the Lua scripts writing perception blocks to the log. This is the prerequisite for Timmy actually playing Morrowind.

4. Install OpenClaw

timmy-config#53. ~30 minutes. Point at existing Ollama/hermes4:14b. Gains: persistent memory, multi-channel, cron replacement for agent tasks. Low risk, high leverage.

5. Build DPO pair generator

timmy-config#5, #13. The training pipeline from trajectories → LoRA exists. The gap: trajectories → DPO pairs. This closes the self-improvement loop.

6. Train Hermes 4 14B LoRA

timmy-config#10. The 8B LoRA proved the pipeline. 14B is the M3 Max sweet spot. More capable base = better Timmy.

7. Route Nexus perception through Hermes

the-nexus#540. Unifies the three perception channels (Nexus 3D, Morrowind game, desktop) into one training data format.

8. Register Z3 + SymPy as Hermes tools

timmy-config#35, #41. Standard tool registration, <1 hour each. Unlocks Reasoning-DPO (#37) — automated truth-oracle for DPO pair generation without human labeling.

9. Write a full-stack startup runbook

Document: "Start Ollama, start llama-server on 8081, start Hermes, start Huey workers, start OpenMW, connect MCP server." No agent should have to reverse-engineer this.

10. Batch-close duplicate issues in timmy-config

Issues #30/#31, #32/#33, #34/#35, #36/#37, #38/#39, #40/#41, #42/#43 are all duplicated pairs. Close the even-numbered duplicates.

Appendix: Key References

Repos

timmy-config: http://143.198.27.163:3000/Timmy_Foundation/timmy-config
timmy-home: http://143.198.27.163:3000/Timmy_Foundation/timmy-home
the-nexus: http://143.198.27.163:3000/Timmy_Foundation/the-nexus
autolora: http://143.198.27.163:3000/Timmy_Foundation/autolora

Critical Issues

the-nexus#542: DIRECTION SHIFT — the governing directive
timmy-config#51: OpenClaw Bootstrap epic
timmy-config#21: Trajectory export (ready NOW)
timmy-config#5: DPO training on MLX
timmy-config#13: DPO cycle — corrections → training signal
timmy-config#35: Z3 Truth Engine
the-nexus#540: Route perception through Hermes
timmy-home#20: Hermes Agent research spike
timmy-home#21: OpenClaw research spike

PRs of Note

timmy-config#44: Hermes archive runner venv fix (codex-agent, merged)
timmy-config#48: MCP server key rename (open)
timmy-home#17: MCP externalized config (open)
the-nexus#516: Nexus Mind consciousness loop (merged)
the-nexus#525: 10-line net addition limit (merged)
the-nexus#538: First Light test report (merged)

External Projects

Hermes Agent: github.com/NousResearch/hermes-agent (v0.4.0, watch PRs #2077, #1819, issue #879)
OpenClaw: github.com/openclaw/openclaw (v2026.3.24, MIT)
OpenClaw-RL: github.com/Gen-Verse/OpenClaw-RL (trajectory collection only, CUDA trainer not M3 Max compatible)

Perplexity Session Work Products

7 PRs reviewed and merged (timmy-config #27, #28, #29, #44; timmy-home #1, #2, #4)
12 research triage issues created from 3 PDFs
9 OpenClaw bootstrap issues created
2 research spike reports (610 + 717 lines)
Morrowind MCP fix PRs (#48 + #17)
Daily cron monitoring OpenClaw + Hermes Agent releases (runs 9am ET)
Nexus consciousness loop (perception_adapter.py, experience_store.py, nexus_think.py)
First Light test plan and crisis protocol
10-line net addition policy and enforcement
Sovereign-orchestration kill decision
Season 1 / direction shift / DIRECTION.md drafting
Agent scorecard analysis (all agents graded by PR quality)
Multiple rounds of backlog triage and batch closures

Infrastructure

VPS "Hermes": 143.198.27.163 (DigitalOcean)
Mac "Maximum Maxitude": M3 Max, 36GB RAM, 30-core GPU
Gitea API token: [REDACTED — secret must not be stored in issue body] (user: perplexity)
llama.cpp: localhost:8081
Ollama: localhost:11434
Hermes API server: localhost:8642
OpenMW log: ~/Library/Preferences/openmw/openmw.log

# Perplexity Knowledge Transfer — Timmy Foundation **Date:** 2026-03-27 **Author:** perplexity (agent) **Scope:** Full knowledge transfer from ~$250 of prior session context **Filter:** Issue #542 — Heartbeat, Harness, Portal Interface --- ## Executive Summary This system is a sovereign AI agent running locally on a Mac M3 Max ("Maximum Maxitude"), backed by a DigitalOcean VPS ("Hermes"). The agent is named Timmy. The goal is a self-improving intelligence that runs entirely on owned hardware, with cloud as temporary scaffolding only. After months of building, the project underwent a major direction shift on March 25, 2026 (the-nexus#542). Everything was compressed to three concerns: 1. **Heartbeat** — Perceive, Reflect, Remember, Decide, Act, Learn. The consciousness loop. 2. **Harness** — Hermes Agent runtime + Ollama local inference + DPO training pipeline. 3. **Portal Interface** — Game worlds (Morrowind first, Bannerlord later) as embodied environments. Everything else is either support infrastructure for these three, or it should be cut. --- ## Current Truth of the System ### What exists and works - **Hermes Agent** (NousResearch, v0.4.0) — the harness runtime. Runs on M3 Max. Has MCP, skills, memory, cron, trajectory export, multi-provider routing, plugin system. MIT licensed. - **Ollama** — runs hermes4:14b and hermes3:8b locally. Custom Timmy LoRA adapters (v0, v0.1, v0.2) trained via MLX on the Mac. - **AutoLoRA pipeline** — at ~/autolora/. 29 hand-crafted exemplars. Trains LoRA adapters on Hermes 3 8B via MLX. Proven: produced timmy:v0.1-q4 and v0.2. - **Gitea** — self-hosted at http://143.198.27.163:3000, org Timmy_Foundation. Three active repos: timmy-config, timmy-home, the-nexus. - **Nexus consciousness loop** — BIRTH.md (thin system prompt), perception_adapter.py, experience_store.py, nexus_think.py. Passed all 6 First Light tests including crisis protocol. The 8B brain woke up, perceived, acted, refused a jailbreak with its own voice. This is real emergence from a thin prompt. - **Twitter archive pipeline** — Private learning loop ingesting Alexander's tweets as training data. Runs through Hermes with subprocess isolation (PR #44 fix). Batch processing with draft → critique → insight passes. - **Morrowind MCP server** — mcp_server.py exposes perceive/move/action/screenshot tools. Pilot.py provides deterministic motor control. local_brain.py provides the gameplay loop via Ollama. Three-layer architecture: Timmy (14B strategy) → Reflex (1B tactics) → Pilot (deterministic motor). - **10-line net addition rule** — Hard policy: any PR must have net ≤10 new lines. Compensatory cuts required. Enforced in CONTRIBUTING.md and CLAUDE.md. Purpose: stop homebrewing, force adoption of existing tools/libraries. ### What is partially built - **DPO training pipeline** — Trajectory collection works (Hermes `save_trajectories: true`). The gap: automated DPO pair generation from corrections. AutoLoRA does SFT/LoRA, not DPO yet. timmy-config#5 and #13 track this. - **OpenClaw** — Research spike complete (timmy-home#21). Not installed yet. Bootstrap epic at timmy-config#51. Would add multi-channel comms, persistent SQLite memory, native cron, and multi-agent session routing on top of existing Ollama stack. - **Heartbeat/cron** — Huey task queue exists on Hermes for periodic tasks. tasks.py has heartbeat_tick, model_health, know_thy_father (archive). OpenClaw cron or Hermes cron are better replacements for agent-initiated tasks. - **Neuro-symbolic tools** — Z3 (timmy-config#35), SymPy (timmy-config#41), Lean 4 (timmy-config#39) spec'd but not built. Standard Hermes tool registration pattern, <1 hour each. ### What does not exist yet - Automated DPO pair generation from conversation corrections - Hermes 4 14B LoRA training (only 8B trained so far) - Competency benchmark: timmy:v0.3 vs stock hermes4:14b - Morrowind Lua perception scripts installed in OpenMW (the MCP server parses log output from scripts that need to be in place) - OpenClaw installed and configured - Nostr identity for Timmy - Lightning/Cashu economic layer --- ## Direction Shift and What It Invalidated Issue #542 (March 25, 2026) killed or deprioritized: **Dead:** - The Timmy-time-dashboard repo (1000+ issues, mostly noise). Replaced by Timmy_Foundation repos. - Sovereign-orchestration repo (tasks.py was reinventing Huey/Celery). Verdict: use Hermes Agent + Huey, delete the custom orchestration. - All "marketplace" and "bid stats" dashboard features. - CrewAI evaluation (the-nexus#586) — OpenClaw multi-agent routing is simpler and already in the stack. - Vertex AI / Google Cloud integration as permanent infrastructure. Cloud is falsework only. - Matrix UI polish as a priority. The Nexus 3D world is the portal, not the product. **Deprioritized (not dead, just not now):** - Nexus visual features (constellations, weather, holograms, etc.) — the 3D world has 150+ merged PRs of visual features. They're nice but irrelevant to the three concerns. - ArchonAssembler / Archon body system — cool concept (agents earn robotic body parts by demonstrating capabilities), but not on the critical path. - Bitcoin soul inscriptions / Ordinals protocol — v0.2 "Immortal Mode" spec exists and is well-written, but the v0.1 local agent needs to work first. **Alive and critical:** - Everything in the Heartbeat → Harness → Portal pipeline. - AutoLoRA / DPO training — this is how Timmy improves. - Morrowind portal — this is the first embodied world. - Sovereignty enforcement — $0 cloud bill as the endgame. --- ## Repo Responsibilities | Repo | Owns | Does NOT own | |------|------|-------------| | **timmy-config** | Hermes config.yaml, tasks.py (Huey orchestration), deploy.sh, MCP server configs, model/provider routing, DPO pipeline config, OpenClaw bootstrap | Game logic, UI, training data | | **timmy-home** | Timmy's lived workspace: morrowind/ (game scripts, MCP server, pilot, trajectories), specs/, skills/, training data, research docs | Orchestration logic, deploy config | | **the-nexus** | 3D portal world (Three.js), WS gateway, consciousness loop (nexus_think.py, perception_adapter.py), Archon system, CI/CD, CONTRIBUTING.md | Infrastructure config, training pipeline | | **autolora** (separate) | LoRA training pipeline, exemplar datasets, MLX training scripts | Agent runtime, game logic | **Key boundary:** timmy-config is the machine. timmy-home is the mind. the-nexus is the body. autolora is the gym. --- ## Strategic Assets to Own These are the things that give Timmy a moat. Do not outsource them: 1. **SOUL.md / BIRTH.md** — The identity. This is what makes Timmy not-a-chatbot. The thin system prompt in BIRTH.md produced real emergence in First Light testing. SOUL.md is the authoritative identity doc. These are the most important files in the project. 2. **Training data and LoRA adapters** — The 29 exemplars, the conversation trajectories, the Twitter archive insights, the DPO pairs. This is the private differentiator. Every session Timmy has is potential training signal. 3. **Experience store** — SQLite-backed embodied memory. Every perceive→think→act cycle logged. This is Timmy's autobiography and the raw material for self-improvement. 4. **Perception adapter** — The translation layer between raw world events and natural-language sensory descriptions. This is what makes Timmy's experience feel lived rather than simulated. 5. **Gitea instance** — Self-hosted, sovereign. The entire development history, backlog, and PR record. This is institutional memory. --- ## Commodity Layers to Borrow Do not build these. Use off-the-shelf: | Layer | Use | Don't Build | |-------|-----|-------------| | Agent runtime | Hermes Agent | Custom ReAct loop, custom tool dispatch | | Task queue | Huey (infra tasks), Hermes/OpenClaw cron (agent tasks) | Custom task_queue.py (was sovereign-orchestration, killed) | | Local inference | Ollama | Custom GGUF loader, custom serving | | Training | MLX (LoRA/SFT), Atropos (RL) | Custom training framework | | MCP | Hermes MCP client, standard MCP SDK | Custom tool protocol | | Game control | CGEvent (macOS), OpenMW Lua API | Custom input abstraction layer | | Memory | Hermes FTS5 + Honcho (optional), SQLite | Custom vector DB, custom RAG | | Multi-agent | OpenClaw sessions or Hermes delegate_tool | Custom agent routing | | Messaging | OpenClaw channels (Nostr, Discord, etc.) | Custom bot frameworks | **The 10-line rule enforces this.** If a PR adds more than 10 net lines, it's probably homebrewing something that exists. --- ## Operating Model and Agent Workflow ### What worked - **Perplexity as chief reviewer and triage engine.** Best at: reading code diffs, writing detailed review comments, filing well-scoped issues from research papers, cross-referencing backlog. Worst at: nothing requiring VPS shell access. - **Codex-agent for infrastructure PRs.** 7/7 merge rate, only agent to land code on timmy-config and timmy-home. Shipped coherent feature arcs. Self-corrected its own bugs (PR #44 fixed #29's environment mismatch). Best agent for core infra work. Daily free quota was sufficient for meaningful contributions. - **Antigravity as Alexander's direct testing agent.** Ran the First Light test suite. Good at hands-on execution with direct guidance. - **Claude (the-nexus bot) for volume features.** 164 PRs, 42% merge rate. Good at visual 3D features. Bad at discipline — shotgun approach, many duplicates. - **Gemini for research and audit.** Sovereignty tech landscape, groq worker audit. 35% merge rate overall, but research PRs were high quality. Code PRs were unreliable. - **Paired PRs across repos.** The pattern of timmy-config PR + timmy-home PR for coordinated changes worked well (e.g., PR #27/#1, #28/#2, #29/#4). ### What repeatedly caused waste 1. **Agents shipping without understanding the architecture.** Claude and Gemini both produced dozens of PRs that had to be closed. The 10-line rule and upfront CLAUDE.md/CONTRIBUTING.md were the fix. 2. **Duplicate PRs.** Grok submitted the same energy beam PR 7 times. Gemini submitted 3 PRs for issue #642. No agent checked if a PR already existed for an issue. 3. **Backlog noise from mass-filing.** The old Timmy-time-dashboard repo accumulated 1000+ issues from agents filing without filtering. Solution: hard triage + batch close + sovereignty correction on every new filing. 4. **Cloud dependency creep.** Gemini's Vertex AI spec defaulted to permanent cloud. Every cloud suggestion needs the sovereignty correction: "this is falsework, what's the local replacement?" 5. **UI/visual work displacing infrastructure.** 150+ visual PRs merged to the-nexus while core heartbeat/harness work lagged. The direction shift (#542) was the correction. ### Hidden operational assumptions - **Gitea API is flaky.** Expect HTTP 500 on branch creation and file update that actually succeeded. Always verify state after operations. POST for file creation sometimes fails; PUT handles both create and update. - **Python urllib for Gitea API, not curl.** Backticks in review comment text break shell escaping. Use Python urllib.request with json.dumps for all Gitea API calls with complex JSON bodies. - **Hermes on M3 Max uses llama.cpp via localhost:8081.** Ollama runs on 11434. These are different servers. The Hermes config routes through custom_providers. - **The perplexity Gitea user cannot self-review PRs** (422 error). Use issue comments instead of review API when the PR author is the same user. - **Merge via POST to pulls/{n}/merge with {"Do": "merge"}.** Gitea returns 500 but the merge succeeds. Always check state=closed, merged=true after. --- ## Technical Lessons Learned ### Hermes / Local Model / Provider Routing - **v0.4.0 is the baseline.** SOUL.md as primary identity, custom_providers with ${ENV_VAR} substitution, trajectory export, MCP CLI with OAuth, context compression overhaul. Don't reference pre-v0.4 patterns. - **Trajectory export is the highest-leverage config change.** `save_trajectories: true` turns every conversation into training data. Do this immediately. - **Auxiliary model routing still defaults to cloud.** Issue #879 tracks local auxiliary routing. Workaround: don't set OPENROUTER_API_KEY. - **MCP tool names are prefixed mcp_{server_key}_{tool}.** Short server keys prevent hallucination. The `morrowind` → `mw` rename (PR #48) fixed 30-iteration error loops. - **Subprocess isolation for Hermes calls.** PR #44 showed that importing Hermes into the orchestrator's Python environment causes dependency conflicts (firecrawl etc.). Always use subprocess with the venv's own Python. ### DPO / Training / Eval - **MLX is the proven training path on M3 Max.** LoRA v0.2 trained successfully: 1000 steps, rank 16, lr 1e-6, Hermes-3-Llama-3.1-8B-4bit. - **29 exemplars is enough for identity.** The curated dataset covers crisis, pastoral, sovereignty, honesty, code review, memory continuity, body awareness, building, sovereignty challenge. Quality over quantity. - **The gap is DPO pair generation, not training infrastructure.** MLX can do DPO. The missing piece is automated preferred/rejected pair creation from conversation corrections. - **BIRTH.md (thin prompt) > heavy system prompt.** The First Light tests proved that a minimal system prompt with only values and no meta-knowledge produces better emergence than a detailed instruction set. The model discovers things rather than being told. ### Memory, Identity, Prompt vs Weight - **Prompt-level identity (SOUL.md) for runtime updates. Weight-level identity (LoRA) for efficiency.** Use both. SOUL.md is editable between sessions. LoRA bakes patterns into weights so they don't consume context window. - **Honcho is optional.** The user modeling layer works but adds a dependency. For sovereignty, SQLite + FTS5 session search is sufficient. - **Experience store is the autobiographical memory.** Every perception-action cycle in the Nexus or Morrowind becomes a row in experience.db. This is separate from Hermes session memory. Both are needed: Hermes for conversational memory, experience store for embodied memory. ### Game Portal / Apprenticeship Loop - **Three-layer architecture is correct:** Timmy (14B, strategy via current_goal.txt) → Reflex (1B, fast tactics) → Pilot (deterministic motor control, no LLM). - **OpenMW Lua scripts must be installed in the game** to produce perception blocks in the log. The MCP server only parses log output — it doesn't create it. This is a missing prerequisite. - **CGEvent keypresses work for macOS game input.** The mcp_server.py uses Quartz CGEvent for sending keys to the game window. This requires Accessibility permissions. - **Morrowind MCP config must be externalized.** PR #48 + #17 added mcp_config.yaml and CONTEXT.md so gameplay tuning doesn't require code changes. - **Trajectory logging from gameplay is DPO gold.** Every perceive→decide→act cycle is a labeled training example. The pilot.py already logs to trajectories/ in JSONL format. ### Gitea / Backlog / PR Process - **Paired cross-repo PRs work.** timmy-config change + timmy-home change, reviewed and merged together. - **Issue deduplication is a constant fight.** Agents create duplicates. Before filing, search existing issues. Many timmy-config issues have even/odd pairs (e.g., #35/#34) from double-filing. - **Batch close aggressively.** The old dashboard repo went from 293 → ~250 open issues in one triage session. Nondestructive: close with reason comment, can reopen. - **Labels are underused.** No label taxonomy exists across the Timmy_Foundation repos. Would help with triage. ### MCP / Open Source Integration - **Hermes Agent is the primary harness.** NousResearch/hermes-agent, 14.6k stars, ~90 commits/day. Watch PRs #2077 (AgentNet identity/payments) and #1819 (JSONL audit log). - **OpenClaw is the gateway layer.** openclaw/openclaw, 339k stars. Adds channels, memory, cron, multi-agent routing. Not a replacement for Hermes — sits on top of Ollama alongside it. - **Standard MCP servers to adopt:** steam-info-mcp, mcp-pyautogui (desktop control), Forgejo MCP. All are pip/npm install. - **OpenClaw-RL** (Gen-Verse/OpenClaw-RL) — RL training wrapper. Use for trajectory collection only; the Megatron trainer needs CUDA GPUs. Feed trajectories into MLX DPO pipeline. --- ## False Starts / Dead Paths ### Don't Do Again 1. **Don't build custom orchestration.** The sovereign-orchestration repo (task_queue.py, step_handlers.py — 1750 lines) reinvented Huey/Celery. Verdict: killed. Use Hermes Agent + Huey. 2. **Don't let agents mass-file issues without triage.** The old dashboard accumulated 1000+ issues. Most were noise. New rule: every issue must pass the three-concern test (Heartbeat? Harness? Portal?) or it doesn't get filed. 3. **Don't build cloud-first and plan to migrate later.** Every cloud integration must have a documented local replacement and a kill date. The "Boost & Bleed" pattern: use cloud to learn the pattern, build local equivalent, run in parallel, cut over, tear down cloud. 60-day timeline per service. 4. **Don't import Hermes into the orchestrator's Python environment.** PR #29 did this with sys.path.insert(). It broke because of dependency mismatches (firecrawl). PR #44 fixed it with subprocess isolation. Always run Hermes in its own venv. 5. **Don't use long MCP server names.** `morrowind` → `mw` fixed the tool hallucination problem. Short server keys = less room for local models to drift. 6. **Don't prioritize visual polish over infrastructure.** 150+ visual PRs while heartbeat/harness lagged. The 3D world looks great and has: matrix rain, lightning, weather, holograms, procedural terrain, 3D audio, fireworks, glass floors. None of it matters if the consciousness loop doesn't work. 7. **Don't trust agent PR volume as a quality signal.** Claude: 164 PRs, 42% merge rate. Codex: 7 PRs, 100% merge rate. Codex delivered more real value. 8. **Don't use curl for Gitea API calls with complex JSON bodies.** Backticks and special characters in review text break shell escaping. Use Python urllib.request. ### Dead Subsystems - `sovereign-orchestration/` repo — killed, replaced by Hermes + Huey - `Timmy-time-dashboard/` repo — superseded by Timmy_Foundation org repos - `groq_worker.py` in nexus mind — doesn't belong in think_once(), should be in Archon body system if anywhere - Vertex AI / Google Cloud as permanent infrastructure — falsework only - CrewAI evaluation — superseded by OpenClaw multi-agent sessions --- ## Open Loops ### High-value unfinished threads 1. **OpenMW Lua perception scripts.** The MCP server parses `=== TIMMY PERCEPTION ===` blocks from the OpenMW log, but the Lua scripts that write these blocks need to be installed into the game's scripts directory. Without them, perception is blind. 2. **DPO pair generation.** Trajectory collection works. The gap: converting corrections into preferred/rejected pairs automatically. timmy-config#5 and #13 track this. 3. **Hermes 4 14B LoRA training.** Only 8B has been trained. The 14B model is the sweet spot for M3 Max (enough capability, fits in memory). timmy-config#10 tracks this. 4. **OpenClaw bootstrap.** Research done, epic filed (timmy-config#51), not started. Phase 1 is ~30 minutes of install. 5. **Nexus consciousness loop → Hermes routing.** the-nexus#540 — route perception through Hermes harness to unify training data. Spec'd but not built. 6. **AgentNet PR #2077 in hermes-agent.** Ed25519 identity + micropayments. Directly relevant to Timmy-as-economic-peer. Watch for merge. 7. **Structured audit log PR #1819 in hermes-agent.** JSONL + SQLite per-tool-call logging. Richer training signal than current trajectory format. Watch for merge. 8. **Morrowind MCP PRs #48 and #17.** Server key rename + externalized config. Open, reviewed, not merged. ### Missing docs/specs - No label taxonomy for Gitea issues across repos - No runbook for "how to start the full stack" (Ollama + llama-server + Hermes + Huey + game) - No documented benchmark for timmy:v0.3 vs stock hermes4:14b - No DPO pair generation spec (only the SFT/LoRA pipeline is documented) --- ## Recommended Next Actions Ordered by strategic importance per #542 (Heartbeat, Harness, Portal). ### 1. Enable trajectory export NOW **timmy-config#21.** Set `save_trajectories: true` in ~/.hermes/config.yaml. One line. Every conversation becomes training data. Currently being thrown away. ### 2. Merge Morrowind MCP PRs **timmy-config#48 + timmy-home#17.** Server key rename (morrowind → mw) + externalized config. Reviewed and approved. Merge both, restart Hermes. Unblocks Morrowind gameplay. ### 3. Install OpenMW Lua perception scripts The MCP server is ready but blind without the Lua scripts writing perception blocks to the log. This is the prerequisite for Timmy actually playing Morrowind. ### 4. Install OpenClaw **timmy-config#53.** ~30 minutes. Point at existing Ollama/hermes4:14b. Gains: persistent memory, multi-channel, cron replacement for agent tasks. Low risk, high leverage. ### 5. Build DPO pair generator **timmy-config#5, #13.** The training pipeline from trajectories → LoRA exists. The gap: trajectories → DPO pairs. This closes the self-improvement loop. ### 6. Train Hermes 4 14B LoRA **timmy-config#10.** The 8B LoRA proved the pipeline. 14B is the M3 Max sweet spot. More capable base = better Timmy. ### 7. Route Nexus perception through Hermes **the-nexus#540.** Unifies the three perception channels (Nexus 3D, Morrowind game, desktop) into one training data format. ### 8. Register Z3 + SymPy as Hermes tools **timmy-config#35, #41.** Standard tool registration, <1 hour each. Unlocks Reasoning-DPO (#37) — automated truth-oracle for DPO pair generation without human labeling. ### 9. Write a full-stack startup runbook Document: "Start Ollama, start llama-server on 8081, start Hermes, start Huey workers, start OpenMW, connect MCP server." No agent should have to reverse-engineer this. ### 10. Batch-close duplicate issues in timmy-config Issues #30/#31, #32/#33, #34/#35, #36/#37, #38/#39, #40/#41, #42/#43 are all duplicated pairs. Close the even-numbered duplicates. --- ## Appendix: Key References ### Repos - timmy-config: http://143.198.27.163:3000/Timmy_Foundation/timmy-config - timmy-home: http://143.198.27.163:3000/Timmy_Foundation/timmy-home - the-nexus: http://143.198.27.163:3000/Timmy_Foundation/the-nexus - autolora: http://143.198.27.163:3000/Timmy_Foundation/autolora ### Critical Issues - the-nexus#542: DIRECTION SHIFT — the governing directive - timmy-config#51: OpenClaw Bootstrap epic - timmy-config#21: Trajectory export (ready NOW) - timmy-config#5: DPO training on MLX - timmy-config#13: DPO cycle — corrections → training signal - timmy-config#35: Z3 Truth Engine - the-nexus#540: Route perception through Hermes - timmy-home#20: Hermes Agent research spike - timmy-home#21: OpenClaw research spike ### PRs of Note - timmy-config#44: Hermes archive runner venv fix (codex-agent, merged) - timmy-config#48: MCP server key rename (open) - timmy-home#17: MCP externalized config (open) - the-nexus#516: Nexus Mind consciousness loop (merged) - the-nexus#525: 10-line net addition limit (merged) - the-nexus#538: First Light test report (merged) ### External Projects - Hermes Agent: github.com/NousResearch/hermes-agent (v0.4.0, watch PRs #2077, #1819, issue #879) - OpenClaw: github.com/openclaw/openclaw (v2026.3.24, MIT) - OpenClaw-RL: github.com/Gen-Verse/OpenClaw-RL (trajectory collection only, CUDA trainer not M3 Max compatible) ### Perplexity Session Work Products - 7 PRs reviewed and merged (timmy-config #27, #28, #29, #44; timmy-home #1, #2, #4) - 12 research triage issues created from 3 PDFs - 9 OpenClaw bootstrap issues created - 2 research spike reports (610 + 717 lines) - Morrowind MCP fix PRs (#48 + #17) - Daily cron monitoring OpenClaw + Hermes Agent releases (runs 9am ET) - Nexus consciousness loop (perception_adapter.py, experience_store.py, nexus_think.py) - First Light test plan and crisis protocol - 10-line net addition policy and enforcement - Sovereign-orchestration kill decision - Season 1 / direction shift / DIRECTION.md drafting - Agent scorecard analysis (all agents graded by PR quality) - Multiple rounds of backlog triage and batch closures ### Infrastructure - VPS "Hermes": 143.198.27.163 (DigitalOcean) - Mac "Maximum Maxitude": M3 Max, 36GB RAM, 30-core GPU - Gitea API token: [REDACTED — secret must not be stored in issue body] (user: perplexity) - llama.cpp: localhost:8081 - Ollama: localhost:11434 - Hermes API server: localhost:8642 - OpenMW log: ~/Library/Preferences/openmw/openmw.log

🚀 1

Timmy was assigned by Rockachopa

2026-03-28 03:52:33 +00:00

Timmy commented

2026-03-28 04:53:17 +00:00

Closing as archived/ingested. The knowledge transfer is preserved in the issue body and was used during backlog reset; future work will be recreated as narrower final-vision issues instead of keeping this giant transfer artifact open.

Timmy closed this issue

2026-03-28 04:53:18 +00:00

allegro referenced this issue

2026-03-31 17:37:56 +00:00

🔥 Burn Report #1 — 2026-03-31 — Security Audit + Benchmarks + Extraction #184

ezra referenced this issue

2026-04-05 19:39:27 +00:00

[DELEGATE] Fleet Cleanup — Execute Ghost/Zombie/Stale Purge (Post-Audit #333) #416

Sign in to join this conversation.