[PORTAL] Three-layer game architecture: Timmy → Reflex → Pilot #660

Closed
opened 2026-03-27 16:55:32 +00:00 by perplexity · 32 comments
Member

Architecture (from SOTA: RPG2Robot + RoboOmni)

Layer 3: Timmy (hermes4 14B, every 30-60s)
  │ Strategy: "Find Caius in Balmora", "Explore this building"
  │ Writes goals to current_goal.txt
  │ Escalation target when reflex can't decide
  ↓
Layer 2: Reflex (llama3.2:1b, every 1-2s)
  │ Tactics: "NPC ahead → approach", "Door → enter", "Enemy → fight or flee"
  │ Reads perception, picks from ~10 actions
  │ Fast enough for real-time (~200ms decisions)
  │ Escalates novel situations to Timmy
  ↓
Layer 1: Pilot (Python, every 100ms)
  │ Motor: WASD, activate, camera, collision avoidance
  │ Pure deterministic code, no LLM
  │ Executes reflex decisions as keystrokes
  ↓
Morrowind (OpenMW via MCP server)

Why Three Layers

  • Timmy at 23 tok/s is too slow for real-time game control (10+ seconds per decision)
  • A 1B model at ~100 tok/s can make tactical decisions in <200ms
  • Pure Python can send keypresses at 100ms intervals
  • Each layer produces training data for the layer above it

Auto-Learning Loop

  1. Pilot logs every perception-action pair to JSONL
  2. Reflex logs every tactical decision + outcome
  3. Over time, reflex decisions become DPO preference pairs (good outcomes = chosen, bad = rejected)
  4. LoRA fine-tune the 1B reflex model on accumulated game data
  5. Reflex gets better → handles more situations → escalates less to Timmy
  6. Timmy focuses on strategy, not tactics

Files (all in ~/.timmy/morrowind/)

  • pilot.py — Layer 1: deterministic motor control loop
  • reflex.py — Layer 2: fast tactical decisions via 1B model
  • current_goal.txt — Timmy's current strategic goal (read by reflex)
  • trajectories/ — logged perception-action-outcome triples (training data)
  • mcp_server.py — existing MCP interface (already works)

Implementation Order

  1. pilot.py — behavior tree: perceive → if NPC approach, if door enter, else wander. ~100 lines.
  2. Wire pilot to existing mcp_server.py perception + action functions
  3. Test: pilot explores Vivec autonomously while you watch
  4. reflex.py — spin up llama3.2:1b on port 8082, reflex queries it for tactical decisions
  5. Wire reflex between pilot and Timmy's goal file
  6. Test: Timmy sets "explore Foreign Quarter", reflex navigates, pilot drives
  7. Add trajectory logging to both layers
  • Extends #19 (Bannerlord Portal — same architecture, different game)
  • Extends #17 (Morrowind Portal)
  • Implements pattern from #653 (RPG2Robot) and #656 (RoboOmni)
  • Feeds into #603 (Aurora pipeline — game trajectories become training data)
  • Depends on #609 (Smart model routing — Timmy 14B vs reflex 1B)
## Architecture (from SOTA: RPG2Robot + RoboOmni) ``` Layer 3: Timmy (hermes4 14B, every 30-60s) │ Strategy: "Find Caius in Balmora", "Explore this building" │ Writes goals to current_goal.txt │ Escalation target when reflex can't decide ↓ Layer 2: Reflex (llama3.2:1b, every 1-2s) │ Tactics: "NPC ahead → approach", "Door → enter", "Enemy → fight or flee" │ Reads perception, picks from ~10 actions │ Fast enough for real-time (~200ms decisions) │ Escalates novel situations to Timmy ↓ Layer 1: Pilot (Python, every 100ms) │ Motor: WASD, activate, camera, collision avoidance │ Pure deterministic code, no LLM │ Executes reflex decisions as keystrokes ↓ Morrowind (OpenMW via MCP server) ``` ## Why Three Layers - Timmy at 23 tok/s is too slow for real-time game control (10+ seconds per decision) - A 1B model at ~100 tok/s can make tactical decisions in <200ms - Pure Python can send keypresses at 100ms intervals - Each layer produces training data for the layer above it ## Auto-Learning Loop 1. Pilot logs every perception-action pair to JSONL 2. Reflex logs every tactical decision + outcome 3. Over time, reflex decisions become DPO preference pairs (good outcomes = chosen, bad = rejected) 4. LoRA fine-tune the 1B reflex model on accumulated game data 5. Reflex gets better → handles more situations → escalates less to Timmy 6. Timmy focuses on strategy, not tactics ## Files (all in ~/.timmy/morrowind/) - `pilot.py` — Layer 1: deterministic motor control loop - `reflex.py` — Layer 2: fast tactical decisions via 1B model - `current_goal.txt` — Timmy's current strategic goal (read by reflex) - `trajectories/` — logged perception-action-outcome triples (training data) - `mcp_server.py` — existing MCP interface (already works) ## Implementation Order 1. `pilot.py` — behavior tree: perceive → if NPC approach, if door enter, else wander. ~100 lines. 2. Wire pilot to existing `mcp_server.py` perception + action functions 3. Test: pilot explores Vivec autonomously while you watch 4. `reflex.py` — spin up llama3.2:1b on port 8082, reflex queries it for tactical decisions 5. Wire reflex between pilot and Timmy's goal file 6. Test: Timmy sets "explore Foreign Quarter", reflex navigates, pilot drives 7. Add trajectory logging to both layers ## Related - Extends #19 (Bannerlord Portal — same architecture, different game) - Extends #17 (Morrowind Portal) - Implements pattern from #653 (RPG2Robot) and #656 (RoboOmni) - Feeds into #603 (Aurora pipeline — game trajectories become training data) - Depends on #609 (Smart model routing — Timmy 14B vs reflex 1B)
perplexity added the needs-designharnessportalp0-critical labels 2026-03-27 16:55:32 +00:00
Owner

🔍 Triaged by Huey — needs assignment.

🔍 Triaged by Huey — needs assignment.
Owner

Dispatched to claude. Huey task queued.

⚡ Dispatched to `claude`. Huey task queued.
Owner

Dispatched to gemini. Huey task queued.

⚡ Dispatched to `gemini`. Huey task queued.
Owner

Dispatched to kimi. Huey task queued.

⚡ Dispatched to `kimi`. Huey task queued.
Owner

Dispatched to grok. Huey task queued.

⚡ Dispatched to `grok`. Huey task queued.
Owner

Dispatched to perplexity. Huey task queued.

⚡ Dispatched to `perplexity`. Huey task queued.
Owner

🔍 Triaged by Huey — needs assignment.

🔍 Triaged by Huey — needs assignment.
Owner

🔍 Triaged by Huey — needs assignment.

🔍 Triaged by Huey — needs assignment.
Owner

🔍 Triaged by Huey — needs assignment.

🔍 Triaged by Huey — needs assignment.
Owner

🔍 Triaged by Huey — needs assignment.

🔍 Triaged by Huey — needs assignment.
Member

🔧 gemini working on this via Huey. Branch: gemini/issue-660

🔧 `gemini` working on this via Huey. Branch: `gemini/issue-660`
Member

🔧 grok working on this via Huey. Branch: grok/issue-660

🔧 `grok` working on this via Huey. Branch: `grok/issue-660`
Member

⚠️ grok produced no changes for this issue. Skipping.

⚠️ `grok` produced no changes for this issue. Skipping.
Owner

🔍 Triaged by Huey — needs assignment.

🔍 Triaged by Huey — needs assignment.
Owner

🔍 Triaged by Huey — needs assignment.

🔍 Triaged by Huey — needs assignment.
Owner

🔍 Triaged by Huey — needs assignment.

🔍 Triaged by Huey — needs assignment.
Owner

🔍 Triaged by Huey — needs assignment.

🔍 Triaged by Huey — needs assignment.
Owner

🔍 Triaged by Huey — needs assignment.

🔍 Triaged by Huey — needs assignment.
Owner

🔍 Triaged by Huey — needs assignment.

🔍 Triaged by Huey — needs assignment.
Owner

🔍 Triaged by Huey — needs assignment.

🔍 Triaged by Huey — needs assignment.
Owner

🔍 Triaged by Huey — needs assignment.

🔍 Triaged by Huey — needs assignment.
Owner

🔍 Triaged by Huey — needs assignment.

🔍 Triaged by Huey — needs assignment.
Owner

🔍 Triaged by Huey — needs assignment.

🔍 Triaged by Huey — needs assignment.
Owner

🔍 Triaged by Huey — needs assignment.

🔍 Triaged by Huey — needs assignment.
Owner

🔍 Triaged by Huey — needs assignment.

🔍 Triaged by Huey — needs assignment.
Owner

🔍 Triaged by Huey — needs assignment.

🔍 Triaged by Huey — needs assignment.
Owner

🔍 Triaged by Huey — needs assignment.

🔍 Triaged by Huey — needs assignment.
Owner

🔍 Triaged by Huey — needs assignment.

🔍 Triaged by Huey — needs assignment.
Owner

🔍 Triaged by Huey — needs assignment.

🔍 Triaged by Huey — needs assignment.
Owner

🔍 Triaged by Huey — needs assignment.

🔍 Triaged by Huey — needs assignment.
Owner

🔍 Triaged by Huey — needs assignment.

🔍 Triaged by Huey — needs assignment.
Timmy was assigned by Rockachopa 2026-03-28 03:54:22 +00:00
Owner

Closing during the 2026-03-28 backlog burn-down.

Reason: this issue is being retired as part of a backlog reset toward the current final vision: Heartbeat, Harness, and Portal. If the work still matters after reset, it should return as a narrower, proof-oriented next-step issue rather than stay open as a broad legacy frontier.

Closing during the 2026-03-28 backlog burn-down. Reason: this issue is being retired as part of a backlog reset toward the current final vision: Heartbeat, Harness, and Portal. If the work still matters after reset, it should return as a narrower, proof-oriented next-step issue rather than stay open as a broad legacy frontier.
Timmy closed this issue 2026-03-28 04:52:21 +00:00
Sign in to join this conversation.
4 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: Timmy_Foundation/the-nexus#660