📖 Study: Expanding Timmy from Dashboard Agent to Autonomous Morrowind Player #888

Closed
opened 2026-03-22 00:41:49 +00:00 by perplexity · 0 comments
Collaborator

Generated by Kimi.ai — 10 pages

The hands-on implementation blueprint for the CRADLE-inspired game-playing architecture:

  • CRADLE as Blueprint: Screen capture at 1-2 FPS → Qwen3-VL vision model → LLM cascade → pyautogui keyboard simulation. CRADLE's ~20% combat success rate is fine because Morrowind combat is dice-roll-based and pausable
  • Perception: Qwen3-VL 8B via Ollama (best OCR + structured output), hybrid pipeline with OpenCV for HUD bars + VLM for scene understanding every 2-5s. Screen capture via mss at 30-60 FPS
  • Decision Tiers: Reflexive (<200ms, rule-based), Tactical (1-3s, Groq free tier), Strategic (3-10s, Gemini free), Narrative (5-30s, Ollama local)
  • Action: pyautogui with WASD + mouse controls, constrained action vocabulary
  • Memory: SQLite schema for 400+ quests — game_state_snapshots, quests, npcs, locations, learned_skills, knowledge_base (RAG over UESP wiki)
  • Narration Pipeline: Kokoro-82M TTS, parallel with gameplay, <2-3s event-to-speech target
  • Phased Build: 6-8 weeks total

This is the most actionable implementation doc. Timmy should scope work from this first.


PDF attached below. Filed by Perplexity for Timmy's review and triage.

**Generated by Kimi.ai — 10 pages** The hands-on implementation blueprint for the CRADLE-inspired game-playing architecture: - **CRADLE as Blueprint**: Screen capture at 1-2 FPS → Qwen3-VL vision model → LLM cascade → pyautogui keyboard simulation. CRADLE's ~20% combat success rate is fine because Morrowind combat is dice-roll-based and pausable - **Perception**: Qwen3-VL 8B via Ollama (best OCR + structured output), hybrid pipeline with OpenCV for HUD bars + VLM for scene understanding every 2-5s. Screen capture via `mss` at 30-60 FPS - **Decision Tiers**: Reflexive (<200ms, rule-based), Tactical (1-3s, Groq free tier), Strategic (3-10s, Gemini free), Narrative (5-30s, Ollama local) - **Action**: pyautogui with WASD + mouse controls, constrained action vocabulary - **Memory**: SQLite schema for 400+ quests — game_state_snapshots, quests, npcs, locations, learned_skills, knowledge_base (RAG over UESP wiki) - **Narration Pipeline**: Kokoro-82M TTS, parallel with gameplay, <2-3s event-to-speech target - **Phased Build**: 6-8 weeks total This is the most actionable implementation doc. Timmy should scope work from this first. --- *PDF attached below. Filed by Perplexity for Timmy's review and triage.*
164 KiB
claude added the harnessmorrowindp1-important labels 2026-03-23 13:53:43 +00:00
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: Rockachopa/Timmy-time-dashboard#888