Build UESP RAG knowledge pipeline (ChromaDB + nomic-embed) #969

Closed
opened 2026-03-22 18:45:50 +00:00 by perplexity · 0 comments
Collaborator

Parent

  • #963 — [Study] Solving the Perception Bottleneck

Objective

Create a retrieval-augmented generation pipeline from UESP wiki content so Timmy can query quest walkthroughs, NPC info, and location guides at runtime.

Scope

  • Gather ~4,300 quest/location/NPC pages from UESP Morrowind wiki
  • Chunk into ~15K passages with appropriate overlap
  • Embed with nomic-embed-text-v1.5 (300MB model, always loaded)
  • Store in ChromaDB (~500MB, pre-loaded at startup)
  • Implement query interface: collection.query(query_texts=[...], n_results=k)
  • Example query: "What do I do after delivering the package to Caius?" → relevant walkthrough passages
  • Integrate with MorrowindKnowledgeBase class alongside ESM data

Key Design Notes

  • RAG complements ESM data: ESM has game mechanics, UESP has strategy/walkthroughs
  • Combined with ESM pre-extraction, gives Timmy encyclopedic Morrowind knowledge
  • nomic-embed-text is lightweight (300MB) and runs on GPU via MLX
  • ChromaDB is the vector store — pre-loaded at startup for zero-latency queries

References

  • Paper §UESP RAG Pipeline (pp. 13-14)
  • Related: #956 (skill library crystallizer from Sovereignty Loop)
## Parent - #963 — [Study] Solving the Perception Bottleneck ## Objective Create a retrieval-augmented generation pipeline from UESP wiki content so Timmy can query quest walkthroughs, NPC info, and location guides at runtime. ## Scope - Gather ~4,300 quest/location/NPC pages from UESP Morrowind wiki - Chunk into ~15K passages with appropriate overlap - Embed with `nomic-embed-text-v1.5` (300MB model, always loaded) - Store in ChromaDB (~500MB, pre-loaded at startup) - Implement query interface: `collection.query(query_texts=[...], n_results=k)` - Example query: "What do I do after delivering the package to Caius?" → relevant walkthrough passages - Integrate with `MorrowindKnowledgeBase` class alongside ESM data ## Key Design Notes - RAG complements ESM data: ESM has game mechanics, UESP has strategy/walkthroughs - Combined with ESM pre-extraction, gives Timmy encyclopedic Morrowind knowledge - nomic-embed-text is lightweight (300MB) and runs on GPU via MLX - ChromaDB is the vector store — pre-loaded at startup for zero-latency queries ## References - Paper §UESP RAG Pipeline (pp. 13-14) - Related: #956 (skill library crystallizer from Sovereignty Loop)
gemini was assigned by Rockachopa 2026-03-22 23:30:50 +00:00
claude added the harnessmorrowindp1-important labels 2026-03-23 13:54:09 +00:00
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: Rockachopa/Timmy-time-dashboard#969