[Community] UESP Knowledge Base — Morrowind Wiki to RAG-Queryable Vector Store #883

Closed
opened 2026-03-21 23:40:08 +00:00 by perplexity · 0 comments
Collaborator

Why This Is High Leverage

Timmy can't play Morrowind intelligently without game knowledge. The feasibility guide recommends UESP wiki data pre-processed into a RAG-queryable knowledge base for quest walkthroughs, NPC locations, item databases, and map data. This is the semantic memory tier that makes Timmy's reasoning about the game world actually correct rather than hallucinated. Without it, he'll wander aimlessly. With it, he knows where Caius Cosades lives, what the Nerevarine prophecy requires, and how to get to Vivec.

Scope

Build a pipeline: UESP wiki → structured extraction → embeddings → SQLite vector store → RAG retrieval.

Data Sources

  • UESP Morrowind namespace: quests, NPCs, locations, items, factions, skills, spells
  • Estimated: ~5,000 relevant pages
  • Extract structured data where possible (infoboxes, tables), full text otherwise

Pipeline

  1. Fetch: MediaWiki API to download Morrowind namespace pages
  2. Parse: Extract structured data (quest steps, NPC locations, item stats) + clean prose
  3. Chunk: Split into retrieval-sized chunks (~500 tokens) with metadata (page title, category, location)
  4. Embed: sentence-transformers (all-MiniLM-L6-v2 or similar, runs on CPU)
  5. Store: SQLite with vector extension, or simple numpy cosine similarity
  6. Query: Given a situation description, retrieve top-k relevant chunks

Requirements

  • src/knowledge/uesp/fetcher.py — MediaWiki API client for UESP Morrowind pages
  • src/knowledge/uesp/parser.py — HTML/wikitext → structured data + clean text
  • src/knowledge/uesp/chunker.py — Smart chunking with metadata preservation
  • src/knowledge/embeddings.py — Embedding generation and storage
  • src/knowledge/retriever.py — Semantic similarity search, returns ranked chunks with sources
  • Initial data load script
  • Tests with sample queries ("Where is Caius Cosades?", "How do I become Hortator?")

Acceptance Criteria

  • "Where is Caius Cosades?" returns: Balmora, South Wall Cornerclub, with quest context
  • "What do I need for the Hortator quest?" returns: step-by-step requirements per house
  • Retrieval latency < 200ms for top-10 results
  • Full Morrowind namespace indexed (~5,000 pages)

Assignee: Kimi

## Why This Is High Leverage Timmy can't play Morrowind intelligently without game knowledge. The feasibility guide recommends UESP wiki data pre-processed into a RAG-queryable knowledge base for quest walkthroughs, NPC locations, item databases, and map data. This is the semantic memory tier that makes Timmy's reasoning about the game world actually correct rather than hallucinated. Without it, he'll wander aimlessly. With it, he knows where Caius Cosades lives, what the Nerevarine prophecy requires, and how to get to Vivec. ## Scope Build a pipeline: UESP wiki → structured extraction → embeddings → SQLite vector store → RAG retrieval. ### Data Sources - UESP Morrowind namespace: quests, NPCs, locations, items, factions, skills, spells - Estimated: ~5,000 relevant pages - Extract structured data where possible (infoboxes, tables), full text otherwise ### Pipeline 1. **Fetch:** MediaWiki API to download Morrowind namespace pages 2. **Parse:** Extract structured data (quest steps, NPC locations, item stats) + clean prose 3. **Chunk:** Split into retrieval-sized chunks (~500 tokens) with metadata (page title, category, location) 4. **Embed:** sentence-transformers (all-MiniLM-L6-v2 or similar, runs on CPU) 5. **Store:** SQLite with vector extension, or simple numpy cosine similarity 6. **Query:** Given a situation description, retrieve top-k relevant chunks ## Requirements - [ ] `src/knowledge/uesp/fetcher.py` — MediaWiki API client for UESP Morrowind pages - [ ] `src/knowledge/uesp/parser.py` — HTML/wikitext → structured data + clean text - [ ] `src/knowledge/uesp/chunker.py` — Smart chunking with metadata preservation - [ ] `src/knowledge/embeddings.py` — Embedding generation and storage - [ ] `src/knowledge/retriever.py` — Semantic similarity search, returns ranked chunks with sources - [ ] Initial data load script - [ ] Tests with sample queries ("Where is Caius Cosades?", "How do I become Hortator?") ## Acceptance Criteria - "Where is Caius Cosades?" returns: Balmora, South Wall Cornerclub, with quest context - "What do I need for the Hortator quest?" returns: step-by-step requirements per house - Retrieval latency < 200ms for top-10 results - Full Morrowind namespace indexed (~5,000 pages) ## Assignee: Kimi
gemini was assigned by Rockachopa 2026-03-22 23:33:20 +00:00
claude added the harnessmorrowindp1-important labels 2026-03-23 13:53:40 +00:00
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: Rockachopa/Timmy-time-dashboard#883