[P1] Build semantic index for research outputs (nomic-embed-text + SQLite) #976

Closed
opened 2026-03-22 19:08:53 +00:00 by perplexity · 1 comment
Collaborator

Parent

  • #972 — [GOVERNING] Replacing Claude — Autonomous Research Pipeline Spec

Objective

Index all research outputs into a semantic memory store so repeated questions are answered from local cache at zero API cost.

Scope

  • Embed research outputs using nomic-embed-text-v1.5 via Ollama (300MB model)
  • Store embeddings in SQLite or ChromaDB
  • Implement memory.search(topic, limit=10) with confidence scoring
  • Implement memory.store(topic, report, type="research")
  • Index existing research artifacts (the 6+ PDFs already triaged)

Key Design Notes

  • This is the foundation for Tier 4 (cached) in the cascade strategy
  • Target: 80%+ of research queries answered from local cache within 3 months
  • Same embedding model and store as the UESP RAG pipeline (#969) — consolidate

Effort Estimate

4 hours

  • #969 — UESP RAG knowledge pipeline (same infrastructure)
  • #955 — PerceptionCache (similar pattern for game perception)
## Parent - #972 — [GOVERNING] Replacing Claude — Autonomous Research Pipeline Spec ## Objective Index all research outputs into a semantic memory store so repeated questions are answered from local cache at zero API cost. ## Scope - Embed research outputs using `nomic-embed-text-v1.5` via Ollama (300MB model) - Store embeddings in SQLite or ChromaDB - Implement `memory.search(topic, limit=10)` with confidence scoring - Implement `memory.store(topic, report, type="research")` - Index existing research artifacts (the 6+ PDFs already triaged) ## Key Design Notes - This is the foundation for Tier 4 (cached) in the cascade strategy - Target: 80%+ of research queries answered from local cache within 3 months - Same embedding model and store as the UESP RAG pipeline (#969) — consolidate ## Effort Estimate 4 hours ## Related - #969 — UESP RAG knowledge pipeline (same infrastructure) - #955 — PerceptionCache (similar pattern for game perception)
gemini was assigned by Rockachopa 2026-03-22 23:30:49 +00:00
Collaborator

Attempted to implement semantic index for research outputs using Ollama embeddings and SQLite. Modified src/config.py, src/timmy/memory/embeddings.py, and src/timmy/memory_system.py. Also created index_research_docs.py to demonstrate indexing. All tests passed locally after changes. I was unable to create a pull request due to policy restrictions.

Attempted to implement semantic index for research outputs using Ollama embeddings and SQLite. Modified `src/config.py`, `src/timmy/memory/embeddings.py`, and `src/timmy/memory_system.py`. Also created `index_research_docs.py` to demonstrate indexing. All tests passed locally after changes. I was unable to create a pull request due to policy restrictions.
Sign in to join this conversation.
No Label
2 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: Rockachopa/Timmy-time-dashboard#976