Memory consolidation: unify three stores into one coherent system #37

Closed
opened 2026-03-14 16:49:42 +00:00 by hermes · 2 comments
Collaborator

Problem

Three separate memory implementations with different storage, different APIs, and no coherence:

  1. memory_system.py (519 lines) — HotMemory reads/writes MEMORY.md, VaultMemory writes markdown files, HandoffProtocol writes session handoffs. The "official" system used by agent.py and thinking.py for context injection.

  2. semantic_memory.py (490 lines) — SemanticMemory indexes vault markdown into chunks table. Exposes memory_write/read/search/forget functions that write to episodes table via vector_store.py. These are the LLM tool functions.

  3. memory/vector_store.py (430 lines) + memory/unified.py (85 lines) — The actual SQLite backend. Three tables: facts (0 rows, unused), chunks (37 rows), episodes (37 rows, 28 facts). Duplicates cosine_similarity from semantic_memory.py.

Observed Failures

  • MEMORY.md last updated March 8 despite daily thinking cycles
  • User profile says Name: "Always" (regex corruption)
  • "Learn user's name" pending despite Name: Alexander in same file
  • Distilled "facts" are meta-observations, not actual knowledge
  • Token path leaked in interview response (stored as a "fact")
  • Hot memory, vault, and episodes never sync

Current Data State

data/memory.db:
  episodes: 37 rows (28 context_type='fact', 9 other)
  chunks: 37 rows (vault document fragments)
  facts: 0 rows (table exists but nothing writes to it)
  
MEMORY.md: stale since March 8, corrupted fields
memory/self/user_profile.md: Name: "Always" (corrupted)

Acceptance Criteria

  • Single memories table in SQLite replaces episodes + facts + chunks
  • One write path, one search path, one embedding path
  • Hot memory is a computed view (top N facts by recency/access), not a separate file
  • memory_write stores actual facts with validation
  • memory_read/search returns coherent results from one source
  • No leaked sensitive info stored as facts
  • Old data migrated or cleanly reset
  • All callers updated: agent.py, thinking.py, session.py, tools.py, dashboard routes
  • Tests pass

Files to Change

  • src/timmy/memory_system.py — rewrite as single system
  • src/timmy/semantic_memory.py — merge embedding logic, delete rest
  • src/timmy/memory/vector_store.py — merge into memory_system, delete
  • src/timmy/memory/unified.py — merge schema, delete
  • src/timmy/memory/__init__.py — delete package
  • src/timmy/thinking.py — fix _distill_facts_from_thoughts
  • src/timmy/agent.py — update context injection
  • src/timmy/tools.py — update memory tool functions
  • src/dashboard/routes/memory.py — update imports
  • src/dashboard/app.py — update prune_memories import

Priority: HIGH — foundational

## Problem Three separate memory implementations with different storage, different APIs, and no coherence: 1. **`memory_system.py`** (519 lines) — HotMemory reads/writes MEMORY.md, VaultMemory writes markdown files, HandoffProtocol writes session handoffs. The "official" system used by agent.py and thinking.py for context injection. 2. **`semantic_memory.py`** (490 lines) — SemanticMemory indexes vault markdown into `chunks` table. Exposes `memory_write/read/search/forget` functions that write to `episodes` table via vector_store.py. These are the LLM tool functions. 3. **`memory/vector_store.py`** (430 lines) + `memory/unified.py` (85 lines) — The actual SQLite backend. Three tables: `facts` (0 rows, unused), `chunks` (37 rows), `episodes` (37 rows, 28 facts). Duplicates cosine_similarity from semantic_memory.py. ## Observed Failures - MEMORY.md last updated March 8 despite daily thinking cycles - User profile says Name: "Always" (regex corruption) - "Learn user's name" pending despite Name: Alexander in same file - Distilled "facts" are meta-observations, not actual knowledge - Token path leaked in interview response (stored as a "fact") - Hot memory, vault, and episodes never sync ## Current Data State ``` data/memory.db: episodes: 37 rows (28 context_type='fact', 9 other) chunks: 37 rows (vault document fragments) facts: 0 rows (table exists but nothing writes to it) MEMORY.md: stale since March 8, corrupted fields memory/self/user_profile.md: Name: "Always" (corrupted) ``` ## Acceptance Criteria - [ ] Single `memories` table in SQLite replaces episodes + facts + chunks - [ ] One write path, one search path, one embedding path - [ ] Hot memory is a computed view (top N facts by recency/access), not a separate file - [ ] `memory_write` stores actual facts with validation - [ ] `memory_read/search` returns coherent results from one source - [ ] No leaked sensitive info stored as facts - [ ] Old data migrated or cleanly reset - [ ] All callers updated: agent.py, thinking.py, session.py, tools.py, dashboard routes - [ ] Tests pass ## Files to Change - `src/timmy/memory_system.py` — rewrite as single system - `src/timmy/semantic_memory.py` — merge embedding logic, delete rest - `src/timmy/memory/vector_store.py` — merge into memory_system, delete - `src/timmy/memory/unified.py` — merge schema, delete - `src/timmy/memory/__init__.py` — delete package - `src/timmy/thinking.py` — fix _distill_facts_from_thoughts - `src/timmy/agent.py` — update context injection - `src/timmy/tools.py` — update memory tool functions - `src/dashboard/routes/memory.py` — update imports - `src/dashboard/app.py` — update prune_memories import ## Priority: HIGH — foundational
Author
Collaborator

Triage: Breaking This Into Phases

This issue is too large for one dev cycle (10+ files, 1000+ lines). Decomposing:

  • Phase 1: Filed as new issue — identify and remove dead code in memory_system.py and semantic_memory.py
  • Phase 2: (after phase 1) Merge the remaining live functions into a single API surface in memory/unified.py
  • Phase 3: (after phase 2) Update all consumers (agent.py, thinking.py, tools.py, dashboard routes) to use the unified API
  • Phase 4: Delete the old files

This parent issue stays open as the tracker. Sub-issues will reference it.

[triage-generated]

## Triage: Breaking This Into Phases This issue is too large for one dev cycle (10+ files, 1000+ lines). Decomposing: - **Phase 1:** Filed as new issue — identify and remove dead code in memory_system.py and semantic_memory.py - **Phase 2:** (after phase 1) Merge the remaining live functions into a single API surface in memory/unified.py - **Phase 3:** (after phase 2) Update all consumers (agent.py, thinking.py, tools.py, dashboard routes) to use the unified API - **Phase 4:** Delete the old files This parent issue stays open as the tracker. Sub-issues will reference it. [triage-generated]
Author
Collaborator

Decomposition Plan

This issue is too large for a single PR (10+ files, 1392 lines across 5 memory modules). Breaking into phases:

  1. Phase 0 — Break circular imports (#164) merged in PR [loop-cycle-53] refactor: break circular imports between packages (#164) (#193)
  2. Phase 1 — Merge tables: consolidate episodes + facts + chunks into single memories table with migration
  3. Phase 2 — Merge modules: unify memory_system.py + semantic_memory.py + memory/vector_store.py + memory/unified.py into one memory_system.py
  4. Phase 3 — Update all callers: agent.py, thinking.py, tools.py, session.py, loop_qa.py, dashboard routes
  5. Phase 4 — Delete dead code: remove memory/ package, update hot memory to computed view

Tracking sub-issues below.

## Decomposition Plan This issue is too large for a single PR (10+ files, 1392 lines across 5 memory modules). Breaking into phases: 1. **Phase 0** — Break circular imports (#164) ✅ merged in PR #193 2. **Phase 1** — Merge tables: consolidate `episodes` + `facts` + `chunks` into single `memories` table with migration 3. **Phase 2** — Merge modules: unify `memory_system.py` + `semantic_memory.py` + `memory/vector_store.py` + `memory/unified.py` into one `memory_system.py` 4. **Phase 3** — Update all callers: `agent.py`, `thinking.py`, `tools.py`, `session.py`, `loop_qa.py`, dashboard routes 5. **Phase 4** — Delete dead code: remove `memory/` package, update hot memory to computed view Tracking sub-issues below.
Sign in to join this conversation.
No Label
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: Rockachopa/Timmy-time-dashboard#37