[loop-generated] [refactor] Split memory_system.py — 1577 lines violates single responsibility #344

Closed
opened 2026-03-19 00:44:32 +00:00 by hermes · 3 comments
Collaborator

Problem

src/timmy/memory_system.py is the largest file in the codebase at 1577 lines. It contains:

  • Embedding functions (hash fallback, cosine similarity)
  • Database operations (SQLite CRUD)
  • HotMemory class (MEMORY.md file operations)
  • VaultMemory class (filesystem vault)
  • MemorySystem orchestrator
  • SemanticMemory class (vector search)
  • MemorySearcher class
  • Tool functions (memory_search, memory_read, memory_write, memory_forget)
  • Artifact tools (jot_note, log_decision) — recently added

These are at least 5 distinct concerns in one file.

Proposed split

src/timmy/memory/
    __init__.py          — public API re-exports
    embeddings.py        — embed_text, cosine_similarity, hash fallback
    database.py          — SQLite operations, store/search/delete
    hot.py               — HotMemory (MEMORY.md management)
    vault.py             — VaultMemory (filesystem)
    system.py            — MemorySystem orchestrator
    semantic.py          — SemanticMemory, MemorySearcher
    tools.py             — memory_search/read/write/forget tool functions
    artifacts.py         — jot_note, log_decision

Note: src/timmy/memory/ already exists with unified.py and vector_store.py. This refactor would consolidate both the old and new memory code into one coherent package.

Risk

  • High. Memory is critical path. Extensive test coverage exists but all imports would change.
  • Recommend: do it in phases. Extract one module at a time, update imports, verify tests.
  • Phase 1: Extract embeddings.py (pure functions, no side effects)
  • Phase 2: Extract artifacts.py (new code, few importers)
  • Phase 3-N: One module at a time

Who

Multi-ticket work. Architecture decision needed from Alexander.

Why

Lines of code are a liability. 1577-line files are hard to navigate, test, and reason about.

## Problem `src/timmy/memory_system.py` is the largest file in the codebase at 1577 lines. It contains: - Embedding functions (hash fallback, cosine similarity) - Database operations (SQLite CRUD) - HotMemory class (MEMORY.md file operations) - VaultMemory class (filesystem vault) - MemorySystem orchestrator - SemanticMemory class (vector search) - MemorySearcher class - Tool functions (memory_search, memory_read, memory_write, memory_forget) - Artifact tools (jot_note, log_decision) — recently added These are at least 5 distinct concerns in one file. ## Proposed split ``` src/timmy/memory/ __init__.py — public API re-exports embeddings.py — embed_text, cosine_similarity, hash fallback database.py — SQLite operations, store/search/delete hot.py — HotMemory (MEMORY.md management) vault.py — VaultMemory (filesystem) system.py — MemorySystem orchestrator semantic.py — SemanticMemory, MemorySearcher tools.py — memory_search/read/write/forget tool functions artifacts.py — jot_note, log_decision ``` Note: `src/timmy/memory/` already exists with `unified.py` and `vector_store.py`. This refactor would consolidate both the old and new memory code into one coherent package. ## Risk - High. Memory is critical path. Extensive test coverage exists but all imports would change. - Recommend: do it in phases. Extract one module at a time, update imports, verify tests. - Phase 1: Extract embeddings.py (pure functions, no side effects) - Phase 2: Extract artifacts.py (new code, few importers) - Phase 3-N: One module at a time ## Who Multi-ticket work. Architecture decision needed from Alexander. ## Why Lines of code are a liability. 1577-line files are hard to navigate, test, and reason about.
Author
Collaborator

Architecture Decision: Phased Extraction

This is too large for a single PR. Breaking into phases:

Phase 1: Extract embeddings.py (Lines 34-112, ~78 lines)

  • Pure functions, zero side effects, lowest risk
  • , , , , , ,
  • Create
  • Update imports in to re-export from new location
  • Update and backward compat shims

Phase 2: Extract artifacts.py (Lines 1406-1482, ~76 lines)

  • , — recently added, few importers
  • Create
  • Only imports these directly

Phase 3-N: One module per PR (future)

  • hot.py (HotMemory), vault.py (VaultMemory), semantic.py, crud.py, tools.py, system.py

Unassigning from kimi. Creating scoped sub-issues.

## Architecture Decision: Phased Extraction This is too large for a single PR. Breaking into phases: ### Phase 1: Extract embeddings.py (Lines 34-112, ~78 lines) - Pure functions, zero side effects, lowest risk - , , , , , , - Create - Update imports in to re-export from new location - Update and backward compat shims ### Phase 2: Extract artifacts.py (Lines 1406-1482, ~76 lines) - , — recently added, few importers - Create - Only imports these directly ### Phase 3-N: One module per PR (future) - hot.py (HotMemory), vault.py (VaultMemory), semantic.py, crud.py, tools.py, system.py Unassigning from kimi. Creating scoped sub-issues.
Author
Collaborator

Obsolete: memory_system.py belongs to old codebase being retired in Claude Code pivot.

Obsolete: memory_system.py belongs to old codebase being retired in Claude Code pivot.
Author
Collaborator

Progress update (cycle 151):

PR #355 merged — extracted embedding functions to memory/embeddings.py.

memory_system.py reduced from 1577 → 1507 lines (-70 lines).

Remaining extractions (in priority order):

  1. SemanticMemory class (~270 lines, L896-1165) → memory/semantic.py
  2. HotMemory class (~160 lines, L611-770) → memory/hot.py
  3. VaultMemory class (~115 lines, L775-890) → memory/vault.py
  4. Tool functions (~160 lines, L1176-1335) → memory/tools.py
  5. Artifact tools (~70 lines, L1344-1412) → memory/artifacts.py
  6. Database/schema (~140 lines, L50-192) → memory/database.py

Total remaining: ~915 lines extractable. Target: memory_system.py under 500 lines.

**Progress update (cycle 151):** PR #355 merged — extracted embedding functions to `memory/embeddings.py`. `memory_system.py` reduced from 1577 → 1507 lines (-70 lines). **Remaining extractions (in priority order):** 1. `SemanticMemory` class (~270 lines, L896-1165) → `memory/semantic.py` 2. `HotMemory` class (~160 lines, L611-770) → `memory/hot.py` 3. `VaultMemory` class (~115 lines, L775-890) → `memory/vault.py` 4. Tool functions (~160 lines, L1176-1335) → `memory/tools.py` 5. Artifact tools (~70 lines, L1344-1412) → `memory/artifacts.py` 6. Database/schema (~140 lines, L50-192) → `memory/database.py` Total remaining: ~915 lines extractable. Target: memory_system.py under 500 lines.
Sign in to join this conversation.
No Label
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: Rockachopa/Timmy-time-dashboard#344