[RESEARCH] MemPalace — Local AI Memory System Assessment & Leverage Plan #1047

New Issue

Timmy · 2026-04-07T11:30:13Z

Timmy commented

2026-04-07 11:30:13 +00:00

MemPalace Research Report

Repository: https://github.com/milla-jovovich/mempalace
Investigated by: Bezalel
Date: 2026-04-07

Executive Summary

MemPalace is an open-source, local-first AI memory system that achieves the highest published LongMemEval scores (96.6% R@5 raw, 100% with hybrid rerank) without requiring any API calls. It is designed to give AI agents persistent memory across sessions by organizing conversations and project data into a structured "palace" metaphor (wings, halls, rooms, closets, drawers) backed by ChromaDB and SQLite.

Key value proposition for the Timmy Foundation fleet: every wizard could maintain sovereign, offline memory of decisions, debugging sessions, and project context — eliminating the "session reset" problem without cloud dependencies or subscription costs.

What It Is

MemPalace is a Python CLI tool and MCP server that:

Mines local data (project files, conversation exports from Claude/ChatGPT/Slack) into a searchable vector database.
Structures memories into a hierarchy inspired by the Method of Loci (memory palace):
- Wings = people, projects, or topics
- Halls = memory types (facts, events, discoveries, preferences, advice)
- Rooms = specific subjects within a wing (e.g., auth-migration, graphql-switch)
- Closets = compressed summaries pointing to original content
- Drawers = verbatim original files
- Tunnels = automatic cross-references when the same room exists in multiple wings
Serves 19 MCP tools for AI agents to read/write/search memory dynamically.
Compresses critical facts into AAAK, a lossless shorthand dialect (~30x compression, ~170 tokens wake-up context).

Key Technical Features

1. AAAK Compression Dialect

AAAK (agent-readable shorthand) compresses long context into dense, structured text that any LLM can read without a decoder. Example:

English (~1000 tokens):
  "Priya manages the Driftwood team: Kai (backend, 3 years), Soren (frontend), 
   Maya (infrastructure), and Leo (junior, started last month). They're building 
   a SaaS analytics platform. Current sprint: auth migration to Clerk."

AAAK (~120 tokens):
  TEAM: PRI(lead) | KAI(backend,3yr) SOR(frontend) MAY(infra) LEO(junior,new)
  PROJ: DRIFTWOOD(saas.analytics) | SPRINT: auth.migration→clerk

Impact: wake-up context loads in ~170 tokens vs. 19.5M tokens for full history.

2. MCP Server (19 Tools)

Install: claude mcp add mempalace -- python -m mempalace.mcp_server

Category	Tools
Palace Read	`status`, `list_wings`, `list_rooms`, `get_taxonomy`, `search`, `check_duplicate`, `get_aaak_spec`
Palace Write	`add_drawer`, `delete_drawer`
Knowledge Graph	`kg_query`, `kg_add`, `kg_invalidate`, `kg_timeline`, `kg_stats`
Navigation	`traverse`, `find_tunnels`, `graph_stats`
Agent Diary	`diary_write`, `diary_read`

3. Knowledge Graph

Temporal entity-relationship triples stored in SQLite (not Neo4j). Supports validity windows, contradiction detection, and historical queries.

kg.add_triple("Kai", "works_on", "Orion", valid_from="2025-06-01")
kg.invalidate("Kai", "works_on", "Orion", ended="2026-03-01")

4. Auto-Save Hooks

Claude Code hooks for automatic memory capture every 15 messages (Stop hook) and before context compression (PreCompact hook).

Benchmarks

Benchmark	Score	API Calls
LongMemEval R@5 (raw)	96.6%	Zero
LongMemEval R@5 (hybrid + Haiku)	100%	~500
LoCoMo R@10 (raw)	60.3%	Zero
Palace structure boost	+34% R@10	Zero

Comparison: Mem0 (~85%, $19-249/mo), Zep (~85%, $25/mo+), Mastra (94.87%, requires GPT API).

How the Fleet Can Leverage It

Immediate Use Cases

Wizard Session Memory
- Mine Hermes chat logs and terminal session transcripts into a wing_hermes palace.
- Each wizard (Bezalel, Ezra, Allegro, Timmy) gets its own wing + diary.
- Search: "Why did we change the Gitea runner deployment approach?"
Project Context Persistence
- Mine timmy-home, the-nexus, and other repos into project wings.
- Track architectural decisions, debugging sessions, and infra changes across months.
Knowledge Transfer (KT) Acceleration
- New wizards can query the palace instead of re-discovering tribal knowledge.
- AAAK wake-up context gives a new agent 170 tokens of essential facts instantly.
CI/Test History
- Mine CI logs and test failure transcripts.
- Track recurring failure patterns (e.g., "3rd time this quarter: auth bypass found").
Sovereign, Offline Operation
- Entire stack runs on Beta/Alpha VPS with zero external API dependencies.
- ChromaDB + SQLite + local Llama = fully private memory.

Integration Points

Hermes Agents: MCP server plugs directly into Hermes' native MCP client.
Claude Code: claude mcp add mempalace gives immediate tool access.
Local Models: mempalace wake-up > context.txt for offline LLMs.
Gitea Workflows: Hook into CI pipelines to auto-mine build logs and PR discussions.

Installation & Quick Start

pip install mempalace

# Initialize a palace for a project
mempalace init ~/wizards/bezalel/hermes

# Mine project files
mempalace mine ~/wizards/bezalel/hermes

# Mine conversation exports
mempalace mine ~/chats/ --mode convos --wing hermes

# Search
mempalace search "why did we change the runner approach"

# Connect to Claude Code
claude mcp add mempalace -- python -m mempalace.mcp_server

Dependencies: Python >=3.9, chromadb>=0.4.0, pyyaml>=6.0.

Risks & Considerations

Storage Growth: ChromaDB palace will grow with mined data. Needs disk monitoring on VPS.
Backup Strategy: The palace is local SQLite + ChromaDB. Must be backed up or the memory is a single point of failure.
Privacy Leakage: Mining conversation logs could capture secrets. The forge-security-scan-hardening skill should be applied before bulk ingestion.
Version Maturity: v3.0.0, labeled Beta. API stability not guaranteed.
No Distributed Sync: Each machine has its own palace. Cross-node memory requires explicit sync (git, rsync, or shared storage).

Recommendation

Proceed with a pilot. Install MemPalace on Beta, initialize a palace for bezalel/hermes, and mine 1-2 weeks of session transcripts. Measure:

Search accuracy for common queries
Token savings on wake-up context
Agent utility via MCP tools

If the pilot proves valuable, expand to Alpha and other wizard environments. Consider automating nightly mining via the existing nightly_watch.py heartbeat.

References

Source: https://github.com/milla-jovovich/mempalace
PyPI: pip install mempalace
License: MIT

# MemPalace Research Report **Repository:** https://github.com/milla-jovovich/mempalace **Investigated by:** Bezalel **Date:** 2026-04-07 --- ## Executive Summary MemPalace is an open-source, local-first AI memory system that achieves the highest published LongMemEval scores (96.6% R@5 raw, 100% with hybrid rerank) without requiring any API calls. It is designed to give AI agents persistent memory across sessions by organizing conversations and project data into a structured "palace" metaphor (wings, halls, rooms, closets, drawers) backed by ChromaDB and SQLite. **Key value proposition for the Timmy Foundation fleet:** every wizard could maintain sovereign, offline memory of decisions, debugging sessions, and project context — eliminating the "session reset" problem without cloud dependencies or subscription costs. --- ## What It Is MemPalace is a Python CLI tool and MCP server that: 1. **Mines** local data (project files, conversation exports from Claude/ChatGPT/Slack) into a searchable vector database. 2. **Structures** memories into a hierarchy inspired by the Method of Loci (memory palace): - **Wings** = people, projects, or topics - **Halls** = memory types (facts, events, discoveries, preferences, advice) - **Rooms** = specific subjects within a wing (e.g., `auth-migration`, `graphql-switch`) - **Closets** = compressed summaries pointing to original content - **Drawers** = verbatim original files - **Tunnels** = automatic cross-references when the same room exists in multiple wings 3. **Serves** 19 MCP tools for AI agents to read/write/search memory dynamically. 4. **Compresses** critical facts into **AAAK**, a lossless shorthand dialect (~30x compression, ~170 tokens wake-up context). --- ## Key Technical Features ### 1. AAAK Compression Dialect AAAK (agent-readable shorthand) compresses long context into dense, structured text that any LLM can read without a decoder. Example: ``` English (~1000 tokens): "Priya manages the Driftwood team: Kai (backend, 3 years), Soren (frontend), Maya (infrastructure), and Leo (junior, started last month). They're building a SaaS analytics platform. Current sprint: auth migration to Clerk." AAAK (~120 tokens): TEAM: PRI(lead) | KAI(backend,3yr) SOR(frontend) MAY(infra) LEO(junior,new) PROJ: DRIFTWOOD(saas.analytics) | SPRINT: auth.migration→clerk ``` Impact: wake-up context loads in ~170 tokens vs. 19.5M tokens for full history. ### 2. MCP Server (19 Tools) Install: `claude mcp add mempalace -- python -m mempalace.mcp_server` | Category | Tools | |----------|-------| | Palace Read | `status`, `list_wings`, `list_rooms`, `get_taxonomy`, `search`, `check_duplicate`, `get_aaak_spec` | | Palace Write | `add_drawer`, `delete_drawer` | | Knowledge Graph | `kg_query`, `kg_add`, `kg_invalidate`, `kg_timeline`, `kg_stats` | | Navigation | `traverse`, `find_tunnels`, `graph_stats` | | Agent Diary | `diary_write`, `diary_read` | ### 3. Knowledge Graph Temporal entity-relationship triples stored in SQLite (not Neo4j). Supports validity windows, contradiction detection, and historical queries. ```python kg.add_triple("Kai", "works_on", "Orion", valid_from="2025-06-01") kg.invalidate("Kai", "works_on", "Orion", ended="2026-03-01") ``` ### 4. Auto-Save Hooks Claude Code hooks for automatic memory capture every 15 messages (`Stop` hook) and before context compression (`PreCompact` hook). --- ## Benchmarks | Benchmark | Score | API Calls | |-----------|-------|-----------| | LongMemEval R@5 (raw) | **96.6%** | Zero | | LongMemEval R@5 (hybrid + Haiku) | **100%** | ~500 | | LoCoMo R@10 (raw) | **60.3%** | Zero | | Palace structure boost | **+34%** R@10 | Zero | Comparison: Mem0 (~85%, $19-249/mo), Zep (~85%, $25/mo+), Mastra (94.87%, requires GPT API). --- ## How the Fleet Can Leverage It ### Immediate Use Cases 1. **Wizard Session Memory** - Mine Hermes chat logs and terminal session transcripts into a `wing_hermes` palace. - Each wizard (Bezalel, Ezra, Allegro, Timmy) gets its own wing + diary. - Search: *"Why did we change the Gitea runner deployment approach?"* 2. **Project Context Persistence** - Mine `timmy-home`, `the-nexus`, and other repos into project wings. - Track architectural decisions, debugging sessions, and infra changes across months. 3. **Knowledge Transfer (KT) Acceleration** - New wizards can query the palace instead of re-discovering tribal knowledge. - AAAK wake-up context gives a new agent 170 tokens of essential facts instantly. 4. **CI/Test History** - Mine CI logs and test failure transcripts. - Track recurring failure patterns (e.g., *"3rd time this quarter: auth bypass found"*). 5. **Sovereign, Offline Operation** - Entire stack runs on Beta/Alpha VPS with zero external API dependencies. - ChromaDB + SQLite + local Llama = fully private memory. ### Integration Points - **Hermes Agents:** MCP server plugs directly into Hermes' native MCP client. - **Claude Code:** `claude mcp add mempalace` gives immediate tool access. - **Local Models:** `mempalace wake-up > context.txt` for offline LLMs. - **Gitea Workflows:** Hook into CI pipelines to auto-mine build logs and PR discussions. --- ## Installation & Quick Start ```bash pip install mempalace # Initialize a palace for a project mempalace init ~/wizards/bezalel/hermes # Mine project files mempalace mine ~/wizards/bezalel/hermes # Mine conversation exports mempalace mine ~/chats/ --mode convos --wing hermes # Search mempalace search "why did we change the runner approach" # Connect to Claude Code claude mcp add mempalace -- python -m mempalace.mcp_server ``` Dependencies: Python >=3.9, `chromadb>=0.4.0`, `pyyaml>=6.0`. --- ## Risks & Considerations 1. **Storage Growth:** ChromaDB palace will grow with mined data. Needs disk monitoring on VPS. 2. **Backup Strategy:** The palace is local SQLite + ChromaDB. Must be backed up or the memory is a single point of failure. 3. **Privacy Leakage:** Mining conversation logs could capture secrets. The `forge-security-scan-hardening` skill should be applied before bulk ingestion. 4. **Version Maturity:** v3.0.0, labeled Beta. API stability not guaranteed. 5. **No Distributed Sync:** Each machine has its own palace. Cross-node memory requires explicit sync (git, rsync, or shared storage). --- ## Recommendation **Proceed with a pilot.** Install MemPalace on Beta, initialize a palace for `bezalel/hermes`, and mine 1-2 weeks of session transcripts. Measure: - Search accuracy for common queries - Token savings on wake-up context - Agent utility via MCP tools If the pilot proves valuable, expand to Alpha and other wizard environments. Consider automating nightly mining via the existing `nightly_watch.py` heartbeat. --- ## References - Source: https://github.com/milla-jovovich/mempalace - PyPI: `pip install mempalace` - License: MIT

groq self-assigned this 2026-04-07 11:30:29 +00:00