[MEMPALACE][MP-2] Enforce retrieval-order — palace first, generation last #369

Closed
opened 2026-04-07 17:01:26 +00:00 by perplexity · 1 comment
Member

Part of epic #367 | Depends on #368

Why

This is the single most important behavioral change for elegant memory. Right now, agents faced with a recall-style question ("what did we do yesterday?", "what's the status of issue #X?") generate from vibes instead of checking their palace drawers first. The result is hallucination dressed up as memory.

The fix is a retrieval-order enforcer: a middleware that intercepts recall-style prompts and routes them through a defined lookup chain before allowing free generation.

Retrieval Order

L0  identity.txt          →  Who am I? What are my mandates?
L1  Palace rooms/drawers   →  What do I know about this topic?
L2  Session scratchpad     →  What have I learned this session?
L3  Artifact retrieval     →  Can I fetch the actual issue/file/log?
L4  Procedures/playbooks   →  Is there a documented way to do this?
L5  Free generation        →  Only when L0-L4 are exhausted

Implementation

1. Recall Detector (scripts/skills/recall_detector.py)

Heuristic classifier that detects recall-style prompts:

  • References to prior work ("yesterday", "last time", "we discussed")
  • Issue/PR number references ("#123", "issue 456")
  • File/project references ("mempalace.py", "the-nexus")
  • Status queries ("what's the status of", "where are we on")
  • Agent-directed questions ("what did ezra do", "has allegro finished")

Returns a confidence score and suggested retrieval layers.

2. Retrieval Chain (scripts/skills/retrieval_chain.py)

Given a recall-classified prompt:

  1. Load identity.txt → inject as context prefix
  2. Search palace rooms for matching drawers → inject hits
  3. Check session scratchpad for recent context
  4. If issue/PR referenced, fetch via Gitea API
  5. Check playbooks for relevant procedures
  6. If all layers return empty, allow free generation with honest disclaimer

3. Honest Fallback

When the palace has no relevant data, the agent says so:

"I don't have this in my memory palace. Let me check the artifacts directly."

Not:

"Based on my understanding..." (hallucination)

Acceptance Criteria

  • Recall detector identifies >90% of recall-style prompts in test corpus
  • Retrieval chain checks L0→L4 in order with short-circuit on hit
  • Free generation only fires when all prior layers return empty
  • Honest fallback message when palace has no data
  • Integration test: mock palace with known data, verify retrieval beats generation
  • Documented retrieval order in docs/MEMORY_ARCHITECTURE.md
Part of epic #367 | Depends on #368 ## Why This is the single most important behavioral change for elegant memory. Right now, agents faced with a recall-style question ("what did we do yesterday?", "what's the status of issue #X?") **generate from vibes** instead of checking their palace drawers first. The result is hallucination dressed up as memory. The fix is a retrieval-order enforcer: a middleware that intercepts recall-style prompts and routes them through a defined lookup chain before allowing free generation. ## Retrieval Order ``` L0 identity.txt → Who am I? What are my mandates? L1 Palace rooms/drawers → What do I know about this topic? L2 Session scratchpad → What have I learned this session? L3 Artifact retrieval → Can I fetch the actual issue/file/log? L4 Procedures/playbooks → Is there a documented way to do this? L5 Free generation → Only when L0-L4 are exhausted ``` ## Implementation ### 1. Recall Detector (`scripts/skills/recall_detector.py`) Heuristic classifier that detects recall-style prompts: - References to prior work ("yesterday", "last time", "we discussed") - Issue/PR number references ("#123", "issue 456") - File/project references ("mempalace.py", "the-nexus") - Status queries ("what's the status of", "where are we on") - Agent-directed questions ("what did ezra do", "has allegro finished") Returns a confidence score and suggested retrieval layers. ### 2. Retrieval Chain (`scripts/skills/retrieval_chain.py`) Given a recall-classified prompt: 1. Load identity.txt → inject as context prefix 2. Search palace rooms for matching drawers → inject hits 3. Check session scratchpad for recent context 4. If issue/PR referenced, fetch via Gitea API 5. Check playbooks for relevant procedures 6. If all layers return empty, allow free generation with honest disclaimer ### 3. Honest Fallback When the palace has no relevant data, the agent says so: > "I don't have this in my memory palace. Let me check the artifacts directly." Not: > "Based on my understanding..." (hallucination) ## Acceptance Criteria - [ ] Recall detector identifies >90% of recall-style prompts in test corpus - [ ] Retrieval chain checks L0→L4 in order with short-circuit on hit - [ ] Free generation only fires when all prior layers return empty - [ ] Honest fallback message when palace has no data - [ ] Integration test: mock palace with known data, verify retrieval beats generation - [ ] Documented retrieval order in `docs/MEMORY_ARCHITECTURE.md`
Owner

MP-2: Retrieval Order Enforcer — Complete

Created hermes-sovereign/mempalace/retrieval_enforcer.py:

Layers:

  • L0: Identity (~/.mempalace/identity.txt, capped at 200 tokens)
  • L1: Palace search (mempalace CLI, graceful degradation for ONNX #373)
  • L2: Session scratchpad (JSON-based)
  • L3: Gitea artifacts (API search with token auth)
  • L4: Procedures (skills directory scan)
  • L5: Free generation (fallback)

Features:

  • is_recall_query() regex detection for 16 recall patterns
  • enforce_retrieval_order() with skip_if_not_recall gate
  • layers_checked tracking in result dict
  • Pure stdlib + subprocess for mempalace CLI

17 tests — all pass.

PR: #374

## MP-2: Retrieval Order Enforcer — Complete Created `hermes-sovereign/mempalace/retrieval_enforcer.py`: **Layers:** - L0: Identity (`~/.mempalace/identity.txt`, capped at 200 tokens) - L1: Palace search (mempalace CLI, graceful degradation for ONNX #373) - L2: Session scratchpad (JSON-based) - L3: Gitea artifacts (API search with token auth) - L4: Procedures (skills directory scan) - L5: Free generation (fallback) **Features:** - `is_recall_query()` regex detection for 16 recall patterns - `enforce_retrieval_order()` with skip_if_not_recall gate - `layers_checked` tracking in result dict - Pure stdlib + subprocess for mempalace CLI 17 tests — all pass. PR: #374
Sign in to join this conversation.
2 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: Timmy_Foundation/timmy-config#369