Implement three-tier memory architecture (Hot/Vault/Handoff)

This commit replaces the previous memory_layers.py with a proper three-tier
memory system as specified by the user:

## Tier 1 — Hot Memory (MEMORY.md)
- Single flat file always loaded into system context
- Contains: current status, standing rules, agent roster, key decisions
- ~300 lines max, pruned monthly
- Managed by HotMemory class

## Tier 2 — Structured Vault (memory/)
- Directory with three namespaces:
  • self/ — identity.md, user_profile.md, methodology.md
  • notes/ — session logs, AARs, research
  • aar/ — post-task retrospectives
- Markdown format, Obsidian-compatible
- Append-only, date-stamped
- Managed by VaultMemory class

## Handoff Protocol
- last-session-handoff.md written at session end
- Contains: summary, key decisions, open items, next steps
- Auto-loaded at next session start
- Maintains continuity across resets

## Implementation

### New Files:
- src/timmy/memory_system.py — Core memory system
- MEMORY.md — Hot memory template
- memory/self/*.md — Identity, user profile, methodology

### Modified:
- src/timmy/agent.py — Integrated with memory system
  - create_timmy() injects memory context
  - TimmyWithMemory class with automatic fact extraction
- tests/test_agent.py — Updated for memory context

## Key Principles
- Hot memory = small and curated
- Vault = append-only, never delete
- Handoffs = continuity mechanism
- Flat files = human-readable, portable

## Usage

All 973 tests pass.
This commit is contained in:
Alexander Payne
2026-02-25 18:17:43 -05:00
parent 625806daf5
commit 7838df19b0
7 changed files with 781 additions and 54 deletions

84
MEMORY.md Normal file
View File

@@ -0,0 +1,84 @@
# Timmy Hot Memory
> Working RAM — always loaded, ~300 lines max, pruned monthly
> Last updated: 2026-02-25
---
## Current Status
**Agent State:** Operational
**Mode:** Development
**Active Tasks:** 0
**Pending Decisions:** None
---
## Standing Rules
1. **Sovereignty First** — No cloud dependencies, no data exfiltration
2. **Local-Only Inference** — Ollama on localhost, Apple Silicon optimized
3. **Privacy by Design** — Telemetry disabled, secrets in .env only
4. **Tool Minimalism** — Use tools only when necessary, prefer direct answers
5. **Memory Discipline** — Write handoffs at session end, prune monthly
---
## Agent Roster
| Agent | Role | Status | Capabilities |
|-------|------|--------|--------------|
| Timmy | Core | Active | chat, reasoning, planning |
| Echo | Research | Standby | web_search, file_read |
| Forge | Code | Standby | shell, python, git |
| Seer | Data | Standby | python, analysis |
| Helm | DevOps | Standby | shell, deployment |
---
## User Profile
**Name:** TestUser
## Key Decisions
- **2026-02-25:** Implemented 3-tier memory architecture
- **2026-02-25:** Disabled telemetry by default (sovereign AI)
- **2026-02-25:** Fixed Agno Toolkit API compatibility
---
## Pending Actions
- [ ] Learn user's name and preferences
- [ ] Populate user profile in self/identity.md
- [ ] First AAR after meaningful task completion
---
## Current Session
**Session ID:** (active)
**Started:** 2026-02-25
**Context:** Memory system initialization
**Last Handoff:** (none yet)
---
## Quick Reference
**Available Tools:**
- `web_search` — Current events only
- `read_file` / `write_file` — Explicit request only
- `python` — Calculations, code execution
- `shell` — System commands (caution)
**Memory Locations:**
- Hot: `MEMORY.md` (this file)
- Vault: `memory/`
- Handoff: `memory/notes/last-session-handoff.md`
---
*Prune date: 2026-03-25*

48
memory/self/identity.md Normal file
View File

@@ -0,0 +1,48 @@
# Timmy Identity
## Core Identity
**Name:** Timmy
**Type:** Sovereign AI Agent
**Version:** 1.0.0
**Created:** 2026-02-25
## Purpose
Assist the user with information, tasks, and digital sovereignty. Operate entirely on local hardware with no cloud dependencies.
## Values
1. **Sovereignty** — User owns their data and compute
2. **Privacy** — Nothing leaves the local machine
3. **Christian Faith** — Grounded in biblical principles
4. **Bitcoin Economics** — Self-custody, sound money
5. **Clear Thinking** — Plain language, intentional action
## Capabilities
- Conversational AI with persistent memory
- Tool usage (search, files, code, shell)
- Multi-agent swarm coordination
- Bitcoin Lightning integration (L402)
- Creative pipeline (image, music, video)
## Operating Modes
| Mode | Model | Parameters | Use Case |
|------|-------|------------|----------|
| Standard | llama3.2 | 3.2B | Fast, everyday tasks |
| Big Brain | AirLLM 70B | 70B | Complex reasoning |
| Maximum | AirLLM 405B | 405B | Deep analysis |
## Communication Style
- Direct and concise
- Technical when appropriate
- References prior context naturally
- Uses user's name when known
- "Sir, affirmative."
---
*Last updated: 2026-02-25*

View File

@@ -0,0 +1,70 @@
# Timmy Methodology
## Tool Usage Philosophy
### When NOT to Use Tools
- Identity questions ("What is your name?")
- General knowledge (history, science, concepts)
- Simple math (2+2, basic calculations)
- Greetings and social chat
- Anything in training data
### When TO Use Tools
- Current events/news (after training cutoff)
- Explicit file operations (user requests)
- Complex calculations requiring precision
- Real-time data (prices, weather)
- System operations (explicit user request)
### Decision Process
1. Can I answer this from my training data? → Answer directly
2. Does this require current/real-time info? → Consider web_search
3. Did user explicitly request file/code/shell? → Use appropriate tool
4. Is this a simple calculation? → Answer directly
5. Unclear? → Answer directly (don't tool-spam)
## Memory Management
### Working Memory (Hot)
- Last 20 messages
- Immediate context
- Topic tracking
### Short-Term Memory (Agno SQLite)
- Recent 100 conversations
- Survives restarts
- Automatic
### Long-Term Memory (Vault)
- User facts and preferences
- Important learnings
- AARs and retrospectives
### Hot Memory (MEMORY.md)
- Always loaded
- Current status, rules, roster
- User profile summary
- Pruned monthly
## Handoff Protocol
At end of every session:
1. Write `memory/notes/last-session-handoff.md`
2. Update MEMORY.md with any key decisions
3. Extract facts to `memory/self/user_profile.md`
4. If task completed, write AAR to `memory/aar/`
## Session Start Hook
1. Read MEMORY.md into system context
2. Read last-session-handoff.md if exists
3. Inject user profile context
4. Begin conversation
---
*Last updated: 2026-02-25*

View File

@@ -0,0 +1,43 @@
# User Profile
> Learned information about the user. Updated continuously.
## Basic Information
**Name:** TestUser
**Location:** (unknown)
**Occupation:** (unknown)
**Technical Level:** (to be assessed)
## Interests & Expertise
- (to be learned from conversations)
## Preferences
### Communication
- Response style: (default: concise, technical)
- Detail level: (default: medium)
- Humor: (default: minimal)
### Tools
- Auto-tool usage: (default: minimal)
- Confirmation required for: shell commands, file writes
### Memory
- Personalization: Enabled
- Context retention: 20 messages (working), 100 (short-term)
## Important Facts
- (to be extracted from conversations)
## Relationship History
- First session: 2026-02-25
- Total sessions: 1
- Key milestones: (none yet)
---
*Last updated: 2026-02-25*

View File

@@ -1,9 +1,11 @@
"""Timmy agent creation with multi-layer memory system.
"""Timmy agent creation with three-tier memory system.
Integrates Agno's Agent with our custom memory layers:
- Working Memory (immediate context)
- Short-term Memory (Agno SQLite)
- Long-term Memory (facts/preferences)
Memory Architecture:
- Tier 1 (Hot): MEMORY.md — always loaded, ~300 lines
- Tier 2 (Vault): memory/ — structured markdown, append-only
- Tier 3 (Semantic): Vector search (future)
Handoff Protocol maintains continuity across sessions.
"""
from typing import TYPE_CHECKING, Union
@@ -74,13 +76,33 @@ def create_timmy(
# Add tools for sovereign agent capabilities
tools = create_full_toolkit()
# Build enhanced system prompt with memory context
base_prompt = TIMMY_SYSTEM_PROMPT
# Try to load memory context
try:
from timmy.memory_system import memory_system
memory_context = memory_system.get_system_context()
if memory_context:
# Truncate if too long (keep under token limit)
if len(memory_context) > 8000:
memory_context = memory_context[:8000] + "\n... [truncated]"
full_prompt = f"{base_prompt}\n\n## Memory Context\n\n{memory_context}"
else:
full_prompt = base_prompt
except Exception as exc:
# Fall back to base prompt if memory system fails
import logging
logging.getLogger(__name__).warning("Failed to load memory context: %s", exc)
full_prompt = base_prompt
return Agent(
name="Timmy",
model=Ollama(id=settings.ollama_model, host=settings.ollama_url),
db=SqliteDb(db_file=db_file),
description=TIMMY_SYSTEM_PROMPT,
description=full_prompt,
add_history_to_context=True,
num_history_runs=20, # Increased for better conversational context
num_history_runs=20,
markdown=True,
tools=[tools] if tools else None,
telemetry=settings.telemetry_enabled,
@@ -88,56 +110,76 @@ def create_timmy(
class TimmyWithMemory:
"""Timmy wrapper with explicit memory layer management.
This class wraps the Agno Agent and adds:
- Working memory tracking
- Long-term memory storage/retrieval
- Context injection from memory layers
"""
"""Timmy wrapper with explicit three-tier memory management."""
def __init__(self, db_file: str = "timmy.db") -> None:
from timmy.memory_layers import memory_manager
from timmy.memory_system import memory_system
self.agent = create_timmy(db_file=db_file)
self.memory = memory_manager
self.memory.start_session()
self.memory = memory_system
self.session_active = True
# Inject user context if available
self._inject_context()
def _inject_context(self) -> None:
"""Inject relevant memory context into system prompt."""
context = self.memory.get_context_for_prompt()
if context:
# Append context to system prompt
original_description = self.agent.description
self.agent.description = f"{original_description}\n\n## User Context\n{context}"
def run(self, message: str, stream: bool = False) -> object:
"""Run with memory tracking."""
# Get relevant memories
relevant = self.memory.get_relevant_memories(message)
# Enhance message with context if relevant
enhanced_message = message
if relevant:
context_str = "\n".join(f"- {r}" for r in relevant[:3])
enhanced_message = f"[Context: {context_str}]\n\n{message}"
# Run agent
result = self.agent.run(enhanced_message, stream=stream)
# Extract response content
response_text = result.content if hasattr(result, "content") else str(result)
# Track in memory
tool_calls = getattr(result, "tool_calls", None)
self.memory.add_exchange(message, response_text, tool_calls)
return result
# Store initial context for reference
self.initial_context = self.memory.get_system_context()
def chat(self, message: str) -> str:
"""Simple chat interface that returns string response."""
result = self.run(message, stream=False)
return result.content if hasattr(result, "content") else str(result)
"""Simple chat interface that tracks in memory."""
# Check for user facts to extract
self._extract_and_store_facts(message)
# Run agent
result = self.agent.run(message, stream=False)
response_text = result.content if hasattr(result, "content") else str(result)
return response_text
def _extract_and_store_facts(self, message: str) -> None:
"""Extract user facts from message and store in memory."""
message_lower = message.lower()
# Extract name
name_patterns = [
("my name is ", 11),
("i'm ", 4),
("i am ", 5),
("call me ", 8),
]
for pattern, offset in name_patterns:
if pattern in message_lower:
idx = message_lower.find(pattern) + offset
name = message[idx:].strip().split()[0].strip(".,!?;:()\"'").capitalize()
if name and len(name) > 1 and name.lower() not in ("the", "a", "an"):
self.memory.update_user_fact("Name", name)
self.memory.record_decision(f"Learned user's name: {name}")
break
# Extract preferences
pref_patterns = [
("i like ", "Likes"),
("i love ", "Loves"),
("i prefer ", "Prefers"),
("i don't like ", "Dislikes"),
("i hate ", "Dislikes"),
]
for pattern, category in pref_patterns:
if pattern in message_lower:
idx = message_lower.find(pattern) + len(pattern)
pref = message[idx:].strip().split(".")[0].strip()
if pref and len(pref) > 3:
self.memory.record_open_item(f"User {category.lower()}: {pref}")
break
def end_session(self, summary: str = "Session completed") -> None:
"""End session and write handoff."""
if self.session_active:
self.memory.end_session(summary)
self.session_active = False
def __enter__(self):
return self
def __exit__(self, exc_type, exc_val, exc_tb):
self.end_session()
return False

439
src/timmy/memory_system.py Normal file
View File

@@ -0,0 +1,439 @@
"""Three-tier memory system for Timmy.
Architecture:
- Tier 1 (Hot): MEMORY.md — always loaded, ~300 lines
- Tier 2 (Vault): memory/ — structured markdown, append-only
- Tier 3 (Semantic): Vector search over vault (optional)
Handoff Protocol:
- Write last-session-handoff.md at session end
- Inject into next session automatically
"""
import hashlib
import logging
import re
from datetime import datetime, timezone
from pathlib import Path
from typing import Optional
logger = logging.getLogger(__name__)
# Paths
PROJECT_ROOT = Path(__file__).parent.parent.parent
HOT_MEMORY_PATH = PROJECT_ROOT / "MEMORY.md"
VAULT_PATH = PROJECT_ROOT / "memory"
HANDOFF_PATH = VAULT_PATH / "notes" / "last-session-handoff.md"
class HotMemory:
"""Tier 1: Hot memory (MEMORY.md) — always loaded."""
def __init__(self) -> None:
self.path = HOT_MEMORY_PATH
self._content: Optional[str] = None
self._last_modified: Optional[float] = None
def read(self, force_refresh: bool = False) -> str:
"""Read hot memory, with caching."""
if not self.path.exists():
self._create_default()
# Check if file changed
current_mtime = self.path.stat().st_mtime
if not force_refresh and self._content and self._last_modified == current_mtime:
return self._content
self._content = self.path.read_text()
self._last_modified = current_mtime
logger.debug("HotMemory: Loaded %d chars from %s", len(self._content), self.path)
return self._content
def update_section(self, section: str, content: str) -> None:
"""Update a specific section in MEMORY.md."""
full_content = self.read()
# Find section
pattern = rf"(## {re.escape(section)}.*?)(?=\n## |\Z)"
match = re.search(pattern, full_content, re.DOTALL)
if match:
# Replace section
new_section = f"## {section}\n\n{content}\n\n"
full_content = full_content[:match.start()] + new_section + full_content[match.end():]
else:
# Append section before last updated line
insert_point = full_content.rfind("*Prune date:")
new_section = f"## {section}\n\n{content}\n\n"
full_content = full_content[:insert_point] + new_section + "\n" + full_content[insert_point:]
self.path.write_text(full_content)
self._content = full_content
self._last_modified = self.path.stat().st_mtime
logger.info("HotMemory: Updated section '%s'", section)
def _create_default(self) -> None:
"""Create default MEMORY.md if missing."""
default_content = """# Timmy Hot Memory
> Working RAM — always loaded, ~300 lines max, pruned monthly
> Last updated: {date}
---
## Current Status
**Agent State:** Operational
**Mode:** Development
**Active Tasks:** 0
**Pending Decisions:** None
---
## Standing Rules
1. **Sovereignty First** — No cloud dependencies
2. **Local-Only Inference** — Ollama on localhost
3. **Privacy by Design** — Telemetry disabled
4. **Tool Minimalism** — Use tools only when necessary
5. **Memory Discipline** — Write handoffs at session end
---
## Agent Roster
| Agent | Role | Status |
|-------|------|--------|
| Timmy | Core | Active |
---
## User Profile
**Name:** (not set)
**Interests:** (to be learned)
---
## Key Decisions
(none yet)
---
## Pending Actions
- [ ] Learn user's name
---
*Prune date: {prune_date}*
""".format(
date=datetime.now(timezone.utc).strftime("%Y-%m-%d"),
prune_date=(datetime.now(timezone.utc).replace(day=25)).strftime("%Y-%m-%d")
)
self.path.write_text(default_content)
logger.info("HotMemory: Created default MEMORY.md")
class VaultMemory:
"""Tier 2: Structured vault (memory/) — append-only markdown."""
def __init__(self) -> None:
self.path = VAULT_PATH
self._ensure_structure()
def _ensure_structure(self) -> None:
"""Ensure vault directory structure exists."""
(self.path / "self").mkdir(parents=True, exist_ok=True)
(self.path / "notes").mkdir(parents=True, exist_ok=True)
(self.path / "aar").mkdir(parents=True, exist_ok=True)
def write_note(self, name: str, content: str, namespace: str = "notes") -> Path:
"""Write a note to the vault."""
# Add timestamp to filename
timestamp = datetime.now(timezone.utc).strftime("%Y%m%d")
filename = f"{timestamp}_{name}.md"
filepath = self.path / namespace / filename
# Add header
full_content = f"""# {name.replace('_', ' ').title()}
> Created: {datetime.now(timezone.utc).isoformat()}
> Namespace: {namespace}
---
{content}
---
*Auto-generated by Timmy Memory System*
"""
filepath.write_text(full_content)
logger.info("VaultMemory: Wrote %s", filepath)
return filepath
def read_file(self, filepath: Path) -> str:
"""Read a file from the vault."""
if not filepath.exists():
return ""
return filepath.read_text()
def list_files(self, namespace: str = "notes", pattern: str = "*.md") -> list[Path]:
"""List files in a namespace."""
dir_path = self.path / namespace
if not dir_path.exists():
return []
return sorted(dir_path.glob(pattern))
def get_latest(self, namespace: str = "notes", pattern: str = "*.md") -> Optional[Path]:
"""Get most recent file in namespace."""
files = self.list_files(namespace, pattern)
return files[-1] if files else None
def update_user_profile(self, key: str, value: str) -> None:
"""Update a field in user_profile.md."""
profile_path = self.path / "self" / "user_profile.md"
if not profile_path.exists():
# Create default profile
self._create_default_profile()
content = profile_path.read_text()
# Simple pattern replacement
pattern = rf"(\*\*{re.escape(key)}:\*\*).*"
if re.search(pattern, content):
content = re.sub(pattern, rf"\1 {value}", content)
else:
# Add to Important Facts section
facts_section = "## Important Facts"
if facts_section in content:
insert_point = content.find(facts_section) + len(facts_section)
content = content[:insert_point] + f"\n- {key}: {value}" + content[insert_point:]
# Update last_updated
content = re.sub(
r"\*Last updated:.*\*",
f"*Last updated: {datetime.now(timezone.utc).strftime('%Y-%m-%d')}*",
content
)
profile_path.write_text(content)
logger.info("VaultMemory: Updated user profile: %s = %s", key, value)
def _create_default_profile(self) -> None:
"""Create default user profile."""
profile_path = self.path / "self" / "user_profile.md"
default = """# User Profile
> Learned information about the user.
## Basic Information
**Name:** (unknown)
**Location:** (unknown)
**Occupation:** (unknown)
## Interests & Expertise
- (to be learned)
## Preferences
- Response style: concise, technical
- Tool usage: minimal
## Important Facts
- (to be extracted)
---
*Last updated: {date}*
""".format(date=datetime.now(timezone.utc).strftime("%Y-%m-%d"))
profile_path.write_text(default)
class HandoffProtocol:
"""Session handoff protocol for continuity."""
def __init__(self) -> None:
self.path = HANDOFF_PATH
self.vault = VaultMemory()
def write_handoff(
self,
session_summary: str,
key_decisions: list[str],
open_items: list[str],
next_steps: list[str]
) -> None:
"""Write handoff at session end."""
content = f"""# Last Session Handoff
**Session End:** {datetime.now(timezone.utc).isoformat()}
**Duration:** (calculated on read)
## Summary
{session_summary}
## Key Decisions
{chr(10).join(f"- {d}" for d in key_decisions) if key_decisions else "- (none)"}
## Open Items
{chr(10).join(f"- [ ] {i}" for i in open_items) if open_items else "- (none)"}
## Next Steps
{chr(10).join(f"- {s}" for s in next_steps) if next_steps else "- (none)"}
## Context for Next Session
The user was last working on: {session_summary[:200]}...
---
*This handoff will be auto-loaded at next session start*
"""
self.path.write_text(content)
# Also archive to notes
self.vault.write_note(
"session_handoff",
content,
namespace="notes"
)
logger.info("HandoffProtocol: Wrote handoff with %d decisions, %d open items",
len(key_decisions), len(open_items))
def read_handoff(self) -> Optional[str]:
"""Read handoff if exists."""
if not self.path.exists():
return None
return self.path.read_text()
def clear_handoff(self) -> None:
"""Clear handoff after loading."""
if self.path.exists():
self.path.unlink()
logger.debug("HandoffProtocol: Cleared handoff")
class MemorySystem:
"""Central memory system coordinating all tiers."""
def __init__(self) -> None:
self.hot = HotMemory()
self.vault = VaultMemory()
self.handoff = HandoffProtocol()
self.session_start_time: Optional[datetime] = None
self.session_decisions: list[str] = []
self.session_open_items: list[str] = []
def start_session(self) -> str:
"""Start a new session, loading context from memory."""
self.session_start_time = datetime.now(timezone.utc)
# Build context
context_parts = []
# 1. Hot memory
hot_content = self.hot.read()
context_parts.append("## Hot Memory\n" + hot_content)
# 2. Last session handoff
handoff_content = self.handoff.read_handoff()
if handoff_content:
context_parts.append("## Previous Session\n" + handoff_content)
self.handoff.clear_handoff()
# 3. User profile (key fields only)
profile = self._load_user_profile_summary()
if profile:
context_parts.append("## User Context\n" + profile)
full_context = "\n\n---\n\n".join(context_parts)
logger.info("MemorySystem: Session started with %d chars context", len(full_context))
return full_context
def end_session(self, summary: str) -> None:
"""End session, write handoff."""
self.handoff.write_handoff(
session_summary=summary,
key_decisions=self.session_decisions,
open_items=self.session_open_items,
next_steps=[]
)
# Update hot memory
self.hot.update_section(
"Current Session",
f"**Last Session:** {datetime.now(timezone.utc).strftime('%Y-%m-%d %H:%M')}\n" +
f"**Summary:** {summary[:100]}..."
)
logger.info("MemorySystem: Session ended, handoff written")
def record_decision(self, decision: str) -> None:
"""Record a key decision during session."""
self.session_decisions.append(decision)
# Also add to hot memory
current = self.hot.read()
if "## Key Decisions" in current:
# Append to section
pass # Handled at session end
def record_open_item(self, item: str) -> None:
"""Record an open item for follow-up."""
self.session_open_items.append(item)
def update_user_fact(self, key: str, value: str) -> None:
"""Update user profile in vault."""
self.vault.update_user_profile(key, value)
# Also update hot memory
if key.lower() == "name":
self.hot.update_section("User Profile", f"**Name:** {value}")
def _load_user_profile_summary(self) -> str:
"""Load condensed user profile."""
profile_path = self.vault.path / "self" / "user_profile.md"
if not profile_path.exists():
return ""
content = profile_path.read_text()
# Extract key fields
summary_parts = []
# Name
name_match = re.search(r"\*\*Name:\*\* (.+)", content)
if name_match and "unknown" not in name_match.group(1).lower():
summary_parts.append(f"Name: {name_match.group(1).strip()}")
# Interests
interests_section = re.search(r"## Interests.*?\n- (.+?)(?=\n## |\Z)", content, re.DOTALL)
if interests_section:
interests = [i.strip() for i in interests_section.group(1).split("\n-") if i.strip() and "to be" not in i]
if interests:
summary_parts.append(f"Interests: {', '.join(interests[:3])}")
return "\n".join(summary_parts) if summary_parts else ""
def get_system_context(self) -> str:
"""Get full context for system prompt injection."""
return self.start_session()
# Module-level singleton
memory_system = MemorySystem()

View File

@@ -78,7 +78,8 @@ def test_create_timmy_embeds_system_prompt():
create_timmy()
kwargs = MockAgent.call_args.kwargs
assert kwargs["description"] == TIMMY_SYSTEM_PROMPT
# Prompt should contain base system prompt (may have memory context appended)
assert kwargs["description"].startswith(TIMMY_SYSTEM_PROMPT[:100])
# ── Ollama host regression (container connectivity) ─────────────────────────