docs/adr/019-semantic-memory.md

# ADR 019: Semantic Memory (Vector Store)

## Status
Accepted

## Context
The Echo agent needed the ability to remember conversations, facts, and context across sessions. Simple keyword search was insufficient for finding relevant historical context.

## Decision
Implement a vector-based semantic memory store using SQLite with optional sentence-transformers embeddings.

## Context Types

| Type | Description |
|------|-------------|
| `conversation` | User/agent dialogue |
| `fact` | Extracted facts about user/system |
| `document` | Uploaded documents |

## Schema
```sql
CREATE TABLE memory_entries (
    id TEXT PRIMARY KEY,
    content TEXT NOT NULL,
    source TEXT NOT NULL,
    context_type TEXT NOT NULL DEFAULT 'conversation',
    agent_id TEXT,
    task_id TEXT,
    session_id TEXT,
    metadata TEXT,  -- JSON
    embedding TEXT,  -- JSON array of floats
    timestamp TEXT NOT NULL
);
```

## Embedding Strategy

**Primary**: sentence-transformers `all-MiniLM-L6-v2` (384 dimensions)
- High quality semantic similarity
- Local execution (no cloud)
- ~80MB model download

**Fallback**: Character n-gram hash embedding
- No external dependencies
- Lower quality but functional
- Enables system to work without heavy ML deps

## Usage

```python
from memory.vector_store import (
    store_memory,
    search_memories,
    get_memory_context,
)

# Store a memory
store_memory(
    content="User prefers dark mode",
    source="user",
    context_type="fact",
    agent_id="echo",
)

# Search for relevant context
results = search_memories(
    query="user preferences",
    agent_id="echo",
    limit=5,
)

# Get formatted context for LLM
context = get_memory_context(
    query="what does user like?",
    max_tokens=1000,
)
```

## Integration Points

### Echo Agent
Echo should store all conversations and retrieve relevant context when answering questions about "what we discussed" or "what we know".

### Task Context
Task handlers can query for similar past tasks:
```python
similar = search_memories(
    query=task.description,
    context_type="conversation",
    limit=3,
)
```

## Similarity Scoring

**Cosine Similarity** (when embeddings available):
```python
score = dot(a, b) / (norm(a) * norm(b))  # -1 to 1
```

**Keyword Overlap** (fallback):
```python
score = len(query_words & content_words) / len(query_words)
```

## Consequences
- **Positive**: Semantic search finds related content even without keyword matches
- **Negative**: Embedding computation adds latency (~10-100ms per query)
- **Mitigation**: Background embedding computation, caching

## Future Work
- sqlite-vss extension for vector similarity index
- Memory compression for long-term storage
- Automatic fact extraction from conversations
feat: complete Event Log, Ledger, Memory, Cascade Router, Upgrade Queue, Activity Feed This commit implements six major features: 1. Event Log System (src/swarm/event_log.py) - SQLite-based audit trail for all swarm events - Task lifecycle tracking (created, assigned, completed, failed) - Agent lifecycle tracking (joined, left, status changes) - Integrated with coordinator for automatic logging - Dashboard page at /swarm/events 2. Lightning Ledger (src/lightning/ledger.py) - Transaction tracking for Lightning Network payments - Balance calculations (incoming, outgoing, net, available) - Integrated with payment_handler for automatic logging - Dashboard page at /lightning/ledger 3. Semantic Memory / Vector Store (src/memory/vector_store.py) - Embedding-based similarity search for Echo agent - Fallback to keyword matching if sentence-transformers unavailable - Personal facts storage and retrieval - Dashboard page at /memory 4. Cascade Router Integration (src/timmy/cascade_adapter.py) - Automatic LLM failover between providers (Ollama → AirLLM → API) - Circuit breaker pattern for failing providers - Metrics tracking per provider (latency, error rates) - Dashboard status page at /router/status 5. Self-Upgrade Approval Queue (src/upgrades/) - State machine for self-modifications: proposed → approved/rejected → applied/failed - Human approval required before applying changes - Git integration for branch management - Dashboard queue at /self-modify/queue 6. Real-Time Activity Feed (src/events/broadcaster.py) - WebSocket-based live activity streaming - Bridges event_log to dashboard clients - Activity panel on /swarm/live Tests: - 101 unit tests passing - 4 new E2E test files for Selenium testing - Run with: SELENIUM_UI=1 pytest tests/functional/ -v --headed Documentation: - 6 ADRs (017-022) documenting architecture decisions - Implementation summary in docs/IMPLEMENTATION_SUMMARY.md - Architecture diagram in docs/architecture-v2.md 2026-02-26 08:01:01 -05:00			`# ADR 019: Semantic Memory (Vector Store)`

			`## Status`
			`Accepted`

			`## Context`
			`The Echo agent needed the ability to remember conversations, facts, and context across sessions. Simple keyword search was insufficient for finding relevant historical context.`

			`## Decision`
			`Implement a vector-based semantic memory store using SQLite with optional sentence-transformers embeddings.`

			`## Context Types`

			`\| Type \| Description \|`
			`\|------\|-------------\|`
			\| `conversation` \| User/agent dialogue \|
			\| `fact` \| Extracted facts about user/system \|
			\| `document` \| Uploaded documents \|

			`## Schema`
			```sql
			`CREATE TABLE memory_entries (`
			`id TEXT PRIMARY KEY,`
			`content TEXT NOT NULL,`
			`source TEXT NOT NULL,`
			`context_type TEXT NOT NULL DEFAULT 'conversation',`
			`agent_id TEXT,`
			`task_id TEXT,`
			`session_id TEXT,`
			`metadata TEXT, -- JSON`
			`embedding TEXT, -- JSON array of floats`
			`timestamp TEXT NOT NULL`
			`);`
			```

			`## Embedding Strategy`

			Primary: sentence-transformers `all-MiniLM-L6-v2` (384 dimensions)
			`- High quality semantic similarity`
			`- Local execution (no cloud)`
			`- ~80MB model download`

			`Fallback: Character n-gram hash embedding`
			`- No external dependencies`
			`- Lower quality but functional`
			`- Enables system to work without heavy ML deps`

			`## Usage`

			```python
			`from memory.vector_store import (`
			`store_memory,`
			`search_memories,`
			`get_memory_context,`
			`)`

			`# Store a memory`
			`store_memory(`
			`content="User prefers dark mode",`
			`source="user",`
			`context_type="fact",`
			`agent_id="echo",`
			`)`

			`# Search for relevant context`
			`results = search_memories(`
			`query="user preferences",`
			`agent_id="echo",`
			`limit=5,`
			`)`

			`# Get formatted context for LLM`
			`context = get_memory_context(`
			`query="what does user like?",`
			`max_tokens=1000,`
			`)`
			```

			`## Integration Points`

			`### Echo Agent`
			`Echo should store all conversations and retrieve relevant context when answering questions about "what we discussed" or "what we know".`

			`### Task Context`
			`Task handlers can query for similar past tasks:`
			```python
			`similar = search_memories(`
			`query=task.description,`
			`context_type="conversation",`
			`limit=3,`
			`)`
			```

			`## Similarity Scoring`

			`Cosine Similarity (when embeddings available):`
			```python
			`score = dot(a, b) / (norm(a) * norm(b)) # -1 to 1`
			```

			`Keyword Overlap (fallback):`
			```python
			`score = len(query_words & content_words) / len(query_words)`
			```

			`## Consequences`
			`- Positive: Semantic search finds related content even without keyword matches`
			`- Negative: Embedding computation adds latency (~10-100ms per query)`
			`- Mitigation: Background embedding computation, caching`

			`## Future Work`
			`- sqlite-vss extension for vector similarity index`
			`- Memory compression for long-term storage`
			`- Automatic fact extraction from conversations`