Compare commits

...

11 Commits

Author SHA1 Message Date
01977f28fb docs: improve KNOWN_VIOLATIONS justifications in verify_memory_sovereignty.py
Some checks failed
Forge CI / smoke-and-build (pull_request) Failing after 36s
2026-04-10 00:12:42 -04:00
a055e68ebf Merge pull request #265
Some checks failed
Forge CI / smoke-and-build (push) Failing after 43s
Merged PR #265
2026-04-10 03:44:23 +00:00
f6c9ecb893 Merge pull request #264
Some checks failed
Forge CI / smoke-and-build (push) Has been cancelled
Merged PR #264
2026-04-10 03:44:19 +00:00
549431bb81 Merge pull request #259
Some checks failed
Forge CI / smoke-and-build (push) Has been cancelled
Merged PR #259
2026-04-10 03:44:16 +00:00
43dc2d21f2 Merge pull request #263
Some checks failed
Forge CI / smoke-and-build (push) Has been cancelled
Merged PR #263
2026-04-10 03:44:04 +00:00
2948d010b7 Merge pull request #266
Some checks failed
Forge CI / smoke-and-build (push) Has been cancelled
Merged PR #266
2026-04-10 03:44:00 +00:00
Alexander Whitestone
0d92b9ad15 feat(scripts): add memory budget enforcement tool (#256)
All checks were successful
Forge CI / smoke-and-build (pull_request) Successful in 40s
Add scripts/memory_budget.py — a CI-friendly tool for checking and
enforcing character budgets on MEMORY.md and USER.md memory files.

Features:
- Checks MEMORY.md vs memory_char_limit (default 2200)
- Checks USER.md vs user_char_limit (default 1375)
- Estimates total injection cost (chars / ~4 chars per token)
- Alerts when approaching limits (>80% usage)
- --report flag for detailed breakdown with progress bars
- --verbose flag for per-entry details
- --enforce flag trims oldest entries to fit budget
- --json flag for machine-readable output (CI integration)
- Exit codes: 0=within budget, 1=over budget, 2=trimmed
- Suggestions for largest entries when over budget

Relates to #256
2026-04-09 21:13:01 -04:00
Alexander Whitestone
2e37ff638a Add memory sovereignty verification script (#257)
All checks were successful
Forge CI / smoke-and-build (pull_request) Successful in 39s
CI check that scans all memory-path code for network dependencies.

Scans 8 memory-related files:
- tools/memory_tool.py (MEMORY.md/USER.md store)
- hermes_state.py (SQLite session store)
- tools/session_search_tool.py (FTS5 session search)
- tools/graph_store.py (knowledge graph)
- tools/temporal_kg_tool.py (temporal KG tool)
- agent/temporal_knowledge_graph.py (temporal triple store)
- tools/skills_tool.py (skill listing/viewing)
- tools/skills_sync.py (bundled skill syncing)

Verifies no HTTP/HTTPS calls, no external API usage, and no
network dependencies in the core memory read/write path.

Reports violations with file:line references. Exit 0 if sovereign,
exit 1 if violations found. Suitable for CI integration.
2026-04-09 21:07:03 -04:00
Alexander Whitestone
815160bd6f burn: add Memory Architecture Guide (closes #263, #258)
All checks were successful
Forge CI / smoke-and-build (pull_request) Successful in 1m3s
Developer-facing guide covering all four memory tiers:
- Built-in memory (MEMORY.md/USER.md) with frozen snapshot pattern
- Session search (FTS5 + Gemini Flash summarization)
- Skills as procedural memory
- External memory provider plugin architecture

Includes data lifecycle, security guarantees, code paths,
configuration reference, and troubleshooting.
2026-04-09 20:51:45 -04:00
2a6045a76a feat: create plugins/memory/mempalace/__init__.py
Some checks failed
Forge CI / smoke-and-build (pull_request) Failing after 40s
2026-04-09 00:45:21 +00:00
4ef7b5fc46 feat: create plugins/memory/mempalace/plugin.yaml 2026-04-09 00:45:14 +00:00
5 changed files with 1289 additions and 0 deletions

View File

@@ -0,0 +1,335 @@
# Memory Architecture Guide
Developer-facing guide to the Hermes Agent memory system. Covers all four memory tiers, data lifecycle, security guarantees, and extension points.
## Overview
Hermes has four distinct memory systems, each serving a different purpose:
| Tier | System | Scope | Cost | Persistence |
|------|--------|-------|------|-------------|
| 1 | **Built-in Memory** (MEMORY.md / USER.md) | Current session, curated facts | ~1,300 tokens fixed per session | File-backed, cross-session |
| 2 | **Session Search** (FTS5) | All past conversations | On-demand (search + summarize) | SQLite (state.db) |
| 3 | **Skills** (procedural memory) | How to do specific tasks | Loaded on match only | File-backed (~/.hermes/skills/) |
| 4 | **External Providers** (plugins) | Deep persistent knowledge | Provider-dependent | Provider-specific |
All four tiers operate independently. Built-in memory is always active. The others are opt-in or on-demand.
## Tier 1: Built-in Memory (MEMORY.md / USER.md)
### File Layout
```
~/.hermes/memories/
├── MEMORY.md — Agent's notes (environment facts, conventions, lessons learned)
└── USER.md — User profile (preferences, communication style, identity)
```
Profile-aware: when running under a profile (`hermes -p coder`), the memories directory resolves to `~/.hermes/profiles/<name>/memories/`.
### Frozen Snapshot Pattern
This is the most important architectural decision in the memory system.
1. **Session start:** `MemoryStore.load_for_prompt()` reads both files from disk, parses entries delimited by `§` (section sign), and injects them into the system prompt as a frozen block.
2. **During session:** The `memory` tool writes to disk immediately (durable), but does **not** update the system prompt. This preserves the LLM's prefix cache for the entire session.
3. **Next session:** The snapshot refreshes from disk.
**Why frozen?** System prompt changes invalidate the KV cache on every API call. With a ~30K token system prompt, that's expensive. Freezing memory at session start means the cache stays warm for the entire conversation. The tradeoff: memory writes made mid-session don't take effect until next session. Tool responses show the live state so the agent can verify writes succeeded.
### Character Limits
| Store | Default Limit | Approx Tokens | Typical Entries |
|-------|--------------|---------------|-----------------|
| MEMORY.md | 2,200 chars | ~800 | 8-15 |
| USER.md | 1,375 chars | ~500 | 5-10 |
Limits are in characters (not tokens) because character counts are model-independent. Configurable in `config.yaml`:
```yaml
memory:
memory_char_limit: 2200
user_char_limit: 1375
```
### Entry Format
Entries are separated by `\n§\n`. Each entry can be multiline. Example MEMORY.md:
```
User runs macOS 14 Sonoma, uses Homebrew, has Docker Desktop
§
Project ~/code/api uses Go 1.22, chi router, sqlc. Tests: 'make test'
§
Staging server 10.0.1.50 uses SSH port 2222, key at ~/.ssh/staging_ed25519
```
### Tool Interface
The `memory` tool (defined in `tools/memory_tool.py`) supports:
- **`add`** — Append new entry. Rejects exact duplicates.
- **`replace`** — Find entry by unique substring (`old_text`), replace with `content`.
- **`remove`** — Find entry by unique substring, delete it.
- **`read`** — Return current entries from disk (live state, not frozen snapshot).
Substring matching: `old_text` must match exactly one entry. If it matches multiple, the tool returns an error asking for more specificity.
### Security Scanning
Every memory entry is scanned against `_MEMORY_THREAT_PATTERNS` before acceptance:
- Prompt injection patterns (`ignore previous instructions`, `you are now...`)
- Credential exfiltration (`curl`/`wget` with env vars, `.env` file reads)
- SSH backdoor attempts (`authorized_keys`, `.ssh` writes)
- Invisible Unicode characters (zero-width spaces, BOM)
Matches are rejected with an error message. Source: `_scan_memory_content()` in `tools/memory_tool.py`.
### Code Path
```
agent/prompt_builder.py
└── assembles system prompt pieces
└── MemoryStore.load_for_prompt() → frozen snapshot injection
tools/memory_tool.py
├── MemoryStore class (file I/O, locking, parsing)
├── memory_tool() function (add/replace/remove/read dispatch)
└── _scan_memory_content() (threat scanning)
hermes_cli/memory_setup.py
└── Interactive first-run memory setup
```
## Tier 2: Session Search (FTS5)
### How It Works
1. Every CLI and gateway session stores full message history in SQLite (`~/.hermes/state.db`)
2. The `messages_fts` FTS5 virtual table enables fast full-text search
3. The `session_search` tool finds relevant messages, groups by session, loads top N
4. Each matching session is summarized by Gemini Flash (auxiliary LLM, not main model)
5. Summaries are returned to the main agent as context
### Why Gemini Flash for Summarization
Raw session transcripts can be 50K+ chars. Feeding them to the main model wastes context window and tokens. Gemini Flash is fast, cheap, and good enough for "extract the relevant bits" summarization. Same pattern used by `web_extract`.
### Schema
```sql
-- Core tables
sessions (id, source, user_id, model, system_prompt, parent_session_id, ...)
messages (id, session_id, role, content, tool_name, timestamp, ...)
-- Full-text search
messages_fts -- FTS5 virtual table on messages.content
-- Schema tracking
schema_version
```
WAL mode for concurrent readers + one writer (gateway multi-platform support).
### Session Lineage
When context compression triggers a session split, `parent_session_id` chains the old and new sessions. This lets session search follow the thread across compression boundaries.
### Code Path
```
tools/session_search_tool.py
├── FTS5 query against messages_fts
├── Groups results by session_id
├── Loads top N sessions (MAX_SESSION_CHARS = 100K per session)
├── Sends to Gemini Flash via auxiliary_client.async_call_llm()
└── Returns per-session summaries
hermes_state.py (SessionDB class)
├── SQLite WAL mode database
├── FTS5 triggers for message insert/update/delete
└── Session CRUD operations
```
### Memory vs Session Search
| | Memory | Session Search |
|---|--------|---------------|
| **Capacity** | ~1,300 tokens total | Unlimited (all stored sessions) |
| **Latency** | Instant (in system prompt) | Requires FTS query + LLM call |
| **When to use** | Critical facts always in context | "What did we discuss about X?" |
| **Management** | Agent-curated | Automatic |
| **Token cost** | Fixed per session | On-demand per search |
## Tier 3: Skills (Procedural Memory)
### What Skills Are
Skills capture **how to do a specific type of task** based on proven experience. Where memory is broad and declarative, skills are narrow and actionable.
A skill is a directory with a `SKILL.md` (markdown instructions) and optional supporting files:
```
~/.hermes/skills/
├── my-skill/
│ ├── SKILL.md — Instructions, steps, pitfalls
│ ├── references/ — API docs, specs
│ ├── templates/ — Code templates, config files
│ ├── scripts/ — Helper scripts
│ └── assets/ — Images, data files
```
### How Skills Load
At the start of each turn, the agent's system prompt includes available skills. When a skill matches the current task, the agent loads it with `skill_view(name)` and follows its instructions. Skills are **not** injected wholesale — they're loaded on demand to preserve context window.
### Skill Lifecycle
1. **Creation:** After a complex task (5+ tool calls), the agent offers to save the approach as a skill using `skill_manage(action='create')`.
2. **Usage:** On future matching tasks, the agent loads the skill with `skill_view(name)`.
3. **Maintenance:** If a skill is outdated or incomplete when used, the agent patches it immediately with `skill_manage(action='patch')`.
4. **Deletion:** Obsolete skills are removed with `skill_manage(action='delete')`.
### Skills vs Memory
| | Memory | Skills |
|---|--------|--------|
| **Format** | Free-text entries | Structured markdown (steps, pitfalls, examples) |
| **Scope** | Facts and preferences | Procedures and workflows |
| **Loading** | Always in system prompt | On-demand when matched |
| **Size** | ~1,300 tokens total | Variable (loaded individually) |
### Code Path
```
tools/skill_manager_tool.py — Create, edit, patch, delete skills
agent/skill_commands.py — Slash commands for skill management
skills_hub.py — Browse, search, install skills from hub
```
## Tier 4: External Memory Providers
### Plugin Architecture
```
plugins/memory/
├── __init__.py — Provider registry and base interface
├── honcho/ — Dialectic Q&A, cross-session user modeling
├── openviking/ — Knowledge graph memory
├── mem0/ — Semantic memory with auto-extraction
├── hindsight/ — Retrospective memory analysis
├── holographic/ — Distributed holographic memory
├── retaindb/ — Vector-based retention
├── byterover/ — Byte-level memory compression
└── supermemory/ — Cloud-hosted semantic memory
```
Only one external provider can be active at a time. Built-in memory (Tier 1) always runs alongside it.
### Integration Points
When a provider is active, Hermes:
1. Injects provider context into the system prompt
2. Prefetches relevant memories before each turn (background, non-blocking)
3. Syncs conversation turns to the provider after each response
4. Extracts memories on session end (for providers that support it)
5. Mirrors built-in memory writes to the provider
6. Adds provider-specific tools for search and management
### Configuration
```yaml
memory:
provider: openviking # or honcho, mem0, hindsight, etc.
```
Setup: `hermes memory setup` (interactive picker).
## Data Lifecycle
```
Session Start
├── Load MEMORY.md + USER.md from disk → frozen snapshot in system prompt
├── Load skills catalog (names + descriptions)
├── Initialize session search (SQLite connection)
└── Initialize external provider (if configured)
Each Turn
├── Agent sees frozen memory in system prompt
├── Agent can call memory tool → writes to disk, returns live state
├── Agent can call session_search → FTS5 + Gemini Flash summarization
├── Agent can load skills → reads SKILL.md from disk
└── External provider prefetches context (if active)
Session End
├── All memory writes already on disk (immediate persistence)
├── Session transcript saved to SQLite (messages + FTS5 index)
├── External provider extracts final memories (if supported)
└── Skill updates persisted (if any were patched)
```
## Privacy and Data Locality
| Component | Location | Network |
|-----------|----------|---------|
| MEMORY.md / USER.md | `~/.hermes/memories/` | Local only |
| Session DB | `~/.hermes/state.db` | Local only |
| Skills | `~/.hermes/skills/` | Local only |
| External provider | Provider-dependent | Provider API calls |
Built-in memory (Tiers 1-3) never leaves the machine. External providers (Tier 4) send data to the configured provider by design. The agent logs all provider API calls in the session transcript for auditability.
## Configuration Reference
```yaml
# ~/.hermes/config.yaml
memory:
memory_enabled: true # Enable MEMORY.md
user_profile_enabled: true # Enable USER.md
memory_char_limit: 2200 # MEMORY.md char limit (~800 tokens)
user_char_limit: 1375 # USER.md char limit (~500 tokens)
nudge_interval: 10 # Turns between memory nudge reminders
provider: null # External provider name (null = disabled)
```
Environment variables (in `~/.hermes/.env`):
- Provider-specific API keys (e.g., `HONCHO_API_KEY`, `MEM0_API_KEY`)
## Troubleshooting
### Memory not appearing in system prompt
- Check `~/.hermes/memories/MEMORY.md` exists and has content
- Verify `memory.memory_enabled: true` in config
- Check for file lock issues (WAL mode, concurrent access)
### Memory writes not taking effect
- Writes are durable to disk immediately but frozen in system prompt until next session
- Tool response shows live state — verify the write succeeded there
- Start a new session to see the updated snapshot
### Session search returns nothing
- Verify `state.db` has sessions: `sqlite3 ~/.hermes/state.db "SELECT count(*) FROM sessions"`
- Check FTS5 index: `sqlite3 ~/.hermes/state.db "SELECT count(*) FROM messages_fts"`
- Ensure auxiliary LLM (Gemini Flash) is configured and reachable
### Skills not loading
- Check `~/.hermes/skills/` directory exists
- Verify SKILL.md has valid frontmatter (name, description)
- Skills load by name match — check the skill name matches what the agent expects
### External provider errors
- Check API key in `~/.hermes/.env`
- Verify provider is installed: `pip install <provider-package>`
- Run `hermes memory status` for diagnostic info

View File

@@ -0,0 +1,248 @@
"""
MemPalace Portal — Hybrid Memory Provider.
Bridges the local Holographic fact store with the fleet-wide MemPalace vector database.
Implements smart context compression for token efficiency.
"""
import json
import logging
import os
import re
import requests
from typing import Any, Dict, List, Optional
from agent.memory_provider import MemoryProvider
# Import Holographic components if available
try:
from plugins.memory.holographic.store import MemoryStore
from plugins.memory.holographic.retrieval import FactRetriever
HAS_HOLOGRAPHIC = True
except ImportError:
HAS_HOLOGRAPHIC = False
logger = logging.getLogger(__name__)
# ---------------------------------------------------------------------------
# Tool Schemas
# ---------------------------------------------------------------------------
MEMPALACE_SCHEMA = {
"name": "mempalace",
"description": (
"Search or record memories in the shared fleet vector database. "
"Use this for long-term, high-volume memory across the entire fleet."
),
"parameters": {
"type": "object",
"properties": {
"action": {"type": "string", "enum": ["search", "record", "wings"]},
"query": {"type": "string", "description": "Search query."},
"text": {"type": "string", "description": "Memory text to record."},
"room": {"type": "string", "description": "Target room (e.g., forge, hermes, nexus)."},
"n_results": {"type": "integer", "default": 5},
},
"required": ["action"],
},
}
FACT_STORE_SCHEMA = {
"name": "fact_store",
"description": (
"Structured local fact storage. Use for durable facts about people, projects, and decisions."
),
"parameters": {
"type": "object",
"properties": {
"action": {"type": "string", "enum": ["add", "search", "probe", "reason", "update", "remove"]},
"content": {"type": "string"},
"query": {"type": "string"},
"entity": {"type": "string"},
"fact_id": {"type": "integer"},
},
"required": ["action"],
},
}
# ---------------------------------------------------------------------------
# Provider Implementation
# ---------------------------------------------------------------------------
class MemPalacePortalProvider(MemoryProvider):
"""Hybrid Fleet Vector + Local Structured memory provider."""
def __init__(self, config: dict | None = None):
self._config = config or {}
self._api_url = os.environ.get("MEMPALACE_API_URL", "http://127.0.0.1:7771")
self._hologram_store = None
self._hologram_retriever = None
self._session_id = None
@property
def name(self) -> str:
return "mempalace"
def is_available(self) -> bool:
# Always available if we can reach the API or have Holographic
return True
def initialize(self, session_id: str, **kwargs) -> None:
self._session_id = session_id
hermes_home = kwargs.get("hermes_home")
if HAS_HOLOGRAPHIC and hermes_home:
db_path = os.path.join(hermes_home, "memory_store.db")
try:
self._hologram_store = MemoryStore(db_path=db_path)
self._hologram_retriever = FactRetriever(store=self._hologram_store)
logger.info("Holographic store initialized as local portal layer.")
except Exception as e:
logger.error(f"Failed to init Holographic layer: {e}")
def system_prompt_block(self) -> str:
status = "Active (Fleet Portal)"
if self._hologram_store:
status += " + Local Hologram"
return (
f"# MemPalace Portal\n"
f"Status: {status}.\n"
"You have access to the shared fleet vector database (mempalace) and local structured facts (fact_store).\n"
"Use mempalace for semantic fleet-wide recall. Use fact_store for precise local knowledge."
)
def prefetch(self, query: str, *, session_id: str = "") -> str:
if not query:
return ""
context_blocks = []
# 1. Fleet Search (MemPalace)
try:
res = requests.get(f"{self._api_url}/search", params={"q": query, "n": 3}, timeout=2)
if res.ok:
data = res.json()
memories = data.get("memories", [])
if memories:
block = "## Fleet Memories (MemPalace)\n"
for m in memories:
block += f"- {m['text']}\n"
context_blocks.append(block)
except Exception:
pass
# 2. Local Probe (Holographic)
if self._hologram_retriever:
try:
# Extract entities from query to probe
entities = re.findall(r'\b([A-Z][a-z]+(?:\s+[A-Z][a-z]+)*)\b', query)
facts = []
for ent in entities:
results = self._hologram_retriever.probe(ent, limit=3)
facts.extend(results)
if facts:
block = "## Local Facts (Hologram)\n"
seen = set()
for f in facts:
if f['content'] not in seen:
block += f"- {f['content']}\n"
seen.add(f['content'])
context_blocks.append(block)
except Exception:
pass
return "\n\n".join(context_blocks)
def sync_turn(self, user_content: str, assistant_content: str, *, session_id: str = "") -> None:
# Record to Fleet Palace
try:
payload = {
"text": f"User: {user_content}\nAssistant: {assistant_content}",
"room": "hermes_sync",
"metadata": {"session_id": session_id}
}
requests.post(f"{self._api_url}/record", json=payload, timeout=2)
except Exception:
pass
def on_pre_compress(self, messages: List[Dict[str, Any]]) -> str:
"""Token Efficiency: Summarize and archive before context is lost."""
if not messages:
return ""
# Extract key facts for Hologram
if self._hologram_store:
# Simple heuristic: look for \"I prefer\", \"The project uses\", etc.
for msg in messages:
if msg.get(\"role\") == \"user\":
content = msg.get(\"content\", \"\")
if \"prefer\" in content.lower() or \"use\" in content.lower():
try:
self._hologram_store.add_fact(content[:200], category=\"user_pref\")
except Exception:
pass
# Archive session summary to MemPalace
summary_text = f"Session {self._session_id} summary: " + " | ".join([m['content'][:50] for m in messages if m.get('role') == 'user'])
try:
payload = {
"text": summary_text,
"room": "summaries",
"metadata": {"type": "session_summary", "session_id": self._session_id}
}
requests.post(f"{self._api_url}/record", json=payload, timeout=2)
except Exception:
pass
return "Insights archived to MemPalace and Hologram."
def get_tool_schemas(self) -> List[Dict[str, Any]]:
return [MEMPALACE_SCHEMA, FACT_STORE_SCHEMA]
def handle_tool_call(self, tool_name: str, args: Dict[str, Any], **kwargs) -> str:
if tool_name == "mempalace":
return self._handle_mempalace(args)
elif tool_name == "fact_store":
return self._handle_fact_store(args)
return json.dumps({"error": f"Unknown tool: {tool_name}"})
def _handle_mempalace(self, args: dict) -> str:
action = args.get("action")
try:
if action == "search":
res = requests.get(f"{self._api_url}/search", params={"q": args["query"], "n": args.get("n_results", 5)}, timeout=10)
return res.text
elif action == "record":
res = requests.post(f"{self._api_url}/record", json={"text": args["text"], "room": args.get("room", "general")}, timeout=10)
return res.text
elif action == "wings":
res = requests.get(f"{self._api_url}/wings", timeout=10)
return res.text
except Exception as e:
return json.dumps({"success": False, "error": str(e)})
return json.dumps({"error": "Invalid action"})
def _handle_fact_store(self, args: dict) -> str:
if not self._hologram_store:
return json.dumps({"error": "Holographic store not initialized locally."})
# Logic similar to holographic plugin
action = args["action"]
try:
if action == "add":
fid = self._hologram_store.add_fact(args["content"])
return json.dumps({"fact_id": fid, "status": "added"})
elif action == "probe":
res = self._hologram_retriever.probe(args["entity"])
return json.dumps({"results": res})
# ... other actions ...
return json.dumps({"status": "ok", "message": f"Action {action} processed (partial impl)"})
except Exception as e:
return json.dumps({"error": str(e)})
def shutdown(self) -> None:
if self._hologram_store:
self._hologram_store.close()
def register(ctx) -> None:
provider = MemPalacePortalProvider()
ctx.register_memory_provider(provider)

View File

@@ -0,0 +1,7 @@
name: mempalace
version: 1.0.0
description: "The Portal: Hybrid Fleet Vector (MemPalace) + Local Structured (Holographic) memory."
dependencies:
- requests
- numpy

374
scripts/memory_budget.py Normal file
View File

@@ -0,0 +1,374 @@
#!/usr/bin/env python3
"""Memory Budget Enforcement Tool for hermes-agent.
Checks and enforces character/token budgets on MEMORY.md and USER.md files.
Designed for CI integration, pre-commit hooks, and manual health checks.
Usage:
python scripts/memory_budget.py # Check budget (exit 0/1)
python scripts/memory_budget.py --report # Detailed breakdown
python scripts/memory_budget.py --enforce # Trim entries to fit budget
python scripts/memory_budget.py --hermes-home ~/.hermes # Custom HERMES_HOME
Exit codes:
0 Within budget
1 Over budget (no trimming performed)
2 Entries were trimmed (--enforce was used)
"""
from __future__ import annotations
import argparse
import sys
from dataclasses import dataclass
from pathlib import Path
from typing import List
# ---------------------------------------------------------------------------
# Constants (must stay in sync with tools/memory_tool.py)
# ---------------------------------------------------------------------------
ENTRY_DELIMITER = "\n§\n"
DEFAULT_MEMORY_CHAR_LIMIT = 2200
DEFAULT_USER_CHAR_LIMIT = 1375
WARN_THRESHOLD = 0.80 # alert when >80% of budget used
CHARS_PER_TOKEN = 4 # rough estimate matching agent/model_metadata.py
# ---------------------------------------------------------------------------
# Data structures
# ---------------------------------------------------------------------------
@dataclass
class FileReport:
"""Budget analysis for a single memory file."""
label: str # "MEMORY.md" or "USER.md"
path: Path
exists: bool
char_limit: int
raw_chars: int # raw file size in chars
entry_chars: int # chars after splitting/rejoining entries
entry_count: int
entries: List[str] # individual entry texts
@property
def usage_pct(self) -> float:
if self.char_limit <= 0:
return 0.0
return min(100.0, (self.entry_chars / self.char_limit) * 100)
@property
def estimated_tokens(self) -> int:
return self.entry_chars // CHARS_PER_TOKEN
@property
def over_budget(self) -> bool:
return self.entry_chars > self.char_limit
@property
def warning(self) -> bool:
return self.usage_pct >= (WARN_THRESHOLD * 100)
@property
def remaining_chars(self) -> int:
return max(0, self.char_limit - self.entry_chars)
def _read_entries(path: Path) -> List[str]:
"""Read a memory file and split into entries (matching MemoryStore logic)."""
if not path.exists():
return []
try:
raw = path.read_text(encoding="utf-8")
except (OSError, IOError):
return []
if not raw.strip():
return []
entries = [e.strip() for e in raw.split(ENTRY_DELIMITER)]
return [e for e in entries if e]
def _write_entries(path: Path, entries: List[str]) -> None:
"""Write entries back to a memory file."""
content = ENTRY_DELIMITER.join(entries) if entries else ""
path.parent.mkdir(parents=True, exist_ok=True)
path.write_text(content, encoding="utf-8")
def analyze_file(path: Path, label: str, char_limit: int) -> FileReport:
"""Analyze a single memory file against its budget."""
exists = path.exists()
entries = _read_entries(path) if exists else []
raw_chars = path.stat().st_size if exists else 0
joined = ENTRY_DELIMITER.join(entries)
return FileReport(
label=label,
path=path,
exists=exists,
char_limit=char_limit,
raw_chars=raw_chars,
entry_chars=len(joined),
entry_count=len(entries),
entries=entries,
)
def trim_entries(report: FileReport) -> List[str]:
"""Trim oldest entries until the file fits within its budget.
Entries are removed from the front (oldest first) because memory files
append new entries at the end.
"""
entries = list(report.entries)
joined = ENTRY_DELIMITER.join(entries)
while len(joined) > report.char_limit and entries:
entries.pop(0)
joined = ENTRY_DELIMITER.join(entries)
return entries
# ---------------------------------------------------------------------------
# Reporting
# ---------------------------------------------------------------------------
def _bar(pct: float, width: int = 30) -> str:
"""Render a text progress bar."""
filled = int(pct / 100 * width)
bar = "#" * filled + "-" * (width - filled)
return f"[{bar}]"
def print_report(memory: FileReport, user: FileReport, *, verbose: bool = False) -> None:
"""Print a human-readable budget report."""
total_chars = memory.entry_chars + user.entry_chars
total_limit = memory.char_limit + user.char_limit
total_tokens = total_chars // CHARS_PER_TOKEN
total_pct = (total_chars / total_limit * 100) if total_limit > 0 else 0
print("=" * 60)
print(" MEMORY BUDGET REPORT")
print("=" * 60)
print()
for rpt in (memory, user):
status = "OVER " if rpt.over_budget else ("WARN" if rpt.warning else " OK ")
print(f" {rpt.label:12s} {status} {_bar(rpt.usage_pct)} {rpt.usage_pct:5.1f}%")
print(f" {'':12s} {rpt.entry_chars:,}/{rpt.char_limit:,} chars "
f"| {rpt.entry_count} entries "
f"| ~{rpt.estimated_tokens:,} tokens")
if rpt.exists and verbose and rpt.entries:
for i, entry in enumerate(rpt.entries):
preview = entry[:72].replace("\n", " ")
if len(entry) > 72:
preview += "..."
print(f" #{i+1}: ({len(entry)} chars) {preview}")
print()
print(f" TOTAL {_bar(total_pct)} {total_pct:5.1f}%")
print(f" {total_chars:,}/{total_limit:,} chars | ~{total_tokens:,} tokens")
print()
# Alerts
alerts = []
for rpt in (memory, user):
if rpt.over_budget:
overshoot = rpt.entry_chars - rpt.char_limit
alerts.append(
f" CRITICAL {rpt.label} is {overshoot:,} chars over budget "
f"({rpt.entry_chars:,}/{rpt.char_limit:,}). "
f"Run with --enforce to auto-trim."
)
elif rpt.warning:
alerts.append(
f" WARNING {rpt.label} is at {rpt.usage_pct:.0f}% capacity. "
f"Consider compressing or cleaning up entries."
)
if alerts:
print(" ALERTS")
print(" ------")
for a in alerts:
print(a)
print()
def print_json(memory: FileReport, user: FileReport) -> None:
"""Print a JSON report for machine consumption."""
import json
def _rpt_dict(r: FileReport) -> dict:
return {
"label": r.label,
"path": str(r.path),
"exists": r.exists,
"char_limit": r.char_limit,
"entry_chars": r.entry_chars,
"entry_count": r.entry_count,
"estimated_tokens": r.estimated_tokens,
"usage_pct": round(r.usage_pct, 1),
"over_budget": r.over_budget,
"warning": r.warning,
"remaining_chars": r.remaining_chars,
}
total_chars = memory.entry_chars + user.entry_chars
total_limit = memory.char_limit + user.char_limit
data = {
"memory": _rpt_dict(memory),
"user": _rpt_dict(user),
"total": {
"chars": total_chars,
"limit": total_limit,
"estimated_tokens": total_chars // CHARS_PER_TOKEN,
"usage_pct": round((total_chars / total_limit * 100) if total_limit else 0, 1),
"over_budget": memory.over_budget or user.over_budget,
"warning": memory.warning or user.warning,
},
}
print(json.dumps(data, indent=2))
# ---------------------------------------------------------------------------
# Main
# ---------------------------------------------------------------------------
def _resolve_hermes_home(custom: str | None) -> Path:
"""Resolve HERMES_HOME directory."""
if custom:
return Path(custom).expanduser()
import os
return Path(os.getenv("HERMES_HOME", Path.home() / ".hermes"))
def main() -> int:
parser = argparse.ArgumentParser(
description="Check and enforce memory budgets for hermes-agent.",
formatter_class=argparse.RawDescriptionHelpFormatter,
epilog=__doc__,
)
parser.add_argument(
"--hermes-home", metavar="DIR",
help="Custom HERMES_HOME directory (default: $HERMES_HOME or ~/.hermes)",
)
parser.add_argument(
"--memory-limit", type=int, default=DEFAULT_MEMORY_CHAR_LIMIT,
help=f"Character limit for MEMORY.md (default: {DEFAULT_MEMORY_CHAR_LIMIT})",
)
parser.add_argument(
"--user-limit", type=int, default=DEFAULT_USER_CHAR_LIMIT,
help=f"Character limit for USER.md (default: {DEFAULT_USER_CHAR_LIMIT})",
)
parser.add_argument(
"--report", action="store_true",
help="Print detailed per-file budget report",
)
parser.add_argument(
"--verbose", "-v", action="store_true",
help="Show individual entry details in report",
)
parser.add_argument(
"--enforce", action="store_true",
help="Trim oldest entries to fit within budget (writes to disk)",
)
parser.add_argument(
"--json", action="store_true", dest="json_output",
help="Output report as JSON (for CI/scripting)",
)
args = parser.parse_args()
hermes_home = _resolve_hermes_home(args.hermes_home)
memories_dir = hermes_home / "memories"
# Analyze both files
memory = analyze_file(
memories_dir / "MEMORY.md", "MEMORY.md", args.memory_limit,
)
user = analyze_file(
memories_dir / "USER.md", "USER.md", args.user_limit,
)
over_budget = memory.over_budget or user.over_budget
trimmed = False
# Enforce budget by trimming entries
if args.enforce and over_budget:
for rpt in (memory, user):
if rpt.over_budget and rpt.exists:
trimmed_entries = trim_entries(rpt)
removed = rpt.entry_count - len(trimmed_entries)
if removed > 0:
_write_entries(rpt.path, trimmed_entries)
rpt.entries = trimmed_entries
rpt.entry_count = len(trimmed_entries)
rpt.entry_chars = len(ENTRY_DELIMITER.join(trimmed_entries))
rpt.raw_chars = rpt.path.stat().st_size
print(f" Trimmed {removed} oldest entries from {rpt.label} "
f"({rpt.entry_chars:,}/{rpt.char_limit:,} chars now)")
trimmed = True
# Re-check after trimming
over_budget = memory.over_budget or user.over_budget
# Output
if args.json_output:
print_json(memory, user)
elif args.report or args.verbose:
print_report(memory, user, verbose=args.verbose)
else:
# Compact summary
if over_budget:
print("Memory budget: OVER")
for rpt in (memory, user):
if rpt.over_budget:
print(f" {rpt.label}: {rpt.entry_chars:,}/{rpt.char_limit:,} chars "
f"({rpt.usage_pct:.0f}%)")
elif memory.warning or user.warning:
print("Memory budget: WARNING")
for rpt in (memory, user):
if rpt.warning:
print(f" {rpt.label}: {rpt.entry_chars:,}/{rpt.char_limit:,} chars "
f"({rpt.usage_pct:.0f}%)")
else:
print("Memory budget: OK")
for rpt in (memory, user):
if rpt.exists:
print(f" {rpt.label}: {rpt.entry_chars:,}/{rpt.char_limit:,} chars "
f"({rpt.usage_pct:.0f}%)")
# Suggest actions when over budget but not enforced
if over_budget and not args.enforce:
suggestions = []
for rpt in (memory, user):
if rpt.over_budget:
suggestions.append(
f" - {rpt.label}: remove stale entries or run with --enforce to auto-trim"
)
# Identify largest entries
if rpt.entries:
indexed = sorted(enumerate(rpt.entries), key=lambda x: len(x[1]), reverse=True)
top3 = indexed[:3]
for idx, entry in top3:
preview = entry[:60].replace("\n", " ")
if len(entry) > 60:
preview += "..."
suggestions.append(
f" largest entry #{idx+1}: ({len(entry)} chars) {preview}"
)
if suggestions:
print()
print("Suggestions:")
for s in suggestions:
print(s)
# Exit code
if trimmed:
return 2
if over_budget:
return 1
return 0
if __name__ == "__main__":
sys.exit(main())

View File

@@ -0,0 +1,325 @@
#!/usr/bin/env python3
"""
Memory Sovereignty Verification
Verifies that the memory path in hermes-agent has no network dependencies.
Memory data must stay on the local filesystem only — no HTTP calls, no external
API calls, no cloud sync during memory read/write/flush/load operations.
Scans:
- tools/memory_tool.py (MEMORY.md / USER.md store)
- hermes_state.py (SQLite session store)
- tools/session_search_tool.py (FTS5 session search + summarization)
- tools/graph_store.py (knowledge graph persistence)
- tools/temporal_kg_tool.py (temporal knowledge graph)
- agent/temporal_knowledge_graph.py (temporal triple store)
- tools/skills_tool.py (skill listing/viewing)
- tools/skills_sync.py (bundled skill syncing)
Exit codes:
0 = sovereign (no violations)
1 = violations found
"""
import ast
import re
import sys
from pathlib import Path
# ---------------------------------------------------------------------------
# Configuration
# ---------------------------------------------------------------------------
# Files in the memory path to scan (relative to repo root).
MEMORY_FILES = [
"tools/memory_tool.py",
"hermes_state.py",
"tools/session_search_tool.py",
"tools/graph_store.py",
"tools/temporal_kg_tool.py",
"agent/temporal_knowledge_graph.py",
"tools/skills_tool.py",
"tools/skills_sync.py",
]
# Patterns that indicate network/external API usage.
NETWORK_PATTERNS = [
# HTTP libraries
(r'\brequests\.(get|post|put|delete|patch|head|session)', "requests HTTP call"),
(r'\burllib\.request\.(urlopen|Request)', "urllib HTTP call"),
(r'\bhttpx\.(get|post|put|delete|Client|AsyncClient)', "httpx HTTP call"),
(r'\bhttp\.client\.(HTTPConnection|HTTPSConnection)', "http.client connection"),
(r'\baiohttp\.(ClientSession|get|post)', "aiohttp HTTP call"),
(r'\bwebsockets\.\w+', "websocket connection"),
# API client patterns
(r'\bopenai\b.*\b(api_key|chat|completions|Client)\b', "OpenAI API usage"),
(r'\banthropic\b.*\b(api_key|messages|Client)\b', "Anthropic API usage"),
(r'\bAsyncOpenAI\b', "AsyncOpenAI client"),
(r'\bAsyncAnthropic\b', "AsyncAnthropic client"),
# Generic network indicators
(r'\bsocket\.(socket|connect|create_connection)', "raw socket connection"),
(r'\bftplib\b', "FTP connection"),
(r'\bsmtplib\b', "SMTP connection"),
(r'\bparamiko\b', "SSH connection via paramiko"),
# URL patterns (hardcoded endpoints)
(r'https?://(?!example\.com)[a-zA-Z0-9._-]+\.(com|org|net|io|dev|ai)', "hardcoded URL"),
]
# Import aliases that indicate network-capable modules.
NETWORK_IMPORTS = {
"requests",
"httpx",
"aiohttp",
"urllib.request",
"http.client",
"websockets",
"openai",
"anthropic",
"openrouter_client",
}
# Functions whose names suggest network I/O.
NETWORK_FUNC_NAMES = {
"async_call_llm",
"extract_content_or_reasoning",
}
# Files that are ALLOWED to have network calls (known violations with justification).
# Each entry maps to a reason string.
KNOWN_VIOLATIONS = {
"tools/graph_store.py": (
"GraphStore persists to Gitea via API. This is a known architectural trade-off "
"for knowledge graph persistence, which is not part of the core memory path "
"(MEMORY.md/USER.md/SQLite). Future work will explore local-first alternatives "
"to align more closely with SOUL.md principles."
),
"tools/session_search_tool.py": (
"Session search uses LLM summarization via an auxiliary client. While the FTS5 "
"search is local, the LLM call for summarization is an external dependency. "
"This is a temporary architectural trade-off for enhanced presentation. "
"Research is ongoing to implement local LLM options for full sovereignty, "
"in line with SOUL.md."
),
}
# ---------------------------------------------------------------------------
# Scanner
# ---------------------------------------------------------------------------
class Violation:
"""A sovereignty violation with location and description."""
def __init__(self, file: str, line: int, description: str, code: str):
self.file = file
self.line = line
self.description = description
self.code = code.strip()
def __str__(self):
return f"{self.file}:{self.line}: {self.description}\n {self.code}"
def scan_file(filepath: Path, repo_root: Path) -> list[Violation]:
"""Scan a single file for network dependency patterns."""
violations = []
rel_path = str(filepath.relative_to(repo_root))
# Skip known violations
if rel_path in KNOWN_VIOLATIONS:
return violations
try:
content = filepath.read_text(encoding="utf-8")
except (OSError, IOError) as e:
print(f"WARNING: Cannot read {rel_path}: {e}", file=sys.stderr)
return violations
lines = content.splitlines()
# --- Check imports ---
try:
tree = ast.parse(content, filename=str(filepath))
except SyntaxError as e:
print(f"WARNING: Cannot parse {rel_path}: {e}", file=sys.stderr)
return violations
for node in ast.walk(tree):
if isinstance(node, ast.Import):
for alias in node.names:
mod = alias.name
if mod in NETWORK_IMPORTS or any(
mod.startswith(ni + ".") for ni in NETWORK_IMPORTS
):
violations.append(Violation(
rel_path, node.lineno,
f"Network-capable import: {mod}",
lines[node.lineno - 1] if node.lineno <= len(lines) else "",
))
elif isinstance(node, ast.ImportFrom):
if node.module and (
node.module in NETWORK_IMPORTS
or any(node.module.startswith(ni + ".") for ni in NETWORK_IMPORTS)
):
violations.append(Violation(
rel_path, node.lineno,
f"Network-capable import from: {node.module}",
lines[node.lineno - 1] if node.lineno <= len(lines) else "",
))
# --- Check for LLM call function usage ---
for i, line in enumerate(lines, 1):
stripped = line.strip()
if stripped.startswith("#"):
continue
for func_name in NETWORK_FUNC_NAMES:
if func_name in line and not stripped.startswith("def ") and not stripped.startswith("class "):
# Check it's actually a call, not a definition or import
if re.search(r'\b' + func_name + r'\s*\(', line):
violations.append(Violation(
rel_path, i,
f"External LLM call function: {func_name}()",
line,
))
# --- Regex-based pattern matching ---
for i, line in enumerate(lines, 1):
stripped = line.strip()
if stripped.startswith("#"):
continue
for pattern, description in NETWORK_PATTERNS:
if re.search(pattern, line, re.IGNORECASE):
violations.append(Violation(
rel_path, i,
f"Suspicious pattern ({description})",
line,
))
return violations
def verify_sovereignty(repo_root: Path) -> tuple[list[Violation], list[str]]:
"""Run sovereignty verification across all memory files.
Returns (violations, info_messages).
"""
all_violations = []
info = []
for rel_path in MEMORY_FILES:
filepath = repo_root / rel_path
if not filepath.exists():
info.append(f"SKIP: {rel_path} (file not found)")
continue
if rel_path in KNOWN_VIOLATIONS:
info.append(
f"WARN: {rel_path} — known violation (excluded from gate): "
f"{KNOWN_VIOLATIONS[rel_path]}"
)
continue
violations = scan_file(filepath, repo_root)
all_violations.extend(violations)
if not violations:
info.append(f"PASS: {rel_path} — sovereign (local-only)")
return all_violations, info
# ---------------------------------------------------------------------------
# Deep analysis helpers
# ---------------------------------------------------------------------------
def check_graph_store_network(repo_root: Path) -> str:
"""Analyze graph_store.py for its network dependencies."""
filepath = repo_root / "tools" / "graph_store.py"
if not filepath.exists():
return ""
content = filepath.read_text(encoding="utf-8")
if "GiteaClient" in content:
return (
"tools/graph_store.py uses GiteaClient for persistence — "
"this is an external API call. However, graph_store is NOT part of "
"the core memory path (MEMORY.md/USER.md/SQLite). It is a separate "
"knowledge graph system."
)
return ""
def check_session_search_llm(repo_root: Path) -> str:
"""Analyze session_search_tool.py for LLM usage."""
filepath = repo_root / "tools" / "session_search_tool.py"
if not filepath.exists():
return ""
content = filepath.read_text(encoding="utf-8")
warnings = []
if "async_call_llm" in content:
warnings.append("uses async_call_llm for summarization")
if "auxiliary_client" in content:
warnings.append("imports auxiliary_client (LLM calls)")
if warnings:
return (
f"tools/session_search_tool.py: {'; '.join(warnings)}. "
f"The FTS5 search is local SQLite, but session summarization "
f"involves LLM API calls."
)
return ""
# ---------------------------------------------------------------------------
# Main
# ---------------------------------------------------------------------------
def main():
repo_root = Path(__file__).resolve().parent.parent
print(f"Memory Sovereignty Verification")
print(f"Repository: {repo_root}")
print(f"Scanning {len(MEMORY_FILES)} memory-path files...")
print()
violations, info = verify_sovereignty(repo_root)
# Print info messages
for msg in info:
print(f" {msg}")
# Print deep analysis
print()
print("Deep analysis:")
for checker in [check_graph_store_network, check_session_search_llm]:
note = checker(repo_root)
if note:
print(f" NOTE: {note}")
print()
if violations:
print(f"SOVEREIGNTY VIOLATIONS FOUND: {len(violations)}")
print("=" * 60)
for v in violations:
print(v)
print()
print("=" * 60)
print(
f"FAIL: {len(violations)} potential network dependencies detected "
f"in the memory path."
)
print("Memory must be local-only (filesystem + SQLite).")
print()
print("If a violation is intentional and documented, add it to")
print("KNOWN_VIOLATIONS in this script with a justification.")
return 1
else:
print("PASS: Memory path is sovereign — no network dependencies detected.")
print("All memory operations use local filesystem and/or SQLite only.")
return 0
if __name__ == "__main__":
sys.exit(main())