Comprehensive cleanup across 80 files based on automated (ruff, pyflakes, vulture)
and manual analysis of the entire codebase.
Changes by category:
Unused imports removed (~95 across 55 files):
- Removed genuinely unused imports from all major subsystems
- agent/, hermes_cli/, tools/, gateway/, plugins/, cron/
- Includes imports in try/except blocks that were truly unused
(vs availability checks which were left alone)
Unused variables removed (~25):
- Removed dead variables: connected, inner, channels, last_exc,
source, new_server_names, verify, pconfig, default_terminal,
result, pending_handled, temperature, loop
- Dropped unused argparse subparser assignments in hermes_cli/main.py
(12 instances of add_parser() where result was never used)
Dead code removed:
- run_agent.py: Removed dead ternary (None if False else None) and
surrounding unreachable branch in identity fallback
- run_agent.py: Removed write-only attribute _last_reported_tool
- hermes_cli/providers.py: Removed dead @property decorator on
module-level function (decorator has no effect outside a class)
- gateway/run.py: Removed unused MCP config load before reconnect
- gateway/platforms/slack.py: Removed dead SessionSource construction
Undefined name bugs fixed (would cause NameError at runtime):
- batch_runner.py: Added missing logger = logging.getLogger(__name__)
- tools/environments/daytona.py: Added missing Dict and Path imports
Unnecessary global statements removed (14):
- tools/terminal_tool.py: 5 functions declared global for dicts
they only mutated via .pop()/[key]=value (no rebinding)
- tools/browser_tool.py: cleanup thread loop only reads flag
- tools/rl_training_tool.py: 4 functions only do dict mutations
- tools/mcp_oauth.py: only reads the global
- hermes_time.py: only reads cached values
Inefficient patterns fixed:
- startswith/endswith tuple form: 15 instances of
x.startswith('a') or x.startswith('b') consolidated to
x.startswith(('a', 'b'))
- len(x)==0 / len(x)>0: 13 instances replaced with pythonic
truthiness checks (not x / bool(x))
- in dict.keys(): 5 instances simplified to in dict
- Redefined unused name: removed duplicate _strip_mdv2 import in
send_message_tool.py
Other fixes:
- hermes_cli/doctor.py: Replaced undefined logger.debug() with pass
- hermes_cli/config.py: Consolidated chained .endswith() calls
Test results: 3934 passed, 17 failed (all pre-existing on main),
19 skipped. Zero regressions.
232 lines
9.6 KiB
Python
232 lines
9.6 KiB
Python
"""Abstract base class for pluggable memory providers.
|
|
|
|
Memory providers give the agent persistent recall across sessions. One
|
|
external provider is active at a time alongside the always-on built-in
|
|
memory (MEMORY.md / USER.md). The MemoryManager enforces this limit.
|
|
|
|
Built-in memory is always active as the first provider and cannot be removed.
|
|
External providers (Honcho, Hindsight, Mem0, etc.) are additive — they never
|
|
disable the built-in store. Only one external provider runs at a time to
|
|
prevent tool schema bloat and conflicting memory backends.
|
|
|
|
Registration:
|
|
1. Built-in: BuiltinMemoryProvider — always present, not removable.
|
|
2. Plugins: Ship in plugins/memory/<name>/, activated by memory.provider config.
|
|
|
|
Lifecycle (called by MemoryManager, wired in run_agent.py):
|
|
initialize() — connect, create resources, warm up
|
|
system_prompt_block() — static text for the system prompt
|
|
prefetch(query) — background recall before each turn
|
|
sync_turn(user, asst) — async write after each turn
|
|
get_tool_schemas() — tool schemas to expose to the model
|
|
handle_tool_call() — dispatch a tool call
|
|
shutdown() — clean exit
|
|
|
|
Optional hooks (override to opt in):
|
|
on_turn_start(turn, message, **kwargs) — per-turn tick with runtime context
|
|
on_session_end(messages) — end-of-session extraction
|
|
on_pre_compress(messages) -> str — extract before context compression
|
|
on_memory_write(action, target, content) — mirror built-in memory writes
|
|
on_delegation(task, result, **kwargs) — parent-side observation of subagent work
|
|
"""
|
|
|
|
from __future__ import annotations
|
|
|
|
import logging
|
|
from abc import ABC, abstractmethod
|
|
from typing import Any, Dict, List
|
|
|
|
logger = logging.getLogger(__name__)
|
|
|
|
|
|
class MemoryProvider(ABC):
|
|
"""Abstract base class for memory providers."""
|
|
|
|
@property
|
|
@abstractmethod
|
|
def name(self) -> str:
|
|
"""Short identifier for this provider (e.g. 'builtin', 'honcho', 'hindsight')."""
|
|
|
|
# -- Core lifecycle (implement these) ------------------------------------
|
|
|
|
@abstractmethod
|
|
def is_available(self) -> bool:
|
|
"""Return True if this provider is configured, has credentials, and is ready.
|
|
|
|
Called during agent init to decide whether to activate the provider.
|
|
Should not make network calls — just check config and installed deps.
|
|
"""
|
|
|
|
@abstractmethod
|
|
def initialize(self, session_id: str, **kwargs) -> None:
|
|
"""Initialize for a session.
|
|
|
|
Called once at agent startup. May create resources (banks, tables),
|
|
establish connections, start background threads, etc.
|
|
|
|
kwargs always include:
|
|
- hermes_home (str): The active HERMES_HOME directory path. Use this
|
|
for profile-scoped storage instead of hardcoding ``~/.hermes``.
|
|
- platform (str): "cli", "telegram", "discord", "cron", etc.
|
|
|
|
kwargs may also include:
|
|
- agent_context (str): "primary", "subagent", "cron", or "flush".
|
|
Providers should skip writes for non-primary contexts (cron system
|
|
prompts would corrupt user representations).
|
|
- agent_identity (str): Profile name (e.g. "coder"). Use for
|
|
per-profile provider identity scoping.
|
|
- agent_workspace (str): Shared workspace name (e.g. "hermes").
|
|
- parent_session_id (str): For subagents, the parent's session_id.
|
|
- user_id (str): Platform user identifier (gateway sessions).
|
|
"""
|
|
|
|
def system_prompt_block(self) -> str:
|
|
"""Return text to include in the system prompt.
|
|
|
|
Called during system prompt assembly. Return empty string to skip.
|
|
This is for STATIC provider info (instructions, status). Prefetched
|
|
recall context is injected separately via prefetch().
|
|
"""
|
|
return ""
|
|
|
|
def prefetch(self, query: str, *, session_id: str = "") -> str:
|
|
"""Recall relevant context for the upcoming turn.
|
|
|
|
Called before each API call. Return formatted text to inject as
|
|
context, or empty string if nothing relevant. Implementations
|
|
should be fast — use background threads for the actual recall
|
|
and return cached results here.
|
|
|
|
session_id is provided for providers serving concurrent sessions
|
|
(gateway group chats, cached agents). Providers that don't need
|
|
per-session scoping can ignore it.
|
|
"""
|
|
return ""
|
|
|
|
def queue_prefetch(self, query: str, *, session_id: str = "") -> None:
|
|
"""Queue a background recall for the NEXT turn.
|
|
|
|
Called after each turn completes. The result will be consumed
|
|
by prefetch() on the next turn. Default is no-op — providers
|
|
that do background prefetching should override this.
|
|
"""
|
|
|
|
def sync_turn(self, user_content: str, assistant_content: str, *, session_id: str = "") -> None:
|
|
"""Persist a completed turn to the backend.
|
|
|
|
Called after each turn. Should be non-blocking — queue for
|
|
background processing if the backend has latency.
|
|
"""
|
|
|
|
@abstractmethod
|
|
def get_tool_schemas(self) -> List[Dict[str, Any]]:
|
|
"""Return tool schemas this provider exposes.
|
|
|
|
Each schema follows the OpenAI function calling format:
|
|
{"name": "...", "description": "...", "parameters": {...}}
|
|
|
|
Return empty list if this provider has no tools (context-only).
|
|
"""
|
|
|
|
def handle_tool_call(self, tool_name: str, args: Dict[str, Any], **kwargs) -> str:
|
|
"""Handle a tool call for one of this provider's tools.
|
|
|
|
Must return a JSON string (the tool result).
|
|
Only called for tool names returned by get_tool_schemas().
|
|
"""
|
|
raise NotImplementedError(f"Provider {self.name} does not handle tool {tool_name}")
|
|
|
|
def shutdown(self) -> None:
|
|
"""Clean shutdown — flush queues, close connections."""
|
|
|
|
# -- Optional hooks (override to opt in) ---------------------------------
|
|
|
|
def on_turn_start(self, turn_number: int, message: str, **kwargs) -> None:
|
|
"""Called at the start of each turn with the user message.
|
|
|
|
Use for turn-counting, scope management, periodic maintenance.
|
|
|
|
kwargs may include: remaining_tokens, model, platform, tool_count.
|
|
Providers use what they need; extras are ignored.
|
|
"""
|
|
|
|
def on_session_end(self, messages: List[Dict[str, Any]]) -> None:
|
|
"""Called when a session ends (explicit exit or timeout).
|
|
|
|
Use for end-of-session fact extraction, summarization, etc.
|
|
messages is the full conversation history.
|
|
|
|
NOT called after every turn — only at actual session boundaries
|
|
(CLI exit, /reset, gateway session expiry).
|
|
"""
|
|
|
|
def on_pre_compress(self, messages: List[Dict[str, Any]]) -> str:
|
|
"""Called before context compression discards old messages.
|
|
|
|
Use to extract insights from messages about to be compressed.
|
|
messages is the list that will be summarized/discarded.
|
|
|
|
Return text to include in the compression summary prompt so the
|
|
compressor preserves provider-extracted insights. Return empty
|
|
string for no contribution (backwards-compatible default).
|
|
"""
|
|
return ""
|
|
|
|
def on_delegation(self, task: str, result: str, *,
|
|
child_session_id: str = "", **kwargs) -> None:
|
|
"""Called on the PARENT agent when a subagent completes.
|
|
|
|
The parent's memory provider gets the task+result pair as an
|
|
observation of what was delegated and what came back. The subagent
|
|
itself has no provider session (skip_memory=True).
|
|
|
|
task: the delegation prompt
|
|
result: the subagent's final response
|
|
child_session_id: the subagent's session_id
|
|
"""
|
|
|
|
def get_config_schema(self) -> List[Dict[str, Any]]:
|
|
"""Return config fields this provider needs for setup.
|
|
|
|
Used by 'hermes memory setup' to walk the user through configuration.
|
|
Each field is a dict with:
|
|
key: config key name (e.g. 'api_key', 'mode')
|
|
description: human-readable description
|
|
secret: True if this should go to .env (default: False)
|
|
required: True if required (default: False)
|
|
default: default value (optional)
|
|
choices: list of valid values (optional)
|
|
url: URL where user can get this credential (optional)
|
|
env_var: explicit env var name for secrets (default: auto-generated)
|
|
|
|
Return empty list if no config needed (e.g. local-only providers).
|
|
"""
|
|
return []
|
|
|
|
def save_config(self, values: Dict[str, Any], hermes_home: str) -> None:
|
|
"""Write non-secret config to the provider's native location.
|
|
|
|
Called by 'hermes memory setup' after collecting user inputs.
|
|
``values`` contains only non-secret fields (secrets go to .env).
|
|
``hermes_home`` is the active HERMES_HOME directory path.
|
|
|
|
Providers with native config files (JSON, YAML) should override
|
|
this to write to their expected location. Providers that use only
|
|
env vars can leave the default (no-op).
|
|
|
|
All new memory provider plugins MUST implement either:
|
|
- save_config() for native config file formats, OR
|
|
- use only env vars (in which case get_config_schema() fields
|
|
should all have ``env_var`` set and this method stays no-op).
|
|
"""
|
|
|
|
def on_memory_write(self, action: str, target: str, content: str) -> None:
|
|
"""Called when the built-in memory tool writes an entry.
|
|
|
|
action: 'add', 'replace', or 'remove'
|
|
target: 'memory' or 'user'
|
|
content: the entry content
|
|
|
|
Use to mirror built-in memory writes to your backend.
|
|
"""
|