- Updated the HermesCLI layout to replace the floating completion menu with an inline CompletionsMenu, ensuring it appears consistently below the input area.
- This change enhances user experience by maintaining visibility of completions even after agent output fills the terminal, improving usability in non-full-screen modes.
- Eliminated the 'read' action from the memory tool and related logging in the agent, streamlining the available actions to 'add', 'replace', and 'remove'.
- Updated error messages and documentation to reflect the removal of the 'read' action, ensuring clarity in the API's usage.
Two-part implementation:
Part A - Curated Bounded Memory:
- New memory tool (tools/memory_tool.py) with MEMORY.md + USER.md stores
- Character-limited (2200/1375 chars), § delimited entries
- Frozen snapshot injected into system prompt at session start
- Model manages pruning via replace/remove with substring matching
- Usage indicator shown in system prompt header
Part B - SQLite Session Store:
- New hermes_state.py with SessionDB class, FTS5 full-text search
- Gateway session.py rewritten to dual-write SQLite + legacy JSONL
- Compression-triggered session splitting with parent_session_id chains
- New session_search tool with Gemini Flash summarization of matched sessions
- CLI session lifecycle (create on launch, close on exit)
Also:
- System prompt now cached per session, only rebuilt on compression
(fixes prefix cache invalidation from date/time changes every turn)
- Config version bumped to 3, hermes doctor checks for new artifacts
- Disabled in batch_runner and RL environments
- Introduced a new function `_resolve_short_name` to convert short skill names to full identifiers, improving user experience during skill installation.
- Updated the `do_install` function to utilize the new resolution method for identifiers without slashes, ensuring accurate skill fetching.
- Enhanced the install confirmation process to include a disclaimer about third-party skills, emphasizing user responsibility and security awareness.
- Introduced a new configuration section in `cli-config.yaml.example` for defining platform-specific toolsets, allowing for greater customization of available tools per platform.
- Updated the CLI to check for user-defined toolsets in the configuration, falling back to the default `hermes-cli` toolset if none are specified.
- Enhanced the `GatewayRunner` class to load platform-specific toolsets from the configuration, ensuring that the correct tools are enabled based on the platform being used.
- Introduced a new `todo_tool.py` for planning and tracking multi-step tasks, enhancing the agent's capabilities.
- Updated CLI to include a floating autocomplete dropdown for commands and improved user instructions for better navigation.
- Revised toolsets to incorporate the new `todo` tool and updated documentation to reflect changes in available tools and commands.
- Enhanced user experience with new keybindings and clearer command descriptions in the CLI.
- Updated the layout in HermesCLI to include a floating completion menu, improving user experience by providing real-time suggestions as users type.
- Refactored the layout structure to utilize FloatContainer, ensuring the input area remains accessible while displaying the completion menu dynamically.
- Revised user instructions to reflect the removal of the Ctrl+Enter key binding for new lines, simplifying the input method.
- Clarified that Alt+Enter is now the sole key for multi-line input, enhancing user experience.
- Removed the Shift+Enter key binding for inserting new lines, simplifying the input method.
- Introduced Ctrl+Enter as the primary key for multi-line input, ensuring better compatibility across terminals.
- Updated user instructions to reflect the new key bindings for a clearer user experience.
- Patched prompt_toolkit to recognize Shift+Enter as a distinct key for inserting new lines, improving the multiline input experience.
- Added Alt+Enter as a fallback for terminals that do not support Shift+Enter, ensuring consistent functionality across different environments.
- Updated user instructions to reflect the new key bindings for multiline input.
- Introduced SlashCommandCompleter for command autocompletion, enhancing user experience by suggesting commands as users type.
- Enabled multiline input with Shift+Enter, allowing users to enter longer messages more conveniently.
- Implemented paste detection to handle large text inputs, saving them to temporary files and replacing them with compact references in the input area.
- Updated input area styling and hint display to improve usability and feedback during agent operation.
- Updated the dynamic prompt to display the Hermes symbol when the agent is active, enhancing user feedback.
- Introduced a spacer line in the layout to prevent spinner output from overlapping the input cursor, improving usability.
- Adjusted the overall layout to maintain a clean interface while accommodating dynamic elements.
- Updated the input area prompt to dynamically reflect agent status, enhancing user feedback during operation.
- Removed the status line from the layout to streamline the interface, focusing solely on the input area.
- Adjusted styling for prompt states to improve visual clarity and user experience.
- Removed ANSI escape codes for color in tool activity messages to simplify output.
- Updated the _get_cute_tool_message method to provide a cleaner, more consistent format for various tool activities.
- Enhanced readability by aligning messages and removing unnecessary complexity, ensuring a more straightforward user experience.
- Introduced ANSI escape codes for color-coded CLI messages to enhance readability.
- Updated the _get_cute_tool_message method to generate clean, aligned activity lines for various tools, replacing kawaii ASCII art with a more structured format.
- Simplified message construction for web tools, terminal commands, and process management, ensuring consistent and scannable output.
- Updated the _build_tool_preview function to include detailed previews for new tools: 'todo', 'send_message', and various 'rl_' tools, improving user feedback during task execution.
- Added emoji representations for tools in GatewayRunner, including 'process', 'todo', and 'send_message', to enhance visual clarity in progress messages.
- Improved handling of task management and messaging outputs, ensuring more informative and user-friendly interactions.
Single `todo` tool that reads (no params) or writes (provide todos array
with merge flag). In-memory TodoStore on AIAgent, no system prompt
mutation, behavioral guidance in tool description only. State re-injected
after context compression events. Gateway sessions hydrate from
conversation history. Added to all platform toolsets.
Also wired into RL agent_loop.py with per-run TodoStore and fixed
browser_snapshot user_task passthrough from first user message.
- Enhanced the _build_tool_preview function to include specific formatting for the 'process' tool, displaying action, session_id, data, and timeout when applicable.
- This update improves the clarity of tool previews, particularly for actions that require session tracking and timeout management.
New process registry and tool for managing long-running background processes
across all terminal backends (local, Docker, Singularity, Modal, SSH).
Process Registry (tools/process_registry.py):
- ProcessSession tracking with rolling 200KB output buffer
- spawn_local() with optional PTY via ptyprocess for interactive CLIs
- spawn_via_env() for non-local backends (runs inside sandbox, never on host)
- Background reader threads per process (Popen stdout or PTY)
- wait() with timeout clamping, interrupt support, and transparent limit reporting
- JSON checkpoint to ~/.hermes/processes.json for gateway crash recovery
- Module-level singleton shared across agent loop, gateway, and RL
Process Tool (model_tools.py):
- 7 actions: list, poll, log, wait, kill, write, submit
- Paired with terminal in all toolsets (CLI, messaging, RL)
- Timeout clamping with transparent notes in response
Terminal Tool Updates (tools/terminal_tool.py):
- Replaced nohup background mode with registry spawn (returns session_id)
- Added workdir parameter for per-command working directory
- Added check_interval parameter for gateway auto-check watchers
- Added pty parameter for interactive CLI tools (Codex, Claude Code)
- Updated TERMINAL_TOOL_DESCRIPTION with full background workflow docs
- Cleanup thread now respects active background processes (won't reap sandbox)
Gateway Integration (gateway/run.py, session.py, config.py):
- Session reset protection: sessions with active processes exempt from reset
- Default idle timeout increased from 2 hours to 24 hours
- from_dict fallback aligned to match (was 120, now 1440)
- session_key env var propagated to process registry for session mapping
- Crash recovery on gateway startup via checkpoint probe
- check_interval watcher: asyncio task polls process, delivers updates to platform
RL Safety (environments/):
- tool_context.py cleanup() kills background processes on episode end
- hermes_base_env.py warns when enabled_toolsets is None (loads all tools)
- Process tool safe in RL via wait() blocking the agent loop
Also:
- Added ptyprocess as optional dependency (in pyproject.toml [pty] extra + [all])
- Fixed pre-existing bug: rl_test_inference missing from TOOL_TO_TOOLSET_MAP
- Updated AGENTS.md with process management docs and project structure
- Updated README.md terminal section with process management overview
- Introduced a new parameter `skip_context_files` in the AIAgent class to control the inclusion of context files (SOUL.md, AGENTS.md, .cursorrules) in the system prompt.
- Updated the _process_single_prompt function to set `skip_context_files` to True, preventing pollution of trajectories during batch processing and data generation.
- cli-config.yaml.example: env_type → backend everywhere, matching the
documented config key that hermes_cli/config.py and README already use
- cli-config.yaml.example: added comments clarifying cwd is a path
INSIDE the target environment for non-local backends
- AGENTS.md: updated terminal.cwd description to explain "." only
resolves to host CWD for the local backend
- .env.example: updated TERMINAL_CWD comment to warn against using
host-local paths with remote backends, lists per-backend defaults
When using Modal, Docker, SSH, or Singularity as the terminal backend
from the CLI, the agent resolved cwd: "." to the host machine's local
path (e.g. /Users/rewbs/code/hermes-agent) and passed it to the remote
sandbox, where it doesn't exist. All commands failed with "No such file
or directory".
Root cause: cli.py unconditionally resolved "." to os.getcwd() and wrote
it to TERMINAL_CWD regardless of backend type. Every tool then used that
host-local path as the working directory inside the remote environment.
Fixes:
- cli.py: only resolve "." to os.getcwd() for the local backend. For all
remote backends (ssh, docker, modal, singularity), leave TERMINAL_CWD
unset so the tool layer uses per-backend defaults (/root, /, ~, etc.)
- terminal_tool.py: added sanity check -- if TERMINAL_CWD contains a
host-local prefix (/Users/, /home/, C:\) for a non-local backend, log
a warning and fall back to the backend's default
- terminal_tool.py: SSH default CWD is now ~ instead of os.getcwd()
- file_operations.py: last-resort CWD fallback changed from os.getcwd()
to "/" so host paths never leak into remote file operations
Two config systems used different key names for the terminal backend:
- hermes_cli/config.py, README, and all docs use "terminal.backend"
- cli.py's env var mapping only recognized "terminal.env_type"
Users following the docs who set `backend: modal` in ~/.hermes/config.yaml
had it silently ignored -- TERMINAL_ENV always defaulted to "local".
Additionally, when no config file existed, cli.py's hardcoded defaults
overwrote any TERMINAL_ENV=modal set in .env, despite the comment saying
"env vars take precedence."
Fixes:
- cli.py now normalizes "backend" -> "env_type" (backend takes precedence)
- Defaults no longer overwrite .env when no config file terminal section exists
- hermes status reads from config as fallback when env var isn't set
Also fixes four related bugs found in the Modal/sandbox lifecycle:
- file_tools cache not cleared on sandbox cleanup (stale ops on dead sandbox)
- Global lock held during slow Modal teardown (blocked all tool calls 10-15s)
- Race condition in file_tools between existence check and access (KeyError)
- Per-task creation locks never cleaned up (memory leak)
- Improved the caching mechanism for ShellFileOperations to ensure stale entries are invalidated when environments are cleaned up.
- Enhanced thread safety by refining the use of locks during environment creation and cleanup processes.
- Streamlined the cleanup of inactive environments to prevent blocking other tool calls, ensuring efficient resource management.
- Added error handling and messaging improvements for better user feedback during environment cleanup.
- Introduced a new cleanup function that ensures terminal and browser sessions are cleaned up only once during application exit.
- Updated atexit registration to use the new cleanup function, enhancing resource management and preventing potential issues from multiple cleanup calls.
- Modified terminal cleanup messaging to only display when environments are cleaned, improving user feedback.
- Removed the check for the browserbase SDK from the optional packages list.
- Added validation for Node.js installation and the presence of the agent-browser package, providing feedback on their status for browser automation tools.
- Updated the doctor script to load environment variables from user-specific and project-specific `.env` files, improving configuration management.
- Added checks for the existence of the `SOUL.md` persona file, providing feedback on its status and creating it with a template if missing.
- Enhanced install scripts to create the `SOUL.md` file if it doesn't exist, ensuring users can easily customize the agent's personality.
- Modified the retrieval of tool definitions to use the agent result's "tools" key, ensuring accurate logging in the transcript.
- Enhanced the response structure to include tools in the final output, improving the clarity of tool usage in session interactions.
- Refactored the agent response processing to return a comprehensive result dictionary, including final responses and full message history.
- Improved transcript logging to capture the complete conversation, including tool calls and intermediate reasoning, facilitating session resumption and debugging.
- Added handling for fresh sessions to include tool definitions in the transcript for clarity.
- Implemented logic to filter and timestamp new messages, ensuring accurate logging of user and assistant interactions.
- Implemented deep copy of DEFAULT_CONFIG to prevent mutations during config loading.
- Enhanced user config merging process to clarify the deep merge of user values over defaults.
- Added newline handling when appending environment variables to ensure proper formatting.
- Updated the set_config_value function to write only user-specific configurations back to the file, avoiding overwriting default values.
- Updated prompts for the OPENAI_BASE_URL to clarify its use for custom endpoints.
- Enhanced the migration function to skip "advanced" environment variables during interactive configuration, streamlining the setup for standard users.
- Improved messaging for missing optional API keys, ensuring clearer guidance for users during configuration.
- Revised descriptions and prompts for the OPENAI_BASE_URL and OPENAI_API_KEY environment variables to enhance user understanding.
- Added a URL reference for the OPENAI_API_KEY to guide users in obtaining their API key.
- Specified the use of the API key for voice transcription and custom endpoints, improving the overall configuration documentation.
- Updated the vision tool to accept both HTTP/HTTPS URLs and local file paths for image analysis.
- Implemented caching of user-uploaded images in local directories to ensure reliable access for the vision tool, addressing issues with ephemeral URLs.
- Enhanced platform adapters (Discord, Telegram, WhatsApp) to download and cache images, allowing for immediate analysis and enriched message context.
- Added a new method to auto-analyze images attached by users, enriching the conversation with detailed descriptions.
- Improved documentation for image handling processes and updated related functions for clarity and efficiency.
- Clarified the requirements for Telegram voice bubbles, specifying the need for ffmpeg when using Edge TTS.
- Enhanced README and messaging documentation to detail audio delivery formats across platforms.
- Improved installation script messages to inform users about the necessity of ffmpeg for proper audio playback on Telegram.
- Added detection of the platform from the environment variable to determine the appropriate audio output format.
- Implemented logic to output Opus (.ogg) files for Telegram when using compatible TTS providers, while defaulting to MP3 for others.
- Updated `pyproject.toml` to include Edge TTS and ElevenLabs as dependencies.
- Enhanced documentation to detail voice message capabilities across platforms and TTS provider options.
- Modified the GatewayRunner to handle MEDIA tags from TTS tool responses, ensuring proper delivery of audio messages.
- Introduced a default agent identity prompt to ensure consistent behavior across platforms.
- Added platform-specific formatting hints for CLI, WhatsApp, Telegram, and Discord to guide the agent's output style.
- Updated the AIAgent initialization to accept a platform parameter, enhancing adaptability to different interfaces.
- Appended the current local date and time to the active system prompt to provide context for the model, addressing potential misinterpretations due to training cutoffs.
- Introduced a new script, `kill_modal.sh`, to facilitate stopping running Modal apps, including the ability to stop all apps or specific swe-rex sandboxes.
- Enhanced user experience with clear usage instructions and feedback during the stopping process.
- Improved error handling to ensure smooth execution even if some apps fail to stop.
- Updated task filter descriptions for clarity and added a new skip task feature to exclude incompatible tasks.
- Introduced a set of modal incompatible tasks to prevent execution errors in cloud environments.
- Implemented streaming JSONL logging for task results, preserving data even on interruptions.
- Refactored task evaluation logic to include skipped task reporting and improved error handling.