load_cli_config() only merged keys present in its hardcoded defaults
dict, silently dropping user-added keys like platform_toolsets (saved
by 'hermes tools'), provider_routing, memory, honcho, etc.
Added a second pass to carry over all file_config keys that aren't in
defaults, so 'hermes tools' changes actually take effect in CLI mode.
The gateway was unaffected (reads YAML directly via yaml.safe_load).
Updated the AIAgent class to print the full content of assistant messages without truncation, enhancing visibility of the messages during runtime. This change improves the clarity of communication from the agent.
- /retry, /undo, /compress were setting a non-existent conversation_history
attribute on SessionEntry (a @dataclass with no such field). The dangling
attribute was silently created but never read — transcript was reloaded
from DB on next interaction, making all three commands no-ops.
- /reset accessed self.session_store._sessions (non-existent) instead of
self.session_store._entries, causing AttributeError caught by a bare
except, silently skipping the pre-reset memory flush.
Fix:
- Add SessionDB.clear_messages() to delete messages and reset counters
- Add SessionStore.rewrite_transcript() to atomically replace transcript
in both SQLite and legacy JSONL storage
- Replace all dangling attr assignments with rewrite_transcript() calls
- Fix _sessions → _entries in /reset handler
Closes#210
Added the tools attribute to the AIAgent class's status output, ensuring that the current tools used by the agent are included in the status information. This enhancement improves the visibility of the agent's capabilities during runtime.
Added the system prompt to the AIAgent class's status output, ensuring that the current system prompt is included in the agent's status information. This enhancement improves visibility into the agent's configuration during runtime.
Enhanced error handling in the _model_flow_nous function to detect session expiration and prompt for re-authentication with the Nous Portal. Added logic to manage re-login attempts and provide user feedback on success or failure, improving the overall user experience during authentication issues.
Enhanced the AIAgent class to capture and normalize summary information for reasoning items. Implemented logic to handle summaries as lists, ensuring proper formatting for API interactions. Updated tests to validate the inclusion of summaries in reasoning items, both for existing and default cases.
Updated the authentication mechanism to store Codex OAuth tokens in the Hermes auth store located at ~/.hermes/auth.json instead of the previous ~/.codex/auth.json. This change includes refactoring related functions for reading and saving tokens, ensuring better management of authentication states and preventing conflicts between different applications. Adjusted tests to reflect the new storage structure and improved error handling for missing or malformed tokens.
Introduced a new `provider_routing` section in the CLI configuration to control how requests are routed across providers when using OpenRouter. This includes options for sorting providers by throughput, latency, or price, as well as allowing or ignoring specific providers, setting the order of provider attempts, and managing data collection policies. Updated relevant classes and documentation to support these features, enhancing flexibility in provider selection.
Updated the CLI header formatting for tool and configuration displays to center titles within their respective widths. Enhanced the display of command descriptions to include an ellipsis for longer texts, ensuring better readability. This refactor improves the overall user interface of the CLI.
Updated the install.sh script to set DEBIAN_FRONTEND and NEEDRESTART_MODE environment variables for non-interactive package installations on Ubuntu and Debian. This change ensures that prompts from needrestart and whiptail do not block the installation process, improving automation for system package installations.
Added support for processing encrypted reasoning content within the AIAgent class. Introduced logic to determine reasoning effort and enable/disable reasoning based on configuration settings. Updated the kwargs to reflect these changes, ensuring proper handling of reasoning parameters during agent execution.
Updated documentation within terminal_tool.py to emphasize the appropriate use of foreground and background processes. Enhanced descriptions for the timeout setting and background execution to guide users towards optimal configurations for scripts, builds, and long-running tasks. Adjusted the default timeout value from 60 to 180 seconds for improved handling of longer operations.
Updated the environment variables for subprocess execution in the ProcessRegistry class to set PYTHONUNBUFFERED to "1". This change ensures that output from Python scripts is unbuffered, allowing for real-time visibility of progress during background execution. Adjusted both the pty and background process spawning methods to use the new environment configuration.
Updated the _process_single_prompt function to accept an optional 'image' field in prompt_data, allowing for per-prompt container image overrides. Implemented checks for Docker image accessibility and added logic to register task environment overrides for Docker, Modal, and Singularity. This improves flexibility in managing containerized environments for prompt execution.
Remove loop_scope="function" parameter from async test decorators in
test_hooks.py. This matches the existing convention in the repo
(test_telegram_documents.py) and avoids requiring pytest-asyncio 0.23+.
All 144 new tests from PR #191 now pass.
Enhanced the README and CLI documentation to include the newly added `/compress` and `/usage` commands for managing conversation context and monitoring token usage. Updated log descriptions to clarify the contents of log files and ensured that sensitive information is automatically redacted. This improves user understanding of available features and log management.
Implemented the /compress command to allow users to manually compress conversation context, ensuring sufficient history is available before execution. The /usage command was also added to display token usage statistics for the current session, including prompt and completion tokens. Updated command documentation to reflect these new features.
Introduced a new command "/usage" in the CLI to show cumulative token usage for the current session. This includes details on prompt tokens, completion tokens, total tokens, API calls, and context state. Updated command documentation to reflect this addition. Enhanced the AIAgent class to track token usage throughout the session.
Introduced a new command "/compress" to the CLI, allowing users to manually trigger context compression on the current conversation. The method checks for sufficient conversation history and active agent status before performing compression, providing feedback on the number of messages and tokens before and after the operation. Updated command documentation accordingly.
Two fixes to the subagent progress display from PR #186:
1. Task index prefix: show 1-indexed prefix ([1], [2], ...) for ALL
tasks in batch mode (task_count > 1). Single tasks get no prefix.
Previously task 0 had no prefix while others did, making batch
output confusing.
2. Completion indicator: use spinner.print_above() instead of raw
print() for per-task completion lines (✓ [1/2] ...). Raw print
collided with the active spinner, mushing the completion text
onto the spinner line. Now prints cleanly above.
Added task_count parameter to _build_child_progress_callback and
_run_single_child. Updated tests accordingly.
print_above() used \033[K (erase-to-end-of-line) to clear the spinner
line before printing text above it. This causes garbled escape codes when
prompt_toolkit's patch_stdout is active in CLI mode.
Switched to the same spaces-based clearing approach used by stop() —
overwrite with blanks, then carriage return back to start of line.
Updated test assertion to match the new clearing method.
When subagents run via delegate_task, the user now sees real-time
progress instead of silence:
CLI: tree-view activity lines print above the delegation spinner
🔀 Delegating: research quantum computing
├─ 💭 "I'll search for papers first..."
├─ 🔍 web_search "quantum computing"
├─ 📖 read_file "paper.pdf"
└─ ⠹ working... (18.2s)
Gateway (Telegram/Discord): batched progress summaries sent every
5 tool calls to avoid message spam. Remaining tools flushed on
subagent completion.
Changes:
- agent/display.py: add KawaiiSpinner.print_above() to print
status lines above an active spinner without disrupting animation.
Uses captured stdout (self._out) so it works inside the child's
redirect_stdout(devnull).
- tools/delegate_tool.py: add _build_child_progress_callback()
that creates a per-child callback relaying tool calls and
thinking events to the parent's spinner (CLI) or progress
queue (gateway). Each child gets its own callback instance,
so parallel subagents don't share state. Includes _flush()
for gateway batch completion.
- run_agent.py: fire tool_progress_callback with '_thinking'
event when the model produces text content. Guarded by
_delegate_depth > 0 so only subagents fire this (prevents
gateway spam from main agent). REASONING_SCRATCHPAD/think/
reasoning XML tags are stripped before display.
Tests: 21 new tests covering print_above, callback builder,
thinking relay, SCRATCHPAD filtering, batching, flush, thread
isolation, delegate_depth guard, and prefix handling.
- Introduce a new test suite in `test_file_tools_live.py` to validate file operations and ensure accurate command execution in a real environment.
- Implement assertions to check for shell noise contamination in outputs, enhancing the reliability of command results.
- Create fixtures for setting up a local environment and populating directories with known file contents for comprehensive testing.
- Refactor shell noise handling in `process_registry.py` and `local.py` to support multiple noise patterns, improving output cleanliness.
- Introduce a separate error log for capturing warnings and errors related to tool execution, ensuring detailed inspection of issues post-failure.
- Enhance error handling in the AIAgent class to log exceptions with stack traces for better debugging.
- Add a similar error logging mechanism in the gateway to streamline debugging processes.
- Implement logic to distinguish between "full" memory errors and actual failures in the `_detect_tool_failure` function.
- Add JSON parsing to identify specific error messages related to memory limits, improving error handling for memory-related tools.
- Introduce a new test suite for the `redact_sensitive_text` function, covering various sensitive data formats including API keys, tokens, and environment variables.
- Ensure that sensitive information is properly masked in logs and outputs while non-sensitive data remains unchanged.
- Add tests for different scenarios including JSON fields, authorization headers, and environment variable assignments.
- Implement a redacting formatter for logging to enhance security during log output.
- Replace `hermes login` with `hermes model` for selecting providers and managing authentication.
- Update documentation and CLI commands to reflect the new provider selection process.
- Introduce a new redaction system for logging sensitive information.
- Enhance Codex model discovery by integrating API fetching and local cache.
- Adjust max turns configuration logic for better clarity and precedence.
- Improve error handling and user feedback during authentication processes.
- Enhanced Codex model discovery by fetching available models from the API, with fallback to local cache and defaults.
- Updated the context compressor's summary target tokens to 2500 for improved performance.
- Added external credential detection for Codex CLI to streamline authentication.
- Refactored various components to ensure consistent handling of authentication and model selection across the application.
Add a new hooks system allowing users to run custom code at key lifecycle points in the agent's operation. This includes support for events such as `gateway:startup`, `session:start`, `agent:step`, and more. Documentation for creating hooks and available events has been added to `README.md` and a new `hooks.md` file. Additionally, integrate step callbacks in the agent to facilitate hook execution during tool-calling iterations.
Refactor the extraction of MEDIA paths to collect them from the history before processing the current turn's messages. This change ensures that MEDIA tags are deduplicated based on previously seen paths, preventing TTS voice messages from being re-attached in subsequent replies. This addresses the issue outlined in #160.
/retry and /undo set session_entry.conversation_history which does not
exist on SessionEntry. The truncated history was never written to disk,
so the next message reload picked up the full unmodified transcript.
Added SessionStore.rewrite_transcript() that persists changes to both
the JSONL file and SQLite database, and updated both commands to use it.
/reset accessed self.session_store._sessions which does not exist on
SessionStore (the correct attribute is _entries). Also replaced the
hand-coded session key with _generate_session_key() to fix WhatsApp DM
sessions using the wrong key format.
Closes#210
Fixes#160
The issue was that MEDIA tags were being extracted from ALL messages
in the conversation history, not just messages from the current turn.
This caused TTS voice messages generated in earlier turns to be
re-attached to every subsequent reply.
The fix:
- Track history_len before calling run_conversation
- Only scan messages AFTER history_len for MEDIA tags
- Add comprehensive tests to prevent regression
This ensures each voice message is sent exactly once, when it's
generated, not on every subsequent message in the session.
Fixes#149
The _strip_think_blocks() method existed but was not applied to the
final_response in the normal completion path. This caused <think>...</think>
XML tags to leak into user-facing responses on all platforms (CLI, Telegram,
Discord, Slack, WhatsApp).
Changes:
- Strip think blocks from final_response before returning in normal path (line ~2600)
- Strip think blocks from fallback content when salvaging from prior tool_calls turn
Notes:
- The raw content with think blocks is preserved in messages[] for trajectory
export - this only affects the user-facing final_response
- The _has_content_after_think_block() check still uses raw content before
stripping, which is correct for detecting think-only responses