The 413 "Request Entity Too Large" error from the LLM API was caught by the
generic 4xx handler which aborts immediately. This is wrong for 413 — it's a
payload-size issue that can be resolved by compressing conversation history.
- Intercept 413 before the generic 4xx block and route to _compress_context
- Exclude 413 from generic is_client_error detection
- Add 'request entity too large' to context-length phrases as safety net
- Add tests for 413 compression behavior
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
When running via the gateway (e.g. Telegram), the session_search tool
returned: {"error": "session_search must be handled by the agent loop"}
Root cause:
- gateway/run.py creates AIAgent without passing session_db=
- self._session_db is None in the agent instance
- The dispatch condition "elif function_name == 'session_search' and self._session_db"
skips when _session_db is None, falling through to the generic error
This fix:
1. Initializes self._session_db in GatewayRunner.__init__()
2. Passes session_db to all AIAgent instantiations in gateway/run.py
3. Adds defensive fallback in run_agent.py to return a clear error when
session_db is unavailable, instead of falling through
Fixes#105
- Added _max_tokens_param method in AIAgent to return appropriate max tokens parameter based on the provider (OpenAI vs. others).
- Updated API calls in AIAgent to utilize the new max tokens handling.
- Introduced auxiliary_max_tokens_param function in auxiliary_client for consistent max tokens management across auxiliary clients.
- Refactored multiple tools to use auxiliary_max_tokens_param for improved compatibility with different models and providers.
USER.md stays in system prompt when Honcho is active -- prefetch is
additive context, not a replacement. Memory tool user observations
write to both USER.md (local) and Honcho (cross-session) simultaneously.
When Honcho is active:
- System prompt uses Honcho prefetch instead of USER.md
- memory tool target=user add routes to Honcho
- MEMORY.md untouched in all cases
When disabled, everything works as before.
Also wires up contextTokens config to cap prefetch size.
Opt-in persistent cross-session user modeling via Honcho. Reads
~/.honcho/config.json as single source of truth (shared with
Claude Code, Cursor, and other Honcho-enabled tools). Zero impact
when disabled or unconfigured.
- honcho_integration/ package (client, session manager, peer resolution)
- Host-based config resolution matching claude-honcho/cursor-honcho pattern
- Prefetch user context into system prompt per conversation turn
- Sync user/assistant messages to Honcho after each exchange
- query_user_context tool for mid-conversation dialectic reasoning
- Gated activation: requires ~/.honcho/config.json with enabled=true
The `hermes` CLI entry point (hermes_cli/main.py) and the agent runner
(run_agent.py) only loaded .env from the project installation directory.
After the standard installer, code lives at ~/.hermes/hermes-agent/ but
config lives at ~/.hermes/ — so the .env was never found.
Aligns these entry points with the pattern already used by gateway/run.py
and rl_cli.py: load ~/.hermes/.env first, fall back to project root .env
for dev-mode compatibility.
Also fixes:
- status.py checking .env existence and API keys at PROJECT_ROOT
- doctor.py KeyError on tool availability (missing_vars vs env_vars)
- doctor.py checking logs/ and Skills Hub at PROJECT_ROOT instead of HERMES_HOME
- doctor.py redundant logs/ check (already covered by subdirectory loop)
- mini-swe-agent loading config from platformdirs default instead of ~/.hermes/
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Simplified the logic for determining support for reasoning based on the base URL by introducing clearer variable names.
- Added product attribution for the Nous Portal to the extra body of requests when applicable, enhancing tagging for better tracking.
- Introduced a new static method `_clean_session_content` in the `AIAgent` class to convert REASONING_SCRATCHPAD tags to <think> blocks and clean up whitespace in session logs.
- Updated the `_save_session_log` method to utilize the cleaned content for assistant messages, ensuring consistency in session logs.
- Changed the default output directory for TTS audio files from `~/voice-memos` to `~/.hermes/audio_cache`, reflecting a more appropriate storage location.
- Updated the `clear_interrupt` method to also reset the global tool interrupt signal, improving the clarity of interrupt management within the agent.
- This change ensures that all interrupt states are properly cleared, enhancing the reliability of the agent's operation.
- Introduced a new configuration option for reasoning effort in the CLI, allowing users to specify the level of reasoning the agent should perform before responding.
- Updated the CLI and agent initialization to incorporate the reasoning configuration, enhancing the agent's responsiveness and adaptability.
- Implemented logic to load reasoning effort from environment variables and configuration files, providing flexibility in agent behavior.
- Enhanced the documentation in the example configuration file to clarify the new reasoning effort options available.
- Implemented functionality to load ephemeral prefill messages from a JSON file, enhancing few-shot priming capabilities for the agent.
- Introduced a mechanism to load an ephemeral system prompt from environment variables or configuration files, ensuring dynamic prompt adjustments at API-call time.
- Updated the CLI and agent initialization to utilize the new prefill messages and system prompt, improving the overall interaction experience.
- Enhanced configuration options with new environment variables for prefill messages and system prompts, allowing for greater customization without persistence.
- Removed static methods for converting and checking <REASONING_SCRATCHPAD> tags, simplifying the codebase.
- Replaced calls to the removed methods with direct function calls for better clarity and maintainability.
- Updated trajectory saving logic to utilize a dedicated function for improved organization and readability.
- Introduced a shared interrupt signaling mechanism to allow tools to check for user interrupts during long-running operations.
- Updated the AIAgent to handle interrupts more effectively, ensuring in-progress tool calls are canceled and multiple interrupt messages are combined into one prompt.
- Enhanced the CLI configuration to include container resource limits (CPU, memory, disk) and persistence options for Docker, Singularity, and Modal environments.
- Improved documentation to clarify interrupt behaviors and container resource settings, providing users with better guidance on configuration and usage.
- Introduced a method to strip <think> blocks from content, improving text visibility.
- Implemented counters to reset nudge intervals when memory and skill tools are used, enhancing user guidance.
- Captured content from turns with tool calls to provide fallback responses, ensuring continuity in conversation.
- Updated nudge logic to remind users about saving memories and creating skills based on interaction patterns.
- Added skills configuration options in cli-config.yaml.example, including a nudge interval for skill creation reminders.
- Implemented skills guidance in AIAgent to prompt users to save reusable workflows after complex tasks.
- Enhanced skills indexing in the prompt builder to include descriptions from SKILL.md files for better context.
- Updated the agent's behavior to periodically remind users about potential skills during tool-calling iterations.
- Added configuration options for memory nudge interval and flush minimum turns in cli-config.yaml.example.
- Implemented memory flushing before conversation reset, clearing, and exit in the CLI to ensure memories are saved.
- Introduced a flush_memories method in AIAgent to handle memory persistence before context loss.
- Added periodic nudges to remind the agent to consider saving memories based on user interactions.
- Introduced MEMORY_GUIDANCE and SESSION_SEARCH_GUIDANCE to improve agent's contextual awareness and proactive assistance.
- Updated AIAgent to conditionally include tool-aware guidance in prompts based on available tools.
- Enhanced descriptions in memory and session search schemas for clearer user instructions on when to utilize these features.
- Eliminated the `compression_model` variable from the AIAgent class, as it was not being utilized.
- Cleaned up the context compressor initialization for improved clarity and maintainability.
- Relocated functions related to model metadata, including fetch_model_metadata, get_model_context_length, estimate_tokens_rough, and estimate_messages_tokens_rough, to agent/model_metadata.py for better organization and maintainability.
- Updated imports in run_agent.py to reflect the new location of these functions.
- Added functionality to suppress logging noise from specific modules when in quiet mode, improving user experience in CLI.
- Updated terminal_tool.py to change the log level for fallback directory usage from warning to debug, providing clearer context without cluttering logs.
- Added methods for handling sudo password and dangerous command approval prompts using a callback mechanism in cli.py.
- Integrated these prompts with the prompt_toolkit UI for improved user experience.
- Updated terminal_tool.py to support callback registration for interactive prompts, enhancing the CLI's interactivity.
- Introduced a background thread for API calls in run_agent.py to allow for interrupt handling during long-running operations.
- Enhanced error handling for interrupted API calls, ensuring graceful degradation of user experience.
- Introduced new methods in run_agent.py for building API keyword arguments and normalizing assistant messages from API responses.
- Added functionality for compressing conversation context and managing session state in SQLite.
- Improved tool call execution handling, including enhanced logging and error management.
- Updated path handling in multiple platform files to utilize pathlib for better compatibility and readability.
- Updated various modules including cli.py, run_agent.py, gateway, and tools to replace silent exception handling with structured logging.
- Improved error messages to provide more context, aiding in debugging and monitoring.
- Ensured consistent logging practices throughout the codebase, enhancing traceability and maintainability.
- Introduced logging functionality in cli.py, run_agent.py, scheduler.py, and various tool modules to replace print statements with structured logging.
- Enhanced error handling and informational messages to improve debugging and monitoring capabilities.
- Ensured consistent logging practices across the codebase, facilitating better traceability and maintenance.
- Eliminated the `_log_api_payload` method used for temporary debugging, streamlining the codebase.
- Updated the `_save_session_log` method to save the full raw session, including all messages and metadata, improving the clarity and completeness of session logs.
- Adjusted session log entry to include additional context such as `base_url` and `platform` for better tracking.
- Changed the session logging directory from `~/.hermes-agent/logs/` to `~/.hermes/sessions/` for consistency.
- Updated the `run_agent.py` to reflect the new logging path, ensuring session logs are stored correctly alongside gateway sessions.
- Incremented schema version to 2 and added a new column `finish_reason` to the `messages` table.
- Implemented a method to flush un-logged messages to the session database, ensuring data integrity during conversation interruptions.
- Enhanced error handling to persist messages in various early-return scenarios, preventing data loss.
- Implemented a multi-provider authentication system for the Hermes Agent, supporting OAuth for Nous Portal and traditional API key methods for OpenRouter and custom endpoints.
- Enhanced CLI with commands for logging in and out of providers, allowing users to authenticate and manage their credentials easily.
- Updated configuration options to select inference providers, with detailed documentation on usage and setup.
- Improved status reporting to include authentication status and provider details, enhancing user awareness of their current configuration.
- Added new files for authentication handling and updated existing components to integrate the new provider system.
- Added a spinner to visually indicate task delegation progress in quiet mode, improving user experience during batch processing.
- Implemented a method to update spinner text dynamically based on remaining tasks, providing real-time feedback.
- Enhanced the `delegate_task` function to include per-task completion messages, ensuring clarity on task status during execution.
- Updated the KawaiiSpinner class to allow message updates while running, facilitating better interaction during long-running tasks.
- Introduced the `delegate_task` tool, allowing the main agent to spawn child AIAgent instances with isolated context for complex tasks.
- Supported both single-task and batch processing (up to 3 concurrent tasks) to enhance task management capabilities.
- Updated configuration options for delegation, including maximum iterations and default toolsets for subagents.
- Enhanced documentation to provide clear guidance on using the delegation feature and its configuration.
- Added comprehensive tests to ensure the functionality and reliability of the delegation logic.
- Updated the tool name from "search" to "search_files" across multiple files to better reflect its functionality.
- Adjusted related documentation and descriptions to ensure clarity in usage and expected behavior.
- Enhanced the toolset definitions and mappings to incorporate the new naming convention, improving overall consistency in the codebase.
- Introduced a new `execute_code` tool that allows the agent to run Python scripts that call Hermes tools via RPC, reducing the number of round trips required for tool interactions.
- Added configuration options for timeout and maximum tool calls in the sandbox environment.
- Updated the toolset definitions to include the new code execution capabilities, ensuring integration across platforms.
- Implemented comprehensive tests for the code execution sandbox, covering various scenarios including tool call limits and error handling.
- Enhanced the CLI and documentation to reflect the new functionality, providing users with clear guidance on using the code execution tool.
- Added a new `clarify_tool` to enable the agent to ask structured multiple-choice or open-ended questions to users.
- Implemented callback functionality for user interaction, allowing the platform to handle UI presentation.
- Updated the CLI and agent to support clarify questions, including timeout handling and response management.
- Enhanced toolset definitions and requirements to include the clarify tool, ensuring availability across platforms.
- Added a new `skill_manager_tool` to enable agents to create, update, and delete their own skills, enhancing procedural memory capabilities.
- Updated the skills directory structure to support user-created skills in `~/.hermes/skills/`, allowing for better organization and management.
- Enhanced the CLI and documentation to reflect the new skill management functionalities, including detailed instructions on creating and modifying skills.
- Implemented a manifest-based syncing mechanism for bundled skills to ensure user modifications are preserved during updates.