Commit Graph

24 Commits

Author SHA1 Message Date
0xbyt4
694a3ebdd5 fix(code_execution): handle empty enabled_sandbox_tools in schema description
build_execute_code_schema(set()) produced "from hermes_tools import , ..."
in the code property description — invalid Python syntax shown to the model.

This triggers when a user enables only the code_execution toolset without
any of the sandbox-allowed tools (e.g. `hermes tools code_execution`),
because SANDBOX_ALLOWED_TOOLS & {"execute_code"} = empty set.

Also adds 29 unit tests covering build_execute_code_schema, environment
variable filtering, execute_code edge cases, and interrupt handling.
2026-03-10 06:18:27 -07:00
teknium1
6f3a673aba fix: restore success-path server_sock.close() before rpc_thread.join()
PR #568 moved the close entirely to the finally block, but the success-path
close is needed to break the RPC thread out of accept() immediately. Without
it, rpc_thread.join(3) may block for up to 3 seconds if the child process
never connected. The finally-block close remains as a safety net for the
exception/error path (the actual fd leak fix).
2026-03-09 23:40:20 -07:00
teknium1
ab6a6338c4 Merge PR #568: fix(code-execution): close server socket in finally block to prevent fd leak
Authored by alireza78a. Moves server_sock.close() into the finally block so
the socket fd is always cleaned up, even if an exception occurs between socket
creation and the success-path close.
2026-03-09 23:39:13 -07:00
teknium1
2036c22f88 fix: macOS browser/code-exec socket path exceeds Unix limit (#374)
macOS sets TMPDIR to /var/folders/xx/.../T/ (~51 chars). Combined with
agent-browser session names, socket paths reach 121 chars — exceeding
the 104-byte macOS AF_UNIX limit. This causes 'Screenshot file was not
created' errors and silent browser_vision failures on macOS.

Fix: use /tmp/ on macOS (symlink to /private/tmp, sticky-bit protected).
On Linux, tempfile.gettempdir() already returns /tmp — no behavior change.

Changes in browser_tool.py:
- Add _socket_safe_tmpdir() helper — returns /tmp on macOS, gettempdir()
  elsewhere
- Replace all 3 tempfile.gettempdir() calls for socket dirs
- Set mode=0o700 on socket dirs for privacy (was using default umask)
- Guard vision/text client init with try/except — a broken auxiliary
  config no longer prevents the entire browser_tool module from importing
  (which would disable all 10 browser tools, not just vision)
- Improve screenshot error messages with mode info and diagnostic hints
- Don't delete screenshots when LLM analysis fails — the capture was
  valid, only the vision API call failed. Screenshots are still cleaned
  up by the existing 24-hour _cleanup_old_screenshots mechanism.

Changes in code_execution_tool.py:
- Same /tmp fix for RPC socket path (was 103 chars on macOS — one char
  from the 104-byte limit)
2026-03-08 19:31:23 -07:00
teknium1
3830bbda41 fix: include url in web_extract trimmed results & fix docs
The web_extract_tool was stripping the 'url' key during its output
trimming step, but documentation in 3 places claimed it was present.
This caused KeyError when accessing result['url'] in execute_code
scripts, especially when extracting from multiple URLs.

Changes:
- web_tools.py: Add 'url' back to trimmed_results output
- code_execution_tool.py: Add 'title' to _TOOL_STUBS docstring and
  _TOOL_DOC_LINES so docs match actual {url, title, content, error}
  response format
2026-03-07 18:07:36 -08:00
teknium1
69a36a3361 Merge PR #309: fix(timezone): timezone-aware now() for prompt, cron, and execute_code
Authored by areu01or00. Adds timezone support via hermes_time.now() helper
with IANA timezone resolution (HERMES_TIMEZONE env → config.yaml → server-local).
Updates system prompt timestamp, cron scheduling, and execute_code sandbox TZ
injection. Includes config migration (v4→v5) and comprehensive test coverage.
2026-03-07 00:04:41 -08:00
alireza78a
a857321463 fix(code-execution): close server socket in finally block to prevent fd leak 2026-03-07 05:49:48 +03:30
teknium1
f75b1d21b4 fix: execute_code and delegate_task now respect disabled toolsets
When a user disables the web toolset via 'hermes tools', the execute_code
schema description still hardcoded web_search/web_extract as available,
causing the model to keep trying to use them. Similarly, delegate_task
always defaulted to ['terminal', 'file', 'web'] for subagents regardless
of the parent's config.

Changes:
- execute_code schema is now built dynamically via build_execute_code_schema()
  based on which sandbox tools are actually enabled
- model_tools.py rebuilds the execute_code schema at definition time using
  the intersection of sandbox-allowed and session-enabled tools
- delegate_task now inherits the parent agent's enabled_toolsets instead of
  hardcoding DEFAULT_TOOLSETS when no explicit toolsets are specified
- delegate_task description updated to say 'inherits your enabled toolsets'

Reported by kotyKD on Discord.
2026-03-06 17:36:14 -08:00
teknium1
3982fcf095 fix: sync execute_code sandbox stubs with real tool schemas
The _TOOL_STUBS dict in code_execution_tool.py was out of sync with the
actual tool schemas, causing TypeErrors when the LLM used parameters it
sees in its system prompt but the sandbox stubs didn't accept:

search_files:
  - Added missing params: context, offset, output_mode
  - Fixed target default: 'grep' → 'content' (old value was obsolete)

patch:
  - Added missing params: mode, patch (V4A multi-file patch support)

Also added 4 drift-detection tests (TestStubSchemaDrift) that will
catch future divergence between stubs and real schemas:
  - test_stubs_cover_all_schema_params: every schema param in stub
  - test_stubs_pass_all_params_to_rpc: every stub param sent over RPC
  - test_search_files_target_uses_current_values: no obsolete values
  - test_generated_module_accepts_all_params: generated code compiles

All 28 tests pass.
2026-03-06 03:40:06 -08:00
teknium1
efec4fcaab feat(execute_code): add json_parse, shell_quote, retry helpers to sandbox
The execute_code sandbox generates a hermes_tools.py stub module for LLM
scripts. Three common failure modes keep tripping up scripts:

1. json.loads(strict=True) rejects control chars in terminal() output
   (e.g., GitHub issue bodies with literal tabs/newlines)
2. Shell backtick/quote interpretation when interpolating dynamic content
   into terminal() commands (markdown with backticks gets eaten by bash)
3. No retry logic for transient network failures (API timeouts, rate limits)

Adds three convenience helpers to the generated hermes_tools module:

- json_parse(text) — json.loads with strict=False for tolerant parsing
- shell_quote(s) — shlex.quote() for safe shell interpolation
- retry(fn, max_attempts=3, delay=2) — exponential backoff wrapper

Also updates the EXECUTE_CODE_SCHEMA description to document these helpers
so LLMs know they're available without importing anything extra.

Includes 7 new tests (unit + integration) covering all three helpers.
2026-03-06 01:52:46 -08:00
areu01or00
a1c25046a9 fix(timezone): add timezone-aware clock across agent, cron, and execute_code 2026-03-03 18:23:40 +05:30
Farukest
3f58e47c63 fix: guard POSIX-only process functions for Windows compatibility
os.setsid, os.killpg, and os.getpgid do not exist on Windows and raise
AttributeError on import or first call. This breaks the terminal tool,
code execution sandbox, process registry, and WhatsApp bridge on Windows.

Added _IS_WINDOWS platform guard in all four affected files, following
the pattern documented in CONTRIBUTING.md. On Windows, preexec_fn is
set to None and process termination falls back to proc.terminate() /
proc.kill() instead of process group signals.

Files changed:
- tools/environments/local.py (3 call sites)
- tools/process_registry.py (2 call sites)
- tools/code_execution_tool.py (3 call sites)
- gateway/platforms/whatsapp.py (3 call sites)
2026-03-01 01:54:27 +03:00
teknium1
e5bd25c73f Fix: #41 2026-02-25 21:16:15 -08:00
teknium1
91907789af refactor: remove temporary debug logging in code execution tool
- Eliminated the temporary debug logging in the `execute_code` function that tracked enabled and sandbox tools, streamlining the code and reducing clutter.
2026-02-24 14:25:53 -08:00
teknium1
6845852e82 refactor: update failure message handling in display module and add debug logging in code execution tool
- Modified the `_wrap` function to append a failure suffix without applying red coloring, simplifying the failure message format.
- Introduced temporary debug logging in the `execute_code` function to track enabled and sandbox tools, aiding in troubleshooting.
2026-02-24 14:25:53 -08:00
teknium1
08ff1c1aa8 More major refactor/tech debt removal! 2026-02-21 20:22:33 -08:00
teknium1
748fd3db88 refactor: enhance error handling with structured logging across multiple modules
- Updated various modules including cli.py, run_agent.py, gateway, and tools to replace silent exception handling with structured logging.
- Improved error messages to provide more context, aiding in debugging and monitoring.
- Ensured consistent logging practices throughout the codebase, enhancing traceability and maintainability.
2026-02-21 03:32:11 -08:00
teknium1
b6247b71b5 refactor: update tool descriptions for clarity and conciseness
- Revised descriptions for various tools in model_tools.py, browser_tool.py, code_execution_tool.py, delegate_tool.py, and terminal_tool.py to enhance clarity and reduce verbosity.
- Improved consistency in terminology and formatting across tool descriptions, ensuring users have a clearer understanding of tool functionalities and usage.
2026-02-21 02:41:30 -08:00
teknium1
70dd3a16dc Cleanup time! 2026-02-20 23:23:32 -08:00
teknium1
c0d412a736 refactor: update search tool parameters and documentation for clarity
- Changed the target parameter from "content" and "files" to "grep" and "find" to better represent their functionality.
- Revised descriptions in the tool definitions and execution code schema to enhance understanding of search modes and output formats.
- Ensured consistency in the handling of search operations across the codebase.
2026-02-20 02:46:30 -08:00
teknium1
f9eb5edb96 refactor: rename search tool for clarity and consistency
- Updated the tool name from "search" to "search_files" across multiple files to better reflect its functionality.
- Adjusted related documentation and descriptions to ensure clarity in usage and expected behavior.
- Enhanced the toolset definitions and mappings to incorporate the new naming convention, improving overall consistency in the codebase.
2026-02-20 02:43:57 -08:00
teknium1
3b90fa5c9b fix: increase default timeout for code execution sandbox
- Updated the default timeout for sandbox script execution from 120 seconds to 300 seconds (5 minutes) to allow longer-running scripts.
- Enhanced comments in the code execution tool to clarify the timeout duration.
- Suppressed stdout and stderr output from internal tool handlers during execution to prevent clutter in the CLI interface.
2026-02-20 01:29:53 -08:00
teknium1
273b367f05 fix: update documentation and return types for web tools
- Revised docstrings for `web_search` and `web_extract` functions to clarify return types and structure.
- Updated the execution code schema documentation to reflect changes in the output format for both tools, ensuring consistency and improved understanding for users.
2026-02-19 23:30:01 -08:00
teknium1
783acd712d feat: implement code execution sandbox for programmatic tool calling
- Introduced a new `execute_code` tool that allows the agent to run Python scripts that call Hermes tools via RPC, reducing the number of round trips required for tool interactions.
- Added configuration options for timeout and maximum tool calls in the sandbox environment.
- Updated the toolset definitions to include the new code execution capabilities, ensuring integration across platforms.
- Implemented comprehensive tests for the code execution sandbox, covering various scenarios including tool call limits and error handling.
- Enhanced the CLI and documentation to reflect the new functionality, providing users with clear guidance on using the code execution tool.
2026-02-19 23:23:43 -08:00