hermes-agent

Author	SHA1	Message	Date
teknium1	6f3a673aba	fix: restore success-path server_sock.close() before rpc_thread.join() PR #568 moved the close entirely to the finally block, but the success-path close is needed to break the RPC thread out of accept() immediately. Without it, rpc_thread.join(3) may block for up to 3 seconds if the child process never connected. The finally-block close remains as a safety net for the exception/error path (the actual fd leak fix).	2026-03-09 23:40:20 -07:00
teknium1	ab6a6338c4	Merge PR #568 : fix(code-execution): close server socket in finally block to prevent fd leak Authored by alireza78a. Moves server_sock.close() into the finally block so the socket fd is always cleaned up, even if an exception occurs between socket creation and the success-path close.	2026-03-09 23:39:13 -07:00
teknium1	1ec8c1fcaa	Merge PR #564 : fix: count actual tool calls instead of tool-related messages Authored by 0xbyt4. Fixes tool_call_count double-counting tool responses and under-counting parallel tool calls.	2026-03-09 23:32:54 -07:00
teknium1	739eb6702e	Merge PR #551 : Make skill file writes atomic Authored by aydnOktay. Adds _atomic_write_text() helper using tempfile.mkstemp() + os.replace() to prevent skill file corruption on crash/interrupt. All 7 write_text() calls in skill_manager_tool.py converted, including rollback writes during security scans.	2026-03-09 23:31:43 -07:00
teknium1	1aa7badb3c	fix: add missing Platform.SIGNAL to toolset mappings, update test + config docs Platform.SIGNAL was missing from default_toolset_map and platform_config_key in gateway/run.py, causing Signal to silently fall back to hermes-telegram toolset (same bug as HomeAssistant, fixed in PR #538). Also updates: - tests/test_toolsets.py: include hermes-signal and hermes-homeassistant in the platform core-tools consistency check - cli-config.yaml.example: document signal and homeassistant platform keys	2026-03-09 23:27:19 -07:00
teknium1	ee4008431a	fix: stop terminal border flashing with steady cursor and TUI spinner widget Cherry-picked and improved from PR #470 (fixes #464). Problem: On Ubuntu 24.04 with ghostty + tmux, the prompt input box border lines flash due to cursor blink and raw spinner terminal writes conflicting with prompt_toolkit's rendering. Changes: - cli.py: Add CursorShape.BLOCK to Application() to disable cursor blink - cli.py: Add thinking_callback + spinner_widget in TUI layout so thinking status displays as a proper prompt_toolkit widget instead of raw terminal writes that conflict with the TUI renderer - run_agent.py: Add thinking_callback parameter to AIAgent; when set, uses the callback instead of KawaiiSpinner for thinking display What was NOT changed (preserving existing behavior): - agent/display.py: Untouched. KawaiiSpinner _write() stdout capture, _animate() logic, and 0.12s frame interval all preserved. This protects subagent stdout redirection and keeps smooth animations for non-CLI contexts (gateway, batch runner). - Original emoji spinner types (brain/sparkle/pulse/moon/star) preserved for all non-CLI contexts. Fixes from original PR #470: - CursorShape.STEADY_BLOCK -> CursorShape.BLOCK (STEADY_BLOCK doesn't exist in prompt_toolkit 3.0.52) - Removed duplicate self._spinner_text = '' line - Removed redundant nested if-checks Tested: 2706 tests pass, interactive CLI verified via tmux.	2026-03-09 23:26:43 -07:00
teknium1	88f8bcde38	Merge PR #538 : fix cron HERMES_HOME path mismatch, missing HomeAssistant toolset mapping, Daytona timeout drift Authored by Himess. Three independent fixes: - cron/jobs.py: respect HERMES_HOME env var (consistent with scheduler.py) - gateway/run.py: add Platform.HOMEASSISTANT to toolset mappings - tools/environments/daytona.py: use time.monotonic() for timeout deadline	2026-03-09 23:20:52 -07:00
teknium1	2285615010	Merge PR #533 : fix: use regex for search output parsing to handle Windows drive-letter paths Authored by Himess. Replaces split(':', 2) with regex that optionally captures Windows drive-letter prefix in rg/grep output parsing. Fixes search_files returning zero results on Windows where paths like C:\path\file.py:42:content were misparsed by naive colon splitting. No behavior change on Unix/Mac.	2026-03-09 23:18:42 -07:00
teknium1	805ce8177b	Merge PR #529 : fix: restrict .env file permissions to owner-only Authored by Himess. Adds 0600 chmod on ~/.hermes/.env after writing API keys, matching the existing pattern in auth.py for auth.json.	2026-03-09 23:10:59 -07:00
teknium1	bdce33e239	Merge PR #810 : fix(cli): handle unquoted multi-word session names in -c/--continue and -r/--resume	2026-03-09 23:08:45 -07:00
Teknium	9be8d88ccc	Merge pull request #815 from NousResearch/hermes/hermes-5ab2a29e Add hermes-atropos-environments bundled skill	2026-03-09 23:06:19 -07:00
teknium1	6ab3ebf195	Add hermes-atropos-environments skill (bundled) Add comprehensive skill for building, testing, and debugging Hermes Agent RL environments for Atropos training. Includes: - SKILL.md: Full guide covering HermesAgentBaseEnv interface, required methods, config class, CLI modes (serve/process/evaluate), reward function patterns, common pitfalls, and minimum implementation checklist - New 'Inference Setup' section: instructs the agent to always ask the user for their inference provider (OpenRouter + model choice, self-hosted VLLM endpoint, or other OpenAI-compatible API) before running tests - references/agentresult-fields.md: AgentResult dataclass field reference - references/atropos-base-env.md: Atropos BaseEnv API reference - references/usage-patterns.md: Step-by-step patterns for process, evaluate, serve, and smoke test modes Will be auto-synced to ~/.hermes/skills/ via skills_sync.	2026-03-09 23:04:17 -07:00
teknium1	0a628c1aef	fix(cli): handle unquoted multi-word session names in -c/--continue and -r/--resume When a user runs `hermes -w -c Pokemon Agent Dev` without quoting the session name, argparse would fail with: error: argument command: invalid choice: 'Agent' This is because argparse parses `-c Pokemon` (consuming one token via nargs='?'), then sees 'Agent' and tries to match it as a subcommand. Fix: add _coalesce_session_name_args() that pre-processes sys.argv before argparse, joining consecutive non-flag, non-subcommand tokens after -c or -r into a single argument. This makes both quoted and unquoted multi-word session names work transparently. Includes 17 tests covering all edge cases: multi-word names, single-word, bare flags, flag ordering, subcommand boundaries, and passthrough.	2026-03-09 21:36:29 -07:00
teknium1	36328a996f	Merge PR #458 : Add explicit UTF-8 encoding to config/data file I/O Authored by shitcoinsherpa. Adds encoding='utf-8' to all text-mode open() calls in gateway/run.py, gateway/config.py, hermes_cli/config.py, hermes_cli/main.py, and hermes_cli/status.py. Prevents encoding errors on Windows where the default locale is not UTF-8. Also fixed 4 additional open() calls in gateway/run.py that were added after the PR branch was created.	2026-03-09 21:19:20 -07:00
shitcoinsherpa	4bc32dc0f1	Fix password reader for Windows using msvcrt.getwch() The existing password prompt uses /dev/tty and termios to read input with echo disabled. Neither exists on Windows. On Windows, msvcrt.getwch() reads a single character from the console without echoing it. This adds a Windows code path that uses getwch() in a loop, collecting characters until Enter is pressed. The Unix path using termios and /dev/tty is unchanged.	2026-03-09 21:15:59 -07:00
teknium1	4de5e017f1	Merge PR #457 : Use pywinpty for PTY support on Windows Authored by shitcoinsherpa. Imports winpty.PtyProcess on Windows instead of ptyprocess.PtyProcess, and adds platform markers to the [pty] extra so the correct package is installed automatically.	2026-03-09 21:09:56 -07:00
teknium1	3e352f8a0d	fix: add upstream guard for non-dict function_args + tests for build_tool_preview Complements PR #453 by 0xbyt4. Adds isinstance(dict) guard in run_agent.py to catch cases where json.loads returns non-dict (e.g. null, list, string) before they reach downstream code. Also adds 15 tests for build_tool_preview covering None args, empty dicts, known/unknown tools, fallback keys, truncation, and all special-cased tools (process, todo, memory, session_search).	2026-03-09 21:01:40 -07:00
teknium1	28ae5db9b0	Merge PR #453 : fix: handle None args in build_tool_preview Authored by 0xbyt4. Adds defensive guard for None/empty args in build_tool_preview() to prevent crashes when a model returns null tool call arguments.	2026-03-09 20:58:34 -07:00
teknium1	d5811c887a	Merge: fix double judge call + eval buffer pollution in WebResearchEnv	2026-03-09 20:57:54 -07:00
teknium1	975fd86dc4	fix: eliminate double LLM judge call and eval buffer pollution evaluate() was calling _llm_judge twice per item (once via compute_reward, once directly) — double the API cost for no benefit. Now extracts correctness from compute_reward's buffer instead. Also: compute_reward appends to training metric buffers during eval, which would pollute wandb training charts. Now rolls back buffer entries added during eval so training metrics stay clean.	2026-03-09 20:57:46 -07:00
teknium1	0ff7fe3ee2	Merge PR #439 : docs: fix spelling of 'publicly' Authored by JackTheGit. Simple typo fix: publically → publicly in axolotl reference docs.	2026-03-09 20:55:37 -07:00
teknium1	b9d55d5719	feat: add pokemon-player skill with battle-tested gameplay tips Comprehensive skill for playing Pokemon Red/Blue via the pokemon-agent package (NousResearch/pokemon-agent). Includes: - Full startup procedure (uv venv, server, localhost.run dashboard tunnel) - Save/load lifecycle and naming conventions - Gameplay loop with emphasis on frequent vision checks - Hard-learned navigation tips: - Use vision every 2-4 steps (RAM state is blind to obstacles) - Wait 2-3 seconds after door/stair warps for map transitions - Sidestep after exiting buildings to avoid re-entering - Hold B to speed Gen 1's slow text scrolling - Ledges are one-way — use vision to find gaps - Battle strategy, type chart, Gen 1 quirks - Memory conventions with PKM: prefix - Progression milestones through all 8 gyms + Elite Four	2026-03-09 20:29:38 -07:00
teknium1	ab7dc22984	Merge: WebResearchEnv evaluate() with full agent loop + tools	2026-03-09 19:53:36 -07:00
teknium1	bf8350ac18	fix: evaluate() uses full agent loop with tools, not single-turn The evaluate method was doing single-turn chat_completion (no tools), which defeats the purpose of an agentic research benchmark. Fixed to run the full HermesAgentLoop with web_search/web_extract tools. Results comparison (Claude Sonnet 4.5, FRAMES benchmark): Without tools (broken): 0.56 mean correctness With agent loop + tools: 1.00 mean correctness, 0.994 reward New eval metrics: mean_correctness, mean_reward, mean_tool_calls, tool_usage_rate — all logged via evaluate_log() in lighteval format.	2026-03-09 19:53:28 -07:00
teknium1	a5c6348d41	Merge: WebResearchEnv compute_reward fix (verified with live test)	2026-03-09 19:29:19 -07:00
teknium1	320f881e0b	fix: WebResearchEnv compute_reward extracts from AgentResult.messages AgentResult has .messages (list of dicts), not .final_response or .tool_calls. Fixed compute_reward to extract the final response and tool names from the message history. Verified with live process mode test: - Agent used 7 tool calls (web_search, web_extract) - Produced a 1106-char researched response about Winter Olympics - Reward: 0.384 (partial correctness via LLM judge) - JSONL output contains valid tokens, masks, scores, messages	2026-03-09 19:29:12 -07:00
teknium1	172a38c344	fix: Docker persistent bind mounts fail with Permission denied cap-drop ALL removes DAC_OVERRIDE, which root needs to write to bind-mounted directories owned by the host user (uid 1000). This broke persistent Docker sandboxes — the container couldn't write to /workspace or /root. Add back the minimum capabilities needed: - DAC_OVERRIDE: root can write to bind-mounted dirs owned by host user - CHOWN: package managers (pip, npm, apt) need to set file ownership - FOWNER: needed for operations on files owned by other users Still drops all other capabilities (NET_RAW, SYS_ADMIN, etc.) and keeps no-new-privileges. Security boundary is the container itself. Verified end-to-end: create files → destroy container → new container with same task_id → files persist on host and are accessible in the new container.	2026-03-09 17:52:33 -07:00
teknium1	8bc0d4f77d	Merge: WebResearchEnv Atropos standards compliance	2026-03-09 17:45:57 -07:00
teknium1	8eabdefa8a	fix: bring WebResearchEnv up to Atropos environment standards The environment was merged missing several standard components. Updated to match the patterns established by 82 Atropos environments and our own HermesAgentBaseEnv contract. Added: - WebResearchEnvConfig — custom Pydantic config with reward weights, efficiency thresholds, eval settings, dataset config (all tunable via CLI/YAML without code changes) - config_init() classmethod — default server config (OpenRouter + Claude) so the env works out of the box - wandb_log() override — logs reward breakdown metrics (correctness, tool_usage, efficiency, diversity, correct_rate, tool_usage_rate) with proper buffer management and super() call - evaluate() — uses server.chat_completion instead of broken stub _run_agent_on_item(). Logs via evaluate_log() for lighteval- compatible output. Fixed: - Removed broken _run_agent_on_item() stub that returned empty results - evaluate() now uses server.chat_completion (same pattern as TerminalTestEnv) for actual model evaluation - compute_reward reads tool calls from AgentResult properly - LLM judge uses self.server.chat_completion instead of ctx Reward config is now tunable without code changes: --env.correctness_weight 0.6 --env.tool_usage_weight 0.2 --env.efficiency_weight 0.2 --env.diversity_bonus 0.1 --env.efficient_max_calls 5	2026-03-09 17:45:50 -07:00
teknium1	f658af45c2	Merge PR #446 : fix(cli): use correct visibility filter string in codex API model fetch Authored by PercyDikec. Fixes #445. Changes 'hide' to 'hidden' in _fetch_models_from_api to match _read_cache_models and the actual API response format.	2026-03-09 17:42:39 -07:00
teknium1	5212644861	fix(security): prevent shell injection in tilde-username path expansion Validate that the username portion of ~username paths contains only valid characters (alphanumeric, dot, hyphen, underscore) before passing to shell echo for expansion. Previously, paths like '~; rm -rf /' would be passed unquoted to self._exec(f'echo {path}'), allowing arbitrary command execution. The approach validates the username rather than using shlex.quote(), which would prevent tilde expansion from working at all since echo '~user' outputs the literal string instead of expanding it. Added tests for injection blocking and valid ~username/path expansion. Credit to @alireza78a for reporting (PR #442, issue #442).	2026-03-09 17:33:19 -07:00
teknium1	1151f84351	Merge PR #434 : feat: add WebResearchEnv RL environment for multi-step web research Authored by jackx707. Adds web_research_env.py (Atropos RL environment for multi-step web research using FRAMES benchmark) and batch generation config.	2026-03-09 17:24:20 -07:00
teknium1	9abd6bf342	fix: gateway missing docker_volumes config bridge + list serialization bug The gateway's config.yaml → env var bridge was missing docker_volumes, so Docker volume mounts configured in config.yaml were ignored for gateway sessions (Telegram, Discord, etc.) while working in CLI. Also fixes list serialization: str() produces Python repr with single quotes which json.loads() in terminal_tool.py can't parse. Now uses json.dumps() for list values. Based on PR #431 by @manuelschipper (applied manually due to stale branch).	2026-03-09 17:24:00 -07:00
Teknium	d2c7ef6b41	Merge pull request #792 from NousResearch/hermes/hermes-d2f5523a Merge PR #428: Improve type hints and error diagnostics in vision_tools + add 42 tests	2026-03-09 17:21:44 -07:00
teknium1	a34102049b	Merge: vision auto-detection fallback to local endpoints	2026-03-09 15:36:27 -07:00
teknium1	ef5d811aba	fix: vision auto-detection now falls back to custom/local endpoints Vision auto-mode previously only tried OpenRouter, Nous, and Codex for multimodal — deliberately skipping custom endpoints with the assumption they 'may not handle vision input.' This caused silent failures for users running local multimodal models (Qwen-VL, LLaVA, Pixtral, etc.) without any cloud API keys. Now custom endpoints are tried as a last resort in auto mode. If the model doesn't support vision, the API call fails gracefully — but users with local vision models no longer need to manually set auxiliary.vision.provider: main in config.yaml. Reported by @Spadav and @kotyKD.	2026-03-09 15:36:19 -07:00
teknium1	2d44ed1c5b	test: add comprehensive tests for vision_tools (42 tests) Covers PR #428 changes and existing vision_tools functionality: - _validate_image_url: 20 tests for urlparse-based validation - _determine_mime_type: 6 tests for MIME type detection - _image_to_base64_data_url: 3 tests for base64 conversion - _handle_vision_analyze: 5 tests for type hints, prompt building, AUXILIARY_VISION_MODEL env var override - Error logging exc_info: 3 async tests verifying stack traces are logged on download failure, analysis error, and cleanup error - check_vision_requirements & get_debug_session_info: 2 basic tests - Registry integration: 3 tests for tool registration	2026-03-09 15:32:02 -07:00
teknium1	fa2e72ae9c	docs: document docker_volumes config for shared host directories The Docker backend already supports user-configured volume mounts via docker_volumes, but it was undocumented — missing from DEFAULT_CONFIG, cli.py defaults, and configuration docs. Changes: - hermes_cli/config.py: Add docker_volumes to DEFAULT_CONFIG with inline documentation and examples - cli.py: Add docker_volumes to load_cli_config defaults - configuration.md: Full Docker Volume Mounts section with YAML examples, use cases (providing files, receiving outputs, shared workspaces), and env var alternative	2026-03-09 15:29:34 -07:00
teknium1	5bfc4ed53b	Merge PR #428 : Improve type hints and error diagnostics in vision_tools Authored by aydnOktay. Improves URL validation with urlparse, adds exc_info to error logs for full stack traces, and tightens type hints. Resolved merge conflict in _handle_vision_analyze: kept PR's string formatting with our AUXILIARY_VISION_MODEL env var logic.	2026-03-09 15:27:54 -07:00
teknium1	520aec20e0	fix: add mcp to dev dependencies for test suite MCP tests import from mcp.types but mcp wasn't in the dev optional dependencies. Fresh 'pip install -e .[dev]' setups failed 3 tests. Based on PR #427 by @teyrebaz33 (applied manually due to stale branch).	2026-03-09 15:12:54 -07:00
teknium1	64bec1d060	fix: Slack gateway setup missing event subscriptions and scopes The 'hermes gateway setup' instructions for Slack were missing: - The 'Subscribe to Events' step entirely (message.im, message.channels, app_mention, message.groups) - Several required scopes (app_mentions:read, groups:history, users:read, files:write) - Warning about bot only working in DMs without message.channels - Step to invite the bot to channels The 'hermes setup' flow (setup.py) and the website docs (slack.md) already had the correct information — only gateway.py was outdated. Reported by JordanB on Slack.	2026-03-09 14:31:19 -07:00
teknium1	ac58309dbd	docs: improve Slack setup guide with channel event subscriptions and scopes The #1 support issue with Slack is 'bot works in DMs but not channels'. This is almost always caused by missing event subscriptions (message.channels, message.groups) or missing OAuth scopes (channels:history, groups:history). Changes: - slack.md: Move channels:history and groups:history from optional to required scopes. Move message.channels and message.groups to required events. Add new 'How the Bot Responds' section explaining DM vs channel behavior. Add Step 8 for inviting bot to channels. Expand troubleshooting table with specific 'works in DMs not channels' entry. Add quick checklist for channel debugging. - setup.py: Expand Slack setup wizard with all required scopes, event subscriptions, and a warning that without message.channels/message.groups the bot only works in DMs. Add link to full docs. Improve Member ID discovery instructions. - config.py: Update SLACK_BOT_TOKEN and SLACK_APP_TOKEN descriptions to list required scopes and event subscriptions inline.	2026-03-09 14:00:11 -07:00
Teknium	a5a5d82a21	Merge pull request #784 from NousResearch/feat/slack-app-mention-and-documents feat(slack): fix app_mention 404 + add document/video support	2026-03-09 13:04:50 -07:00
teknium1	34e8d088c2	feat(slack): fix app_mention 404 + add document/video support - Register no-op app_mention event handler to suppress Bolt 404 errors. The 'message' handler already processes @mentions in channels, so app_mention is acknowledged without duplicate processing. - Add send_document() for native file attachments (PDFs, CSVs, etc.) via files_upload_v2, matching the pattern from Telegram PR #779. - Add send_video() for native video uploads via files_upload_v2. - Handle incoming document attachments from users: download, cache, and inject text content for .txt/.md files (capped at 100KB), following the same pattern as the Telegram adapter. - Add _download_slack_file_bytes() helper for raw byte downloads. - Add 24 new tests covering all new functionality. Fixes the unhandled app_mention events reported in gateway logs.	2026-03-09 13:02:59 -07:00
teknium1	c754135965	fix: banner wraps in narrow terminals (Kitty, small windows) The full HERMES-AGENT ASCII logo needs ~95 columns, and the side-by-side caduceus + tools panel needs ~80. In narrow terminals (Kitty default, resized windows) everything wraps into visual garbage. Fixes: - show_banner() auto-detects terminal width and falls back to compact banner when < 80 columns - build_welcome_banner() skips the ASCII logo when < 95 columns - Compact banner now dynamically sized via _build_compact_banner() instead of a hardcoded 64-char box that also wrapped in narrow terms - Same width checks applied to /clear command's banner refresh The up/down arrow key issue in Kitty terminal for multiline input is a known Kitty keyboard protocol (CSI u) vs prompt_toolkit compatibility gap — arrow keys work correctly in standard terminals and tmux. Users can work around it by running in tmux or setting TERM=xterm-256color.	2026-03-09 05:57:36 -07:00
teknium1	c6b75baad0	feat: find-nearby skill and Telegram location support Adds a 'find-nearby' skill for discovering nearby places using OpenStreetMap (Overpass + Nominatim). No API keys needed. Works with: - Coordinates (from Telegram location pins) - Addresses, cities, zip codes, landmarks (auto-geocoded) - Multiple place types (restaurant, cafe, bar, pharmacy, etc.) Returns names, distances, cuisine, hours, addresses, and Google Maps links (pin + directions). 184-line stdlib-only script. Also adds Telegram location message handling: - New MessageType.LOCATION in gateway base - Telegram adapter handles LOCATION and VENUE messages - Injects lat/lon coordinates into conversation context - Prompts agent to ask what the user wants nearby Inspired by PR #422 (reimplemented with simpler script and broader skill scope — addresses/cities/zips, not just Telegram coordinates).	2026-03-09 05:31:10 -07:00
teknium1	a7ad6f6d28	Merge: custom providers instant activation + model persistence	2026-03-09 05:08:01 -07:00
teknium1	1a2141d04d	fix: custom providers activate immediately, save model name Selecting a saved custom provider now switches instantly without probing /models — the model name is stored in the config entry as a complete profile (name + url + key + model). Changes: - custom_providers entries now include 'model' field - Selecting a saved provider with a model just activates it - Only probes /models if no model is saved (first-time setup) - Menu shows saved model name: 'Local (localhost:8000) — llama-70b' - Dedup on re-entry: still activates the model, just doesn't add a duplicate config entry (updates model name if changed)	2026-03-09 05:07:53 -07:00
teknium1	ff3f3169b2	Merge: auto-save custom endpoints + removal option	2026-03-09 04:58:27 -07:00
teknium1	f4580b6010	feat: auto-save custom endpoints + removal option When a user adds a custom endpoint via 'hermes model' → 'Custom endpoint', it now automatically saves to custom_providers in config.yaml so it persists and appears in the provider menu on subsequent runs. Deduplicates by base_url. Auto-generated names based on URL: http://localhost:8000/v1 → 'Local (localhost:8000)' https://xyz.runpod.ai/v1 → 'RunPod (xyz.runpod.ai)' https://api.example.com/v1 → 'Api.example.com' Also adds 'Remove a saved custom provider' option to the menu (only shown when custom providers exist) with a selection UI to pick which one to remove. Users can also manually edit custom_providers in config.yaml for full control over names and settings.	2026-03-09 04:58:20 -07:00

1 2 3 4 5 ...

1148 Commits