hermes-agent

Author	SHA1	Message	Date
Siddharth Balyan	b6461903ff	feat: nix flake — uv2nix build, NixOS module, persistent container mode (#20 ) * feat: nix flake, uv2nix build, dev shell and home manager * fixed nix run, updated docs for setup * feat(nix): NixOS module with persistent container mode, managed guards, checks - Replace homeModules.nix with nixosModules.nix (two deployment modes) - Mode A (native): hardened systemd service with ProtectSystem=strict - Mode B (container): persistent Ubuntu container with /nix/store bind-mount, identity-hash-based recreation, GC root protection, symlink-based updates - Add HERMES_MANAGED guards blocking CLI config mutation (config set, setup, gateway install/uninstall) when running under NixOS module - Add nix/checks.nix with build-time verification (binary, CLI, managed guard) - Remove container.nix (no Nix-built OCI image; pulls ubuntu:24.04 at runtime) - Simplify packages.nix (drop fetchFromGitHub submodules, PYTHONPATH wrappers) - Rewrite docs/nixos-setup.md with full options reference, container architecture, secrets management, and troubleshooting guide Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * Update config.py * feat(nix): add CI workflow and enhanced build checks - GitHub Actions workflow for nix flake check + build on linux/macOS - Entry point sync check to catch pyproject.toml drift - Expanded managed-guard check to cover config edit - Wrap hermes-acp binary in Nix package - Fix Path type mismatch in is_managed() * Update MCP server package name; bundled skills support * fix reading .env. instead have container user a common mounted .env file * feat(nix): container entrypoint with privilege drop and sudo provisioning Container was running as non-root via --user, which broke apt/pip installs and caused crashes when $HOME didn't exist. Replace --user with a Nix-built entrypoint script that provisions the hermes user, sudo (NOPASSWD), and /home/hermes inside the container on first boot, then drops privileges via setpriv. Writable layer persists so setup only runs once. Also expands MCP server options to support HTTP transport and sampling. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix group and user creation in container mode * feat(nix): persistent /home/hermes and MESSAGING_CWD in container mode Container mode now bind-mounts ${stateDir}/home to /home/hermes so the agent's home directory survives container recreation. Previously it lived in the writable layer and was lost on image/volume/options changes. Also passes MESSAGING_CWD to the container so the agent finds its workspace and documents, matching native mode behavior. Other changes: - Extract containerDataDir/containerHomeDir bindings (no more magic strings) - Fix entrypoint chown to run unconditionally (volume mounts always exist) - Add schema field to container identity hash for auto-recreation - Add idempotency test (Scenario G) to config-roundtrip check * docs: add Nix & NixOS setup guide to docs site Add comprehensive Nix documentation to the Docusaurus site at website/docs/getting-started/nix-setup.md, covering nix run/profile install, NixOS module (native + container modes), declarative settings, secrets management, MCP servers, managed mode, container architecture, dev shell, flake checks, and full options reference. - Register nix-setup in sidebar after installation page - Add Nix callout tip to installation.md linking to new guide - Add canonical version pointer in docs/nixos-setup.md * docs: remove docs/nixos-setup.md, consolidate into website docs Backfill missing details (restart/restartSec in full example, gateway.pid, 0750 permissions, docker inspect commands) into the canonical website/docs/getting-started/nix-setup.md and delete the old standalone file. * fix(nix): add compression.protect_last_n and target_ratio to config-keys.json New keys were added to DEFAULT_CONFIG on main, causing the config-drift check to fail in CI. * fix(nix): skip checks on aarch64-darwin (onnxruntime wheel missing) The full Python venv includes onnxruntime (via faster-whisper/STT) which lacks a compatible uv2nix wheel on aarch64-darwin. Gate all checks behind stdenv.hostPlatform.isLinux. The package and devShell still evaluate on macOS. * fix(nix): skip flake check and build on macOS CI onnxruntime (transitive dep via faster-whisper) lacks a compatible uv2nix wheel on aarch64-darwin. Run full checks and build on Linux only; macOS CI verifies the flake evaluates without building. * fix(nix): preserve container writable layer across nixos-rebuild The container identity hash included the entrypoint's Nix store path, which changes on every nixpkgs update (due to runtimeShell/stdenv input-addressing). This caused false-positive identity mismatches, triggering container recreation and losing the persistent writable layer. - Use stable symlink (current-entrypoint) like current-package already does - Remove entrypoint from identity hash (only image/volumes/options matter) - Add GC root for entrypoint so nix-collect-garbage doesn't break it - Remove global HERMES_HOME env var from addToSystemPackages (conflicted with interactive CLI use, service already sets its own) --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-26 01:08:02 +05:30
Teknium	8f6ef042c1	fix(cli): buffer reasoning preview chunks and fix duplicate display (#3013 ) Three improvements to reasoning/thinking display in the CLI: 1. Buffer tiny reasoning chunks: providers like DeepSeek stream reasoning one word at a time, producing a separate [thinking] line per token. Add a buffer that coalesces chunks and flushes at natural boundaries (newlines, sentence endings, terminal width). 2. Fix duplicate reasoning display: centralize callback selection into _current_reasoning_callback() — one place instead of 4 scattered inline ternaries. Prevents both the streaming box AND the preview callback from firing simultaneously. 3. Fix post-response reasoning box guard: change the check from 'not self._stream_started' to 'not self._reasoning_stream_started' so the final reasoning box is only suppressed when reasoning was actually streamed live, not when any text was streamed. Cherry-picked from PR #2781 by juanfradb.	2026-03-25 12:16:39 -07:00
Teknium	099dfca6db	fix: GLM reasoning-only and max-length handling (#3010 ) - Add 'prompt exceeds max length' to context overflow detection for Z.AI/GLM 400 errors - Extract inline reasoning blocks from assistant content as fallback when no structured reasoning fields are present - Guard inline extraction so structured API reasoning takes priority - Update test for reasoning-only response salvage behavior Cherry-picked from PR #2993 by kshitijk4poor. Added priority guard to fix test_structured_reasoning_takes_priority failure. Co-authored-by: kshitijk4poor <kshitijk4poor@users.noreply.github.com>	2026-03-25 12:05:37 -07:00
Teknium	68ab37e891	fix(delegate): give subagents independent iteration budgets (#3004 ) Each subagent now gets its own IterationBudget instead of sharing the parent's. The per-subagent cap is controlled by delegation.max_iterations in config.yaml (default 50). Total iterations across parent + subagents can exceed the parent's max_iterations, but the user retains control via the config setting. Previously, subagents shared the parent's budget, so three parallel subagents configured for max_iterations=50 racing against a parent that already used 60 of 90 would each only get ~10 iterations. Inspired by PR #2928 (Bartok9) which identified the issue (#2873).	2026-03-25 11:29:49 -07:00
Teknium	65dace1b1a	fix(discord): stop phantom typing indicator after agent turn completes (#3003 ) Two fixes for a race where Discord's typing indicator lingers after the agent finishes: 1. _keep_typing (root cause): after outer stop_typing() clears the task dict, _keep_typing wakes from its 2s sleep and calls send_typing() again, recreating an orphaned loop. Add a finally block so _keep_typing always calls stop_typing() on exit, cleaning up any loop it recreated. 2. _process_message_background (safety net): add stop_typing() after cancelling the typing task, catching any platform-level persistent typing tasks that slipped through. Combines fixes from PR #2945 by catbusconductor (root cause in _keep_typing) and PR #2832 by subrih (safety net in _process_message_background).	2026-03-25 11:28:28 -07:00
Teknium	650b400c98	fix(cron): mark session as ended after job completes (#2998 ) Cron was the only execution path that never called end_session(), leaving ended_at = NULL permanently. This made cron sessions invisible to hermes prune --older-than and indistinguishable from active sessions. Captures session_id in a local variable before agent construction so it's available in the finally block even if AIAgent() fails, then calls end_session(session_id, 'cron_complete') before close(). Cherry-picked from PR #2979 by ygd58. Fixed bug: original PR called end_session() with zero arguments (TypeError — method requires session_id and end_reason). Fixes #2972. Co-authored-by: ygd58 <ygd58@users.noreply.github.com>	2026-03-25 11:13:21 -07:00
Teknium	61949f0af7	Fix (#2997 ) Co-authored-by: Jack <jvand@DESKTOP-JACK.localdomain>	2026-03-25 11:12:11 -07:00
Teknium	52c5e491f5	fix(session): surface silent SessionDB failures that cause session data loss (#2999 ) * fix(session): surface silent SessionDB failures that cause session data loss SessionDB initialization and operation failures were logged at debug level or silently swallowed, causing sessions to never be indexed in the FTS5 database. This made session_search unable to find affected conversations. In practice, ~48% of sessions can be lost without any visible indication. The JSON session files are still written (separate code path), but the SQLite/FTS5 index gets nothing — making session_search return empty results for affected sessions. Changes: - cli.py: Log warnings (not debug) when SessionDB init fails at both __init__ and _start_session entry points - run_agent.py: Log warnings on create_session, append_message, and compression split failures - run_agent.py: Set _session_db = None after create_session failure to fail fast instead of silently dropping every message for the session Root cause: When gateway restarts or DB lock contention occurs during SessionDB() init, the exception is caught and swallowed. The agent continues running normally — JSON session logs are written to disk — but no messages reach the FTS5 index. * fix: use module logger instead of root logging for SessionDB warnings Follow-up to cherry-picked PR #2939 — the original used logging.warning() (root logger) instead of logger.warning() (module logger) in the 5 new warning calls. Module logger preserves the logger hierarchy and shows the correct module name in log output. --------- Co-authored-by: LucidPaths <lc77@outlook.de>	2026-03-25 11:10:19 -07:00
Teknium	f665351740	fix(shell): exponential backoff for persistent shell polling (#2996 ) * fix(shell): replace fixed 10ms poll interval with exponential backoff to reduce WSL2 resource consumption * fix(shell): rename _poll_interval to _poll_interval_start for clarity, update SSH override * fix(shell): correctly rename _poll_interval to _poll_interval_start in ssh.py --------- Co-authored-by: ygd58 <buraysandro9@gmail.com>	2026-03-25 10:56:48 -07:00
Teknium	fba73a60e3	fix(skills): use Git Trees API to prevent silent subdirectory loss during install (#2995 ) * fix(skills): use Git Trees API to prevent silent subdirectory loss during install Refactors _download_directory() to use the Git Trees API (single call for the entire repo tree) as the primary path, falling back to the recursive Contents API when the tree endpoint is unavailable or truncated. Prevents silent subdirectory loss caused by per-directory rate limiting or transient failures. Cherry-picked from PR #2981 by tugrulguner. Fixes #2940. * fix: simplify tree API — use branch name directly as tree-ish Eliminates an extra git/ref/heads API call by passing the branch name directly to git/trees/{branch}?recursive=1, matching the pattern already used by _find_skill_in_repo_tree. --------- Co-authored-by: tugrulguner <tugrulguner@users.noreply.github.com>	2026-03-25 10:48:18 -07:00
Teknium	114e636b7d	fix(display): suppress KawaiiSpinner animation under patch_stdout (#2994 ) When the CLI is active, sys.stdout is prompt_toolkit's StdoutProxy which queues writes and injects newlines around each flush(). This causes every \r spinner frame to land on its own line instead of overwriting the previous one, producing visible flickering where the spinner and status bar repeatedly swap positions. The CLI already renders spinner state via a dedicated TUI widget (_spinner_text / get_spinner_text), so KawaiiSpinner's \r-based loop is redundant under StdoutProxy. Detect the proxy and suppress the animation entirely — the thread still runs to preserve start()/stop() semantics. Also removes the 0.4s flush rate-limit workaround that was papering over the same issue, and cleans up the unused _last_flush_time attribute. Salvaged from PR #2908 by Mibayy (fixed _raw -> raw detection, dropped unrelated bundled changes).	2026-03-25 10:46:54 -07:00
Teknium	20cc1731f4	perf(prompt_builder): avoid redundant file re-read for skill conditions (#2992 ) build_skills_system_prompt() was calling _read_skill_conditions() which re-read each SKILL.md file to extract conditional activation fields. The frontmatter was already parsed by _parse_skill_file() earlier in the same loop. Extract conditions inline from the existing frontmatter dict instead, saving one file read per skill (~80+ on a typical setup). Salvaged from PR #2827 by InB4DevOps.	2026-03-25 10:39:27 -07:00
Teknium	b2a6b012fe	fix(api_server): streaming breaks when agent makes tool calls (#2985 ) * fix(run_agent): ensure _fire_first_delta() is called for tool generation events Added calls to _fire_first_delta() in the AIAgent class to improve the handling of tool generation events, ensuring timely notifications during the processing of function calls and tool usage. * fix(run_agent): improve timeout handling for chat completions Enhanced the timeout configuration for chat completions in the AIAgent class by introducing customizable connection, read, and write timeouts using environment variables. This ensures more robust handling of API requests during streaming operations. * fix(run_agent): reduce default stream read timeout for chat completions Updated the default stream read timeout from 120 seconds to 60 seconds in the AIAgent class, enhancing the timeout configuration for chat completions. This change aims to improve responsiveness during streaming operations. * fix(run_agent): enhance streaming error handling and retry logic Improved the error handling and retry mechanism for streaming requests in the AIAgent class. Introduced a configurable maximum number of stream retries and refined the handling of transient network errors, allowing for retries with fresh connections. Non-transient errors now trigger a fallback to non-streaming only when appropriate, ensuring better resilience during API interactions. * fix(api_server): streaming breaks when agent makes tool calls The agent fires stream_delta_callback(None) to signal the CLI display to close its response box before tool execution begins. The API server's _on_delta callback was forwarding this None directly into the SSE queue, where the SSE writer treats it as end-of-stream and terminates the HTTP response prematurely. After tool calls complete, the agent streams the final answer through the same callback, but the SSE response was already closed — so Open WebUI (and similar frontends) never received the actual answer. Fix: filter out None in _on_delta so the SSE stream stays open. The SSE loop already detects completion via agent_task.done(), which handles stream termination correctly without needing the None sentinel. Reported by Rohit Paul on X.	2026-03-25 09:56:20 -07:00
Teknium	42fec19151	feat: persist reasoning across gateway session turns (schema v6) (#2974 ) feat: persist reasoning across gateway session turns (schema v6) Tested against OpenAI Codex (direct), Anthropic (direct + OAI-compat), and OpenRouter → 6 backends. All reasoning field types (reasoning, reasoning_details, codex_reasoning_items) round-trip through the DB correctly.	2026-03-25 09:47:28 -07:00
Teknium	5dbe2d9d73	fix: skills-sh install fails for deeply nested repo structures (#2980 ) * fix(run_agent): ensure _fire_first_delta() is called for tool generation events Added calls to _fire_first_delta() in the AIAgent class to improve the handling of tool generation events, ensuring timely notifications during the processing of function calls and tool usage. * fix(run_agent): improve timeout handling for chat completions Enhanced the timeout configuration for chat completions in the AIAgent class by introducing customizable connection, read, and write timeouts using environment variables. This ensures more robust handling of API requests during streaming operations. * fix(run_agent): reduce default stream read timeout for chat completions Updated the default stream read timeout from 120 seconds to 60 seconds in the AIAgent class, enhancing the timeout configuration for chat completions. This change aims to improve responsiveness during streaming operations. * fix(run_agent): enhance streaming error handling and retry logic Improved the error handling and retry mechanism for streaming requests in the AIAgent class. Introduced a configurable maximum number of stream retries and refined the handling of transient network errors, allowing for retries with fresh connections. Non-transient errors now trigger a fallback to non-streaming only when appropriate, ensuring better resilience during API interactions. * fix: skills-sh install fails for deeply nested repo structures Skills in repos with deep directory nesting (e.g. cli-tool/components/skills/development/senior-backend/) could not be installed because the candidate path generation and shallow root-dir scan never reached them. Added GitHubSource._find_skill_in_repo_tree() which uses the GitHub Trees API to recursively search the entire repo tree in a single API call. This is used as a final fallback in SkillsShSource._discover_identifier() when the standard candidate paths and shallow scan both fail. Fixes installation of skills from repos like davila7/claude-code-templates where skills are nested 4+ levels deep. Reported by user Samuraixheart.	2026-03-25 09:31:05 -07:00
Teknium	c6f4515f73	fix(whatsapp): download documents, audio, and video media from messages (#2978 ) Add downloadMediaMessage() calls for documents, audio/voice notes, and video in bridge.js — previously only images were downloaded, leaving all other file types inaccessible to the agent. Handle local file paths from the bridge for DOCUMENT, VOICE, and VIDEO types in whatsapp.py with proper MIME detection. Inject text content inline for readable files (.txt, .md, .csv, .json, etc.). Follow-up fixes applied during salvage: - Remove unused cache_document_from_bytes import - Add 100KB size cap on text injection (matches Telegram/Discord/Slack) - Align injection format with other platforms Cherry-picked from PR #2818. Also fixes #2856 (bugs 1 & 2). PR #2865 by ayberkesn fixed the same voice note issue. Co-authored-by: noestelar <hola@noeali.com>	2026-03-25 08:37:28 -07:00
Teknium	fd292e676b	fix: skip KawaiiSpinner when TUI handles tool progress (#2973 ) * docs: unify hooks documentation — add plugin hooks to hooks page, add session:end event The hooks page only documented gateway event hooks (HOOK.yaml system). The plugins page listed plugin hooks (pre_tool_call, etc.) that weren't referenced from the hooks page, which was confusing. Changes: - hooks.md: Add overview table showing both hook systems - hooks.md: Add Plugin Hooks section with available hooks, callback signatures, and example - hooks.md: Add missing session:end gateway event (emitted but undocumented) - hooks.md: Mark pre_llm_call, post_llm_call, on_session_start, on_session_end as planned (defined in VALID_HOOKS but not yet invoked) - hooks.md: Update info box to cross-reference plugin hooks - hooks.md: Fix heading hierarchy (gateway content as subsections) - plugins.md: Add cross-reference to hooks page for full details - plugins.md: Mark planned hooks as (planned) * feat(session_search): add recent sessions mode when query is omitted When session_search is called without a query (or with an empty query), it now returns metadata for the most recent sessions instead of erroring. This lets the agent quickly see what was worked on recently without needing specific keywords. Returns for each session: session_id, title, source, started_at, last_active, message_count, preview (first user message). Zero LLM cost — pure DB query. Current session lineage and child delegation sessions are excluded. The agent can then keyword-search specific sessions if it needs deeper context from any of them. * docs: clarify two-mode behavior in session_search schema description * fix(compression): restore sane defaults and cap summary at 12K tokens - threshold: 0.80 → 0.50 (compress at 50%, not 80%) - target_ratio: 0.40 → 0.20, now relative to threshold not total context (20% of 50% = 10% of context as tail budget) - summary ceiling: 32K → 12K (Gemini can't output more than ~12K) - Updated DEFAULT_CONFIG, config display, example config, and tests * fix: browser_vision ignores auxiliary.vision.timeout config (#2901) * docs: unify hooks documentation — add plugin hooks to hooks page, add session:end event The hooks page only documented gateway event hooks (HOOK.yaml system). The plugins page listed plugin hooks (pre_tool_call, etc.) that weren't referenced from the hooks page, which was confusing. Changes: - hooks.md: Add overview table showing both hook systems - hooks.md: Add Plugin Hooks section with available hooks, callback signatures, and example - hooks.md: Add missing session:end gateway event (emitted but undocumented) - hooks.md: Mark pre_llm_call, post_llm_call, on_session_start, on_session_end as planned (defined in VALID_HOOKS but not yet invoked) - hooks.md: Update info box to cross-reference plugin hooks - hooks.md: Fix heading hierarchy (gateway content as subsections) - plugins.md: Add cross-reference to hooks page for full details - plugins.md: Mark planned hooks as (planned) * fix: browser_vision ignores auxiliary.vision.timeout config browser_vision called call_llm() without passing a timeout parameter, so it always used the 30-second default in auxiliary_client.py. This made vision analysis with local models (llama.cpp, ollama) impossible since they typically need more than 30s for screenshot analysis. Now browser_vision reads auxiliary.vision.timeout from config.yaml (same config key that vision_analyze already uses) and passes it through to call_llm(). Also bumped the default vision timeout from 30s to 120s in both browser_vision and vision_analyze — 30s is too aggressive for local models and the previous default silently failed for anyone running vision locally. Fixes user report from GamerGB1988. * fix(skills): agent-created skills were incorrectly treated as untrusted community content _resolve_trust_level() didn't handle 'agent-created' source, so it fell through to 'community' trust level. Community policy blocks on any caution or dangerous findings, which meant common patterns like curl with env vars, systemctl, crontab, cloudflared references etc. would block skill creation/patching. The agent-created policy row already existed in INSTALL_POLICY with permissive settings (allow caution, ask on dangerous) but was never reached. Now it is. Fixes reports of skill_manage being blocked by security scanner. * fix(cli): enhance real-time reasoning output by forcing flush of long partial lines Updated the reasoning output mechanism to emit complete lines and force-flush long partial lines, ensuring reasoning is visible in real-time even without newlines. This improves user experience during reasoning sessions. * fix: skip KawaiiSpinner when TUI handles tool progress In the interactive CLI, the agent runs with quiet_mode=True and tool_progress_callback set. The quiet_mode condition triggered KawaiiSpinner for every tool call, but the TUI was already handling progress display via the spinner widget. The KawaiiSpinner writes carriage-return animation through StdoutProxy, triggering run_in_terminal() erase/redraw cycles on every flush. These redundant cycles cause the status bar to ghost into terminal scrollback. The thinking spinner already had this guard (checks thinking_callback). This extends the same pattern to the three tool spinner creation sites: concurrent tools, delegate_task, and single tool execution.	2026-03-25 08:33:44 -07:00
Teknium	e5691eed38	feat(gateway): configurable Telegram reply threading mode (#2907 ) Add reply_to_mode setting (off/first/all) to control whether Telegram replies quote/thread to the user's original message. - 'off': Never thread replies (no quote bubble) - 'first': Only first chunk threads to user's message (default, preserves existing behavior) - 'all': All chunks in multi-part replies thread to user's message Configurable via: - reply_to_mode in platform config (gateway config YAML) - TELEGRAM_REPLY_TO_MODE env var Based on PR #855 by raulvidis.	2026-03-24 19:56:00 -07:00
Teknium	ab4ba8163a	feat(migration): comprehensive OpenClaw migration v2 — 17 new modules, terminal recap (#2906 ) * feat(migration): comprehensive OpenClaw -> Hermes migration v2 Extends the existing migration script from ~15% to ~95% coverage of OpenClaw's configuration surface. Adds 17 new migration modules: Direct migrations (written to config.yaml/.env): - MCP servers: full server definitions with transport, tools, sampling - Agent defaults: reasoning_effort, compression, human_delay, timezone - Session config: reset triggers (daily/idle) -> session_reset - Full model providers: custom_providers with base_url/api_mode - Deep channel config: Matrix, Mattermost, IRC, Discord deep settings - Browser config: timeout settings - Tools config: exec timeout -> terminal.timeout - Approvals: mode mapping (smart/manual/auto -> Hermes equivalents) Archived for manual review (no direct Hermes equivalent): - Plugins config + installed extensions - Cron jobs (with note to use 'hermes cron') - Hooks/webhooks config - Multi-agent list + routing bindings - Gateway config (port, auth, TLS) - Memory backend config (QMD, vector search) - Skills registry per-entry config - UI/identity settings - Logging/diagnostics preferences Also adds: - MIGRATION_NOTES.md generation with PM2 reassurance message - _set_env_var helper for consistent env file management - Updated presets to include all new options - Comprehensive mock test passing (12 migrated, 12 archived) * feat(migration): add terminal recap with visual summary Replaces raw JSON dump with a formatted box showing migrated/archived/ skipped/conflict/error counts, detailed item lists with labels, PM2 reassurance message, and actionable next steps. JSON output available via MIGRATION_JSON_OUTPUT=1 env var. * fix(test): allowlist python_os_environ as known false-positive in skills guard test MIGRATION_JSON_OUTPUT env var is a legitimate CLI feature flag that enables JSON output mode, not an env dump. Add it alongside agent_config_mod as an accepted finding in test_skill_installs_cleanly_under_skills_guard. * fix(test): add hermes_config_mod to known false-positives in skills guard test The scanner flags two print statements that tell the user to review ~/.hermes/config.yaml in the post-migration summary. The script never writes to that file — those are informational strings, not config mutations. --------- Co-authored-by: Hermes <hermes@nousresearch.ai>	2026-03-24 19:44:02 -07:00
Teknium	80cc27eb9d	feat(api-server): Idempotency-Key support, body size limit, OpenAI error envelope (#2903 ) * feat(api-server): add Idempotency-Key support and request size limit; unify OpenAI error envelope * fix(api-server): include provider error message in 500 OpenAI error body --------- Co-authored-by: aydnOktay <xaydinoktay@gmail.com>	2026-03-24 19:31:08 -07:00
Teknium	1b24a226ea	fix(skills): agent-created skills were incorrectly treated as untrusted community content _resolve_trust_level() didn't handle 'agent-created' source, so it fell through to 'community' trust level. Community policy blocks on any caution or dangerous findings, which meant common patterns like curl with env vars, systemctl, crontab, cloudflared references etc. would block skill creation/patching. The agent-created policy row already existed in INSTALL_POLICY with permissive settings (allow caution, ask on dangerous) but was never reached. Now it is. Fixes reports of skill_manage being blocked by security scanner.	2026-03-24 19:15:03 -07:00
Teknium	9b32f846a8	fix: browser_vision ignores auxiliary.vision.timeout config (#2901 ) * docs: unify hooks documentation — add plugin hooks to hooks page, add session:end event The hooks page only documented gateway event hooks (HOOK.yaml system). The plugins page listed plugin hooks (pre_tool_call, etc.) that weren't referenced from the hooks page, which was confusing. Changes: - hooks.md: Add overview table showing both hook systems - hooks.md: Add Plugin Hooks section with available hooks, callback signatures, and example - hooks.md: Add missing session:end gateway event (emitted but undocumented) - hooks.md: Mark pre_llm_call, post_llm_call, on_session_start, on_session_end as planned (defined in VALID_HOOKS but not yet invoked) - hooks.md: Update info box to cross-reference plugin hooks - hooks.md: Fix heading hierarchy (gateway content as subsections) - plugins.md: Add cross-reference to hooks page for full details - plugins.md: Mark planned hooks as (planned) * fix: browser_vision ignores auxiliary.vision.timeout config browser_vision called call_llm() without passing a timeout parameter, so it always used the 30-second default in auxiliary_client.py. This made vision analysis with local models (llama.cpp, ollama) impossible since they typically need more than 30s for screenshot analysis. Now browser_vision reads auxiliary.vision.timeout from config.yaml (same config key that vision_analyze already uses) and passes it through to call_llm(). Also bumped the default vision timeout from 30s to 120s in both browser_vision and vision_analyze — 30s is too aggressive for local models and the previous default silently failed for anyone running vision locally. Fixes user report from GamerGB1988.	2026-03-24 19:10:12 -07:00
Teknium	7ca22ea11b	fix(compression): restore sane defaults and cap summary at 12K tokens - threshold: 0.80 → 0.50 (compress at 50%, not 80%) - target_ratio: 0.40 → 0.20, now relative to threshold not total context (20% of 50% = 10% of context as tail budget) - summary ceiling: 32K → 12K (Gemini can't output more than ~12K) - Updated DEFAULT_CONFIG, config display, example config, and tests	2026-03-24 18:48:47 -07:00
Teknium	ef47531617	docs: unify hooks documentation — add plugin hooks to hooks page, add session:end event The hooks page only documented gateway event hooks (HOOK.yaml system). The plugins page listed plugin hooks (pre_tool_call, etc.) that weren't referenced from the hooks page, which was confusing. Changes: - hooks.md: Add overview table showing both hook systems - hooks.md: Add Plugin Hooks section with available hooks, callback signatures, and example - hooks.md: Add missing session:end gateway event (emitted but undocumented) - hooks.md: Mark pre_llm_call, post_llm_call, on_session_start, on_session_end as planned (defined in VALID_HOOKS but not yet invoked) - hooks.md: Update info box to cross-reference plugin hooks - hooks.md: Fix heading hierarchy (gateway content as subsections) - plugins.md: Add cross-reference to hooks page for full details - plugins.md: Mark planned hooks as (planned)	2026-03-24 18:48:47 -07:00
Teknium	b36fe9282a	feat(session_search): add recent sessions mode when query is omitted (#2533 ) feat(session_search): add recent sessions mode when query is omitted	2026-03-24 18:41:38 -07:00
Teknium	1e9ff53a74	docs: clarify two-mode behavior in session_search schema description	2026-03-24 18:08:06 -07:00
Teknium	27c023e071	feat(config): expose compression target_ratio, protect_last_n, and threshold in DEFAULT_CONFIG PR #2554 made these configurable via config.yaml but didn't add them to DEFAULT_CONFIG or the config display. Users couldn't discover the new knobs without reading the source. - threshold: 0.80 (compress at 80% context usage) - target_ratio: 0.40 (preserve 40% of context as recent tail) - protect_last_n: 20 (keep last 20 messages uncompressed) - Updated hermes config display to show all three fields	2026-03-24 18:05:43 -07:00
Teknium	9231a335d4	fix(compression): replace dead summary_target_tokens with ratio-based scaling (#2554 ) The summary_target_tokens parameter was accepted in the constructor, stored on the instance, and never used — the summary budget was always computed from hardcoded module constants (_SUMMARY_RATIO=0.20, _MAX_SUMMARY_TOKENS=8000). This caused two compounding problems: 1. The config value was silently ignored, giving users no control over post-compression size. 2. Fixed budgets (20K tail, 8K summary cap) didn't scale with context window size. Switching from a 1M-context model to a 200K model would trigger compression that nuked 350K tokens of conversation history down to ~30K. Changes: - Replace summary_target_tokens with summary_target_ratio (default 0.40) which sets the post-compression target as a fraction of context_length. Tail token budget and summary cap now scale proportionally: MiniMax 200K → ~80K post-compression GPT-5 1M → ~400K post-compression - Change threshold_percent default: 0.50 → 0.80 (don't fire until 80% of context is consumed) - Change protect_last_n default: 4 → 20 (preserve ~10 full turns) - Summary token cap scales to 5% of context (was fixed 8K), capped at 32K ceiling - Read target_ratio and protect_last_n from config.yaml compression section (both are now configurable) - Remove hardcoded summary_target_tokens=500 from run_agent.py - Add 5 new tests for ratio scaling, clamping, and new defaults	2026-03-24 17:45:49 -07:00
Teknium	7efaa5968d	Merge pull request #2891 from NousResearch/hermes/hermes-gateway-context fix(gateway): stop loading hermes repo AGENTS.md into gateway sessions (~10k wasted tokens)	2026-03-24 17:43:41 -07:00
Teknium	8ee4f32819	fix(gateway): use TERMINAL_CWD for context file discovery, not process cwd The gateway process runs from the hermes-agent install directory, so os.getcwd() picks up the repo's AGENTS.md (16k chars) and other dev context files — inflating input tokens by ~10k on every gateway message. Fix: use TERMINAL_CWD (which the gateway sets to MESSAGING_CWD or $HOME) as the cwd for build_context_files_prompt(). In CLI mode, TERMINAL_CWD is the user's actual project directory, so behavior is unchanged. Before: gateway 15-20k input tokens, CLI 6-8k After: gateway ~6-8k input tokens (same as CLI) Reported by keri on Discord.	2026-03-24 17:30:33 -07:00
Teknium	689344430c	chore: gitignore orphaned mini-swe-agent directory	2026-03-24 12:50:34 -07:00
Teknium	618f15dda9	fix: reorder setup wizard providers — OpenRouter first Move OpenRouter to position 1 in the setup wizard's provider list to match hermes model ordering. Update default selection index and fix test expectations for the new ordering. Setup order: OpenRouter → Nous Portal → Codex → Custom → ...	2026-03-24 12:50:24 -07:00
Teknium	481915587e	fix: update context pressure warnings and token estimates after compaction Reset context pressure warnings and update last_prompt_tokens and last_completion_tokens in the context compressor to prevent stale values from causing excessive warnings and re-triggering compression. This change ensures accurate pressure calculations following the compaction process.	2026-03-24 09:25:10 -07:00
Teknium	0b993c1e07	docs: quote pip install extras to fix zsh glob errors (#2815 ) zsh interprets square brackets as glob patterns, so `pip install hermes-agent[voice]` fails with 'no matches found'. Quote all pip install commands with extras across 5 docs pages (12 instances). Reported by OFumik0OP.	2026-03-24 09:25:01 -07:00
Teknium	9718334962	docs: fix api-server response storage — SQLite, not in-memory (#2819 ) * docs: update all docs for /model command overhaul and custom provider support Documents the full /model command overhaul across 6 files: AGENTS.md: - Add model_switch.py to project structure tree configuration.md: - Rewrite General Setup with 3 config methods (interactive, config.yaml, env vars) - Add new 'Switching Models with /model' section documenting all syntax variants - Add 'Named Custom Providers' section with config.yaml examples and custom:name:model triple syntax slash-commands.md: - Update /model descriptions in both CLI and messaging tables with full syntax examples (provider:model, custom:model, custom:name:model, bare custom auto-detect) cli-commands.md: - Add /model slash command subsection under hermes model with syntax table - Add custom endpoint config to hermes model use cases faq.md: - Add config.yaml example for offline/local model setup - Note that provider: custom is a first-class provider - Document /model custom auto-detect provider-runtime.md: - Add model_switch.py to implementation file list - Update provider families to show Custom as first-class with named variants * docs: fix api-server response storage description — SQLite, not in-memory The ResponseStore class uses SQLite persistence (with in-memory fallback), not pure in-memory storage. Responses survive gateway restarts.	2026-03-24 09:05:15 -07:00
Teknium	ebcb81b649	docs: document 9 previously undocumented features New documentation for features that existed in code but had no docs: New page: - context-references.md: Full docs for @-syntax inline context injection (@file:, @folder:, @diff, @staged, @git:, @url:) with line ranges, CLI autocomplete, size limits, sensitive path blocking, and error handling configuration.md additions: - Environment variable substitution: ${VAR_NAME} syntax in config.yaml with expansion, fallback, and multi-reference support - Gateway streaming: Progressive token delivery on messaging platforms via message editing (StreamingConfig: enabled, transport, edit_interval, buffer_threshold, cursor) with platform support matrix - Web search backends: Three providers (Firecrawl, Parallel, Tavily) with web.backend config key, capability matrix, auto-detection from API keys, self-hosted Firecrawl, and Parallel search modes security.md additions: - SSRF protection: Always-on URL validation blocking private networks, loopback, link-local, CGNAT, cloud metadata hostnames, with fail-closed DNS and redirect chain re-validation - Tirith pre-exec security scanning: Content-level command scanning for homograph URLs, pipe-to-interpreter, terminal injection with auto-install, SHA-256/cosign verification, config options, and fail-open/fail-closed modes sessions.md addition: - Auto-generated session titles: Background LLM-powered title generation after first exchange creating-skills.md additions: - Conditional skill activation: requires_toolsets, requires_tools, fallback_for_toolsets, fallback_for_tools frontmatter fields with matching logic and use cases - Environment variable requirements: required_environment_variables frontmatter for automatic env passthrough to sandboxed execution, plus terminal.env_passthrough user config	2026-03-24 08:56:21 -07:00
Teknium	ac5b8a478a	ci: add supply chain audit workflow for PR scanning (#2816 ) Scans every PR diff for patterns associated with supply chain attacks: CRITICAL (blocks merge): - .pth files (auto-execute on Python startup — litellm attack vector) - base64 decode + exec/eval combo (obfuscated payload execution) - subprocess with encoded/obfuscated commands WARNING (comment only, no block): - base64 encode/decode alone (legitimate uses: images, JWT, etc.) - exec/eval alone - Outbound POST/PUT requests - setup.py/sitecustomize.py/usercustomize.py changes - marshal.loads/pickle.loads/compile() Posts a detailed comment on the PR with matched lines and context. Excludes lockfiles (uv.lock, package-lock.json) from scanning. Motivated by the litellm 1.82.7/1.82.8 credential stealer attack (BerriAI/litellm#24512).	2026-03-24 08:56:04 -07:00
Teknium	624e4a8e7a	chore: regenerate uv.lock with hashes, use lockfile in setup (#2812 ) - Regenerate uv.lock with sha256 hashes for all 2965 package artifacts - Add python_version marker to yc-bench (requires >=3.12) - Update setup-hermes.sh to prefer 'uv sync --locked' for hash-verified installs, with fallback to 'uv pip install' when lockfile is stale This completes the supply chain hardening: pyproject.toml bounds the version ranges, and uv.lock pins exact versions with cryptographic hashes so tampered packages are rejected at install time.	2026-03-24 08:42:45 -07:00
Teknium	177e43259f	refactor: update mini_swe_runner to use Hermes built-in backends Replace all minisweagent imports with Hermes-Agent's own environment classes (LocalEnvironment, DockerEnvironment, ModalEnvironment). mini_swe_runner.py no longer has any dependency on mini-swe-agent. The runner now uses the same backends as the terminal tool, so Docker and Modal environments work out of the box without extra submodules. Tested: local and Docker backends verified working through the runner.	2026-03-24 08:27:15 -07:00
Teknium	c9b76057d4	chore: pin all dependency version ranges (supply chain hardening) (#2810 ) Adds upper-bound version pins (<next_major) to all dependencies in pyproject.toml — both core and optional. Previously most deps were unpinned or had only floor bounds, meaning fresh installs would pull whatever version was latest on PyPI. This limits blast radius from supply chain attacks like the litellm 1.82.7/1.82.8 credential stealer (BerriAI/litellm#24512). With bounded ranges, a compromised major version bump won't be pulled automatically. Floors are set to current known-good installed versions.	2026-03-24 08:25:17 -07:00
Teknium	745859babb	feat: env var passthrough for skills and user config (#2807 ) * feat: env var passthrough for skills and user config Skills that declare required_environment_variables now have those vars passed through to sandboxed execution environments (execute_code and terminal). Previously, execute_code stripped all vars containing KEY, TOKEN, SECRET, etc. and the terminal blocklist removed Hermes infrastructure vars — both blocked skill-declared env vars. Two passthrough sources: 1. Skill-scoped (automatic): when a skill is loaded via skill_view and declares required_environment_variables, vars that are present in the environment are registered in a session-scoped passthrough set. 2. Config-based (manual): terminal.env_passthrough in config.yaml lets users explicitly allowlist vars for non-skill use cases. Changes: - New module: tools/env_passthrough.py — shared passthrough registry - hermes_cli/config.py: add terminal.env_passthrough to DEFAULT_CONFIG - tools/skills_tool.py: register available skill env vars on load - tools/code_execution_tool.py: check passthrough before filtering - tools/environments/local.py: check passthrough in _sanitize_subprocess_env and _make_run_env - 19 new tests covering all layers * docs: add environment variable passthrough documentation Document the env var passthrough feature across four docs pages: - security.md: new 'Environment Variable Passthrough' section with full explanation, comparison table, and security considerations - code-execution.md: update security section, add passthrough subsection, fix comparison table - creating-skills.md: add tip about automatic sandbox passthrough - skills.md: add note about passthrough after secure setup docs Live-tested: launched interactive CLI, loaded a skill with required_environment_variables, verified TEST_SKILL_SECRET_KEY was accessible inside execute_code sandbox (value: passthrough-test-value-42).	2026-03-24 08:19:34 -07:00
Teknium	ad1bf16f28	chore: remove all remaining mini-swe-agent references Complete cleanup after dropping the mini-swe-agent submodule (PR #2804): - Remove MSWEA_SILENT_STARTUP and MSWEA_GLOBAL_CONFIG_DIR env var settings from cli.py, run_agent.py, hermes_cli/main.py, doctor.py - Remove mini-swe-agent health check from hermes doctor - Remove 'minisweagent' from logger suppression lists - Remove litellm/typer/platformdirs from requirements.txt - Remove mini-swe-agent install steps from install.ps1 (Windows) - Remove mini-swe-agent install steps from website docs - Update all stale comments/docstrings referencing mini-swe-agent in terminal_tool.py, tools/__init__.py, code_execution_tool.py, environments/README.md, environments/agent_loop.py - Remove mini_swe_runner from pyproject.toml py-modules (still exists as standalone script for RL training use) - Shrink test_minisweagent_path.py to empty stub The orphaned mini-swe-agent/ directory on disk needs manual removal: rm -rf mini-swe-agent/	2026-03-24 08:19:23 -07:00
Teknium	e2c81c6e2f	docs: add missing skills, CLI commands, and messaging env vars Complete the documentation gaps identified in the previous audit: Skills catalogs: - skills-catalog.md: Add 7 missing bundled skills — data-science/ jupyter-live-kernel, dogfood/hermes-agent-setup, inference-sh/ inference-sh-cli, mlops/huggingface-hub, productivity/linear, research/parallel-cli, social-media/xitter - optional-skills-catalog.md: Add 8 missing optional skills — blockchain/base, creative/blender-mcp, creative/meme-generation, mcp/fastmcp, productivity/telephony, research/bioinformatics, security/oss-forensics, security/sherlock CLI commands reference: - cli-commands.md: Add full documentation for hermes mcp (add/remove/ list/test/configure) and hermes plugins (install/update/remove/list) Messaging platform docs: - discord.md: Add DISCORD_REQUIRE_MENTION and DISCORD_FREE_RESPONSE_CHANNELS to manual config env vars section - signal.md: Add SIGNAL_ALLOW_ALL_USERS to env var reference table - slack.md: Add SLACK_HOME_CHANNEL_NAME to config section	2026-03-24 08:12:37 -07:00
Teknium	677b11d84c	fix: reject relative cwd paths for container terminal backends When TERMINAL_CWD is set to '.' or any relative path (common when the CLI config defaults to cwd='.'), container backends (docker, modal, singularity, daytona) would pass it directly to the container where it's meaningless. This caused 'docker run -d -w .' to fail. Now relative paths are caught alongside host paths and replaced with the default '/root' for container backends.	2026-03-24 08:03:14 -07:00
Teknium	ee3f3e756d	docs: fix stale and incorrect documentation across 18 files Cross-referenced all 84 docs pages against the actual codebase and corrected every discrepancy found. Reference docs: - faq.md: Fix non-existent commands (/stats→/usage, /context→/usage, hermes models→hermes model, hermes config get→hermes config show, hermes gateway logs→cat gateway.log, async→sync chat() call) - cli-commands.md: Fix --provider choices list (remove providers not in argparse), add undocumented -s/--skills flag - slash-commands.md: Add missing /queue and /resume commands, fix /approve args_hint to show [session\|always] - tools-reference.md: Remove duplicate vision and web toolset sections - environment-variables.md: Fix HERMES_INFERENCE_PROVIDER list (add copilot-acp, remove alibaba to match actual argparse choices) Configuration & user guide: - configuration.md: Fix approval_mode→approvals.mode (manual not ask), checkpoints.enabled default true not false, human_delay defaults (500/2000→800/2500), remove non-existent delegation.max_iterations and delegation.default_toolsets, fix website_blocklist nesting under security:, add .hermes.md and CLAUDE.md to context files table with priority system explanation - security.md: Fix website_blocklist nesting under security: - context-files.md: Add .hermes.md/HERMES.md and CLAUDE.md support, document priority-based first-match-wins loading behavior - cli.md: Fix personalities config nesting (top-level, not under agent:) - delegation.md: Fix model override docs (config-level, not per-call tool parameter) - rl-training.md: Fix log directory (tinker-atropos/logs/→ ~/.hermes/logs/rl_training/) - tts.md: Fix Discord delivery format (voice bubble with fallback, not just file attachment) - git-worktrees.md: Remove outdated v0.2.0 version reference Developer guide: - prompt-assembly.md: Add .hermes.md, CLAUDE.md, document priority system for context files - agent-loop.md: Fix callback list (remove non-existent message_callback, add stream_delta_callback, tool_gen_callback, status_callback) Messaging & guides: - webhooks.md: Fix command (hermes setup gateway→hermes gateway setup) - tips.md: Fix session idle timeout (120min→24h), config file (gateway.json→config.yaml) - build-a-hermes-plugin.md: Fix plugin.yaml provides: format (provides_tools/provides_hooks as lists), note register_command() as not yet implemented	2026-03-24 07:53:07 -07:00
Teknium	02b38b93cb	refactor: remove mini-swe-agent dependency — inline Docker/Modal backends (#2804 ) Drop the mini-swe-agent git submodule. All terminal backends now use hermes-agent's own environment implementations directly. Docker backend: - Inline the `docker run -d` container startup (was 15 lines in minisweagent's DockerEnvironment). Our wrapper already handled execute(), cleanup(), security hardening, volumes, and resource limits. Modal backend: - Import swe-rex's ModalDeployment directly instead of going through minisweagent's 90-line passthrough wrapper. - Bake the _AsyncWorker pattern (from environments/patches.py) directly into ModalEnvironment for Atropos compatibility without monkey-patching. Cleanup: - Remove minisweagent_path.py (submodule path resolution helper) - Remove submodule init/install from install.sh and setup-hermes.sh - Remove mini-swe-agent from .gitmodules - environments/patches.py is now a no-op (kept for backward compat) - terminal_tool.py no longer does sys.path hacking for minisweagent - mini_swe_runner.py guards imports (optional, for RL training only) - Update all affected tests to mock the new direct subprocess calls - Update README.md, CONTRIBUTING.md No functionality change — all Docker, Modal, local, SSH, Singularity, and Daytona backends behave identically. 6093 tests pass.	2026-03-24 07:30:25 -07:00
Teknium	2233f764af	fix(tools): handle 402 insufficient credits error in vision tool (#2802 ) Co-authored-by: Dilee <uzmpsk.dilekakbas@gmail.com>	2026-03-24 07:23:07 -07:00
Teknium	98b5570961	fix: make browser command timeout configurable via config.yaml (#2801 ) browser_vision and other browser commands had a hardcoded 30-second subprocess timeout that couldn't be overridden. Users with slower machines (local Chromium without GPU) would hit timeouts on screenshot capture even when setting browser.command_timeout in config.yaml, because nothing read that value. Changes: - Add browser.command_timeout to DEFAULT_CONFIG (default: 30s) - Add _get_command_timeout() helper that reads config, falls back to 30s - _run_browser_command() now defaults to config value instead of constant - browser_vision screenshot no longer hardcodes timeout=30 - browser_navigate uses max(config_timeout, 60) as floor for navigation Reported by Gamer1988.	2026-03-24 07:21:50 -07:00
Teknium	773d3bb4df	docs: update all docs for /model command overhaul and custom provider support Documents the full /model command overhaul across 6 files: AGENTS.md: - Add model_switch.py to project structure tree configuration.md: - Rewrite General Setup with 3 config methods (interactive, config.yaml, env vars) - Add new 'Switching Models with /model' section documenting all syntax variants - Add 'Named Custom Providers' section with config.yaml examples and custom:name:model triple syntax slash-commands.md: - Update /model descriptions in both CLI and messaging tables with full syntax examples (provider:model, custom:model, custom:name:model, bare custom auto-detect) cli-commands.md: - Add /model slash command subsection under hermes model with syntax table - Add custom endpoint config to hermes model use cases faq.md: - Add config.yaml example for offline/local model setup - Note that provider: custom is a first-class provider - Document /model custom auto-detect provider-runtime.md: - Add model_switch.py to implementation file list - Update provider families to show Custom as first-class with named variants	2026-03-24 07:19:26 -07:00
Teknium	a312ee7b4c	fix(agent): ensure first delta is fired during reasoning updates - Added calls to `_fire_first_delta()` in the `AIAgent` class to ensure that the first delta is triggered for both reasoning and thinking updates. This change improves the handling of delta events during streaming, enhancing the responsiveness of the agent's reasoning capabilities.	2026-03-24 07:16:20 -07:00

1 2 3 4 5 ...

2654 Commits