* fix(security): add SSRF protection to browser_navigate
browser_navigate() only checked the website blocklist policy but did
not call is_safe_url() to block private/internal addresses. This
allowed the agent to navigate to localhost, cloud metadata endpoints
(169.254.169.254), and private network IPs via the browser.
web_tools and vision_tools already had this check. Added the same
is_safe_url() pre-flight validation before the blocklist check in
browser_navigate().
* fix: move SSRF import to module level, fix policy test mock
Move is_safe_url import to module level so it can be monkeypatched
in tests. Update test_browser_navigate_returns_policy_block to mock
_is_safe_url so the SSRF check passes and the policy check is reached.
* fix(security): harden browser SSRF protection
Follow-up to cherry-picked PR #3041:
1. Fail-closed fallback: if url_safety module can't import, block all
URLs instead of allowing all. Security guards should never fail-open.
2. Post-redirect SSRF check: after navigation, verify the final URL
isn't a private/internal address. If a public URL redirected to
169.254.169.254 or localhost, navigate to about:blank and return
an error — prevents the model from reading internal content via
subsequent browser_snapshot calls.
---------
Co-authored-by: 0xbyt4 <35742124+0xbyt4@users.noreply.github.com>
When a background task (/bg command) prints its output while the main agent
is processing with the thinking spinner visible, the status bar could render
on the same row as the spinner, causing visual overlap.
This fix adds an explicit app.invalidate() call with a brief pause before
printing background task output, ensuring the TUI layout is in a consistent
state before the output is written.
Changes:
- Add TUI refresh before success output in _handle_background_command
- Add TUI refresh before error output in the exception handler
- Add tests for the refresh behavior
Closes#2718
Co-authored-by: Bartok9 <bartokmagic@proton.me>
After streaming retries are exhausted on transient errors, fall back to
non-streaming instead of propagating the error. Also fall back for any
other pre-delivery stream error (not just 'streaming not supported').
Added user-facing message when streaming is not supported by a model/
provider, directing users to set display.streaming: false in config.yaml
to avoid the fallback delay.
Cherry-picked from PR #3008 by kshitijk4poor. Added UX message for
streaming-not-supported detection.
Co-authored-by: kshitijk4poor <kshitijk4poor@users.noreply.github.com>
* feat: nix flake, uv2nix build, dev shell and home manager
* fixed nix run, updated docs for setup
* feat(nix): NixOS module with persistent container mode, managed guards, checks
- Replace homeModules.nix with nixosModules.nix (two deployment modes)
- Mode A (native): hardened systemd service with ProtectSystem=strict
- Mode B (container): persistent Ubuntu container with /nix/store bind-mount,
identity-hash-based recreation, GC root protection, symlink-based updates
- Add HERMES_MANAGED guards blocking CLI config mutation (config set, setup,
gateway install/uninstall) when running under NixOS module
- Add nix/checks.nix with build-time verification (binary, CLI, managed guard)
- Remove container.nix (no Nix-built OCI image; pulls ubuntu:24.04 at runtime)
- Simplify packages.nix (drop fetchFromGitHub submodules, PYTHONPATH wrappers)
- Rewrite docs/nixos-setup.md with full options reference, container
architecture, secrets management, and troubleshooting guide
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* Update config.py
* feat(nix): add CI workflow and enhanced build checks
- GitHub Actions workflow for nix flake check + build on linux/macOS
- Entry point sync check to catch pyproject.toml drift
- Expanded managed-guard check to cover config edit
- Wrap hermes-acp binary in Nix package
- Fix Path type mismatch in is_managed()
* Update MCP server package name; bundled skills support
* fix reading .env. instead have container user a common mounted .env file
* feat(nix): container entrypoint with privilege drop and sudo provisioning
Container was running as non-root via --user, which broke apt/pip installs
and caused crashes when $HOME didn't exist. Replace --user with a Nix-built
entrypoint script that provisions the hermes user, sudo (NOPASSWD), and
/home/hermes inside the container on first boot, then drops privileges via
setpriv. Writable layer persists so setup only runs once.
Also expands MCP server options to support HTTP transport and sampling.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix group and user creation in container mode
* feat(nix): persistent /home/hermes and MESSAGING_CWD in container mode
Container mode now bind-mounts ${stateDir}/home to /home/hermes so the
agent's home directory survives container recreation. Previously it lived
in the writable layer and was lost on image/volume/options changes.
Also passes MESSAGING_CWD to the container so the agent finds its
workspace and documents, matching native mode behavior.
Other changes:
- Extract containerDataDir/containerHomeDir bindings (no more magic strings)
- Fix entrypoint chown to run unconditionally (volume mounts always exist)
- Add schema field to container identity hash for auto-recreation
- Add idempotency test (Scenario G) to config-roundtrip check
* docs: add Nix & NixOS setup guide to docs site
Add comprehensive Nix documentation to the Docusaurus site at
website/docs/getting-started/nix-setup.md, covering nix run/profile
install, NixOS module (native + container modes), declarative settings,
secrets management, MCP servers, managed mode, container architecture,
dev shell, flake checks, and full options reference.
- Register nix-setup in sidebar after installation page
- Add Nix callout tip to installation.md linking to new guide
- Add canonical version pointer in docs/nixos-setup.md
* docs: remove docs/nixos-setup.md, consolidate into website docs
Backfill missing details (restart/restartSec in full example,
gateway.pid, 0750 permissions, docker inspect commands) into
the canonical website/docs/getting-started/nix-setup.md and
delete the old standalone file.
* fix(nix): add compression.protect_last_n and target_ratio to config-keys.json
New keys were added to DEFAULT_CONFIG on main, causing the
config-drift check to fail in CI.
* fix(nix): skip checks on aarch64-darwin (onnxruntime wheel missing)
The full Python venv includes onnxruntime (via faster-whisper/STT)
which lacks a compatible uv2nix wheel on aarch64-darwin. Gate all
checks behind stdenv.hostPlatform.isLinux. The package and devShell
still evaluate on macOS.
* fix(nix): skip flake check and build on macOS CI
onnxruntime (transitive dep via faster-whisper) lacks a compatible
uv2nix wheel on aarch64-darwin. Run full checks and build on Linux
only; macOS CI verifies the flake evaluates without building.
* fix(nix): preserve container writable layer across nixos-rebuild
The container identity hash included the entrypoint's Nix store path,
which changes on every nixpkgs update (due to runtimeShell/stdenv
input-addressing). This caused false-positive identity mismatches,
triggering container recreation and losing the persistent writable layer.
- Use stable symlink (current-entrypoint) like current-package already does
- Remove entrypoint from identity hash (only image/volumes/options matter)
- Add GC root for entrypoint so nix-collect-garbage doesn't break it
- Remove global HERMES_HOME env var from addToSystemPackages (conflicted
with interactive CLI use, service already sets its own)
---------
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Three improvements to reasoning/thinking display in the CLI:
1. Buffer tiny reasoning chunks: providers like DeepSeek stream reasoning
one word at a time, producing a separate [thinking] line per token.
Add a buffer that coalesces chunks and flushes at natural boundaries
(newlines, sentence endings, terminal width).
2. Fix duplicate reasoning display: centralize callback selection into
_current_reasoning_callback() — one place instead of 4 scattered
inline ternaries. Prevents both the streaming box AND the preview
callback from firing simultaneously.
3. Fix post-response reasoning box guard: change the check from
'not self._stream_started' to 'not self._reasoning_stream_started'
so the final reasoning box is only suppressed when reasoning was
actually streamed live, not when any text was streamed.
Cherry-picked from PR #2781 by juanfradb.
- Add 'prompt exceeds max length' to context overflow detection for
Z.AI/GLM 400 errors
- Extract inline reasoning blocks from assistant content as fallback
when no structured reasoning fields are present
- Guard inline extraction so structured API reasoning takes priority
- Update test for reasoning-only response salvage behavior
Cherry-picked from PR #2993 by kshitijk4poor. Added priority guard
to fix test_structured_reasoning_takes_priority failure.
Co-authored-by: kshitijk4poor <kshitijk4poor@users.noreply.github.com>
Cron was the only execution path that never called end_session(),
leaving ended_at = NULL permanently. This made cron sessions invisible
to hermes prune --older-than and indistinguishable from active sessions.
Captures session_id in a local variable before agent construction so
it's available in the finally block even if AIAgent() fails, then calls
end_session(session_id, 'cron_complete') before close().
Cherry-picked from PR #2979 by ygd58. Fixed bug: original PR called
end_session() with zero arguments (TypeError — method requires
session_id and end_reason).
Fixes#2972.
Co-authored-by: ygd58 <ygd58@users.noreply.github.com>
* fix(skills): use Git Trees API to prevent silent subdirectory loss during install
Refactors _download_directory() to use the Git Trees API (single call
for the entire repo tree) as the primary path, falling back to the
recursive Contents API when the tree endpoint is unavailable or
truncated. Prevents silent subdirectory loss caused by per-directory
rate limiting or transient failures.
Cherry-picked from PR #2981 by tugrulguner.
Fixes#2940.
* fix: simplify tree API — use branch name directly as tree-ish
Eliminates an extra git/ref/heads API call by passing the branch name
directly to git/trees/{branch}?recursive=1, matching the pattern
already used by _find_skill_in_repo_tree.
---------
Co-authored-by: tugrulguner <tugrulguner@users.noreply.github.com>
* fix(run_agent): ensure _fire_first_delta() is called for tool generation events
Added calls to _fire_first_delta() in the AIAgent class to improve the handling of tool generation events, ensuring timely notifications during the processing of function calls and tool usage.
* fix(run_agent): improve timeout handling for chat completions
Enhanced the timeout configuration for chat completions in the AIAgent class by introducing customizable connection, read, and write timeouts using environment variables. This ensures more robust handling of API requests during streaming operations.
* fix(run_agent): reduce default stream read timeout for chat completions
Updated the default stream read timeout from 120 seconds to 60 seconds in the AIAgent class, enhancing the timeout configuration for chat completions. This change aims to improve responsiveness during streaming operations.
* fix(run_agent): enhance streaming error handling and retry logic
Improved the error handling and retry mechanism for streaming requests in the AIAgent class. Introduced a configurable maximum number of stream retries and refined the handling of transient network errors, allowing for retries with fresh connections. Non-transient errors now trigger a fallback to non-streaming only when appropriate, ensuring better resilience during API interactions.
* fix(api_server): streaming breaks when agent makes tool calls
The agent fires stream_delta_callback(None) to signal the CLI display
to close its response box before tool execution begins. The API server's
_on_delta callback was forwarding this None directly into the SSE queue,
where the SSE writer treats it as end-of-stream and terminates the HTTP
response prematurely.
After tool calls complete, the agent streams the final answer through
the same callback, but the SSE response was already closed — so Open
WebUI (and similar frontends) never received the actual answer.
Fix: filter out None in _on_delta so the SSE stream stays open. The SSE
loop already detects completion via agent_task.done(), which handles
stream termination correctly without needing the None sentinel.
Reported by Rohit Paul on X.
feat: persist reasoning across gateway session turns (schema v6)
Tested against OpenAI Codex (direct), Anthropic (direct + OAI-compat), and OpenRouter → 6 backends. All reasoning field types (reasoning, reasoning_details, codex_reasoning_items) round-trip through the DB correctly.
* fix(run_agent): ensure _fire_first_delta() is called for tool generation events
Added calls to _fire_first_delta() in the AIAgent class to improve the handling of tool generation events, ensuring timely notifications during the processing of function calls and tool usage.
* fix(run_agent): improve timeout handling for chat completions
Enhanced the timeout configuration for chat completions in the AIAgent class by introducing customizable connection, read, and write timeouts using environment variables. This ensures more robust handling of API requests during streaming operations.
* fix(run_agent): reduce default stream read timeout for chat completions
Updated the default stream read timeout from 120 seconds to 60 seconds in the AIAgent class, enhancing the timeout configuration for chat completions. This change aims to improve responsiveness during streaming operations.
* fix(run_agent): enhance streaming error handling and retry logic
Improved the error handling and retry mechanism for streaming requests in the AIAgent class. Introduced a configurable maximum number of stream retries and refined the handling of transient network errors, allowing for retries with fresh connections. Non-transient errors now trigger a fallback to non-streaming only when appropriate, ensuring better resilience during API interactions.
* fix: skills-sh install fails for deeply nested repo structures
Skills in repos with deep directory nesting (e.g.
cli-tool/components/skills/development/senior-backend/) could not be
installed because the candidate path generation and shallow root-dir
scan never reached them.
Added GitHubSource._find_skill_in_repo_tree() which uses the GitHub
Trees API to recursively search the entire repo tree in a single API
call. This is used as a final fallback in
SkillsShSource._discover_identifier() when the standard candidate
paths and shallow scan both fail.
Fixes installation of skills from repos like davila7/claude-code-templates
where skills are nested 4+ levels deep.
Reported by user Samuraixheart.
Add reply_to_mode setting (off/first/all) to control whether Telegram
replies quote/thread to the user's original message.
- 'off': Never thread replies (no quote bubble)
- 'first': Only first chunk threads to user's message (default, preserves existing behavior)
- 'all': All chunks in multi-part replies thread to user's message
Configurable via:
- reply_to_mode in platform config (gateway config YAML)
- TELEGRAM_REPLY_TO_MODE env var
Based on PR #855 by raulvidis.
* feat(migration): comprehensive OpenClaw -> Hermes migration v2
Extends the existing migration script from ~15% to ~95% coverage of
OpenClaw's configuration surface. Adds 17 new migration modules:
Direct migrations (written to config.yaml/.env):
- MCP servers: full server definitions with transport, tools, sampling
- Agent defaults: reasoning_effort, compression, human_delay, timezone
- Session config: reset triggers (daily/idle) -> session_reset
- Full model providers: custom_providers with base_url/api_mode
- Deep channel config: Matrix, Mattermost, IRC, Discord deep settings
- Browser config: timeout settings
- Tools config: exec timeout -> terminal.timeout
- Approvals: mode mapping (smart/manual/auto -> Hermes equivalents)
Archived for manual review (no direct Hermes equivalent):
- Plugins config + installed extensions
- Cron jobs (with note to use 'hermes cron')
- Hooks/webhooks config
- Multi-agent list + routing bindings
- Gateway config (port, auth, TLS)
- Memory backend config (QMD, vector search)
- Skills registry per-entry config
- UI/identity settings
- Logging/diagnostics preferences
Also adds:
- MIGRATION_NOTES.md generation with PM2 reassurance message
- _set_env_var helper for consistent env file management
- Updated presets to include all new options
- Comprehensive mock test passing (12 migrated, 12 archived)
* feat(migration): add terminal recap with visual summary
Replaces raw JSON dump with a formatted box showing migrated/archived/
skipped/conflict/error counts, detailed item lists with labels, PM2
reassurance message, and actionable next steps. JSON output available
via MIGRATION_JSON_OUTPUT=1 env var.
* fix(test): allowlist python_os_environ as known false-positive in skills guard test
MIGRATION_JSON_OUTPUT env var is a legitimate CLI feature flag that enables
JSON output mode, not an env dump. Add it alongside agent_config_mod as an
accepted finding in test_skill_installs_cleanly_under_skills_guard.
* fix(test): add hermes_config_mod to known false-positives in skills guard test
The scanner flags two print statements that tell the user to *review*
~/.hermes/config.yaml in the post-migration summary. The script never
writes to that file — those are informational strings, not config mutations.
---------
Co-authored-by: Hermes <hermes@nousresearch.ai>
- threshold: 0.80 → 0.50 (compress at 50%, not 80%)
- target_ratio: 0.40 → 0.20, now relative to threshold not total context
(20% of 50% = 10% of context as tail budget)
- summary ceiling: 32K → 12K (Gemini can't output more than ~12K)
- Updated DEFAULT_CONFIG, config display, example config, and tests
The summary_target_tokens parameter was accepted in the constructor,
stored on the instance, and never used — the summary budget was always
computed from hardcoded module constants (_SUMMARY_RATIO=0.20,
_MAX_SUMMARY_TOKENS=8000). This caused two compounding problems:
1. The config value was silently ignored, giving users no control
over post-compression size.
2. Fixed budgets (20K tail, 8K summary cap) didn't scale with
context window size. Switching from a 1M-context model to a
200K model would trigger compression that nuked 350K tokens
of conversation history down to ~30K.
Changes:
- Replace summary_target_tokens with summary_target_ratio (default 0.40)
which sets the post-compression target as a fraction of context_length.
Tail token budget and summary cap now scale proportionally:
MiniMax 200K → ~80K post-compression
GPT-5 1M → ~400K post-compression
- Change threshold_percent default: 0.50 → 0.80 (don't fire until
80% of context is consumed)
- Change protect_last_n default: 4 → 20 (preserve ~10 full turns)
- Summary token cap scales to 5% of context (was fixed 8K), capped
at 32K ceiling
- Read target_ratio and protect_last_n from config.yaml compression
section (both are now configurable)
- Remove hardcoded summary_target_tokens=500 from run_agent.py
- Add 5 new tests for ratio scaling, clamping, and new defaults
Move OpenRouter to position 1 in the setup wizard's provider list
to match hermes model ordering. Update default selection index and
fix test expectations for the new ordering.
Setup order: OpenRouter → Nous Portal → Codex → Custom → ...
* feat: env var passthrough for skills and user config
Skills that declare required_environment_variables now have those vars
passed through to sandboxed execution environments (execute_code and
terminal). Previously, execute_code stripped all vars containing KEY,
TOKEN, SECRET, etc. and the terminal blocklist removed Hermes
infrastructure vars — both blocked skill-declared env vars.
Two passthrough sources:
1. Skill-scoped (automatic): when a skill is loaded via skill_view and
declares required_environment_variables, vars that are present in
the environment are registered in a session-scoped passthrough set.
2. Config-based (manual): terminal.env_passthrough in config.yaml lets
users explicitly allowlist vars for non-skill use cases.
Changes:
- New module: tools/env_passthrough.py — shared passthrough registry
- hermes_cli/config.py: add terminal.env_passthrough to DEFAULT_CONFIG
- tools/skills_tool.py: register available skill env vars on load
- tools/code_execution_tool.py: check passthrough before filtering
- tools/environments/local.py: check passthrough in _sanitize_subprocess_env
and _make_run_env
- 19 new tests covering all layers
* docs: add environment variable passthrough documentation
Document the env var passthrough feature across four docs pages:
- security.md: new 'Environment Variable Passthrough' section with
full explanation, comparison table, and security considerations
- code-execution.md: update security section, add passthrough subsection,
fix comparison table
- creating-skills.md: add tip about automatic sandbox passthrough
- skills.md: add note about passthrough after secure setup docs
Live-tested: launched interactive CLI, loaded a skill with
required_environment_variables, verified TEST_SKILL_SECRET_KEY was
accessible inside execute_code sandbox (value: passthrough-test-value-42).
Complete cleanup after dropping the mini-swe-agent submodule (PR #2804):
- Remove MSWEA_SILENT_STARTUP and MSWEA_GLOBAL_CONFIG_DIR env var
settings from cli.py, run_agent.py, hermes_cli/main.py, doctor.py
- Remove mini-swe-agent health check from hermes doctor
- Remove 'minisweagent' from logger suppression lists
- Remove litellm/typer/platformdirs from requirements.txt
- Remove mini-swe-agent install steps from install.ps1 (Windows)
- Remove mini-swe-agent install steps from website docs
- Update all stale comments/docstrings referencing mini-swe-agent
in terminal_tool.py, tools/__init__.py, code_execution_tool.py,
environments/README.md, environments/agent_loop.py
- Remove mini_swe_runner from pyproject.toml py-modules
(still exists as standalone script for RL training use)
- Shrink test_minisweagent_path.py to empty stub
The orphaned mini-swe-agent/ directory on disk needs manual removal:
rm -rf mini-swe-agent/
Drop the mini-swe-agent git submodule. All terminal backends now use
hermes-agent's own environment implementations directly.
Docker backend:
- Inline the `docker run -d` container startup (was 15 lines in
minisweagent's DockerEnvironment). Our wrapper already handled
execute(), cleanup(), security hardening, volumes, and resource limits.
Modal backend:
- Import swe-rex's ModalDeployment directly instead of going through
minisweagent's 90-line passthrough wrapper.
- Bake the _AsyncWorker pattern (from environments/patches.py) directly
into ModalEnvironment for Atropos compatibility without monkey-patching.
Cleanup:
- Remove minisweagent_path.py (submodule path resolution helper)
- Remove submodule init/install from install.sh and setup-hermes.sh
- Remove mini-swe-agent from .gitmodules
- environments/patches.py is now a no-op (kept for backward compat)
- terminal_tool.py no longer does sys.path hacking for minisweagent
- mini_swe_runner.py guards imports (optional, for RL training only)
- Update all affected tests to mock the new direct subprocess calls
- Update README.md, CONTRIBUTING.md
No functionality change — all Docker, Modal, local, SSH, Singularity,
and Daytona backends behave identically. 6093 tests pass.
Fixes#2492.
`generate_systemd_unit()` and `get_python_path()` hardcoded `venv`
as the virtualenv directory name. When the virtualenv is `.venv`
(which `setup-hermes.sh` and `.gitignore` both reference), the
generated systemd unit had incorrect VIRTUAL_ENV and PATH variables.
Introduce `_detect_venv_dir()` which:
1. Checks `sys.prefix` vs `sys.base_prefix` to detect the active venv
2. Falls back to probing `.venv` then `venv` under PROJECT_ROOT
Both `get_python_path()` and `generate_systemd_unit()` now use
this detection instead of hardcoded paths.
Co-authored-by: Hermes <hermes@nousresearch.ai>
* feat(model): persist base_url on /model switch, auto-detect for bare /model custom
Phase 2+3 of the /model command overhaul:
Phase 2 — Persist base_url on model switch:
- CLI: save model.base_url when switching to a non-OpenRouter endpoint;
clear it when switching away from custom to prevent stale URLs
leaking into the new provider's resolution
- Gateway: same logic using direct YAML write
Phase 3 — Better feedback and edge cases:
- Bare '/model custom' now auto-detects the model from the endpoint
using _auto_detect_local_model() and saves all three config values
(model, provider, base_url) atomically
- Shows endpoint URL in success messages when switching to/from
custom providers (both CLI and gateway)
- Clear error messages when no custom endpoint is configured
- Updated test assertions for the additional save_config_value call
Fixes#2562 (Phase 2+3)
* feat(model): support custom:name:model triple syntax for named custom providers
Phase 5 of the /model command overhaul.
Extends parse_model_input() to handle the triple syntax:
/model custom:local-server:qwen → provider='custom:local-server', model='qwen'
/model custom:my-model → provider='custom', model='my-model' (unchanged)
The 'custom:local-server' provider string is already supported by
_get_named_custom_provider() in runtime_provider.py, which matches
it against the custom_providers list in config.yaml. This just wires
the parsing so users can do it from the /model slash command.
Added 4 tests covering single, triple, whitespace, and empty model cases.
resolve_provider('custom') was silently returning 'openrouter', causing
users who set provider: custom in config.yaml to unknowingly route
through OpenRouter instead of their local/custom endpoint. The display
showed 'via openrouter' even when the user explicitly chose custom.
Changes:
- auth.py: Split the conditional so 'custom' returns 'custom' as-is
- runtime_provider.py: _resolve_named_custom_runtime now returns
provider='custom' instead of 'openrouter'
- runtime_provider.py: _resolve_openrouter_runtime returns
provider='custom' when that was explicitly requested
- Add 'no-key-required' placeholder for keyless local servers
- Update existing test + add 5 new tests covering the fix
Fixes#2562
On macOS with Homebrew (Apple Silicon), Node.js and agent-browser
binaries live under /opt/homebrew/bin/ which is not included in the
_SANE_PATH fallback used by browser_tool.py and environments/local.py.
When Hermes runs with a filtered PATH (e.g. as a systemd service),
these binaries are invisible, causing 'env: node: No such file or
directory' errors when using browser tools.
Changes:
- Add /opt/homebrew/bin and /opt/homebrew/sbin to _SANE_PATH in both
browser_tool.py and environments/local.py
- Add _discover_homebrew_node_dirs() to find versioned Node installs
(e.g. brew install node@24) that aren't linked into /opt/homebrew/bin
- Extend _find_agent_browser() to search Homebrew and Hermes-managed
dirs when agent-browser isn't on the current PATH
- Include discovered Homebrew node dirs in subprocess PATH when
launching agent-browser
- Add 11 new tests covering all Homebrew path discovery logic
The gateway memory flush agent reviews old conversation history on session
reset/expiry and writes to memory. It had no awareness of memory changes
made after that conversation ended (by the live agent, cron jobs, or other
sessions), causing silent overwrites of newer entries.
Two fixes:
1. Skip memory flush entirely for cron sessions (session IDs starting with
'cron_'). Cron sessions are headless with no meaningful user conversation
to extract memories from.
2. Inject the current live memory state (MEMORY.md + USER.md) directly into
the flush prompt. The flush agent can now see what's already saved and
make informed decisions — only adding genuinely new information rather
than blindly overwriting entries that may have been updated since the
conversation ended.
Addresses the root cause identified in #2670: the flush agent was making
memory decisions blind to the current state of memory, causing stale
context to overwrite newer entries on gateway restarts and session resets.
Co-authored-by: devorun <devorun@users.noreply.github.com>
Co-authored-by: dlkakbs <dlkakbs@users.noreply.github.com>
* feat(config): support ${ENV_VAR} substitution in config.yaml
* fix: extend env var expansion to CLI and gateway config loaders
The original PR (#2680) only wired _expand_env_vars into load_config(),
which is used by 'hermes tools' and 'hermes setup'. The two primary
config paths were missed:
- load_cli_config() in cli.py (interactive CLI)
- Module-level _cfg in gateway/run.py (gateway — bridges api_keys to env vars)
Also:
- Remove redundant 'import re' (already imported at module level)
- Add missing blank lines between top-level functions (PEP 8)
- Add tests for load_cli_config() expansion
---------
Co-authored-by: teyrebaz33 <hakanerten02@hotmail.com>
* fix(security): add SSRF protection to vision_tools and web_tools
Both vision_analyze and web_extract/web_crawl accept arbitrary URLs
without checking if they target private/internal network addresses.
A prompt-injected or malicious skill could use this to access cloud
metadata endpoints (169.254.169.254), localhost services, or private
network hosts.
Adds a shared url_safety.is_safe_url() that resolves hostnames and
blocks private, loopback, link-local, and reserved IP ranges. Also
blocks known internal hostnames (metadata.google.internal).
Integrated at the URL validation layer in vision_tools and before
each website_policy check in web_tools (extract, crawl).
* test(vision): update localhost test to reflect SSRF protection
The existing test_valid_url_with_port asserted localhost URLs pass
validation. With SSRF protection, localhost is now correctly blocked.
Update the test to verify the block, and add a separate test for
valid URLs with ports using a public hostname.
* fix(security): harden SSRF protection — fail-closed, CGNAT, multicast, redirect guard
Follow-up hardening on top of dieutx's SSRF protection (PR #2630):
- Change fail-open to fail-closed: DNS errors and unexpected exceptions
now block the request instead of allowing it (OWASP best practice)
- Block CGNAT range (100.64.0.0/10): Python's ipaddress.is_private
does NOT cover this range (returns False for both is_private and
is_global). Used by Tailscale/WireGuard and carrier infrastructure.
- Add is_multicast and is_unspecified checks: multicast (224.0.0.0/4)
and unspecified (0.0.0.0) addresses were not caught by the original
four-check chain
- Add redirect guard for vision_tools: httpx event hook re-validates
each redirect target against SSRF checks, preventing the classic
redirect-based SSRF bypass (302 to internal IP)
- Move SSRF filtering before backend dispatch in web_extract: now
covers Parallel and Tavily backends, not just Firecrawl
- Extract _is_blocked_ip() helper for cleaner IP range checking
- Add 24 new tests (CGNAT, multicast, IPv4-mapped IPv6, fail-closed
behavior, parametrized blocked/allowed IP lists)
- Fix existing tests to mock DNS resolution for test hostnames
---------
Co-authored-by: dieutx <dangtc94@gmail.com>
Root cause: terminal_tool, execute_code, and process_registry returned raw
subprocess output with ANSI escape sequences intact. The model saw these
in tool results and copied them into file writes.
Previous fix (PR #2532) stripped ANSI at the write point in file_tools.py,
but this was a band-aid — regex on file content risks corrupting legitimate
content, and doesn't prevent ANSI from wasting tokens in the model context.
Source-level fix:
- New tools/ansi_strip.py with comprehensive ECMA-48 regex covering CSI
(incl. private-mode, colon-separated, intermediate bytes), OSC (both
terminators), DCS/SOS/PM/APC strings, Fp/Fe/Fs/nF escapes, 8-bit C1
- terminal_tool.py: strip output before returning to model
- code_execution_tool.py: strip stdout/stderr before returning
- process_registry.py: strip output in poll/read_log/wait
- file_tools.py: remove _strip_ansi band-aid (no longer needed)
Verified: `ls --color=always` output returned as clean text to model,
file written from that output contains zero ESC bytes.
Cherry-picked from PR #2576 by ereid7, plus read-side fix from 173a5c62.
Both fixes were originally landed in 173a5c62 but were inadvertently
reverted by commit 34be3f8b (a squash-merge that bundled unrelated
tools_config.py changes).
Save side (_save_platform_tools): exclude platform default toolset
names (hermes-cli, hermes-telegram) from preserved entries so they
don't silently re-enable everything.
Read side (_get_platform_tools): when the saved list contains explicit
configurable keys, use direct membership instead of subset inference.
The subset approach is broken when composite toolsets like hermes-cli
resolve to ALL tools.
Cherry-picked from PR #2575 by ticketclosed-wontfix.
Filters out Discord system messages (thread renames, pins, member joins,
boosts) that were being treated as regular user messages.
Follow-up fix: also allow MessageType.reply (value 19) — the original
filter only allowed MessageType.default, which would silently drop all
reply-based interactions.
Added pytest.importorskip for discord dependency in tests.
An agent session killed the systemd-managed gateway (PID 1605) and restarted
it with '&disown', taking it outside systemd's Restart= management. When the
orphaned process later received SIGTERM, nothing restarted it.
Add dangerous command patterns to detect:
- 'gateway run' with & (background), disown, nohup, or setsid
- These should use 'systemctl --user restart hermes-gateway' instead
Also applied directly to main repo and fixed the systemd service:
- Changed Restart=on-failure to Restart=always (clean SIGTERM = exit 0 = not
a 'failure', so on-failure never triggered)
- RestartSec=10 for reasonable restart delay
Previously 'Activated skills: xxx' was printed above the banner in
show_banner(). Now it prints directly after the 'Welcome to Hermes
Agent!' line in run(), which is a more natural placement.
Path('~/.hermes/image.png').is_file() returns False because Path
doesn't expand tilde. This caused the tool to fall through to URL
validation, which also failed, producing a confusing error:
'Invalid image source. Provide an HTTP/HTTPS URL or a valid local
file path.'
Fix: use os.path.expanduser() before constructing the Path object.
Added two tests for tilde expansion (success and nonexistent file).
When a messaging platform fails to connect at startup (e.g. transient DNS
failure) or disconnects at runtime with a retryable error, the gateway now
queues it for background reconnection instead of giving up permanently.
- New _platform_reconnect_watcher background task runs alongside the
existing session expiry watcher
- Exponential backoff: 30s, 60s, 120s, 240s, 300s cap
- Max 20 retry attempts before giving up on a platform
- Non-retryable errors (bad auth token, etc.) are not retried
- Runtime disconnections via _handle_adapter_fatal_error now queue
retryable failures instead of triggering gateway shutdown
- On successful reconnect, adapter is wired up and channel directory
is rebuilt automatically
Fixes the case where a DNS blip during gateway startup caused Telegram
and Discord to be permanently unavailable until manual restart.
The previous commit capped the 1.4x at 95% of context, but the multiplier
itself is unnecessary and confusing:
85% threshold × 1.4 = 119% of context → never fires
95% warn × 1.4 = 133% of context → never warns
The 85% hygiene threshold already provides ample headroom over the agent's
own 50% compressor. Even if rough estimates overestimate by 50%, hygiene
would fire at ~57% actual usage — safe and harmless.
Remove the multiplier entirely. Both actual and estimated token paths
now use the same 85% / 95% thresholds. Update tests and comments.
Three bugs in gateway session hygiene pre-compression caused 'Session too
large' errors for ~200K context models like GLM-5-turbo on z.ai:
1. Gateway hygiene called get_model_context_length(model) without passing
config_context_length, provider, or base_url — so user overrides like
model.context_length: 180000 were ignored, and provider-aware detection
(models.dev, z.ai endpoint) couldn't fire. The agent's own compressor
correctly passed all three (run_agent.py line 1038).
2. The 1.4x safety factor on rough token estimates pushed the compression
threshold above the model's actual context limit:
200K * 0.85 * 1.4 = 238K > 200K (model limit)
So hygiene never compressed, sessions grew past the limit, and the API
rejected the request.
3. Same issue for the warn threshold: 200K * 0.95 * 1.4 = 266K.
Fix:
- Read model.context_length, provider, and base_url from config.yaml
(same as run_agent.py does) and pass them to get_model_context_length()
- Resolve provider/base_url from runtime when not in config
- Cap the 1.4x-adjusted compress threshold at 95% of context_length
- Cap the 1.4x-adjusted warn threshold at context_length
Affects: z.ai GLM-5/GLM-5-turbo, any ~200K or smaller context model
where the 1.4x factor would push 85% above 100%.
Ref: Discord report from Ddox — glm-5-turbo on z.ai coding plan
* fix(mcp-oauth): port mismatch, path traversal, and shared state in OAuth flow
Three bugs in the new MCP OAuth 2.1 PKCE implementation:
1. CRITICAL: OAuth redirect port mismatch — build_oauth_auth() calls
_find_free_port() to register the redirect_uri, but _wait_for_callback()
calls _find_free_port() again getting a DIFFERENT port. Browser redirects
to port A, server listens on port B — callback never arrives, 120s timeout.
Fix: share the port via module-level _oauth_port variable.
2. MEDIUM: Path traversal via unsanitized server_name — HermesTokenStorage
uses server_name directly in filenames. A name like "../../.ssh/config"
writes token files outside ~/.hermes/mcp-tokens/.
Fix: sanitize server_name with the same regex pattern used elsewhere.
3. MEDIUM: Class-level auth_code/state on _CallbackHandler causes data
races if concurrent OAuth flows run. Second callback overwrites first.
Fix: factory function _make_callback_handler() returns a handler class
with a closure-scoped result dict, isolating each flow.
* test: add tests for MCP OAuth path traversal, handler isolation, and port sharing
7 new tests covering:
- Path traversal blocked (../../.ssh/config stays in mcp-tokens/)
- Dots/slashes sanitized and resolved within base dir
- Normal server names preserved
- Special characters sanitized (@, :, /)
- Concurrent handler result dicts are independent
- Handler writes to its own result dict, not class-level
- build_oauth_auth stores port in module-level _oauth_port
---------
Co-authored-by: 0xbyt4 <35742124+0xbyt4@users.noreply.github.com>
When a session expires (daily schedule or idle timeout) and is
automatically reset, send a notification to the user explaining
what happened:
◐ Session automatically reset (inactive for 24h).
Conversation history cleared.
Use /resume to browse and restore a previous session.
Adjust reset timing in config.yaml under session_reset.
Notifications are suppressed when:
- The expired session had no activity (no tokens used)
- The platform is excluded (api_server, webhook by default)
- notify: false in config
Changes:
- session.py: _should_reset() returns reason string ('idle'/'daily')
instead of bool; SessionEntry gains auto_reset_reason and
reset_had_activity fields; old entry's total_tokens checked
- config.py: SessionResetPolicy gains notify (bool, default: true)
and notify_exclude_platforms (default: api_server, webhook)
- run.py: sends notification via adapter.send() before processing
the user's message, with activity + platform checks
- 13 new tests
Config (config.yaml):
session_reset:
notify: true
notify_exclude_platforms: [api_server, webhook]
- Download and cache .pdf, .docx, .xlsx, .pptx attachments locally
instead of passing expiring CDN URLs to the agent
- Inject .txt and .md content (≤100 KB) into event.text so the agent
sees file content without needing to fetch the URL
- Add 20 MB size guard and SUPPORTED_DOCUMENT_TYPES allowlist
- Fix: unsupported types (.zip etc.) no longer get MessageType.DOCUMENT
- Add 9 unit tests in test_discord_document_handling.py
Mirrors the Slack implementation from PR #784. Discord CDN URLs are
publicly accessible so no auth header is needed (unlike Slack).
Co-authored-by: Dilee <uzmpsk.dilekakbas@gmail.com>
- test_plugins.py: remove tests for unimplemented plugin command API
(get_plugin_command_handler, register_command never existed)
- test_redact.py: add autouse fixture to clear HERMES_REDACT_SECRETS
env var leaked by cli.py import in other tests
- test_signal.py: same HERMES_REDACT_SECRETS fix for phone redaction
- test_mattermost.py: add @bot_user_id to test messages after the
mention-only filter was added in #2443
- test_context_token_tracking.py: mock resolve_provider_client for
openai-codex provider that requires real OAuth credentials
Full suite: 5893 passed, 0 failed.
* fix: respect DashScope v1 runtime mode for alibaba
Remove the hardcoded Alibaba branch from resolve_runtime_provider()
that forced api_mode='anthropic_messages' regardless of the base URL.
Alibaba now goes through the generic API-key provider path, which
auto-detects the protocol from the URL:
- /apps/anthropic → anthropic_messages (via endswith check)
- /v1 → chat_completions (default)
This fixes Alibaba setup with OpenAI-compatible DashScope endpoints
(e.g. coding-intl.dashscope.aliyuncs.com/v1) that were broken because
runtime always forced Anthropic mode even when setup saved a /v1 URL.
Based on PR #2024 by @kshitijk4poor.
* docs(skill): add split, merge, search examples to ocr-and-documents skill
Adds pymupdf examples for PDF splitting, merging, and text search
to the existing ocr-and-documents skill. No new dependencies — pymupdf
already covers all three operations natively.
* fix: replace all production print() calls with logger in rl_training_tool
Replace all bare print() calls in production code paths with proper logger calls.
- Add `import logging` and module-level `logger = logging.getLogger(__name__)`
- Replace print() in _start_training_run() with logger.info()
- Replace print() in _stop_training_run() with logger.info()
- Replace print(Warning/Note) calls with logger.warning() and logger.info()
Using the logging framework allows log level filtering, proper formatting,
and log routing instead of always printing to stdout.
* fix(gateway): process /queue'd messages after agent completion
/queue stored messages in adapter._pending_messages but never consumed
them after normal (non-interrupted) completion. The consumption path
at line 5219 only checked pending messages when result.get('interrupted')
was True — since /queue deliberately doesn't interrupt, queued messages
were silently dropped.
Now checks adapter._pending_messages after both interrupted AND normal
completion. For queued messages (non-interrupt), the first response is
delivered before recursing to process the queued follow-up. Skips the
direct send when streaming already delivered the response.
Reported by GhostMode on Discord.
---------
Co-authored-by: kshitijk4poor <kshitijk4poor@users.noreply.github.com>
Co-authored-by: memosr.eth <96793918+memosr@users.noreply.github.com>
The /v1/responses endpoint used an in-memory OrderedDict that lost
all conversation state on gateway restart. Replace with SQLite-backed
storage at ~/.hermes/response_store.db.
- Responses and conversation name mappings survive restarts
- Same LRU eviction behavior (configurable max_size)
- WAL mode for concurrent read performance
- Falls back to in-memory SQLite if disk path unavailable
- Conversation name→response_id mapping moved into the store
Reverts the sanitizer addition from PR #2466 (originally #2129).
We already have _empty_content_retries handling for reasoning-only
responses. The trailing strip risks silently eating valid messages
and is redundant with existing empty-content handling.
Add hermes mcp add/remove/list/test/configure CLI for managing MCP
server connections interactively. Discovery-first 'add' flow connects,
discovers tools, and lets users select which to enable via curses checklist.
Add OAuth 2.1 PKCE authentication for MCP HTTP servers (RFC 7636).
Supports browser-based and manual (headless) authorization, token
caching with 0600 permissions, automatic refresh. Zero external deps.
Add ${ENV_VAR} interpolation in MCP server config values, resolved
from os.environ + ~/.hermes/.env at load time.
Core OAuth module from PR #2021 by @imnotdev25. CLI and mcp_tool
wiring rewritten against current main. Closes#497, #690.