* fix(gateway): add media download retry to Mattermost, Slack, and base cache
Media downloads on Mattermost and Slack fail permanently on transient
errors (timeouts, 429 rate limits, 5xx server errors). Telegram and
WhatsApp already have retry logic, but these platforms had single-attempt
downloads with hardcoded 30s timeouts.
Changes:
- base.py cache_image_from_url: add retry with exponential backoff
(covers Signal and any platform using the shared cache helper)
- mattermost.py _send_media_url: retry on 429/5xx/timeout (3 attempts)
- slack.py _download_slack_file: retry on timeout/5xx (3 attempts)
- slack.py _download_slack_file_bytes: same retry pattern
* test: add tests for media download retry
---------
Co-authored-by: dieutx <dangtc94@gmail.com>
When a new session starts in the gateway (via /new, /reset, or
auto-reset), send the user a summary of the detected configuration:
✨ Session reset! Starting fresh.
◆ Model: qwen3.5:27b-q4_K_M
◆ Provider: custom
◆ Context: 8K tokens (config)
◆ Endpoint: http://localhost:11434/v1
This makes misconfigured context length immediately visible — a user
running a local 8K model that falls to the 128K default will see:
◆ Context: 128K tokens (default — set model.context_length in config to override)
Instead of silently getting no compression and degrading responses.
- _format_session_info() resolves model, provider, context length,
and endpoint from config + runtime, matching the hygiene code's
resolution chain
- Local/custom endpoints shown; cloud endpoints hidden (not useful)
- Context source annotated: config, detected, or default with hint
- Appended to /new and /reset responses, and auto-reset notifications
- 9 tests covering all formatting paths and failure resilience
Addresses the user-facing side of #2708 — instead of trying to fix
every edge case in context detection, surface the values so users
can immediately see when something is wrong.
When user messages have empty content (e.g., Discord @mention-only
messages, unrecognized attachments), the Anthropic API rejects the
request with 'user messages must have non-empty content'.
Changes:
- anthropic_adapter.py: Add empty content validation for user messages
(string and list formats), matching the existing pattern for assistant
and tool messages. Empty content gets '(empty message)' placeholder.
- discord.py: Defense-in-depth check at gateway layer to catch empty
messages before they enter session history.
- Add 4 regression tests covering empty string, whitespace-only,
empty list, and empty text block scenarios.
Fixes#3143
Co-authored-by: Bartok9 <bartok9@users.noreply.github.com>
Two remaining CI failures:
1. agent-client-protocol 0.9.0 removed AuthMethod (replaced with
AuthMethodAgent/EnvVar/Terminal). Pin to <0.9 until the new API
is evaluated — our usage doesn't map 1:1 to the new types.
2. test_429_exhausts_all_retries_before_raising expected pytest.raises
but the agent now catches 429s after max retries, tries fallback,
then returns a result dict. Updated to check final_response.
The gateway's update_session() used += for token counts, but the cached
agent's session_prompt_tokens / session_completion_tokens are cumulative
totals that grow across messages. Each update_session call re-added the
running total, inflating usage stats with every message (1.7x after 3
messages, worse over longer conversations).
Fix: change += to = for in-memory entry fields, add set_token_counts()
to SessionDB that uses direct assignment instead of SQL increment, and
switch the gateway to call it.
CLI mode continues using update_token_counts() (increment) since it
tracks per-API-call deltas — that path is unchanged.
Based on analysis from PR #3222 by @zaycruz (closed).
Co-authored-by: zaycruz <zay@users.noreply.github.com>
The cached agent accumulates session_input_tokens across messages, so
run_conversation() returns cumulative totals. But update_session() used
+= (increment), double-counting on every message after the first.
- session.py: change in-memory entry updates from += to = (direct
assignment for cumulative values)
- hermes_state.py: add absolute=True flag to update_token_counts()
that uses SET column = ? instead of SET column = column + ?
- session.py: pass absolute=True to the DB call
CLI path is unchanged — it passes per-API-call deltas directly to
update_token_counts() with the default absolute=False (increment).
Reported by @zaycruz in #3222. Closes#3222.
The startup warning 'No user allowlists configured' only checked
GATEWAY_ALLOW_ALL_USERS and per-platform _ALLOWED_USERS vars. It
missed SIGNAL_GROUP_ALLOWED_USERS and per-platform _ALLOW_ALL_USERS
vars (e.g. TELEGRAM_ALLOW_ALL_USERS), causing a false warning even
when users had these configured. The actual auth check in
_is_user_authorized already recognized these vars.
Cherry-picked from PR #3202 by binhnt92.
Co-authored-by: binhnt92 <binhnt.ht.92@gmail.com>
rewrite_transcript (used by /retry, /undo, /compress) was calling
append_message without reasoning, reasoning_details, or
codex_reasoning_items — permanently dropping them from SQLite.
Co-authored-by: alireza78a <alireza78.crypto@gmail.com>
When _try_activate_fallback() switches to the fallback model, it
updates the agent's model/provider/client but never touches
self.context_compressor. The compressor keeps the primary model's
context_length and threshold_tokens, so compression decisions use
wrong limits — a 200K primary → 32K fallback still uses 200K-based
thresholds, causing oversized sessions to overflow the fallback.
Update the compressor's model, credentials, context_length, and
threshold_tokens after fallback activation using get_model_context_length()
for the new model.
Cherry-picked from PR #3202 by binhnt92.
Co-authored-by: binhnt92 <binhnt.ht.92@gmail.com>
The API server adapter was creating agents without specifying
enabled_toolsets, causing ALL tools to load — including clarify,
send_message, and text_to_speech which don't work without interactive
callbacks or gateway dispatch.
Changes:
- toolsets.py: Add hermes-api-server toolset (core tools minus clarify,
send_message, text_to_speech)
- api_server.py: Resolve toolsets from config.yaml platform_toolsets
via _get_platform_tools() — same path as all other gateway platforms.
Falls back to hermes-api-server default when no override configured.
- tools_config.py: Add api_server to PLATFORMS dict so users can
customize via 'hermes tools' or platform_toolsets.api_server in
config.yaml
- 12 tests covering toolset definition, config resolution, and
user override
Reported by thatwolfieguy on Discord.
Two changes:
1. Fix /queue command: remove the _agent_running guard that rejected
/queue after the agent finished. The prompt was deferred in
_pending_input until the agent completed, then the handler checked
_agent_running (now False) and rejected it. /queue now always queues
regardless of timing.
2. Add display.busy_input_mode config (CLI-only):
- 'interrupt' (default): Enter while busy interrupts the current run
(preserves existing behavior)
- 'queue': Enter while busy queues the message for the next turn,
with a 'Queued for the next turn: ...' confirmation
Ctrl+C always interrupts regardless of this setting.
Salvaged from PR #3037 by StefanoChiodino. Key differences:
- Default is 'interrupt' (preserves existing behavior) not 'queue'
- No config version bump (unnecessary for new key in existing section)
- Simpler normalization (no alias map)
- /queue fix is simpler: just remove the guard instead of intercepting
commands during busy state
* fix(gateway): silence flush agent terminal output
quiet_mode=True only suppresses AIAgent init messages.
Tool call output still leaks to the terminal through
_safe_print → _print_fn during session reset/expiry.
Since #2670 injected live memory state into the flush prompt,
the flush agent now reliably calls memory tools — making the
output leak noticeable for the first time.
Set _print_fn to a no-op so the background flush is fully silent.
* test(gateway): add test for flush agent terminal silence + fix dotenv mock
- Add TestFlushAgentSilenced: verifies _print_fn is set to a no-op on
the flush agent so tool output never leaks to the terminal
- Fix pre-existing test failures: replace patch('run_agent.AIAgent')
with sys.modules mock to avoid importing run_agent (requires openai)
- Add autouse _mock_dotenv fixture so all tests in this file run
without the dotenv package installed
* fix(display): route KawaiiSpinner output through print_fn to fully silence flush agent
The previous fix set tmp_agent._print_fn = no-op on the flush agent but
spinner output and quiet-mode cute messages bypassed _print_fn entirely:
- KawaiiSpinner captured sys.stdout at __init__ and wrote directly to it
- quiet-mode tool results used builtin print() instead of _safe_print()
Add optional print_fn parameter to KawaiiSpinner.__init__; _write routes
through it when set. Pass self._print_fn to all spinner construction sites
in run_agent.py and change the quiet-mode cute message print to _safe_print.
The existing gateway fix (tmp_agent._print_fn = lambda) now propagates
correctly through both paths.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* fix(gateway): silence hygiene and compression background agents
Two more background AIAgent instances in the gateway were created with
quiet_mode=True but without _print_fn = no-op, causing tool output to
leak to the terminal:
- _hyg_agent (in-turn hygiene memory agent)
- tmp_agent (_compress_context path)
Apply the same _print_fn no-op pattern used for the flush agent.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* chore(display): remove unused _last_flush_time from KawaiiSpinner
Attribute was set but never read; upstream already removed it.
Leftover from conflict resolution during rebase onto upstream/main.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
---------
Co-authored-by: Dilee <uzmpsk.dilekakbas@gmail.com>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
When send() fails due to a network error (ConnectError, ReadTimeout, etc.),
the failure was silently logged and the user received no feedback — appearing
as a hang. In one reported case, a user waited 1+ hour for a response that
had already been generated but failed to deliver (#2910).
Adds _send_with_retry() to BasePlatformAdapter:
- Transient errors: retry up to 2x with exponential backoff + jitter
- On exhaustion: send delivery-failure notice so user knows to retry
- Permanent errors: fall back to plain-text version (preserves existing behavior)
- SendResult.retryable flag for platform-specific transient errors
All adapters benefit automatically via BasePlatformAdapter inheritance.
Cherry-picked from PR #3108 by Mibayy.
Co-authored-by: Mibayy <mibayy@users.noreply.github.com>
shutil.get_terminal_size() can return stale/fallback values on SSH that
differ from prompt_toolkit's actual terminal width. Fragments built for
the wrong width overflow and wrap onto a second line (wrap_lines=True
default), appearing as progressively degrading duplicates.
- Read width from get_app().output.get_size().columns when inside a
prompt_toolkit TUI, falling back to shutil outside TUI context
- Add wrap_lines=False on the status bar Window as belt-and-suspenders
guard against any future width mismatch
Closes#3130
Co-authored-by: Mibayy <Mibayy@users.noreply.github.com>
Two bugs caused the OpenClaw migration during first-time setup to be
ineffective, forcing users to reconfigure everything manually:
1. The setup wizard created config.yaml with all defaults BEFORE running
the migration, then the migrator ran with overwrite=False. Every config
setting was reported as a 'conflict' against the defaults and skipped.
Fix: use overwrite=True during setup-time migration (safe because only
defaults exist at that point). The hermes claw migrate CLI command
still defaults to overwrite=False for post-setup use.
2. After migration, the full setup wizard ran all 5 sections unconditionally,
forcing the user through model/terminal/agent/messaging/tools configuration
even when those settings were just imported.
Fix: add _get_section_config_summary() and _skip_configured_section()
helpers. After migration, each section checks if it's already configured
(API keys present, non-default values, platform tokens) and offers
'Reconfigure? [y/N]' with default No. Unconfigured sections still run
normally.
Reported by Dev Bredda on social media.
After a Telegram 502, _handle_polling_network_error calls updater.stop()
then start_polling(). If start_polling() also raises, the old code logged
a warning and returned — but the comment 'The next network error will
trigger another attempt' was wrong. The updater loop is dead after stop(),
so no further error callbacks ever fire. The gateway stays alive but
permanently deaf to messages.
Fix: when start_polling() fails in the except branch, schedule a new
_handle_polling_network_error task to continue the exponential backoff
retry chain. The task is tracked in _background_tasks (preventing GC).
Guarded by has_fatal_error to avoid spurious retries during shutdown.
Closes#3173.
Salvaged from PR #3177 by Mibayy.
The delegate_task tool accepts a toolsets parameter directly from the
LLM's function call arguments. When provided, these toolsets are passed
through _strip_blocked_tools but never intersected with the parent
agent's enabled_toolsets. A model can request toolsets the parent does
not have (e.g., web, browser, rl), granting the subagent tools that
were explicitly disabled for the parent.
Intersect LLM-requested toolsets with the parent's enabled set before
applying the blocked-tool filter, so subagents can only receive a
subset of the parent's tools.
Co-authored-by: dieutx <dangtc94@gmail.com>
* feat: config-gated /verbose command for messaging gateway
Add gateway_config_gate field to CommandDef, allowing cli_only commands
to be conditionally available in the gateway based on a config value.
- CommandDef gains gateway_config_gate: str | None — a config dotpath
that, when truthy, overrides cli_only for gateway surfaces
- /verbose uses gateway_config_gate='display.tool_progress_command'
- Default is off (cli_only behavior preserved)
- When enabled, /verbose cycles tool_progress mode (off/new/all/verbose)
in the gateway, saving to config.yaml — same cycle as the CLI
- Gateway helpers (help, telegram menus, slack mapping) dynamically
check config to include/exclude config-gated commands
- GATEWAY_KNOWN_COMMANDS always includes config-gated commands so
the gateway recognizes them and can respond appropriately
- Handles YAML 1.1 bool coercion (bare 'off' parses as False)
- 8 new tests for the config gate mechanism + gateway handler
* docs: document gateway_config_gate and /verbose messaging support
- AGENTS.md: add gateway_config_gate to CommandDef fields
- slash-commands.md: note /verbose can be enabled for messaging, update Notes
- configuration.md: add tool_progress_command to display section + usage note
- cli.md: cross-link to config docs for messaging enablement
- messaging/index.md: show tool_progress_command in config snippet
- plugins.md: add gateway_config_gate to register_command parameter table
When third-party tools (Paperclip orchestrator, etc.) spawn hermes chat
as a subprocess, their sessions pollute user session history and search.
- hermes chat --source <tag> (also HERMES_SESSION_SOURCE env var)
- exclude_sources parameter on list_sessions_rich() and search_messages()
- Sessions with source=tool hidden from sessions list/browse/search
- Third-party adapters pass --source tool to isolate agent sessions
Cherry-picked from PR #3208 by HenkDz.
Co-authored-by: Henkey <noonou7@gmail.com>
except Exception does not catch KeyboardInterrupt (inherits from
BaseException). A second Ctrl+C during exit cleanup aborts pending
writes — Honcho observations dropped, SQLite sessions left unclosed,
cron job sessions never marked ended.
Changed to except (Exception, KeyboardInterrupt) at all five sites:
- cli.py: honcho.shutdown() and end_session() in finally exit block
- run_agent.py: _flush_honcho_on_exit atexit handler
- cron/scheduler.py: end_session() and close() in job finally block
Tests exercise the actual production code paths and confirm
KeyboardInterrupt propagates without the fix.
Co-authored-by: dieutx <dangtc94@gmail.com>
Asyncio tasks created with create_task() but never stored can be
garbage collected mid-execution. Add self._background_tasks set to
hold references, with add_done_callback cleanup. Tracks:
- /background command task
- session-reset memory flush task
- session-resume memory flush task
Cancel all pending tasks in stop().
Update test fixtures that construct GatewayRunner via object.__new__()
to include the new _background_tasks attribute.
Cherry-picked from PR #3167 by memosr. The original PR also deleted
the DM topic auto-skill loading code — that deletion was excluded
from this salvage as it removes a shipped feature (#2598).
Co-authored-by: memosr.eth <96793918+memosr@users.noreply.github.com>
detect_dangerous_command() ran regex patterns against raw command strings
without normalization, allowing bypass via Unicode fullwidth chars,
ANSI escape codes, null bytes, and 8-bit C1 controls.
Adds _normalize_command_for_detection() that:
- Strips ANSI escapes using the full ECMA-48 strip_ansi() from
tools/ansi_strip (CSI, OSC, DCS, 8-bit C1, nF sequences)
- Removes null bytes
- Normalizes Unicode via NFKC (fullwidth Latin → ASCII, etc.)
Includes 12 regression tests covering fullwidth, ANSI, C1, null byte,
and combined obfuscation bypasses.
Salvaged from PR #3089 by thakoreh — improved ANSI stripping to use
existing comprehensive strip_ansi() instead of a weaker hand-rolled
regex, and added test coverage.
Co-authored-by: Hiren <hiren.thakore58@gmail.com>
Nous Portal now passes through OpenRouter model names and routes from
there. Update the static fallback model list and auxiliary client default
to use OpenRouter-format slugs (provider/model) instead of bare names.
- _PROVIDER_MODELS['nous']: full OpenRouter catalog
- _NOUS_MODEL: google/gemini-3-flash-preview (was gemini-3-flash)
- Updated 4 test assertions for the new default model name
* fix(session-db): survive CLI/gateway concurrent write contention
Closes#3139
Three layered fixes for the scenario where CLI and gateway write to
state.db concurrently, causing create_session() to fail with
'database is locked' and permanently disabling session_search on the
gateway side.
1. Increase SQLite connection timeout: 10s -> 30s
hermes_state.py: longer window for the WAL writer to finish a batch
flush before the other process gives up entirely.
2. INSERT OR IGNORE in create_session
hermes_state.py: prevents IntegrityError on duplicate session IDs
(e.g. gateway restarts while CLI session is still alive).
3. Don't null out _session_db on create_session failure (main fix)
run_agent.py: a transient lock at agent startup must not permanently
disable session_search for the lifetime of that agent instance.
_session_db now stays alive so subsequent flushes and searches work
once the lock clears.
4. New ensure_session() helper + call it during flush
hermes_state.py: INSERT OR IGNORE for a minimal session row.
run_agent.py _flush_messages_to_session_db: calls ensure_session()
before appending messages, so the FK constraint is satisfied even
when create_session() failed at startup. No-op when the row exists.
* fix(state): release lock between context queries in search_messages
The context-window queries (one per FTS5 match) were running inside
the same lock acquisition as the primary FTS5 query, holding the lock
for O(N) sequential SQLite round-trips. Move per-match context fetches
outside the outer lock block so each acquires the lock independently,
keeping critical sections short and allowing other threads to interleave.
* fix(session): prefer longer source in load_transcript to prevent legacy truncation
When a long-lived session pre-dates SQLite storage (e.g. sessions
created before the DB layer was introduced, or after a clean
deployment that reset the DB), _flush_messages_to_session_db only
writes the *new* messages from the current turn to SQLite — it skips
messages already present in conversation_history, assuming they are
already persisted.
That assumption fails for legacy JSONL-only sessions:
Turn N (first after DB migration):
load_transcript(id) → SQLite: 0 → falls back to JSONL: 994 ✓
_flush_messages_to_session_db: skip first 994, write 2 new → SQLite: 2
Turn N+1:
load_transcript(id) → SQLite: 2 → returns immediately ✗
Agent sees 2 messages of history instead of 996
The same pattern causes the reported symptom: session JSON truncated
to 4 messages (_save_session_log writes agent.messages which only has
2 history + 2 new = 4).
Fix: always load both sources and return whichever is longer. For a
fully-migrated session SQLite will always be ≥ JSONL, so there is no
regression. For a legacy session that hasn't been bootstrapped yet,
JSONL wins and the full history is restored.
Closes#3212
* test: add load_transcript source preference tests for #3212
Covers: JSONL longer returns JSONL, SQLite longer returns SQLite,
SQLite empty falls back to JSONL, both empty returns empty, equal
length prefers SQLite (richer reasoning fields).
---------
Co-authored-by: Mibayy <mibayy@hermes.ai>
Co-authored-by: kewe63 <kewe.3217@gmail.com>
Co-authored-by: Mibayy <mibayy@users.noreply.github.com>
* fix(skills): reduce skills.sh resolution churn and preserve trust for wrapped identifiers
- Accept common skills.sh prefix typos (skils-sh/, skils.sh/)
- Strip skills-sh/ prefix in _resolve_trust_level() so trusted repos
stay trusted when installed through skills.sh
- Use resolved identifier (from bundle/meta) for scan_skill source
- Prefer tree search before root scan in _discover_identifier()
- Add _resolve_github_meta() consolidation for inspect flow
Cherry-picked from PR #3001 by kshitijk4poor.
* fix: restore candidate loop in SkillsShSource.fetch() for consistency
The cherry-picked PR only tried the first candidate identifier in
fetch() while inspect() (via _resolve_github_meta) tried all four.
This meant skills at repo/skills/path would be found by inspect but
missed by fetch, forcing it through the heavier _discover_identifier
flow. Restore the candidate loop so both paths behave identically.
Updated the test assertion to match.
---------
Co-authored-by: kshitijk4poor <82637225+kshitijk4poor@users.noreply.github.com>
Gateway sessions had their own inline toolset resolution that only read
platform_toolsets from config, which never includes MCP server names.
MCP tools were discovered and registered but invisible to the model.
- Replace duplicated gateway toolset resolution in _run_agent() and
_run_background_task() with calls to the shared _get_platform_tools()
- Extend _get_platform_tools() to include globally enabled MCP servers
at runtime (include_default_mcp_servers=True), while config-editing
flows use include_default_mcp_servers=False to avoid persisting
implicit MCP defaults into platform_toolsets
- Add homeassistant to PLATFORMS dict (was missing, caused KeyError)
- Fix CLI entry point to use _get_platform_tools() as well, so MCP
tools are visible in CLI mode too
- Remove redundant platform_key reassignment in _run_background_task
Co-authored-by: kshitijk4poor <kshitijk4poor@users.noreply.github.com>
Previously _agent_config_signature() used only the first 8 characters of
the API key, which causes false cache hits for JWT/OAuth tokens that share
a common prefix (e.g. 'eyJhbGci'). This led to cross-account cache
collisions when switching OAuth accounts in multi-user gateway deployments.
Replace the 8-char prefix with a SHA-256 hash of the full key so the
signature is unique per credential while keeping secrets out of the
cache key.
Salvaged from PR #3117 by EmpireOperating.
Co-authored-by: EmpireOperating <EmpireOperating@users.noreply.github.com>
Salvages PR #3005 by web3blind. Cherry-picked onto current main with functional skill binding and docs added.
- DM topic creation via createForumTopic (Bot API 9.4, Feb 2026)
- Config-driven topics with thread_id persistence across restarts
- Session isolation via existing build_session_key thread_id support
- auto_skill field on MessageEvent for topic-skill bindings
- Gateway auto-loads bound skill on new sessions (same as /skill commands)
- Docs: full Private Chat Topics section in Telegram messaging guide
- 20 tests (17 original + 3 for auto_skill)
Closes#2598
Co-authored-by: web3blind <web3blind@users.noreply.github.com>
The default SOUL.md seeded for new users should match
DEFAULT_AGENT_IDENTITY — a short, neutral identity paragraph.
The elaborate voice spec (avoid lists, dialogue examples, symbol
conventions) was never intended as the default for all users.
Users who want a custom persona write their own SOUL.md.
The non-streaming API call path (_interruptible_api_call) had no
wall-clock timeout. When providers keep connections alive with SSE
keep-alive pings but never deliver a response, httpx's inactivity
timeout never fires and the call hangs indefinitely.
Subagents always used the non-streaming path because they have no
stream consumers (quiet_mode=True). This caused delegate_task to
hang for 40+ minutes in production.
The streaming path has two layers of protection:
- httpx read timeout (60s, HERMES_STREAM_READ_TIMEOUT)
- Stale stream detection (90s, HERMES_STREAM_STALE_TIMEOUT)
Both work because streaming sends chunks continuously — a 90-second
gap between chunks genuinely means the connection is broken, even for
reasoning models that take minutes to complete.
Now run_conversation() always prefers the streaming path. The streaming
method falls back to non-streaming automatically if the provider
doesn't support it. Stream delta callbacks are no-ops when no
consumers are registered, so there's no overhead for subagents.
run_conversation raised the raw exception after exhausting retries,
which crashed the background thread in cli.py (unhandled exception
in Thread). Now returns a proper error result dict with failed=True
and persists the session, matching the pattern used by other error
paths (invalid responses, empty content, etc.).
Also wraps cli.py's run_agent thread function in try/except as a
safety net against any future unhandled exceptions from
run_conversation.
Made-with: Cursor
When an agent thread hangs (truly blocked, never checks _interrupt_requested),
/stop now force-cleans _running_agents to unlock the session immediately.
Two changes:
- Early /stop intercept in the running-agent guard: bypasses normal command
dispatch to force-interrupt and unlock the session. Follows the same pattern
as the existing /new intercept.
- Sentinel /stop: force-cleans the sentinel instead of returning 'nothing to
stop yet', so /stop during slow startup actually unlocks the session.
Follow-up improvements over original PR:
- Consolidated duplicate resolve_command imports into single early resolution
- Updated _handle_stop_command to also force-clean for consistency
- Removed 10-minute hard timeout on the executor (would kill legitimate
long-running agent tasks; the /stop force-clean handles recovery)
Cherry-picked from Mibayy's PR #2498.
Co-authored-by: Mibayy <Mibayy@users.noreply.github.com>
The recursive os.walk for AGENTS.md in subdirectories was undesired.
Only load AGENTS.md from the working directory root, matching the
behavior of CLAUDE.md and .cursorrules.
Remove run_hermes_oauth_login(), refresh_hermes_oauth_token(),
read_hermes_oauth_credentials(), _save_hermes_oauth_credentials(),
_generate_pkce(), and associated constants/credential file path.
This code was added in 63e88326 but never wired into any user-facing
flow (setup wizard, hermes model, or any CLI command). Neither
clawdbot/OpenClaw nor opencode implement PKCE for Anthropic — both
use setup-token or API keys. Dead code that was never tested in
production.
Also removes the credential resolution step that checked
~/.hermes/.anthropic_oauth.json (step 3 in resolve_anthropic_token),
renumbering remaining steps.
reset_session_state() was missing two fields added after it was written:
- _user_turn_count: kept accumulating across sessions, affecting
flush_min_turns guard behavior
- context_compressor._previous_summary: old session's compression
summary leaked into new session's iterative compression
Cherry-picked from PR #2640 by dusterbloom. Closes#2635.
sessions delete and prune call input() for confirmation without
catching EOFError. When stdin isn't a TTY (piped input, CI/CD, cron),
input() throws EOFError and the command crashes.
Extract a _confirm_prompt() helper that handles EOFError and
KeyboardInterrupt, defaulting to cancel. Both call sites now use it.
Salvaged from PR #2622 by dieutx (improved from duplicated try/except
to shared helper). Closes#2565.
In gateway mode, async tools (vision_analyze, web_extract, session_search)
deadlock because _run_async() spawns a thread with asyncio.run(), creating
a new event loop, but _get_cached_client() returns an AsyncOpenAI client
bound to a different loop. httpx.AsyncClient cannot work across event loop
boundaries, causing await client.chat.completions.create() to hang forever.
Fix: include the event loop identity in the async client cache key so each
loop gets its own AsyncOpenAI instance. Also fix session_search_tool.py
which had its own broken asyncio.run()-in-thread pattern — now uses the
centralized _run_async() bridge.
The /model command is removed from both the interactive CLI and
messenger gateway (Telegram/Discord/Slack/WhatsApp). Users can
still change models via 'hermes model' CLI subcommand or by
editing config.yaml directly.
Removed:
- CommandDef entry from COMMAND_REGISTRY
- CLI process_command() handler and model autocomplete logic
- Gateway _handle_model_command() and dispatch
- SlashCommandCompleter model_completer_provider parameter
- Two-stage Tab completion and ghost text for /model
- All /model-specific tests
Unaffected:
- /provider command (read-only, shows current model + providers)
- ACP adapter _cmd_model (separate system for VS Code/Zed/JetBrains)
- model_switch.py module (used by ACP)
- 'hermes model' CLI subcommand
Author: Teknium
- Registry now warns when a tool name is overwritten by a different
toolset (silent dict overwrite was the previous behavior)
- MCP tool registration checks for collisions with non-MCP (built-in)
tools before registering. If an MCP tool's prefixed name matches an
existing built-in, the MCP tool is skipped and a warning is logged.
MCP-to-MCP collisions are allowed (last server wins).
- Both regular MCP tools and utility tools (resources/prompts) are
guarded.
- Adds 5 tests covering: registry overwrite warning, same-toolset
re-registration silence, built-in collision skip, normal registration,
and MCP-to-MCP collision pass-through.
Reported by k_sze (KONG) — MiniMax MCP server's web_search tool could
theoretically shadow Hermes's built-in web_search if prefixing failed.
Covers the case where a SKILL.md has `metadata:` (null) or
`metadata.hermes:` (null), which caused an AttributeError
before the fix in d218cf91.
Made-with: Cursor
* fix(security): add SSRF protection to browser_navigate
browser_navigate() only checked the website blocklist policy but did
not call is_safe_url() to block private/internal addresses. This
allowed the agent to navigate to localhost, cloud metadata endpoints
(169.254.169.254), and private network IPs via the browser.
web_tools and vision_tools already had this check. Added the same
is_safe_url() pre-flight validation before the blocklist check in
browser_navigate().
* fix: move SSRF import to module level, fix policy test mock
Move is_safe_url import to module level so it can be monkeypatched
in tests. Update test_browser_navigate_returns_policy_block to mock
_is_safe_url so the SSRF check passes and the policy check is reached.
* fix(security): harden browser SSRF protection
Follow-up to cherry-picked PR #3041:
1. Fail-closed fallback: if url_safety module can't import, block all
URLs instead of allowing all. Security guards should never fail-open.
2. Post-redirect SSRF check: after navigation, verify the final URL
isn't a private/internal address. If a public URL redirected to
169.254.169.254 or localhost, navigate to about:blank and return
an error — prevents the model from reading internal content via
subsequent browser_snapshot calls.
---------
Co-authored-by: 0xbyt4 <35742124+0xbyt4@users.noreply.github.com>
When a background task (/bg command) prints its output while the main agent
is processing with the thinking spinner visible, the status bar could render
on the same row as the spinner, causing visual overlap.
This fix adds an explicit app.invalidate() call with a brief pause before
printing background task output, ensuring the TUI layout is in a consistent
state before the output is written.
Changes:
- Add TUI refresh before success output in _handle_background_command
- Add TUI refresh before error output in the exception handler
- Add tests for the refresh behavior
Closes#2718
Co-authored-by: Bartok9 <bartokmagic@proton.me>
After streaming retries are exhausted on transient errors, fall back to
non-streaming instead of propagating the error. Also fall back for any
other pre-delivery stream error (not just 'streaming not supported').
Added user-facing message when streaming is not supported by a model/
provider, directing users to set display.streaming: false in config.yaml
to avoid the fallback delay.
Cherry-picked from PR #3008 by kshitijk4poor. Added UX message for
streaming-not-supported detection.
Co-authored-by: kshitijk4poor <kshitijk4poor@users.noreply.github.com>
* feat: nix flake, uv2nix build, dev shell and home manager
* fixed nix run, updated docs for setup
* feat(nix): NixOS module with persistent container mode, managed guards, checks
- Replace homeModules.nix with nixosModules.nix (two deployment modes)
- Mode A (native): hardened systemd service with ProtectSystem=strict
- Mode B (container): persistent Ubuntu container with /nix/store bind-mount,
identity-hash-based recreation, GC root protection, symlink-based updates
- Add HERMES_MANAGED guards blocking CLI config mutation (config set, setup,
gateway install/uninstall) when running under NixOS module
- Add nix/checks.nix with build-time verification (binary, CLI, managed guard)
- Remove container.nix (no Nix-built OCI image; pulls ubuntu:24.04 at runtime)
- Simplify packages.nix (drop fetchFromGitHub submodules, PYTHONPATH wrappers)
- Rewrite docs/nixos-setup.md with full options reference, container
architecture, secrets management, and troubleshooting guide
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* Update config.py
* feat(nix): add CI workflow and enhanced build checks
- GitHub Actions workflow for nix flake check + build on linux/macOS
- Entry point sync check to catch pyproject.toml drift
- Expanded managed-guard check to cover config edit
- Wrap hermes-acp binary in Nix package
- Fix Path type mismatch in is_managed()
* Update MCP server package name; bundled skills support
* fix reading .env. instead have container user a common mounted .env file
* feat(nix): container entrypoint with privilege drop and sudo provisioning
Container was running as non-root via --user, which broke apt/pip installs
and caused crashes when $HOME didn't exist. Replace --user with a Nix-built
entrypoint script that provisions the hermes user, sudo (NOPASSWD), and
/home/hermes inside the container on first boot, then drops privileges via
setpriv. Writable layer persists so setup only runs once.
Also expands MCP server options to support HTTP transport and sampling.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix group and user creation in container mode
* feat(nix): persistent /home/hermes and MESSAGING_CWD in container mode
Container mode now bind-mounts ${stateDir}/home to /home/hermes so the
agent's home directory survives container recreation. Previously it lived
in the writable layer and was lost on image/volume/options changes.
Also passes MESSAGING_CWD to the container so the agent finds its
workspace and documents, matching native mode behavior.
Other changes:
- Extract containerDataDir/containerHomeDir bindings (no more magic strings)
- Fix entrypoint chown to run unconditionally (volume mounts always exist)
- Add schema field to container identity hash for auto-recreation
- Add idempotency test (Scenario G) to config-roundtrip check
* docs: add Nix & NixOS setup guide to docs site
Add comprehensive Nix documentation to the Docusaurus site at
website/docs/getting-started/nix-setup.md, covering nix run/profile
install, NixOS module (native + container modes), declarative settings,
secrets management, MCP servers, managed mode, container architecture,
dev shell, flake checks, and full options reference.
- Register nix-setup in sidebar after installation page
- Add Nix callout tip to installation.md linking to new guide
- Add canonical version pointer in docs/nixos-setup.md
* docs: remove docs/nixos-setup.md, consolidate into website docs
Backfill missing details (restart/restartSec in full example,
gateway.pid, 0750 permissions, docker inspect commands) into
the canonical website/docs/getting-started/nix-setup.md and
delete the old standalone file.
* fix(nix): add compression.protect_last_n and target_ratio to config-keys.json
New keys were added to DEFAULT_CONFIG on main, causing the
config-drift check to fail in CI.
* fix(nix): skip checks on aarch64-darwin (onnxruntime wheel missing)
The full Python venv includes onnxruntime (via faster-whisper/STT)
which lacks a compatible uv2nix wheel on aarch64-darwin. Gate all
checks behind stdenv.hostPlatform.isLinux. The package and devShell
still evaluate on macOS.
* fix(nix): skip flake check and build on macOS CI
onnxruntime (transitive dep via faster-whisper) lacks a compatible
uv2nix wheel on aarch64-darwin. Run full checks and build on Linux
only; macOS CI verifies the flake evaluates without building.
* fix(nix): preserve container writable layer across nixos-rebuild
The container identity hash included the entrypoint's Nix store path,
which changes on every nixpkgs update (due to runtimeShell/stdenv
input-addressing). This caused false-positive identity mismatches,
triggering container recreation and losing the persistent writable layer.
- Use stable symlink (current-entrypoint) like current-package already does
- Remove entrypoint from identity hash (only image/volumes/options matter)
- Add GC root for entrypoint so nix-collect-garbage doesn't break it
- Remove global HERMES_HOME env var from addToSystemPackages (conflicted
with interactive CLI use, service already sets its own)
---------
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Three improvements to reasoning/thinking display in the CLI:
1. Buffer tiny reasoning chunks: providers like DeepSeek stream reasoning
one word at a time, producing a separate [thinking] line per token.
Add a buffer that coalesces chunks and flushes at natural boundaries
(newlines, sentence endings, terminal width).
2. Fix duplicate reasoning display: centralize callback selection into
_current_reasoning_callback() — one place instead of 4 scattered
inline ternaries. Prevents both the streaming box AND the preview
callback from firing simultaneously.
3. Fix post-response reasoning box guard: change the check from
'not self._stream_started' to 'not self._reasoning_stream_started'
so the final reasoning box is only suppressed when reasoning was
actually streamed live, not when any text was streamed.
Cherry-picked from PR #2781 by juanfradb.
- Add 'prompt exceeds max length' to context overflow detection for
Z.AI/GLM 400 errors
- Extract inline reasoning blocks from assistant content as fallback
when no structured reasoning fields are present
- Guard inline extraction so structured API reasoning takes priority
- Update test for reasoning-only response salvage behavior
Cherry-picked from PR #2993 by kshitijk4poor. Added priority guard
to fix test_structured_reasoning_takes_priority failure.
Co-authored-by: kshitijk4poor <kshitijk4poor@users.noreply.github.com>
Cron was the only execution path that never called end_session(),
leaving ended_at = NULL permanently. This made cron sessions invisible
to hermes prune --older-than and indistinguishable from active sessions.
Captures session_id in a local variable before agent construction so
it's available in the finally block even if AIAgent() fails, then calls
end_session(session_id, 'cron_complete') before close().
Cherry-picked from PR #2979 by ygd58. Fixed bug: original PR called
end_session() with zero arguments (TypeError — method requires
session_id and end_reason).
Fixes#2972.
Co-authored-by: ygd58 <ygd58@users.noreply.github.com>
* fix(skills): use Git Trees API to prevent silent subdirectory loss during install
Refactors _download_directory() to use the Git Trees API (single call
for the entire repo tree) as the primary path, falling back to the
recursive Contents API when the tree endpoint is unavailable or
truncated. Prevents silent subdirectory loss caused by per-directory
rate limiting or transient failures.
Cherry-picked from PR #2981 by tugrulguner.
Fixes#2940.
* fix: simplify tree API — use branch name directly as tree-ish
Eliminates an extra git/ref/heads API call by passing the branch name
directly to git/trees/{branch}?recursive=1, matching the pattern
already used by _find_skill_in_repo_tree.
---------
Co-authored-by: tugrulguner <tugrulguner@users.noreply.github.com>