hermes-agent

Author	SHA1	Message	Date
Teknium	b6b87dedd4	fix: discover plugins before reading plugin toolsets in tools_config (#3457 ) hermes tools and _get_platform_tools() call get_plugin_toolsets() / _get_plugin_toolset_keys() without first ensuring plugins have been discovered. discover_plugins() only runs as a side effect of importing model_tools.py, which hermes tools never does. This means: - hermes tools TUI never shows plugin toolsets (invisible to users) - _get_platform_tools() in standalone processes misses plugin toolsets Fix: call discover_plugins() (idempotent) in both _get_plugin_toolset_keys() and _get_effective_configurable_toolsets() before accessing plugin state. In the gateway/CLI where model_tools.py is already imported, the call is a no-op (discover_and_load checks _discovered flag).	2026-03-27 15:31:17 -07:00
Teknium	8fdfc4b00c	fix(agent): detect thinking-budget exhaustion on truncation, skip useless retries (#3444 ) When finish_reason='length' and the response contains only reasoning (think blocks or empty content), the model exhausted its output token budget on thinking with nothing left for the actual response. Previously, this fell into either: - chat_completions: 3 useless continuation retries (model hits same limit) - anthropic/codex: generic 'Response truncated' error with rollback Now: detect the think-only + length condition early and return immediately with a targeted error message: 'Model used all output tokens on reasoning with none left for the response. Try lowering reasoning effort or increasing max_tokens.' This saves 2 wasted API calls on the chat_completions path and gives users actionable guidance instead of a cryptic error. The existing think-only retry logic (finish_reason='stop') is unchanged — that's a genuine model glitch where retrying can help.	2026-03-27 15:29:30 -07:00
Teknium	658692799d	fix: guard aux LLM calls against None content + reasoning fallback + retry (salvage #3389 ) (#3449 ) Salvage of #3389 by @binhnt92 with reasoning fallback and retry logic added on top. All 7 auxiliary LLM call sites now use extract_content_or_reasoning() which mirrors the main agent loop's behavior: extract content, strip think blocks, fall back to structured reasoning fields, retry on empty. Closes #3389.	2026-03-27 15:28:19 -07:00
Teknium	ab09f6b568	feat: curate HF model picker with OpenRouter analogues (#3440 ) Show only agentic models that map to OpenRouter defaults: Qwen/Qwen3.5-397B-A17B ↔ qwen/qwen3.5-plus Qwen/Qwen3.5-35B-A3B ↔ qwen/qwen3.5-35b-a3b deepseek-ai/DeepSeek-V3.2 ↔ deepseek/deepseek-chat moonshotai/Kimi-K2.5 ↔ moonshotai/kimi-k2.5 MiniMaxAI/MiniMax-M2.5 ↔ minimax/minimax-m2.5 zai-org/GLM-5 ↔ z-ai/glm-5 XiaomiMiMo/MiMo-V2-Flash ↔ xiaomi/mimo-v2-pro moonshotai/Kimi-K2-Thinking ↔ moonshotai/kimi-k2-thinking Users can still pick any HF model via Enter custom model name.	2026-03-27 13:54:46 -07:00
Teknium	e4e04c2005	fix: make tirith block verdicts approvable instead of hard-blocking (#3428 ) Previously, tirith exit code 1 (block) immediately rejected the command with no approval prompt — users saw 'BLOCKED: Command blocked by security scan' and the agent moved on. This prevented gateway/CLI users from approving pipe-to-shell installs like 'curl ... \| sh' even when they understood the risk. Changes: - Tirith 'block' and 'warn' now both go through the approval flow. Users see the full tirith findings (severity, title, description, safer alternatives) and can choose to approve or deny. - New _format_tirith_description() builds rich descriptions from tirith findings JSON so the approval prompt is informative. - CLI startup now warns when tirith is enabled but not available, so users know command scanning is degraded to pattern matching only. The default approval choice is still deny, so the security posture is unchanged for unattended/timeout scenarios. Reported via Discord by pistrie — 'curl -fsSL https://mandex.dev/install.sh \| sh' was hard-blocked with no way to approve.	2026-03-27 13:22:01 -07:00
Teknium	6f11ff53ad	fix(anthropic): use model-native output limits instead of hardcoded 16K (#3426 ) The Anthropic adapter defaulted to max_tokens=16384 when no explicit value was configured. This severely limits thinking-enabled models where thinking tokens count toward max_tokens: - Claude Opus 4.6 supports 128K output but was capped at 16K - Claude Sonnet 4.6 supports 64K output but was capped at 16K With extended thinking (adaptive or budget-based), the model could exhaust the entire 16K on reasoning, leaving zero tokens for the actual response. This caused two user-visible errors: - 'Response truncated (finish_reason=length)' — thinking consumed most tokens - 'Response only contains think block with no content' — thinking consumed all Fix: add _ANTHROPIC_OUTPUT_LIMITS lookup table (sourced from Anthropic docs and Cline's model catalog) and use the model's actual output limit as the default. Unknown future models default to 128K (the current maximum). Also adds context_length clamping: if the user configured a smaller context window (e.g. custom endpoint), max_tokens is clamped to context_length - 1 to avoid exceeding the window. Closes #2706	2026-03-27 13:02:52 -07:00
Teknium	fb46a90098	fix: increase API timeout default from 900s to 1800s for slow-thinking models (#3431 ) Models like GLM-5/5.1 can think for 15+ minutes. The previous 900s (15 min) default for HERMES_API_TIMEOUT killed legitimate requests. Raised to 1800s (30 min) in both places that read the env var: - _build_api_kwargs() timeout (non-streaming total timeout) - _call_chat_completions() write timeout (streaming connection) The streaming per-chunk read timeout (60s) and stale stream detector (180-300s) are unchanged — those are appropriate for inter-chunk timing.	2026-03-27 13:02:23 -07:00
Teknium	fd8c465e42	feat: add Hugging Face as a first-class inference provider (#3419 ) Salvage of PR #1747 (original PR #1171 by @davanstrien) onto current main. Registers Hugging Face Inference Providers (router.huggingface.co/v1) as a named provider: - hermes chat --provider huggingface (or --provider hf) - 18 curated open models via hermes model picker - HF_TOKEN in ~/.hermes/.env - OpenAI-compatible endpoint with automatic failover (Groq, Together, SambaNova, etc.) Files: auth.py, models.py, main.py, setup.py, config.py, model_metadata.py, .env.example, 5 docs pages, 17 new tests. Co-authored-by: Daniel van Strien <davanstrien@gmail.com>	2026-03-27 12:41:59 -07:00
Teknium	f57ebf52e9	fix(api-server): cancel orphaned agent + true interrupt on SSE disconnect (salvage #3399 ) (#3427 ) Salvage of #3399 by @binhnt92 with true agent interruption added on top. When a streaming /v1/chat/completions client disconnects mid-stream, the agent is now interrupted via agent.interrupt() so it stops making LLM API calls, and the asyncio task wrapper is cancelled. Closes #3399.	2026-03-27 11:33:19 -07:00
Teknium	5127567d5d	perf(ttft): cache skills prompt with shared skill_utils module (salvage #3366 ) (#3421 ) Two-layer caching for build_skills_system_prompt(): 1. In-process LRU (OrderedDict, max 8) — same-process: 546ms → <1ms 2. Disk snapshot (.skills_prompt_snapshot.json) — cold start: 297ms → 103ms Key improvements over original PR #3366: - Extract shared logic into agent/skill_utils.py (parse_frontmatter, skill_matches_platform, get_disabled_skill_names, extract_skill_conditions, extract_skill_description, iter_skill_index_files) - tools/skills_tool.py delegates to shared module — zero code duplication - Proper LRU eviction via OrderedDict.move_to_end + popitem(last=False) - Cache invalidation on all skill mutation paths: - skill_manage tool (in-conversation writes) - hermes skills install (CLI hub) - hermes skills uninstall (CLI hub) - Automatic via mtime/size manifest on cold start prompt_builder.py no longer imports tools.skills_tool (avoids pulling in the entire tool registry chain at prompt build time). 6301 tests pass, 0 failures. Co-authored-by: kshitijk4poor <82637225+kshitijk4poor@users.noreply.github.com>	2026-03-27 10:54:02 -07:00
Siddharth Balyan	cc4514076b	feat(nix): add suffix PATHs during nix build for more agent-friendliness (#3274 ) * refactor: suffix runtimeDeps PATH so apt-installed tools take priority Changes makeWrapper from --prefix to --suffix. In container mode, tools installed via apt in /usr/bin now win over read-only nix store copies. Nix store versions become dead-letter fallbacks. Native NixOS mode unaffected — tools in /run/current-system/sw/bin already precede the suffix. * feat(container): first-boot apt provisioning for agent tools Installs nodejs, npm, curl via apt and uv via curl on first container boot. Uses sentinel file so subsequent boots skip. Container recreation triggers fresh install. Combined with --suffix PATH change, agents get mutable tools that support npm i -g and uv without hitting read-only nix store paths. * docs: update nixosModules header for tool provisioning * feat(container): consolidate first-boot provisioning + Python 3.11 venv Merge sudo and tool apt installs into a single apt-get update call. Move uv install outside the sentinel so transient failures retry on next boot. Bootstrap a Python 3.11 venv via uv (--seed for pip) and prepend ~/.venv/bin to PATH so agents get writable python/pip/node out of the box. --------- Co-authored-by: Hermes Agent <hermes@nousresearch.com>	2026-03-27 23:00:56 +05:30
Teknium	8ecd7aed2c	fix: prevent reasoning box from rendering 3x during tool-calling loops (#3405 ) Two independent bugs caused the reasoning box to appear three times when the model produced reasoning + tool_calls: Bug A: _build_assistant_message() re-fired reasoning_callback with the full reasoning text even when streaming had already displayed it. The original guard only checked structured reasoning_content deltas, but reasoning also arrives via content tag extraction (<REASONING_SCRATCHPAD>/<think> tags in delta.content), which went through _fire_stream_delta not _fire_reasoning_delta. Fix: skip the callback entirely when streaming is active — both paths display reasoning during the stream. Any reasoning not shown during streaming is caught by the CLI post-response fallback. Bug B: The post-response reasoning display checked _reasoning_stream_started, but that flag was reset by _reset_stream_state() during intermediate turn boundaries (when stream_delta_callback(None) fires between tool calls). Introduced _reasoning_shown_this_turn flag that persists across the tool loop and is only reset at the start of each user turn. Live-tested in PTY: reasoning now shows exactly once per API call, no duplicates across tool-calling loops.	2026-03-27 09:57:50 -07:00
Teknium	e0dbbdb2c9	fix: eliminate 'Event loop is closed' / 'Press ENTER to continue' during idle (#3398 ) The OpenAI SDK's AsyncHttpxClientWrapper.__del__ schedules aclose() via asyncio.get_running_loop().create_task(). When an AsyncOpenAI client is garbage-collected while prompt_toolkit's event loop is running (the common CLI idle state), the aclose() task runs on prompt_toolkit's loop but the underlying TCP transport is bound to a different (dead) worker loop. The transport's self._loop.call_soon() then raises RuntimeError('Event loop is closed'), which prompt_toolkit surfaces as the disruptive 'Unhandled exception in event loop ... Press ENTER to continue...' error. Three-layer fix: 1. neuter_async_httpx_del(): Monkey-patches __del__ to a no-op at CLI startup before any AsyncOpenAI clients are created. Safe because cached clients are explicitly cleaned via _force_close_async_httpx, and uncached clients' TCP connections are cleaned by the OS on exit. 2. Custom asyncio exception handler: Installed on prompt_toolkit's event loop to silently suppress 'Event loop is closed' RuntimeError. Defense-in-depth for SDK upgrades that might change the class name. 3. cleanup_stale_async_clients(): Called after each agent turn (when the agent thread joins) to proactively evict cache entries whose event loop is closed, preventing stale clients from accumulating.	2026-03-27 09:45:25 -07:00
Teknium	eb2127c1dc	fix(cron): prevent recurring job re-fire on gateway crash/restart loop (#3396 ) When a gateway crashes mid-job execution (before mark_job_run can persist the updated next_run_at), the job would fire again on every restart attempt within the grace window. For a daily 6:15 AM job with a 2-hour grace, rapidly restarting the gateway could trigger dozens of duplicate runs. Fix: call advance_next_run() BEFORE run_job() in tick(). For recurring jobs (cron/interval), this preemptively advances next_run_at to the next future occurrence and persists it to disk. If the process then crashes during execution, the job won't be considered due on restart. One-shot jobs are left unchanged — they still retry on restart since there's no future occurrence to advance to. This changes the scheduler from at-least-once to at-most-once semantics for recurring jobs, which is the correct tradeoff: missing one daily message is far better than sending it dozens of times.	2026-03-27 08:02:58 -07:00
Teknium	5a1e2a307a	perf(ttft): salvage easy-win startup optimizations from #3346 (#3395 ) * perf(ttft): dedupe shared tool availability checks * perf(ttft): short-circuit vision auto-resolution * perf(ttft): make Claude Code version detection lazy * perf(ttft): reuse loaded toolsets for skills prompt --------- Co-authored-by: kshitijk4poor <82637225+kshitijk4poor@users.noreply.github.com>	2026-03-27 07:49:44 -07:00
Teknium	41d9d08078	fix(telegram): fall back to no thread_id on 'Message thread not found' (#3390 ) python-telegram-bot's BadRequest inherits from NetworkError, so the send() retry loop was catching 'Message thread not found' as a transient network error and retrying 3 times before silently failing. This killed all tool progress messages, streaming responses, and typing indicators when the incoming message carried an invalid message_thread_id. Now detect BadRequest inside the NetworkError handler: - 'thread not found' + thread_id set → clear thread_id and retry once (message still reaches the chat, just without topic threading) - Other BadRequest errors → raise immediately (permanent, don't retry) - True NetworkError → retry as before (transient) 252 silent failures in gateway.log traced to this on 2026-03-26. 5 new tests for thread fallback, non-thread BadRequest, no-thread sends, network retry, and multi-chunk fallback.	2026-03-27 06:07:28 -07:00
Teknium	b7bcae49c6	fix: SQLite WAL write-lock contention causing 15-20s TUI freeze (#3385 ) Multiple hermes processes (gateway + CLI sessions + worktree agents) sharing one state.db caused WAL write-lock convoy effects. SQLite's built-in busy handler uses deterministic sleep intervals (up to 100ms) that synchronize competing writers, creating 15-20 second freezes during agent init. Root cause: timeout=30.0 with 7+ concurrent connections meant: - WAL never checkpointed (294MB, readers always blocked it) - Bloated WAL slowed all reads and writes - Deterministic backoff caused convoy effects under contention Fix: - Replace 30s SQLite timeout with 1s + app-level retry (15 attempts, random 20-150ms jitter between retries to break convoys) - Use BEGIN IMMEDIATE for explicit write-lock acquisition (fail fast) - Set isolation_level=None for manual transaction control - PASSIVE WAL checkpoint on close() and every 50 writes - All 12 write methods converted to _execute_write() helper Before: 15-20s frozen at create_session during agent init After: <1s to API call, WAL stays at ~4MB Tested: 4355 tests pass, 3 concurrent live sessions with simultaneous writes showed zero contention on every py-spy sample.	2026-03-27 05:22:57 -07:00
Teknium	915df02bbf	fix(streaming): stale stream detector race causing spurious RemoteProtocolError The stale stream detector (90s timeout) was killing healthy connections during the model's thinking phase, producing self-inflicted RemoteProtocolError ("peer closed connection without sending complete message body"). Three issues: 1. last_chunk_time was never reset between inner stream retries, so subsequent attempts inherited the previous attempt's stale budget 2. The non-streaming fallback path didn't reset the timer either 3. 90s base timeout was too aggressive for large-context Opus sessions where thinking time before first token routinely exceeds 90s Fix: reset last_chunk_time at the start of each streaming attempt and before the non-streaming fallback. Increase base timeout to 180s and scale to 300s for >100K token contexts. Made-with: Cursor	2026-03-27 04:05:51 -07:00
Teknium	75fcbc44ce	feat(telegram): auto-discover fallback IPs via DoH when api.telegram.org is unreachable (#3376 ) * feat(telegram): auto-discover fallback IPs via DoH when api.telegram.org is unreachable On some networks (university, corporate), api.telegram.org resolves to a valid Telegram IP that is unreachable due to routing/firewall rules. A different IP in the same Telegram-owned 149.154.160.0/20 block works fine. This adds automatic fallback IP discovery at connect time: 1. Query Google and Cloudflare DNS-over-HTTPS for api.telegram.org A records 2. Exclude the system-DNS IP (the unreachable one), use the rest as fallbacks 3. If DoH is also blocked, fall back to a seed list (149.154.167.220) 4. TelegramFallbackTransport tries primary first, sticks to whichever works No configuration needed — works automatically. TELEGRAM_FALLBACK_IPS env var still available as manual override. Zero impact on healthy networks (primary path succeeds on first attempt, fallback never exercised). No new dependencies (uses httpx already in deps + stdlib socket). * fix: share transport instance and downgrade seed fallback log to info - Use single TelegramFallbackTransport shared between request and get_updates_request so sticky IP is shared across polling and API calls - Keep separate HTTPXRequest instances (different timeout settings) - Downgrade "using seed fallback IPs" from warning to info to avoid noisy logs on healthy networks * fix: add telegram.request mock and discovery fixture to remaining test files The original PR missed test_dm_topics.py and test_telegram_network_reconnect.py — both need the telegram.request mock module. The reconnect test also needs _no_auto_discovery since _handle_polling_network_error calls connect() which now invokes discover_fallback_ips(). --------- Co-authored-by: Mohan Qiao <Gavin-Qiao@users.noreply.github.com>	2026-03-27 04:03:13 -07:00
Teknium	be416cdfa9	fix: guard config.get() against YAML null values to prevent AttributeError (#3377 ) dict.get(key, default) returns None — not the default — when the key IS present but explicitly set to null/~ in YAML. Calling .lower() on that raises AttributeError. Use (config.get(key) or fallback) so both missing keys and explicit nulls coalesce to the intended default. Files fixed: - tools/tts_tool.py — _get_provider() - tools/web_tools.py — _get_backend() - tools/mcp_tool.py — MCPServerTask auth config - trajectory_compressor.py — _detect_provider() and config loading Co-authored-by: dieutx <dangtc94@gmail.com>	2026-03-27 04:03:00 -07:00
Teknium	b8b1f24fd7	fix: handle addition-only hunks in V4A patch parser (#3325 ) V4A patches with only + lines (no context or - lines) were silently dropped because search_lines was empty and the 'if search_lines:' block was the only code path. Addition-only hunks are common when the model generates patches for new functions or blocks. Adds an else branch that inserts at the context_hint position when available, or appends at end of file. Includes 2 regression tests for addition-only hunks with and without context hints. Salvaged from PR #3092 by thakoreh. Co-authored-by: Hiren <hiren.thakore58@gmail.com>	2026-03-26 19:38:04 -07:00
Teknium	a2847ea7f0	fix(gateway): add media download retry to Mattermost, Slack, and base cache (#3323 ) * fix(gateway): add media download retry to Mattermost, Slack, and base cache Media downloads on Mattermost and Slack fail permanently on transient errors (timeouts, 429 rate limits, 5xx server errors). Telegram and WhatsApp already have retry logic, but these platforms had single-attempt downloads with hardcoded 30s timeouts. Changes: - base.py cache_image_from_url: add retry with exponential backoff (covers Signal and any platform using the shared cache helper) - mattermost.py _send_media_url: retry on 429/5xx/timeout (3 attempts) - slack.py _download_slack_file: retry on timeout/5xx (3 attempts) - slack.py _download_slack_file_bytes: same retry pattern * test: add tests for media download retry --------- Co-authored-by: dieutx <dangtc94@gmail.com>	2026-03-26 19:33:18 -07:00
Teknium	58ca875e19	feat(gateway): surface session config on /new, /reset, and auto-reset (#3321 ) When a new session starts in the gateway (via /new, /reset, or auto-reset), send the user a summary of the detected configuration: ✨ Session reset! Starting fresh. ◆ Model: qwen3.5:27b-q4_K_M ◆ Provider: custom ◆ Context: 8K tokens (config) ◆ Endpoint: http://localhost:11434/v1 This makes misconfigured context length immediately visible — a user running a local 8K model that falls to the 128K default will see: ◆ Context: 128K tokens (default — set model.context_length in config to override) Instead of silently getting no compression and degrading responses. - _format_session_info() resolves model, provider, context length, and endpoint from config + runtime, matching the hygiene code's resolution chain - Local/custom endpoints shown; cloud endpoints hidden (not useful) - Context source annotated: config, detected, or default with hint - Appended to /new and /reset responses, and auto-reset notifications - 9 tests covering all formatting paths and failure resilience Addresses the user-facing side of #2708 — instead of trying to fix every edge case in context detection, surface the values so users can immediately see when something is wrong.	2026-03-26 19:27:58 -07:00
Teknium	3f95e741a7	fix: validate empty user messages to prevent Anthropic API 400 errors (#3322 ) When user messages have empty content (e.g., Discord @mention-only messages, unrecognized attachments), the Anthropic API rejects the request with 'user messages must have non-empty content'. Changes: - anthropic_adapter.py: Add empty content validation for user messages (string and list formats), matching the existing pattern for assistant and tool messages. Empty content gets '(empty message)' placeholder. - discord.py: Defense-in-depth check at gateway layer to catch empty messages before they enter session history. - Add 4 regression tests covering empty string, whitespace-only, empty list, and empty text block scenarios. Fixes #3143 Co-authored-by: Bartok9 <bartok9@users.noreply.github.com>	2026-03-26 19:24:03 -07:00
Teknium	03396627a6	fix(ci): pin acp <0.9 and update retry-exhaust test (#3320 ) Two remaining CI failures: 1. agent-client-protocol 0.9.0 removed AuthMethod (replaced with AuthMethodAgent/EnvVar/Terminal). Pin to <0.9 until the new API is evaluated — our usage doesn't map 1:1 to the new types. 2. test_429_exhausts_all_retries_before_raising expected pytest.raises but the agent now catches 429s after max retries, tries fallback, then returns a result dict. Updated to check final_response.	2026-03-26 19:21:34 -07:00
Teknium	22cfad157b	fix: gateway token double-counting — use absolute set instead of increment (#3317 ) The gateway's update_session() used += for token counts, but the cached agent's session_prompt_tokens / session_completion_tokens are cumulative totals that grow across messages. Each update_session call re-added the running total, inflating usage stats with every message (1.7x after 3 messages, worse over longer conversations). Fix: change += to = for in-memory entry fields, add set_token_counts() to SessionDB that uses direct assignment instead of SQL increment, and switch the gateway to call it. CLI mode continues using update_token_counts() (increment) since it tracks per-API-call deltas — that path is unchanged. Based on analysis from PR #3222 by @zaycruz (closed). Co-authored-by: zaycruz <zay@users.noreply.github.com>	2026-03-26 19:13:07 -07:00
Teknium	867eefdd9f	fix(signal): track SSE keepalive comments as connection activity (#3316 ) signal-cli sends SSE comment lines (':') as keepalives every ~15s. The SSE listener only counted 'data:' lines as activity, so the health monitor reported false idle warnings every 2 minutes during quiet periods. Recognize ':' lines as valid activity per the SSE spec. Salvaged from PR #2938 by ticketclosed-wontfix.	2026-03-26 19:10:25 -07:00
Teknium	a8df7f9964	fix: gateway token double-counting with cached agents (#3306 ) The cached agent accumulates session_input_tokens across messages, so run_conversation() returns cumulative totals. But update_session() used += (increment), double-counting on every message after the first. - session.py: change in-memory entry updates from += to = (direct assignment for cumulative values) - hermes_state.py: add absolute=True flag to update_token_counts() that uses SET column = ? instead of SET column = column + ? - session.py: pass absolute=True to the DB call CLI path is unchanged — it passes per-API-call deltas directly to update_token_counts() with the default absolute=False (increment). Reported by @zaycruz in #3222. Closes #3222.	2026-03-26 19:04:53 -07:00
Teknium	1519c4d477	fix(session): add /resume CLI handler, session log truncation guard, reopen_session API (#3315 ) Three improvements salvaged from PR #3225 by Mibayy: 1. Add /resume slash command handler in CLI process_command(). The command was registered in the commands registry but had no handler, so typing /resume produced 'Unknown command'. The handler resolves by title or session ID, ends the current session cleanly, loads conversation history from SQLite, re-opens the target session, and syncs the AIAgent instance. Follows the same pattern as new_session(). 2. Add truncation guard in _save_session_log(). When resuming a session whose messages weren't fully written to SQLite, the agent starts with partial history and the first save would overwrite the full JSON log on disk. The guard reads the existing file and skips the write if it already has more messages than the current batch. 3. Add reopen_session() method to SessionDB. Proper API for clearing ended_at/end_reason instead of reaching into _conn directly. Note: Bug 1 from the original PR (INSERT OR IGNORE + _session_db = None) is already fixed on main — skipped as redundant. Closes #3123.	2026-03-26 19:04:28 -07:00
Teknium	005786c55d	fix(gateway): include per-platform ALLOW_ALL and SIGNAL_GROUP in startup allowlist check (#3313 ) The startup warning 'No user allowlists configured' only checked GATEWAY_ALLOW_ALL_USERS and per-platform _ALLOWED_USERS vars. It missed SIGNAL_GROUP_ALLOWED_USERS and per-platform _ALLOW_ALL_USERS vars (e.g. TELEGRAM_ALLOW_ALL_USERS), causing a false warning even when users had these configured. The actual auth check in _is_user_authorized already recognized these vars. Cherry-picked from PR #3202 by binhnt92. Co-authored-by: binhnt92 <binhnt.ht.92@gmail.com>	2026-03-26 18:23:49 -07:00
Teknium	ad764d3513	fix(auxiliary): catch ImportError from build_anthropic_client in vision auto-detection (#3312 ) _try_anthropic() caught ImportError on the module import (line 667-669) but not on the build_anthropic_client() call (line 696). When the anthropic_adapter module imports fine but the anthropic SDK is missing, build_anthropic_client() raises ImportError at call time. This escaped _try_anthropic() entirely, killing get_available_vision_backends() and cascading to 7 test failures: - 4 setup wizard tests hit unexpected 'Configure vision:' prompt - 3 codex-auth-as-vision tests failed check_vision_requirements() The fix wraps the build_anthropic_client call in try/except ImportError, returning (None, None) when the SDK is unavailable — consistent with the existing guard at the top of the function.	2026-03-26 18:21:59 -07:00
Teknium	f008ee1019	fix(session): preserve reasoning fields in rewrite_transcript (#3311 ) rewrite_transcript (used by /retry, /undo, /compress) was calling append_message without reasoning, reasoning_details, or codex_reasoning_items — permanently dropping them from SQLite. Co-authored-by: alireza78a <alireza78.crypto@gmail.com>	2026-03-26 18:18:00 -07:00
Teknium	60fdb58ce4	fix(agent): update context compressor limits after fallback activation (#3305 ) When _try_activate_fallback() switches to the fallback model, it updates the agent's model/provider/client but never touches self.context_compressor. The compressor keeps the primary model's context_length and threshold_tokens, so compression decisions use wrong limits — a 200K primary → 32K fallback still uses 200K-based thresholds, causing oversized sessions to overflow the fallback. Update the compressor's model, credentials, context_length, and threshold_tokens after fallback activation using get_model_context_length() for the new model. Cherry-picked from PR #3202 by binhnt92. Co-authored-by: binhnt92 <binhnt.ht.92@gmail.com>	2026-03-26 18:10:50 -07:00
Teknium	18d28c63a7	fix: add explicit hermes-api-server toolset for API server platform (#3304 ) The API server adapter was creating agents without specifying enabled_toolsets, causing ALL tools to load — including clarify, send_message, and text_to_speech which don't work without interactive callbacks or gateway dispatch. Changes: - toolsets.py: Add hermes-api-server toolset (core tools minus clarify, send_message, text_to_speech) - api_server.py: Resolve toolsets from config.yaml platform_toolsets via _get_platform_tools() — same path as all other gateway platforms. Falls back to hermes-api-server default when no override configured. - tools_config.py: Add api_server to PLATFORMS dict so users can customize via 'hermes tools' or platform_toolsets.api_server in config.yaml - 12 tests covering toolset definition, config resolution, and user override Reported by thatwolfieguy on Discord.	2026-03-26 18:02:26 -07:00
Teknium	3c57eaf744	fix: YAML boolean handling for tool_progress config (#3300 ) YAML 1.1 parses bare `off` as boolean False, which is falsy in Python's `or` chain and silently falls through to the 'all' default. Users setting `display.tool_progress: off` in config.yaml saw no effect — tool progress stayed on. Normalise False → 'off' before the or chain in both affected paths: - gateway/run.py _run_agent() tool progress reader - cli.py HermesCLI.__init__() tool_progress_mode Reported by @gibbsoft in #2859. Closes #2859.	2026-03-26 17:58:50 -07:00
Teknium	2d232c9991	feat(cli): configurable busy input mode + fix /queue always working (#3298 ) Two changes: 1. Fix /queue command: remove the _agent_running guard that rejected /queue after the agent finished. The prompt was deferred in _pending_input until the agent completed, then the handler checked _agent_running (now False) and rejected it. /queue now always queues regardless of timing. 2. Add display.busy_input_mode config (CLI-only): - 'interrupt' (default): Enter while busy interrupts the current run (preserves existing behavior) - 'queue': Enter while busy queues the message for the next turn, with a 'Queued for the next turn: ...' confirmation Ctrl+C always interrupts regardless of this setting. Salvaged from PR #3037 by StefanoChiodino. Key differences: - Default is 'interrupt' (preserves existing behavior) not 'queue' - No config version bump (unnecessary for new key in existing section) - Simpler normalization (no alias map) - /queue fix is simpler: just remove the guard instead of intercepting commands during busy state	2026-03-26 17:58:40 -07:00
Teknium	0375b2a0d7	fix(gateway): silence background agent terminal output (#3297 ) * fix(gateway): silence flush agent terminal output quiet_mode=True only suppresses AIAgent init messages. Tool call output still leaks to the terminal through _safe_print → _print_fn during session reset/expiry. Since #2670 injected live memory state into the flush prompt, the flush agent now reliably calls memory tools — making the output leak noticeable for the first time. Set _print_fn to a no-op so the background flush is fully silent. * test(gateway): add test for flush agent terminal silence + fix dotenv mock - Add TestFlushAgentSilenced: verifies _print_fn is set to a no-op on the flush agent so tool output never leaks to the terminal - Fix pre-existing test failures: replace patch('run_agent.AIAgent') with sys.modules mock to avoid importing run_agent (requires openai) - Add autouse _mock_dotenv fixture so all tests in this file run without the dotenv package installed * fix(display): route KawaiiSpinner output through print_fn to fully silence flush agent The previous fix set tmp_agent._print_fn = no-op on the flush agent but spinner output and quiet-mode cute messages bypassed _print_fn entirely: - KawaiiSpinner captured sys.stdout at __init__ and wrote directly to it - quiet-mode tool results used builtin print() instead of _safe_print() Add optional print_fn parameter to KawaiiSpinner.__init__; _write routes through it when set. Pass self._print_fn to all spinner construction sites in run_agent.py and change the quiet-mode cute message print to _safe_print. The existing gateway fix (tmp_agent._print_fn = lambda) now propagates correctly through both paths. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(gateway): silence hygiene and compression background agents Two more background AIAgent instances in the gateway were created with quiet_mode=True but without _print_fn = no-op, causing tool output to leak to the terminal: - _hyg_agent (in-turn hygiene memory agent) - tmp_agent (_compress_context path) Apply the same _print_fn no-op pattern used for the flush agent. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * chore(display): remove unused _last_flush_time from KawaiiSpinner Attribute was set but never read; upstream already removed it. Leftover from conflict resolution during rebase onto upstream/main. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> --------- Co-authored-by: Dilee <uzmpsk.dilekakbas@gmail.com> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-26 17:40:31 -07:00
Teknium	08fa326bb0	feat(gateway): deliver background review notifications to user chat (#3293 ) The background memory/skill review (_spawn_background_review) runs after the agent response when turn/iteration counters exceed their thresholds. It saves memories and skills, then prints a summary like '💾 Memory updated · User profile updated'. In CLI mode this goes to the terminal via _safe_print. In gateway mode, _safe_print routes to print() which goes to stdout — invisible to the user. Add a background_review_callback attribute to AIAgent. When set, the background review thread calls it with the summary string after saves complete. The gateway wires this to adapter.send() via the same run_coroutine_threadsafe bridge used by status_callback, delivering the notification to the user's chat.	2026-03-26 17:38:24 -07:00
Teknium	bde45f5a2a	fix(gateway): retry transient send failures and notify user on exhaustion (#3288 ) When send() fails due to a network error (ConnectError, ReadTimeout, etc.), the failure was silently logged and the user received no feedback — appearing as a hang. In one reported case, a user waited 1+ hour for a response that had already been generated but failed to deliver (#2910). Adds _send_with_retry() to BasePlatformAdapter: - Transient errors: retry up to 2x with exponential backoff + jitter - On exhaustion: send delivery-failure notice so user knows to retry - Permanent errors: fall back to plain-text version (preserves existing behavior) - SendResult.retryable flag for platform-specific transient errors All adapters benefit automatically via BasePlatformAdapter inheritance. Cherry-picked from PR #3108 by Mibayy. Co-authored-by: Mibayy <mibayy@users.noreply.github.com>	2026-03-26 17:37:10 -07:00
Teknium	716e616d28	fix(tui): status bar duplicates and degrades during long sessions (#3291 ) shutil.get_terminal_size() can return stale/fallback values on SSH that differ from prompt_toolkit's actual terminal width. Fragments built for the wrong width overflow and wrap onto a second line (wrap_lines=True default), appearing as progressively degrading duplicates. - Read width from get_app().output.get_size().columns when inside a prompt_toolkit TUI, falling back to shutil outside TUI context - Add wrap_lines=False on the status bar Window as belt-and-suspenders guard against any future width mismatch Closes #3130 Co-authored-by: Mibayy <Mibayy@users.noreply.github.com>	2026-03-26 17:33:11 -07:00
Teknium	bdccdd67a1	fix: OpenClaw migration overwrites defaults and setup wizard skips imported sections (#3282 ) Two bugs caused the OpenClaw migration during first-time setup to be ineffective, forcing users to reconfigure everything manually: 1. The setup wizard created config.yaml with all defaults BEFORE running the migration, then the migrator ran with overwrite=False. Every config setting was reported as a 'conflict' against the defaults and skipped. Fix: use overwrite=True during setup-time migration (safe because only defaults exist at that point). The hermes claw migrate CLI command still defaults to overwrite=False for post-setup use. 2. After migration, the full setup wizard ran all 5 sections unconditionally, forcing the user through model/terminal/agent/messaging/tools configuration even when those settings were just imported. Fix: add _get_section_config_summary() and _skip_configured_section() helpers. After migration, each section checks if it's already configured (API keys present, non-default values, platform tokens) and offers 'Reconfigure? [y/N]' with default No. Unconfigured sections still run normally. Reported by Dev Bredda on social media.	2026-03-26 16:29:38 -07:00
Teknium	148f46620f	fix(matrix): add backoff for SyncError in sync loop (#3280 ) When the homeserver returns an error response, matrix-nio parses it as a SyncError return value rather than raising an exception. The sync loop only had backoff in the except handler, so SyncError caused a tight retry loop (~489 req/s) flooding logs and hammering the homeserver. Check the return value and sleep 5s before retry. Cherry-picked from PR #2937 by ticketclosed-wontfix. Co-authored-by: ticketclosed-wontfix <ticketclosed-wontfix@users.noreply.github.com>	2026-03-26 16:19:58 -07:00
Teknium	6610c377ba	fix(telegram): self-reschedule reconnect when start_polling fails (#3268 ) After a Telegram 502, _handle_polling_network_error calls updater.stop() then start_polling(). If start_polling() also raises, the old code logged a warning and returned — but the comment 'The next network error will trigger another attempt' was wrong. The updater loop is dead after stop(), so no further error callbacks ever fire. The gateway stays alive but permanently deaf to messages. Fix: when start_polling() fails in the except branch, schedule a new _handle_polling_network_error task to continue the exponential backoff retry chain. The task is tracked in _background_tasks (preventing GC). Guarded by has_fatal_error to avoid spurious retries during shutdown. Closes #3173. Salvaged from PR #3177 by Mibayy.	2026-03-26 15:34:33 -07:00
Teknium	e5d14445ef	fix(security): restrict subagent toolsets to parent's enabled set (#3269 ) The delegate_task tool accepts a toolsets parameter directly from the LLM's function call arguments. When provided, these toolsets are passed through _strip_blocked_tools but never intersected with the parent agent's enabled_toolsets. A model can request toolsets the parent does not have (e.g., web, browser, rl), granting the subagent tools that were explicitly disabled for the parent. Intersect LLM-requested toolsets with the parent's enabled set before applying the blocked-tool filter, so subagents can only receive a subset of the parent's tools. Co-authored-by: dieutx <dangtc94@gmail.com>	2026-03-26 14:50:26 -07:00
Teknium	72250b5f62	feat: config-gated /verbose command for messaging gateway (#3262 ) * feat: config-gated /verbose command for messaging gateway Add gateway_config_gate field to CommandDef, allowing cli_only commands to be conditionally available in the gateway based on a config value. - CommandDef gains gateway_config_gate: str \| None — a config dotpath that, when truthy, overrides cli_only for gateway surfaces - /verbose uses gateway_config_gate='display.tool_progress_command' - Default is off (cli_only behavior preserved) - When enabled, /verbose cycles tool_progress mode (off/new/all/verbose) in the gateway, saving to config.yaml — same cycle as the CLI - Gateway helpers (help, telegram menus, slack mapping) dynamically check config to include/exclude config-gated commands - GATEWAY_KNOWN_COMMANDS always includes config-gated commands so the gateway recognizes them and can respond appropriately - Handles YAML 1.1 bool coercion (bare 'off' parses as False) - 8 new tests for the config gate mechanism + gateway handler * docs: document gateway_config_gate and /verbose messaging support - AGENTS.md: add gateway_config_gate to CommandDef fields - slash-commands.md: note /verbose can be enabled for messaging, update Notes - configuration.md: add tool_progress_command to display section + usage note - cli.md: cross-link to config docs for messaging enablement - messaging/index.md: show tool_progress_command in config snippet - plugins.md: add gateway_config_gate to register_command parameter table	2026-03-26 14:41:04 -07:00
Teknium	243ee67529	fix: store asyncio task references to prevent GC mid-execution (#3267 ) Python's asyncio event loop holds only weak references to tasks. Without a strong reference, the garbage collector can destroy a task while it's awaiting I/O — silently dropping messages. Python 3.12+ made this more aggressive. Audit of all gateway platform adapters found 6 untracked create_task calls across 6 files: Per-message tasks (tracked via _background_tasks set from base class): - gateway/platforms/webhook.py: handle_message task - gateway/platforms/sms.py: handle_message task - gateway/platforms/signal.py: SSE response aclose task Long-running infrastructure tasks (stored in named instance vars): - gateway/platforms/slack.py: Socket Mode handler (_socket_mode_task) - gateway/platforms/discord.py: bot client (_bot_task) - gateway/platforms/whatsapp.py: message poll loop (_poll_task, 2 sites) All other adapters (telegram, mattermost, matrix, email, homeassistant, dingtalk) already tracked their tasks correctly. Salvaged from PR #3160 by memosr — expanded from 1 file to 6.	2026-03-26 14:36:24 -07:00
Teknium	3a86328847	fix(gateway): add request timeouts to HA, Email, Mattermost, SMS adapters (#3258 ) Add timeout=30 to all bare ClientSession, IMAP4_SSL, smtplib.SMTP, and ws_connect calls that previously had no timeout, preventing indefinite hangs when an external server is slow or unresponsive. Adapters hardened: - HomeAssistant: REST + WS session creation, ws_connect handshake - Email: all IMAP4_SSL (x2) and smtplib.SMTP (x3) calls - Mattermost: session creation, _api_get, _api_post, _upload_file (60s) - SMS: session creation in connect() + fallback session in send() Salvaged from PRs #3161, #3168, #3170 (memosr) and #3201 (binhnt92). SMS fallback ClientSession on send() also patched (missed in #3201). Co-authored-by: memosr <memosr@users.noreply.github.com> Co-authored-by: nguyen binh <binhnt92@users.noreply.github.com>	2026-03-26 14:36:07 -07:00
Teknium	db241ae6ce	feat(sessions): add --source flag for third-party session isolation (#3255 ) When third-party tools (Paperclip orchestrator, etc.) spawn hermes chat as a subprocess, their sessions pollute user session history and search. - hermes chat --source <tag> (also HERMES_SESSION_SOURCE env var) - exclude_sources parameter on list_sessions_rich() and search_messages() - Sessions with source=tool hidden from sessions list/browse/search - Third-party adapters pass --source tool to isolate agent sessions Cherry-picked from PR #3208 by HenkDz. Co-authored-by: Henkey <noonou7@gmail.com>	2026-03-26 14:35:31 -07:00
Teknium	41ee207a5e	fix: catch KeyboardInterrupt in exit cleanup handlers (#3257 ) except Exception does not catch KeyboardInterrupt (inherits from BaseException). A second Ctrl+C during exit cleanup aborts pending writes — Honcho observations dropped, SQLite sessions left unclosed, cron job sessions never marked ended. Changed to except (Exception, KeyboardInterrupt) at all five sites: - cli.py: honcho.shutdown() and end_session() in finally exit block - run_agent.py: _flush_honcho_on_exit atexit handler - cron/scheduler.py: end_session() and close() in job finally block Tests exercise the actual production code paths and confirm KeyboardInterrupt propagates without the fix. Co-authored-by: dieutx <dangtc94@gmail.com>	2026-03-26 14:34:31 -07:00
Teknium	e9e7fb0683	fix(gateway): track background task references in GatewayRunner (#3254 ) Asyncio tasks created with create_task() but never stored can be garbage collected mid-execution. Add self._background_tasks set to hold references, with add_done_callback cleanup. Tracks: - /background command task - session-reset memory flush task - session-resume memory flush task Cancel all pending tasks in stop(). Update test fixtures that construct GatewayRunner via object.__new__() to include the new _background_tasks attribute. Cherry-picked from PR #3167 by memosr. The original PR also deleted the DM topic auto-skill loading code — that deletion was excluded from this salvage as it removes a shipped feature (#2598). Co-authored-by: memosr.eth <96793918+memosr@users.noreply.github.com>	2026-03-26 14:33:48 -07:00

1 2 3 4 5 ...

2759 Commits