hermes-agent

Author	SHA1	Message	Date
Teknium	5b29ff50f8	fix(logging): extract useful info from HTML error pages, dump debug on max retries Three problems with API error debugging: 1. Terminal showed str(error)[:200] — raw HTML gibberish for Cloudflare 502/503 pages instead of "502 Bad Gateway" 2. errors.log dumped the entire HTML page as unstructured text 3. _dump_api_request_debug was never called when retries exhausted, only for non-retryable 4xx errors Adds _summarize_api_error() that extracts <title> and Cloudflare Ray ID from HTML error pages, and falls back to SDK error body messages. Now the terminal shows clean one-liners like: 📝 Error: HTTP 502 — openrouter.ai \| 502: Bad gateway — Ray 9e226... Also calls _dump_api_request_debug on max_retries_exhausted so the full request context is written to ~/.hermes/sessions/ for post-mortem. Made-with: Cursor	2026-03-25 18:36:04 -07:00
Teknium	7258311710	fix: stop recursive AGENTS.md walk, load top-level only (#3110 ) The recursive os.walk for AGENTS.md in subdirectories was undesired. Only load AGENTS.md from the working directory root, matching the behavior of CLAUDE.md and .cursorrules.	2026-03-25 18:30:45 -07:00
Teknium	910ec7eb38	chore: remove unused Hermes-native PKCE OAuth flow (#3107 ) Remove run_hermes_oauth_login(), refresh_hermes_oauth_token(), read_hermes_oauth_credentials(), _save_hermes_oauth_credentials(), _generate_pkce(), and associated constants/credential file path. This code was added in `63e88326` but never wired into any user-facing flow (setup wizard, hermes model, or any CLI command). Neither clawdbot/OpenClaw nor opencode implement PKCE for Anthropic — both use setup-token or API keys. Dead code that was never tested in production. Also removes the credential resolution step that checked ~/.hermes/.anthropic_oauth.json (step 3 in resolve_anthropic_token), renumbering remaining steps.	2026-03-25 18:29:47 -07:00
Teknium	4b45f65858	fix: update api_key in _try_activate_fallback for subagent auth (#3103 ) When fallback activates (e.g. minimax → OpenRouter), self.provider, self.base_url, self.api_mode, and self._client_kwargs were all updated but self.api_key was not. delegate_tool.py reads parent_agent.api_key to pass credentials to child agents, so subagents inherited the stale pre-fallback key (e.g. a minimax key sent to OpenRouter), causing 401 Missing Authentication errors. Add self.api_key = ... in both the anthropic_messages and chat_completions branches of _try_activate_fallback().	2026-03-25 18:23:03 -07:00
Teknium	b374f52063	fix(session): clear compressor summary and turn counter on /clear and /new (#3102 ) reset_session_state() was missing two fields added after it was written: - _user_turn_count: kept accumulating across sessions, affecting flush_min_turns guard behavior - context_compressor._previous_summary: old session's compression summary leaked into new session's iterative compression Cherry-picked from PR #2640 by dusterbloom. Closes #2635.	2026-03-25 18:22:21 -07:00
Teknium	bd43a43f07	fix(cli): handle EOFError in sessions delete/prune confirmation prompts (#3101 ) sessions delete and prune call input() for confirmation without catching EOFError. When stdin isn't a TTY (piped input, CI/CD, cron), input() throws EOFError and the command crashes. Extract a _confirm_prompt() helper that handles EOFError and KeyboardInterrupt, defaulting to cancel. Both call sites now use it. Salvaged from PR #2622 by dieutx (improved from duplicated try/except to shared helper). Closes #2565.	2026-03-25 18:06:04 -07:00
Teknium	432ba3b709	fix: use sys.executable for pip in update commands to fix PEP 668 (#3099 ) The update commands called bare 'pip' as fallback when uv wasn't found. On modern Debian/Ubuntu enforcing PEP 668, this resolves to system pip which refuses to install in an externally-managed environment. Use sys.executable -m pip to ensure the venv's pip is used. Fixed in both cmd_update and _update_via_zip (the PR only caught one instance). Salvaged from PR #2655 by devorun. Fixes #2648.	2026-03-25 17:52:59 -07:00
Teknium	712cebc40f	fix(logging): show HTTP status code and 400 body in API error output (#3096 ) When an API call fails, the terminal output now includes the HTTP status code in the header line and, for 400 errors, the response body from the provider (truncated to 300 chars). Makes it much easier to diagnose issues like invalid model names or malformed requests that were previously hidden behind generic error messages. Salvaged from PR #2646 by Mibayy. Fixes #2644.	2026-03-25 17:47:55 -07:00
Teknium	45f57c2012	feat(models): add glm-5-turbo to zai provider model list (#3095 ) Cherry-picked from PR #2542 by ReqX. Adds glm-5-turbo to the direct zai provider curated model list so /model zai:glm-5-turbo validates correctly. The model was already in _OPENROUTER_UPSTREAM_MODELS but missing from the direct provider list.	2026-03-25 17:42:25 -07:00
Teknium	41081d718c	fix(cli): prevent update crash in non-TTY environments (#3094 ) cmd_update calls input() unconditionally during config migration. In headless environments (Telegram gateway, systemd), there's no TTY, so input() throws EOFError and the update crashes. Guard with sys.stdin.isatty(), default to skipping the migration prompt when non-interactive. Salvaged from PR #2850 by devorun. Closes #2848.	2026-03-25 17:34:20 -07:00
ctlst	281100e2df	fix(agent): prevent AsyncOpenAI/httpx cross-loop deadlock in gateway mode (#2701 ) In gateway mode, async tools (vision_analyze, web_extract, session_search) deadlock because _run_async() spawns a thread with asyncio.run(), creating a new event loop, but _get_cached_client() returns an AsyncOpenAI client bound to a different loop. httpx.AsyncClient cannot work across event loop boundaries, causing await client.chat.completions.create() to hang forever. Fix: include the event loop identity in the async client cache key so each loop gets its own AsyncOpenAI instance. Also fix session_search_tool.py which had its own broken asyncio.run()-in-thread pattern — now uses the centralized _run_async() bridge.	2026-03-25 17:31:56 -07:00
Teknium	0d7f739675	fix(setup): use explicit key mapping for returning-user menu dispatch instead of positional index (#3083 ) Co-authored-by: ygd58 <buraysandro9@gmail.com>	2026-03-25 17:14:43 -07:00
Teknium	9783c9d5c1	refactor: remove /model slash command from CLI and gateway (#3080 ) The /model command is removed from both the interactive CLI and messenger gateway (Telegram/Discord/Slack/WhatsApp). Users can still change models via 'hermes model' CLI subcommand or by editing config.yaml directly. Removed: - CommandDef entry from COMMAND_REGISTRY - CLI process_command() handler and model autocomplete logic - Gateway _handle_model_command() and dispatch - SlashCommandCompleter model_completer_provider parameter - Two-stage Tab completion and ghost text for /model - All /model-specific tests Unaffected: - /provider command (read-only, shows current model + providers) - ACP adapter _cmd_model (separate system for VS Code/Zed/JetBrains) - model_switch.py module (used by ACP) - 'hermes model' CLI subcommand Author: Teknium	2026-03-25 17:03:05 -07:00
Teknium	0cfc1f88a3	fix: add MCP tool name collision protection (#3077 ) - Registry now warns when a tool name is overwritten by a different toolset (silent dict overwrite was the previous behavior) - MCP tool registration checks for collisions with non-MCP (built-in) tools before registering. If an MCP tool's prefixed name matches an existing built-in, the MCP tool is skipped and a warning is logged. MCP-to-MCP collisions are allowed (last server wins). - Both regular MCP tools and utility tools (resources/prompts) are guarded. - Adds 5 tests covering: registry overwrite warning, same-toolset re-registration silence, built-in collision skip, normal registration, and MCP-to-MCP collision pass-through. Reported by k_sze (KONG) — MiniMax MCP server's web_search tool could theoretically shadow Hermes's built-in web_search if prefixing failed.	2026-03-25 16:52:04 -07:00
Teknium	3bc953a666	fix(security): bump dependencies to fix CVEs + regenerate uv.lock (#3073 ) * fix(security): bump dependencies to fix 7 CVEs Python (pyproject.toml): - requests >=2.33.0: CVE-2026-25645 - PyJWT >=2.12.0: CVE-2026-32597 Transitive Python CVEs (require lock file or upstream fix): - cbor2 5.8.0: CVE-2026-26209 (via modal) - pygments 2.19.2: CVE-2026-4539 (via rich) - pynacl 1.5.0: CVE-2025-69277 (via discord.py) NPM (package-lock.json via npm audit fix): - basic-ftp: CRITICAL path traversal (GHSA-5rq4-664w-9x2c) - fast-xml-parser: HIGH stack overflow + entity expansion - undici: HIGH CRLF injection, memory DoS, smuggling - minimatch: HIGH ReDoS Remaining: lodash moderate prototype pollution in @appium/logger (upstream fix needed). * chore: regenerate uv.lock for CVE version bumps uv lock after requests >=2.33.0 and PyJWT >=2.12.0 minimum bumps. Without this, uv sync --locked fails because the old lock pinned requests==2.32.5 and pyjwt==2.11.0 (below new minimums). --------- Co-authored-by: 0xbyt4 <35742124+0xbyt4@users.noreply.github.com>	2026-03-25 16:43:21 -07:00
Teknium	bd6b138e85	fix: clean up HTML error messages in CLI display (#3069 ) When API calls fail with HTML error pages (e.g., CloudFlare errors), the CLI was dumping raw HTML content to users like: 📝 Error: <!DOCTYPE html><!--[if lt IE 7]> <html class="no-js ie6... This commit adds a _clean_error_message() utility method that: - Detects HTML content and replaces with user-friendly message - Collapses multiline errors to single line - Truncates overly long errors (>150 chars) - Preserves meaningful error text for regular errors Applied to all user-facing error displays: - API call failure messages (line 6314) - Interrupt error responses (line 6324) - Invalid response error messages (line 6000) Before: 📝 Error: <!DOCTYPE html><!--[if lt IE 7]>... After: 📝 Error: Service temporarily unavailable (HTML error page returned)	2026-03-25 16:39:22 -07:00
Teknium	9792bde31a	fix(agent): count compression restarts toward retry limit (#3070 ) When context overflow triggers compression, the outer retry loop restarts via continue without incrementing retry_count. If compression reduces messages but not enough to fit the context window, this creates an infinite loop burning API credits: API call → overflow → compress → retry → overflow → compress → ... Increment retry_count on compression restarts so the loop exits after max_retries total attempts. Cherry-picked from PR #2766 by dieutx.	2026-03-25 16:35:17 -07:00
Teknium	9d1e13019e	fix(cli): prevent TypeError on startup when base_url is None (#3068 ) Description This PR fixes the startup crash introduced in v0.4.0 where `self.base_url` being `None` throws a `TypeError`. Root Cause: At `cli.py:1108`, a membership check (`"openrouter.ai" in self.base_url`) is performed. If a user's config doesn't explicitly set a `base_url` (meaning it's `None`), Python raises a `TypeError: argument of type 'NoneType' is not iterable`, causing the entire CLI to crash on boot. Fix: Added a simple truthiness guard (`if self.base_url and ...`) to ensure the membership check only occurs if `base_url` is a valid string. Closes #2842 Co-authored-by: devorun <130918800+devorun@users.noreply.github.com>	2026-03-25 16:21:00 -07:00
Teknium	37cabc47d3	test(skills): add regression tests for null metadata frontmatter Covers the case where a SKILL.md has `metadata:` (null) or `metadata.hermes:` (null), which caused an AttributeError before the fix in `d218cf91`. Made-with: Cursor	2026-03-25 16:09:27 -07:00
Teknium	f7f30aaab9	fix(streaming): detect and kill stale SSE connections Adds a wall-clock stale stream detector (HERMES_STREAM_STALE_TIMEOUT, default 90s) that force-closes the httpx client when no real chunks arrive, even if SSE keep-alive pings keep the socket alive. Works with the existing streaming retry loop to recover via fresh connection. Made-with: Cursor	2026-03-25 16:07:05 -07:00
Teknium	d218cf9118	fix(skills): handle null metadata in skill frontmatter frontmatter.get("metadata", {}) returns None (not {}) when the key exists with a null value, crashing build_skills_system_prompt with AttributeError: 'NoneType' object has no attribute 'get'. Made-with: Cursor	2026-03-25 16:06:15 -07:00
Teknium	841401f588	feat(cli): preserve user input on multiline paste (#3065 ) When pasting 5+ lines, the CLI previously replaced the entire input buffer with a file reference placeholder. If the user had already typed a question, it was lost. Fix: move paste collapsing into handle_paste (BracketedPaste handler) so only the pasted content is saved to file. The placeholder is inserted at the cursor position, preserving existing buffer text. Also fixes: - Multi-ref expansion on submit (re.sub instead of re.match) so multiple paste blocks and surrounding text are all preserved - Double-collapse prevention via _paste_just_collapsed flag - Consistent Unicode arrow character across all paste paths Salvaged from PR #2607 by crazywriter1 (option B: core fix only, without keybinding overrides for solid-object navigation/deletion).	2026-03-25 16:00:36 -07:00
Teknium	77bcaba2d7	refactor: consolidate get_hermes_home() and parse_reasoning_effort() (#3062 ) Centralizes two widely-duplicated patterns into hermes_constants.py: 1. get_hermes_home() — Path resolution for ~/.hermes (HERMES_HOME env var) - Was copy-pasted inline across 30+ files as: Path(os.getenv("HERMES_HOME", Path.home() / ".hermes")) - Now defined once in hermes_constants.py (zero-dependency module) - hermes_cli/config.py re-exports it for backward compatibility - Removed local wrapper functions in honcho_integration/client.py, tools/website_policy.py, tools/tirith_security.py, hermes_cli/uninstall.py 2. parse_reasoning_effort() — Reasoning effort string validation - Was copy-pasted in cli.py, gateway/run.py, cron/scheduler.py - Same validation logic: check against (xhigh, high, medium, low, minimal, none) - Now defined once in hermes_constants.py, called from all 3 locations - Warning log for unknown values kept at call sites (context-specific) 31 files changed, net +31 lines (125 insertions, 94 deletions) Full test suite: 6179 passed, 0 failed	2026-03-25 15:54:28 -07:00
Teknium	e0cfc089da	fix(gateway/slack): send progress messages to correct thread (#3063 ) Co-authored-by: Jneeee <jneeee@outlook.com>	2026-03-25 15:51:15 -07:00
Siddharth Balyan	7126524e8d	remove config drift check for nix (#3061 )	2026-03-25 15:46:29 -07:00
Teknium	f83c27e26f	feat(skills): add Docker management skill to optional-skills (#3060 ) Docker CLI reference covering containers, images, Compose, volumes, networks, troubleshooting, and Dockerfile optimization. Placed in optional-skills/devops/ since it's a documentation-only skill with no external dependencies beyond Docker CLI. Based on PR #3032 by @sprmn24. Moved from skills/ to optional-skills/ and trimmed the description to be concise. Co-authored-by: sprmn24 <sprmn24@users.noreply.github.com>	2026-03-25 15:32:25 -07:00
Teknium	ab548a9b5e	fix(security): add SSRF protection to browser_navigate (#3058 ) * fix(security): add SSRF protection to browser_navigate browser_navigate() only checked the website blocklist policy but did not call is_safe_url() to block private/internal addresses. This allowed the agent to navigate to localhost, cloud metadata endpoints (169.254.169.254), and private network IPs via the browser. web_tools and vision_tools already had this check. Added the same is_safe_url() pre-flight validation before the blocklist check in browser_navigate(). * fix: move SSRF import to module level, fix policy test mock Move is_safe_url import to module level so it can be monkeypatched in tests. Update test_browser_navigate_returns_policy_block to mock _is_safe_url so the SSRF check passes and the policy check is reached. * fix(security): harden browser SSRF protection Follow-up to cherry-picked PR #3041: 1. Fail-closed fallback: if url_safety module can't import, block all URLs instead of allowing all. Security guards should never fail-open. 2. Post-redirect SSRF check: after navigation, verify the final URL isn't a private/internal address. If a public URL redirected to 169.254.169.254 or localhost, navigate to about:blank and return an error — prevents the model from reading internal content via subsequent browser_snapshot calls. --------- Co-authored-by: 0xbyt4 <35742124+0xbyt4@users.noreply.github.com>	2026-03-25 15:16:57 -07:00
Teknium	73e66eb3c0	fix(gateway): thread-safe SessionStore — protect _entries with threading.Lock (#3052 ) SessionStore._entries was read and mutated without synchronisation, causing race conditions when multiple platforms (Telegram + Discord) received messages concurrently on the same gateway process. Two threads could simultaneously pass the session_key check and create duplicate sessions for the same user, splitting conversation history. - Added threading.Lock to protect all _entries / _loaded mutations - Split _ensure_loaded() into public wrapper + internal _ensure_loaded_locked() - SQLite I/O is performed outside the lock to avoid blocking during slow disk operations - _save() stays inside the lock since it reads _entries for serialization Cherry-picked from PR #3012 by Kewe63. Removed unrelated changes (delivery.py case-sensitivity, hermes_state.py schema tracking) and stripped the UTC timezone switch to keep the change focused on threading. Co-authored-by: Kewe63 <Kewe63@users.noreply.github.com>	2026-03-25 15:15:37 -07:00
Teknium	14cf2d85ca	fix(display): guard isatty() against closed streams via _is_tty property (#3056 ) In gateway/Telegram mode, the stdout fd can be closed by executor thread cleanup. KawaiiSpinner.stop() called isatty() on the closed fd, raising ValueError and masking the original error. Instead of a point fix, add a _is_tty property that centralizes the closed-stream guard — both _animate() and stop() now use it. Follows the same (ValueError, OSError) pattern already in _write(). Inspired by PR #2632 by bot-deo88.	2026-03-25 15:15:15 -07:00
Teknium	8bb1d15da4	chore: remove ~100 unused imports across 55 files (#3016 ) Automated cleanup via pyflakes + autoflake with manual review. Changes: - Removed unused stdlib imports (os, sys, json, pathlib.Path, etc.) - Removed unused typing imports (List, Dict, Any, Optional, Tuple, Set, etc.) - Removed unused internal imports (hermes_cli.auth, hermes_cli.config, etc.) - Fixed cli.py: removed 8 shadowed banner imports (imported from hermes_cli.banner then immediately redefined locally — only build_welcome_banner is actually used) - Added noqa comments to imports that appear unused but serve a purpose: - Re-exports (gateway/session.py SessionResetPolicy, tools/terminal_tool.py is_interrupted/_interrupt_event) - SDK presence checks in try/except (daytona, fal_client, discord) - Test mock targets (auxiliary_client.py Path, mcp_config.py get_hermes_home) Zero behavioral changes. Full test suite passes (6162/6162, 2 pre-existing streaming test failures unrelated to this change).	2026-03-25 15:02:03 -07:00
Teknium	861624d4e9	fix(cli): refresh TUI before background task output to prevent status bar overlap (#3048 ) When a background task (/bg command) prints its output while the main agent is processing with the thinking spinner visible, the status bar could render on the same row as the spinner, causing visual overlap. This fix adds an explicit app.invalidate() call with a brief pause before printing background task output, ensuring the TUI layout is in a consistent state before the output is written. Changes: - Add TUI refresh before success output in _handle_background_command - Add TUI refresh before error output in the exception handler - Add tests for the refresh behavior Closes #2718 Co-authored-by: Bartok9 <bartokmagic@proton.me>	2026-03-25 15:00:33 -07:00
Teknium	e4033b2baf	fix(cli): catch KeyboardInterrupt during flush_memories on exit (#3025 ) KeyboardInterrupt inherits from BaseException, not Exception, so the except Exception: clauses wrapping flush_memories() on exit paths silently skipped the flush when the user pressed Ctrl+C. This could lose conversation memory. Change both call sites to except (Exception, KeyboardInterrupt): so the memory flush is attempted even during interrupt. Salvaged from PR #2855 by RufusLin (dropped unrelated bundled changes).	2026-03-25 12:47:51 -07:00
Teknium	94e3d9adbf	fix(agent): restore safe non-streaming fallback after stream failures (#3020 ) After streaming retries are exhausted on transient errors, fall back to non-streaming instead of propagating the error. Also fall back for any other pre-delivery stream error (not just 'streaming not supported'). Added user-facing message when streaming is not supported by a model/ provider, directing users to set display.streaming: false in config.yaml to avoid the fallback delay. Cherry-picked from PR #3008 by kshitijk4poor. Added UX message for streaming-not-supported detection. Co-authored-by: kshitijk4poor <kshitijk4poor@users.noreply.github.com>	2026-03-25 12:46:04 -07:00
Teknium	0dcd6ab2f2	fix: status bar shows 26K instead of 260K for token counts with trailing zeros (#3024 ) format_token_count_compact() used unconditional rstrip("0") to clean up decimal trailing zeros (e.g. "1.50" → "1.5"), but this also stripped meaningful trailing zeros from whole numbers ("260" → "26", "100" → "1"). Guard the strip behind a decimal-point check. Co-authored-by: kshitijk4poor <82637225+kshitijk4poor@users.noreply.github.com>	2026-03-25 12:45:58 -07:00
Siddharth Balyan	b6461903ff	feat: nix flake — uv2nix build, NixOS module, persistent container mode (#20 ) * feat: nix flake, uv2nix build, dev shell and home manager * fixed nix run, updated docs for setup * feat(nix): NixOS module with persistent container mode, managed guards, checks - Replace homeModules.nix with nixosModules.nix (two deployment modes) - Mode A (native): hardened systemd service with ProtectSystem=strict - Mode B (container): persistent Ubuntu container with /nix/store bind-mount, identity-hash-based recreation, GC root protection, symlink-based updates - Add HERMES_MANAGED guards blocking CLI config mutation (config set, setup, gateway install/uninstall) when running under NixOS module - Add nix/checks.nix with build-time verification (binary, CLI, managed guard) - Remove container.nix (no Nix-built OCI image; pulls ubuntu:24.04 at runtime) - Simplify packages.nix (drop fetchFromGitHub submodules, PYTHONPATH wrappers) - Rewrite docs/nixos-setup.md with full options reference, container architecture, secrets management, and troubleshooting guide Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * Update config.py * feat(nix): add CI workflow and enhanced build checks - GitHub Actions workflow for nix flake check + build on linux/macOS - Entry point sync check to catch pyproject.toml drift - Expanded managed-guard check to cover config edit - Wrap hermes-acp binary in Nix package - Fix Path type mismatch in is_managed() * Update MCP server package name; bundled skills support * fix reading .env. instead have container user a common mounted .env file * feat(nix): container entrypoint with privilege drop and sudo provisioning Container was running as non-root via --user, which broke apt/pip installs and caused crashes when $HOME didn't exist. Replace --user with a Nix-built entrypoint script that provisions the hermes user, sudo (NOPASSWD), and /home/hermes inside the container on first boot, then drops privileges via setpriv. Writable layer persists so setup only runs once. Also expands MCP server options to support HTTP transport and sampling. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix group and user creation in container mode * feat(nix): persistent /home/hermes and MESSAGING_CWD in container mode Container mode now bind-mounts ${stateDir}/home to /home/hermes so the agent's home directory survives container recreation. Previously it lived in the writable layer and was lost on image/volume/options changes. Also passes MESSAGING_CWD to the container so the agent finds its workspace and documents, matching native mode behavior. Other changes: - Extract containerDataDir/containerHomeDir bindings (no more magic strings) - Fix entrypoint chown to run unconditionally (volume mounts always exist) - Add schema field to container identity hash for auto-recreation - Add idempotency test (Scenario G) to config-roundtrip check * docs: add Nix & NixOS setup guide to docs site Add comprehensive Nix documentation to the Docusaurus site at website/docs/getting-started/nix-setup.md, covering nix run/profile install, NixOS module (native + container modes), declarative settings, secrets management, MCP servers, managed mode, container architecture, dev shell, flake checks, and full options reference. - Register nix-setup in sidebar after installation page - Add Nix callout tip to installation.md linking to new guide - Add canonical version pointer in docs/nixos-setup.md * docs: remove docs/nixos-setup.md, consolidate into website docs Backfill missing details (restart/restartSec in full example, gateway.pid, 0750 permissions, docker inspect commands) into the canonical website/docs/getting-started/nix-setup.md and delete the old standalone file. * fix(nix): add compression.protect_last_n and target_ratio to config-keys.json New keys were added to DEFAULT_CONFIG on main, causing the config-drift check to fail in CI. * fix(nix): skip checks on aarch64-darwin (onnxruntime wheel missing) The full Python venv includes onnxruntime (via faster-whisper/STT) which lacks a compatible uv2nix wheel on aarch64-darwin. Gate all checks behind stdenv.hostPlatform.isLinux. The package and devShell still evaluate on macOS. * fix(nix): skip flake check and build on macOS CI onnxruntime (transitive dep via faster-whisper) lacks a compatible uv2nix wheel on aarch64-darwin. Run full checks and build on Linux only; macOS CI verifies the flake evaluates without building. * fix(nix): preserve container writable layer across nixos-rebuild The container identity hash included the entrypoint's Nix store path, which changes on every nixpkgs update (due to runtimeShell/stdenv input-addressing). This caused false-positive identity mismatches, triggering container recreation and losing the persistent writable layer. - Use stable symlink (current-entrypoint) like current-package already does - Remove entrypoint from identity hash (only image/volumes/options matter) - Add GC root for entrypoint so nix-collect-garbage doesn't break it - Remove global HERMES_HOME env var from addToSystemPackages (conflicted with interactive CLI use, service already sets its own) --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-26 01:08:02 +05:30
Teknium	8f6ef042c1	fix(cli): buffer reasoning preview chunks and fix duplicate display (#3013 ) Three improvements to reasoning/thinking display in the CLI: 1. Buffer tiny reasoning chunks: providers like DeepSeek stream reasoning one word at a time, producing a separate [thinking] line per token. Add a buffer that coalesces chunks and flushes at natural boundaries (newlines, sentence endings, terminal width). 2. Fix duplicate reasoning display: centralize callback selection into _current_reasoning_callback() — one place instead of 4 scattered inline ternaries. Prevents both the streaming box AND the preview callback from firing simultaneously. 3. Fix post-response reasoning box guard: change the check from 'not self._stream_started' to 'not self._reasoning_stream_started' so the final reasoning box is only suppressed when reasoning was actually streamed live, not when any text was streamed. Cherry-picked from PR #2781 by juanfradb.	2026-03-25 12:16:39 -07:00
Teknium	099dfca6db	fix: GLM reasoning-only and max-length handling (#3010 ) - Add 'prompt exceeds max length' to context overflow detection for Z.AI/GLM 400 errors - Extract inline reasoning blocks from assistant content as fallback when no structured reasoning fields are present - Guard inline extraction so structured API reasoning takes priority - Update test for reasoning-only response salvage behavior Cherry-picked from PR #2993 by kshitijk4poor. Added priority guard to fix test_structured_reasoning_takes_priority failure. Co-authored-by: kshitijk4poor <kshitijk4poor@users.noreply.github.com>	2026-03-25 12:05:37 -07:00
Teknium	68ab37e891	fix(delegate): give subagents independent iteration budgets (#3004 ) Each subagent now gets its own IterationBudget instead of sharing the parent's. The per-subagent cap is controlled by delegation.max_iterations in config.yaml (default 50). Total iterations across parent + subagents can exceed the parent's max_iterations, but the user retains control via the config setting. Previously, subagents shared the parent's budget, so three parallel subagents configured for max_iterations=50 racing against a parent that already used 60 of 90 would each only get ~10 iterations. Inspired by PR #2928 (Bartok9) which identified the issue (#2873).	2026-03-25 11:29:49 -07:00
Teknium	65dace1b1a	fix(discord): stop phantom typing indicator after agent turn completes (#3003 ) Two fixes for a race where Discord's typing indicator lingers after the agent finishes: 1. _keep_typing (root cause): after outer stop_typing() clears the task dict, _keep_typing wakes from its 2s sleep and calls send_typing() again, recreating an orphaned loop. Add a finally block so _keep_typing always calls stop_typing() on exit, cleaning up any loop it recreated. 2. _process_message_background (safety net): add stop_typing() after cancelling the typing task, catching any platform-level persistent typing tasks that slipped through. Combines fixes from PR #2945 by catbusconductor (root cause in _keep_typing) and PR #2832 by subrih (safety net in _process_message_background).	2026-03-25 11:28:28 -07:00
Teknium	650b400c98	fix(cron): mark session as ended after job completes (#2998 ) Cron was the only execution path that never called end_session(), leaving ended_at = NULL permanently. This made cron sessions invisible to hermes prune --older-than and indistinguishable from active sessions. Captures session_id in a local variable before agent construction so it's available in the finally block even if AIAgent() fails, then calls end_session(session_id, 'cron_complete') before close(). Cherry-picked from PR #2979 by ygd58. Fixed bug: original PR called end_session() with zero arguments (TypeError — method requires session_id and end_reason). Fixes #2972. Co-authored-by: ygd58 <ygd58@users.noreply.github.com>	2026-03-25 11:13:21 -07:00
Teknium	61949f0af7	Fix (#2997 ) Co-authored-by: Jack <jvand@DESKTOP-JACK.localdomain>	2026-03-25 11:12:11 -07:00
Teknium	52c5e491f5	fix(session): surface silent SessionDB failures that cause session data loss (#2999 ) * fix(session): surface silent SessionDB failures that cause session data loss SessionDB initialization and operation failures were logged at debug level or silently swallowed, causing sessions to never be indexed in the FTS5 database. This made session_search unable to find affected conversations. In practice, ~48% of sessions can be lost without any visible indication. The JSON session files are still written (separate code path), but the SQLite/FTS5 index gets nothing — making session_search return empty results for affected sessions. Changes: - cli.py: Log warnings (not debug) when SessionDB init fails at both __init__ and _start_session entry points - run_agent.py: Log warnings on create_session, append_message, and compression split failures - run_agent.py: Set _session_db = None after create_session failure to fail fast instead of silently dropping every message for the session Root cause: When gateway restarts or DB lock contention occurs during SessionDB() init, the exception is caught and swallowed. The agent continues running normally — JSON session logs are written to disk — but no messages reach the FTS5 index. * fix: use module logger instead of root logging for SessionDB warnings Follow-up to cherry-picked PR #2939 — the original used logging.warning() (root logger) instead of logger.warning() (module logger) in the 5 new warning calls. Module logger preserves the logger hierarchy and shows the correct module name in log output. --------- Co-authored-by: LucidPaths <lc77@outlook.de>	2026-03-25 11:10:19 -07:00
Teknium	f665351740	fix(shell): exponential backoff for persistent shell polling (#2996 ) * fix(shell): replace fixed 10ms poll interval with exponential backoff to reduce WSL2 resource consumption * fix(shell): rename _poll_interval to _poll_interval_start for clarity, update SSH override * fix(shell): correctly rename _poll_interval to _poll_interval_start in ssh.py --------- Co-authored-by: ygd58 <buraysandro9@gmail.com>	2026-03-25 10:56:48 -07:00
Teknium	fba73a60e3	fix(skills): use Git Trees API to prevent silent subdirectory loss during install (#2995 ) * fix(skills): use Git Trees API to prevent silent subdirectory loss during install Refactors _download_directory() to use the Git Trees API (single call for the entire repo tree) as the primary path, falling back to the recursive Contents API when the tree endpoint is unavailable or truncated. Prevents silent subdirectory loss caused by per-directory rate limiting or transient failures. Cherry-picked from PR #2981 by tugrulguner. Fixes #2940. * fix: simplify tree API — use branch name directly as tree-ish Eliminates an extra git/ref/heads API call by passing the branch name directly to git/trees/{branch}?recursive=1, matching the pattern already used by _find_skill_in_repo_tree. --------- Co-authored-by: tugrulguner <tugrulguner@users.noreply.github.com>	2026-03-25 10:48:18 -07:00
Teknium	114e636b7d	fix(display): suppress KawaiiSpinner animation under patch_stdout (#2994 ) When the CLI is active, sys.stdout is prompt_toolkit's StdoutProxy which queues writes and injects newlines around each flush(). This causes every \r spinner frame to land on its own line instead of overwriting the previous one, producing visible flickering where the spinner and status bar repeatedly swap positions. The CLI already renders spinner state via a dedicated TUI widget (_spinner_text / get_spinner_text), so KawaiiSpinner's \r-based loop is redundant under StdoutProxy. Detect the proxy and suppress the animation entirely — the thread still runs to preserve start()/stop() semantics. Also removes the 0.4s flush rate-limit workaround that was papering over the same issue, and cleans up the unused _last_flush_time attribute. Salvaged from PR #2908 by Mibayy (fixed _raw -> raw detection, dropped unrelated bundled changes).	2026-03-25 10:46:54 -07:00
Teknium	20cc1731f4	perf(prompt_builder): avoid redundant file re-read for skill conditions (#2992 ) build_skills_system_prompt() was calling _read_skill_conditions() which re-read each SKILL.md file to extract conditional activation fields. The frontmatter was already parsed by _parse_skill_file() earlier in the same loop. Extract conditions inline from the existing frontmatter dict instead, saving one file read per skill (~80+ on a typical setup). Salvaged from PR #2827 by InB4DevOps.	2026-03-25 10:39:27 -07:00
Teknium	b2a6b012fe	fix(api_server): streaming breaks when agent makes tool calls (#2985 ) * fix(run_agent): ensure _fire_first_delta() is called for tool generation events Added calls to _fire_first_delta() in the AIAgent class to improve the handling of tool generation events, ensuring timely notifications during the processing of function calls and tool usage. * fix(run_agent): improve timeout handling for chat completions Enhanced the timeout configuration for chat completions in the AIAgent class by introducing customizable connection, read, and write timeouts using environment variables. This ensures more robust handling of API requests during streaming operations. * fix(run_agent): reduce default stream read timeout for chat completions Updated the default stream read timeout from 120 seconds to 60 seconds in the AIAgent class, enhancing the timeout configuration for chat completions. This change aims to improve responsiveness during streaming operations. * fix(run_agent): enhance streaming error handling and retry logic Improved the error handling and retry mechanism for streaming requests in the AIAgent class. Introduced a configurable maximum number of stream retries and refined the handling of transient network errors, allowing for retries with fresh connections. Non-transient errors now trigger a fallback to non-streaming only when appropriate, ensuring better resilience during API interactions. * fix(api_server): streaming breaks when agent makes tool calls The agent fires stream_delta_callback(None) to signal the CLI display to close its response box before tool execution begins. The API server's _on_delta callback was forwarding this None directly into the SSE queue, where the SSE writer treats it as end-of-stream and terminates the HTTP response prematurely. After tool calls complete, the agent streams the final answer through the same callback, but the SSE response was already closed — so Open WebUI (and similar frontends) never received the actual answer. Fix: filter out None in _on_delta so the SSE stream stays open. The SSE loop already detects completion via agent_task.done(), which handles stream termination correctly without needing the None sentinel. Reported by Rohit Paul on X.	2026-03-25 09:56:20 -07:00
Teknium	42fec19151	feat: persist reasoning across gateway session turns (schema v6) (#2974 ) feat: persist reasoning across gateway session turns (schema v6) Tested against OpenAI Codex (direct), Anthropic (direct + OAI-compat), and OpenRouter → 6 backends. All reasoning field types (reasoning, reasoning_details, codex_reasoning_items) round-trip through the DB correctly.	2026-03-25 09:47:28 -07:00
Teknium	5dbe2d9d73	fix: skills-sh install fails for deeply nested repo structures (#2980 ) * fix(run_agent): ensure _fire_first_delta() is called for tool generation events Added calls to _fire_first_delta() in the AIAgent class to improve the handling of tool generation events, ensuring timely notifications during the processing of function calls and tool usage. * fix(run_agent): improve timeout handling for chat completions Enhanced the timeout configuration for chat completions in the AIAgent class by introducing customizable connection, read, and write timeouts using environment variables. This ensures more robust handling of API requests during streaming operations. * fix(run_agent): reduce default stream read timeout for chat completions Updated the default stream read timeout from 120 seconds to 60 seconds in the AIAgent class, enhancing the timeout configuration for chat completions. This change aims to improve responsiveness during streaming operations. * fix(run_agent): enhance streaming error handling and retry logic Improved the error handling and retry mechanism for streaming requests in the AIAgent class. Introduced a configurable maximum number of stream retries and refined the handling of transient network errors, allowing for retries with fresh connections. Non-transient errors now trigger a fallback to non-streaming only when appropriate, ensuring better resilience during API interactions. * fix: skills-sh install fails for deeply nested repo structures Skills in repos with deep directory nesting (e.g. cli-tool/components/skills/development/senior-backend/) could not be installed because the candidate path generation and shallow root-dir scan never reached them. Added GitHubSource._find_skill_in_repo_tree() which uses the GitHub Trees API to recursively search the entire repo tree in a single API call. This is used as a final fallback in SkillsShSource._discover_identifier() when the standard candidate paths and shallow scan both fail. Fixes installation of skills from repos like davila7/claude-code-templates where skills are nested 4+ levels deep. Reported by user Samuraixheart.	2026-03-25 09:31:05 -07:00
Teknium	c6f4515f73	fix(whatsapp): download documents, audio, and video media from messages (#2978 ) Add downloadMediaMessage() calls for documents, audio/voice notes, and video in bridge.js — previously only images were downloaded, leaving all other file types inaccessible to the agent. Handle local file paths from the bridge for DOCUMENT, VOICE, and VIDEO types in whatsapp.py with proper MIME detection. Inject text content inline for readable files (.txt, .md, .csv, .json, etc.). Follow-up fixes applied during salvage: - Remove unused cache_document_from_bytes import - Add 100KB size cap on text injection (matches Telegram/Discord/Slack) - Align injection format with other platforms Cherry-picked from PR #2818. Also fixes #2856 (bugs 1 & 2). PR #2865 by ayberkesn fixed the same voice note issue. Co-authored-by: noestelar <hola@noeali.com>	2026-03-25 08:37:28 -07:00

1 2 3 4 5 ...

2688 Commits