hermes-agent

Author	SHA1	Message	Date
christopher-kapic	97108db038	fix(cli): pass conversation_history in quiet mode with --resume hermes chat -q 'msg' --resume SESSION_ID loaded the session history but never passed it to run_conversation(), so the model responded without prior context. The interactive mode already does this correctly. Based on work by christopher-kapic in PR #2081. Fixes #2106.	2026-03-21 12:51:34 -07:00
Teknium	29520df44f	revert: remove Shift+Enter keybindings that crash prompt_toolkit Reverts the s-enter and Kitty CSI keybindings from PR #2345/#2346. The s-enter key notation causes 'Invalid key: s-enter' crash on some prompt_toolkit versions, breaking hermes startup entirely.	2026-03-21 10:41:07 -07:00
Teknium	42cef9c282	fix: resolve merge conflict markers in cli.py breaking hermes startup PR #2346 was merged with unresolved git conflict markers (<<<<<<, =======, >>>>>>>) in cli.py at line 6047, causing SyntaxError on startup. Resolved by keeping both the Shift+Enter keybindings and the tab handler.	2026-03-21 10:34:21 -07:00
ygd58	356122e990	fix(cli): handle Kitty keyboard protocol Shift+Enter for Ghostty/WezTerm Kitty-protocol terminals (Ghostty, WezTerm) encode Shift+Enter as CSI 13;2u instead of plain Enter. Without this binding, raw escape characters appear in the input buffer. Adds s-enter and the Kitty escape sequence as newline-insert bindings. Based on work by ygd58 in PR #1798. Fixes #1795. Registry.py apostrophe sanitization change excluded (unrelated scope).	2026-03-21 10:03:55 -07:00
Teknium	c4e787d47b	feat: enable streaming by default in CLI Streaming provides a better UX — tokens appear as they arrive instead of waiting for the full response. show_reasoning remains false so thinking blocks are not streamed to the user.	2026-03-21 09:49:47 -07:00
Teknium	d70e07fc45	refactor(cli): add protected TUI extension hooks for wrapper CLIs Based on PR #1749 by @erosika (reimplemented on current main). Extracts three protected methods from run() so wrapper CLIs can extend the TUI without overriding the entire method: - _get_extra_tui_widgets(): inject widgets between spacer and status bar - _register_extra_tui_keybindings(kb, input_area): add keybindings - _build_tui_layout_children(**widgets): full control over ordering Default implementations reproduce existing layout exactly. The inline HSplit in run() now delegates to _build_tui_layout_children(). 5 tests covering defaults, widget insertion position, and keybinding registration.	2026-03-21 09:42:07 -07:00
Teknium	fd1d6c03cb	fix(cli): correct truncated AUXILIARY_WEB_EXTRACT_API_KEY env var name Cherry-picked from PR #2295 by @dlkakbs. The web_extract auxiliary client api_key env var was literally stored as 'AUXILI..._KEY' (dots in the source) instead of the full name. Users configuring an auxiliary web_extract model with an API key would have auth failures because the key was written to a non-existent var.	2026-03-21 07:09:28 -07:00
Teknium	eb537b5db4	fix(cli): prevent multiple reasoning boxes from rendering Added a check to suppress further reasoning rendering once the response box is open, preventing potential overlap of reasoning boxes during late thinking blocks. This enhances the user experience by maintaining a clean output in the CLI.	2026-03-21 06:28:47 -07:00
Teknium	3585019831	feat(cli): enhance user input display with consistent formatting - Added a user bar separator for improved visual clarity when displaying pasted text and user input in the HermesCLI. - Ensured consistent formatting for both multi-line and single-line user inputs, enhancing the overall user experience in the command-line interface. These changes contribute to a more organized and visually appealing output during interactions.	2026-03-20 23:36:49 -07:00
Test	d0ac8d9fc7	chore: remove dead top-level toolsets config key The top-level 'toolsets' key in config.yaml was never read at runtime. Tool selection uses platform_toolsets (per-platform) or the --toolsets CLI flag. The key existed in load_cli_config() defaults and the example config as 'toolsets: [all]', misleading users into thinking it controlled tool availability. - Remove from load_cli_config() hardcoded defaults - Remove from hermes config show output - Replace in cli-config.yaml.example with deprecation note pointing to platform_toolsets and hermes tools	2026-03-20 22:27:13 -07:00
Test	f7e2ed20fa	feat(cli): implement true-color ANSI support for response text - Added support for true-color ANSI escape codes in the HermesCLI to enhance the visual appearance of streamed content. - Introduced a fallback mechanism for text color in case of errors while retrieving the color from the active skin. - Updated the output formatting to include the new text color in both line emissions and buffer flushing. These changes improve the user experience by ensuring consistent and visually appealing text output in the command-line interface.	2026-03-20 21:02:36 -07:00
Teknium	2416b2b7af	refactor(cli, banner): update gold ANSI color to true-color format (#2246 ) - Changed the ANSI escape code for gold color in cli.py and banner.py to use true-color format (#FFD700) for better visual consistency. - Enhanced the _on_tool_progress method in HermesCLI to update the TUI spinner with tool execution status, improving user feedback during operations. These changes improve the visual representation and user experience in the command-line interface. Co-authored-by: Test <test@test.com>	2026-03-20 18:17:38 -07:00
Teknium	8e884fb3f1	Merge pull request #2215 from NousResearch/hermes/hermes-31d7db3b fix: infer provider from base URL for models.dev context length lookup	2026-03-20 12:52:07 -07:00
Test	76bc27199f	fix(cli, agent): improve streaming handling and state management - Updated _stream_delta method in HermesCLI to handle None values, flushing the stream and resetting state for clean tool execution. - Enhanced quiet mode handling in AIAgent to ensure proper display closure before tool execution, preventing display issues with intermediate streamed content. These changes improve the robustness of the streaming functionality and ensure a smoother user experience during tool interactions.	2026-03-20 10:02:42 -07:00
Teknium	66a1942524	feat: add /queue command to queue prompts without interrupting (#2191 ) Adds /queue <prompt> (alias /q) that queues a message for the next turn while the agent is busy, without interrupting the current run. - CLI: /queue <prompt> puts it in _pending_input for the next turn - Gateway: /queue <prompt> creates a pending MessageEvent on the adapter, picked up after the current agent run finishes - Enter still interrupts as usual (no behavior change) - /queue with no prompt shows usage - /queue when agent is idle tells user to just type normally Co-authored-by: Test <test@test.com>	2026-03-20 09:44:27 -07:00
Test	1055d4356a	fix: skip model auto-detection for custom/local providers When the user is on a custom provider (provider=custom, localhost, or 127.0.0.1 endpoint), /model <name> no longer tries to auto-detect a provider switch. The model name changes on the current endpoint as-is. To switch away from a custom endpoint, users must use explicit provider:model syntax (e.g. /model openai-codex:gpt-5.2-codex). A helpful tip is printed when changing models on a custom endpoint. This prevents the confusing case where someone on LM Studio types /model gpt-5.2-codex, the auto-detection tries to switch providers, fails or partially succeeds, and requests still go to the old endpoint. Also fixes the missing prompt_toolkit.auto_suggest mock stub in test_cli_init.py (same issue already fixed in test_cli_new_session.py).	2026-03-20 04:35:17 -07:00
Teknium	b19f5133c3	Merge pull request #2118 from NousResearch/hermes/hermes-e83093f0 feat: show reasoning/thinking blocks when show_reasoning is enabled	2026-03-20 04:35:12 -07:00
Test	b1832faaae	feat: show reasoning/thinking blocks when show_reasoning is enabled - Add <thinking> tag to streaming filter's tag list - When show_reasoning is on, route XML reasoning content to the reasoning display box instead of silently discarding it - Expand _strip_think_blocks to handle all tag variants: <think>, <thinking>, <THINKING>, <reasoning>, <REASONING_SCRATCHPAD>	2026-03-19 19:44:31 -07:00
InB4DevOps	fe331ed9bd	fix: Reset token counters on new session for accurate usage display (#2099 )	2026-03-20 01:21:25 +01:00
Teknium	4c0c7f4c6e	fix: /model command — bare provider names, custom endpoint display Two issues with /model preventing proper provider switching: 1. Bare provider names not detected: typing '/model nous' treated 'nous' as a model name instead of triggering a provider switch. Fixed by adding step 0 in detect_provider_for_model() that checks if the input matches a known provider name/alias (excluding 'custom'/'openrouter' which need explicit model names) and returns that provider's default model. 2. Custom endpoint details hidden: /model (no args) showed '[custom]' with just a usage hint but no endpoint URL or model name. Now displays the configured base_url for custom providers in both CLI and gateway. Note: config base_url and OPENAI_BASE_URL are intentionally NOT cleared on provider switch — dedicated provider paths (nous, anthropic, codex) have their own credential resolution that ignores these, and clearing them would destroy the user's custom endpoint config, preventing switching back. Co-authored-by: Test <test@test.com>	2026-03-19 12:06:48 -07:00
StefanIsMe	04b6ecadc4	feat(cli): Tab now accepts auto-suggestions (ghost text) Previously, Tab only handled dropdown completions. Users seeing gray ghost text from history-based suggestions had no way to accept them with Tab - they had to use Right arrow or Ctrl+E. Now Tab follows priority: 1. Completion menu open → accept selected completion 2. Ghost text suggestion available → accept auto-suggestion 3. Otherwise → start completion menu This matches user intuition that Tab should 'complete what I see.'	2026-03-19 10:40:37 -07:00
cmcleay	bb59057d5d	fix: normalize live Chrome CDP endpoints for browser tools	2026-03-19 10:17:03 -07:00
Teknium	d76fa7fc37	fix: detect context length for custom model endpoints via fuzzy matching + config override (#2051 ) * fix: detect context length for custom model endpoints via fuzzy matching + config override Custom model endpoints (non-OpenRouter, non-known-provider) were silently falling back to 2M tokens when the model name didn't exactly match what the endpoint's /v1/models reported. This happened because: 1. Endpoint metadata lookup used exact match only — model name mismatches (e.g. 'qwen3.5:9b' vs 'Qwen3.5-9B-Q4_K_M.gguf') caused a miss 2. Single-model servers (common for local inference) required exact name match even though only one model was loaded 3. No user escape hatch to manually set context length Changes: - Add fuzzy matching for endpoint model metadata: single-model servers use the only available model regardless of name; multi-model servers try substring matching in both directions - Add model.context_length config override (highest priority) so users can explicitly set their model's context length in config.yaml - Log an informative message when falling back to 2M probe, telling users about the config override option - Thread config_context_length through ContextCompressor and AIAgent init Tests: 6 new tests covering fuzzy match, single-model fallback, config override (including zero/None edge cases). * fix: auto-detect local model name and context length for local servers Cherry-picked from PR #2043 by sudoingX. - Auto-detect model name from local server's /v1/models when only one model is loaded (no manual model name config needed) - Add n_ctx_train and n_ctx to context length detection keys for llama.cpp - Query llama.cpp /props endpoint for actual allocated context (not just training context from GGUF metadata) - Strip .gguf suffix from display in banner and status bar - _auto_detect_local_model() in runtime_provider.py for CLI init Co-authored-by: sudo <sudoingx@users.noreply.github.com> * fix: revert accidental summary_target_tokens change + add docs for context_length config - Revert summary_target_tokens from 2500 back to 500 (accidental change during patching) - Add 'Context Length Detection' section to Custom & Self-Hosted docs explaining model.context_length config override --------- Co-authored-by: Test <test@test.com> Co-authored-by: sudo <sudoingx@users.noreply.github.com>	2026-03-19 06:01:16 -07:00
Test	e7844e9c8d	Merge origin/main, resolve conflicts (self._base_url_lower)	2026-03-18 04:09:00 -07:00
Test	c1750bb32d	feat(cli): add /statusbar command to toggle context bar Adds /statusbar (alias /sb) to show/hide the bottom status bar that displays model name, context usage, and session duration. Uses ConditionalContainer so the bar takes zero space when hidden rather than leaving a blank line.	2026-03-18 03:49:49 -07:00
Test	8422196e89	Merge PR #1879 : feat: integrate GitHub Copilot providers	2026-03-18 03:18:33 -07:00
Teknium	24ac577046	fix: respect model.default from config.yaml for openai-codex provider (#1896 ) When config.yaml had a non-default model (e.g. gpt-5.3-codex) and the provider was openai-codex, _normalize_model_for_provider() would replace it with the latest available codex model because _model_is_default only checked the CLI argument, not the config value. Now _model_is_default is False when config.yaml has a model that differs from the global fallback (anthropic/claude-opus-4.6), so the user's explicit config choice is preserved. Fixes #1887 Co-authored-by: Test <test@test.com>	2026-03-18 02:50:31 -07:00
Test	a8132d1252	fix: respect model.default from config.yaml for openai-codex provider When config.yaml had a non-default model (e.g. gpt-5.3-codex) and the provider was openai-codex, _normalize_model_for_provider() would replace it with the latest available codex model because _model_is_default only checked the CLI argument, not the config value. Now _model_is_default is False when config.yaml has a model that differs from the global fallback (anthropic/claude-opus-4.6), so the user's explicit config choice is preserved. Fixes #1887	2026-03-18 02:24:41 -07:00
max	0c392e7a87	feat: integrate GitHub Copilot providers across Hermes Add first-class GitHub Copilot and Copilot ACP provider support across model selection, runtime provider resolution, CLI sessions, delegated subagents, cron jobs, and the Telegram gateway. This also normalizes Copilot model catalogs and API modes, introduces a Copilot ACP OpenAI-compatible shim, and fixes service-mode auth by resolving Homebrew-installed gh binaries under launchd. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-03-17 23:40:22 -07:00
Test	a71e3f4d98	fix: add /browser to COMMAND_REGISTRY so it shows in help and autocomplete The /browser command handler existed in cli.py but was never added to COMMAND_REGISTRY after the centralized command registry refactor. This meant: - /browser didn't appear in /help - No tab-completion or subcommand suggestions - Dispatch used _base_word fallback instead of canonical resolution Added CommandDef with connect/disconnect/status subcommands and switched dispatch to use canonical instead of _base_word.	2026-03-17 13:29:36 -07:00
Teknium	7ac9088d5c	fix: Telegram streaming — config bridge, not-modified, flood control (#1782 ) * fix: NameError in OpenCode provider setup (prompt_text -> prompt) The OpenCode Zen and OpenCode Go setup sections used prompt_text() which is undefined. All other providers correctly use the local prompt() function defined in setup.py. Fixes crash during 'hermes setup' when selecting either OpenCode provider. * fix: Telegram streaming — config bridge, not-modified, flood control Three fixes for gateway streaming: 1. Bridge streaming config from config.yaml into gateway runtime. load_gateway_config() now reads the 'streaming' key from config.yaml (same pattern as session_reset, stt, etc.), matching the docs. Previously only gateway.json was read. 2. Handle 'Message is not modified' in Telegram edit_message(). This Telegram API error fires when editing with identical content — a no-op, not a real failure. Previously it returned success=False which made the stream consumer disable streaming entirely. 3. Handle RetryAfter / flood control in Telegram edit_message(). Fast providers can hit Telegram rate limits during streaming. Now waits the requested retry_after duration and retries once, instead of treating it as a fatal edit failure. Also fixed double-edit on stream finish: the consumer now tracks last-sent text and skips redundant edits, preventing the not-modified error at the source. * refactor: make config.yaml the primary gateway config source Eliminates the per-key bridge pattern in load_gateway_config(). Previously gateway.json was the primary source and each config.yaml key needed an individual bridge — easy to forget (streaming was missing, causing garl4546's bug). Now config.yaml is read first and its keys are mapped directly into the GatewayConfig.from_dict() schema. gateway.json is kept as a legacy fallback layer (loaded first, then overwritten by config.yaml keys). If gateway.json exists, a log message suggests migrating. Also: - Removed dead save_gateway_config() (never called anywhere) - Updated CLI help text and send_message error to reference config.yaml instead of gateway.json --------- Co-authored-by: Test <test@test.com>	2026-03-17 10:51:54 -07:00
teknium1	c881209b92	Revert "feat(cli): skin-aware light/dark theme mode with terminal auto-detection" This reverts commit `a1c81360a5`.	2026-03-17 10:04:53 -07:00
Teknium	d1d17f4f0a	feat(compression): add summary_base_url + move compression config to YAML-only - Add summary_base_url config option to compression block for custom OpenAI-compatible endpoints (e.g. zai, DeepSeek, Ollama) - Remove compression env var bridges from cli.py and gateway/run.py (CONTEXT_COMPRESSION_* env vars no longer set from config) - Switch run_agent.py to read compression config directly from config.yaml instead of env vars - Fix backwards-compat block in _resolve_task_provider_model to also fire when auxiliary.compression.provider is 'auto' (DEFAULT_CONFIG sets this, which was silently preventing the compression section's summary_* keys from being read) - Add test for summary_base_url config-to-client flow - Update docs to show compression as config.yaml-only Closes #1591 Based on PR #1702 by @uzaylisak	2026-03-17 04:46:15 -07:00
teknium1	0897e4350e	merge: resolve conflicts with origin/main	2026-03-17 04:30:37 -07:00
teknium1	e5fc916814	feat: auto-generate session titles after first exchange After the first user→assistant exchange, Hermes now generates a short descriptive session title via the auxiliary LLM (compression task config). Title generation runs in a background thread so it never delays the user-facing response. Key behaviors: - Fires only on the first 1-2 exchanges (checks user message count) - Skips if a title already exists (user-set titles are never overwritten) - Uses call_llm with compression task config (cheapest/fastest model) - Truncates long messages to keep the title generation request small - Cleans up LLM output: strips quotes, 'Title:' prefixes, enforces 80 char max - Works in both CLI and gateway (Telegram/Discord/etc.) Also updates /title (no args) to show the session ID alongside the title in both CLI and gateway. Implements #1426	2026-03-17 04:14:40 -07:00
Teknium	d417ba2a48	feat: add route-aware pricing estimates (#1695 ) Salvaged from PR #1563 by @kshitijk4poor. Cherry-picked with authorship preserved. - Route-aware pricing architecture replacing static MODEL_PRICING + heuristics - Canonical usage normalization (Anthropic/OpenAI/Codex API shapes) - Cache-aware billing (separate cache_read/cache_write rates) - Cost status tracking (estimated/included/unknown/actual) - OpenRouter live pricing via models API - Schema migration v4→v5 with billing metadata columns - Removed speculative forward-looking entries - Removed cost display from CLI status bar - Threaded OpenRouter metadata pre-warm Co-authored-by: kshitij <82637225+kshitijk4poor@users.noreply.github.com>	2026-03-17 03:44:44 -07:00
Teknium	1d5a39e002	fix: thread safety for concurrent subagent delegation (#1672 ) * fix: thread safety for concurrent subagent delegation Four thread-safety fixes that prevent crashes and data races when running multiple subagents concurrently via delegate_task: 1. Remove redirect_stdout/stderr from delegate_tool — mutating global sys.stdout races with the spinner thread when multiple children start concurrently, causing segfaults. Children already run with quiet_mode=True so the redirect was redundant. 2. Split _run_single_child into _build_child_agent (main thread) + _run_single_child (worker thread). AIAgent construction creates httpx/SSL clients which are not thread-safe to initialize concurrently. 3. Add threading.Lock to SessionDB — subagents share the parent's SessionDB and call create_session/append_message from worker threads with no synchronization. 4. Add _active_children_lock to AIAgent — interrupt() iterates _active_children while worker threads append/remove children. 5. Add _client_cache_lock to auxiliary_client — multiple subagent threads may resolve clients concurrently via call_llm(). Based on PR #1471 by peteromallet. * feat: Honcho base_url override via config.yaml + quick command alias type Two features salvaged from PR #1576: 1. Honcho base_url override: allows pointing Hermes at a remote self-hosted Honcho deployment via config.yaml: honcho: base_url: "http://192.168.x.x:8000" When set, this overrides the Honcho SDK's environment mapping (production/local), enabling LAN/VPN Honcho deployments without requiring the server to live on localhost. Uses config.yaml instead of env var (HONCHO_URL) per project convention. 2. Quick command alias type: adds a new 'alias' quick command type that rewrites to another slash command before normal dispatch: quick_commands: sc: type: alias target: /context Supports both CLI and gateway. Arguments are forwarded to the target command. Based on PR #1576 by redhelix. --------- Co-authored-by: peteromallet <peteromallet@users.noreply.github.com> Co-authored-by: redhelix <redhelix@users.noreply.github.com>	2026-03-17 02:53:33 -07:00
teknium1	a1c81360a5	feat(cli): skin-aware light/dark theme mode with terminal auto-detection Add display.theme_mode setting (auto/light/dark) that makes the CLI readable on light terminal backgrounds. - Auto-detect terminal background via COLORFGBG, OSC 11, and macOS appearance (fallback chain in hermes_cli/colors.py) - Add colors_light overrides to all 7 built-in skins with dark/readable colors for light backgrounds - SkinConfig.get_color() now returns light overrides when theme is light - get_prompt_toolkit_style_overrides() uses light bg colors for completion menus in light mode - init_skin_from_config() reads display.theme_mode from config - 7 new tests covering theme mode resolution, detection fallbacks, and light-mode skin overrides Salvaged from PR #1187 by @peteromallet. Core design preserved; adapted to current main (kept all existing helpers, tool_emojis, convenience functions that were added after the PR branched). Co-authored-by: Peter O'Mallet <peteromallet@users.noreply.github.com>	2026-03-17 02:51:40 -07:00
Teknium	556e0f4b43	fix(docker): add explicit env allowlist for container credentials (#1436 ) Docker terminal sessions are secret-dark by default. This adds terminal.docker_forward_env as an explicit allowlist for env vars that may be forwarded into Docker containers. Values resolve from the current shell first, then fall back to ~/.hermes/.env. Only variables the user explicitly lists are forwarded — nothing is auto-exposed. Cherry-picked from PR #1449 by @teknium1, conflict-resolved onto current main. Fixes #1436 Supersedes #1439	2026-03-17 02:34:35 -07:00
Teknium	8992babaa3	fix(cli): flush stdout during agent loop to prevent macOS display freeze (#1624 ) * fix: prevent infinite 400 failure loop on context overflow (#1630) When a gateway session exceeds the model's context window, Anthropic may return a generic 400 invalid_request_error with just 'Error' as the message. This bypassed the phrase-based context-length detection, causing the agent to treat it as a non-retryable client error. Worse, the failed user message was still persisted to the transcript, making the session even larger on each attempt — creating an infinite loop. Three-layer fix: 1. run_agent.py — Fallback heuristic: when a 400 error has a very short generic message AND the session is large (>40% of context or >80 messages), treat it as a probable context overflow and trigger compression instead of aborting. 2. run_agent.py + gateway/run.py — Don't persist failed messages: when the agent returns failed=True before generating any response, skip writing the user's message to the transcript/DB. This prevents the session from growing on each failure. 3. gateway/run.py — Smarter error messages: detect context-overflow failures and suggest /compact or /reset specifically, instead of a generic 'try again' that will fail identically. * fix(skills): detect prompt injection patterns and block cache file reads Adds two security layers to prevent prompt injection via skills hub cache files (#1558): 1. read_file: blocks direct reads of ~/.hermes/skills/.hub/ directory (index-cache, catalog files). The 3.5MB clawhub_catalog_v1.json was the original injection vector — untrusted skill descriptions in the catalog contained adversarial text that the model executed. 2. skill_view: warns when skills are loaded from outside the trusted ~/.hermes/skills/ directory, and detects common injection patterns in skill content ("ignore previous instructions", "<system>", etc.). Cherry-picked from PR #1562 by ygd58. * fix(tools): chunk long messages in send_message_tool before dispatch (#1552) Long messages sent via send_message tool or cron delivery silently failed when exceeding platform limits. Gateway adapters handle this via truncate_message(), but the standalone senders in send_message_tool bypassed that entirely. - Apply truncate_message() chunking in _send_to_platform() before dispatching to individual platform senders - Remove naive message[i:i+2000] character split in _send_discord() in favor of centralized smart splitting - Attach media files to last chunk only for Telegram - Add regression tests for chunking and media placement Cherry-picked from PR #1557 by llbn. * fix(approval): show full command in dangerous command approval (#1553) Previously the command was truncated to 80 chars in CLI (with a [v]iew full option), 500 chars in Discord embeds, and missing entirely in Telegram/Slack approval messages. Now the full command is always displayed everywhere: - CLI: removed 80-char truncation and [v]iew full menu option - Gateway (TG/Slack): approval_required message includes full command in a code block - Discord: embed shows full command up to 4096-char limit - Windows: skip SIGALRM-based test timeout (Unix-only) - Updated tests: replaced view-flow tests with direct approval tests Cherry-picked from PR #1566 by crazywriter1. * fix(cli): flush stdout during agent loop to prevent macOS display freeze (#1624) The interrupt polling loop in chat() waited on the queue without invalidating the prompt_toolkit renderer. On macOS, the StdoutProxy buffer only flushed on input events, causing the CLI to appear frozen during tool execution until the user typed a key. Fix: call _invalidate() on each queue timeout (every ~100ms, throttled to 150ms) to force the renderer to flush buffered agent output. --------- Co-authored-by: buray <ygd58@users.noreply.github.com> Co-authored-by: lbn <llbn@users.noreply.github.com> Co-authored-by: crazywriter1 <53251494+crazywriter1@users.noreply.github.com>	2026-03-17 02:09:26 -07:00
Teknium	49043b7b7d	feat: add /tools disable/enable/list slash commands with session reset (#1652 ) Add in-session tool management via /tools disable/enable/list, plus hermes tools list/disable/enable CLI subcommands. Supports both built-in toolsets (web, memory) and MCP tools (github:create_issue). To preserve prompt caching, /tools disable/enable in a chat session saves the change to config and resets the session cleanly — the user is asked to confirm before the reset happens. Also improves prefix matching: /qui now dispatches to /quit instead of showing ambiguous when longer skill commands like /quint-pipeline are installed. Based on PR #1520 by @YanSte. Co-authored-by: Yannick Stephan <YanSte@users.noreply.github.com>	2026-03-17 02:05:26 -07:00
Teknium	3744118311	feat(cli): two-stage /model autocomplete with ghost text suggestions (#1641 ) * feat(cli): two-stage /model autocomplete with ghost text suggestions - SlashCommandCompleter: Tab-complete providers first (anthropic:, openrouter:, etc.) then models within the selected provider - SlashCommandAutoSuggest: inline ghost text for slash commands, subcommands, and /model provider:model two-stage suggestions - Custom Tab key binding: accepts provider completion and immediately re-triggers completions to show that provider's models - COMMANDS_BY_CATEGORY: structured format with explicit subcommands for tab completion and ghost text (prompt, reasoning, voice, skills, cron, browser) - SUBCOMMANDS dict auto-extracted from command definitions - Model/provider info cached 60s for responsive completions * fix: repair test regression and restore gold color from PR #1622 - Fix test_unknown_command_still_shows_error: patch _cprint instead of console.print to match the _cprint switch in process_command() - Restore gold color on 'Type /help' hint using _DIM + _GOLD constants instead of bare \033[2m (was losing the #B8860B gold) - Use _GOLD constant for ambiguous command message for consistency - Add clarifying comment on SUBCOMMANDS regex fallback --------- Co-authored-by: Lars van der Zande <lmvanderzande@gmail.com>	2026-03-17 01:47:32 -07:00
Teknium	46176c8029	refactor: centralize slash command registry (#1603 ) * refactor: centralize slash command registry Replace 7+ scattered command definition sites with a single CommandDef registry in hermes_cli/commands.py. All downstream consumers now derive from this registry: - CLI process_command() resolves aliases via resolve_command() - Gateway _known_commands uses GATEWAY_KNOWN_COMMANDS frozenset - Gateway help text generated by gateway_help_lines() - Telegram BotCommands generated by telegram_bot_commands() - Slack subcommand map generated by slack_subcommand_map() Adding a command or alias is now a one-line change to COMMAND_REGISTRY instead of touching 6+ files. Bugfixes included: - Telegram now registers /rollback, /background (were missing) - Slack now has /voice, /update, /reload-mcp (were missing) - Gateway duplicate 'reasoning' dispatch (dead code) removed - Gateway help text can no longer drift from CLI help Backwards-compatible: COMMANDS and COMMANDS_BY_CATEGORY dicts are rebuilt from the registry, so existing imports work unchanged. * docs: update developer docs for centralized command registry Update AGENTS.md with full 'Slash Command Registry' and 'Adding a Slash Command' sections covering CommandDef fields, registry helpers, and the one-line alias workflow. Also update: - CONTRIBUTING.md: commands.py description - website/docs/reference/slash-commands.md: reference central registry - docs/plans/centralize-command-registry.md: mark COMPLETED - plans/checkpoint-rollback.md: reference new pattern - hermes-agent-dev skill: architecture table * chore: remove stale plan docs	2026-03-16 23:21:03 -07:00
Teknium	6794e79bb4	feat: add /bg as alias for /background slash command (#1590 ) * feat: add optional smart model routing Add a conservative cheap-vs-strong routing option that can send very short/simple turns to a cheaper model across providers while keeping the primary model for complex work. Wire it through CLI, gateway, and cron, and document the config.yaml workflow. * fix(gateway): remove recursive ExecStop from systemd units, extend TimeoutStopSec to 60s * fix(gateway): avoid recursive ExecStop in user systemd unit * fix: extend ExecStop removal and TimeoutStopSec=60 to system unit The cherry-picked PR #1448 fix only covered the user systemd unit. The system unit had the same TimeoutStopSec=15 and could benefit from the same 60s timeout for clean shutdown. Also adds a regression test for the system unit. --------- Co-authored-by: Ninja <ninja@local> * feat(skills): add blender-mcp optional skill for 3D modeling Control a running Blender instance from Hermes via socket connection to the blender-mcp addon (port 9876). Supports creating 3D objects, materials, animations, and running arbitrary bpy code. Placed in optional-skills/ since it requires Blender 4.3+ desktop with a third-party addon manually started each session. * feat(acp): support slash commands in ACP adapter (#1532) Adds /help, /model, /tools, /context, /reset, /compact, /version to the ACP adapter (VS Code, Zed, JetBrains). Commands are handled directly in the server without instantiating the TUI — each command queries agent/session state and returns plain text. Unrecognized /commands fall through to the LLM as normal messages. /model uses detect_provider_for_model() for auto-detection when switching models, matching the CLI and gateway behavior. Fixes #1402 * fix(logging): improve error logging in session search tool (#1533) * fix(gateway): restart on retryable startup failures (#1517) * feat(email): add skip_attachments option via config.yaml * feat(email): add skip_attachments option via config.yaml Adds a config.yaml-driven option to skip email attachments in the gateway email adapter. Useful for malware protection and bandwidth savings. Configure in config.yaml: platforms: email: skip_attachments: true Based on PR #1521 by @an420eth, changed from env var to config.yaml (via PlatformConfig.extra) to match the project's config-first pattern. * docs: document skip_attachments option for email adapter * fix(telegram): retry on transient TLS failures during connect and send Add exponential-backoff retry (3 attempts) around initialize() to handle transient TLS resets during gateway startup. Also catches TimedOut and OSError in addition to NetworkError. Add exponential-backoff retry (3 attempts) around send_message() for NetworkError during message delivery, wrapping the existing Markdown fallback logic. Both imports are guarded with try/except ImportError for test environments where telegram is mocked. Based on PR #1527 by cmd8. Closes #1526. * feat: permissive block_anchor thresholds and unicode normalization (#1539) Salvaged from PR #1528 by an420eth. Closes #517. Improves _strategy_block_anchor in fuzzy_match.py: - Add unicode normalization (smart quotes, em/en-dashes, ellipsis, non-breaking spaces → ASCII) so LLM-produced unicode artifacts don't break anchor line matching - Lower thresholds: 0.10 for unique matches (was 0.70), 0.30 for multiple candidates — if first/last lines match exactly, the block is almost certainly correct - Use original (non-normalized) content for offset calculation to preserve correct character positions Tested: 3 new scenarios fixed (em-dash anchors, non-breaking space anchors, very-low-similarity unique matches), zero regressions on all 9 existing fuzzy match tests. Co-authored-by: an420eth <an420eth@users.noreply.github.com> * feat(cli): add file path autocomplete in the input prompt (#1545) When typing a path-like token (./ ../ ~/ / or containing /), the CLI now shows filesystem completions in the dropdown menu. Directories show a trailing slash and 'dir' label; files show their size. Completions are case-insensitive and capped at 30 entries. Triggered by tokens like: edit ./src/ma → shows ./src/main.py, ./src/manifest.json, ... check ~/doc → shows ~/docs/, ~/documents/, ... read /etc/hos → shows /etc/hosts, /etc/hostname, ... open tools/reg → shows tools/registry.py Slash command autocomplete (/help, /model, etc.) is unaffected — it still triggers when the input starts with /. Inspired by OpenCode PR #145 (file path completion menu). Implementation: - hermes_cli/commands.py: _extract_path_word() detects path-like tokens, _path_completions() yields filesystem Completions with size labels, get_completions() routes to paths vs slash commands - tests/hermes_cli/test_path_completion.py: 26 tests covering path extraction, prefix filtering, directory markers, home expansion, case-insensitivity, integration with slash commands * feat(privacy): redact PII from LLM context when privacy.redact_pii is enabled Add privacy.redact_pii config option (boolean, default false). When enabled, the gateway redacts personally identifiable information from the system prompt before sending it to the LLM provider: - Phone numbers (user IDs on WhatsApp/Signal) → hashed to user_<sha256> - User IDs → hashed to user_<sha256> - Chat IDs → numeric portion hashed, platform prefix preserved - Home channel IDs → hashed - Names/usernames → NOT affected (user-chosen, publicly visible) Hashes are deterministic (same user → same hash) so the model can still distinguish users in group chats. Routing and delivery use the original values internally — redaction only affects LLM context. Inspired by OpenClaw PR #47959. * fix(privacy): skip PII redaction on Discord/Slack (mentions need real IDs) Discord uses <@user_id> for mentions and Slack uses <@U12345> — the LLM needs the real ID to tag users. Redaction now only applies to WhatsApp, Signal, and Telegram where IDs are pure routing metadata. Add 4 platform-specific tests covering Discord, WhatsApp, Signal, Slack. * feat: smart approvals + /stop command (inspired by OpenAI Codex) * feat: smart approvals — LLM-based risk assessment for dangerous commands Adds a 'smart' approval mode that uses the auxiliary LLM to assess whether a flagged command is genuinely dangerous or a false positive, auto-approving low-risk commands without prompting the user. Inspired by OpenAI Codex's Smart Approvals guardian subagent (openai/codex#13860). Config (config.yaml): approvals: mode: manual # manual (default), smart, off Modes: - manual — current behavior, always prompt the user - smart — aux LLM evaluates risk: APPROVE (auto-allow), DENY (block), or ESCALATE (fall through to manual prompt) - off — skip all approval prompts (equivalent to --yolo) When smart mode auto-approves, the pattern gets session-level approval so subsequent uses of the same pattern don't trigger another LLM call. When it denies, the command is blocked without user prompt. When uncertain, it escalates to the normal manual approval flow. The LLM prompt is carefully scoped: it sees only the command text and the flagged reason, assesses actual risk vs false positive, and returns a single-word verdict. * feat: make smart approval model configurable via config.yaml Adds auxiliary.approval section to config.yaml with the same provider/model/base_url/api_key pattern as other aux tasks (vision, web_extract, compression, etc.). Config: auxiliary: approval: provider: auto model: '' # fast/cheap model recommended base_url: '' api_key: '' Bridged to env vars in both CLI and gateway paths so the aux client picks them up automatically. * feat: add /stop command to kill all background processes Adds a /stop slash command that kills all running background processes at once. Currently users have to process(list) then process(kill) for each one individually. Inspired by OpenAI Codex's separation of interrupt (Ctrl+C stops current turn) from /stop (cleans up background processes). See openai/codex#14602. Ctrl+C continues to only interrupt the active agent turn — background dev servers, watchers, etc. are preserved. /stop is the explicit way to clean them all up. * feat: first-class plugin architecture + hide status bar cost by default (#1544) The persistent status bar now shows context %, token counts, and duration but NOT $ cost by default. Cost display is opt-in via: display: show_cost: true in config.yaml, or: hermes config set display.show_cost true The /usage command still shows full cost breakdown since the user explicitly asked for it — this only affects the always-visible bar. Status bar without cost: ⚕ claude-sonnet-4 │ 12K/200K │ 6% │ 15m Status bar with show_cost: true: ⚕ claude-sonnet-4 │ 12K/200K │ 6% │ $0.06 │ 15m * feat: improve memory prioritization + aggressive skill updates (inspired by OpenAI Codex) * feat: improve memory prioritization — user preferences over procedural knowledge Inspired by OpenAI Codex's memory prompt improvements (openai/codex#14493) which focus memory writes on user preferences and recurring patterns rather than procedural task details. Key insight: 'Optimize for reducing future user steering — the most valuable memory prevents the user from having to repeat themselves.' Changes: - MEMORY_GUIDANCE (prompt_builder.py): added prioritization hierarchy and the core principle about reducing user steering - MEMORY_SCHEMA (memory_tool.py): reordered WHEN TO SAVE list to put corrections first, added explicit PRIORITY guidance - Memory nudge (run_agent.py): now asks specifically about preferences, corrections, and workflow patterns instead of generic 'anything' - Memory flush (run_agent.py): now instructs to prioritize user preferences and corrections over task-specific details * feat: more aggressive skill creation and update prompting Press harder on skill updates — the agent should proactively patch skills when it encounters issues during use, not wait to be asked. Changes: - SKILLS_GUIDANCE: 'consider saving' → 'save'; added explicit instruction to patch skills immediately when found outdated/wrong - Skills header: added instruction to update loaded skills before finishing if they had missing steps or wrong commands - Skill nudge: more assertive ('save the approach' not 'consider saving'), now also prompts for updating existing skills used in the task - Skill nudge interval: lowered default from 15 to 10 iterations - skill_manage schema: added 'patch it immediately' to update triggers * feat: first-class plugin architecture (#1555) Plugin system for extending Hermes with custom tools, hooks, and integrations — no source code changes required. Core system (hermes_cli/plugins.py): - Plugin discovery from ~/.hermes/plugins/, .hermes/plugins/, and pip entry_points (hermes_agent.plugins group) - PluginContext with register_tool() and register_hook() - 6 lifecycle hooks: pre/post tool_call, pre/post llm_call, on_session_start/end - Namespace package handling for relative imports in plugins - Graceful error isolation — broken plugins never crash the agent Integration (model_tools.py): - Plugin discovery runs after built-in + MCP tools - Plugin tools bypass toolset filter via get_plugin_tool_names() - Pre/post tool call hooks fire in handle_function_call() CLI: - /plugins command shows loaded plugins, tool counts, status - Added to COMMANDS dict for autocomplete Docs: - Getting started guide (build-a-hermes-plugin.md) — full tutorial building a calculator plugin step by step - Reference page (features/plugins.md) — quick overview + tables - Covers: file structure, schemas, handlers, hooks, data files, bundled skills, env var gating, pip distribution, common mistakes Tests: 16 tests covering discovery, loading, hooks, tool visibility. * feat: add /bg as alias for /background slash command Adds /bg alias across CLI, gateway, and Slack platform adapter. Updates help text, autocomplete, known_commands set, and dispatch logic. Includes tests for the new alias. * docs: add plan for centralized slash command registry Scopes a refactor to replace 7+ scattered command definition sites with a single CommandDef registry in hermes_cli/commands.py. Includes derived helper functions for gateway help text, Telegram BotCommands, Slack subcommand maps, and alias resolution. Documents current drift (Telegram missing /rollback + /background, Slack missing /voice + /update, gateway dead code) that the refactor fixes for free. --------- Co-authored-by: Ninja <ninja@local> Co-authored-by: alireza78a <alireza78a@users.noreply.github.com> Co-authored-by: Oktay Aydin <113846926+aydnOktay@users.noreply.github.com> Co-authored-by: JP Lew <polydegen@protonmail.com> Co-authored-by: an420eth <an420eth@users.noreply.github.com>	2026-03-16 17:27:02 -07:00
Teknium	181077b785	fix: hide Honcho session line on CLI load when no API key configured (#1582 ) HonchoClientConfig.from_env() set enabled=True unconditionally, even when HONCHO_API_KEY was not set. When ~/.honcho/config.json didn't exist, from_global_config() fell back to from_env() and returned enabled=True with a null api_key, causing the Honcho session indicator to display on every CLI launch. Fix: from_env() now sets enabled=bool(api_key), matching the auto-enable logic already used in from_global_config(). Also added api_key guard to the CLI display as defense-in-depth.	2026-03-16 17:22:52 -07:00
teknium1	f4d61c168b	merge: resolve conflicts with main (show_cost, turn routing, docker docs)	2026-03-16 14:22:38 -07:00
Teknium	5e5c92663d	fix: hermes update causes dual gateways on macOS (launchd) (#1567 ) * feat: add optional smart model routing Add a conservative cheap-vs-strong routing option that can send very short/simple turns to a cheaper model across providers while keeping the primary model for complex work. Wire it through CLI, gateway, and cron, and document the config.yaml workflow. * fix(gateway): remove recursive ExecStop from systemd units, extend TimeoutStopSec to 60s * fix(gateway): avoid recursive ExecStop in user systemd unit * fix: extend ExecStop removal and TimeoutStopSec=60 to system unit The cherry-picked PR #1448 fix only covered the user systemd unit. The system unit had the same TimeoutStopSec=15 and could benefit from the same 60s timeout for clean shutdown. Also adds a regression test for the system unit. --------- Co-authored-by: Ninja <ninja@local> * feat(skills): add blender-mcp optional skill for 3D modeling Control a running Blender instance from Hermes via socket connection to the blender-mcp addon (port 9876). Supports creating 3D objects, materials, animations, and running arbitrary bpy code. Placed in optional-skills/ since it requires Blender 4.3+ desktop with a third-party addon manually started each session. * feat(acp): support slash commands in ACP adapter (#1532) Adds /help, /model, /tools, /context, /reset, /compact, /version to the ACP adapter (VS Code, Zed, JetBrains). Commands are handled directly in the server without instantiating the TUI — each command queries agent/session state and returns plain text. Unrecognized /commands fall through to the LLM as normal messages. /model uses detect_provider_for_model() for auto-detection when switching models, matching the CLI and gateway behavior. Fixes #1402 * fix(logging): improve error logging in session search tool (#1533) * fix(gateway): restart on retryable startup failures (#1517) * feat(email): add skip_attachments option via config.yaml * feat(email): add skip_attachments option via config.yaml Adds a config.yaml-driven option to skip email attachments in the gateway email adapter. Useful for malware protection and bandwidth savings. Configure in config.yaml: platforms: email: skip_attachments: true Based on PR #1521 by @an420eth, changed from env var to config.yaml (via PlatformConfig.extra) to match the project's config-first pattern. * docs: document skip_attachments option for email adapter * fix(telegram): retry on transient TLS failures during connect and send Add exponential-backoff retry (3 attempts) around initialize() to handle transient TLS resets during gateway startup. Also catches TimedOut and OSError in addition to NetworkError. Add exponential-backoff retry (3 attempts) around send_message() for NetworkError during message delivery, wrapping the existing Markdown fallback logic. Both imports are guarded with try/except ImportError for test environments where telegram is mocked. Based on PR #1527 by cmd8. Closes #1526. * feat: permissive block_anchor thresholds and unicode normalization (#1539) Salvaged from PR #1528 by an420eth. Closes #517. Improves _strategy_block_anchor in fuzzy_match.py: - Add unicode normalization (smart quotes, em/en-dashes, ellipsis, non-breaking spaces → ASCII) so LLM-produced unicode artifacts don't break anchor line matching - Lower thresholds: 0.10 for unique matches (was 0.70), 0.30 for multiple candidates — if first/last lines match exactly, the block is almost certainly correct - Use original (non-normalized) content for offset calculation to preserve correct character positions Tested: 3 new scenarios fixed (em-dash anchors, non-breaking space anchors, very-low-similarity unique matches), zero regressions on all 9 existing fuzzy match tests. Co-authored-by: an420eth <an420eth@users.noreply.github.com> * feat(cli): add file path autocomplete in the input prompt (#1545) When typing a path-like token (./ ../ ~/ / or containing /), the CLI now shows filesystem completions in the dropdown menu. Directories show a trailing slash and 'dir' label; files show their size. Completions are case-insensitive and capped at 30 entries. Triggered by tokens like: edit ./src/ma → shows ./src/main.py, ./src/manifest.json, ... check ~/doc → shows ~/docs/, ~/documents/, ... read /etc/hos → shows /etc/hosts, /etc/hostname, ... open tools/reg → shows tools/registry.py Slash command autocomplete (/help, /model, etc.) is unaffected — it still triggers when the input starts with /. Inspired by OpenCode PR #145 (file path completion menu). Implementation: - hermes_cli/commands.py: _extract_path_word() detects path-like tokens, _path_completions() yields filesystem Completions with size labels, get_completions() routes to paths vs slash commands - tests/hermes_cli/test_path_completion.py: 26 tests covering path extraction, prefix filtering, directory markers, home expansion, case-insensitivity, integration with slash commands * feat(privacy): redact PII from LLM context when privacy.redact_pii is enabled Add privacy.redact_pii config option (boolean, default false). When enabled, the gateway redacts personally identifiable information from the system prompt before sending it to the LLM provider: - Phone numbers (user IDs on WhatsApp/Signal) → hashed to user_<sha256> - User IDs → hashed to user_<sha256> - Chat IDs → numeric portion hashed, platform prefix preserved - Home channel IDs → hashed - Names/usernames → NOT affected (user-chosen, publicly visible) Hashes are deterministic (same user → same hash) so the model can still distinguish users in group chats. Routing and delivery use the original values internally — redaction only affects LLM context. Inspired by OpenClaw PR #47959. * fix(privacy): skip PII redaction on Discord/Slack (mentions need real IDs) Discord uses <@user_id> for mentions and Slack uses <@U12345> — the LLM needs the real ID to tag users. Redaction now only applies to WhatsApp, Signal, and Telegram where IDs are pure routing metadata. Add 4 platform-specific tests covering Discord, WhatsApp, Signal, Slack. * feat: smart approvals + /stop command (inspired by OpenAI Codex) * feat: smart approvals — LLM-based risk assessment for dangerous commands Adds a 'smart' approval mode that uses the auxiliary LLM to assess whether a flagged command is genuinely dangerous or a false positive, auto-approving low-risk commands without prompting the user. Inspired by OpenAI Codex's Smart Approvals guardian subagent (openai/codex#13860). Config (config.yaml): approvals: mode: manual # manual (default), smart, off Modes: - manual — current behavior, always prompt the user - smart — aux LLM evaluates risk: APPROVE (auto-allow), DENY (block), or ESCALATE (fall through to manual prompt) - off — skip all approval prompts (equivalent to --yolo) When smart mode auto-approves, the pattern gets session-level approval so subsequent uses of the same pattern don't trigger another LLM call. When it denies, the command is blocked without user prompt. When uncertain, it escalates to the normal manual approval flow. The LLM prompt is carefully scoped: it sees only the command text and the flagged reason, assesses actual risk vs false positive, and returns a single-word verdict. * feat: make smart approval model configurable via config.yaml Adds auxiliary.approval section to config.yaml with the same provider/model/base_url/api_key pattern as other aux tasks (vision, web_extract, compression, etc.). Config: auxiliary: approval: provider: auto model: '' # fast/cheap model recommended base_url: '' api_key: '' Bridged to env vars in both CLI and gateway paths so the aux client picks them up automatically. * feat: add /stop command to kill all background processes Adds a /stop slash command that kills all running background processes at once. Currently users have to process(list) then process(kill) for each one individually. Inspired by OpenAI Codex's separation of interrupt (Ctrl+C stops current turn) from /stop (cleans up background processes). See openai/codex#14602. Ctrl+C continues to only interrupt the active agent turn — background dev servers, watchers, etc. are preserved. /stop is the explicit way to clean them all up. * feat: first-class plugin architecture + hide status bar cost by default (#1544) The persistent status bar now shows context %, token counts, and duration but NOT $ cost by default. Cost display is opt-in via: display: show_cost: true in config.yaml, or: hermes config set display.show_cost true The /usage command still shows full cost breakdown since the user explicitly asked for it — this only affects the always-visible bar. Status bar without cost: ⚕ claude-sonnet-4 │ 12K/200K │ 6% │ 15m Status bar with show_cost: true: ⚕ claude-sonnet-4 │ 12K/200K │ 6% │ $0.06 │ 15m * feat: improve memory prioritization + aggressive skill updates (inspired by OpenAI Codex) * feat: improve memory prioritization — user preferences over procedural knowledge Inspired by OpenAI Codex's memory prompt improvements (openai/codex#14493) which focus memory writes on user preferences and recurring patterns rather than procedural task details. Key insight: 'Optimize for reducing future user steering — the most valuable memory prevents the user from having to repeat themselves.' Changes: - MEMORY_GUIDANCE (prompt_builder.py): added prioritization hierarchy and the core principle about reducing user steering - MEMORY_SCHEMA (memory_tool.py): reordered WHEN TO SAVE list to put corrections first, added explicit PRIORITY guidance - Memory nudge (run_agent.py): now asks specifically about preferences, corrections, and workflow patterns instead of generic 'anything' - Memory flush (run_agent.py): now instructs to prioritize user preferences and corrections over task-specific details * feat: more aggressive skill creation and update prompting Press harder on skill updates — the agent should proactively patch skills when it encounters issues during use, not wait to be asked. Changes: - SKILLS_GUIDANCE: 'consider saving' → 'save'; added explicit instruction to patch skills immediately when found outdated/wrong - Skills header: added instruction to update loaded skills before finishing if they had missing steps or wrong commands - Skill nudge: more assertive ('save the approach' not 'consider saving'), now also prompts for updating existing skills used in the task - Skill nudge interval: lowered default from 15 to 10 iterations - skill_manage schema: added 'patch it immediately' to update triggers * feat: first-class plugin architecture (#1555) Plugin system for extending Hermes with custom tools, hooks, and integrations — no source code changes required. Core system (hermes_cli/plugins.py): - Plugin discovery from ~/.hermes/plugins/, .hermes/plugins/, and pip entry_points (hermes_agent.plugins group) - PluginContext with register_tool() and register_hook() - 6 lifecycle hooks: pre/post tool_call, pre/post llm_call, on_session_start/end - Namespace package handling for relative imports in plugins - Graceful error isolation — broken plugins never crash the agent Integration (model_tools.py): - Plugin discovery runs after built-in + MCP tools - Plugin tools bypass toolset filter via get_plugin_tool_names() - Pre/post tool call hooks fire in handle_function_call() CLI: - /plugins command shows loaded plugins, tool counts, status - Added to COMMANDS dict for autocomplete Docs: - Getting started guide (build-a-hermes-plugin.md) — full tutorial building a calculator plugin step by step - Reference page (features/plugins.md) — quick overview + tables - Covers: file structure, schemas, handlers, hooks, data files, bundled skills, env var gating, pip distribution, common mistakes Tests: 16 tests covering discovery, loading, hooks, tool visibility. * fix: hermes update causes dual gateways on macOS (launchd) Three bugs worked together to create the dual-gateway problem: 1. cmd_update only checked systemd for gateway restart, completely ignoring launchd on macOS. After killing the PID it would print 'Restart it with: hermes gateway run' even when launchd was about to auto-respawn the process. 2. launchd's KeepAlive.SuccessfulExit=false respawns the gateway after SIGTERM (non-zero exit), so the user's manual restart created a second instance. 3. The launchd plist lacked --replace (systemd had it), so the respawned gateway didn't kill stale instances on startup. Fixes: - Add --replace to launchd ProgramArguments (matches systemd) - Add launchd detection to cmd_update's auto-restart logic - Print 'auto-restart via launchd' instead of manual restart hint * fix: add launchd plist auto-refresh + explicit restart in cmd_update Two integration issues with the initial fix: 1. Existing macOS users with old plist (no --replace) would never get the fix until manual uninstall/reinstall. Added refresh_launchd_plist_if_needed() — mirrors the existing refresh_systemd_unit_if_needed(). Called from launchd_start(), launchd_restart(), and cmd_update. 2. cmd_update relied on KeepAlive respawn after SIGTERM rather than explicit launchctl stop/start. This caused races: launchd would respawn the old process before the PID file was cleaned up. Now does explicit stop+start (matching how systemd gets an explicit systemctl restart), with plist refresh first so the new --replace flag is picked up. --------- Co-authored-by: Ninja <ninja@local> Co-authored-by: alireza78a <alireza78a@users.noreply.github.com> Co-authored-by: Oktay Aydin <113846926+aydnOktay@users.noreply.github.com> Co-authored-by: JP Lew <polydegen@protonmail.com> Co-authored-by: an420eth <an420eth@users.noreply.github.com>	2026-03-16 12:36:29 -07:00
teknium1	942950f5b9	feat(cli): live reasoning token streaming — dim box above response When both display.streaming and display.show_reasoning are enabled, reasoning tokens stream in real-time into a dim bordered box. When content tokens start arriving, the reasoning box closes and the response box opens — smooth visual transition. - _stream_reasoning_delta(): line-buffered rendering in dim text - _close_reasoning_box(): flush + close, called on first content token - Reasoning callback routes to streaming version when both flags set - Skips static post-response reasoning display when streamed live - State reset per turn via _reset_stream_state() Works with reasoning_content deltas (OpenRouter reasoning mode) and thinking_delta events (Anthropic extended thinking).	2026-03-16 10:29:55 -07:00
teknium1	d3687d3e81	docs: document planned live reasoning token display as future enhancement The streaming infrastructure already fires reasoning deltas via _fire_reasoning_delta() during streaming. The remaining work is the CLI display layer: a dim reasoning box that opens on first reasoning token, streams live, then transitions to the response box. Reference: PR #1214 (raulvidis) for gateway reasoning visibility.	2026-03-16 10:22:44 -07:00
teknium1	23b9d88a76	docs: add streaming config to cli-config.yaml.example and defaults Documents the new streaming options in the example config: - display.streaming for CLI (under display section) - streaming.enabled + transport/interval/threshold/cursor for gateway - Added streaming: false to load_cli_config() defaults dict	2026-03-16 07:53:08 -07:00

1 2 3 4 5 ...

322 Commits