hermes-agent

Author	SHA1	Message	Date
Teknium	de9bba8d7c	fix: remove hardcoded OpenRouter/opus defaults No model, base_url, or provider is assumed when the user hasn't configured one. Previously the defaults dict in cli.py, AIAgent constructor args, and several fallback paths all hardcoded anthropic/claude-opus-4.6 + openrouter.ai/api/v1 — silently routing unconfigured users to OpenRouter, which 404s for anyone using a different provider. Now empty defaults force the setup wizard to run, and existing users who already completed setup are unaffected (their config.yaml has the model they chose). Files changed: - cli.py: defaults dict, _DEFAULT_CONFIG_MODEL - run_agent.py: AIAgent.__init__ defaults, main() defaults - hermes_cli/config.py: DEFAULT_CONFIG - hermes_cli/runtime_provider.py: is_fallback sentinel - acp_adapter/session.py: default_model - tests: updated to reflect empty defaults	2026-04-01 15:22:26 -07:00
Teknium	c59ab8b0da	fix: profile model.model promoted to model.default when default not set When a profile config sets model.model but not model.default, the hardcoded default (claude-opus-4.6) survived the config merge and took precedence in HermesCLI.__init__ because it checks model.default first. Profile model configs were silently ignored. Now model.model is promoted to model.default during the merge when the user didn't explicitly set model.default. Fixes #4486.	2026-04-01 13:46:18 -07:00
Teknium	16d9f58445	fix(gateway): persist memory flush state to prevent redundant re-flushes on restart (#4481 ) * fix: force-close TCP sockets on client cleanup, detect and recover dead connections When a provider drops connections mid-stream (e.g. OpenRouter outage), httpx's graceful close leaves sockets in CLOSE-WAIT indefinitely. These zombie connections accumulate and can prevent recovery without restarting. Changes: - _force_close_tcp_sockets: walks the httpx connection pool and issues socket.shutdown(SHUT_RDWR) + close() to force TCP RST on every socket when a client is closed, preventing CLOSE-WAIT accumulation - _cleanup_dead_connections: probes the primary client's pool for dead sockets (recv MSG_PEEK), rebuilds the client if any are found - Pre-turn health check at the start of each run_conversation call that auto-recovers with a user-facing status message - Primary client rebuild after stale stream detection to purge pool - User-facing messages on streaming connection failures: "Connection to provider dropped — Reconnecting (attempt 2/3)" "Connection failed after 3 attempts — try again in a moment" Made-with: Cursor * fix: pool entry missing base_url for openrouter, clean error messages - _resolve_runtime_from_pool_entry: add OPENROUTER_BASE_URL fallback when pool entry has no runtime_base_url (pool entries from auth.json credential_pool often omit base_url) - Replace Rich console.print for auth errors with plain print() to prevent ANSI escape code mangling through prompt_toolkit's stdout patch - Force-close TCP sockets on client cleanup to prevent CLOSE-WAIT accumulation after provider outages - Pre-turn dead connection detection with auto-recovery and user message - Primary client rebuild after stale stream detection - User-facing status messages on streaming connection failures/retries Made-with: Cursor * fix(gateway): persist memory flush state to prevent redundant re-flushes on restart The _session_expiry_watcher tracked flushed sessions in an in-memory set (_pre_flushed_sessions) that was lost on gateway restart. Expired sessions remained in sessions.json and were re-discovered every restart, causing redundant AIAgent runs that burned API credits and blocked the event loop. Fix: Add a memory_flushed boolean field to SessionEntry, persisted in sessions.json. The watcher sets it after a successful flush. On restart, the flag survives and the watcher skips already-flushed sessions. - Add memory_flushed field to SessionEntry with to_dict/from_dict support - Old sessions.json entries without the field default to False (backward compat) - Remove the ephemeral _pre_flushed_sessions set from SessionStore - Update tests: save/load roundtrip, legacy entry compat, auto-reset behavior	2026-04-01 12:05:02 -07:00
kshitijk4poor	935137f0d9	feat: add inline diff previews for write actions Show inline diffs in the CLI transcript when write_file, patch, or skill_manage modifies files. Captures a filesystem snapshot before the tool runs, computes a unified diff after, and renders it with ANSI coloring in the activity feed. Adds tool_start_callback and tool_complete_callback hooks to AIAgent for pre/post tool execution notifications. Also fixes _extract_parallel_scope_path to normalize relative paths to absolute, preventing the parallel overlap detection from missing conflicts when the same file is referenced with different path styles. Gated by display.inline_diffs config option (default: true). Based on PR #3774 by @kshitijk4poor.	2026-04-01 02:13:57 -07:00
Teknium	996250d178	fix(cli): pin entire TUI to bottom of terminal on startup (#4412 ) Replace the per-response padding from PR #4359 (which created a void between short responses and the prompt) with a one-time initial scroll at session start. Prints terminal_height newlines before the banner so the cursor starts at the bottom row — banner, responses, and prompt all appear pinned to the bottom with empty space above, not below. patch_stdout naturally keeps the prompt at the bottom from there, so no per-response padding is needed.	2026-04-01 01:41:09 -07:00
Johannnnn506	9b99ea176e	fix(cli): initialize ctx_len before compact banner path	2026-04-01 01:12:23 -07:00
Teknium	a7f7e87070	fix: preserve credential_pool through smart routing and defer eager fallback on 429 (#4361 ) Three bugs prevented credential pool rotation from working when multiple Codex OAuth tokens were configured: 1. credential_pool was dropped during smart model turn routing. resolve_turn_route() constructed runtime dicts without it, so the AIAgent was created without pool access. Fixed in smart_model_routing.py (no-route and fallback paths), cli.py, and gateway/run.py. 2. Eager fallback fired before pool rotation on 429. The rate-limit handler at line ~7180 switched to a fallback provider immediately, before _recover_with_credential_pool got a chance to rotate to the next credential. Now deferred when the pool still has credentials. 3. (Non-issue) Retry budget was reported as too small, but successful pool rotations already skip retry_count increment — no change needed. Reported by community member Schinsly who identified all three root causes and verified the fix locally with multiple Codex accounts.	2026-04-01 01:02:34 -07:00
Teknium	f8cb54ba04	fix(cli): anchor input prompt near bottom of terminal after responses (#4359 ) After short agent responses, the prompt_toolkit input area sat mid-screen with empty terminal space below it. Now prints padding newlines (half terminal height) after each response to push the prompt toward the bottom. patch_stdout renders the padding above the input area.	2026-03-31 14:56:35 -07:00
Teknium	1b62ad9de7	fix: root-level provider in config.yaml no longer overrides model.provider load_cli_config() had a priority inversion: a stale root-level 'provider' key in config.yaml would OVERRIDE the canonical 'model.provider' set by 'hermes model'. The gateway reads model.provider directly from YAML and worked correctly, but 'hermes chat -q' and the interactive CLI went through the merge logic and picked up the stale root-level key. Fix: root-level provider/base_url are now only used as a fallback when model.provider/model.base_url is not set (never as an override). Also added _normalize_root_model_keys() to config.py load_config() and save_config() — migrates root-level provider/base_url into the model section and removes the root-level keys permanently. Reported by (≧▽≦) in Discord: opencode-go provider persisted as a root-level key and overrode the correct model.provider=openrouter, causing 401 errors.	2026-03-31 12:54:22 -07:00
binhnt92	c94a5fa1b2	fix(cli): use atomic write in save_config_value to prevent config loss on interrupt save_config_value() used bare open(path, 'w') + yaml.dump() which truncates the file to zero bytes on open. If the process is interrupted mid-write, config.yaml is left empty. Replace with atomic_yaml_write() (temp file + fsync + os.replace), matching the gateway config write path. Co-authored-by: Hermes Agent <hermes@nousresearch.com>	2026-03-31 12:21:55 -07:00
Teknium	57625329a2	docs+feat: comprehensive local LLM provider guides and context length warning (#4294 ) * docs: update llama.cpp section with --jinja flag and tool calling guide The llama.cpp docs were missing the --jinja flag which is required for tool calling to work. Without it, models output tool calls as raw JSON text instead of structured API responses, making Hermes unable to execute them. Changes: - Add --jinja and -fa flags to the server startup example - Replace deprecated env vars (OPENAI_BASE_URL, LLM_MODEL) with hermes model interactive setup - Add caution block explaining the --jinja requirement and symptoms - List models with native tool calling support - Add /props endpoint verification tip * docs+feat: comprehensive local LLM provider guides and context length warning Docs (providers.md): - Rewrote Ollama section with context length warning (defaults to 4k on <24GB VRAM), three methods to increase it, and verification steps - Rewrote vLLM section with --max-model-len, tool calling flags (--enable-auto-tool-choice, --tool-call-parser), and context guidance - Rewrote SGLang section with --context-length, --tool-call-parser, and warning about 128-token default max output - Added LM Studio section (port 1234, context length defaults to 2048, tool calling since 0.3.6) - Added llama.cpp context length flag (-c) and GPU offload (-ngl) - Added Troubleshooting Local Models section covering: - Tool calls appearing as text (with per-server fix table) - Silent context truncation and diagnosis commands - Low detected context at startup - Truncated responses - Replaced all deprecated env vars (OPENAI_BASE_URL, LLM_MODEL) with hermes model interactive setup and config.yaml examples - Added deprecation warning for legacy env vars in General Setup Code (cli.py): - Added context length warning in show_banner() when detected context is <= 8192 tokens, with server-specific fix hints: - Ollama (port 11434): suggests OLLAMA_CONTEXT_LENGTH env var - LM Studio (port 1234): suggests model settings adjustment - Other servers: suggests config.yaml override Tests: - 9 new tests covering warning thresholds, server-specific hints, and no-warning cases	2026-03-31 11:42:48 -07:00
Teknium	8d59881a62	feat(auth): same-provider credential pools with rotation, custom endpoint support, and interactive CLI (#2647 ) * feat(auth): add same-provider credential pools and rotation UX Add same-provider credential pooling so Hermes can rotate across multiple credentials for a single provider, recover from exhausted credentials without jumping providers immediately, and configure that behavior directly in hermes setup. - agent/credential_pool.py: persisted per-provider credential pools - hermes auth add/list/remove/reset CLI commands - 429/402/401 recovery with pool rotation in run_agent.py - Setup wizard integration for pool strategy configuration - Auto-seeding from env vars and existing OAuth state Co-authored-by: kshitijk4poor <82637225+kshitijk4poor@users.noreply.github.com> Salvaged from PR #2647 * fix(tests): prevent pool auto-seeding from host env in credential pool tests Tests for non-pool Anthropic paths and auth remove were failing when host env vars (ANTHROPIC_API_KEY) or file-backed OAuth credentials were present. The pool auto-seeding picked these up, causing unexpected pool entries in tests. - Mock _select_pool_entry in auxiliary_client OAuth flag tests - Clear Anthropic env vars and mock _seed_from_singletons in auth remove test * feat(auth): add thread safety, least_used strategy, and request counting - Add threading.Lock to CredentialPool for gateway thread safety (concurrent requests from multiple gateway sessions could race on pool state mutations without this) - Add 'least_used' rotation strategy that selects the credential with the lowest request_count, distributing load more evenly - Add request_count field to PooledCredential for usage tracking - Add mark_used() method to increment per-credential request counts - Wrap select(), mark_exhausted_and_rotate(), and try_refresh_current() with lock acquisition - Add tests: least_used selection, mark_used counting, concurrent thread safety (4 threads × 20 selects with no corruption) * feat(auth): add interactive mode for bare 'hermes auth' command When 'hermes auth' is called without a subcommand, it now launches an interactive wizard that: 1. Shows full credential pool status across all providers 2. Offers a menu: add, remove, reset cooldowns, set strategy 3. For OAuth-capable providers (anthropic, nous, openai-codex), the add flow explicitly asks 'API key or OAuth login?' — making it clear that both auth types are supported for the same provider 4. Strategy picker shows all 4 options (fill_first, round_robin, least_used, random) with the current selection marked 5. Remove flow shows entries with indices for easy selection The subcommand paths (hermes auth add/list/remove/reset) still work exactly as before for scripted/non-interactive use. * fix(tests): update runtime_provider tests for config.yaml source of truth (#4165) Tests were using OPENAI_BASE_URL env var which is no longer consulted after #4165. Updated to use model config (provider, base_url, api_key) which is the new single source of truth for custom endpoint URLs. * feat(auth): support custom endpoint credential pools keyed by provider name Custom OpenAI-compatible endpoints all share provider='custom', making the provider-keyed pool useless. Now pools for custom endpoints are keyed by 'custom:<normalized_name>' where the name comes from the custom_providers config list (auto-generated from URL hostname). - Pool key format: 'custom:together.ai', 'custom:local-(localhost:8080)' - load_pool('custom:name') seeds from custom_providers api_key AND model.api_key when base_url matches - hermes auth add/list now shows custom endpoints alongside registry providers - _resolve_openrouter_runtime and _resolve_named_custom_runtime check pool before falling back to single config key - 6 new tests covering custom pool keying, seeding, and listing * docs: add Excalidraw diagram of full credential pool flow Comprehensive architecture diagram showing: - Credential sources (env vars, auth.json OAuth, config.yaml, CLI) - Pool storage and auto-seeding - Runtime resolution paths (registry, custom, OpenRouter) - Error recovery (429 retry-then-rotate, 402 immediate, 401 refresh) - CLI management commands and strategy configuration Open at: https://excalidraw.com/#json=2Ycqhqpi6f12E_3ITyiwh,c7u9jSt5BwrmiVzHGbm87g * fix(tests): update setup wizard pool tests for unified select_provider_and_model flow The setup wizard now delegates to select_provider_and_model() instead of using its own prompt_choice-based provider picker. Tests needed: - Mock select_provider_and_model as no-op (provider pre-written to config) - Call _stub_tts BEFORE custom prompt_choice mock (it overwrites it) - Pre-write model.provider to config so the pool step is reached * docs: add comprehensive credential pool documentation - New page: website/docs/user-guide/features/credential-pools.md Full guide covering quick start, CLI commands, rotation strategies, error recovery, custom endpoint pools, auto-discovery, thread safety, architecture, and storage format. - Updated fallback-providers.md to reference credential pools as the first layer of resilience (same-provider rotation before cross-provider) - Added hermes auth to CLI commands reference with usage examples - Added credential_pool_strategies to configuration guide * chore: remove excalidraw diagram from repo (external link only) * refactor: simplify credential pool code — extract helpers, collapse extras, dedup patterns - _load_config_safe(): replace 4 identical try/except/import blocks - _iter_custom_providers(): shared generator for custom provider iteration - PooledCredential.extra dict: collapse 11 round-trip-only fields (token_type, scope, client_id, portal_base_url, obtained_at, expires_in, agent_key_id, agent_key_expires_in, agent_key_reused, agent_key_obtained_at, tls) into a single extra dict with __getattr__ for backward-compatible access - _available_entries(): shared exhaustion-check between select and peek - Dedup anthropic OAuth seeding (hermes_pkce + claude_code identical) - SimpleNamespace replaces class _Args boilerplate in auth_commands - _try_resolve_from_custom_pool(): shared pool-check in runtime_provider Net -17 lines. All 383 targeted tests pass. --------- Co-authored-by: kshitijk4poor <82637225+kshitijk4poor@users.noreply.github.com>	2026-03-31 03:10:01 -07:00
Teknium	f890a94c12	refactor: make config.yaml the single source of truth for endpoint URLs (#4165 ) OPENAI_BASE_URL was written to .env AND config.yaml, creating a dual-source confusion. Users (especially Docker) would see the URL in .env and assume that's where all config lives, then wonder why LLM_MODEL in .env didn't work. Changes: - Remove all 27 save_env_value("OPENAI_BASE_URL", ...) calls across main.py, setup.py, and tools_config.py - Remove OPENAI_BASE_URL env var reading from runtime_provider.py, cli.py, models.py, and gateway/run.py - Remove LLM_MODEL/HERMES_MODEL env var reading from gateway/run.py and auxiliary_client.py — config.yaml model.default is authoritative - Vision base URL now saved to config.yaml auxiliary.vision.base_url (both setup wizard and tools_config paths) - Tests updated to set config values instead of env vars Convention enforced: .env is for SECRETS only (API keys). All other configuration (model names, base URLs, provider selection) lives exclusively in config.yaml.	2026-03-30 22:02:53 -07:00
Teknium	1bd206ea5d	feat: add /btw command for ephemeral side questions (#4161 ) Adds /btw <question> — ask a quick follow-up using the current session context without interrupting the main conversation. - Snapshots conversation history, answers with a no-tools agent - Response is not persisted to session history or DB - Runs in a background thread (CLI) / async task (gateway) - Per-session guard prevents concurrent /btw in gateway Implementation: - model_tools.py: enabled_toolsets=[] now correctly means "no tools" (was falsy, fell through to default "all tools") - run_agent.py: persist_session=False gates _persist_session() - cli.py: _handle_btw_command (background thread, Rich panel output) - gateway/run.py: _handle_btw_command + _run_btw_task (async task) - hermes_cli/commands.py: CommandDef for "btw" Inspired by PR #3504 by areu01or00, reimplemented cleanly on current main with the enabled_toolsets=[] fix and without the __btw_no_tools__ hack.	2026-03-30 21:10:05 -07:00
Teknium	c1ef9b2250	fix(cli): ensure on_session_end hook fires on interrupted exits (#4159 ) - Add SIGTERM/SIGHUP signal handlers for graceful shutdown - Add BrokenPipeError to exit exception handling (SSH disconnects) - Fire on_session_end plugin hook in finally block, guarded by _agent_running to avoid double-firing on normal exits (the hook already fires per-turn from run_conversation) Co-authored-by: kelsia14 <kelsia14@users.noreply.github.com>	2026-03-30 20:37:17 -07:00
Teknium	13f3e67165	ux: show 'Initializing agent...' on first message (#4086 ) Display a brief status message before the heavy agent initialization (OpenAI client setup, tool loading, memory init, etc.) so users aren't staring at a blank screen for several seconds. Only prints when self.agent is None (first use or after model switch). Closes #4060 Co-authored-by: SHL0MS <SHL0MS@users.noreply.github.com>	2026-03-30 17:05:40 -07:00
Teknium	f93637b3a1	feat: add /profile slash command to show active profile (#4027 ) Adds /profile to COMMAND_REGISTRY (Info category) with handlers in both CLI and gateway. Shows the active profile name and home directory. Works on all platforms — CLI, Telegram, Discord, Slack, etc. Detects profile by checking if HERMES_HOME is under ~/.hermes/profiles/. Shows 'default' when running without a profile.	2026-03-30 13:20:06 -07:00
Teknium	0976bf6cd0	feat: add /yolo slash command to toggle dangerous command approvals (#3990 ) Adds a /yolo command that toggles HERMES_YOLO_MODE at runtime, skipping all dangerous command approval prompts for the current session. Works in both CLI and gateway (Telegram, Discord, etc.). - /yolo -> ON: all commands auto-approved, no confirmation prompts - /yolo -> OFF: normal approval flow restored The --yolo CLI flag already existed for launch-time opt-in. This adds the ability to toggle mid-session without restarting. Session-scoped — resets when the process ends. Uses the existing HERMES_YOLO_MODE env var that check_all_command_guards() already respects.	2026-03-30 11:17:09 -07:00
Teknium	0e592aa5b4	fix(cli): remove input() from /tools disable that freezes the terminal (#3918 ) input() hangs inside prompt_toolkit's TUI event loop — this is a known pitfall (AGENTS.md). The /tools disable and /tools enable commands used input() for a Y/N confirmation prompt, causing the terminal to freeze with no way to type a response. Fix: remove the confirmation prompt. The user typing '/tools disable web' is implicit consent. The change is applied directly with a status message.	2026-03-30 02:53:21 -07:00
Wing Lian	efae525dc5	feat(plugins): add inject_message interface for remote message injection (#3778 )	2026-03-30 02:48:06 -07:00
kshitij	c288bbfb57	fix(cli): prevent status bar wrapping into duplicate rows (#3883 ) - measure status bar display width using prompt_toolkit cell widths - trim rendered status text when fragments would overflow - add a final single-fragment fallback to prevent wrapping - update width assertions to validate display cells instead of len()	2026-03-29 23:59:07 -07:00
Teknium	86ac23c8da	fix(auth): stop silently falling back to OpenRouter when no provider is configured (#3862 ) Previously, when no API keys or provider credentials were found, Hermes silently defaulted to OpenRouter + Claude Opus. This caused confusion when users configured local servers (LM Studio, Ollama, etc.) with a typo or unrecognized provider name — the system would silently route to OpenRouter instead of telling them something was wrong. Changes: - resolve_provider() now raises AuthError when no credentials are found instead of returning 'openrouter' as a silent fallback - Added local server aliases: lmstudio, ollama, vllm, llamacpp → custom - Removed hardcoded 'anthropic/claude-opus-4.6' fallback from gateway and cron scheduler (they read from config.yaml instead) - Updated cli-config.yaml.example with complete provider documentation including all supported providers, aliases, and local server setup	2026-03-29 21:06:35 -07:00
Teknium	e314833c9d	feat(display): configurable tool preview length -- show full paths by default (#3841 ) Tool call previews (paths, commands, queries) were hardcoded to truncate at 35-40 chars across CLI spinners, completion lines, and gateway progress messages. Users could not see full file paths in tool output. New config option: display.tool_preview_length (default 0 = no limit). Set a positive number to truncate at that length. Changes: - display.py: module-level _tool_preview_max_len with getter/setter; build_tool_preview() and get_cute_tool_message() _trunc/_path respect it - cli.py: reads config at startup, spinner widget respects config - gateway/run.py: reads config per-message, progress callback respects config - run_agent.py: removed redundant 30-char quiet-mode spinner truncation - config.py: added display.tool_preview_length to DEFAULT_CONFIG Reported by kriskaminski	2026-03-29 18:02:42 -07:00
Teknium	252fbea005	feat(providers): add ordered fallback provider chain (salvage #1761 ) (#3813 ) Extends the single fallback_model mechanism into an ordered chain. When the primary model fails, Hermes tries each fallback provider in sequence until one succeeds or the chain is exhausted. Config format (new): fallback_providers: - provider: openrouter model: anthropic/claude-sonnet-4 - provider: openai model: gpt-4o Legacy single-dict fallback_model format still works unchanged. Key fix vs original PR: the call sites in the retry loop now use _fallback_index < len(_fallback_chain) instead of the old one-shot _fallback_activated guard, so the chain actually advances through all configured providers. Changes: - run_agent.py: _fallback_chain list + _fallback_index replaces one-shot _fallback_model; _try_activate_fallback() advances through chain; failed provider resolution skips to next entry; call sites updated to allow chain advancement - cli.py: reads fallback_providers with legacy fallback_model compat - gateway/run.py: same - hermes_cli/config.py: fallback_providers: [] in DEFAULT_CONFIG - tests: 12 new chain tests + 6 existing test fixtures updated Co-authored-by: uzaylisak <uzaylisak@users.noreply.github.com>	2026-03-29 16:04:53 -07:00
Teknium	0fd3b59ba1	feat(cli): add Ctrl+Z process suspend support (#3802 ) Adds a Ctrl+Z key binding to suspend the hermes CLI to background using standard Unix job control. Uses prompt_toolkit's run_in_terminal() to properly save/restore terminal state, then sends SIGTSTP to the process group. Prints a branded message with resume instructions. Shows a not-supported notice on Windows. Co-authored-by: CharlieKerfoot <CharlieKerfoot@users.noreply.github.com>	2026-03-29 15:47:55 -07:00
Teknium	f6db1b27ba	feat: add profiles — run multiple isolated Hermes instances (#3681 ) Each profile is a fully independent HERMES_HOME with its own config, API keys, memory, sessions, skills, gateway, cron, and state.db. Core module: hermes_cli/profiles.py (~900 lines) - Profile CRUD: create, delete, list, show, rename - Three clone levels: blank, --clone (config), --clone-all (everything) - Export/import: tar.gz archive for backup and migration - Wrapper alias scripts (~/.local/bin/<name>) - Collision detection for alias names - Sticky default via ~/.hermes/active_profile - Skill seeding via subprocess (handles module-level caching) - Auto-stop gateway on delete with disable-before-stop for services - Tab completion generation for bash and zsh CLI integration (hermes_cli/main.py): - _apply_profile_override(): pre-import -p/--profile flag + sticky default - Full 'hermes profile' subcommand: list, use, create, delete, show, alias, rename, export, import - 'hermes completion bash/zsh' command - Multi-profile skill sync in hermes update Display (cli.py, banner.py, gateway/run.py): - CLI prompt: 'coder ❯' when using a non-default profile - Banner shows profile name - Gateway startup log includes profile name Gateway safety: - Token locks: Discord, Slack, WhatsApp, Signal (extends Telegram pattern) - Port conflict detection: API server, webhook adapter Diagnostics (hermes_cli/doctor.py): - Profile health section: lists profiles, checks config, .env, aliases - Orphan alias detection: warns when wrapper points to deleted profile Tests (tests/hermes_cli/test_profiles.py): - 71 automated tests covering: validation, CRUD, clone levels, rename, export/import, active profile, isolation, alias collision, completion - Full suite: 6760 passed, 0 new failures Documentation: - website/docs/user-guide/profiles.md: full user guide (12 sections) - website/docs/reference/profile-commands.md: command reference (12 commands) - website/docs/reference/faq.md: 6 profile FAQ entries - website/sidebars.ts: navigation updated	2026-03-29 10:41:20 -07:00
Teknium	9f01244137	fix: replace user-facing hardcoded ~/.hermes paths with display_hermes_home() Prep for profiles: user-facing messages now use display_hermes_home() so diagnostic output shows the correct path for each profile. New helper: display_hermes_home() in hermes_constants.py 12 files swept, ~30 user-facing string replacements. Includes dynamic TTS schema description.	2026-03-28 23:47:21 -07:00
Teknium	0bd7e95dfc	fix(honcho): allow self-hosted local instances without API key (#3644 ) Self-hosted Honcho on localhost doesn't require authentication, but both the activation gates and the SDK client required an API key. Combined fix from three contributor PRs: - Relax all 8 activation gates to accept (api_key OR base_url) as valid credentials (#3482 by @cameronbergh) - Use 'local' placeholder for the SDK client when base_url points to localhost/127.0.0.1/::1 (#3570 by @ygd58) Files changed: run_agent.py (2 gates), cli.py (1 gate), gateway/run.py (1 gate), honcho_integration/cli.py (2 gates), hermes_cli/doctor.py (2 gates), honcho_integration/client.py (SDK). Co-authored-by: cameronbergh <cameronbergh@users.noreply.github.com> Co-authored-by: ygd58 <ygd58@users.noreply.github.com> Co-authored-by: devorun <devorun@users.noreply.github.com>	2026-03-28 17:49:56 -07:00
Teknium	bea49e02a3	fix: route /bg spinner through TUI widget to prevent status bar collision (#3643 ) Background agent's KawaiiSpinner wrote \r-based animation and stop() messages through StdoutProxy, colliding with prompt_toolkit's status bar. Two fixes: - display.py: use isinstance(out, StdoutProxy) instead of fragile hasattr+name check for detecting prompt_toolkit's stdout wrapper - cli.py: silence bg agent's raw spinner (_print_fn=no-op) and route thinking updates through the TUI widget only when no foreground agent is active; clear spinner text in finally block with same guard Closes #2718 Co-authored-by: kshitijk4poor <kshitijk4poor@users.noreply.github.com>	2026-03-28 17:29:37 -07:00
Teknium	857a5d7b47	fix: sanitize surrogate characters from clipboard paste to prevent UnicodeEncodeError (#3624 ) Pasting text from rich-text editors (Google Docs, Word, etc.) can inject lone surrogate characters (U+D800..U+DFFF) that are invalid UTF-8. The OpenAI SDK serializes messages with ensure_ascii=False, then encodes to UTF-8 for the HTTP body — surrogates crash this with: UnicodeEncodeError: 'utf-8' codec can't encode character '\udce2' Three-layer fix: 1. Primary: sanitize user_message at the top of run_conversation() 2. CLI: sanitize in chat() before appending to conversation_history 3. Safety net: catch UnicodeEncodeError in the API error handler, sanitize the entire messages list in-place, and retry once. Also exclude UnicodeEncodeError from is_local_validation_error so it doesn't get classified as non-retryable. Includes 14 new tests covering the sanitization helpers and the integration with run_conversation().	2026-03-28 16:53:14 -07:00
Teknium	b029742092	fix(cli): strengthen paste collapse fallback for terminals without bracketed paste (#3625 ) The _on_text_changed fallback only detected pastes when all characters arrived in a single event (chars_added > 1). Some terminals (notably VSCode integrated terminal in certain configs) may deliver paste data differently, causing the fallback to miss. Add a second heuristic: if the newline count jumps by 4+ in a single text-change event, treat it as a paste. Alt+Enter only adds 1 newline per event, so this never false-positives on manual multi-line input. Also fixes: the fallback path was missing _paste_just_collapsed flag set before replacing buffer text, which could cause a re-trigger loop.	2026-03-28 15:40:49 -07:00
Teknium	e4480ff426	fix(config): accept 'model' key as alias for 'default' in model config (#3603 ) Users intuitively write model: { model: my-model } instead of model: { default: my-model } and it silently falls back to the hardcoded default. Now both spellings work across all three config consumers: runtime_provider, CLI, and gateway. Co-authored-by: ygd58 <ygd58@users.noreply.github.com>	2026-03-28 14:55:27 -07:00
Teknium	9a364f2805	fix: cap percentage displays at 100% in stats, gateway, and memory tool (#3599 ) Salvage of PR #3533 (binhnt92). Follow-up to #3480 — applies min(100, ...) to 5 remaining unclamped percentage display sites in context_compressor, cli /stats, gateway /stats, and memory tool. Defensive clamps now that the root cause (estimation heuristic) was already removed in #3480. Co-Authored-By: binhnt92 <binhnt92@users.noreply.github.com>	2026-03-28 14:55:18 -07:00
Teknium	1b2d4f21f3	feat(cli): show resume-by-title command in exit summary (#3607 ) When exiting a session that has a title (auto-generated or manual), the exit summary now also shows: hermes -c "Session Title" alongside the existing hermes --resume <id> command. Also adds the title to the session info block.	2026-03-28 14:54:53 -07:00
Teknium	be39292633	fix(cli): guard .strip() against None values from YAML config (#3552 ) dict.get(key, default) only returns default when key is ABSENT. When YAML has 'key:' with no value, it parses as None — .get() returns None, then .strip() crashes with AttributeError. Use (x or '') pattern to handle both missing and null cases. Salvaged from PR #3217. Co-authored-by: erosika <erosika@users.noreply.github.com>	2026-03-28 11:39:01 -07:00
Teknium	e4e04c2005	fix: make tirith block verdicts approvable instead of hard-blocking (#3428 ) Previously, tirith exit code 1 (block) immediately rejected the command with no approval prompt — users saw 'BLOCKED: Command blocked by security scan' and the agent moved on. This prevented gateway/CLI users from approving pipe-to-shell installs like 'curl ... \| sh' even when they understood the risk. Changes: - Tirith 'block' and 'warn' now both go through the approval flow. Users see the full tirith findings (severity, title, description, safer alternatives) and can choose to approve or deny. - New _format_tirith_description() builds rich descriptions from tirith findings JSON so the approval prompt is informative. - CLI startup now warns when tirith is enabled but not available, so users know command scanning is degraded to pattern matching only. The default approval choice is still deny, so the security posture is unchanged for unattended/timeout scenarios. Reported via Discord by pistrie — 'curl -fsSL https://mandex.dev/install.sh \| sh' was hard-blocked with no way to approve.	2026-03-27 13:22:01 -07:00
Teknium	8ecd7aed2c	fix: prevent reasoning box from rendering 3x during tool-calling loops (#3405 ) Two independent bugs caused the reasoning box to appear three times when the model produced reasoning + tool_calls: Bug A: _build_assistant_message() re-fired reasoning_callback with the full reasoning text even when streaming had already displayed it. The original guard only checked structured reasoning_content deltas, but reasoning also arrives via content tag extraction (<REASONING_SCRATCHPAD>/<think> tags in delta.content), which went through _fire_stream_delta not _fire_reasoning_delta. Fix: skip the callback entirely when streaming is active — both paths display reasoning during the stream. Any reasoning not shown during streaming is caught by the CLI post-response fallback. Bug B: The post-response reasoning display checked _reasoning_stream_started, but that flag was reset by _reset_stream_state() during intermediate turn boundaries (when stream_delta_callback(None) fires between tool calls). Introduced _reasoning_shown_this_turn flag that persists across the tool loop and is only reset at the start of each user turn. Live-tested in PTY: reasoning now shows exactly once per API call, no duplicates across tool-calling loops.	2026-03-27 09:57:50 -07:00
Teknium	e0dbbdb2c9	fix: eliminate 'Event loop is closed' / 'Press ENTER to continue' during idle (#3398 ) The OpenAI SDK's AsyncHttpxClientWrapper.__del__ schedules aclose() via asyncio.get_running_loop().create_task(). When an AsyncOpenAI client is garbage-collected while prompt_toolkit's event loop is running (the common CLI idle state), the aclose() task runs on prompt_toolkit's loop but the underlying TCP transport is bound to a different (dead) worker loop. The transport's self._loop.call_soon() then raises RuntimeError('Event loop is closed'), which prompt_toolkit surfaces as the disruptive 'Unhandled exception in event loop ... Press ENTER to continue...' error. Three-layer fix: 1. neuter_async_httpx_del(): Monkey-patches __del__ to a no-op at CLI startup before any AsyncOpenAI clients are created. Safe because cached clients are explicitly cleaned via _force_close_async_httpx, and uncached clients' TCP connections are cleaned by the OS on exit. 2. Custom asyncio exception handler: Installed on prompt_toolkit's event loop to silently suppress 'Event loop is closed' RuntimeError. Defense-in-depth for SDK upgrades that might change the class name. 3. cleanup_stale_async_clients(): Called after each agent turn (when the agent thread joins) to proactively evict cache entries whose event loop is closed, preventing stale clients from accumulating.	2026-03-27 09:45:25 -07:00
Teknium	1519c4d477	fix(session): add /resume CLI handler, session log truncation guard, reopen_session API (#3315 ) Three improvements salvaged from PR #3225 by Mibayy: 1. Add /resume slash command handler in CLI process_command(). The command was registered in the commands registry but had no handler, so typing /resume produced 'Unknown command'. The handler resolves by title or session ID, ends the current session cleanly, loads conversation history from SQLite, re-opens the target session, and syncs the AIAgent instance. Follows the same pattern as new_session(). 2. Add truncation guard in _save_session_log(). When resuming a session whose messages weren't fully written to SQLite, the agent starts with partial history and the first save would overwrite the full JSON log on disk. The guard reads the existing file and skips the write if it already has more messages than the current batch. 3. Add reopen_session() method to SessionDB. Proper API for clearing ended_at/end_reason instead of reaching into _conn directly. Note: Bug 1 from the original PR (INSERT OR IGNORE + _session_db = None) is already fixed on main — skipped as redundant. Closes #3123.	2026-03-26 19:04:28 -07:00
Teknium	3c57eaf744	fix: YAML boolean handling for tool_progress config (#3300 ) YAML 1.1 parses bare `off` as boolean False, which is falsy in Python's `or` chain and silently falls through to the 'all' default. Users setting `display.tool_progress: off` in config.yaml saw no effect — tool progress stayed on. Normalise False → 'off' before the or chain in both affected paths: - gateway/run.py _run_agent() tool progress reader - cli.py HermesCLI.__init__() tool_progress_mode Reported by @gibbsoft in #2859. Closes #2859.	2026-03-26 17:58:50 -07:00
Teknium	2d232c9991	feat(cli): configurable busy input mode + fix /queue always working (#3298 ) Two changes: 1. Fix /queue command: remove the _agent_running guard that rejected /queue after the agent finished. The prompt was deferred in _pending_input until the agent completed, then the handler checked _agent_running (now False) and rejected it. /queue now always queues regardless of timing. 2. Add display.busy_input_mode config (CLI-only): - 'interrupt' (default): Enter while busy interrupts the current run (preserves existing behavior) - 'queue': Enter while busy queues the message for the next turn, with a 'Queued for the next turn: ...' confirmation Ctrl+C always interrupts regardless of this setting. Salvaged from PR #3037 by StefanoChiodino. Key differences: - Default is 'interrupt' (preserves existing behavior) not 'queue' - No config version bump (unnecessary for new key in existing section) - Simpler normalization (no alias map) - /queue fix is simpler: just remove the guard instead of intercepting commands during busy state	2026-03-26 17:58:40 -07:00
Teknium	716e616d28	fix(tui): status bar duplicates and degrades during long sessions (#3291 ) shutil.get_terminal_size() can return stale/fallback values on SSH that differ from prompt_toolkit's actual terminal width. Fragments built for the wrong width overflow and wrap onto a second line (wrap_lines=True default), appearing as progressively degrading duplicates. - Read width from get_app().output.get_size().columns when inside a prompt_toolkit TUI, falling back to shutil outside TUI context - Add wrap_lines=False on the status bar Window as belt-and-suspenders guard against any future width mismatch Closes #3130 Co-authored-by: Mibayy <Mibayy@users.noreply.github.com>	2026-03-26 17:33:11 -07:00
Teknium	db241ae6ce	feat(sessions): add --source flag for third-party session isolation (#3255 ) When third-party tools (Paperclip orchestrator, etc.) spawn hermes chat as a subprocess, their sessions pollute user session history and search. - hermes chat --source <tag> (also HERMES_SESSION_SOURCE env var) - exclude_sources parameter on list_sessions_rich() and search_messages() - Sessions with source=tool hidden from sessions list/browse/search - Third-party adapters pass --source tool to isolate agent sessions Cherry-picked from PR #3208 by HenkDz. Co-authored-by: Henkey <noonou7@gmail.com>	2026-03-26 14:35:31 -07:00
Teknium	41ee207a5e	fix: catch KeyboardInterrupt in exit cleanup handlers (#3257 ) except Exception does not catch KeyboardInterrupt (inherits from BaseException). A second Ctrl+C during exit cleanup aborts pending writes — Honcho observations dropped, SQLite sessions left unclosed, cron job sessions never marked ended. Changed to except (Exception, KeyboardInterrupt) at all five sites: - cli.py: honcho.shutdown() and end_session() in finally exit block - run_agent.py: _flush_honcho_on_exit atexit handler - cron/scheduler.py: end_session() and close() in job finally block Tests exercise the actual production code paths and confirm KeyboardInterrupt propagates without the fix. Co-authored-by: dieutx <dangtc94@gmail.com>	2026-03-26 14:34:31 -07:00
Teknium	62f8aa9b03	fix: MCP toolset resolution for runtime and config (#3252 ) Gateway sessions had their own inline toolset resolution that only read platform_toolsets from config, which never includes MCP server names. MCP tools were discovered and registered but invisible to the model. - Replace duplicated gateway toolset resolution in _run_agent() and _run_background_task() with calls to the shared _get_platform_tools() - Extend _get_platform_tools() to include globally enabled MCP servers at runtime (include_default_mcp_servers=True), while config-editing flows use include_default_mcp_servers=False to avoid persisting implicit MCP defaults into platform_toolsets - Add homeassistant to PLATFORMS dict (was missing, caused KeyError) - Fix CLI entry point to use _get_platform_tools() as well, so MCP tools are visible in CLI mode too - Remove redundant platform_key reassignment in _run_background_task Co-authored-by: kshitijk4poor <kshitijk4poor@users.noreply.github.com>	2026-03-26 13:39:41 -07:00
Teknium	cbf195e806	chore: fix 154 f-strings, simplify getattr/URL patterns, remove dead code (#3119 ) Three categories of cleanup, all zero-behavioral-change: 1. F-strings without placeholders (154 fixes across 29 files) - Converted f'...' to '...' where no {expression} was present - Heaviest files: run_agent.py (24), cli.py (20), honcho_integration/cli.py (34) 2. Simplify defensive patterns in run_agent.py - Added explicit self._is_anthropic_oauth = False in __init__ (before the api_mode branch that conditionally sets it) - Replaced 7x getattr(self, '_is_anthropic_oauth', False) with direct self._is_anthropic_oauth (attribute always initialized now) - Added _is_openrouter_url() and _is_anthropic_url() helper methods - Replaced 3 inline 'openrouter' in self._base_url_lower checks 3. Remove dead code in small files - hermes_cli/claw.py: removed unused 'total' computation - tools/fuzzy_match.py: removed unused strip_indent() function and pattern_stripped variable Full test suite: 6184 passed, 0 failures E2E PTY: banner clean, tool calls work, zero garbled ANSI	2026-03-25 19:47:58 -07:00
Teknium	08d3be0412	fix: graceful return on max retries instead of crashing thread run_conversation raised the raw exception after exhausting retries, which crashed the background thread in cli.py (unhandled exception in Thread). Now returns a proper error result dict with failed=True and persists the session, matching the pattern used by other error paths (invalid responses, empty content, etc.). Also wraps cli.py's run_agent thread function in try/except as a safety net against any future unhandled exceptions from run_conversation. Made-with: Cursor	2026-03-25 19:00:39 -07:00
Teknium	f46542b6c6	fix(cli): read root-level provider and base_url from config.yaml into model config (#3112 ) When users write root-level provider and base_url in config.yaml (instead of nesting under model:), these keys were never merged into defaults['model']. The CLI reads them from CLI_CONFIG['model']['provider'] so root-level keys were silently ignored, causing fallback to OpenRouter. Merge root-level provider and base_url into defaults['model'] after handling the model key, so custom/local provider configs work regardless of nesting. Cherry-picked from PR #2283 by ygd58. Fixes #2281.	2026-03-25 18:38:32 -07:00
Teknium	9783c9d5c1	refactor: remove /model slash command from CLI and gateway (#3080 ) The /model command is removed from both the interactive CLI and messenger gateway (Telegram/Discord/Slack/WhatsApp). Users can still change models via 'hermes model' CLI subcommand or by editing config.yaml directly. Removed: - CommandDef entry from COMMAND_REGISTRY - CLI process_command() handler and model autocomplete logic - Gateway _handle_model_command() and dispatch - SlashCommandCompleter model_completer_provider parameter - Two-stage Tab completion and ghost text for /model - All /model-specific tests Unaffected: - /provider command (read-only, shows current model + providers) - ACP adapter _cmd_model (separate system for VS Code/Zed/JetBrains) - model_switch.py module (used by ACP) - 'hermes model' CLI subcommand Author: Teknium	2026-03-25 17:03:05 -07:00
Teknium	9d1e13019e	fix(cli): prevent TypeError on startup when base_url is None (#3068 ) Description This PR fixes the startup crash introduced in v0.4.0 where `self.base_url` being `None` throws a `TypeError`. Root Cause: At `cli.py:1108`, a membership check (`"openrouter.ai" in self.base_url`) is performed. If a user's config doesn't explicitly set a `base_url` (meaning it's `None`), Python raises a `TypeError: argument of type 'NoneType' is not iterable`, causing the entire CLI to crash on boot. Fix: Added a simple truthiness guard (`if self.base_url and ...`) to ensure the membership check only occurs if `base_url` is a valid string. Closes #2842 Co-authored-by: devorun <130918800+devorun@users.noreply.github.com>	2026-03-25 16:21:00 -07:00

1 2 3 4 5 ...

398 Commits