hermes-agent

Author	SHA1	Message	Date
Teknium	924bc67eee	feat(memory): pluggable memory provider interface with profile isolation, review fixes, and honcho CLI restoration (#4623 ) * feat(memory): add pluggable memory provider interface with profile isolation Introduces a pluggable MemoryProvider ABC so external memory backends can integrate with Hermes without modifying core files. Each backend becomes a plugin implementing a standard interface, orchestrated by MemoryManager. Key architecture: - agent/memory_provider.py — ABC with core + optional lifecycle hooks - agent/memory_manager.py — single integration point in the agent loop - agent/builtin_memory_provider.py — wraps existing MEMORY.md/USER.md Profile isolation fixes applied to all 6 shipped plugins: - Cognitive Memory: use get_hermes_home() instead of raw env var - Hindsight Memory: check $HERMES_HOME/hindsight/config.json first, fall back to legacy ~/.hindsight/ for backward compat - Hermes Memory Store: replace hardcoded ~/.hermes paths with get_hermes_home() for config loading and DB path defaults - Mem0 Memory: use get_hermes_home() instead of raw env var - RetainDB Memory: auto-derive profile-scoped project name from hermes_home path (hermes-<profile>), explicit env var overrides - OpenViking Memory: read-only, no local state, isolation via .env MemoryManager.initialize_all() now injects hermes_home into kwargs so every provider can resolve profile-scoped storage without importing get_hermes_home() themselves. Plugin system: adds register_memory_provider() to PluginContext and get_plugin_memory_providers() accessor. Based on PR #3825. 46 tests (37 unit + 5 E2E + 4 plugin registration). * refactor(memory): drop cognitive plugin, rewrite OpenViking as full provider Remove cognitive-memory plugin (#727) — core mechanics are broken: decay runs 24x too fast (hourly not daily), prefetch uses row ID as timestamp, search limited by importance not similarity. Rewrite openviking-memory plugin from a read-only search wrapper into a full bidirectional memory provider using the complete OpenViking session lifecycle API: - sync_turn: records user/assistant messages to OpenViking session (threaded, non-blocking) - on_session_end: commits session to trigger automatic memory extraction into 6 categories (profile, preferences, entities, events, cases, patterns) - prefetch: background semantic search via find() endpoint - on_memory_write: mirrors built-in memory writes to the session - is_available: checks env var only, no network calls (ABC compliance) Tools expanded from 3 to 5: - viking_search: semantic search with mode/scope/limit - viking_read: tiered content (abstract ~100tok / overview ~2k / full) - viking_browse: filesystem-style navigation (list/tree/stat) - viking_remember: explicit memory storage via session - viking_add_resource: ingest URLs/docs into knowledge base Uses direct HTTP via httpx (no openviking SDK dependency needed). Response truncation on viking_read to prevent context flooding. * fix(memory): harden Mem0 plugin — thread safety, non-blocking sync, circuit breaker - Remove redundant mem0_context tool (identical to mem0_search with rerank=true, top_k=5 — wastes a tool slot and confuses the model) - Thread sync_turn so it's non-blocking — Mem0's server-side LLM extraction can take 5-10s, was stalling the agent after every turn - Add threading.Lock around _get_client() for thread-safe lazy init (prefetch and sync threads could race on first client creation) - Add circuit breaker: after 5 consecutive API failures, pause calls for 120s instead of hammering a down server every turn. Auto-resets after cooldown. Logs a warning when tripped. - Track success/failure in prefetch, sync_turn, and all tool calls - Wait for previous sync to finish before starting a new one (prevents unbounded thread accumulation on rapid turns) - Clean up shutdown to join both prefetch and sync threads * fix(memory): enforce single external memory provider limit MemoryManager now rejects a second non-builtin provider with a warning. Built-in memory (MEMORY.md/USER.md) is always accepted. Only ONE external plugin provider is allowed at a time. This prevents tool schema bloat (some providers add 3-5 tools each) and conflicting memory backends. The warning message directs users to configure memory.provider in config.yaml to select which provider to activate. Updated all 47 tests to use builtin + one external pattern instead of multiple externals. Added test_second_external_rejected to verify the enforcement. * feat(memory): add ByteRover memory provider plugin Implements the ByteRover integration (from PR #3499 by hieuntg81) as a MemoryProvider plugin instead of direct run_agent.py modifications. ByteRover provides persistent memory via the brv CLI — a hierarchical knowledge tree with tiered retrieval (fuzzy text then LLM-driven search). Local-first with optional cloud sync. Plugin capabilities: - prefetch: background brv query for relevant context - sync_turn: curate conversation turns (threaded, non-blocking) - on_memory_write: mirror built-in memory writes to brv - on_pre_compress: extract insights before context compression Tools (3): - brv_query: search the knowledge tree - brv_curate: store facts/decisions/patterns - brv_status: check CLI version and context tree state Profile isolation: working directory at $HERMES_HOME/byterover/ (scoped per profile). Binary resolution cached with thread-safe double-checked locking. All write operations threaded to avoid blocking the agent (curate can take 120s with LLM processing). * fix(memory): thread remaining sync_turns, fix holographic, add config key Plugin fixes: - Hindsight: thread sync_turn (was blocking up to 30s via _run_in_thread) - RetainDB: thread sync_turn (was blocking on HTTP POST) - Both: shutdown now joins sync threads alongside prefetch threads Holographic retrieval fixes: - reason(): removed dead intersection_key computation (bundled but never used in scoring). Now reuses pre-computed entity_residuals directly, moved role_content encoding outside the inner loop. - contradict(): added _MAX_CONTRADICT_FACTS=500 scaling guard. Above 500 facts, only checks the most recently updated ones to avoid O(n^2) explosion (~125K comparisons at 500 is acceptable). Config: - Added memory.provider key to DEFAULT_CONFIG ("" = builtin only). No version bump needed (deep_merge handles new keys automatically). * feat(memory): extract Honcho as a MemoryProvider plugin Creates plugins/honcho-memory/ as a thin adapter over the existing honcho_integration/ package. All 4 Honcho tools (profile, search, context, conclude) move from the normal tool registry to the MemoryProvider interface. The plugin delegates all work to HonchoSessionManager — no Honcho logic is reimplemented. It uses the existing config chain: $HERMES_HOME/honcho.json -> ~/.honcho/config.json -> env vars. Lifecycle hooks: - initialize: creates HonchoSessionManager via existing client factory - prefetch: background dialectic query - sync_turn: records messages + flushes to API (threaded) - on_memory_write: mirrors user profile writes as conclusions - on_session_end: flushes all pending messages This is a prerequisite for the MemoryManager wiring in run_agent.py. Once wired, Honcho goes through the same provider interface as all other memory plugins, and the scattered Honcho code in run_agent.py can be consolidated into the single MemoryManager integration point. * feat(memory): wire MemoryManager into run_agent.py Adds 8 integration points for the external memory provider plugin, all purely additive (zero existing code modified): 1. Init (~L1130): Create MemoryManager, find matching plugin provider from memory.provider config, initialize with session context 2. Tool injection (~L1160): Append provider tool schemas to self.tools and self.valid_tool_names after memory_manager init 3. System prompt (~L2705): Add external provider's system_prompt_block alongside existing MEMORY.md/USER.md blocks 4. Tool routing (~L5362): Route provider tool calls through memory_manager.handle_tool_call() before the catchall handler 5. Memory write bridge (~L5353): Notify external provider via on_memory_write() when the built-in memory tool writes 6. Pre-compress (~L5233): Call on_pre_compress() before context compression discards messages 7. Prefetch (~L6421): Inject provider prefetch results into the current-turn user message (same pattern as Honcho turn context) 8. Turn sync + session end (~L8161, ~L8172): sync_all() after each completed turn, queue_prefetch_all() for next turn, on_session_end() + shutdown_all() at conversation end All hooks are wrapped in try/except — a failing provider never breaks the agent. The existing memory system, Honcho integration, and all other code paths are completely untouched. Full suite: 7222 passed, 4 pre-existing failures. * refactor(memory): remove legacy Honcho integration from core Extracts all Honcho-specific code from run_agent.py, model_tools.py, toolsets.py, and gateway/run.py. Honcho is now exclusively available as a memory provider plugin (plugins/honcho-memory/). Removed from run_agent.py (-457 lines): - Honcho init block (session manager creation, activation, config) - 8 Honcho methods: _honcho_should_activate, _strip_honcho_tools, _activate_honcho, _register_honcho_exit_hook, _queue_honcho_prefetch, _honcho_prefetch, _honcho_save_user_observation, _honcho_sync - _inject_honcho_turn_context module-level function - Honcho system prompt block (tool descriptions, CLI commands) - Honcho context injection in api_messages building - Honcho params from __init__ (honcho_session_key, honcho_manager, honcho_config) - HONCHO_TOOL_NAMES constant - All honcho-specific tool dispatch forwarding Removed from other files: - model_tools.py: honcho_tools import, honcho params from handle_function_call - toolsets.py: honcho toolset definition, honcho tools from core tools list - gateway/run.py: honcho params from AIAgent constructor calls Removed tests (-339 lines): - 9 Honcho-specific test methods from test_run_agent.py - TestHonchoAtexitFlush class from test_exit_cleanup_interrupt.py Restored two regex constants (_SURROGATE_RE, _BUDGET_WARNING_RE) that were accidentally removed during the honcho function extraction. The honcho_integration/ package is kept intact — the plugin delegates to it. tools/honcho_tools.py registry entries are now dead code (import commented out in model_tools.py) but the file is preserved for reference. Full suite: 7207 passed, 4 pre-existing failures. Zero regressions. * refactor(memory): restructure plugins, add CLI, clean gateway, migration notice Plugin restructure: - Move all memory plugins from plugins/<name>-memory/ to plugins/memory/<name>/ (byterover, hindsight, holographic, honcho, mem0, openviking, retaindb) - New plugins/memory/__init__.py discovery module that scans the directory directly, loading providers by name without the general plugin system - run_agent.py uses load_memory_provider() instead of get_plugin_memory_providers() CLI wiring: - hermes memory setup — interactive curses picker + config wizard - hermes memory status — show active provider, config, availability - hermes memory off — disable external provider (built-in only) - hermes honcho — now shows migration notice pointing to hermes memory setup Gateway cleanup: - Remove _get_or_create_gateway_honcho (already removed in prev commit) - Remove _shutdown_gateway_honcho and _shutdown_all_gateway_honcho methods - Remove all calls to shutdown methods (4 call sites) - Remove _honcho_managers/_honcho_configs dict references Dead code removal: - Delete tools/honcho_tools.py (279 lines, import was already commented out) - Delete tests/gateway/test_honcho_lifecycle.py (131 lines, tested removed methods) - Remove if False placeholder from run_agent.py Migration: - Honcho migration notice on startup: detects existing honcho.json or ~/.honcho/config.json, prints guidance to run hermes memory setup. Only fires when memory.provider is not set and not in quiet mode. Full suite: 7203 passed, 4 pre-existing failures. Zero regressions. * feat(memory): standardize plugin config + add per-plugin documentation Config architecture: - Add save_config(values, hermes_home) to MemoryProvider ABC - Honcho: writes to $HERMES_HOME/honcho.json (SDK native) - Mem0: writes to $HERMES_HOME/mem0.json - Hindsight: writes to $HERMES_HOME/hindsight/config.json - Holographic: writes to config.yaml under plugins.hermes-memory-store - OpenViking/RetainDB/ByteRover: env-var only (default no-op) Setup wizard (hermes memory setup): - Now calls provider.save_config() for non-secret config - Secrets still go to .env via env vars - Only memory.provider activation key goes to config.yaml Documentation: - README.md for each of the 7 providers in plugins/memory/<name>/ - Requirements, setup (wizard + manual), config reference, tools table - Consistent format across all providers The contract for new memory plugins: - get_config_schema() declares all fields (REQUIRED) - save_config() writes native config (REQUIRED if not env-var-only) - Secrets use env_var field in schema, written to .env by wizard - README.md in the plugin directory * docs: add memory providers user guide + developer guide New pages: - user-guide/features/memory-providers.md — comprehensive guide covering all 7 shipped providers (Honcho, OpenViking, Mem0, Hindsight, Holographic, RetainDB, ByteRover). Each with setup, config, tools, cost, and unique features. Includes comparison table and profile isolation notes. - developer-guide/memory-provider-plugin.md — how to build a new memory provider plugin. Covers ABC, required methods, config schema, save_config, threading contract, profile isolation, testing. Updated pages: - user-guide/features/memory.md — replaced Honcho section with link to new Memory Providers page - user-guide/features/honcho.md — replaced with migration redirect to the new Memory Providers page - sidebars.ts — added both new pages to navigation * fix(memory): auto-migrate Honcho users to memory provider plugin When honcho.json or ~/.honcho/config.json exists but memory.provider is not set, automatically set memory.provider: honcho in config.yaml and activate the plugin. The plugin reads the same config files, so all data and credentials are preserved. Zero user action needed. Persists the migration to config.yaml so it only fires once. Prints a one-line confirmation in non-quiet mode. * fix(memory): only auto-migrate Honcho when enabled + credentialed Check HonchoClientConfig.enabled AND (api_key OR base_url) before auto-migrating — not just file existence. Prevents false activation for users who disabled Honcho, stopped using it (config lingers), or have ~/.honcho/ from a different tool. * feat(memory): auto-install pip dependencies during hermes memory setup Reads pip_dependencies from plugin.yaml, checks which are missing, installs them via pip before config walkthrough. Also shows install guidance for external_dependencies (e.g. brv CLI for ByteRover). Updated all 7 plugin.yaml files with pip_dependencies: - honcho: honcho-ai - mem0: mem0ai - openviking: httpx - hindsight: hindsight-client - holographic: (none) - retaindb: requests - byterover: (external_dependencies for brv CLI) * fix: remove remaining Honcho crash risks from cli.py and gateway cli.py: removed Honcho session re-mapping block (would crash importing deleted tools/honcho_tools.py), Honcho flush on compress, Honcho session display on startup, Honcho shutdown on exit, honcho_session_key AIAgent param. gateway/run.py: removed honcho_session_key params from helper methods, sync_honcho param, _honcho.shutdown() block. tests: fixed test_cron_session_with_honcho_key_skipped (was passing removed honcho_key param to _flush_memories_for_session). * fix: include plugins/ in pyproject.toml package list Without this, plugins/memory/ wouldn't be included in non-editable installs. Hermes always runs from the repo checkout so this is belt- and-suspenders, but prevents breakage if the install method changes. * fix(memory): correct pip-to-import name mapping for dep checks The heuristic dep.replace('-', '_') fails for packages where the pip name differs from the import name: honcho-ai→honcho, mem0ai→mem0, hindsight-client→hindsight_client. Added explicit mapping table so hermes memory setup doesn't try to reinstall already-installed packages. * chore: remove dead code from old plugin memory registration path - hermes_cli/plugins.py: removed register_memory_provider(), _memory_providers list, get_plugin_memory_providers() — memory providers now use plugins/memory/ discovery, not the general plugin system - hermes_cli/main.py: stripped 74 lines of dead honcho argparse subparsers (setup, status, sessions, map, peer, mode, tokens, identity, migrate) — kept only the migration redirect - agent/memory_provider.py: updated docstring to reflect new registration path - tests: replaced TestPluginMemoryProviderRegistration with TestPluginMemoryDiscovery that tests the actual plugins/memory/ discovery system. Added 3 new tests (discover, load, nonexistent). * chore: delete dead honcho_integration/cli.py and its tests cli.py (794 lines) was the old 'hermes honcho' command handler — nobody calls it since cmd_honcho was replaced with a migration redirect. Deleted tests that imported from removed code: - tests/honcho_integration/test_cli.py (tested _resolve_api_key) - tests/honcho_integration/test_config_isolation.py (tested CLI config paths) - tests/tools/test_honcho_tools.py (tested the deleted tools/honcho_tools.py) Remaining honcho_integration/ files (actively used by the plugin): - client.py (445 lines) — config loading, SDK client creation - session.py (991 lines) — session management, queries, flush * refactor: move honcho_integration/ into the honcho plugin Moves client.py (445 lines) and session.py (991 lines) from the top-level honcho_integration/ package into plugins/memory/honcho/. No Honcho code remains in the main codebase. - plugins/memory/honcho/client.py — config loading, SDK client creation - plugins/memory/honcho/session.py — session management, queries, flush - Updated all imports: run_agent.py (auto-migration), hermes_cli/doctor.py, plugin __init__.py, session.py cross-import, all tests - Removed honcho_integration/ package and pyproject.toml entry - Renamed tests/honcho_integration/ → tests/honcho_plugin/ * docs: update architecture + gateway-internals for memory provider system - architecture.md: replaced honcho_integration/ with plugins/memory/ - gateway-internals.md: replaced Honcho-specific session routing and flush lifecycle docs with generic memory provider interface docs * fix: update stale mock path for resolve_active_host after honcho plugin migration * fix(memory): address review feedback — P0 lifecycle, ABC contract, honcho CLI restore Review feedback from Honcho devs (erosika): P0 — Provider lifecycle: - Remove on_session_end() + shutdown_all() from run_conversation() tail (was killing providers after every turn in multi-turn sessions) - Add shutdown_memory_provider() method on AIAgent for callers - Wire shutdown into CLI atexit, reset_conversation, gateway stop/expiry Bug fixes: - Remove sync_honcho=False kwarg from /btw callsites (TypeError crash) - Fix doctor.py references to dead 'hermes honcho setup' command - Cache prefetch_all() before tool loop (was re-calling every iteration) ABC contract hardening (all backwards-compatible): - Add session_id kwarg to prefetch/sync_turn/queue_prefetch - Make on_pre_compress() return str (provider insights in compression) - Add *kwargs to on_turn_start() for runtime context - Add on_delegation() hook for parent-side subagent observation - Document agent_context/agent_identity/agent_workspace kwargs on initialize() (prevents cron corruption, enables profile scoping) - Fix docstring: single external provider, not multiple Honcho CLI restoration: - Add plugins/memory/honcho/cli.py (from main's honcho_integration/cli.py with imports adapted to plugin path) - Restore full hermes honcho command with all subcommands (status, peer, mode, tokens, identity, enable/disable, sync, peers, --target-profile) - Restore auto-clone on profile creation + sync on hermes update - hermes honcho setup now redirects to hermes memory setup fix(memory): wire on_delegation, skip_memory for cron/flush, fix ByteRover return type - Wire on_delegation() in delegate_tool.py — parent's memory provider is notified with task+result after each subagent completes - Add skip_memory=True to cron scheduler (prevents cron system prompts from corrupting user representations — closes #4052) - Add skip_memory=True to gateway flush agent (throwaway agent shouldn't activate memory provider) - Fix ByteRover on_pre_compress() return type: None -> str * fix(honcho): port profile isolation fixes from PR #4632 Ports 5 bug fixes found during profile testing (erosika's PR #4632): 1. 3-tier config resolution — resolve_config_path() now checks $HERMES_HOME/honcho.json → ~/.hermes/honcho.json → ~/.honcho/config.json (non-default profiles couldn't find shared host blocks) 2. Thread host=_host_key() through from_global_config() in cmd_setup, cmd_status, cmd_identity (--target-profile was being ignored) 3. Use bare profile name as aiPeer (not host key with dots) — Honcho's peer ID pattern is ^[a-zA-Z0-9_-]+$, dots are invalid 4. Wrap add_peers() in try/except — was fatal on new AI peers, killed all message uploads for the session 5. Gate Honcho clone behind --clone/--clone-all on profile create (bare create should be blank-slate) Also: sanitize assistant_peer_id via _sanitize_id() * fix(tests): add module cleanup fixture to test_cli_provider_resolution test_cli_provider_resolution._import_cli() wipes tools.*, cli, and run_agent from sys.modules to force fresh imports, but had no cleanup. This poisoned all subsequent tests on the same xdist worker — mocks targeting tools.file_tools, tools.send_message_tool, etc. patched the NEW module object while already-imported functions still referenced the OLD one. Caused ~25 cascade failures: send_message KeyError, process_registry FileNotFoundError, file_read_guards timeouts, read_loop_detection file-not-found, mcp_oauth None port, and provider_parity/codex_execution stale tool lists. Fix: autouse fixture saves all affected modules before each test and restores them after, matching the pattern in test_managed_browserbase_and_modal.py.	2026-04-02 15:33:51 -07:00
Ben Barclay	a2e56d044b	Merge branch 'main' into rewbs/tool-use-charge-to-subscription	2026-04-02 11:00:35 +11:00
Teknium	de9bba8d7c	fix: remove hardcoded OpenRouter/opus defaults No model, base_url, or provider is assumed when the user hasn't configured one. Previously the defaults dict in cli.py, AIAgent constructor args, and several fallback paths all hardcoded anthropic/claude-opus-4.6 + openrouter.ai/api/v1 — silently routing unconfigured users to OpenRouter, which 404s for anyone using a different provider. Now empty defaults force the setup wizard to run, and existing users who already completed setup are unaffected (their config.yaml has the model they chose). Files changed: - cli.py: defaults dict, _DEFAULT_CONFIG_MODEL - run_agent.py: AIAgent.__init__ defaults, main() defaults - hermes_cli/config.py: DEFAULT_CONFIG - hermes_cli/runtime_provider.py: is_fallback sentinel - acp_adapter/session.py: default_model - tests: updated to reflect empty defaults	2026-04-01 15:22:26 -07:00
Teknium	e0abf2416d	fix: restore _config_version to 11 (reverted by stale-branch merge in #4419 ) (#4440 ) PR #4419 was based on pre-credential-pools main where _config_version was 10. The squash merge downgraded it from 11 (set by #2647) back to 10. Also fixes the test assertion.	2026-04-01 04:34:04 -07:00
Teknium	70744add15	feat(browser): add persistent Camofox sessions and VNC URL discovery (salvage #4400 ) (#4419 ) Adds two Camofox features: 1. Persistent browser sessions: new `browser.camofox.managed_persistence` config option. When enabled, Hermes sends a deterministic profile-scoped userId to Camofox so the server maps it to a persistent browser profile directory. Cookies, logins, and browser state survive across restarts. Default remains ephemeral (random userId per session). 2. VNC URL discovery: Camofox /health endpoint returns vncPort when running in headed mode. Hermes constructs the VNC URL and includes it in navigate responses so the agent can share it with users. Also fixes camofox_vision bug where call_llm response object was passed directly to json.dumps instead of extracting .choices[0].message.content. Changes from original PR: - Removed browser_evaluate tool (separate feature, needs own PR) - Removed snapshot truncation limit change (unrelated) - Config.yaml only for managed_persistence (no env var, no version bump) - Rewrote tests to use config mock instead of env var - Reverted package-lock.json churn Co-authored-by: analista <psikonetik@gmail.com.com>	2026-04-01 04:18:50 -07:00
kshitijk4poor	935137f0d9	feat: add inline diff previews for write actions Show inline diffs in the CLI transcript when write_file, patch, or skill_manage modifies files. Captures a filesystem snapshot before the tool runs, computes a unified diff after, and renders it with ANSI coloring in the activity feed. Adds tool_start_callback and tool_complete_callback hooks to AIAgent for pre/post tool execution notifications. Also fixes _extract_parallel_scope_path to normalize relative paths to absolute, preventing the parallel overlap detection from missing conflicts when the same file is referenced with different path styles. Gated by display.inline_diffs config option (default: true). Based on PR #3774 by @kshitijk4poor.	2026-04-01 02:13:57 -07:00
Teknium	1b62ad9de7	fix: root-level provider in config.yaml no longer overrides model.provider load_cli_config() had a priority inversion: a stale root-level 'provider' key in config.yaml would OVERRIDE the canonical 'model.provider' set by 'hermes model'. The gateway reads model.provider directly from YAML and worked correctly, but 'hermes chat -q' and the interactive CLI went through the merge logic and picked up the stale root-level key. Fix: root-level provider/base_url are now only used as a fallback when model.provider/model.base_url is not set (never as an override). Also added _normalize_root_model_keys() to config.py load_config() and save_config() — migrates root-level provider/base_url into the model section and removes the root-level keys permanently. Reported by (≧▽≦) in Discord: opencode-go provider persisted as a root-level key and overrode the correct model.provider=openrouter, causing 401 errors.	2026-03-31 12:54:22 -07:00
Teknium	e3f8347be3	feat(file_tools): harden read_file with size guard, dedup, and device blocking (#4315 ) * feat(file_tools): harden read_file with size guard, dedup, and device blocking Three improvements to read_file_tool to reduce wasted context tokens and prevent process hangs: 1. Character-count guard: reads that produce more than 100K characters (≈25-35K tokens across tokenisers) are rejected with an error that tells the model to use offset+limit for a smaller range. The effective cap is min(file_size, 100K) so small files that happen to have long lines aren't over-penalised. Large truncated files also get a hint nudging toward targeted reads. 2. File-read deduplication: when the same (path, offset, limit) is read a second time and the file hasn't been modified (mtime unchanged), return a lightweight stub instead of re-sending the full content. Writes and patches naturally change mtime, so post-edit reads always return fresh content. The dedup cache is cleared on context compression — after compression the original read content is summarised away, so the model needs the full content again. 3. Device path blocking: paths like /dev/zero, /dev/random, /dev/stdin etc. are rejected before any I/O to prevent process hangs from infinite-output or blocking-input devices. Tests: 17 new tests covering all three features plus the dedup-reset- on-compression integration. All 52 file-read tests pass (35 existing + 17 new). Full tool suite (2124 tests) passes with 0 failures. * feat: make file_read_max_chars configurable, add docs Add file_read_max_chars to DEFAULT_CONFIG (default 100K). read_file_tool reads this on first call and caches for the process lifetime. Users on large-context models can raise it; users on small local models can lower it. Also adds a 'File Read Safety' section to the configuration docs explaining the char limit, dedup behavior, and example values.	2026-03-31 12:53:19 -07:00
Dakota Secula-Rosell	c1606aed69	fix(cli): allow empty strings and falsy values in config set `hermes config set KEY ""` and `hermes config set KEY 0` were rejected because the guard used `not value` which is truthy for empty strings, zero, and False. Changed to `value is None` so only truly missing arguments are rejected. Closes #4277 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-31 11:41:12 -07:00
Teknium	8d59881a62	feat(auth): same-provider credential pools with rotation, custom endpoint support, and interactive CLI (#2647 ) * feat(auth): add same-provider credential pools and rotation UX Add same-provider credential pooling so Hermes can rotate across multiple credentials for a single provider, recover from exhausted credentials without jumping providers immediately, and configure that behavior directly in hermes setup. - agent/credential_pool.py: persisted per-provider credential pools - hermes auth add/list/remove/reset CLI commands - 429/402/401 recovery with pool rotation in run_agent.py - Setup wizard integration for pool strategy configuration - Auto-seeding from env vars and existing OAuth state Co-authored-by: kshitijk4poor <82637225+kshitijk4poor@users.noreply.github.com> Salvaged from PR #2647 * fix(tests): prevent pool auto-seeding from host env in credential pool tests Tests for non-pool Anthropic paths and auth remove were failing when host env vars (ANTHROPIC_API_KEY) or file-backed OAuth credentials were present. The pool auto-seeding picked these up, causing unexpected pool entries in tests. - Mock _select_pool_entry in auxiliary_client OAuth flag tests - Clear Anthropic env vars and mock _seed_from_singletons in auth remove test * feat(auth): add thread safety, least_used strategy, and request counting - Add threading.Lock to CredentialPool for gateway thread safety (concurrent requests from multiple gateway sessions could race on pool state mutations without this) - Add 'least_used' rotation strategy that selects the credential with the lowest request_count, distributing load more evenly - Add request_count field to PooledCredential for usage tracking - Add mark_used() method to increment per-credential request counts - Wrap select(), mark_exhausted_and_rotate(), and try_refresh_current() with lock acquisition - Add tests: least_used selection, mark_used counting, concurrent thread safety (4 threads × 20 selects with no corruption) * feat(auth): add interactive mode for bare 'hermes auth' command When 'hermes auth' is called without a subcommand, it now launches an interactive wizard that: 1. Shows full credential pool status across all providers 2. Offers a menu: add, remove, reset cooldowns, set strategy 3. For OAuth-capable providers (anthropic, nous, openai-codex), the add flow explicitly asks 'API key or OAuth login?' — making it clear that both auth types are supported for the same provider 4. Strategy picker shows all 4 options (fill_first, round_robin, least_used, random) with the current selection marked 5. Remove flow shows entries with indices for easy selection The subcommand paths (hermes auth add/list/remove/reset) still work exactly as before for scripted/non-interactive use. * fix(tests): update runtime_provider tests for config.yaml source of truth (#4165) Tests were using OPENAI_BASE_URL env var which is no longer consulted after #4165. Updated to use model config (provider, base_url, api_key) which is the new single source of truth for custom endpoint URLs. * feat(auth): support custom endpoint credential pools keyed by provider name Custom OpenAI-compatible endpoints all share provider='custom', making the provider-keyed pool useless. Now pools for custom endpoints are keyed by 'custom:<normalized_name>' where the name comes from the custom_providers config list (auto-generated from URL hostname). - Pool key format: 'custom:together.ai', 'custom:local-(localhost:8080)' - load_pool('custom:name') seeds from custom_providers api_key AND model.api_key when base_url matches - hermes auth add/list now shows custom endpoints alongside registry providers - _resolve_openrouter_runtime and _resolve_named_custom_runtime check pool before falling back to single config key - 6 new tests covering custom pool keying, seeding, and listing * docs: add Excalidraw diagram of full credential pool flow Comprehensive architecture diagram showing: - Credential sources (env vars, auth.json OAuth, config.yaml, CLI) - Pool storage and auto-seeding - Runtime resolution paths (registry, custom, OpenRouter) - Error recovery (429 retry-then-rotate, 402 immediate, 401 refresh) - CLI management commands and strategy configuration Open at: https://excalidraw.com/#json=2Ycqhqpi6f12E_3ITyiwh,c7u9jSt5BwrmiVzHGbm87g * fix(tests): update setup wizard pool tests for unified select_provider_and_model flow The setup wizard now delegates to select_provider_and_model() instead of using its own prompt_choice-based provider picker. Tests needed: - Mock select_provider_and_model as no-op (provider pre-written to config) - Call _stub_tts BEFORE custom prompt_choice mock (it overwrites it) - Pre-write model.provider to config so the pool step is reached * docs: add comprehensive credential pool documentation - New page: website/docs/user-guide/features/credential-pools.md Full guide covering quick start, CLI commands, rotation strategies, error recovery, custom endpoint pools, auto-discovery, thread safety, architecture, and storage format. - Updated fallback-providers.md to reference credential pools as the first layer of resilience (same-provider rotation before cross-provider) - Added hermes auth to CLI commands reference with usage examples - Added credential_pool_strategies to configuration guide * chore: remove excalidraw diagram from repo (external link only) * refactor: simplify credential pool code — extract helpers, collapse extras, dedup patterns - _load_config_safe(): replace 4 identical try/except/import blocks - _iter_custom_providers(): shared generator for custom provider iteration - PooledCredential.extra dict: collapse 11 round-trip-only fields (token_type, scope, client_id, portal_base_url, obtained_at, expires_in, agent_key_id, agent_key_expires_in, agent_key_reused, agent_key_obtained_at, tls) into a single extra dict with __getattr__ for backward-compatible access - _available_entries(): shared exhaustion-check between select and peek - Dedup anthropic OAuth seeding (hermes_pkce + claude_code identical) - SimpleNamespace replaces class _Args boilerplate in auth_commands - _try_resolve_from_custom_pool(): shared pool-check in runtime_provider Net -17 lines. All 383 targeted tests pass. --------- Co-authored-by: kshitijk4poor <82637225+kshitijk4poor@users.noreply.github.com>	2026-03-31 03:10:01 -07:00
Nils	50302ed70a	fix(tools): make browser SSRF check configurable via browser.allow_private_urls (#4198 ) * fix(tools): skip SSRF check in local browser mode The SSRF protection added in #3041 blocks all private/internal addresses unconditionally in browser_navigate(). This prevents legitimate local development use cases (localhost testing, LAN device access) when using the local Chromium backend. The SSRF check is only meaningful for cloud browsers (Browserbase, BrowserUse) where the agent could reach internal resources on a remote machine. In local mode, the user already has full terminal and network access, so the check adds no security value. This change makes the SSRF check conditional on _get_cloud_provider(), keeping full protection in cloud mode while allowing private addresses in local mode. * fix(tools): make SSRF check configurable via browser.allow_private_urls Replace unconditional SSRF check with a configurable setting. Default (False) keeps existing security behavior. Setting to True allows navigating to private/internal IPs for local dev and LAN use cases. --------- Co-authored-by: Nils (Norya) <nils@begou.dev>	2026-03-31 02:11:55 -07:00
Teknium	ff78ad4c81	feat: add discord.reactions config option to disable message reactions (#4199 ) Adds a 'reactions' key under the discord config section (default: true). When set to false, the bot no longer adds 👀/✅/❌ reactions to messages during processing. The config maps to DISCORD_REACTIONS env var following the same pattern as require_mention and auto_thread. Files changed: - hermes_cli/config.py: Add reactions default to DEFAULT_CONFIG - gateway/config.py: Map discord.reactions to DISCORD_REACTIONS env var - gateway/platforms/discord.py: Gate on_processing_start/complete hooks - tests/gateway/test_discord_reactions.py: 3 new tests for config gate	2026-03-31 01:24:48 -07:00
Teknium	e64b047663	chore: prepare Hermes for Homebrew packaging (#4099 ) Co-authored-by: Yabuku-xD <78594762+Yabuku-xD@users.noreply.github.com>	2026-03-30 17:34:43 -07:00
Robin Fernandes	1126284c97	Merge branch 'main' into rewbs/tool-use-charge-to-subscription	2026-03-31 09:29:43 +09:00
Robin Fernandes	6e4598ce1e	Merge branch 'main' into rewbs/tool-use-charge-to-subscription	2026-03-31 08:48:54 +09:00
Teknium	950f69475f	feat(browser): add Camofox local anti-detection browser backend (#4008 ) Camofox-browser is a self-hosted Node.js server wrapping Camoufox (Firefox fork with C++ fingerprint spoofing). When CAMOFOX_URL is set, all 11 browser tools route through the Camofox REST API instead of the agent-browser CLI. Maps 1:1 to the existing browser tool interface: - Navigate, snapshot, click, type, scroll, back, press, close - Get images, vision (screenshot + LLM analysis) - Console (returns empty with note — camofox limitation) Setup: npm start in camofox-browser dir, or docker run -p 9377:9377 Then: CAMOFOX_URL=http://localhost:9377 in ~/.hermes/.env Advantages over Browserbase (cloud): - Free (no per-session API costs) - Local (zero network latency for browser ops) - Anti-detection at C++ level (bypasses Cloudflare/Google bot detection) - Works offline, Docker-ready Files: - tools/browser_camofox.py: Full REST backend (~400 lines) - tools/browser_tool.py: Routing at each tool function - hermes_cli/config.py: CAMOFOX_URL env var entry - tests/tools/test_browser_camofox.py: 20 tests	2026-03-30 13:18:42 -07:00
Teknium	b4496b33b5	fix: background task media delivery + vision download timeout (#3919 ) * feat(telegram): add webhook mode as alternative to polling When TELEGRAM_WEBHOOK_URL is set, the adapter starts an HTTP webhook server (via python-telegram-bot's start_webhook()) instead of long polling. This enables cloud platforms like Fly.io and Railway to auto-wake suspended machines on inbound HTTP traffic. Polling remains the default — no behavior change unless the env var is set. Env vars: TELEGRAM_WEBHOOK_URL Public HTTPS URL for Telegram to push to TELEGRAM_WEBHOOK_PORT Local listen port (default 8443) TELEGRAM_WEBHOOK_SECRET Secret token for update verification Cherry-picked and adapted from PR #2022 by SHL0MS. Preserved all current main enhancements (network error recovery, polling conflict detection, DM topics setup). Co-authored-by: SHL0MS <SHL0MS@users.noreply.github.com> * fix: send_document call in background task delivery + vision download timeout Two fixes salvaged from PR #2269 by amethystani: 1. gateway/run.py: adapter.send_file() → adapter.send_document() send_file() doesn't exist on BasePlatformAdapter. Background task media files were silently never delivered (AttributeError swallowed by except Exception: pass). 2. tools/vision_tools.py: configurable image download timeout via HERMES_VISION_DOWNLOAD_TIMEOUT env var (default 30s), plus guard against raise None when max_retries=0. The third fix in #2269 (opencode-go auth config) was already resolved on main. Co-authored-by: amethystani <amethystani@users.noreply.github.com> --------- Co-authored-by: SHL0MS <SHL0MS@users.noreply.github.com> Co-authored-by: amethystani <amethystani@users.noreply.github.com>	2026-03-30 02:59:39 -07:00
Teknium	947faed3bc	feat(approvals): make dangerous command approval timeout configurable (#3886 ) * feat(approvals): make dangerous command approval timeout configurable Read `approvals.timeout` from config.yaml (default 60s) instead of hardcoding 60 seconds in both the fallback CLI prompt and the TUI prompt_toolkit callback. Follows the same pattern as `clarify.timeout` which is already configurable via CLI_CONFIG. Closes #3765 * fix: add timeout default to approvals section in DEFAULT_CONFIG --------- Co-authored-by: acsezen <asezen@icloud.com>	2026-03-30 00:02:02 -07:00
Teknium	ce2841f3c9	feat(gateway): add WeCom (Enterprise WeChat) platform support (#3847 ) Adds WeCom as a gateway platform adapter using the AI Bot WebSocket gateway for real-time bidirectional communication. No public endpoint or new pip dependencies needed (uses existing aiohttp + httpx). Features: - WebSocket persistent connection with auto-reconnect (exponential backoff) - DM and group messaging with configurable access policies - Media upload/download with AES decryption for encrypted attachments - Markdown rendering, quote context preservation - Proactive + passive reply message modes - Chunked media upload pipeline (512KB chunks) Cherry-picked from PR #1898 by EvilRan with: - Moved to current main (PR was 300 commits behind) - Skipped base.py regressions (reply_to additions are good but belong in a separate PR since they affect all platforms) - Fixed test assertions to match current base class send() signature (reply_to=None kwarg now explicit) - All 16 integration points added surgically to current main - No new pip dependencies (aiohttp + httpx already installed) Fixes #1898 Co-authored-by: EvilRan <EvilRan@users.noreply.github.com>	2026-03-29 21:29:13 -07:00
Robin Fernandes	1cbb1b99cc	Gate tool-gateway behind an env var, so it's not in users' faces until we're ready. Even if users enable it, it'll be blocked server-side for now, until we unlock for non-admin users on tool-gateway.	2026-03-30 13:28:10 +09:00
Teknium	ca4907dfbc	feat(gateway): add Feishu/Lark platform support (#3817 ) Adds Feishu (ByteDance's enterprise messaging platform) as a gateway platform adapter with full feature parity: WebSocket + webhook transports, message batching, dedup, rate limiting, rich post/card content parsing, media handling (images/audio/files/video), group @mention gating, reaction routing, and interactive card button support. Cherry-picked from PR #1793 by penwyp with: - Moved to current main (PR was 458 commits behind) - Fixed _send_with_retry shadowing BasePlatformAdapter method (renamed to _feishu_send_with_retry to avoid signature mismatch crash) - Fixed import structure: aiohttp/websockets imported independently of lark_oapi so they remain available when SDK is missing - Fixed get_hermes_home import (hermes_constants, not hermes_cli.config) - Added skip decorators for tests requiring lark_oapi SDK - All 16 integration points added surgically to current main New dependency: lark-oapi>=1.5.3,<2 (optional, pip install hermes-agent[feishu]) Fixes #1788 Co-authored-by: penwyp <penwyp@users.noreply.github.com>	2026-03-29 18:17:42 -07:00
Teknium	e314833c9d	feat(display): configurable tool preview length -- show full paths by default (#3841 ) Tool call previews (paths, commands, queries) were hardcoded to truncate at 35-40 chars across CLI spinners, completion lines, and gateway progress messages. Users could not see full file paths in tool output. New config option: display.tool_preview_length (default 0 = no limit). Set a positive number to truncate at that length. Changes: - display.py: module-level _tool_preview_max_len with getter/setter; build_tool_preview() and get_cute_tool_message() _trunc/_path respect it - cli.py: reads config at startup, spinner widget respects config - gateway/run.py: reads config per-message, progress callback respects config - run_agent.py: removed redundant 30-char quiet-mode spinner truncation - config.py: added display.tool_preview_length to DEFAULT_CONFIG Reported by kriskaminski	2026-03-29 18:02:42 -07:00
Teknium	df806bdbaf	feat(cron): add cron.wrap_response config to disable delivery wrapping (#3807 ) Adds a config option to suppress the header/footer text that wraps cron job responses when delivered to messaging platforms. Set cron.wrap_response: false in config.yaml for clean output without the 'Cronjob Response: <name>' header and 'The agent cannot see this message' footer. Default is true (preserves current behavior).	2026-03-29 16:31:01 -07:00
Teknium	252fbea005	feat(providers): add ordered fallback provider chain (salvage #1761 ) (#3813 ) Extends the single fallback_model mechanism into an ordered chain. When the primary model fails, Hermes tries each fallback provider in sequence until one succeeds or the chain is exhausted. Config format (new): fallback_providers: - provider: openrouter model: anthropic/claude-sonnet-4 - provider: openai model: gpt-4o Legacy single-dict fallback_model format still works unchanged. Key fix vs original PR: the call sites in the retry loop now use _fallback_index < len(_fallback_chain) instead of the old one-shot _fallback_activated guard, so the chain actually advances through all configured providers. Changes: - run_agent.py: _fallback_chain list + _fallback_index replaces one-shot _fallback_model; _try_activate_fallback() advances through chain; failed provider resolution skips to next entry; call sites updated to allow chain advancement - cli.py: reads fallback_providers with legacy fallback_model compat - gateway/run.py: same - hermes_cli/config.py: fallback_providers: [] in DEFAULT_CONFIG - tests: 12 new chain tests + 6 existing test fixtures updated Co-authored-by: uzaylisak <uzaylisak@users.noreply.github.com>	2026-03-29 16:04:53 -07:00
Teknium	fcd1645223	feat(skills): support external skill directories via config (#3678 ) Add skills.external_dirs config option — a list of additional directories to scan for skills alongside ~/.hermes/skills/. External dirs are read-only: skill creation/editing always writes to the local dir. Local skills take precedence when names collide. This lets users share skills across tools/agents without copying them into Hermes's own directory (e.g. ~/.agents/skills, /shared/team-skills). Changes: - agent/skill_utils.py: add get_external_skills_dirs() and get_all_skills_dirs() - agent/prompt_builder.py: scan external dirs in build_skills_system_prompt() - tools/skills_tool.py: _find_all_skills() and skill_view() search external dirs; security check recognizes configured external dirs as trusted - agent/skill_commands.py: /skill slash commands discover external skills - hermes_cli/config.py: add skills.external_dirs to DEFAULT_CONFIG - cli-config.yaml.example: document the option - tests/agent/test_external_skills.py: 11 tests covering discovery, precedence, deduplication, and skill_view for external skills Requested by community member primco.	2026-03-29 00:33:30 -07:00
Teknium	91b881f931	feat(mattermost): configurable mention behavior — respond without @mention (#3664 ) Adds MATTERMOST_REQUIRE_MENTION and MATTERMOST_FREE_RESPONSE_CHANNELS env vars, matching Discord's existing mention gating pattern. - MATTERMOST_REQUIRE_MENTION=false: respond to all channel messages - MATTERMOST_FREE_RESPONSE_CHANNELS=id1,id2: specific channels where bot responds without @mention even when require_mention is true - DMs always respond regardless of mention settings - @mention is now stripped from message text (clean agent input) 7 new tests for mention gating, free-response channels, DM bypass, and mention stripping. Updated existing test for mention stripping. Docs: updated mattermost.md with Mention Behavior section, environment-variables.md with new vars, config.py with metadata.	2026-03-28 22:17:43 -07:00
Teknium	d35567c6e0	feat(web): add Exa as a web search and extract backend (#3648 ) Adds Exa (https://exa.ai) as a fourth web backend alongside Parallel, Firecrawl, and Tavily. Follows the exact same integration pattern: - Backend selection: config web.backend=exa or auto-detect from EXA_API_KEY - Search: _exa_search() with highlights for result descriptions - Extract: _exa_extract() with full text content extraction - Lazy singleton client with x-exa-integration header - Wired into web_search_tool and web_extract_tool dispatchers - check_web_api_key() and requires_env updated - CLI: hermes setup summary, hermes tools config, hermes config show - config.py: EXA_API_KEY in OPTIONAL_ENV_VARS with metadata - pyproject.toml: exa-py>=2.9.0,<3 in dependencies Salvaged from PR #1850. Co-authored-by: louiswalsh <louiswalsh@users.noreply.github.com>	2026-03-28 17:35:53 -07:00
Teknium	839d9d7471	feat(agent): configurable timeouts for auxiliary LLM calls via config.yaml (#3597 ) Add per-task timeout settings under auxiliary.{task}.timeout in config.yaml instead of hardcoded values. Users with slow local models (Ollama, llama.cpp) can now increase timeouts for compression, vision, session search, etc. Defaults: - auxiliary.compression.timeout: 120s (was hardcoded 45s) - auxiliary.vision.timeout: 30s (unchanged) - all other aux tasks: 30s (was hardcoded 30s) - title_generator: 30s (was hardcoded 15s) call_llm/async_call_llm now auto-resolve timeout from config when not explicitly passed. Callers can still override with an explicit timeout arg. Based on PR #3406 by alanfwilliams. Converted from env vars to config.yaml per project conventions. Co-authored-by: alanfwilliams <alanfwilliams@users.noreply.github.com>	2026-03-28 14:35:28 -07:00
Teknium	901494d728	feat: make tool-use enforcement configurable via agent.tool_use_enforcement (#3551 ) The TOOL_USE_ENFORCEMENT_GUIDANCE injection (added in #3528) was hardcoded to only match gpt/codex model names. This makes it a config option so users can turn it on for any model family. New config key: agent.tool_use_enforcement - "auto" (default): matches gpt/codex (existing behavior) - true: inject for all models - false: never inject - list of strings: custom model-name substrings to match e.g. ["gpt", "codex", "deepseek", "qwen"] No version bump needed — deep merge provides the default automatically for existing installs. 12 new tests covering all config modes.	2026-03-28 12:31:22 -07:00
Teknium	09796b183b	fix: alibaba provider default endpoint and model list (#3484 ) - Change default inference_base_url from dashscope-intl Anthropic-compat endpoint to coding-intl OpenAI-compat /v1 endpoint. The old Anthropic endpoint 404'd when used with the OpenAI SDK (which appends /chat/completions to a /apps/anthropic base URL). - Update curated model list: remove models unavailable on coding-intl (qwen3-max, qwen-plus-latest, qwen3.5-flash, qwen-vl-max), add third-party models available on the platform (glm-5, glm-4.7, kimi-k2.5, MiniMax-M2.5). - URL-based api_mode auto-detection still works: overriding DASHSCOPE_BASE_URL to an /apps/anthropic endpoint automatically switches to anthropic_messages mode. - Update provider description and env var descriptions to reflect the coding-intl multi-provider platform. - Update tests to match new default URL and test the anthropic override path instead.	2026-03-27 22:10:10 -07:00
Teknium	fd8c465e42	feat: add Hugging Face as a first-class inference provider (#3419 ) Salvage of PR #1747 (original PR #1171 by @davanstrien) onto current main. Registers Hugging Face Inference Providers (router.huggingface.co/v1) as a named provider: - hermes chat --provider huggingface (or --provider hf) - 18 curated open models via hermes model picker - HF_TOKEN in ~/.hermes/.env - OpenAI-compatible endpoint with automatic failover (Groq, Together, SambaNova, etc.) Files: auth.py, models.py, main.py, setup.py, config.py, model_metadata.py, .env.example, 5 docs pages, 17 new tests. Co-authored-by: Daniel van Strien <davanstrien@gmail.com>	2026-03-27 12:41:59 -07:00
Teknium	2d232c9991	feat(cli): configurable busy input mode + fix /queue always working (#3298 ) Two changes: 1. Fix /queue command: remove the _agent_running guard that rejected /queue after the agent finished. The prompt was deferred in _pending_input until the agent completed, then the handler checked _agent_running (now False) and rejected it. /queue now always queues regardless of timing. 2. Add display.busy_input_mode config (CLI-only): - 'interrupt' (default): Enter while busy interrupts the current run (preserves existing behavior) - 'queue': Enter while busy queues the message for the next turn, with a 'Queued for the next turn: ...' confirmation Ctrl+C always interrupts regardless of this setting. Salvaged from PR #3037 by StefanoChiodino. Key differences: - Default is 'interrupt' (preserves existing behavior) not 'queue' - No config version bump (unnecessary for new key in existing section) - Simpler normalization (no alias map) - /queue fix is simpler: just remove the guard instead of intercepting commands during busy state	2026-03-26 17:58:40 -07:00
Robin Fernandes	e95965d76a	Merge branch 'main' into rewbs/tool-use-charge-to-subscription	2026-03-26 16:18:28 -07:00
Robin Fernandes	95dc9aaa75	feat: add managed tool gateway and Nous subscription support - add managed modal and gateway-backed tool integrations\n- improve CLI setup, auth, and configuration for subscriber flows\n- expand tests and docs for managed tool support	2026-03-26 16:17:58 -07:00
Teknium	72250b5f62	feat: config-gated /verbose command for messaging gateway (#3262 ) * feat: config-gated /verbose command for messaging gateway Add gateway_config_gate field to CommandDef, allowing cli_only commands to be conditionally available in the gateway based on a config value. - CommandDef gains gateway_config_gate: str \| None — a config dotpath that, when truthy, overrides cli_only for gateway surfaces - /verbose uses gateway_config_gate='display.tool_progress_command' - Default is off (cli_only behavior preserved) - When enabled, /verbose cycles tool_progress mode (off/new/all/verbose) in the gateway, saving to config.yaml — same cycle as the CLI - Gateway helpers (help, telegram menus, slack mapping) dynamically check config to include/exclude config-gated commands - GATEWAY_KNOWN_COMMANDS always includes config-gated commands so the gateway recognizes them and can respond appropriately - Handles YAML 1.1 bool coercion (bare 'off' parses as False) - 8 new tests for the config gate mechanism + gateway handler * docs: document gateway_config_gate and /verbose messaging support - AGENTS.md: add gateway_config_gate to CommandDef fields - slash-commands.md: note /verbose can be enabled for messaging, update Notes - configuration.md: add tool_progress_command to display section + usage note - cli.md: cross-link to config docs for messaging enablement - messaging/index.md: show tool_progress_command in config snippet - plugins.md: add gateway_config_gate to register_command parameter table	2026-03-26 14:41:04 -07:00
Teknium	77bcaba2d7	refactor: consolidate get_hermes_home() and parse_reasoning_effort() (#3062 ) Centralizes two widely-duplicated patterns into hermes_constants.py: 1. get_hermes_home() — Path resolution for ~/.hermes (HERMES_HOME env var) - Was copy-pasted inline across 30+ files as: Path(os.getenv("HERMES_HOME", Path.home() / ".hermes")) - Now defined once in hermes_constants.py (zero-dependency module) - hermes_cli/config.py re-exports it for backward compatibility - Removed local wrapper functions in honcho_integration/client.py, tools/website_policy.py, tools/tirith_security.py, hermes_cli/uninstall.py 2. parse_reasoning_effort() — Reasoning effort string validation - Was copy-pasted in cli.py, gateway/run.py, cron/scheduler.py - Same validation logic: check against (xhigh, high, medium, low, minimal, none) - Now defined once in hermes_constants.py, called from all 3 locations - Warning log for unknown values kept at call sites (context-specific) 31 files changed, net +31 lines (125 insertions, 94 deletions) Full test suite: 6179 passed, 0 failed	2026-03-25 15:54:28 -07:00
Siddharth Balyan	b6461903ff	feat: nix flake — uv2nix build, NixOS module, persistent container mode (#20 ) * feat: nix flake, uv2nix build, dev shell and home manager * fixed nix run, updated docs for setup * feat(nix): NixOS module with persistent container mode, managed guards, checks - Replace homeModules.nix with nixosModules.nix (two deployment modes) - Mode A (native): hardened systemd service with ProtectSystem=strict - Mode B (container): persistent Ubuntu container with /nix/store bind-mount, identity-hash-based recreation, GC root protection, symlink-based updates - Add HERMES_MANAGED guards blocking CLI config mutation (config set, setup, gateway install/uninstall) when running under NixOS module - Add nix/checks.nix with build-time verification (binary, CLI, managed guard) - Remove container.nix (no Nix-built OCI image; pulls ubuntu:24.04 at runtime) - Simplify packages.nix (drop fetchFromGitHub submodules, PYTHONPATH wrappers) - Rewrite docs/nixos-setup.md with full options reference, container architecture, secrets management, and troubleshooting guide Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * Update config.py * feat(nix): add CI workflow and enhanced build checks - GitHub Actions workflow for nix flake check + build on linux/macOS - Entry point sync check to catch pyproject.toml drift - Expanded managed-guard check to cover config edit - Wrap hermes-acp binary in Nix package - Fix Path type mismatch in is_managed() * Update MCP server package name; bundled skills support * fix reading .env. instead have container user a common mounted .env file * feat(nix): container entrypoint with privilege drop and sudo provisioning Container was running as non-root via --user, which broke apt/pip installs and caused crashes when $HOME didn't exist. Replace --user with a Nix-built entrypoint script that provisions the hermes user, sudo (NOPASSWD), and /home/hermes inside the container on first boot, then drops privileges via setpriv. Writable layer persists so setup only runs once. Also expands MCP server options to support HTTP transport and sampling. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix group and user creation in container mode * feat(nix): persistent /home/hermes and MESSAGING_CWD in container mode Container mode now bind-mounts ${stateDir}/home to /home/hermes so the agent's home directory survives container recreation. Previously it lived in the writable layer and was lost on image/volume/options changes. Also passes MESSAGING_CWD to the container so the agent finds its workspace and documents, matching native mode behavior. Other changes: - Extract containerDataDir/containerHomeDir bindings (no more magic strings) - Fix entrypoint chown to run unconditionally (volume mounts always exist) - Add schema field to container identity hash for auto-recreation - Add idempotency test (Scenario G) to config-roundtrip check * docs: add Nix & NixOS setup guide to docs site Add comprehensive Nix documentation to the Docusaurus site at website/docs/getting-started/nix-setup.md, covering nix run/profile install, NixOS module (native + container modes), declarative settings, secrets management, MCP servers, managed mode, container architecture, dev shell, flake checks, and full options reference. - Register nix-setup in sidebar after installation page - Add Nix callout tip to installation.md linking to new guide - Add canonical version pointer in docs/nixos-setup.md * docs: remove docs/nixos-setup.md, consolidate into website docs Backfill missing details (restart/restartSec in full example, gateway.pid, 0750 permissions, docker inspect commands) into the canonical website/docs/getting-started/nix-setup.md and delete the old standalone file. * fix(nix): add compression.protect_last_n and target_ratio to config-keys.json New keys were added to DEFAULT_CONFIG on main, causing the config-drift check to fail in CI. * fix(nix): skip checks on aarch64-darwin (onnxruntime wheel missing) The full Python venv includes onnxruntime (via faster-whisper/STT) which lacks a compatible uv2nix wheel on aarch64-darwin. Gate all checks behind stdenv.hostPlatform.isLinux. The package and devShell still evaluate on macOS. * fix(nix): skip flake check and build on macOS CI onnxruntime (transitive dep via faster-whisper) lacks a compatible uv2nix wheel on aarch64-darwin. Run full checks and build on Linux only; macOS CI verifies the flake evaluates without building. * fix(nix): preserve container writable layer across nixos-rebuild The container identity hash included the entrypoint's Nix store path, which changes on every nixpkgs update (due to runtimeShell/stdenv input-addressing). This caused false-positive identity mismatches, triggering container recreation and losing the persistent writable layer. - Use stable symlink (current-entrypoint) like current-package already does - Remove entrypoint from identity hash (only image/volumes/options matter) - Add GC root for entrypoint so nix-collect-garbage doesn't break it - Remove global HERMES_HOME env var from addToSystemPackages (conflicted with interactive CLI use, service already sets its own) --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-26 01:08:02 +05:30
Teknium	68ab37e891	fix(delegate): give subagents independent iteration budgets (#3004 ) Each subagent now gets its own IterationBudget instead of sharing the parent's. The per-subagent cap is controlled by delegation.max_iterations in config.yaml (default 50). Total iterations across parent + subagents can exceed the parent's max_iterations, but the user retains control via the config setting. Previously, subagents shared the parent's budget, so three parallel subagents configured for max_iterations=50 racing against a parent that already used 60 of 90 would each only get ~10 iterations. Inspired by PR #2928 (Bartok9) which identified the issue (#2873).	2026-03-25 11:29:49 -07:00
Teknium	7ca22ea11b	fix(compression): restore sane defaults and cap summary at 12K tokens - threshold: 0.80 → 0.50 (compress at 50%, not 80%) - target_ratio: 0.40 → 0.20, now relative to threshold not total context (20% of 50% = 10% of context as tail budget) - summary ceiling: 32K → 12K (Gemini can't output more than ~12K) - Updated DEFAULT_CONFIG, config display, example config, and tests	2026-03-24 18:48:47 -07:00
Teknium	27c023e071	feat(config): expose compression target_ratio, protect_last_n, and threshold in DEFAULT_CONFIG PR #2554 made these configurable via config.yaml but didn't add them to DEFAULT_CONFIG or the config display. Users couldn't discover the new knobs without reading the source. - threshold: 0.80 (compress at 80% context usage) - target_ratio: 0.40 (preserve 40% of context as recent tail) - protect_last_n: 20 (keep last 20 messages uncompressed) - Updated hermes config display to show all three fields	2026-03-24 18:05:43 -07:00
Teknium	745859babb	feat: env var passthrough for skills and user config (#2807 ) * feat: env var passthrough for skills and user config Skills that declare required_environment_variables now have those vars passed through to sandboxed execution environments (execute_code and terminal). Previously, execute_code stripped all vars containing KEY, TOKEN, SECRET, etc. and the terminal blocklist removed Hermes infrastructure vars — both blocked skill-declared env vars. Two passthrough sources: 1. Skill-scoped (automatic): when a skill is loaded via skill_view and declares required_environment_variables, vars that are present in the environment are registered in a session-scoped passthrough set. 2. Config-based (manual): terminal.env_passthrough in config.yaml lets users explicitly allowlist vars for non-skill use cases. Changes: - New module: tools/env_passthrough.py — shared passthrough registry - hermes_cli/config.py: add terminal.env_passthrough to DEFAULT_CONFIG - tools/skills_tool.py: register available skill env vars on load - tools/code_execution_tool.py: check passthrough before filtering - tools/environments/local.py: check passthrough in _sanitize_subprocess_env and _make_run_env - 19 new tests covering all layers * docs: add environment variable passthrough documentation Document the env var passthrough feature across four docs pages: - security.md: new 'Environment Variable Passthrough' section with full explanation, comparison table, and security considerations - code-execution.md: update security section, add passthrough subsection, fix comparison table - creating-skills.md: add tip about automatic sandbox passthrough - skills.md: add note about passthrough after secure setup docs Live-tested: launched interactive CLI, loaded a skill with required_environment_variables, verified TEST_SKILL_SECRET_KEY was accessible inside execute_code sandbox (value: passthrough-test-value-42).	2026-03-24 08:19:34 -07:00
Teknium	98b5570961	fix: make browser command timeout configurable via config.yaml (#2801 ) browser_vision and other browser commands had a hardcoded 30-second subprocess timeout that couldn't be overridden. Users with slower machines (local Chromium without GPU) would hit timeouts on screenshot capture even when setting browser.command_timeout in config.yaml, because nothing read that value. Changes: - Add browser.command_timeout to DEFAULT_CONFIG (default: 30s) - Add _get_command_timeout() helper that reads config, falls back to 30s - _run_browser_command() now defaults to config value instead of constant - browser_vision screenshot no longer hardcodes timeout=30 - browser_navigate uses max(config_timeout, 60) as floor for navigation Reported by Gamer1988.	2026-03-24 07:21:50 -07:00
Teknium	4ff73fb32c	feat(config): support ${ENV_VAR} substitution in config.yaml (#2684 ) * feat(config): support ${ENV_VAR} substitution in config.yaml * fix: extend env var expansion to CLI and gateway config loaders The original PR (#2680) only wired _expand_env_vars into load_config(), which is used by 'hermes tools' and 'hermes setup'. The two primary config paths were missed: - load_cli_config() in cli.py (interactive CLI) - Module-level _cfg in gateway/run.py (gateway — bridges api_keys to env vars) Also: - Remove redundant 'import re' (already imported at module level) - Add missing blank lines between top-level functions (PEP 8) - Add tests for load_cli_config() expansion --------- Co-authored-by: teyrebaz33 <hakanerten02@hotmail.com>	2026-03-23 16:02:06 -07:00
Teknium	6435d69a6d	fix: make vision_analyze timeout configurable via config.yaml (#2480 ) Reads auxiliary.vision.timeout from config.yaml (default: 30s) and passes it to async_call_llm. Useful for slow local vision models that need more than 30 seconds. Setting is in config.yaml (not .env) since it's not a secret: auxiliary: vision: timeout: 120 Based on PR #2306. Co-authored-by: kshitijk4poor <kshitijk4poor@users.noreply.github.com>	2026-03-22 05:28:24 -07:00
Teknium	34be3f8be6	revert: remove trailing empty assistant message stripping Reverts the sanitizer addition from PR #2466 (originally #2129). We already have _empty_content_retries handling for reasoning-only responses. The trailing strip risks silently eating valid messages and is redundant with existing empty-content handling.	2026-03-22 04:55:34 -07:00
Mibayy	0698ddb496	fix(compression): remove hardcoded gemini-3-flash-preview as default summary model Closes #2453 The DEFAULT_CONFIG was hardcoding google/gemini-3-flash-preview as the summary_model for context compression. This caused unexpected OpenRouter charges for users who configured a different provider/model, because the compression task would silently fall back to gemini via OpenRouter even when the user's main model was on a different provider. Fix: change summary_model default to empty string. When empty, call_llm() resolves the model through the standard auto-detection chain (auxiliary.compression config -> env vars -> main provider), which correctly uses the user's configured provider and model. Users who want a dedicated cheap model for compression can still explicitly set compression.summary_model in their config.yaml.	2026-03-22 04:36:36 -07:00
Test	d0ac8d9fc7	chore: remove dead top-level toolsets config key The top-level 'toolsets' key in config.yaml was never read at runtime. Tool selection uses platform_toolsets (per-platform) or the --toolsets CLI flag. The key existed in load_cli_config() defaults and the example config as 'toolsets: [all]', misleading users into thinking it controlled tool availability. - Remove from load_cli_config() hardcoded defaults - Remove from hermes config show output - Replace in cli-config.yaml.example with deprecation note pointing to platform_toolsets and hermes tools	2026-03-20 22:27:13 -07:00
Test	e140c02d51	feat(gateway): add webhook platform adapter for external event triggers Add a generic webhook platform adapter that receives HTTP POSTs from external services (GitHub, GitLab, JIRA, Stripe, etc.), validates HMAC signatures, transforms payloads into agent prompts, and routes responses back to the source or to another platform. Features: - Configurable routes with per-route HMAC secrets, event filters, prompt templates with dot-notation payload access, skill loading, and pluggable delivery (github_comment, telegram, discord, log) - HMAC signature validation (GitHub SHA-256, GitLab token, generic) - Rate limiting (30 req/min per route, configurable) - Idempotency cache (1hr TTL, prevents duplicate runs on retries) - Body size limits (1MB default, checked before reading payload) - Setup wizard integration with security warnings and docs links - 33 tests (29 unit + 4 integration), all passing Security: - HMAC secret required per route (startup validation) - Setup wizard warns about internet exposure for webhook/SMS platforms - Sandboxing (Docker/VM) recommended in docs for public-facing deployments Files changed: - gateway/config.py — Platform.WEBHOOK enum + env var overrides - gateway/platforms/webhook.py — WebhookAdapter (~420 lines) - gateway/run.py — factory wiring + auth bypass for webhook events - hermes_cli/config.py — WEBHOOK_* env var definitions - hermes_cli/setup.py — webhook section in setup_gateway() - tests/gateway/test_webhook_adapter.py — 29 unit tests - tests/gateway/test_webhook_integration.py — 4 integration tests - website/docs/user-guide/messaging/webhooks.md — full user docs - website/docs/reference/environment-variables.md — WEBHOOK_* vars - website/sidebars.ts — nav entry	2026-03-20 06:33:36 -07:00
Test	4ad0083118	fix(honcho): read HONCHO_BASE_URL for local/self-hosted instances Cherry-picked from PR #2120 by @unclebumpy. - from_env() now reads HONCHO_BASE_URL and enables Honcho when base_url is set, even without an API key - from_global_config() reads baseUrl from config root with HONCHO_BASE_URL env var as fallback - get_honcho_client() guard relaxed to allow base_url without api_key for no-auth local instances - Added HONCHO_BASE_URL to OPTIONAL_ENV_VARS registry Result: Setting HONCHO_BASE_URL=http://localhost:8000 in ~/.hermes/.env now correctly routes the Honcho client to a local instance.	2026-03-20 04:36:06 -07:00
Teknium	dd60bcbfb7	feat: OpenAI-compatible API server + WhatsApp configurable reply prefix (#1756 ) * feat: OpenAI-compatible API server platform adapter Salvaged from PR #956, updated for current main. Adds an HTTP API server as a gateway platform adapter that exposes hermes-agent via the OpenAI Chat Completions and Responses APIs. Any OpenAI-compatible frontend (Open WebUI, LobeChat, LibreChat, AnythingLLM, NextChat, ChatBox, etc.) can connect by pointing at http://localhost:8642/v1. Endpoints: - POST /v1/chat/completions — stateless Chat Completions API - POST /v1/responses — stateful Responses API with chaining - GET /v1/responses/{id} — retrieve stored response - DELETE /v1/responses/{id} — delete stored response - GET /v1/models — list hermes-agent as available model - GET /health — health check Features: - Real SSE streaming via stream_delta_callback (uses main's streaming) - In-memory LRU response store for Responses API conversation chaining - Named conversations via 'conversation' parameter - Bearer token auth (optional, via API_SERVER_KEY) - CORS support for browser-based frontends - System prompt layering (frontend system messages on top of core) - Real token usage tracking in responses Integration points: - Platform.API_SERVER in gateway/config.py - _create_adapter() branch in gateway/run.py - API_SERVER_* env vars in hermes_cli/config.py - Env var overrides in gateway/config.py _apply_env_overrides() Changes vs original PR #956: - Removed streaming infrastructure (already on main via stream_consumer.py) - Removed Telegram reply_to_mode (separate feature, not included) - Updated _resolve_model() -> _resolve_gateway_model() - Updated stream_callback -> stream_delta_callback - Updated connect()/disconnect() to use _mark_connected()/_mark_disconnected() - Adapted to current Platform enum (includes MATTERMOST, MATRIX, DINGTALK) Tests: 72 new tests, all passing Docs: API server guide, Open WebUI integration guide, env var reference * feat(whatsapp): make reply prefix configurable via config.yaml Reworked from PR #1764 (ifrederico) to use config.yaml instead of .env. The WhatsApp bridge prepends a header to every outgoing message. This was hardcoded to '⚕ Hermes Agent'. Users can now customize or disable it via config.yaml: whatsapp: reply_prefix: '' # disable header reply_prefix: '🤖 My Bot\n───\n' # custom prefix How it works: - load_gateway_config() reads whatsapp.reply_prefix from config.yaml and stores it in PlatformConfig.extra['reply_prefix'] - WhatsAppAdapter reads it from config.extra at init - When spawning bridge.js, the adapter passes it as WHATSAPP_REPLY_PREFIX in the subprocess environment - bridge.js handles undefined (default), empty (no header), or custom values with \\n escape support - Self-chat echo suppression uses the configured prefix Also fixes _config_version: was 9 but ENV_VARS_BY_VERSION had a key 10 (TAVILY_API_KEY), so existing users at v9 would never be prompted for Tavily. Bumped to 10 to close the gap. Added a regression test to prevent this from happening again. Credit: ifrederico (PR #1764) for the bridge.js implementation and the config version gap discovery. --------- Co-authored-by: Test <test@test.com>	2026-03-17 10:44:37 -07:00

1 2 3 4

190 Commits