hermes-agent

Author	SHA1	Message	Date
ryanautomated	0f9aa57069	fix: silent memory flush failure on /new and /resume commands The _async_flush_memories() helper accepts (session_id) but both the /new and /resume handlers passed two arguments (session_id, session_key). The TypeError was silently swallowed at DEBUG level, so memory extraction never ran when users typed /new or /resume. One call site (the session expiry watcher) was already fixed in `9c96f669`, but /new and /resume were missed. - gateway/run.py:3247 — remove stray session_key from /new handler - gateway/run.py:4989 — remove stray session_key from /resume handler - tests/gateway/test_resume_command.py:222 — update test assertion	2026-04-06 16:49:42 -07:00
Mikita Lisavets	29b5ec2555	fix: clear session-scoped model after session reset	2026-04-06 13:20:01 -07:00
Mikita Lisavets	9afb9a6cb2	fix: clear session-scoped model overrides during session reset	2026-04-06 13:20:01 -07:00
donrhmexe	2c814d7b5d	fix: /model --global writes model.name instead of model.default The canonical config key for model name is model.default (used by setup, auth, runtime_provider, profile list, and CLI startup). But /model --global wrote to model.name in both gateway and CLI paths. This caused: - hermes profile list showing the old model (reads model.default) - Gateway restart reverting to the old model (_resolve_gateway_model reads model.default) - CLI startup using the old model (main.py reads model.default) The only reason it appeared to work in Telegram was the cached agent staying alive with the in-place switch. Fix: change all 3 write/read sites to use model.default.	2026-04-06 13:20:01 -07:00
Dusk1e	7d0953d6ff	security(gateway): isolate env/credential registries using ContextVars	2026-04-06 12:42:16 -07:00
Teknium	9c96f669a1	feat: centralized logging, instrumentation, hermes logs CLI, gateway noise fix (#5430 ) Adds comprehensive logging infrastructure to Hermes Agent across 4 phases: Phase 1 — Centralized logging - New hermes_logging.py with idempotent setup_logging() used by CLI, gateway, and cron - agent.log (INFO+) and errors.log (WARNING+) with RotatingFileHandler + RedactingFormatter - config.yaml logging: section (level, max_size_mb, backup_count) - All entry points wired (cli.py, main.py, gateway/run.py, run_agent.py) - Fixed debug_helpers.py writing to ./logs/ instead of ~/.hermes/logs/ Phase 2 — Event instrumentation - API calls: model, provider, tokens, latency, cache hit % - Tool execution: name, duration, result size (both sequential + concurrent) - Session lifecycle: turn start (session/model/provider/platform), compression (before/after) - Credential pool: rotation events, exhaustion tracking Phase 3 — hermes logs CLI command - hermes logs / hermes logs -f / hermes logs errors / hermes logs gateway - --level, --session, --since filters - hermes logs list (file sizes + ages) Phase 4 — Gateway bug fix + noise reduction - fix: _async_flush_memories() called with wrong arg count — sessions never flushed - Batched session expiry logs: 6 lines/cycle → 2 summary lines - Added inbound message + response time logging 75 new tests, zero regressions on the full suite.	2026-04-06 00:08:20 -07:00
Teknium	dce5f51c7c	feat: config structure validation — detect malformed YAML at startup (#5426 ) Add validate_config_structure() that catches common config.yaml mistakes: - custom_providers as dict instead of list (missing '-' in YAML) - fallback_model accidentally nested inside another section - custom_providers entries missing required fields (name, base_url) - Missing model section when custom_providers is configured - Root-level keys that look like misplaced custom_providers fields Surface these diagnostics at three levels: 1. Startup: print_config_warnings() runs at CLI and gateway module load, so users see issues before hitting cryptic errors 2. Error time: 'Unknown provider' errors in auth.py and model_switch.py now include config diagnostics with fix suggestions 3. Doctor: 'hermes doctor' shows a Config Structure section with all issues and fix hints Also adds a warning log in runtime_provider.py when custom_providers is a dict (previously returned None silently). Motivated by a Discord user who had malformed custom_providers YAML and got only 'Unknown Provider' with no guidance on what was wrong. 17 new tests covering all validation paths.	2026-04-05 23:31:20 -07:00
Teknium	89c812d1d2	feat: shared thread sessions by default — multi-user thread support (#5391 ) Threads (Telegram forum topics, Discord threads, Slack threads) now default to shared sessions where all participants see the same conversation. This is the expected UX for threaded conversations where multiple users @mention the bot and interact collaboratively. Changes: - build_session_key(): when thread_id is present, user_id is no longer appended to the session key (threads are shared by default) - New config: thread_sessions_per_user (default: false) — opt-in to restore per-user isolation in threads if needed - Sender attribution: messages in shared threads are prefixed with [sender name] so the agent can tell participants apart - System prompt: shared threads show 'Multi-user thread' note instead of a per-turn User line (avoids busting prompt cache) - Wired through all callers: gateway/run.py, base.py, telegram.py, feishu.py - Regular group messages (no thread) remain per-user isolated (unchanged) - DM threads are unaffected (they have their own keying logic) Closes community request from demontut_ re: thread-based shared sessions.	2026-04-05 19:46:58 -07:00
Teknium	fec58ad99e	fix(gateway): replace wall-clock agent timeout with inactivity-based timeout (#5389 ) The gateway previously used a hard wall-clock asyncio.wait_for timeout that killed agents after a fixed duration regardless of activity. This punished legitimate long-running tasks (subagent delegation, reasoning models, multi-step research). Now uses an inactivity-based polling loop that checks the agent's built-in activity tracker (get_activity_summary) every 5 seconds. The agent can run indefinitely as long as it's actively calling tools or receiving API responses. Only fires when the agent has been completely idle for the configured duration. Changes: - Replace asyncio.wait_for with asyncio.wait poll loop checking agent idle time via get_activity_summary() - Add agent.gateway_timeout config.yaml key (default 1800s, 0=unlimited) - Update stale session eviction to use agent idle time instead of pure wall-clock (prevents evicting active long-running tasks) - Preserve all existing diagnostic logging and user-facing context Inspired by PR #4864 (Mibayy) and issue #4815 (BongSuCHOI). Reimplemented on current main using existing _touch_activity() infrastructure rather than a parallel tracker.	2026-04-05 19:38:21 -07:00
Teknium	2563493466	fix: improve timeout debug logging and user-facing diagnostics (#5370 ) Agent activity tracking: - Add _last_activity_ts, _last_activity_desc, _current_tool to AIAgent - Touch activity on: API call start/complete, tool start/complete, first stream chunk, streaming request start - Public get_activity_summary() method for external consumers Gateway timeout diagnostics: - Timeout message now includes what the agent was doing when killed: actively working vs stuck on a tool vs waiting on API response - Includes iteration count, last activity description, seconds since last activity — users can distinguish legitimate long tasks from genuine hangs - 'Still working' notifications now show iteration count and current tool instead of just elapsed time - Stale lock eviction logs include agent activity state for debugging Stream stale timeout: - _emit_status when stale stream is detected (was log-only) — gateway users now see 'No response from provider for Ns' with model and context size - Improved logger.warning with model name and estimated context size Error path notifications (gateway-visible via _emit_status): - Context compression attempts now use _emit_status (was _vprint only) - Non-retryable client errors emit summary before aborting - Max retry exhaustion emits error summary (was _vprint only) - Rate limit exhaustion emits specific rate-limit message These were all CLI-visible but silent to gateway users, which is why people on Telegram/Discord saw generic 'request failed' messages without explanation.	2026-04-05 18:33:33 -07:00
Mibayy	cc2b56b26a	feat(api): structured run events via /v1/runs SSE endpoint Add POST /v1/runs to start async agent runs and GET /v1/runs/{run_id}/events for SSE streaming of typed lifecycle events (tool.started, tool.completed, message.delta, reasoning.available, run.completed, run.failed). Changes the internal tool_progress_callback signature from positional (tool_name, preview, args) to event-type-first (event_type, tool_name, preview, args, **kwargs). Existing consumers filter on event_type and remain backward-compatible. Adds concurrency limit (_MAX_CONCURRENT_RUNS=10) and orphaned run sweep. Fixes logic inversion in cli.py _on_tool_progress where the original PR would have displayed internal tools instead of non-internal ones. Co-authored-by: Mibayy <mibayy@users.noreply.github.com>	2026-04-05 12:05:13 -07:00
Teknium	0c95e91059	fix: follow-up fixes for salvaged PRs - Fix GatewayApp → GatewayRunner import in api_server.py (PR #4976) - Update launchd test assertions for new bootstrap/bootout/kickstart commands (PR #4892) - Add nonlocal message declaration in run_sync() to fix UnboundLocalError (pre-existing scoping bug)	2026-04-05 11:59:28 -07:00
analista	6a6ae9a5c3	fix(gateway): correct misleading log text for unknown /commands The warning said 'forwarding as plain text' but the code returns a user-facing error reply instead of forwarding. Describe what actually happens.	2026-04-05 11:59:28 -07:00
analista	e8053e8b93	fix(gateway): surface unknown /commands instead of leaking them to the LLM Previously, typing a /command that isn't a built-in, plugin, or skill would silently fall through to the LLM as plain text. The model often interprets it as a loose instruction and invents unrelated tool calls — e.g. a stray /claude_code slipped through and the model fabricated a delegate_task invocation that got stuck in an OAuth loop. Now we check GATEWAY_KNOWN_COMMANDS after the skill / plugin / unavailable-skill lookups and return an actionable message pointing the user at /commands. The user gets feedback, and the agent doesn't waste a round-trip guessing what /foo-bar was supposed to mean.	2026-04-05 11:59:28 -07:00
analista	4a75aec433	fix(gateway): resolve Telegram's underscored /commands to skill/plugin keys Telegram's Bot API disallows hyphens in command names, so _build_telegram_menu registers /claude-code as /claude_code. When the user taps it from autocomplete, the gateway dispatch did a direct lookup against skill_cmds (keyed on the hyphenated form) and missed, silently falling through to the LLM as plain text. The model would then typically call delegate_task, spawning a Hermes subagent instead of invoking the intended skill. Normalize underscores to hyphens in skill and plugin command lookup, matching the existing pattern in _check_unavailable_skill.	2026-04-05 11:59:28 -07:00
nibzard	4df2fca2f0	fix(gateway): cap memory flush retries at 3 to prevent infinite loop The _session_expiry_watcher retried failed memory flushes forever because exceptions were caught at debug level without setting memory_flushed=True. Expired sessions with transient failures (rate limits, network errors) would retry every 5 minutes indefinitely, burning API quota and blocking gateway message processing via 429 rate limit cascades. Observed case: a March 19 session retried 28+ times over ~17 days, causing repeated 429 errors that made Telegram unresponsive. Add a per-session failure counter (_flush_failures) that gives up after 3 consecutive attempts and marks the session as flushed to break the loop.	2026-04-05 11:59:28 -07:00
Teknium	c100ad874c	fix(matrix): E2EE cron delivery via live adapter + HTML formatting + origin fallback Salvaged from PRs #3767 (chalkers), #5236 (ygd58), #2641 (buntingszn). Three improvements to Matrix cron delivery: 1. Live adapter path: when the gateway is running, cron delivery now uses the connected MatrixAdapter via run_coroutine_threadsafe instead of the standalone HTTP PUT. This enables delivery to E2EE rooms where the raw HTTP path cannot encrypt. Falls back to standalone on failure. Threads adapters + event loop from gateway -> cron ticker -> tick() -> _deliver_result(). (from #3767) 2. HTML formatted_body: _send_matrix() now converts markdown to HTML using the optional markdown library, with h1-h6 to bold conversion for Element X compatibility. Falls back to plain text if markdown is not installed. Also adds random bytes to txn_id to prevent collisions. (from #5236) 3. Origin fallback: when deliver="origin" but origin is null (jobs created via API/scripts), falls back to HOME_CHANNEL env vars in order: matrix -> telegram -> discord -> slack. (from #2641)	2026-04-05 11:07:47 -07:00
dlkakbs	36e046e843	fix(gateway): MIME type fallback for Matrix document uploads Cherry-picked run.py portion from PR #3495 by dlkakbs. When Matrix sends non-image files (text, YAML, JSON, etc.), the MIME type may be empty or application/octet-stream. Falls back to extension-based detection so text files are properly injected into agent context.	2026-04-05 11:07:47 -07:00
LucidPaths	70f798043b	fix: Ollama Cloud auth, /model switch persistence, and alias tab completion - Add OLLAMA_API_KEY to credential resolution chain for ollama.com endpoints - Update requested_provider/_explicit_api_key/_explicit_base_url after /model switch so _ensure_runtime_credentials() doesn't revert the switch - Pass base_url/api_key from fallback config to resolve_provider_client() - Add DirectAlias system: user-configurable model_aliases in config.yaml checked before catalog resolution, with reverse lookup by model ID - Add /model tab completion showing aliases with provider metadata Co-authored-by: LucidPaths <LucidPaths@users.noreply.github.com>	2026-04-05 11:06:06 -07:00
Teknium	e899d6a05d	fix: increase default HERMES_AGENT_TIMEOUT from 10min to 30min Users hitting the 10-minute default during complex tool chains. Bumps both the execution cap and stale-lock eviction timeout. Still overridable via HERMES_AGENT_TIMEOUT env var (0 = unlimited).	2026-04-05 10:32:59 -07:00
Teknium	4976a8b066	feat: /model command — models.dev primary database + --provider flag (#5181 ) Full overhaul of the model/provider system. ## What changed - models.dev (109 providers, 4000+ models) as primary database for provider identity AND model metadata - --provider flag replaces colon syntax for explicit provider switching - Full ModelInfo/ProviderInfo dataclasses with context, cost, capabilities, modalities - HermesOverlay system merges models.dev + Hermes-specific transport/auth/aggregator flags - User-defined endpoints via config.yaml providers: section - /model (no args) lists authenticated providers with curated model catalog - Rich metadata display: context window, max output, cost/M tokens, capabilities - Config migration: custom_providers list → providers dict (v11→v12) - AIAgent.switch_model() for in-place model swap preserving conversation ## Files agent/models_dev.py, hermes_cli/providers.py, hermes_cli/model_switch.py, hermes_cli/model_normalize.py, cli.py, gateway/run.py, run_agent.py, hermes_cli/config.py, hermes_cli/commands.py	2026-04-05 01:04:44 -07:00
Teknium	0c54da8aaf	feat(gateway): live-stream /update output + interactive prompt buttons (#5180 ) * feat(gateway): live-stream /update output + forward interactive prompts Adds real-time output streaming and interactive prompt forwarding for the gateway /update command, so users on Telegram/Discord/etc see the full update progress and can respond to prompts (stash restore, config migration) without needing terminal access. Changes: hermes_cli/main.py: - Add --gateway flag to 'hermes update' argparse - Add _gateway_prompt() file-based IPC function that writes .update_prompt.json and polls for .update_response - Modify _restore_stashed_changes() to accept optional input_fn parameter for gateway mode prompt forwarding - cmd_update() uses _gateway_prompt when --gateway is set, enabling interactive stash restore and config migration prompts gateway/run.py: - _handle_update_command: spawn with --gateway flag and PYTHONUNBUFFERED=1 for real-time output flushing - Store session_key in .update_pending.json for cross-restart session matching - Add _update_prompt_pending dict to track sessions awaiting update prompt responses - Replace _watch_for_update_completion with _watch_update_progress: streams output chunks every ~4s, detects .update_prompt.json and forwards prompts to the user, handles completion/failure/timeout - Add update prompt interception in _handle_message: when a prompt is pending, the user's next message is written to .update_response instead of being processed normally - Preserve _send_update_notification as legacy fallback for post-restart cases where adapter isn't available yet File-based IPC protocol: - .update_prompt.json: written by update process with prompt text, default value, and unique ID - .update_response: written by gateway with user's answer - .update_output.txt: existing, now streamed in real-time - .update_exit_code: existing completion marker Tests: 16 new tests covering _gateway_prompt IPC, output streaming, prompt detection/forwarding, message interception, and cleanup. * feat: interactive buttons for update prompts (Telegram + Discord) Telegram: Inline keyboard with ✓ Yes / ✗ No buttons. Clicking a button answers the callback query, edits the message to show the choice, and writes .update_response directly. CallbackQueryHandler registered on the update_prompt: prefix. Discord: UpdatePromptView (discord.ui.View) with green Yes / red No buttons. Follows the ExecApprovalView pattern — auth check, embed color update, disabled-after-click. Writes .update_response on click. All platforms: /approve and /deny (and /yes, /no) now work as shorthand for yes/no when an update prompt is pending. The text fallback message instructs users to use these commands. Raw message interception still works as a fallback for non-command responses. Gateway watcher checks adapter for send_update_prompt method (class-level check to avoid MagicMock false positives) and falls back to text prompt with /approve instructions when unavailable. * fix: block /update on non-messaging platforms (API, webhooks, ACP) Add _UPDATE_ALLOWED_PLATFORMS frozenset that explicitly lists messaging platforms where /update is permitted. API server, webhook, and ACP platforms get a clear error directing them to run hermes update from the terminal instead. ACP and API server already don't reach _handle_message (separate codepaths), and webhooks have distinct session keys that can't collide with messaging sessions. This guard is belt-and-suspenders.	2026-04-05 00:28:58 -07:00
Teknium	b93fa234df	fix: clear ghost status-bar lines on terminal resize (#4960 ) * feat: add /branch (/fork) command for session branching Inspired by Claude Code's /branch command. Creates a copy of the current session's conversation history in a new session, allowing the user to explore a different approach without losing the original. Works like 'git checkout -b' for conversations: - /branch — auto-generates a title from the parent session - /branch my-idea — uses a custom title - /fork — alias for /branch Implementation: - CLI: _handle_branch_command() in cli.py - Gateway: _handle_branch_command() in gateway/run.py - CommandDef with 'fork' alias in commands.py - Uses existing parent_session_id field in session DB - Uses get_next_title_in_lineage() for auto-numbered branches - 14 tests covering session creation, history copy, parent links, title generation, edge cases, and agent sync * fix: clear ghost status-bar lines on terminal resize When the terminal shrinks (e.g. un-maximize), the emulator reflows previously full-width rows (status bar, input rules) into multiple narrower rows. prompt_toolkit's _on_resize only cursor_up()s by the stored layout height, missing the extra rows from reflow — leaving ghost duplicates of the status bar visible. Fix: monkey-patch Application._on_resize to detect width shrinks, calculate the extra rows created by reflow, and inflate the renderer's cursor_pos.y so the erase moves up far enough to clear ghosts.	2026-04-03 22:43:45 -07:00
Teknium	1c0c5d957f	fix(gateway): support infinite timeout + periodic notifications + actionable error (#4959 ) - HERMES_AGENT_TIMEOUT=0 now means no limit (infinite execution) - Periodic 'still working' notifications every 10 minutes for long tasks - Timeout error message now tells users how to increase the limit - Stale-lock eviction handles infinite timeout correctly (float inf TTL)	2026-04-03 22:37:38 -07:00
Teknium	ad4feeaf0d	feat: wire skills.external_dirs into all remaining discovery paths The config key skills.external_dirs and core resolution (get_all_skills_dirs, get_external_skills_dirs in agent/skill_utils.py) already existed but several code paths still only scanned SKILLS_DIR. Now external dirs are respected everywhere: - skills_categories(): scan all dirs for category discovery - _get_category_from_path(): resolve categories against any skills root - skill_manager_tool._find_skill(): search all dirs for edit/patch/delete - credential_files.get_skills_directory_mount(): mount all dirs into Docker/Singularity containers (external dirs at external_skills/<idx>) - credential_files.iter_skills_files(): list files from all dirs for Modal/Daytona upload - tools/environments/ssh.py: rsync all skill dirs to remote hosts - gateway _check_unavailable_skill(): check disabled skills across all dirs Usage in config.yaml: skills: external_dirs: - ~/repos/agent-skills/hermes - /shared/team-skills	2026-04-03 21:14:42 -07:00
Teknium	f1c0847145	fix(gateway): restore short preview truncation for all/new tool progress modes (#4935 ) The tool_preview_length: 0 (unlimited) config change from `e314833c` removed truncation from gateway progress messages in all/new modes. This caused full terminal commands, code blocks, and file paths to appear as permanent messages in Telegram -- the old 40-char truncation was the correct behavior for messaging platforms. Now: - all/new modes: always truncate previews to 40 chars (old behavior) - verbose mode: respects tool_preview_length config for JSON args cap Reported by Paulclgro and socialsurfer on Discord.	2026-04-03 20:32:01 -07:00
memosr	287ac15efd	fix(gateway): write update-pending state atomically to prevent corruption	2026-04-03 18:57:38 -07:00
Tranquil-Flow	3bfb39a25f	fix(gateway): isolate approval session key per turn	2026-04-03 17:50:01 -07:00
Teknium	8a384628a5	fix(memory): profile-scoped memory isolation and clone support (#4845 ) Three fixes for memory+profile isolation bugs: 1. memory_tool.py: Replace module-level MEMORY_DIR constant with get_memory_dir() function that calls get_hermes_home() dynamically. The old constant was cached at import time and could go stale if HERMES_HOME changed after import. Internal MemoryStore methods now call get_memory_dir() directly. MEMORY_DIR kept as backward-compat alias. 2. profiles.py: profile create --clone now copies MEMORY.md and USER.md from the source profile. These curated memory files are part of the agent's identity (same as SOUL.md) and should carry over on clone. 3. holographic plugin: initialize() now expands $HERMES_HOME and ${HERMES_HOME} in the db_path config value, so users can write 'db_path: $HERMES_HOME/memory_store.db' and it resolves to the active profile directory, not the default home. Tests updated to mock get_memory_dir() alongside the legacy MEMORY_DIR.	2026-04-03 13:10:11 -07:00
Teknium	aecbf7fa4a	fix(discord): register /approve and /deny slash commands, wire up button-based approval UI (#4800 ) Two fixes for Discord exec approval: 1. Register /approve and /deny as native Discord slash commands so they appear in Discord's command picker (autocomplete). Previously they were only handled as text commands, so users saw 'no commands found' when typing /approve. 2. Wire up the existing ExecApprovalView button UI (was dead code): - ExecApprovalView now calls resolve_gateway_approval() to actually unblock the waiting agent thread when a button is clicked - Gateway's _approval_notify_sync() detects adapters with send_exec_approval() and routes through the button UI - Added 'Allow Session' button for parity with /approve session - send_exec_approval() now accepts session_key and metadata for thread support - Graceful fallback to text-based /approve prompt if button send fails Also updates test mocks to include grey/secondary ButtonStyle and purple Color (used by new button styles).	2026-04-03 10:24:07 -07:00
Teknium	5db630aae4	fix: respect per-platform disabled skills in Telegram menu and gateway dispatch (#4799 ) Three interconnected bugs caused `hermes skills config` per-platform settings to be silently ignored: 1. telegram_menu_commands() never filtered disabled skills — all skills consumed menu slots regardless of platform config, hitting Telegram's 100 command cap. Now loads disabled skills for 'telegram' and excludes them from the menu. 2. Gateway skill dispatch executed disabled skills because get_skill_commands() (process-global cache) only filters by the global disabled list at scan time. Added per-platform check before execution, returning an actionable 'skill is disabled' message. 3. get_disabled_skill_names() only checked HERMES_PLATFORM env var, but the gateway sets HERMES_SESSION_PLATFORM instead. Added HERMES_SESSION_PLATFORM as fallback, plus an explicit platform= parameter for callers that know their platform (menu builder, gateway dispatch). Also added platform to prompt_builder's skills cache key so multi-platform gateways get correct per-platform skill prompts. Reported by SteveSkedasticity (CLAW community).	2026-04-03 10:10:53 -07:00
Teknium	b6f9b70afd	fix(gateway): route /approve and /deny through running-agent guard (#4798 ) When the agent is blocked on a dangerous command approval (threading.Event wait inside tools/approval.py), incoming /approve and /deny commands were falling through to the generic interrupt path instead of being dispatched to their command handlers. The interrupt sets _interrupt_requested on the agent, but the agent thread is blocked on event.wait() — not checking the flag. Result: approval times out after 300s (5 minutes) before executing. Fix: intercept /approve and /deny in the running-agent early-intercept block (alongside /stop, /new, /queue) and route directly to _handle_approve_command / _handle_deny_command.	2026-04-03 09:59:52 -07:00
Teknium	f374ae4c61	fix: prevent compression death spiral from API disconnects (#2153 ) (#4750 ) Three fixes for long-running gateway sessions that enter a death spiral when API disconnects prevent token data collection, which prevents compression, which causes more disconnects: Layer 1 — Stale token counter fallback (run_agent.py in-loop): When last_prompt_tokens is 0 (stale after API disconnect or provider returned no usage data), fall back to estimate_messages_tokens_rough() instead of passing 0 to should_compress(), which would never fire. Layer 2 — Server disconnect heuristic (run_agent.py error handler): When ReadError/RemoteProtocolError hits a large session (>60% context or >200 messages), treat it as a context-length error and trigger compression rather than burning through retries that all fail the same way. Layer 3 — Hard message count limit (gateway/run.py hygiene): Force compression when a session exceeds 400 messages, regardless of token estimates. This catches runaway growth even when all token-based checks fail due to missing API data. Based on the analysis from PR #2157 by ygd58 — the gateway threshold direction fix (1.4x multiplier) was already resolved on main.	2026-04-03 02:16:46 -07:00
Teknium	a933079564	fix: move class-level attribute after docstring, clarify throttle comment Follow-up nits for salvaged PR #4577: - Move _running_agents_ts class attribute below the docstring so GatewayRunner.__doc__ is preserved. - Add clarifying comment explaining the throttle continue behavior (batches queued messages during the throttle interval).	2026-04-03 00:50:17 -07:00
kshitijk4poor	0ed28ab80c	refactor: simplify and harden PR fixes after review - Fix cron ThreadPoolExecutor blocking on timeout: use shutdown(wait=False, cancel_futures=True) instead of context manager that waits indefinitely - Extract _dequeue_pending_text() to deduplicate media-placeholder logic in interrupt and normal-completion dequeue paths - Remove hasattr guards for _running_agents_ts: add class-level default so partial test construction works without scattered defensive checks - Move `import concurrent.futures` to top of cron/scheduler.py - Progress throttle: sleep remaining interval instead of busy-looping 0.1s (~15 wakeups per 1.5s window → 1 wakeup) - Deduplicate _load_stt_config() in transcription_tools.py: _has_openai_audio_backend() now delegates to _resolve_openai_audio_client_config()	2026-04-03 00:50:17 -07:00
kshitijk4poor	970042deab	fix(gateway): prevent stuck sessions with agent timeout and staleness eviction Three changes to prevent sessions from getting permanently locked: 1. Agent execution timeout (HERMES_AGENT_TIMEOUT, default 10min): Wraps run_in_executor with asyncio.wait_for so a hung API call or runaway tool can't lock a session indefinitely. On timeout, the agent is interrupted and the user gets an actionable error message. 2. Staleness eviction for _running_agents: Tracks start timestamps for each session entry. When a new message arrives and the entry is older than timeout + 1min grace, it's evicted as a leaked lock. Safety net for any cleanup path that fails to remove the entry. 3. Cron job timeout (HERMES_CRON_TIMEOUT, default 10min): Wraps run_conversation in a ThreadPoolExecutor with timeout so a hung cron job doesn't block the ticker thread (and all subsequent cron jobs) indefinitely. Follows grammY runner's per-update timeout pattern and aiogram's asyncio.wait_for approach for handler deadlines.	2026-04-03 00:50:17 -07:00
kshitijk4poor	69f85a4dce	fix(gateway): race condition, photo media loss, and flood control in Telegram Three bugs causing intermittent silent drops, partial responses, and flood control delays on the Telegram platform: 1. Race condition in handle_message() — _active_sessions was set inside the background task, not before create_task(). Two rapid messages could both pass the guard and spawn duplicate processing tasks. Fix: set _active_sessions synchronously before spawning the task (grammY sequentialize / aiogram EventIsolation pattern). 2. Photo media loss on dequeue — when a photo (no caption) was queued during active processing and later dequeued, only .text was extracted. Empty text → message silently dropped. Fix: _build_media_placeholder() creates text context for media-only events so they survive the dequeue path. 3. Progress message edits triggered Telegram flood control — rapid tool calls edited the progress message every 0.3s, hitting Telegram's rate limit (23s+ waits). This blocked progress updates and could cause stream consumer timeouts. Fix: throttle edits to 1.5s minimum interval, detect flood control errors and gracefully degrade to new messages. edit_message() now returns failure for flood waits >5s instead of blocking.	2026-04-03 00:50:17 -07:00
Teknium	21c2d32471	fix(gateway): normalize step_callback prev_tools for backward compat The PR changed prev_tools from list[str] to list[dict] with name/result keys. The gateway's _step_callback_sync passed this directly to hooks as 'tool_names', breaking user-authored hooks that call ', '.join(tool_names). Now: - 'tool_names' always contains strings (backward-compatible) - 'tools' carries the enriched dicts for hooks that want results Also adds summary logging to register_mcp_servers() and comprehensive tests for all three PR changes: - sanitize_mcp_name_component edge cases - register_mcp_servers public API - _register_session_mcp_servers ACP integration - step_callback result forwarding - gateway normalization backward compat	2026-04-02 20:54:27 -07:00
Teknium	924bc67eee	feat(memory): pluggable memory provider interface with profile isolation, review fixes, and honcho CLI restoration (#4623 ) * feat(memory): add pluggable memory provider interface with profile isolation Introduces a pluggable MemoryProvider ABC so external memory backends can integrate with Hermes without modifying core files. Each backend becomes a plugin implementing a standard interface, orchestrated by MemoryManager. Key architecture: - agent/memory_provider.py — ABC with core + optional lifecycle hooks - agent/memory_manager.py — single integration point in the agent loop - agent/builtin_memory_provider.py — wraps existing MEMORY.md/USER.md Profile isolation fixes applied to all 6 shipped plugins: - Cognitive Memory: use get_hermes_home() instead of raw env var - Hindsight Memory: check $HERMES_HOME/hindsight/config.json first, fall back to legacy ~/.hindsight/ for backward compat - Hermes Memory Store: replace hardcoded ~/.hermes paths with get_hermes_home() for config loading and DB path defaults - Mem0 Memory: use get_hermes_home() instead of raw env var - RetainDB Memory: auto-derive profile-scoped project name from hermes_home path (hermes-<profile>), explicit env var overrides - OpenViking Memory: read-only, no local state, isolation via .env MemoryManager.initialize_all() now injects hermes_home into kwargs so every provider can resolve profile-scoped storage without importing get_hermes_home() themselves. Plugin system: adds register_memory_provider() to PluginContext and get_plugin_memory_providers() accessor. Based on PR #3825. 46 tests (37 unit + 5 E2E + 4 plugin registration). * refactor(memory): drop cognitive plugin, rewrite OpenViking as full provider Remove cognitive-memory plugin (#727) — core mechanics are broken: decay runs 24x too fast (hourly not daily), prefetch uses row ID as timestamp, search limited by importance not similarity. Rewrite openviking-memory plugin from a read-only search wrapper into a full bidirectional memory provider using the complete OpenViking session lifecycle API: - sync_turn: records user/assistant messages to OpenViking session (threaded, non-blocking) - on_session_end: commits session to trigger automatic memory extraction into 6 categories (profile, preferences, entities, events, cases, patterns) - prefetch: background semantic search via find() endpoint - on_memory_write: mirrors built-in memory writes to the session - is_available: checks env var only, no network calls (ABC compliance) Tools expanded from 3 to 5: - viking_search: semantic search with mode/scope/limit - viking_read: tiered content (abstract ~100tok / overview ~2k / full) - viking_browse: filesystem-style navigation (list/tree/stat) - viking_remember: explicit memory storage via session - viking_add_resource: ingest URLs/docs into knowledge base Uses direct HTTP via httpx (no openviking SDK dependency needed). Response truncation on viking_read to prevent context flooding. * fix(memory): harden Mem0 plugin — thread safety, non-blocking sync, circuit breaker - Remove redundant mem0_context tool (identical to mem0_search with rerank=true, top_k=5 — wastes a tool slot and confuses the model) - Thread sync_turn so it's non-blocking — Mem0's server-side LLM extraction can take 5-10s, was stalling the agent after every turn - Add threading.Lock around _get_client() for thread-safe lazy init (prefetch and sync threads could race on first client creation) - Add circuit breaker: after 5 consecutive API failures, pause calls for 120s instead of hammering a down server every turn. Auto-resets after cooldown. Logs a warning when tripped. - Track success/failure in prefetch, sync_turn, and all tool calls - Wait for previous sync to finish before starting a new one (prevents unbounded thread accumulation on rapid turns) - Clean up shutdown to join both prefetch and sync threads * fix(memory): enforce single external memory provider limit MemoryManager now rejects a second non-builtin provider with a warning. Built-in memory (MEMORY.md/USER.md) is always accepted. Only ONE external plugin provider is allowed at a time. This prevents tool schema bloat (some providers add 3-5 tools each) and conflicting memory backends. The warning message directs users to configure memory.provider in config.yaml to select which provider to activate. Updated all 47 tests to use builtin + one external pattern instead of multiple externals. Added test_second_external_rejected to verify the enforcement. * feat(memory): add ByteRover memory provider plugin Implements the ByteRover integration (from PR #3499 by hieuntg81) as a MemoryProvider plugin instead of direct run_agent.py modifications. ByteRover provides persistent memory via the brv CLI — a hierarchical knowledge tree with tiered retrieval (fuzzy text then LLM-driven search). Local-first with optional cloud sync. Plugin capabilities: - prefetch: background brv query for relevant context - sync_turn: curate conversation turns (threaded, non-blocking) - on_memory_write: mirror built-in memory writes to brv - on_pre_compress: extract insights before context compression Tools (3): - brv_query: search the knowledge tree - brv_curate: store facts/decisions/patterns - brv_status: check CLI version and context tree state Profile isolation: working directory at $HERMES_HOME/byterover/ (scoped per profile). Binary resolution cached with thread-safe double-checked locking. All write operations threaded to avoid blocking the agent (curate can take 120s with LLM processing). * fix(memory): thread remaining sync_turns, fix holographic, add config key Plugin fixes: - Hindsight: thread sync_turn (was blocking up to 30s via _run_in_thread) - RetainDB: thread sync_turn (was blocking on HTTP POST) - Both: shutdown now joins sync threads alongside prefetch threads Holographic retrieval fixes: - reason(): removed dead intersection_key computation (bundled but never used in scoring). Now reuses pre-computed entity_residuals directly, moved role_content encoding outside the inner loop. - contradict(): added _MAX_CONTRADICT_FACTS=500 scaling guard. Above 500 facts, only checks the most recently updated ones to avoid O(n^2) explosion (~125K comparisons at 500 is acceptable). Config: - Added memory.provider key to DEFAULT_CONFIG ("" = builtin only). No version bump needed (deep_merge handles new keys automatically). * feat(memory): extract Honcho as a MemoryProvider plugin Creates plugins/honcho-memory/ as a thin adapter over the existing honcho_integration/ package. All 4 Honcho tools (profile, search, context, conclude) move from the normal tool registry to the MemoryProvider interface. The plugin delegates all work to HonchoSessionManager — no Honcho logic is reimplemented. It uses the existing config chain: $HERMES_HOME/honcho.json -> ~/.honcho/config.json -> env vars. Lifecycle hooks: - initialize: creates HonchoSessionManager via existing client factory - prefetch: background dialectic query - sync_turn: records messages + flushes to API (threaded) - on_memory_write: mirrors user profile writes as conclusions - on_session_end: flushes all pending messages This is a prerequisite for the MemoryManager wiring in run_agent.py. Once wired, Honcho goes through the same provider interface as all other memory plugins, and the scattered Honcho code in run_agent.py can be consolidated into the single MemoryManager integration point. * feat(memory): wire MemoryManager into run_agent.py Adds 8 integration points for the external memory provider plugin, all purely additive (zero existing code modified): 1. Init (~L1130): Create MemoryManager, find matching plugin provider from memory.provider config, initialize with session context 2. Tool injection (~L1160): Append provider tool schemas to self.tools and self.valid_tool_names after memory_manager init 3. System prompt (~L2705): Add external provider's system_prompt_block alongside existing MEMORY.md/USER.md blocks 4. Tool routing (~L5362): Route provider tool calls through memory_manager.handle_tool_call() before the catchall handler 5. Memory write bridge (~L5353): Notify external provider via on_memory_write() when the built-in memory tool writes 6. Pre-compress (~L5233): Call on_pre_compress() before context compression discards messages 7. Prefetch (~L6421): Inject provider prefetch results into the current-turn user message (same pattern as Honcho turn context) 8. Turn sync + session end (~L8161, ~L8172): sync_all() after each completed turn, queue_prefetch_all() for next turn, on_session_end() + shutdown_all() at conversation end All hooks are wrapped in try/except — a failing provider never breaks the agent. The existing memory system, Honcho integration, and all other code paths are completely untouched. Full suite: 7222 passed, 4 pre-existing failures. * refactor(memory): remove legacy Honcho integration from core Extracts all Honcho-specific code from run_agent.py, model_tools.py, toolsets.py, and gateway/run.py. Honcho is now exclusively available as a memory provider plugin (plugins/honcho-memory/). Removed from run_agent.py (-457 lines): - Honcho init block (session manager creation, activation, config) - 8 Honcho methods: _honcho_should_activate, _strip_honcho_tools, _activate_honcho, _register_honcho_exit_hook, _queue_honcho_prefetch, _honcho_prefetch, _honcho_save_user_observation, _honcho_sync - _inject_honcho_turn_context module-level function - Honcho system prompt block (tool descriptions, CLI commands) - Honcho context injection in api_messages building - Honcho params from __init__ (honcho_session_key, honcho_manager, honcho_config) - HONCHO_TOOL_NAMES constant - All honcho-specific tool dispatch forwarding Removed from other files: - model_tools.py: honcho_tools import, honcho params from handle_function_call - toolsets.py: honcho toolset definition, honcho tools from core tools list - gateway/run.py: honcho params from AIAgent constructor calls Removed tests (-339 lines): - 9 Honcho-specific test methods from test_run_agent.py - TestHonchoAtexitFlush class from test_exit_cleanup_interrupt.py Restored two regex constants (_SURROGATE_RE, _BUDGET_WARNING_RE) that were accidentally removed during the honcho function extraction. The honcho_integration/ package is kept intact — the plugin delegates to it. tools/honcho_tools.py registry entries are now dead code (import commented out in model_tools.py) but the file is preserved for reference. Full suite: 7207 passed, 4 pre-existing failures. Zero regressions. * refactor(memory): restructure plugins, add CLI, clean gateway, migration notice Plugin restructure: - Move all memory plugins from plugins/<name>-memory/ to plugins/memory/<name>/ (byterover, hindsight, holographic, honcho, mem0, openviking, retaindb) - New plugins/memory/__init__.py discovery module that scans the directory directly, loading providers by name without the general plugin system - run_agent.py uses load_memory_provider() instead of get_plugin_memory_providers() CLI wiring: - hermes memory setup — interactive curses picker + config wizard - hermes memory status — show active provider, config, availability - hermes memory off — disable external provider (built-in only) - hermes honcho — now shows migration notice pointing to hermes memory setup Gateway cleanup: - Remove _get_or_create_gateway_honcho (already removed in prev commit) - Remove _shutdown_gateway_honcho and _shutdown_all_gateway_honcho methods - Remove all calls to shutdown methods (4 call sites) - Remove _honcho_managers/_honcho_configs dict references Dead code removal: - Delete tools/honcho_tools.py (279 lines, import was already commented out) - Delete tests/gateway/test_honcho_lifecycle.py (131 lines, tested removed methods) - Remove if False placeholder from run_agent.py Migration: - Honcho migration notice on startup: detects existing honcho.json or ~/.honcho/config.json, prints guidance to run hermes memory setup. Only fires when memory.provider is not set and not in quiet mode. Full suite: 7203 passed, 4 pre-existing failures. Zero regressions. * feat(memory): standardize plugin config + add per-plugin documentation Config architecture: - Add save_config(values, hermes_home) to MemoryProvider ABC - Honcho: writes to $HERMES_HOME/honcho.json (SDK native) - Mem0: writes to $HERMES_HOME/mem0.json - Hindsight: writes to $HERMES_HOME/hindsight/config.json - Holographic: writes to config.yaml under plugins.hermes-memory-store - OpenViking/RetainDB/ByteRover: env-var only (default no-op) Setup wizard (hermes memory setup): - Now calls provider.save_config() for non-secret config - Secrets still go to .env via env vars - Only memory.provider activation key goes to config.yaml Documentation: - README.md for each of the 7 providers in plugins/memory/<name>/ - Requirements, setup (wizard + manual), config reference, tools table - Consistent format across all providers The contract for new memory plugins: - get_config_schema() declares all fields (REQUIRED) - save_config() writes native config (REQUIRED if not env-var-only) - Secrets use env_var field in schema, written to .env by wizard - README.md in the plugin directory * docs: add memory providers user guide + developer guide New pages: - user-guide/features/memory-providers.md — comprehensive guide covering all 7 shipped providers (Honcho, OpenViking, Mem0, Hindsight, Holographic, RetainDB, ByteRover). Each with setup, config, tools, cost, and unique features. Includes comparison table and profile isolation notes. - developer-guide/memory-provider-plugin.md — how to build a new memory provider plugin. Covers ABC, required methods, config schema, save_config, threading contract, profile isolation, testing. Updated pages: - user-guide/features/memory.md — replaced Honcho section with link to new Memory Providers page - user-guide/features/honcho.md — replaced with migration redirect to the new Memory Providers page - sidebars.ts — added both new pages to navigation * fix(memory): auto-migrate Honcho users to memory provider plugin When honcho.json or ~/.honcho/config.json exists but memory.provider is not set, automatically set memory.provider: honcho in config.yaml and activate the plugin. The plugin reads the same config files, so all data and credentials are preserved. Zero user action needed. Persists the migration to config.yaml so it only fires once. Prints a one-line confirmation in non-quiet mode. * fix(memory): only auto-migrate Honcho when enabled + credentialed Check HonchoClientConfig.enabled AND (api_key OR base_url) before auto-migrating — not just file existence. Prevents false activation for users who disabled Honcho, stopped using it (config lingers), or have ~/.honcho/ from a different tool. * feat(memory): auto-install pip dependencies during hermes memory setup Reads pip_dependencies from plugin.yaml, checks which are missing, installs them via pip before config walkthrough. Also shows install guidance for external_dependencies (e.g. brv CLI for ByteRover). Updated all 7 plugin.yaml files with pip_dependencies: - honcho: honcho-ai - mem0: mem0ai - openviking: httpx - hindsight: hindsight-client - holographic: (none) - retaindb: requests - byterover: (external_dependencies for brv CLI) * fix: remove remaining Honcho crash risks from cli.py and gateway cli.py: removed Honcho session re-mapping block (would crash importing deleted tools/honcho_tools.py), Honcho flush on compress, Honcho session display on startup, Honcho shutdown on exit, honcho_session_key AIAgent param. gateway/run.py: removed honcho_session_key params from helper methods, sync_honcho param, _honcho.shutdown() block. tests: fixed test_cron_session_with_honcho_key_skipped (was passing removed honcho_key param to _flush_memories_for_session). * fix: include plugins/ in pyproject.toml package list Without this, plugins/memory/ wouldn't be included in non-editable installs. Hermes always runs from the repo checkout so this is belt- and-suspenders, but prevents breakage if the install method changes. * fix(memory): correct pip-to-import name mapping for dep checks The heuristic dep.replace('-', '_') fails for packages where the pip name differs from the import name: honcho-ai→honcho, mem0ai→mem0, hindsight-client→hindsight_client. Added explicit mapping table so hermes memory setup doesn't try to reinstall already-installed packages. * chore: remove dead code from old plugin memory registration path - hermes_cli/plugins.py: removed register_memory_provider(), _memory_providers list, get_plugin_memory_providers() — memory providers now use plugins/memory/ discovery, not the general plugin system - hermes_cli/main.py: stripped 74 lines of dead honcho argparse subparsers (setup, status, sessions, map, peer, mode, tokens, identity, migrate) — kept only the migration redirect - agent/memory_provider.py: updated docstring to reflect new registration path - tests: replaced TestPluginMemoryProviderRegistration with TestPluginMemoryDiscovery that tests the actual plugins/memory/ discovery system. Added 3 new tests (discover, load, nonexistent). * chore: delete dead honcho_integration/cli.py and its tests cli.py (794 lines) was the old 'hermes honcho' command handler — nobody calls it since cmd_honcho was replaced with a migration redirect. Deleted tests that imported from removed code: - tests/honcho_integration/test_cli.py (tested _resolve_api_key) - tests/honcho_integration/test_config_isolation.py (tested CLI config paths) - tests/tools/test_honcho_tools.py (tested the deleted tools/honcho_tools.py) Remaining honcho_integration/ files (actively used by the plugin): - client.py (445 lines) — config loading, SDK client creation - session.py (991 lines) — session management, queries, flush * refactor: move honcho_integration/ into the honcho plugin Moves client.py (445 lines) and session.py (991 lines) from the top-level honcho_integration/ package into plugins/memory/honcho/. No Honcho code remains in the main codebase. - plugins/memory/honcho/client.py — config loading, SDK client creation - plugins/memory/honcho/session.py — session management, queries, flush - Updated all imports: run_agent.py (auto-migration), hermes_cli/doctor.py, plugin __init__.py, session.py cross-import, all tests - Removed honcho_integration/ package and pyproject.toml entry - Renamed tests/honcho_integration/ → tests/honcho_plugin/ * docs: update architecture + gateway-internals for memory provider system - architecture.md: replaced honcho_integration/ with plugins/memory/ - gateway-internals.md: replaced Honcho-specific session routing and flush lifecycle docs with generic memory provider interface docs * fix: update stale mock path for resolve_active_host after honcho plugin migration * fix(memory): address review feedback — P0 lifecycle, ABC contract, honcho CLI restore Review feedback from Honcho devs (erosika): P0 — Provider lifecycle: - Remove on_session_end() + shutdown_all() from run_conversation() tail (was killing providers after every turn in multi-turn sessions) - Add shutdown_memory_provider() method on AIAgent for callers - Wire shutdown into CLI atexit, reset_conversation, gateway stop/expiry Bug fixes: - Remove sync_honcho=False kwarg from /btw callsites (TypeError crash) - Fix doctor.py references to dead 'hermes honcho setup' command - Cache prefetch_all() before tool loop (was re-calling every iteration) ABC contract hardening (all backwards-compatible): - Add session_id kwarg to prefetch/sync_turn/queue_prefetch - Make on_pre_compress() return str (provider insights in compression) - Add *kwargs to on_turn_start() for runtime context - Add on_delegation() hook for parent-side subagent observation - Document agent_context/agent_identity/agent_workspace kwargs on initialize() (prevents cron corruption, enables profile scoping) - Fix docstring: single external provider, not multiple Honcho CLI restoration: - Add plugins/memory/honcho/cli.py (from main's honcho_integration/cli.py with imports adapted to plugin path) - Restore full hermes honcho command with all subcommands (status, peer, mode, tokens, identity, enable/disable, sync, peers, --target-profile) - Restore auto-clone on profile creation + sync on hermes update - hermes honcho setup now redirects to hermes memory setup fix(memory): wire on_delegation, skip_memory for cron/flush, fix ByteRover return type - Wire on_delegation() in delegate_tool.py — parent's memory provider is notified with task+result after each subagent completes - Add skip_memory=True to cron scheduler (prevents cron system prompts from corrupting user representations — closes #4052) - Add skip_memory=True to gateway flush agent (throwaway agent shouldn't activate memory provider) - Fix ByteRover on_pre_compress() return type: None -> str * fix(honcho): port profile isolation fixes from PR #4632 Ports 5 bug fixes found during profile testing (erosika's PR #4632): 1. 3-tier config resolution — resolve_config_path() now checks $HERMES_HOME/honcho.json → ~/.hermes/honcho.json → ~/.honcho/config.json (non-default profiles couldn't find shared host blocks) 2. Thread host=_host_key() through from_global_config() in cmd_setup, cmd_status, cmd_identity (--target-profile was being ignored) 3. Use bare profile name as aiPeer (not host key with dots) — Honcho's peer ID pattern is ^[a-zA-Z0-9_-]+$, dots are invalid 4. Wrap add_peers() in try/except — was fatal on new AI peers, killed all message uploads for the session 5. Gate Honcho clone behind --clone/--clone-all on profile create (bare create should be blank-slate) Also: sanitize assistant_peer_id via _sanitize_id() * fix(tests): add module cleanup fixture to test_cli_provider_resolution test_cli_provider_resolution._import_cli() wipes tools.*, cli, and run_agent from sys.modules to force fresh imports, but had no cleanup. This poisoned all subsequent tests on the same xdist worker — mocks targeting tools.file_tools, tools.send_message_tool, etc. patched the NEW module object while already-imported functions still referenced the OLD one. Caused ~25 cascade failures: send_message KeyError, process_registry FileNotFoundError, file_read_guards timeouts, read_loop_detection file-not-found, mcp_oauth None port, and provider_parity/codex_execution stale tool lists. Fix: autouse fixture saves all affected modules before each test and restores them after, matching the pattern in test_managed_browserbase_and_modal.py.	2026-04-02 15:33:51 -07:00
Teknium	e0b2bdb089	fix: webhook platform support — skip home channel prompt, disable tool progress (salvage #4363 ) (#4660 ) Cherry-picked from PR #4363 by @bennyhodl with follow-up fixes: - Skip 'No home channel' prompt for webhook platform (webhooks deliver to configured targets, not a home channel) - Disable tool progress for webhooks (no message editing support) - Add webhook to PLATFORMS in tools_config.py and skills_config.py - Add hermes-webhook toolset to toolsets.py + hermes-gateway includes - Removed overly aggressive <50 char content filter that blocked legitimate short responses (tool progress already handled at source) Co-authored-by: bennyhodl <bennyhodl@users.noreply.github.com>	2026-04-02 14:00:22 -07:00
kshitijk4poor	20441cf2c8	fix(insights): persist token usage for non-CLI sessions	2026-04-02 10:47:13 -07:00
Teknium	624ad582a5	fix: make gateway approval block agent thread like CLI does (#4557 ) The gateway's dangerous command approval system was fundamentally broken: the agent loop continued running after a command was flagged, and the approval request only reached the user after the agent finished its entire conversation loop. By then the context was lost. This change makes the gateway approval mirror the CLI's synchronous behavior. When a dangerous command is detected: 1. The agent thread blocks on a threading.Event 2. The approval request is sent to the user immediately 3. The user responds with /approve or /deny 4. The event is signaled and the agent resumes with the real result The agent never sees 'approval_required' as a tool result. It either gets the command output (approved) or a definitive BLOCKED message (denied/timed out) — same as CLI mode. Queue-based design supports multiple concurrent approvals (parallel subagents via delegate_task, execute_code RPC handlers). Each approval gets its own _ApprovalEntry with its own threading.Event. /approve resolves the oldest (FIFO); /approve all resolves all at once. Changes: - tools/approval.py: Queue-based per-session blocking gateway approval (register/unregister callbacks, resolve with FIFO or all-at-once) - gateway/run.py: Register approval callback in run_sync(), remove post-loop pop_pending hack, /approve and /deny support 'all' flag - tests: 21 tests including parallel subagent E2E scenarios	2026-04-02 01:47:19 -07:00
Teknium	16d9f58445	fix(gateway): persist memory flush state to prevent redundant re-flushes on restart (#4481 ) * fix: force-close TCP sockets on client cleanup, detect and recover dead connections When a provider drops connections mid-stream (e.g. OpenRouter outage), httpx's graceful close leaves sockets in CLOSE-WAIT indefinitely. These zombie connections accumulate and can prevent recovery without restarting. Changes: - _force_close_tcp_sockets: walks the httpx connection pool and issues socket.shutdown(SHUT_RDWR) + close() to force TCP RST on every socket when a client is closed, preventing CLOSE-WAIT accumulation - _cleanup_dead_connections: probes the primary client's pool for dead sockets (recv MSG_PEEK), rebuilds the client if any are found - Pre-turn health check at the start of each run_conversation call that auto-recovers with a user-facing status message - Primary client rebuild after stale stream detection to purge pool - User-facing messages on streaming connection failures: "Connection to provider dropped — Reconnecting (attempt 2/3)" "Connection failed after 3 attempts — try again in a moment" Made-with: Cursor * fix: pool entry missing base_url for openrouter, clean error messages - _resolve_runtime_from_pool_entry: add OPENROUTER_BASE_URL fallback when pool entry has no runtime_base_url (pool entries from auth.json credential_pool often omit base_url) - Replace Rich console.print for auth errors with plain print() to prevent ANSI escape code mangling through prompt_toolkit's stdout patch - Force-close TCP sockets on client cleanup to prevent CLOSE-WAIT accumulation after provider outages - Pre-turn dead connection detection with auto-recovery and user message - Primary client rebuild after stale stream detection - User-facing status messages on streaming connection failures/retries Made-with: Cursor * fix(gateway): persist memory flush state to prevent redundant re-flushes on restart The _session_expiry_watcher tracked flushed sessions in an in-memory set (_pre_flushed_sessions) that was lost on gateway restart. Expired sessions remained in sessions.json and were re-discovered every restart, causing redundant AIAgent runs that burned API credits and blocked the event loop. Fix: Add a memory_flushed boolean field to SessionEntry, persisted in sessions.json. The watcher sets it after a successful flush. On restart, the flag survives and the watcher skips already-flushed sessions. - Add memory_flushed field to SessionEntry with to_dict/from_dict support - Old sessions.json entries without the field default to False (backward compat) - Remove the ephemeral _pre_flushed_sessions set from SessionStore - Update tests: save/load roundtrip, legacy entry compat, auto-reset behavior	2026-04-01 12:05:02 -07:00
Teknium	bacc86d031	fix: use RedactingFormatter on stderr handler, update types and test mock - stderr handler now uses RedactingFormatter to match file handlers - restart path uses verbose=0 (int) instead of verbose=False (bool) - test mock updated with new run_gateway(verbose, quiet, replace) signature	2026-04-01 11:05:07 -07:00
Alan Justino	5bd01b838c	fix(gateway): wire -v/-q flags to stderr logging By default 'hermes gateway run' now prints WARNING+ to stderr so connection errors and startup failures are visible in the terminal without having to tail ~/.hermes/logs/gateway.log. - gateway/run.py: start_gateway() accepts verbosity: Optional[int]=0. When not None, attaches a StreamHandler to stderr with level mapped from the count (0=WARNING, 1=INFO, 2+=DEBUG). Root logger level is also lowered when DEBUG is requested so records are not swallowed. - hermes_cli/gateway.py: run_gateway() gains verbose: int and quiet: bool params. -q translates to verbosity=None (no stderr handler). Wired through gateway_command(). - hermes_cli/main.py: -v changed from store_true to action=count so -v/-vv/-vvv each increment the level. -q/--quiet added as a new flag. Behaviour summary: hermes gateway run -> WARNING+ on stderr (default) hermes gateway run -q -> silent hermes gateway run -v -> INFO+ hermes gateway run -vv -> DEBUG	2026-04-01 11:05:07 -07:00
Nick	9a581bba50	fix(gateway): resume agent after /approve executes blocked command When a dangerous command was blocked and the user approved it via /approve, the command was executed but the agent loop had already exited — the agent never received the command output and the task died silently. Now _handle_approve_command sends immediate feedback to the user, then creates a synthetic continuation message with the command output and feeds it through _handle_message so the agent picks up where it left off. - Send command result to chat immediately via adapter.send() - Create synthetic MessageEvent with command + output as context - Spawn asyncio task to re-invoke agent via _handle_message - Return None (feedback already sent directly) - Add test for agent re-invocation after approval - Update existing approval tests for new return behavior	2026-04-01 01:38:55 -07:00
Teknium	a7f7e87070	fix: preserve credential_pool through smart routing and defer eager fallback on 429 (#4361 ) Three bugs prevented credential pool rotation from working when multiple Codex OAuth tokens were configured: 1. credential_pool was dropped during smart model turn routing. resolve_turn_route() constructed runtime dicts without it, so the AIAgent was created without pool access. Fixed in smart_model_routing.py (no-route and fallback paths), cli.py, and gateway/run.py. 2. Eager fallback fired before pool rotation on 429. The rate-limit handler at line ~7180 switched to a fallback provider immediately, before _recover_with_credential_pool got a chance to rotate to the next credential. Now deferred when the pool still has credentials. 3. (Non-issue) Retry budget was reported as too small, but successful pool rotations already skip retry_count increment — no change needed. Reported by community member Schinsly who identified all three root causes and verified the fix locally with multiple Codex accounts.	2026-04-01 01:02:34 -07:00
Teknium	84a541b619	feat: support * wildcard in platform allowlists and improve WhatsApp docs * docs: clarify WhatsApp allowlist behavior and document WHATSAPP_ALLOW_ALL_USERS - Add WHATSAPP_ALLOW_ALL_USERS and WHATSAPP_DEBUG to env vars reference - Warn that * is not a wildcard and silently blocks all messages - Show WHATSAPP_ALLOWED_USERS as optional, not required - Update troubleshooting with the * trap and debug mode tip - Fix Security section to mention the allow-all alternative Prompted by a user report in Discord where WHATSAPP_ALLOWED_USERS=* caused all incoming messages to be silently dropped at the bridge level. * feat: support * wildcard in platform allowlists Follow the precedent set by SIGNAL_GROUP_ALLOWED_USERS which already supports * as an allow-all wildcard. Bridge (allowlist.js): matchesAllowedUser() now checks for * in the allowedUsers set before iterating sender aliases. Gateway (run.py): _is_authorized() checks for * in allowed_ids after parsing the allowlist. This is generic — works for all platforms, not just WhatsApp. Updated docs to document * as a supported value instead of warning against it. Added WHATSAPP_ALLOW_ALL_USERS and WHATSAPP_DEBUG to the env vars reference. Tests: JS allowlist test + 2 Python gateway tests (WhatsApp + Telegram to verify cross-platform behavior).	2026-03-31 10:42:03 -07:00
Teknium	8d59881a62	feat(auth): same-provider credential pools with rotation, custom endpoint support, and interactive CLI (#2647 ) * feat(auth): add same-provider credential pools and rotation UX Add same-provider credential pooling so Hermes can rotate across multiple credentials for a single provider, recover from exhausted credentials without jumping providers immediately, and configure that behavior directly in hermes setup. - agent/credential_pool.py: persisted per-provider credential pools - hermes auth add/list/remove/reset CLI commands - 429/402/401 recovery with pool rotation in run_agent.py - Setup wizard integration for pool strategy configuration - Auto-seeding from env vars and existing OAuth state Co-authored-by: kshitijk4poor <82637225+kshitijk4poor@users.noreply.github.com> Salvaged from PR #2647 * fix(tests): prevent pool auto-seeding from host env in credential pool tests Tests for non-pool Anthropic paths and auth remove were failing when host env vars (ANTHROPIC_API_KEY) or file-backed OAuth credentials were present. The pool auto-seeding picked these up, causing unexpected pool entries in tests. - Mock _select_pool_entry in auxiliary_client OAuth flag tests - Clear Anthropic env vars and mock _seed_from_singletons in auth remove test * feat(auth): add thread safety, least_used strategy, and request counting - Add threading.Lock to CredentialPool for gateway thread safety (concurrent requests from multiple gateway sessions could race on pool state mutations without this) - Add 'least_used' rotation strategy that selects the credential with the lowest request_count, distributing load more evenly - Add request_count field to PooledCredential for usage tracking - Add mark_used() method to increment per-credential request counts - Wrap select(), mark_exhausted_and_rotate(), and try_refresh_current() with lock acquisition - Add tests: least_used selection, mark_used counting, concurrent thread safety (4 threads × 20 selects with no corruption) * feat(auth): add interactive mode for bare 'hermes auth' command When 'hermes auth' is called without a subcommand, it now launches an interactive wizard that: 1. Shows full credential pool status across all providers 2. Offers a menu: add, remove, reset cooldowns, set strategy 3. For OAuth-capable providers (anthropic, nous, openai-codex), the add flow explicitly asks 'API key or OAuth login?' — making it clear that both auth types are supported for the same provider 4. Strategy picker shows all 4 options (fill_first, round_robin, least_used, random) with the current selection marked 5. Remove flow shows entries with indices for easy selection The subcommand paths (hermes auth add/list/remove/reset) still work exactly as before for scripted/non-interactive use. * fix(tests): update runtime_provider tests for config.yaml source of truth (#4165) Tests were using OPENAI_BASE_URL env var which is no longer consulted after #4165. Updated to use model config (provider, base_url, api_key) which is the new single source of truth for custom endpoint URLs. * feat(auth): support custom endpoint credential pools keyed by provider name Custom OpenAI-compatible endpoints all share provider='custom', making the provider-keyed pool useless. Now pools for custom endpoints are keyed by 'custom:<normalized_name>' where the name comes from the custom_providers config list (auto-generated from URL hostname). - Pool key format: 'custom:together.ai', 'custom:local-(localhost:8080)' - load_pool('custom:name') seeds from custom_providers api_key AND model.api_key when base_url matches - hermes auth add/list now shows custom endpoints alongside registry providers - _resolve_openrouter_runtime and _resolve_named_custom_runtime check pool before falling back to single config key - 6 new tests covering custom pool keying, seeding, and listing * docs: add Excalidraw diagram of full credential pool flow Comprehensive architecture diagram showing: - Credential sources (env vars, auth.json OAuth, config.yaml, CLI) - Pool storage and auto-seeding - Runtime resolution paths (registry, custom, OpenRouter) - Error recovery (429 retry-then-rotate, 402 immediate, 401 refresh) - CLI management commands and strategy configuration Open at: https://excalidraw.com/#json=2Ycqhqpi6f12E_3ITyiwh,c7u9jSt5BwrmiVzHGbm87g * fix(tests): update setup wizard pool tests for unified select_provider_and_model flow The setup wizard now delegates to select_provider_and_model() instead of using its own prompt_choice-based provider picker. Tests needed: - Mock select_provider_and_model as no-op (provider pre-written to config) - Call _stub_tts BEFORE custom prompt_choice mock (it overwrites it) - Pre-write model.provider to config so the pool step is reached * docs: add comprehensive credential pool documentation - New page: website/docs/user-guide/features/credential-pools.md Full guide covering quick start, CLI commands, rotation strategies, error recovery, custom endpoint pools, auto-discovery, thread safety, architecture, and storage format. - Updated fallback-providers.md to reference credential pools as the first layer of resilience (same-provider rotation before cross-provider) - Added hermes auth to CLI commands reference with usage examples - Added credential_pool_strategies to configuration guide * chore: remove excalidraw diagram from repo (external link only) * refactor: simplify credential pool code — extract helpers, collapse extras, dedup patterns - _load_config_safe(): replace 4 identical try/except/import blocks - _iter_custom_providers(): shared generator for custom provider iteration - PooledCredential.extra dict: collapse 11 round-trip-only fields (token_type, scope, client_id, portal_base_url, obtained_at, expires_in, agent_key_id, agent_key_expires_in, agent_key_reused, agent_key_obtained_at, tls) into a single extra dict with __getattr__ for backward-compatible access - _available_entries(): shared exhaustion-check between select and peek - Dedup anthropic OAuth seeding (hermes_pkce + claude_code identical) - SimpleNamespace replaces class _Args boilerplate in auth_commands - _try_resolve_from_custom_pool(): shared pool-check in runtime_provider Net -17 lines. All 383 targeted tests pass. --------- Co-authored-by: kshitijk4poor <82637225+kshitijk4poor@users.noreply.github.com>	2026-03-31 03:10:01 -07:00
Teknium	f890a94c12	refactor: make config.yaml the single source of truth for endpoint URLs (#4165 ) OPENAI_BASE_URL was written to .env AND config.yaml, creating a dual-source confusion. Users (especially Docker) would see the URL in .env and assume that's where all config lives, then wonder why LLM_MODEL in .env didn't work. Changes: - Remove all 27 save_env_value("OPENAI_BASE_URL", ...) calls across main.py, setup.py, and tools_config.py - Remove OPENAI_BASE_URL env var reading from runtime_provider.py, cli.py, models.py, and gateway/run.py - Remove LLM_MODEL/HERMES_MODEL env var reading from gateway/run.py and auxiliary_client.py — config.yaml model.default is authoritative - Vision base URL now saved to config.yaml auxiliary.vision.base_url (both setup wizard and tools_config paths) - Tests updated to set config values instead of env vars Convention enforced: .env is for SECRETS only (API keys). All other configuration (model names, base URLs, provider selection) lives exclusively in config.yaml.	2026-03-30 22:02:53 -07:00

1 2 3 4 5 ...

408 Commits