hermes-agent

Author	SHA1	Message	Date
Teknium	16d9f58445	fix(gateway): persist memory flush state to prevent redundant re-flushes on restart (#4481 ) * fix: force-close TCP sockets on client cleanup, detect and recover dead connections When a provider drops connections mid-stream (e.g. OpenRouter outage), httpx's graceful close leaves sockets in CLOSE-WAIT indefinitely. These zombie connections accumulate and can prevent recovery without restarting. Changes: - _force_close_tcp_sockets: walks the httpx connection pool and issues socket.shutdown(SHUT_RDWR) + close() to force TCP RST on every socket when a client is closed, preventing CLOSE-WAIT accumulation - _cleanup_dead_connections: probes the primary client's pool for dead sockets (recv MSG_PEEK), rebuilds the client if any are found - Pre-turn health check at the start of each run_conversation call that auto-recovers with a user-facing status message - Primary client rebuild after stale stream detection to purge pool - User-facing messages on streaming connection failures: "Connection to provider dropped — Reconnecting (attempt 2/3)" "Connection failed after 3 attempts — try again in a moment" Made-with: Cursor * fix: pool entry missing base_url for openrouter, clean error messages - _resolve_runtime_from_pool_entry: add OPENROUTER_BASE_URL fallback when pool entry has no runtime_base_url (pool entries from auth.json credential_pool often omit base_url) - Replace Rich console.print for auth errors with plain print() to prevent ANSI escape code mangling through prompt_toolkit's stdout patch - Force-close TCP sockets on client cleanup to prevent CLOSE-WAIT accumulation after provider outages - Pre-turn dead connection detection with auto-recovery and user message - Primary client rebuild after stale stream detection - User-facing status messages on streaming connection failures/retries Made-with: Cursor * fix(gateway): persist memory flush state to prevent redundant re-flushes on restart The _session_expiry_watcher tracked flushed sessions in an in-memory set (_pre_flushed_sessions) that was lost on gateway restart. Expired sessions remained in sessions.json and were re-discovered every restart, causing redundant AIAgent runs that burned API credits and blocked the event loop. Fix: Add a memory_flushed boolean field to SessionEntry, persisted in sessions.json. The watcher sets it after a successful flush. On restart, the flag survives and the watcher skips already-flushed sessions. - Add memory_flushed field to SessionEntry with to_dict/from_dict support - Old sessions.json entries without the field default to False (backward compat) - Remove the ephemeral _pre_flushed_sessions set from SessionStore - Update tests: save/load roundtrip, legacy entry compat, auto-reset behavior	2026-04-01 12:05:02 -07:00
Teknium	7e91009018	fix: lazy-init SessionDB on adapter instance instead of per-request Reuse a single SessionDB across requests by caching on self._session_db with lazy initialization. Avoids creating a new SQLite connection per request when X-Hermes-Session-Id is used. Updated tests to set adapter._session_db directly instead of patching the constructor.	2026-04-01 11:41:32 -07:00
txchen	bf19623a53	feat(api-server): support X-Hermes-Session-Id header for session continuity Allow callers to pass X-Hermes-Session-Id in request headers to continue an existing conversation. When provided, history is loaded from SessionDB instead of the request body, and the session_id is echoed in the response header. Without the header, existing behavior is preserved (new uuid per request). This enables web UI clients to maintain thread continuity without modifying any session state themselves — the same mechanism the gateway uses for IM platforms (Telegram, Discord, etc.).	2026-04-01 11:41:32 -07:00
Nick	9a581bba50	fix(gateway): resume agent after /approve executes blocked command When a dangerous command was blocked and the user approved it via /approve, the command was executed but the agent loop had already exited — the agent never received the command output and the task died silently. Now _handle_approve_command sends immediate feedback to the user, then creates a synthetic continuation message with the command output and feeds it through _handle_message so the agent picks up where it left off. - Send command result to chat immediately via adapter.send() - Create synthetic MessageEvent with command + output as context - Spawn asyncio task to re-invoke agent via _handle_message - Return None (feedback already sent directly) - Add test for agent re-invocation after approval - Update existing approval tests for new return behavior	2026-04-01 01:38:55 -07:00
Teknium	84a541b619	feat: support * wildcard in platform allowlists and improve WhatsApp docs * docs: clarify WhatsApp allowlist behavior and document WHATSAPP_ALLOW_ALL_USERS - Add WHATSAPP_ALLOW_ALL_USERS and WHATSAPP_DEBUG to env vars reference - Warn that * is not a wildcard and silently blocks all messages - Show WHATSAPP_ALLOWED_USERS as optional, not required - Update troubleshooting with the * trap and debug mode tip - Fix Security section to mention the allow-all alternative Prompted by a user report in Discord where WHATSAPP_ALLOWED_USERS=* caused all incoming messages to be silently dropped at the bridge level. * feat: support * wildcard in platform allowlists Follow the precedent set by SIGNAL_GROUP_ALLOWED_USERS which already supports * as an allow-all wildcard. Bridge (allowlist.js): matchesAllowedUser() now checks for * in the allowedUsers set before iterating sender aliases. Gateway (run.py): _is_authorized() checks for * in allowed_ids after parsing the allowlist. This is generic — works for all platforms, not just WhatsApp. Updated docs to document * as a supported value instead of warning against it. Added WHATSAPP_ALLOW_ALL_USERS and WHATSAPP_DEBUG to the env vars reference. Tests: JS allowlist test + 2 Python gateway tests (WhatsApp + Telegram to verify cross-platform behavior).	2026-03-31 10:42:03 -07:00
Teknium	ff78ad4c81	feat: add discord.reactions config option to disable message reactions (#4199 ) Adds a 'reactions' key under the discord config section (default: true). When set to false, the bot no longer adds 👀/✅/❌ reactions to messages during processing. The config maps to DISCORD_REACTIONS env var following the same pattern as require_mention and auto_thread. Files changed: - hermes_cli/config.py: Add reactions default to DEFAULT_CONFIG - gateway/config.py: Map discord.reactions to DISCORD_REACTIONS env var - gateway/platforms/discord.py: Gate on_processing_start/complete hooks - tests/gateway/test_discord_reactions.py: 3 new tests for config gate	2026-03-31 01:24:48 -07:00
Teknium	83e5249be6	fix(gateway): use setsid instead of systemd-run --user for /update (salvage #4024 ) (#4104 ) Salvaged from PR #4024 by @Sertug17. Fixes #4017. - Replace systemd-run --user --scope with setsid for portable session detach - Add system-level service detection to cmd_update gateway restart - Falls back to start_new_session=True on systems without setsid (macOS, minimal containers)	2026-03-30 20:22:09 -07:00
Teknium	cc63b2d1cd	fix(gateway): remove user-facing compression warnings (#4139 ) Auto-compression still runs silently in the background with server-side logging, but no longer sends messages to the user's chat about it. Removed: - 'Session is large... Auto-compressing' pre-compression notification - 'Compressed: N → M messages' post-compression notification - 'Session is still very large after compression' warning - 'Auto-compression failed' warning - Rate-limit tracking (only existed for these warnings)	2026-03-30 19:17:07 -07:00
Teknium	1e59d4813c	feat(api_server): stream tool progress to Open WebUI (#4092 ) Wire the existing tool_progress_callback through the API server's streaming handler so Open WebUI users see what tool is running. Uses the existing 3-arg callback signature (name, preview, args) that fires at tool start — no changes to run_agent.py needed. Progress appears as inline markdown in the SSE content stream. Inspired by PR #4032 by sroecker, reimplemented to avoid breaking the callback signature used by CLI and gateway consumers.	2026-03-30 18:50:27 -07:00
Teknium	e64b047663	chore: prepare Hermes for Homebrew packaging (#4099 ) Co-authored-by: Yabuku-xD <78594762+Yabuku-xD@users.noreply.github.com>	2026-03-30 17:34:43 -07:00
Teknium	07746dca0c	fix(matrix): E2EE decryption — request keys, auto-trust devices, retry buffered events (#4083 ) When the Matrix adapter receives encrypted events it can't decrypt (MegolmEvent), it now: 1. Requests the missing room key from other devices via client.request_room_key(event) instead of silently dropping the message 2. Buffers undecrypted events (bounded to 100, 5 min TTL) and retries decryption after each E2EE maintenance cycle when new keys arrive 3. Auto-trusts/verifies all devices after key queries so other clients share session keys with the bot proactively 4. Exports Megolm keys on disconnect and imports them on connect, so session keys survive gateway restarts This addresses the 'could not decrypt event' warnings that caused the bot to miss messages in encrypted rooms.	2026-03-30 17:16:09 -07:00
Teknium	f007284d05	fix: rate-limit pairing rejection messages to prevent spam (#4081 ) * fix: rate-limit pairing rejection messages to prevent spam When generate_code() returns None (rate limited or max pending), the "Too many pairing requests" message was sent on every subsequent DM with no cooldown. A user sending 30 messages would get 30 rejection replies — reported as potential hack on WhatsApp. Now check _is_rate_limited() before any pairing response, and record rate limit after sending a rejection. Subsequent messages from the same user are silently ignored until the rate limit window expires. * test: add coverage for pairing response rate limiting Follow-up to cherry-picked PR #4042 — adds tests verifying: - Rate-limited users get silently ignored (no response sent) - Rejection messages record rate limit for subsequent suppression --------- Co-authored-by: 0xbyt4 <35742124+0xbyt4@users.noreply.github.com>	2026-03-30 16:48:00 -07:00
Teknium	7dac75f2ae	fix: prevent context pressure warning spam after compression (#4012 ) * feat: add /yolo slash command to toggle dangerous command approvals Adds a /yolo command that toggles HERMES_YOLO_MODE at runtime, skipping all dangerous command approval prompts for the current session. Works in both CLI and gateway (Telegram, Discord, etc.). - /yolo -> ON: all commands auto-approved, no confirmation prompts - /yolo -> OFF: normal approval flow restored The --yolo CLI flag already existed for launch-time opt-in. This adds the ability to toggle mid-session without restarting. Session-scoped — resets when the process ends. Uses the existing HERMES_YOLO_MODE env var that check_all_command_guards() already respects. * fix: prevent context pressure warning spam (agent loop + gateway rate-limit) Two complementary fixes for repeated context pressure warnings spamming gateway users (Telegram, Discord, etc.): 1. Agent-level loop fix (run_agent.py): After compression, only reset _context_pressure_warned if the post-compression estimate is actually below the 85% warning level. Previously the flag was unconditionally reset, causing the warning to re-fire every loop iteration when compression couldn't reduce below 85% of the threshold (e.g. very low threshold like 15%, or system prompt alone exceeds the warning level). 2. Gateway-level rate-limit (gateway/run.py, salvaged from PR #3786): Per-chat_id cooldown of 1 hour on compression warning messages. Both warning paths ('still large after compression' and 'compression failed') are gated. Defense-in-depth — even if the agent-level fix has edge cases, users won't see more than one warning per hour. Co-authored-by: dlkakbs <dlkakbs@users.noreply.github.com> --------- Co-authored-by: dlkakbs <dlkakbs@users.noreply.github.com>	2026-03-30 13:18:21 -07:00
Teknium	1e896b0251	fix: resolve 7 failing CI tests (#3936 ) 1. matrix voice: _on_room_message_media unconditionally overwrote media_urls with the image cache path (always None for non-images), wiping the locally-cached voice path. Now only overrides when cached_path is truthy. 2. cli_tools_command: /tools disable no longer prompts for confirmation (input() removed in earlier commit to fix TUI hang), but tests still expected the old Y/N prompt flow. Updated tests to match current behavior (direct apply + session reset). 3. slack app_mention: connect() was refactored for multi-workspace (creates AsyncWebClient per token), but test only mocked the old self._app.client path. Added AsyncWebClient and acquire_scoped_lock mocks. 4. website_policy: module-level _cached_policy from earlier tests caused fast-path return of None. Added invalidate_cache() before assertion. 5. codex 401 refresh: already passing on current main (fixed by intervening commit).	2026-03-30 08:10:14 -07:00
Teknium	ee61485cac	feat(matrix): support native voice messages via MSC3245 (#3877 ) * feat(matrix): support native voice messages * fix: skip matrix voice tests when matrix-nio not installed --------- Co-authored-by: Carlos Alberto Pereira Gomes <carlosapgomes@users.noreply.github.com>	2026-03-30 00:02:51 -07:00
Teknium	227601c200	feat(discord): add message processing reactions (salvage #1980 ) (#3871 ) Adds lifecycle hooks to the base platform adapter so Discord (and future platforms) can react to message processing events: 👀 when processing starts ✅ on successful completion (delivery confirmed) ❌ on failure, error, or cancellation Implementation: - base.py: on_processing_start/on_processing_complete hooks with _run_processing_hook error isolation wrapper; delivery tracking via _record_delivery closure for accurate success detection - discord.py: _add_reaction/_remove_reaction helpers + hook overrides - Tests for base hook lifecycle and Discord-specific reactions Co-authored-by: alanwilhelm <alanwilhelm@users.noreply.github.com>	2026-03-29 21:55:23 -07:00
Teknium	839f798b74	feat(telegram): add group mention gating and regex triggers (#3870 ) Adds Discord-style mention gating for Telegram groups: - telegram.require_mention: gate group messages (default: false) - telegram.mention_patterns: regex wake-word triggers - telegram.free_response_chats: bypass gating for specific chats When require_mention is enabled, group messages are accepted only for: - slash commands - replies to the bot - @botusername mentions - regex wake-word pattern matches DMs remain unrestricted. @mention text is stripped before passing to the agent. Invalid regex patterns are ignored with a warning. Config bridges follow the existing Discord pattern (yaml → env vars). Cherry-picked and adapted from PR #1977 by mcleay. Fixed ChatType comparison to work without python-telegram-bot installed (uses string matching instead of enum, consistent with other entity_type checks). Co-authored-by: mcleay <mcleay@users.noreply.github.com>	2026-03-29 21:53:59 -07:00
Teknium	ce2841f3c9	feat(gateway): add WeCom (Enterprise WeChat) platform support (#3847 ) Adds WeCom as a gateway platform adapter using the AI Bot WebSocket gateway for real-time bidirectional communication. No public endpoint or new pip dependencies needed (uses existing aiohttp + httpx). Features: - WebSocket persistent connection with auto-reconnect (exponential backoff) - DM and group messaging with configurable access policies - Media upload/download with AES decryption for encrypted attachments - Markdown rendering, quote context preservation - Proactive + passive reply message modes - Chunked media upload pipeline (512KB chunks) Cherry-picked from PR #1898 by EvilRan with: - Moved to current main (PR was 300 commits behind) - Skipped base.py regressions (reply_to additions are good but belong in a separate PR since they affect all platforms) - Fixed test assertions to match current base class send() signature (reply_to=None kwarg now explicit) - All 16 integration points added surgically to current main - No new pip dependencies (aiohttp + httpx already installed) Fixes #1898 Co-authored-by: EvilRan <EvilRan@users.noreply.github.com>	2026-03-29 21:29:13 -07:00
Teknium	2d264a4562	fix(tests): resolve 10 CI failures across hooks, tiktoken, plugins (#3848 ) test_hooks.py (7 failures): Built-in boot-md hook was always loaded by _register_builtin_hooks(), adding +1 to every expected hook count. Mock out built-in registration in TestDiscoverAndLoad so tests isolate user-hook discovery logic. test_tool_token_estimation.py (2 failures): tiktoken is not in core/[all] dependencies. The estimation function gracefully returns {} when tiktoken is missing, but tests expected non-empty results. Added skipif markers for tests that need tiktoken. test_plugins_cmd.py (1 failure): bare 'hermes plugins' now dispatches to cmd_toggle() (interactive curses UI) instead of cmd_list(). Updated test to match the new behavior.	2026-03-29 20:05:59 -07:00
Teknium	3e2c8c529b	fix(whatsapp): resolve LID↔phone aliases in allowlist matching (#3830 ) WhatsApp DMs can arrive with LID sender IDs even when WHATSAPP_ALLOWED_USERS is configured with phone numbers. The allowlist check now reads bridge session mapping files (lid-mapping-*.json) to resolve phone↔LID aliases, matching users regardless of which identifier format the message uses. Both the Python gateway (_is_user_authorized) and the Node bridge (allowlist.js) now share the same mapping-file-based resolution logic. Co-authored-by: Frederico Ribeiro <fr@tecompanytea.com>	2026-03-29 18:21:50 -07:00
Teknium	ca4907dfbc	feat(gateway): add Feishu/Lark platform support (#3817 ) Adds Feishu (ByteDance's enterprise messaging platform) as a gateway platform adapter with full feature parity: WebSocket + webhook transports, message batching, dedup, rate limiting, rich post/card content parsing, media handling (images/audio/files/video), group @mention gating, reaction routing, and interactive card button support. Cherry-picked from PR #1793 by penwyp with: - Moved to current main (PR was 458 commits behind) - Fixed _send_with_retry shadowing BasePlatformAdapter method (renamed to _feishu_send_with_retry to avoid signature mismatch crash) - Fixed import structure: aiohttp/websockets imported independently of lark_oapi so they remain available when SDK is missing - Fixed get_hermes_home import (hermes_constants, not hermes_cli.config) - Added skip decorators for tests requiring lark_oapi SDK - All 16 integration points added surgically to current main New dependency: lark-oapi>=1.5.3,<2 (optional, pip install hermes-agent[feishu]) Fixes #1788 Co-authored-by: penwyp <penwyp@users.noreply.github.com>	2026-03-29 18:17:42 -07:00
Teknium	0ef80c5f32	fix(whatsapp): reuse persistent aiohttp session across requests (#3818 ) Replace per-request aiohttp.ClientSession() in every WhatsApp adapter method with a single persistent self._http_session, matching the pattern used by Mattermost, HomeAssistant, and SMS adapters. Changes: - Create self._http_session in connect(), close in disconnect() - All bridge HTTP calls (send, edit, send-media, typing, get_chat_info, poll_messages) now use the shared session - Explicitly cancel _poll_task on disconnect() instead of relying solely on self._running = False - Health-check sessions in connect() remain ephemeral (persistent session not yet created at that point) - Remove per-method ImportError guards for aiohttp (always available when gateway runs via [messaging] extras) Salvaged from PR #1851 by Himess. The _poll_task storage was already on main from PR #3267; this adds the disconnect cancellation and the persistent session. Tests: 4 new tests for session close, already-closed skip, poll task cancellation, and done-task skip.	2026-03-29 16:25:20 -07:00
Teknium	38d694f559	fix(gateway): apply home channel env overrides consistently (#3808 ) Home channel env vars (SLACK_HOME_CHANNEL, SIGNAL_HOME_CHANNEL, etc.) for Slack, Signal, Mattermost, Matrix, Email, and SMS were nested inside the credential-env blocks, so they were ignored when the platform was already configured via config.yaml. Moved the home channel handling outside the credential blocks with a Platform.X in config.platforms guard, matching the existing pattern for Telegram and Discord. Co-authored-by: cutepawss <cutepawss@users.noreply.github.com>	2026-03-29 15:48:51 -07:00
Teknium	8eb70a6885	fix(email): close SMTP and IMAP connections on failure (#3804 ) SMTP connections in _send_email() and _send_email_with_attachment() leak when login() or send_message() raises before quit() is reached. Both now wrapped in try/finally with a close() fallback if quit() also fails. IMAP connection in _fetch_new_messages() leaks when UID processing raises, since logout() sits after the loop. Restructured with try/finally so logout() runs unconditionally. Co-authored-by: Himess <Himess@users.noreply.github.com>	2026-03-29 15:38:32 -07:00
kshitij	4c532c153b	fix: URL-encode Signal phone numbers and correct attachment RPC parameter (#3670 ) Fixes two Signal bugs: 1. SSE connection: URL-encode phone numbers so + isn't interpreted as space (400 Bad Request) 2. Attachment fetch: use 'id' parameter instead of 'attachmentId' (NullPointerException in signal-cli) Also refactors Signal tests with shared helpers.	2026-03-28 23:45:28 -07:00
Teknium	91b881f931	feat(mattermost): configurable mention behavior — respond without @mention (#3664 ) Adds MATTERMOST_REQUIRE_MENTION and MATTERMOST_FREE_RESPONSE_CHANNELS env vars, matching Discord's existing mention gating pattern. - MATTERMOST_REQUIRE_MENTION=false: respond to all channel messages - MATTERMOST_FREE_RESPONSE_CHANNELS=id1,id2: specific channels where bot responds without @mention even when require_mention is true - DMs always respond regardless of mention settings - @mention is now stripped from message text (clean agent input) 7 new tests for mention gating, free-response channels, DM bypass, and mention stripping. Updated existing test for mention stripping. Docs: updated mattermost.md with Mention Behavior section, environment-variables.md with new vars, config.py with metadata.	2026-03-28 22:17:43 -07:00
nguyen binh	c6e2e486bf	fix: add download retry to cache_audio_from_url matching cache_image_from_url (#3401 ) PR #3323 added retry with exponential backoff to cache_image_from_url but missed the sibling function cache_audio_from_url 18 lines below in the same file. A single transient 429/5xx/timeout loses voice messages while image downloads now survive them. Apply the same retry pattern: 3 attempts with 1.5s exponential backoff, immediate raise on non-retryable 4xx.	2026-03-28 17:28:38 -07:00
Teknium	dabe3c34cc	feat(webhook): hermes webhook CLI + skill for event-driven subscriptions (#3578 ) Adds 'hermes webhook' CLI subcommand and a skill — zero new model tools. CLI commands (require webhook platform to be enabled): hermes webhook subscribe <name> [--events, --prompt, --deliver, ...] hermes webhook list hermes webhook remove <name> hermes webhook test <name> All commands gate on webhook platform being enabled in config. If not configured, prints setup instructions (gateway setup wizard, manual config.yaml, or env vars). The agent uses these via terminal tool, guided by the webhook-subscriptions skill which documents setup, common patterns (GitHub, Stripe, CI/CD, monitoring), prompt template syntax, security, and troubleshooting. Adapter enhancement: webhook.py hot-reloads dynamic subscriptions from ~/.hermes/webhook_subscriptions.json on each incoming request (mtime-gated). Static config.yaml routes always take precedence. Docs: updated webhooks.md with Dynamic Subscriptions section, added hermes webhook to cli-commands.md reference. No new model tools. No toolset changes. 24 new tests for CLI CRUD, persistence, enabled-gate, and adapter dynamic route loading.	2026-03-28 14:33:35 -07:00
Teknium	708f187549	fix(gateway): exit with failure when all platforms fail with retryable errors (#3592 ) When all messaging platforms exhaust retries and get queued for background reconnection, exit with code 1 so systemd Restart=on-failure can restart the process. Previously the gateway stayed alive as a zombie with no connected platforms and exit code 0. Salvaged from PR #3567 by kelsia14. Test updates added. Co-authored-by: kelsia14 <kelsia14@users.noreply.github.com>	2026-03-28 14:25:12 -07:00
Teknium	d7c41f3cef	fix(telegram): honor proxy env vars in fallback transport (salvage #3411 ) (#3591 ) * fix: keep gateway running through telegram proxy failures - continue gateway startup in degraded mode when Telegram cannot connect yet - ensure Telegram fallback transport also honors proxy env vars - support reconnect retries without taking down the whole gateway * test(telegram): cover proxy env handling in fallback transport --------- Co-authored-by: kufufu9 <pi@local>	2026-03-28 14:23:27 -07:00
Teknium	d6b4fa2e9f	fix: strip @botname from commands so /new@TigerNanoBot resolves correctly (#3581 ) Commands sent directly to the bot in groups include @botname suffix (e.g. /compress@TigerNanoBot). get_command() now strips the @anything part before lookup, matching how Telegram bot menu generates commands. Fixes all slash commands silently doing nothing when sent with @mention. Co-authored-by: MacroAnarchy <MacroAnarchy@users.noreply.github.com>	2026-03-28 14:01:01 -07:00
Teknium	df1bf0a209	feat(api-server): add basic security headers (#3576 ) Add X-Content-Type-Options: nosniff and Referrer-Policy: no-referrer to all API server responses via a new security_headers_middleware. Co-authored-by: Oktay Aydin <aydnOktay@users.noreply.github.com>	2026-03-28 14:00:52 -07:00
Teknium	49a49983e4	feat(api-server): add Access-Control-Max-Age to CORS preflight responses (#3580 ) Adds Access-Control-Max-Age: 600 to CORS preflight responses, telling browsers to cache the preflight for 10 minutes. Reduces redundant OPTIONS requests and improves perceived latency for browser-based API clients. Salvaged from PR #3514 by aydnOktay. Co-authored-by: aydnOktay <xaydinoktay@gmail.com>	2026-03-28 14:00:03 -07:00
Teknium	e97c0cb578	fix: replace hardcoded ~/.hermes paths with get_hermes_home() for profile support * feat: GPT tool-use steering + strip budget warnings from history Two changes to improve tool reliability, especially for OpenAI GPT models: 1. GPT tool-use enforcement prompt: Adds GPT_TOOL_USE_GUIDANCE to the system prompt when the model name contains 'gpt' and tools are loaded. This addresses a known behavioral pattern where GPT models describe intended actions ('I will run the tests') instead of actually making tool calls. Inspired by similar steering in OpenCode (beast.txt) and Cline (GPT-5.1 variant). 2. Budget warning history stripping: Budget pressure warnings injected by _get_budget_warning() into tool results are now stripped when conversation history is replayed via run_conversation(). Previously, these turn-scoped signals persisted across turns, causing models to avoid tool calls in all subsequent messages after any turn that hit the 70-90% iteration threshold. * fix: replace hardcoded ~/.hermes paths with get_hermes_home() for profile support Prep for the upcoming profiles feature — each profile is a separate HERMES_HOME directory, so all paths must respect the env var. Fixes: - gateway/platforms/matrix.py: Matrix E2EE store was hardcoded to ~/.hermes/matrix/store, ignoring HERMES_HOME. Now uses get_hermes_home() so each profile gets its own Matrix state. - gateway/platforms/telegram.py: Two locations reading config.yaml via Path.home()/.hermes instead of get_hermes_home(). DM topic thread_id persistence and hot-reload would read the wrong config in a profile. - tools/file_tools.py: Security path for hub index blocking was hardcoded to ~/.hermes, would miss the actual profile's hub cache. - hermes_cli/gateway.py: Service naming now uses the profile name (hermes-gateway-coder) instead of a cryptic hash suffix. Extracted _profile_suffix() helper shared by systemd and launchd. - hermes_cli/gateway.py: Launchd plist path and Label now scoped per profile (ai.hermes.gateway-coder.plist). Previously all profiles would collide on the same plist file on macOS. - hermes_cli/gateway.py: Launchd plist now includes HERMES_HOME in EnvironmentVariables — was missing entirely, making custom HERMES_HOME broken on macOS launchd (pre-existing bug). - All launchctl commands in gateway.py, main.py, status.py updated to use get_launchd_label() instead of hardcoded string. Test fixes: DM topic tests now set HERMES_HOME env var alongside Path.home() mock. Launchd test uses get_launchd_label() for expected commands.	2026-03-28 13:51:08 -07:00
Teknium	09ebf8b252	feat(api-server): add /v1/health alias for OpenAI compatibility (#3572 ) Add GET /v1/health as an alias to the existing /health endpoint so OpenAI-compatible health checks work out of the box. Co-authored-by: Oktay Aydin <aydnOktay@users.noreply.github.com>	2026-03-28 13:32:39 -07:00
Teknium	393929831e	fix(gateway): preserve transcript on /compress and hygiene compression (salvage #3516 ) (#3556 ) * fix(gateway): preserve full transcript on /compress instead of overwriting The /compress command calls _compress_context() which correctly ends the old session (preserving its full transcript in SQLite) and creates a new session_id for the continuation. However, it then immediately called rewrite_transcript() on the OLD session_id, overwriting the preserved transcript with the compressed version — destroying searchable history. Auto-compression (triggered by context pressure) does not have this bug because the gateway already handles the session_id swap via the agent.session_id != session_id check after _run_agent_sync. Fix: after _compress_context creates the new session, write the compressed messages into the NEW session_id and update the session store pointer. The old session's full transcript stays intact and searchable via session_search. Before: /compress destroys original messages, session_search can't find details from compressed portions. After: /compress behaves like /new for history — full transcript preserved, compressed context for the live session. * fix(gateway): preserve transcript on /compress and hygiene compression Apply session_id swap after _compress_context in both /compress handler and hygiene pre-compression. _compress_context creates a new session (ending the old one), but both paths were calling rewrite_transcript on the OLD session_id — overwriting the preserved transcript and destroying searchable history. Now follows the same pattern as the auto-compression handler (lines 5415-5423): detect the new session_id, update the session store entry, and write compressed messages to the new session. Also fix FakeCompressAgent test mock to include session_id attribute and simulate the session_id change that real _compress_context performs. Co-authored-by: MacroAnarchy <MacroAnarchy@users.noreply.github.com> --------- Co-authored-by: MacroAnarchy <MacroAnarchy@users.noreply.github.com>	2026-03-28 12:23:43 -07:00
Teknium	be322efdf2	fix(matrix): harden e2ee access-token handling (#3562 ) * fix(matrix): harden e2ee access-token handling * fix: patch nio mock in e2ee maintenance sync loop test The sync_loop now imports nio for SyncError checking (from PR #3280), so the test needs to inject a fake nio module via sys.modules. --------- Co-authored-by: Cortana <andrew+cortana@chalkley.org>	2026-03-28 12:13:35 -07:00
Teknium	411e3c1539	fix(api-server): allow Idempotency-Key in CORS headers (#3530 ) Browser clients using the Idempotency-Key header for request deduplication were blocked by CORS preflight because the header was not listed in Access-Control-Allow-Headers. Add Idempotency-Key to _CORS_HEADERS and add tests for both the new header allowance and the existing Vary: Origin behavior. Co-authored-by: aydnOktay <aydnOktay@users.noreply.github.com> Co-authored-by: Hermes Agent <hermes@nousresearch.com>	2026-03-28 08:16:41 -07:00
Teknium	290c71a707	fix(gateway): scope progress thread fallback to Slack only (salvage #3414 ) (#3488 ) * test(gateway): map fixture adapter by platform in progress threading tests * fix(gateway): scope progress thread fallback to Slack only --------- Co-authored-by: EmpireOperating <258363005+EmpireOperating@users.noreply.github.com>	2026-03-27 22:37:53 -07:00
Teknium	f57ebf52e9	fix(api-server): cancel orphaned agent + true interrupt on SSE disconnect (salvage #3399 ) (#3427 ) Salvage of #3399 by @binhnt92 with true agent interruption added on top. When a streaming /v1/chat/completions client disconnects mid-stream, the agent is now interrupted via agent.interrupt() so it stops making LLM API calls, and the asyncio task wrapper is cancelled. Closes #3399.	2026-03-27 11:33:19 -07:00
Teknium	41d9d08078	fix(telegram): fall back to no thread_id on 'Message thread not found' (#3390 ) python-telegram-bot's BadRequest inherits from NetworkError, so the send() retry loop was catching 'Message thread not found' as a transient network error and retrying 3 times before silently failing. This killed all tool progress messages, streaming responses, and typing indicators when the incoming message carried an invalid message_thread_id. Now detect BadRequest inside the NetworkError handler: - 'thread not found' + thread_id set → clear thread_id and retry once (message still reaches the chat, just without topic threading) - Other BadRequest errors → raise immediately (permanent, don't retry) - True NetworkError → retry as before (transient) 252 silent failures in gateway.log traced to this on 2026-03-26. 5 new tests for thread fallback, non-thread BadRequest, no-thread sends, network retry, and multi-chunk fallback.	2026-03-27 06:07:28 -07:00
Teknium	75fcbc44ce	feat(telegram): auto-discover fallback IPs via DoH when api.telegram.org is unreachable (#3376 ) * feat(telegram): auto-discover fallback IPs via DoH when api.telegram.org is unreachable On some networks (university, corporate), api.telegram.org resolves to a valid Telegram IP that is unreachable due to routing/firewall rules. A different IP in the same Telegram-owned 149.154.160.0/20 block works fine. This adds automatic fallback IP discovery at connect time: 1. Query Google and Cloudflare DNS-over-HTTPS for api.telegram.org A records 2. Exclude the system-DNS IP (the unreachable one), use the rest as fallbacks 3. If DoH is also blocked, fall back to a seed list (149.154.167.220) 4. TelegramFallbackTransport tries primary first, sticks to whichever works No configuration needed — works automatically. TELEGRAM_FALLBACK_IPS env var still available as manual override. Zero impact on healthy networks (primary path succeeds on first attempt, fallback never exercised). No new dependencies (uses httpx already in deps + stdlib socket). * fix: share transport instance and downgrade seed fallback log to info - Use single TelegramFallbackTransport shared between request and get_updates_request so sticky IP is shared across polling and API calls - Keep separate HTTPXRequest instances (different timeout settings) - Downgrade "using seed fallback IPs" from warning to info to avoid noisy logs on healthy networks * fix: add telegram.request mock and discovery fixture to remaining test files The original PR missed test_dm_topics.py and test_telegram_network_reconnect.py — both need the telegram.request mock module. The reconnect test also needs _no_auto_discovery since _handle_polling_network_error calls connect() which now invokes discover_fallback_ips(). --------- Co-authored-by: Mohan Qiao <Gavin-Qiao@users.noreply.github.com>	2026-03-27 04:03:13 -07:00
Teknium	a2847ea7f0	fix(gateway): add media download retry to Mattermost, Slack, and base cache (#3323 ) * fix(gateway): add media download retry to Mattermost, Slack, and base cache Media downloads on Mattermost and Slack fail permanently on transient errors (timeouts, 429 rate limits, 5xx server errors). Telegram and WhatsApp already have retry logic, but these platforms had single-attempt downloads with hardcoded 30s timeouts. Changes: - base.py cache_image_from_url: add retry with exponential backoff (covers Signal and any platform using the shared cache helper) - mattermost.py _send_media_url: retry on 429/5xx/timeout (3 attempts) - slack.py _download_slack_file: retry on timeout/5xx (3 attempts) - slack.py _download_slack_file_bytes: same retry pattern * test: add tests for media download retry --------- Co-authored-by: dieutx <dangtc94@gmail.com>	2026-03-26 19:33:18 -07:00
Teknium	58ca875e19	feat(gateway): surface session config on /new, /reset, and auto-reset (#3321 ) When a new session starts in the gateway (via /new, /reset, or auto-reset), send the user a summary of the detected configuration: ✨ Session reset! Starting fresh. ◆ Model: qwen3.5:27b-q4_K_M ◆ Provider: custom ◆ Context: 8K tokens (config) ◆ Endpoint: http://localhost:11434/v1 This makes misconfigured context length immediately visible — a user running a local 8K model that falls to the 128K default will see: ◆ Context: 128K tokens (default — set model.context_length in config to override) Instead of silently getting no compression and degrading responses. - _format_session_info() resolves model, provider, context length, and endpoint from config + runtime, matching the hygiene code's resolution chain - Local/custom endpoints shown; cloud endpoints hidden (not useful) - Context source annotated: config, detected, or default with hint - Appended to /new and /reset responses, and auto-reset notifications - 9 tests covering all formatting paths and failure resilience Addresses the user-facing side of #2708 — instead of trying to fix every edge case in context detection, surface the values so users can immediately see when something is wrong.	2026-03-26 19:27:58 -07:00
Teknium	22cfad157b	fix: gateway token double-counting — use absolute set instead of increment (#3317 ) The gateway's update_session() used += for token counts, but the cached agent's session_prompt_tokens / session_completion_tokens are cumulative totals that grow across messages. Each update_session call re-added the running total, inflating usage stats with every message (1.7x after 3 messages, worse over longer conversations). Fix: change += to = for in-memory entry fields, add set_token_counts() to SessionDB that uses direct assignment instead of SQL increment, and switch the gateway to call it. CLI mode continues using update_token_counts() (increment) since it tracks per-API-call deltas — that path is unchanged. Based on analysis from PR #3222 by @zaycruz (closed). Co-authored-by: zaycruz <zay@users.noreply.github.com>	2026-03-26 19:13:07 -07:00
Teknium	a8df7f9964	fix: gateway token double-counting with cached agents (#3306 ) The cached agent accumulates session_input_tokens across messages, so run_conversation() returns cumulative totals. But update_session() used += (increment), double-counting on every message after the first. - session.py: change in-memory entry updates from += to = (direct assignment for cumulative values) - hermes_state.py: add absolute=True flag to update_token_counts() that uses SET column = ? instead of SET column = column + ? - session.py: pass absolute=True to the DB call CLI path is unchanged — it passes per-API-call deltas directly to update_token_counts() with the default absolute=False (increment). Reported by @zaycruz in #3222. Closes #3222.	2026-03-26 19:04:53 -07:00
Teknium	005786c55d	fix(gateway): include per-platform ALLOW_ALL and SIGNAL_GROUP in startup allowlist check (#3313 ) The startup warning 'No user allowlists configured' only checked GATEWAY_ALLOW_ALL_USERS and per-platform _ALLOWED_USERS vars. It missed SIGNAL_GROUP_ALLOWED_USERS and per-platform _ALLOW_ALL_USERS vars (e.g. TELEGRAM_ALLOW_ALL_USERS), causing a false warning even when users had these configured. The actual auth check in _is_user_authorized already recognized these vars. Cherry-picked from PR #3202 by binhnt92. Co-authored-by: binhnt92 <binhnt.ht.92@gmail.com>	2026-03-26 18:23:49 -07:00
Teknium	f008ee1019	fix(session): preserve reasoning fields in rewrite_transcript (#3311 ) rewrite_transcript (used by /retry, /undo, /compress) was calling append_message without reasoning, reasoning_details, or codex_reasoning_items — permanently dropping them from SQLite. Co-authored-by: alireza78a <alireza78.crypto@gmail.com>	2026-03-26 18:18:00 -07:00
Teknium	18d28c63a7	fix: add explicit hermes-api-server toolset for API server platform (#3304 ) The API server adapter was creating agents without specifying enabled_toolsets, causing ALL tools to load — including clarify, send_message, and text_to_speech which don't work without interactive callbacks or gateway dispatch. Changes: - toolsets.py: Add hermes-api-server toolset (core tools minus clarify, send_message, text_to_speech) - api_server.py: Resolve toolsets from config.yaml platform_toolsets via _get_platform_tools() — same path as all other gateway platforms. Falls back to hermes-api-server default when no override configured. - tools_config.py: Add api_server to PLATFORMS dict so users can customize via 'hermes tools' or platform_toolsets.api_server in config.yaml - 12 tests covering toolset definition, config resolution, and user override Reported by thatwolfieguy on Discord.	2026-03-26 18:02:26 -07:00
Teknium	0375b2a0d7	fix(gateway): silence background agent terminal output (#3297 ) * fix(gateway): silence flush agent terminal output quiet_mode=True only suppresses AIAgent init messages. Tool call output still leaks to the terminal through _safe_print → _print_fn during session reset/expiry. Since #2670 injected live memory state into the flush prompt, the flush agent now reliably calls memory tools — making the output leak noticeable for the first time. Set _print_fn to a no-op so the background flush is fully silent. * test(gateway): add test for flush agent terminal silence + fix dotenv mock - Add TestFlushAgentSilenced: verifies _print_fn is set to a no-op on the flush agent so tool output never leaks to the terminal - Fix pre-existing test failures: replace patch('run_agent.AIAgent') with sys.modules mock to avoid importing run_agent (requires openai) - Add autouse _mock_dotenv fixture so all tests in this file run without the dotenv package installed * fix(display): route KawaiiSpinner output through print_fn to fully silence flush agent The previous fix set tmp_agent._print_fn = no-op on the flush agent but spinner output and quiet-mode cute messages bypassed _print_fn entirely: - KawaiiSpinner captured sys.stdout at __init__ and wrote directly to it - quiet-mode tool results used builtin print() instead of _safe_print() Add optional print_fn parameter to KawaiiSpinner.__init__; _write routes through it when set. Pass self._print_fn to all spinner construction sites in run_agent.py and change the quiet-mode cute message print to _safe_print. The existing gateway fix (tmp_agent._print_fn = lambda) now propagates correctly through both paths. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(gateway): silence hygiene and compression background agents Two more background AIAgent instances in the gateway were created with quiet_mode=True but without _print_fn = no-op, causing tool output to leak to the terminal: - _hyg_agent (in-turn hygiene memory agent) - tmp_agent (_compress_context path) Apply the same _print_fn no-op pattern used for the flush agent. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * chore(display): remove unused _last_flush_time from KawaiiSpinner Attribute was set but never read; upstream already removed it. Leftover from conflict resolution during rebase onto upstream/main. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> --------- Co-authored-by: Dilee <uzmpsk.dilekakbas@gmail.com> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-26 17:40:31 -07:00

1 2 3 4 5 ...

263 Commits