hermes-agent

Author	SHA1	Message	Date
Teknium	52b3a3ca3a	fix: default Telegram reactions to off, remove dead _remove_reaction Telegram's set_message_reaction replaces all reactions in one call, so _remove_reaction was never called (unlike Discord's additive model). Default reactions to disabled — users opt in via telegram.reactions: true.	2026-04-07 17:55:55 -07:00
Alvaro Linares	74b0072f8f	feat(telegram): add message reactions on processing start/complete Mirror the Discord reaction pattern for Telegram: - 👀 (eyes) when message processing begins - ✅ (check) on successful completion - ❌ (cross) on failure Controlled via TELEGRAM_REACTIONS env var or telegram.reactions in config.yaml (enabled by default, like Discord). Uses python-telegram-bot's Bot.set_message_reaction() API. Failures are caught and logged at debug level so they never break message processing.	2026-04-07 17:55:55 -07:00
Teknium	469cd16fe0	fix(security): consolidated security hardening — SSRF, timing attack, tar traversal, credential leakage (#5944 ) Salvaged from PRs #5800 (memosr), #5806 (memosr), #5915 (Ruzzgar), #5928 (Awsh1). Changes: - Use hmac.compare_digest for API key comparison (timing attack prevention) - Apply provider env var blocklist to Docker containers (credential leakage) - Replace tar.extractall() with safe extraction in TerminalBench2 (CVE-2007-4559) - Add SSRF protection via is_safe_url to ALL platform adapters: base.py (cache_image_from_url, cache_audio_from_url), discord, slack, telegram, matrix, mattermost, feishu, wecom (Signal and WhatsApp protected via base.py helpers) - Update tests: mock is_safe_url in Mattermost download tests - Add security tests for tar extraction (traversal, symlinks, safe files)	2026-04-07 17:28:37 -07:00
Teknium	125e5ef089	fix: extend caption substring fix to all platforms Move _merge_caption helper from TelegramAdapter to BasePlatformAdapter so all adapters inherit it. Fix the same substring-containment bug in: - gateway/platforms/base.py (photo burst merging) - gateway/run.py (priority photo follow-up merging) - gateway/platforms/feishu.py (media batch merging) The original fix only covered telegram.py. The same bug existed in base.py and run.py (pure substring check) and feishu.py (list membership without whitespace normalization).	2026-04-07 14:08:59 -07:00
Dilee	4a630c2071	fix(telegram): replace substring caption check with exact line-by-line match Captions in photo bursts and media group albums were silently dropped when a shorter caption happened to be a substring of an existing one (e.g. "Meeting" lost inside "Meeting agenda"). Extract a shared _merge_caption static helper that splits on "\n\n" and uses exact match with whitespace normalisation, then use it in both _enqueue_photo_event and _queue_media_group_event. Adds 13 unit tests covering the fixed bug scenarios. Cherry-picked from PR #2671 by Dilee.	2026-04-07 14:08:59 -07:00
Teknium	1a2a03ca69	feat(gateway): approval buttons for Slack & Telegram + Slack thread context (#5890 ) Slack: - Add Block Kit interactive buttons for command approval (Allow Once, Allow Session, Always Allow, Deny) via send_exec_approval() - Register @app.action handlers for each approval button - Add _fetch_thread_context() — fetches thread history via conversations.replies when bot is first @mentioned mid-thread - Fix _has_active_session_for_thread() to use build_session_key() instead of manual key construction (fixes session key mismatch bug where thread_sessions_per_user flag was ignored, ref PR #5833) Telegram: - Add InlineKeyboard approval buttons via send_exec_approval() - Add ea:* callback handling in _handle_callback_query() - Uses monotonic counter + _approval_state dict to map button clicks back to session keys (avoids 64-byte callback_data limit) Both platforms now auto-detected by the gateway runner's _approval_notify_sync() — any adapter with send_exec_approval() on its class gets button-based approval instead of text fallback. Inspired by community PRs #3898 (LevSky22), #2953 (ygd58), #5833 (heathley). Implemented fresh on current main. Tests: 24 new tests covering button rendering, action handling, thread context fetching, session key fix, double-click prevention.	2026-04-07 11:03:14 -07:00
Teknium	d0ffb111c2	refactor: codebase-wide lint cleanup — unused imports, dead code, and inefficient patterns (#5821 ) Comprehensive cleanup across 80 files based on automated (ruff, pyflakes, vulture) and manual analysis of the entire codebase. Changes by category: Unused imports removed (~95 across 55 files): - Removed genuinely unused imports from all major subsystems - agent/, hermes_cli/, tools/, gateway/, plugins/, cron/ - Includes imports in try/except blocks that were truly unused (vs availability checks which were left alone) Unused variables removed (~25): - Removed dead variables: connected, inner, channels, last_exc, source, new_server_names, verify, pconfig, default_terminal, result, pending_handled, temperature, loop - Dropped unused argparse subparser assignments in hermes_cli/main.py (12 instances of add_parser() where result was never used) Dead code removed: - run_agent.py: Removed dead ternary (None if False else None) and surrounding unreachable branch in identity fallback - run_agent.py: Removed write-only attribute _last_reported_tool - hermes_cli/providers.py: Removed dead @property decorator on module-level function (decorator has no effect outside a class) - gateway/run.py: Removed unused MCP config load before reconnect - gateway/platforms/slack.py: Removed dead SessionSource construction Undefined name bugs fixed (would cause NameError at runtime): - batch_runner.py: Added missing logger = logging.getLogger(__name__) - tools/environments/daytona.py: Added missing Dict and Path imports Unnecessary global statements removed (14): - tools/terminal_tool.py: 5 functions declared global for dicts they only mutated via .pop()/[key]=value (no rebinding) - tools/browser_tool.py: cleanup thread loop only reads flag - tools/rl_training_tool.py: 4 functions only do dict mutations - tools/mcp_oauth.py: only reads the global - hermes_time.py: only reads cached values Inefficient patterns fixed: - startswith/endswith tuple form: 15 instances of x.startswith('a') or x.startswith('b') consolidated to x.startswith(('a', 'b')) - len(x)==0 / len(x)>0: 13 instances replaced with pythonic truthiness checks (not x / bool(x)) - in dict.keys(): 5 instances simplified to in dict - Redefined unused name: removed duplicate _strip_mdv2 import in send_message_tool.py Other fixes: - hermes_cli/doctor.py: Replaced undefined logger.debug() with pass - hermes_cli/config.py: Consolidated chained .endswith() calls Test results: 3934 passed, 17 failed (all pre-existing on main), 19 skipped. Zero regressions.	2026-04-07 10:25:31 -07:00
Teknium	3bc2fe802e	feat(telegram): paginated model picker with Next/Prev navigation - Raise max_models from 8 to 50 so all curated models come through - Add _build_model_keyboard() helper with 8-per-page pagination - Next ▶ / ◀ Prev buttons with page counter (e.g. 2/4) - mg:<page> callback data for page navigation - Catch-all query.answer() for noop buttons	2026-04-06 23:10:40 -07:00
Teknium	5a2cf280a3	feat: interactive model picker for Telegram and Discord (#5742 ) /model with no args now shows an interactive UI on Telegram and Discord instead of a text list: Telegram: Inline keyboard buttons — two-step drill-down. Step 1: Provider buttons with model counts (e.g. 'OpenRouter (15)') Step 2: Model buttons within the selected provider Edits the same message in-place as the user navigates. Back/Cancel buttons for navigation. Discord: Embed + Select dropdown menus via discord.ui.View. Step 1: Provider dropdown with model counts Step 2: Model dropdown within the selected provider Back/Cancel buttons. Auth-gated to allowed users. Platforms without picker support (Slack, WhatsApp, Signal, etc.) fall back to the existing text list. /model <name> continues to work as a direct text switch on all platforms — the interactive picker is only for bare /model. Implementation: - TelegramAdapter.send_model_picker() + _handle_model_picker_callback() with compact callback_data (mp:/mm:/mb/mx, all within 64-byte limit) - DiscordAdapter.send_model_picker() + ModelPickerView (discord.ui.View) with Select menus (up to 25 options per dropdown) - GatewayRunner._handle_model_command() detects adapter capability via getattr(type(adapter), 'send_model_picker', None) (safe with mocks) and sends picker with async callback closure for the switch logic - Callback performs full switch: switch_model(), cached agent update, session override, pending model note — same as /model <name>	2026-04-06 23:00:04 -07:00
KangYu	77610961be	Lower Telegram fallback activation log to info	2026-04-06 16:49:30 -07:00
Teknium	89c812d1d2	feat: shared thread sessions by default — multi-user thread support (#5391 ) Threads (Telegram forum topics, Discord threads, Slack threads) now default to shared sessions where all participants see the same conversation. This is the expected UX for threaded conversations where multiple users @mention the bot and interact collaboratively. Changes: - build_session_key(): when thread_id is present, user_id is no longer appended to the session key (threads are shared by default) - New config: thread_sessions_per_user (default: false) — opt-in to restore per-user isolation in threads if needed - Sender attribution: messages in shared threads are prefixed with [sender name] so the agent can tell participants apart - System prompt: shared threads show 'Multi-user thread' note instead of a per-turn User line (avoids busting prompt cache) - Wired through all callers: gateway/run.py, base.py, telegram.py, feishu.py - Regular group messages (no thread) remain per-user isolated (unchanged) - DM threads are unaffected (they have their own keying logic) Closes community request from demontut_ re: thread-based shared sessions.	2026-04-05 19:46:58 -07:00
kshitijk4poor	1d2e34c7eb	Prevent Telegram polling handoffs and flood-control send failures Telegram polling can inherit a stale webhook registration when a deployment switches transport modes, which leaves getUpdates idle even though the gateway starts cleanly. Outbound send also treats Telegram retry_after responses as terminal errors, so brief flood control can drop tool progress and replies. Constraint: Keep the PR narrowly scoped to upstream/main Telegram adapter behavior Rejected: Port OpenClaw's broader polling supervisor and offset persistence \| too broad for an isolated fix PR Confidence: high Scope-risk: narrow Reversibility: clean Directive: Polling mode should clear webhook state before starting getUpdates, and send-path retry logic must distinguish flood control from timeouts Tested: uv run --extra dev pytest tests/gateway/test_telegram_* -q Not-tested: Live Telegram webhook-to-polling migration and real Bot API 429 behavior	2026-04-05 11:59:28 -07:00
Teknium	0c54da8aaf	feat(gateway): live-stream /update output + interactive prompt buttons (#5180 ) * feat(gateway): live-stream /update output + forward interactive prompts Adds real-time output streaming and interactive prompt forwarding for the gateway /update command, so users on Telegram/Discord/etc see the full update progress and can respond to prompts (stash restore, config migration) without needing terminal access. Changes: hermes_cli/main.py: - Add --gateway flag to 'hermes update' argparse - Add _gateway_prompt() file-based IPC function that writes .update_prompt.json and polls for .update_response - Modify _restore_stashed_changes() to accept optional input_fn parameter for gateway mode prompt forwarding - cmd_update() uses _gateway_prompt when --gateway is set, enabling interactive stash restore and config migration prompts gateway/run.py: - _handle_update_command: spawn with --gateway flag and PYTHONUNBUFFERED=1 for real-time output flushing - Store session_key in .update_pending.json for cross-restart session matching - Add _update_prompt_pending dict to track sessions awaiting update prompt responses - Replace _watch_for_update_completion with _watch_update_progress: streams output chunks every ~4s, detects .update_prompt.json and forwards prompts to the user, handles completion/failure/timeout - Add update prompt interception in _handle_message: when a prompt is pending, the user's next message is written to .update_response instead of being processed normally - Preserve _send_update_notification as legacy fallback for post-restart cases where adapter isn't available yet File-based IPC protocol: - .update_prompt.json: written by update process with prompt text, default value, and unique ID - .update_response: written by gateway with user's answer - .update_output.txt: existing, now streamed in real-time - .update_exit_code: existing completion marker Tests: 16 new tests covering _gateway_prompt IPC, output streaming, prompt detection/forwarding, message interception, and cleanup. * feat: interactive buttons for update prompts (Telegram + Discord) Telegram: Inline keyboard with ✓ Yes / ✗ No buttons. Clicking a button answers the callback query, edits the message to show the choice, and writes .update_response directly. CallbackQueryHandler registered on the update_prompt: prefix. Discord: UpdatePromptView (discord.ui.View) with green Yes / red No buttons. Follows the ExecApprovalView pattern — auth check, embed color update, disabled-after-click. Writes .update_response on click. All platforms: /approve and /deny (and /yes, /no) now work as shorthand for yes/no when an update prompt is pending. The text fallback message instructs users to use these commands. Raw message interception still works as a fallback for non-command responses. Gateway watcher checks adapter for send_update_prompt method (class-level check to avoid MagicMock false positives) and falls back to text prompt with /approve instructions when unavailable. * fix: block /update on non-messaging platforms (API, webhooks, ACP) Add _UPDATE_ALLOWED_PLATFORMS frozenset that explicitly lists messaging platforms where /update is permitted. API server, webhook, and ACP platforms get a clear error directing them to run hermes update from the terminal instead. ACP and API server already don't reach _handle_message (separate codepaths), and webhooks have distinct session keys that can't collide with messaging sessions. This guard is belt-and-suspenders.	2026-04-05 00:28:58 -07:00
Teknium	85cefc7a5a	fix(telegram): prevent duplicate message delivery on send timeout (#5153 ) TimedOut is a subclass of NetworkError in python-telegram-bot. The inner retry loop in send() and the outer _send_with_retry() in base.py both treated it as a transient connection error and retried — but send_message is not idempotent. When the request reaches Telegram but the HTTP response times out, the message is already delivered. Retrying sends duplicates. Worst case: up to 9 copies (inner 3x × outer 3x). Inner loop (telegram.py): - Import TimedOut separately, isinstance-check before generic NetworkError retry (same pattern as BadRequest carve-out from #3390) - Re-raise immediately — no retry - Mark as retryable=False in outer exception handler Outer loop (base.py): - Remove 'timeout', 'timed out', 'readtimeout', 'writetimeout' from _RETRYABLE_ERROR_PATTERNS (read/write timeouts are delivery-ambiguous) - Add 'connecttimeout' (safe — connection never established) - Keep 'network' (other platforms still need it) - Add _is_timeout_error() + early return to prevent plain-text fallback on timeout errors (would also cause duplicate delivery) Connection errors (ConnectionReset, ConnectError, etc.) are still retried — these fail before the request reaches the server. Credit: tmdgusya (PR #3899), barun1997 (PR #3904) for identifying the bug and proposing fixes. Closes #3899, closes #3904.	2026-04-04 19:05:34 -07:00
Dolf	1cae9ac628	feat(telegram): add group_topics skill binding for supergroup forum topics Reads config.extra['group_topics'] to bind skills to specific thread_ids in supergroup/forum chats. Mirrors the dm_topics skill injection pattern but for group chat_type. Enables per-topic skill auto-loading in Falcon HQ. Config format: platforms.telegram.extra.group_topics: - chat_id: -1003853746818 topics: - name: FalconConnect thread_id: 5 skill: falconconnect-architecture	2026-04-03 18:20:50 -07:00
kshitijk4poor	28380e7aed	fix(gateway): STT config resolution, stream consumer flood control fallback Three targeted fixes from user-reported issues: 1. STT config resolution (transcription_tools.py): _has_openai_audio_backend() and _resolve_openai_audio_client_config() now check stt.openai.api_key/base_url in config.yaml FIRST, before falling back to env vars. Fixes voice transcription breaking when using a custom OpenAI-compatible endpoint via config.yaml. 2. Stream consumer flood control fallback (stream_consumer.py): When an edit fails mid-stream (e.g., Telegram flood control returns failure for waits >5s), reset _already_sent to False so the normal final send path delivers the complete response. Previously, a truncated partial was left as the final message. 3. Telegram edit_message comment alignment (telegram.py): Clarify that long flood waits return failure so streaming can fall back to a normal final send.	2026-04-03 00:50:17 -07:00
kshitijk4poor	69f85a4dce	fix(gateway): race condition, photo media loss, and flood control in Telegram Three bugs causing intermittent silent drops, partial responses, and flood control delays on the Telegram platform: 1. Race condition in handle_message() — _active_sessions was set inside the background task, not before create_task(). Two rapid messages could both pass the guard and spawn duplicate processing tasks. Fix: set _active_sessions synchronously before spawning the task (grammY sequentialize / aiogram EventIsolation pattern). 2. Photo media loss on dequeue — when a photo (no caption) was queued during active processing and later dequeued, only .text was extracted. Empty text → message silently dropped. Fix: _build_media_placeholder() creates text context for media-only events so they survive the dequeue path. 3. Progress message edits triggered Telegram flood control — rapid tool calls edited the progress message every 0.3s, hitting Telegram's rate limit (23s+ waits). This blocked progress updates and could cause stream consumer timeouts. Fix: throttle edits to 1.5s minimum interval, detect flood control errors and gracefully degrade to new messages. edit_message() now returns failure for flood waits >5s instead of blocking.	2026-04-03 00:50:17 -07:00
SHL0MS	83dec2b3ec	fix: skip empty/whitespace text in Telegram send to prevent 400 errors Telegram API returns HTTP 400 when sent whitespace-only or empty text. Add a guard at the top of send() to silently succeed on blank content instead of crashing. Equivalent to OpenClaw #56620.	2026-03-31 19:10:26 -07:00
Teknium	60ecde8ac7	fix: fit all 100 commands in Telegram menu with 40-char descriptions (#4010 ) * fix: truncate skill descriptions to 100 chars in Telegram menu * fix: 40-char desc cap + 100 command limit for Telegram menu setMyCommands has an undocumented total payload size limit. 50 commands with 256-char descriptions failed, 50 with 100-char worked, and 100 with 40-char descriptions also works (~5300 total chars). Truncate skill descriptions to 40 chars in the menu picker and set cap back to 100. Full descriptions available via /commands.	2026-03-30 11:21:13 -07:00
Teknium	da3e22bcfa	fix: cap Telegram menu at 50 commands — API rejects above ~60 (#4006 ) * fix: use SKILLS_DIR not repo path for Telegram menu skill filter Skills are synced to ~/.hermes/skills/ (SKILLS_DIR), not the repo's skills/ directory. The previous filter compared against the repo path so no skills matched. Now checks SKILLS_DIR and excludes .hub/ subdirectory (user-installed hub skills). * fix: cap Telegram menu at 50 commands — API rejects above ~60 Telegram's setMyCommands returns BOT_COMMANDS_TOO_MUCH when registering close to 100 commands despite docs claiming 100 is the limit. Metadata overhead causes rejection above ~60. Cap at 50 for reliability — remaining commands accessible via /commands.	2026-03-30 11:05:20 -07:00
Teknium	5ceed021dc	feat(gateway): skill-aware slash commands, paginated /commands, Telegram 100-cap (#3934 ) * feat(gateway): skill-aware slash commands, paginated /commands, Telegram 100-cap Map active skills to Telegram's slash command menu so users can discover and invoke skills directly. Three changes: 1. Telegram menu now includes active skill commands alongside built-in commands, capped at 100 entries (Telegram Bot API limit). Overflow commands remain callable but hidden from the picker. Logged at startup when cap is hit. 2. New /commands [page] gateway command for paginated browsing of all commands + skills. /help now shows first 10 skill commands and points to /commands for the full list. 3. When a user types a slash command that matches a disabled or uninstalled skill, they get actionable guidance: - Disabled: 'Enable it with: hermes skills config' - Optional (not installed): 'Install with: hermes skills install official/<path>' Built on ideas from PR #3921 by @kshitijk4poor. * chore: move 21 niche skills to optional-skills Move specialized/niche skills from built-in (skills/) to optional (optional-skills/) to reduce the default skill count. Users can install them with: hermes skills install official/<category>/<name> Moved skills (21): - mlops: accelerate, chroma, faiss, flash-attention, hermes-atropos-environments, huggingface-tokenizers, instructor, lambda-labs, llava, nemo-curator, pinecone, pytorch-lightning, qdrant, saelens, simpo, slime, tensorrt-llm, torchtitan - research: domain-intel, duckduckgo-search - devops: inference-sh cli Built-in skills: 96 → 75 Optional skills: 22 → 43 * fix: only include repo built-in skills in Telegram menu, not user-installed User-installed skills (from hub or manually added) stay accessible via /skills and by typing the command directly, but don't get registered in the Telegram slash command picker. Only skills whose SKILL.md is under the repo's skills/ directory are included in the menu. This keeps the Telegram menu focused on the curated built-in set while user-installed skills remain discoverable through /skills and /commands.	2026-03-30 10:57:30 -07:00
Teknium	649d149438	feat(telegram): add webhook mode as alternative to polling (#3880 ) When TELEGRAM_WEBHOOK_URL is set, the adapter starts an HTTP webhook server (via python-telegram-bot's start_webhook()) instead of long polling. This enables cloud platforms like Fly.io and Railway to auto-wake suspended machines on inbound HTTP traffic. Polling remains the default — no behavior change unless the env var is set. Env vars: TELEGRAM_WEBHOOK_URL Public HTTPS URL for Telegram to push to TELEGRAM_WEBHOOK_PORT Local listen port (default 8443) TELEGRAM_WEBHOOK_SECRET Secret token for update verification Cherry-picked and adapted from PR #2022 by SHL0MS. Preserved all current main enhancements (network error recovery, polling conflict detection, DM topics setup). Co-authored-by: SHL0MS <SHL0MS@users.noreply.github.com>	2026-03-29 22:36:07 -07:00
Teknium	839f798b74	feat(telegram): add group mention gating and regex triggers (#3870 ) Adds Discord-style mention gating for Telegram groups: - telegram.require_mention: gate group messages (default: false) - telegram.mention_patterns: regex wake-word triggers - telegram.free_response_chats: bypass gating for specific chats When require_mention is enabled, group messages are accepted only for: - slash commands - replies to the bot - @botusername mentions - regex wake-word pattern matches DMs remain unrestricted. @mention text is stripped before passing to the agent. Invalid regex patterns are ignored with a warning. Config bridges follow the existing Discord pattern (yaml → env vars). Cherry-picked and adapted from PR #1977 by mcleay. Fixed ChatType comparison to work without python-telegram-bot installed (uses string matching instead of enum, consistent with other entity_type checks). Co-authored-by: mcleay <mcleay@users.noreply.github.com>	2026-03-29 21:53:59 -07:00
Teknium	b60cfd6ce6	fix(telegram): gracefully handle deleted reply targets (#3858 ) * fix: add gpt-5.4-mini to Codex fallback catalog * fix(telegram): gracefully handle deleted reply targets When a user deletes their message while Hermes is processing, Telegram returns BadRequest 'Message to be replied not found'. Previously this was an unhandled permanent error causing silent delivery failure. Now clears reply_to_id and retries so the response is still delivered, matching the existing 'thread not found' recovery pattern. Inspired by PR #3231 by @heathley. Fixes #3229. --------- Co-authored-by: Clippy <clippy@grads.flow> Co-authored-by: Nigel Gibbs <heathley@users.noreply.github.com>	2026-03-29 20:47:07 -07:00
Teknium	e97c0cb578	fix: replace hardcoded ~/.hermes paths with get_hermes_home() for profile support * feat: GPT tool-use steering + strip budget warnings from history Two changes to improve tool reliability, especially for OpenAI GPT models: 1. GPT tool-use enforcement prompt: Adds GPT_TOOL_USE_GUIDANCE to the system prompt when the model name contains 'gpt' and tools are loaded. This addresses a known behavioral pattern where GPT models describe intended actions ('I will run the tests') instead of actually making tool calls. Inspired by similar steering in OpenCode (beast.txt) and Cline (GPT-5.1 variant). 2. Budget warning history stripping: Budget pressure warnings injected by _get_budget_warning() into tool results are now stripped when conversation history is replayed via run_conversation(). Previously, these turn-scoped signals persisted across turns, causing models to avoid tool calls in all subsequent messages after any turn that hit the 70-90% iteration threshold. * fix: replace hardcoded ~/.hermes paths with get_hermes_home() for profile support Prep for the upcoming profiles feature — each profile is a separate HERMES_HOME directory, so all paths must respect the env var. Fixes: - gateway/platforms/matrix.py: Matrix E2EE store was hardcoded to ~/.hermes/matrix/store, ignoring HERMES_HOME. Now uses get_hermes_home() so each profile gets its own Matrix state. - gateway/platforms/telegram.py: Two locations reading config.yaml via Path.home()/.hermes instead of get_hermes_home(). DM topic thread_id persistence and hot-reload would read the wrong config in a profile. - tools/file_tools.py: Security path for hub index blocking was hardcoded to ~/.hermes, would miss the actual profile's hub cache. - hermes_cli/gateway.py: Service naming now uses the profile name (hermes-gateway-coder) instead of a cryptic hash suffix. Extracted _profile_suffix() helper shared by systemd and launchd. - hermes_cli/gateway.py: Launchd plist path and Label now scoped per profile (ai.hermes.gateway-coder.plist). Previously all profiles would collide on the same plist file on macOS. - hermes_cli/gateway.py: Launchd plist now includes HERMES_HOME in EnvironmentVariables — was missing entirely, making custom HERMES_HOME broken on macOS launchd (pre-existing bug). - All launchctl commands in gateway.py, main.py, status.py updated to use get_launchd_label() instead of hardcoded string. Test fixes: DM topic tests now set HERMES_HOME env var alongside Path.home() mock. Launchd test uses get_launchd_label() for expected commands.	2026-03-28 13:51:08 -07:00
Teknium	41d9d08078	fix(telegram): fall back to no thread_id on 'Message thread not found' (#3390 ) python-telegram-bot's BadRequest inherits from NetworkError, so the send() retry loop was catching 'Message thread not found' as a transient network error and retrying 3 times before silently failing. This killed all tool progress messages, streaming responses, and typing indicators when the incoming message carried an invalid message_thread_id. Now detect BadRequest inside the NetworkError handler: - 'thread not found' + thread_id set → clear thread_id and retry once (message still reaches the chat, just without topic threading) - Other BadRequest errors → raise immediately (permanent, don't retry) - True NetworkError → retry as before (transient) 252 silent failures in gateway.log traced to this on 2026-03-26. 5 new tests for thread fallback, non-thread BadRequest, no-thread sends, network retry, and multi-chunk fallback.	2026-03-27 06:07:28 -07:00
Teknium	75fcbc44ce	feat(telegram): auto-discover fallback IPs via DoH when api.telegram.org is unreachable (#3376 ) * feat(telegram): auto-discover fallback IPs via DoH when api.telegram.org is unreachable On some networks (university, corporate), api.telegram.org resolves to a valid Telegram IP that is unreachable due to routing/firewall rules. A different IP in the same Telegram-owned 149.154.160.0/20 block works fine. This adds automatic fallback IP discovery at connect time: 1. Query Google and Cloudflare DNS-over-HTTPS for api.telegram.org A records 2. Exclude the system-DNS IP (the unreachable one), use the rest as fallbacks 3. If DoH is also blocked, fall back to a seed list (149.154.167.220) 4. TelegramFallbackTransport tries primary first, sticks to whichever works No configuration needed — works automatically. TELEGRAM_FALLBACK_IPS env var still available as manual override. Zero impact on healthy networks (primary path succeeds on first attempt, fallback never exercised). No new dependencies (uses httpx already in deps + stdlib socket). * fix: share transport instance and downgrade seed fallback log to info - Use single TelegramFallbackTransport shared between request and get_updates_request so sticky IP is shared across polling and API calls - Keep separate HTTPXRequest instances (different timeout settings) - Downgrade "using seed fallback IPs" from warning to info to avoid noisy logs on healthy networks * fix: add telegram.request mock and discovery fixture to remaining test files The original PR missed test_dm_topics.py and test_telegram_network_reconnect.py — both need the telegram.request mock module. The reconnect test also needs _no_auto_discovery since _handle_polling_network_error calls connect() which now invokes discover_fallback_ips(). --------- Co-authored-by: Mohan Qiao <Gavin-Qiao@users.noreply.github.com>	2026-03-27 04:03:13 -07:00
Teknium	6610c377ba	fix(telegram): self-reschedule reconnect when start_polling fails (#3268 ) After a Telegram 502, _handle_polling_network_error calls updater.stop() then start_polling(). If start_polling() also raises, the old code logged a warning and returned — but the comment 'The next network error will trigger another attempt' was wrong. The updater loop is dead after stop(), so no further error callbacks ever fire. The gateway stays alive but permanently deaf to messages. Fix: when start_polling() fails in the except branch, schedule a new _handle_polling_network_error task to continue the exponential backoff retry chain. The task is tracked in _background_tasks (preventing GC). Guarded by has_fatal_error to avoid spurious retries during shutdown. Closes #3173. Salvaged from PR #3177 by Mibayy.	2026-03-26 15:34:33 -07:00
Teknium	36af1f3baf	feat(telegram): Private Chat Topics with functional skill binding (#2598 ) Salvages PR #3005 by web3blind. Cherry-picked onto current main with functional skill binding and docs added. - DM topic creation via createForumTopic (Bot API 9.4, Feb 2026) - Config-driven topics with thread_id persistence across restarts - Session isolation via existing build_session_key thread_id support - auto_skill field on MessageEvent for topic-skill bindings - Gateway auto-loads bound skill on new sessions (same as /skill commands) - Docs: full Private Chat Topics section in Telegram messaging guide - 20 tests (17 original + 3 for auto_skill) Closes #2598 Co-authored-by: web3blind <web3blind@users.noreply.github.com>	2026-03-26 02:04:11 -07:00
Teknium	8bb1d15da4	chore: remove ~100 unused imports across 55 files (#3016 ) Automated cleanup via pyflakes + autoflake with manual review. Changes: - Removed unused stdlib imports (os, sys, json, pathlib.Path, etc.) - Removed unused typing imports (List, Dict, Any, Optional, Tuple, Set, etc.) - Removed unused internal imports (hermes_cli.auth, hermes_cli.config, etc.) - Fixed cli.py: removed 8 shadowed banner imports (imported from hermes_cli.banner then immediately redefined locally — only build_welcome_banner is actually used) - Added noqa comments to imports that appear unused but serve a purpose: - Re-exports (gateway/session.py SessionResetPolicy, tools/terminal_tool.py is_interrupted/_interrupt_event) - SDK presence checks in try/except (daytona, fal_client, discord) - Test mock targets (auxiliary_client.py Path, mcp_config.py get_hermes_home) Zero behavioral changes. Full test suite passes (6162/6162, 2 pre-existing streaming test failures unrelated to this change).	2026-03-25 15:02:03 -07:00
Teknium	e5691eed38	feat(gateway): configurable Telegram reply threading mode (#2907 ) Add reply_to_mode setting (off/first/all) to control whether Telegram replies quote/thread to the user's original message. - 'off': Never thread replies (no quote bubble) - 'first': Only first chunk threads to user's message (default, preserves existing behavior) - 'all': All chunks in multi-part replies thread to user's message Configurable via: - reply_to_mode in platform config (gateway config YAML) - TELEGRAM_REPLY_TO_MODE env var Based on PR #855 by raulvidis.	2026-03-24 19:56:00 -07:00
Teknium	2bd8e5cb23	fix(telegram): auto-reconnect polling after network interruption Closes #2476 The polling error callback previously only handled Conflict errors (409 from multiple getUpdates callers). All other errors, including NetworkError and TimedOut that python-telegram-bot raises when the host loses connectivity (Mac sleep, WiFi switch, VPN reconnect), were logged and silently discarded. The bot would stop responding until manually restarted. Fix: - Add _looks_like_network_error() to classify transient connectivity errors (NetworkError, TimedOut, OSError, ConnectionError). - Add _handle_polling_network_error() with exponential back-off reconnect: retries up to 10 times with delays 5s, 10s, 20s, 40s, 60s (capped). On exhaustion, marks the adapter retryable-fatal so launchd/systemd can restart the gateway process. - Refactor _polling_error_callback() to route network errors to the new handler before falling through to a generic error log. - Track _polling_network_error_count (reset on successful reconnect) independently from _polling_conflict_count.	2026-03-22 09:18:58 -07:00
Teknium	febfe1c268	fix(telegram): escape bare parentheses/braces in MarkdownV2 output The MarkdownV2 format_message conversion left unescaped ( ) { } in edge cases where placeholder processing didn't cover them (e.g. partial link matches, URLs with parens). This caused Telegram to reject the message with 'character ( is reserved and must be escaped' and fall back to plain text — losing all formatting. Added a safety-net pass (step 12) after placeholder restoration that escapes any remaining bare ( ) { } outside code blocks and valid MarkdownV2 link syntax.	2026-03-21 16:13:13 -07:00
unmodeled-tyler	fb48b8f0c5	fix(gateway): pass message_thread_id in send_image_file, send_document, send_video Fixes #1803. send_image_file, send_document, and send_video were missing message_thread_id forwarding, causing them to fail in Telegram forum/supergroups where thread_id is required. send_voice already handled this correctly. Adds metadata parameter + message_thread_id to all three methods, and adds tests covering the thread_id forwarding path.	2026-03-21 09:49:33 -07:00
Teknium	488a30e879	fix(gateway): retry Telegram 409 polling conflicts before giving up A single Telegram 409 Conflict from getUpdates permanently killed Telegram polling with no recovery possible (retryable=False on first occurrence). This is too aggressive for production use with process supervisors. Transient 409s are expected during: - --replace handoffs where the old long-poll session lingers on Telegram servers for a few seconds after SIGTERM - systemd Restart=on-failure respawns that overlap with the dying instance cleanup Now _handle_polling_conflict() retries up to 3 times with a 10-second delay between attempts. The 30-second total retry window lets stale server-side sessions expire. If all retries fail, the error is still marked as permanently fatal — preserving the original protection against genuine dual-instance conflicts. Tests updated: split the single conflict test into two — one verifying retry on transient conflict, one verifying fatal after exhausted retries. Closes #2296	2026-03-21 07:11:06 -07:00
Teknium	f853e50589	Merge pull request #2199 from llbn/fix/telegram-markdownv2-features Clean PR, well-tested. Adds MarkdownV2 strikethrough, spoiler, and blockquote support to Telegram adapter.	2026-03-20 12:45:47 -07:00
llbn	43b3a0ac66	fix(telegram): escape backslashes and backticks inside code entities for MarkdownV2 - Escape \ → \\ inside inline code and fenced code blocks - Escape ` → \` inside fenced code block bodies (not delimiters) - Add regression tests for code entity backslash handling	2026-03-20 18:32:45 +01:00
llbn	02f639e561	fix(telegram): add MarkdownV2 support for strikethrough, spoiler, and blockquotes - Convert ~~text~~ to ~text~ (MarkdownV2 strikethrough) - Protect \|\|text\|\| from pipe escaping (MarkdownV2 spoiler) - Preserve > at line start as blockquote instead of escaping it - Update _strip_mdv2() to strip ~strikethrough~ and \|\|spoiler\|\| markers - Add tests covering new formatting paths and edge cases	2026-03-20 18:21:24 +01:00
Teknium	2fa33dde81	fix: handle message length overflow in streaming mode (#1783 ) Stream consumer now splits messages that exceed the platform's MAX_MESSAGE_LENGTH. When accumulated text grows past the safe limit, the current message is finalized and a new message is started for the overflow — same as how normal sends chunk long responses. Split point prefers line boundaries (rfind newline) for clean breaks. Works for all platforms (Telegram 4096, Discord 2000, etc.) by reading the adapter's MAX_MESSAGE_LENGTH at runtime. Also added a safety net in the Telegram adapter: if edit_message_text still hits MESSAGE_TOO_LONG (e.g. markdown formatting expansion), it truncates and returns success so the stream consumer doesn't die. Co-authored-by: Test <test@test.com>	2026-03-17 11:00:52 -07:00
Teknium	7ac9088d5c	fix: Telegram streaming — config bridge, not-modified, flood control (#1782 ) * fix: NameError in OpenCode provider setup (prompt_text -> prompt) The OpenCode Zen and OpenCode Go setup sections used prompt_text() which is undefined. All other providers correctly use the local prompt() function defined in setup.py. Fixes crash during 'hermes setup' when selecting either OpenCode provider. * fix: Telegram streaming — config bridge, not-modified, flood control Three fixes for gateway streaming: 1. Bridge streaming config from config.yaml into gateway runtime. load_gateway_config() now reads the 'streaming' key from config.yaml (same pattern as session_reset, stt, etc.), matching the docs. Previously only gateway.json was read. 2. Handle 'Message is not modified' in Telegram edit_message(). This Telegram API error fires when editing with identical content — a no-op, not a real failure. Previously it returned success=False which made the stream consumer disable streaming entirely. 3. Handle RetryAfter / flood control in Telegram edit_message(). Fast providers can hit Telegram rate limits during streaming. Now waits the requested retry_after duration and retries once, instead of treating it as a fatal edit failure. Also fixed double-edit on stream finish: the consumer now tracks last-sent text and skips redundant edits, preventing the not-modified error at the source. * refactor: make config.yaml the primary gateway config source Eliminates the per-key bridge pattern in load_gateway_config(). Previously gateway.json was the primary source and each config.yaml key needed an individual bridge — easy to forget (streaming was missing, causing garl4546's bug). Now config.yaml is read first and its keys are mapped directly into the GatewayConfig.from_dict() schema. gateway.json is kept as a legacy fallback layer (loaded first, then overwritten by config.yaml keys). If gateway.json exists, a log message suggests migrating. Also: - Removed dead save_gateway_config() (never called anywhere) - Updated CLI help text and send_message error to reference config.yaml instead of gateway.json --------- Co-authored-by: Test <test@test.com>	2026-03-17 10:51:54 -07:00
Teknium	d156942419	fix(telegram): aggregate split text messages before dispatching (#1674 ) When a user sends a long message, Telegram clients split it into multiple updates that arrive within milliseconds of each other. Previously each chunk was dispatched independently — the first would start the agent, and subsequent chunks would interrupt or queue as separate turns, causing the agent to only see part of the message. Add text message batching to TelegramAdapter following the same pattern as the existing photo burst batching: - _enqueue_text_event() buffers text by session key, concatenating chunks that arrive in rapid succession - _flush_text_batch() dispatches the combined message after a 0.6s quiet period (configurable via HERMES_TELEGRAM_TEXT_BATCH_DELAY_SECONDS) - Timer resets on each new chunk, so all parts of a split arrive before the batch is dispatched Reported by NulledVector on Discord.	2026-03-17 02:49:57 -07:00
Teknium	9ece1ce2de	feat(gateway): inject reply-to message context for out-of-session replies (#1594 ) * fix: prevent infinite 400 failure loop on context overflow (#1630) When a gateway session exceeds the model's context window, Anthropic may return a generic 400 invalid_request_error with just 'Error' as the message. This bypassed the phrase-based context-length detection, causing the agent to treat it as a non-retryable client error. Worse, the failed user message was still persisted to the transcript, making the session even larger on each attempt — creating an infinite loop. Three-layer fix: 1. run_agent.py — Fallback heuristic: when a 400 error has a very short generic message AND the session is large (>40% of context or >80 messages), treat it as a probable context overflow and trigger compression instead of aborting. 2. run_agent.py + gateway/run.py — Don't persist failed messages: when the agent returns failed=True before generating any response, skip writing the user's message to the transcript/DB. This prevents the session from growing on each failure. 3. gateway/run.py — Smarter error messages: detect context-overflow failures and suggest /compact or /reset specifically, instead of a generic 'try again' that will fail identically. * fix(skills): detect prompt injection patterns and block cache file reads Adds two security layers to prevent prompt injection via skills hub cache files (#1558): 1. read_file: blocks direct reads of ~/.hermes/skills/.hub/ directory (index-cache, catalog files). The 3.5MB clawhub_catalog_v1.json was the original injection vector — untrusted skill descriptions in the catalog contained adversarial text that the model executed. 2. skill_view: warns when skills are loaded from outside the trusted ~/.hermes/skills/ directory, and detects common injection patterns in skill content ("ignore previous instructions", "<system>", etc.). Cherry-picked from PR #1562 by ygd58. * fix(tools): chunk long messages in send_message_tool before dispatch (#1552) Long messages sent via send_message tool or cron delivery silently failed when exceeding platform limits. Gateway adapters handle this via truncate_message(), but the standalone senders in send_message_tool bypassed that entirely. - Apply truncate_message() chunking in _send_to_platform() before dispatching to individual platform senders - Remove naive message[i:i+2000] character split in _send_discord() in favor of centralized smart splitting - Attach media files to last chunk only for Telegram - Add regression tests for chunking and media placement Cherry-picked from PR #1557 by llbn. * fix(approval): show full command in dangerous command approval (#1553) Previously the command was truncated to 80 chars in CLI (with a [v]iew full option), 500 chars in Discord embeds, and missing entirely in Telegram/Slack approval messages. Now the full command is always displayed everywhere: - CLI: removed 80-char truncation and [v]iew full menu option - Gateway (TG/Slack): approval_required message includes full command in a code block - Discord: embed shows full command up to 4096-char limit - Windows: skip SIGALRM-based test timeout (Unix-only) - Updated tests: replaced view-flow tests with direct approval tests Cherry-picked from PR #1566 by crazywriter1. * fix(cli): flush stdout during agent loop to prevent macOS display freeze (#1624) The interrupt polling loop in chat() waited on the queue without invalidating the prompt_toolkit renderer. On macOS, the StdoutProxy buffer only flushed on input events, causing the CLI to appear frozen during tool execution until the user typed a key. Fix: call _invalidate() on each queue timeout (every ~100ms, throttled to 150ms) to force the renderer to flush buffered agent output. * fix(claw): warn when API keys are skipped during OpenClaw migration (#1580) When --migrate-secrets is not passed (the default), API keys like OPENROUTER_API_KEY are silently skipped with no warning. Users don't realize their keys weren't migrated until the agent fails to connect. Add a post-migration warning with actionable instructions: either re-run with --migrate-secrets or add the key manually via hermes config set. Cherry-picked from PR #1593 by ygd58. * fix(security): block sandbox backend creds from subprocess env (#1264) Add Modal and Daytona sandbox credentials to the subprocess env blocklist so they're not leaked to agent terminal sessions via printenv/env. Cherry-picked from PR #1571 by ygd58. * fix(gateway): cap interrupt recursion depth to prevent resource exhaustion (#816) When a user sends multiple messages while the agent keeps failing, _run_agent() calls itself recursively with no depth limit. This can exhaust stack/memory if the agent is in a failure loop. Add _MAX_INTERRUPT_DEPTH = 3. When exceeded, the pending message is logged and the current result is returned instead of recursing deeper. The log handler duplication bug described in #816 was already fixed separately (AIAgent.__init__ deduplicates handlers). * fix(gateway): /model shows active fallback model instead of config default (#1615) When the agent falls back to a different model (e.g. due to rate limiting), /model still showed the config default. Now tracks the effective model/provider after each agent run and displays it. Cleared when the primary model succeeds again or the user explicitly switches via /model. Cherry-picked from PR #1616 by MaxKerkula. Added hasattr guard for test compatibility. * feat(gateway): inject reply-to message context for out-of-session replies (#1594) When a user replies to a Telegram message, check if the quoted text exists in the current session transcript. If missing (from cron jobs, background tasks, or old sessions), prepend [Replying to: "..."] to the message so the agent has context about what's being referenced. - Add reply_to_text field to MessageEvent (base.py) - Populate from Telegram's reply_to_message (text or caption) - Inject context in _handle_message when not found in history Based on PR #1596 by anpicasso (cherry-picked reply-to feature only, excluded unrelated /server command and background delegation changes). --------- Co-authored-by: buray <ygd58@users.noreply.github.com> Co-authored-by: lbn <llbn@users.noreply.github.com> Co-authored-by: crazywriter1 <53251494+crazywriter1@users.noreply.github.com> Co-authored-by: Max K <MaxKerkula@users.noreply.github.com> Co-authored-by: Angello Picasso <angello.picasso@devsu.com>	2026-03-17 02:31:27 -07:00
Teknium	46176c8029	refactor: centralize slash command registry (#1603 ) * refactor: centralize slash command registry Replace 7+ scattered command definition sites with a single CommandDef registry in hermes_cli/commands.py. All downstream consumers now derive from this registry: - CLI process_command() resolves aliases via resolve_command() - Gateway _known_commands uses GATEWAY_KNOWN_COMMANDS frozenset - Gateway help text generated by gateway_help_lines() - Telegram BotCommands generated by telegram_bot_commands() - Slack subcommand map generated by slack_subcommand_map() Adding a command or alias is now a one-line change to COMMAND_REGISTRY instead of touching 6+ files. Bugfixes included: - Telegram now registers /rollback, /background (were missing) - Slack now has /voice, /update, /reload-mcp (were missing) - Gateway duplicate 'reasoning' dispatch (dead code) removed - Gateway help text can no longer drift from CLI help Backwards-compatible: COMMANDS and COMMANDS_BY_CATEGORY dicts are rebuilt from the registry, so existing imports work unchanged. * docs: update developer docs for centralized command registry Update AGENTS.md with full 'Slash Command Registry' and 'Adding a Slash Command' sections covering CommandDef fields, registry helpers, and the one-line alias workflow. Also update: - CONTRIBUTING.md: commands.py description - website/docs/reference/slash-commands.md: reference central registry - docs/plans/centralize-command-registry.md: mark COMPLETED - plans/checkpoint-rollback.md: reference new pattern - hermes-agent-dev skill: architecture table * chore: remove stale plan docs	2026-03-16 23:21:03 -07:00
Teknium	b411b979cb	fix(telegram): retry on transient TLS failures during connect and send (#1535 ) fix(telegram): retry on transient TLS failures during connect and send	2026-03-16 05:28:11 -07:00
JP Lew	17e87478d2	fix(gateway): restart on retryable startup failures (#1517 )	2026-03-16 05:26:31 -07:00
teknium1	25b0ae7979	fix(telegram): retry on transient TLS failures during connect and send Add exponential-backoff retry (3 attempts) around initialize() to handle transient TLS resets during gateway startup. Also catches TimedOut and OSError in addition to NetworkError. Add exponential-backoff retry (3 attempts) around send_message() for NetworkError during message delivery, wrapping the existing Markdown fallback logic. Both imports are guarded with try/except ImportError for test environments where telegram is mocked. Based on PR #1527 by cmd8. Closes #1526.	2026-03-16 05:23:32 -07:00
teknium1	38b4fd3737	fix(gateway): make group session isolation configurable default group and channel sessions to per-user isolation, allow opting back into shared room sessions via config.yaml, and document Discord gateway routing and session behavior.	2026-03-16 00:22:23 -07:00
Teknium	a56937735e	fix(telegram): escape chunk indicators in MarkdownV2 (#1478 )	2026-03-15 19:27:15 -07:00
CoinDegen	4ae1334287	fix(gateway): prevent telegram photo burst interrupts	2026-03-15 03:49:01 -07:00
Vimal	0c182211a1	fix(telegram): check updater/app state before disconnect The disconnect() method was unconditionally calling updater.stop() and app.stop(), causing errors when: - The updater was not running (RuntimeError: This Updater is not running!) - The app was None (AttributeError: 'NoneType' object has no attribute) Changes: - Check if updater exists and is running before stopping - Check if app is running before stopping - Only log warnings for actual errors, not expected shutdown states Fixes spurious warnings during gateway shutdown.	2026-03-14 21:51:30 -07:00

1 2

90 Commits