hermes-agent

Author	SHA1	Message	Date
bg-l2norm	abf1be564b	fix(deps): include telegram webhook extra in messaging installs (#4915 )	2026-04-05 11:59:28 -07:00
teyrebaz33	6df0f07ff3	fix: /status command bypasses active-session guard during agent run (#5046 ) When an agent was actively processing a message, /status sent via Telegram (or any gateway) was queued as a pending interrupt instead of being dispatched immediately. The base platform adapter's handle_message() only had special-case bypass logic for /approve and /deny, so /status fell through to the default interrupt path and was never processed as a system command. Apply the same bypass pattern used by /approve//deny: detect cmd == 'status' inside the active-session guard, dispatch directly to the message handler, and send the response without touching session lifecycle or interrupt state. Adds a regression test that verifies /status is dispatched and responded to immediately even when _active_sessions contains an entry for the session.	2026-04-05 11:59:28 -07:00
nibzard	4df2fca2f0	fix(gateway): cap memory flush retries at 3 to prevent infinite loop The _session_expiry_watcher retried failed memory flushes forever because exceptions were caught at debug level without setting memory_flushed=True. Expired sessions with transient failures (rate limits, network errors) would retry every 5 minutes indefinitely, burning API quota and blocking gateway message processing via 429 rate limit cascades. Observed case: a March 19 session retried 28+ times over ~17 days, causing repeated 429 errors that made Telegram unresponsive. Add a per-session failure counter (_flush_failures) that gives up after 3 consecutive attempts and marks the session as flushed to break the loop.	2026-04-05 11:59:28 -07:00
Saurabh	507b63f86b	fix(api-server): pass fallback_model to AIAgent (#4954 ) The API server platform never passed fallback_model to AIAgent(), so the fallback provider chain was always empty for requests through the OpenAI-compatible endpoint. Load it via GatewayApp._load_fallback_model() to match the behavior of Telegram/Discord/Slack platforms.	2026-04-05 11:59:28 -07:00
memosr	7f853ba7b6	fix: use logger.exception to preserve traceback in logs and drop unused import	2026-04-05 11:59:28 -07:00
memosr	5ff514ec79	fix(security): remove full traceback from cron error output to prevent info leakage	2026-04-05 11:59:28 -07:00
Teknium	daa4a5acdd	feat: add docs links to setup wizard sections (#5283 ) Each setup step now shows a link to the relevant docs page: - Model & Provider → integrations/providers - Terminal Backend → developer-guide/environments - Agent Settings → user-guide/configuration - Messaging Platforms → user-guide/messaging (overview) - Telegram, Discord, Matrix, Mattermost, WhatsApp → per-platform guides - Tools → user-guide/features/tools Existing Slack and Webhook URLs migrated to shared _DOCS_BASE constant.	2026-04-05 11:46:13 -07:00
Teknium	54cb311f40	fix: suppress false 'Unknown toolsets' warning for MCP server names (#5279 ) MCP server names (e.g. annas, libgen) are added to enabled_toolsets by _get_platform_tools() but aren't registered in TOOLSETS until later when _sync_mcp_toolsets() runs during tool discovery. The validation in HermesCLI.__init__() fires before that, producing a false warning. Fix: exclude configured MCP server names from the validation check. CLI_CONFIG is already available at the call site, so no new imports needed. Closes #5267 (alternative fix)	2026-04-05 11:44:40 -07:00
Teknium	a0a1b86c2e	fix: accept reasoning-only responses without retries — set content to "(empty)" (#5278 ) * feat: coerce tool call arguments to match JSON Schema types LLMs frequently return numbers as strings ("42" instead of 42) and booleans as strings ("true" instead of true). This causes silent failures with MCP tools and any tool with strictly-typed parameters. Added coerce_tool_args() in model_tools.py that runs before every tool dispatch. For each argument, it checks the tool registry schema and attempts safe coercion: - "42" → 42 when schema says "type": "integer" - "3.14" → 3.14 when schema says "type": "number" - "true"/"false" → True/False when schema says "type": "boolean" - Union types tried in order - Original values preserved when coercion fails or is not applicable Inspired by Block/goose tool argument coercion system. * fix: accept reasoning-only responses without retries — set content to "(empty)" Previously, when a model returned reasoning/thinking but no visible content, we entered a 120-line retry/classify/compress/salvage cascade that wasted 3+ API calls trying to "fix" the response. The model was done thinking — retrying with the same input just burned money. Now reasoning-only responses are accepted immediately: - Reasoning stays in the `reasoning` field (semantically correct) - Content set to "(empty)" — valid non-empty string every provider accepts - No retries, no compression triggers, no salvage logic - Session history contains "(empty)" not "" — prevents #2128 session poisoning where empty assistant content caused prefill rejections Removes ~120 lines, adds ~15. Saves 2-3 API calls per reasoning-only response. Fixes #2128.	2026-04-05 11:30:52 -07:00
nepenth	534511bebb	feat(matrix): Tier 1 enhancement — reactions, read receipts, rich formatting, room management Cherry-picked from PR #4338 by nepenth, resolved against current main. Adds: - Processing lifecycle reactions (eyes/checkmark/cross) via MATRIX_REACTIONS env - Reaction send/receive with ReactionEvent + UnknownEvent fallback for older nio - Fire-and-forget read receipts on text and media messages - Message redaction, room history fetch, room creation, user invite - Presence status control (online/offline/unavailable) - Emote (/me) and notice message types with HTML rendering - XSS-hardened markdown-to-HTML converter (strips raw HTML preprocessor, sanitizes link URLs against javascript:/data:/vbscript: schemes) - Comprehensive regex fallback with full block/inline markdown support - Markdown>=3.6 added to [matrix] extras in pyproject.toml - 46 new tests covering all features and security hardening	2026-04-05 11:19:54 -07:00
Teknium	20b4060dbf	fix: web_extract fast-fail on scrape timeout + summarizer resilience - Firecrawl scrape: 60s timeout via asyncio.wait_for + to_thread (previously could hang indefinitely) - Summarizer retries: 6 → 2 (one retry), reads timeout from auxiliary.web_extract.timeout config (default 360s / 6min) - Summarizer failure: falls back to truncated raw content (~5000 chars) instead of useless error message, with guidance about config/model - Config default: auxiliary.web_extract.timeout bumped 30 → 360s for local model compatibility Addresses Discord reports of agent hanging during web_extract.	2026-04-05 11:16:45 -07:00
Teknium	c100ad874c	fix(matrix): E2EE cron delivery via live adapter + HTML formatting + origin fallback Salvaged from PRs #3767 (chalkers), #5236 (ygd58), #2641 (buntingszn). Three improvements to Matrix cron delivery: 1. Live adapter path: when the gateway is running, cron delivery now uses the connected MatrixAdapter via run_coroutine_threadsafe instead of the standalone HTTP PUT. This enables delivery to E2EE rooms where the raw HTTP path cannot encrypt. Falls back to standalone on failure. Threads adapters + event loop from gateway -> cron ticker -> tick() -> _deliver_result(). (from #3767) 2. HTML formatted_body: _send_matrix() now converts markdown to HTML using the optional markdown library, with h1-h6 to bold conversion for Element X compatibility. Falls back to plain text if markdown is not installed. Also adds random bytes to txn_id to prevent collisions. (from #5236) 3. Origin fallback: when deliver="origin" but origin is null (jobs created via API/scripts), falls back to HOME_CHANNEL env vars in order: matrix -> telegram -> discord -> slack. (from #2641)	2026-04-05 11:07:47 -07:00
dlkakbs	36e046e843	fix(gateway): MIME type fallback for Matrix document uploads Cherry-picked run.py portion from PR #3495 by dlkakbs. When Matrix sends non-image files (text, YAML, JSON, etc.), the MIME type may be empty or application/octet-stream. Falls back to extension-based detection so text files are properly injected into agent context.	2026-04-05 11:07:47 -07:00
chalkers	bec02f3731	fix(matrix): handle encrypted media events and cache decrypted attachments Cherry-picked from PR #3140 by chalkers, resolved against current main. Registers RoomEncryptedImage/Audio/Video/File callbacks, decrypts attachments via nio.crypto, caches all media types (images, audio, documents), prevents ciphertext URL fallback for encrypted media. Unifies the separate voice-message download into the main cache block. Preserves main's MATRIX_REQUIRE_MENTION, auto-thread, and mention stripping features. Includes 355 lines of encrypted media tests.	2026-04-05 11:07:47 -07:00
binhnt92	b65e67545a	fix(gateway): stop Matrix/Mattermost reconnect on permanent auth failures Cherry-picked from PR #3695 by binhnt92. Matrix _sync_loop() and Mattermost _ws_loop() were retrying all errors forever, including permanent auth failures (expired tokens, revoked access). Now detects M_UNKNOWN_TOKEN, M_FORBIDDEN, 401/403 and stops instead of spinning. Includes 216 lines of tests.	2026-04-05 11:07:47 -07:00
pjay-io	9d7c288d86	fix(matrix): add filesize to nio.upload() for Synapse compatibility Cherry-picked from PR #4343 by pjay-io. Synapse rejects chunked uploads without Content-Length. Adding filesize=len(data) ensures the upload includes proper sizing.	2026-04-05 11:07:47 -07:00
thakoreh	914f7461dc	fix: add missing shutil import for Matrix E2EE setup Cherry-picked from PR #5136 by thakoreh. setup_gateway() uses shutil.which('uv') at line 2126 but shutil was never imported at module level, causing NameError during Matrix E2EE auto-install. Adds top-level import and regression test.	2026-04-05 11:07:47 -07:00
LucidPaths	70f798043b	fix: Ollama Cloud auth, /model switch persistence, and alias tab completion - Add OLLAMA_API_KEY to credential resolution chain for ollama.com endpoints - Update requested_provider/_explicit_api_key/_explicit_base_url after /model switch so _ensure_runtime_credentials() doesn't revert the switch - Pass base_url/api_key from fallback config to resolve_provider_client() - Add DirectAlias system: user-configurable model_aliases in config.yaml checked before catalog resolution, with reverse lookup by model ID - Add /model tab completion showing aliases with provider metadata Co-authored-by: LucidPaths <LucidPaths@users.noreply.github.com>	2026-04-05 11:06:06 -07:00
Teknium	35d280d0bd	feat: coerce tool call arguments to match JSON Schema types (#5265 ) LLMs frequently return numbers as strings ("42" instead of 42) and booleans as strings ("true" instead of true). This causes silent failures with MCP tools and any tool with strictly-typed parameters. Added coerce_tool_args() in model_tools.py that runs before every tool dispatch. For each argument, it checks the tool registry schema and attempts safe coercion: - "42" → 42 when schema says "type": "integer" - "3.14" → 3.14 when schema says "type": "number" - "true"/"false" → True/False when schema says "type": "boolean" - Union types tried in order - Original values preserved when coercion fails or is not applicable Inspired by Block/goose tool argument coercion system.	2026-04-05 10:57:34 -07:00
Teknium	e899d6a05d	fix: increase default HERMES_AGENT_TIMEOUT from 10min to 30min Users hitting the 10-minute default during complex tool chains. Bumps both the execution cap and stale-lock eviction timeout. Still overridable via HERMES_AGENT_TIMEOUT env var (0 = unlimited).	2026-04-05 10:32:59 -07:00
Teknium	51ed7dc2f3	feat: save oversized tool results to file instead of destructive truncation (#5210 ) Previously, tool results exceeding 100K characters were silently chopped with only a '[Truncated]' notice — the rest of the content was lost permanently. The model had no way to access the truncated portion. Now, oversized results are written to HERMES_HOME/cache/tool_responses/ and the model receives: - A 1,500-char head preview for immediate context - The file path so it can use read_file/search_files on the full output This preserves the context window protection (inline content stays small) while making the full data recoverable. Falls back to the old destructive truncation if the file write fails. Inspired by Block/goose's large response handler pattern.	2026-04-05 10:29:57 -07:00
Teknium	d932980c1a	Add gitnexus-explorer optional skill (#5208 ) Index codebases with GitNexus and serve an interactive knowledge graph web UI via Cloudflare tunnel. No sudo required. Includes: - Full setup/build/serve/tunnel pipeline - Zero-dependency Node.js reverse proxy script - Pitfalls section covering cloudflared config conflicts, Vite allowedHosts, Claude Code artifact cleanup, and browser memory limits for large repos	2026-04-05 03:00:19 -07:00
Teknium	4976a8b066	feat: /model command — models.dev primary database + --provider flag (#5181 ) Full overhaul of the model/provider system. ## What changed - models.dev (109 providers, 4000+ models) as primary database for provider identity AND model metadata - --provider flag replaces colon syntax for explicit provider switching - Full ModelInfo/ProviderInfo dataclasses with context, cost, capabilities, modalities - HermesOverlay system merges models.dev + Hermes-specific transport/auth/aggregator flags - User-defined endpoints via config.yaml providers: section - /model (no args) lists authenticated providers with curated model catalog - Rich metadata display: context window, max output, cost/M tokens, capabilities - Config migration: custom_providers list → providers dict (v11→v12) - AIAgent.switch_model() for in-place model swap preserving conversation ## Files agent/models_dev.py, hermes_cli/providers.py, hermes_cli/model_switch.py, hermes_cli/model_normalize.py, cli.py, gateway/run.py, run_agent.py, hermes_cli/config.py, hermes_cli/commands.py	2026-04-05 01:04:44 -07:00
Teknium	cb63b5f381	feat(skills): add popular-web-designs skill with 54 website design systems (#5194 ) Curated collection of production-quality design system specifications extracted from real websites (sourced from VoltAgent/awesome-design-md). Each template captures a site's complete visual language: colors, typography, components, layout, shadows, responsive behavior, and agent-ready CSS values. Hermes-specific adaptations in every template: - Google Fonts CDN link tags for proprietary font substitutes - CSS font-family stacks with proper fallbacks - Integration notes for write_file + generative-widgets workflow - browser_vision verification reminders SKILL.md includes categorized catalog, font substitution reference table, HTML generation pattern, and design-to-use-case matching guide. Sites: Airbnb, Airtable, Apple, BMW, Cal.com, Claude, Clay, ClickHouse, Cohere, Coinbase, Composio, Cursor, ElevenLabs, Expo, Figma, Framer, HashiCorp, IBM, Intercom, Kraken, Linear, Lovable, Minimax, Mintlify, Miro, Mistral AI, MongoDB, Notion, NVIDIA, Ollama, OpenCode, Pinterest, PostHog, Raycast, Replicate, Resend, Revolut, RunwayML, Sanity, Sentry, SpaceX, Spotify, Stripe, Supabase, Superhuman, Together AI, Uber, Vercel, VoltAgent, Warp, Webflow, Wise, xAI, Zapier	2026-04-05 00:42:55 -07:00
Teknium	0c54da8aaf	feat(gateway): live-stream /update output + interactive prompt buttons (#5180 ) * feat(gateway): live-stream /update output + forward interactive prompts Adds real-time output streaming and interactive prompt forwarding for the gateway /update command, so users on Telegram/Discord/etc see the full update progress and can respond to prompts (stash restore, config migration) without needing terminal access. Changes: hermes_cli/main.py: - Add --gateway flag to 'hermes update' argparse - Add _gateway_prompt() file-based IPC function that writes .update_prompt.json and polls for .update_response - Modify _restore_stashed_changes() to accept optional input_fn parameter for gateway mode prompt forwarding - cmd_update() uses _gateway_prompt when --gateway is set, enabling interactive stash restore and config migration prompts gateway/run.py: - _handle_update_command: spawn with --gateway flag and PYTHONUNBUFFERED=1 for real-time output flushing - Store session_key in .update_pending.json for cross-restart session matching - Add _update_prompt_pending dict to track sessions awaiting update prompt responses - Replace _watch_for_update_completion with _watch_update_progress: streams output chunks every ~4s, detects .update_prompt.json and forwards prompts to the user, handles completion/failure/timeout - Add update prompt interception in _handle_message: when a prompt is pending, the user's next message is written to .update_response instead of being processed normally - Preserve _send_update_notification as legacy fallback for post-restart cases where adapter isn't available yet File-based IPC protocol: - .update_prompt.json: written by update process with prompt text, default value, and unique ID - .update_response: written by gateway with user's answer - .update_output.txt: existing, now streamed in real-time - .update_exit_code: existing completion marker Tests: 16 new tests covering _gateway_prompt IPC, output streaming, prompt detection/forwarding, message interception, and cleanup. * feat: interactive buttons for update prompts (Telegram + Discord) Telegram: Inline keyboard with ✓ Yes / ✗ No buttons. Clicking a button answers the callback query, edits the message to show the choice, and writes .update_response directly. CallbackQueryHandler registered on the update_prompt: prefix. Discord: UpdatePromptView (discord.ui.View) with green Yes / red No buttons. Follows the ExecApprovalView pattern — auth check, embed color update, disabled-after-click. Writes .update_response on click. All platforms: /approve and /deny (and /yes, /no) now work as shorthand for yes/no when an update prompt is pending. The text fallback message instructs users to use these commands. Raw message interception still works as a fallback for non-command responses. Gateway watcher checks adapter for send_update_prompt method (class-level check to avoid MagicMock false positives) and falls back to text prompt with /approve instructions when unavailable. * fix: block /update on non-messaging platforms (API, webhooks, ACP) Add _UPDATE_ALLOWED_PLATFORMS frozenset that explicitly lists messaging platforms where /update is permitted. API server, webhook, and ACP platforms get a clear error directing them to run hermes update from the terminal instead. ACP and API server already don't reach _handle_message (separate codepaths), and webhooks have distinct session keys that can't collide with messaging sessions. This guard is belt-and-suspenders.	2026-04-05 00:28:58 -07:00
Teknium	441ec48802	style: use module-level re import instead of local import re as _re	2026-04-05 00:20:53 -07:00
kshitijk4poor	4437354198	Preserve numeric credential labels in auth removal Resolve exact label matches before treating digit-only input as a positional index so destructive auth removal does not mis-target credentials named with numeric labels. Constraint: The CLI remove path must keep supporting existing index-based usage while adding safer label targeting Rejected: Ban numeric labels \| labels are free-form and existing users may already rely on them Confidence: high Scope-risk: narrow Reversibility: clean Directive: When a destructive command accepts multiple identifier forms, prefer exact identity matches before fallback parsing heuristics Tested: Focused pytest slice for auth commands, credential pool recovery, and routing (273 passed); py_compile on changed Python files Not-tested: Full repository pytest suite	2026-04-05 00:20:53 -07:00
kshitijk4poor	65952ac00c	Honor provider reset windows in pooled credential failover Persist structured exhaustion metadata from provider errors, use explicit reset timestamps when available, and expose label-based credential targeting in the auth CLI. This keeps long-lived Codex cooldowns from being misreported as one-hour waits and avoids forcing operators to manage entries by list position alone. Constraint: Existing credential pool JSON needs to remain backward compatible with stored entries that only record status code and timestamp Constraint: Runtime recovery must keep the existing retry-then-rotate semantics for 429s while enriching pool state with provider metadata Rejected: Add a separate credential scheduler subsystem \| too large for the Hermes pool architecture and unnecessary for this fix Rejected: Only change CLI formatting \| would leave runtime rotation blind to resets_at and preserve the serial-failure behavior Confidence: high Scope-risk: moderate Reversibility: clean Directive: Preserve structured rate-limit metadata when new providers expose reset hints; do not collapse back to status-code-only exhaustion tracking Tested: Focused pytest slice for auth commands, credential pool recovery, and routing (272 passed); py_compile on changed Python files; hermes -w auth list/remove smoke test with temporary HERMES_HOME Not-tested: Full repository pytest suite, broader gateway/integration flows outside the touched auth and pool paths	2026-04-05 00:20:53 -07:00
Lume	ed4a605696	docs: update docstring to mention Fireworks strict validation Updates _sanitize_tool_calls_for_strict_api docstring to explicitly mention Fireworks alongside Mistral as strict APIs requiring sanitization. Also documents the specific fields that are stripped (call_id, response_item_id).	2026-04-05 00:13:25 -07:00
Lume	8545343cba	test: add strict API validation tests for Fireworks compatibility Adds comprehensive tests verifying: - Fireworks-compatible messages after sanitization - Codex mode preserves fields for Responses API replay - Fireworks provider triggers sanitization correctly - Codex responses mode correctly skips sanitization Prevents regression of 400 validation errors on strict APIs.	2026-04-05 00:13:25 -07:00
Lume	9be2b18064	test: add test for _should_sanitize_tool_calls() Adds test verifying that: - Codex mode returns False (no sanitization needed) - Chat completions mode returns True (sanitization needed) - Anthropic mode returns True (sanitization needed) This ensures strict APIs like Fireworks receive properly sanitized tool_calls.	2026-04-05 00:13:25 -07:00
Lume	d90035835b	refactor: use _should_sanitize_tool_calls in run_conversation() Replaces hardcoded Mistral check with the new _should_sanitize_tool_calls() method. Updates comment to mention Fireworks alongside Mistral as strict APIs requiring tool_call field sanitization.	2026-04-05 00:13:25 -07:00
Lume	234c01f690	refactor: use _should_sanitize_tool_calls in _handle_max_iterations() Replaces hardcoded Mistral check with the new _should_sanitize_tool_calls() method. Ensures summary generation works correctly with Fireworks and other strict APIs that reject unknown tool_call fields.	2026-04-05 00:13:25 -07:00
Lume	7f6e509199	refactor: use _should_sanitize_tool_calls in flush_memories() Replaces hardcoded Mistral check with the new _should_sanitize_tool_calls() method. This ensures tool_calls are sanitized for all strict APIs, not just Mistral. Prevents 400 errors from Fireworks and other providers.	2026-04-05 00:13:25 -07:00
Lume	560c6ae143	feat: add _should_sanitize_tool_calls() method Adds a centralized method to determine when tool_calls need sanitization for strict APIs. Returns True for all APIs except codex_responses mode. This prevents 400 errors from providers like Fireworks that reject unknown fields (call_id, response_item_id) in tool_calls.	2026-04-05 00:13:25 -07:00
Teknium	5b003ca4a0	test(redact): add regression tests for lowercase variable redaction (#4367 ) (#5185 ) Add 5 regression tests from PR #4476 (gnanam1990) to prevent re-introducing the IGNORECASE bug that caused lowercase Python/TypeScript variable assignments to be incorrectly redacted as secrets. The core fix landed in `6367e1c4`. Tests cover: - Lowercase Python variable with 'token' in name - Lowercase Python variable with 'api_key' in name - TypeScript 'await' not treated as secret value - TypeScript 'secret' variable assignment - 'export' prefix preserved for uppercase env vars Co-authored-by: gnanam1990 <gnanam1990@users.noreply.github.com>	2026-04-05 00:10:16 -07:00
Teknium	0fd3de2674	docs(skill): claude-code v2.2 — add cheat sheet commands, env vars, rules, advanced features (#5158 ) Expands the claude-code skill with content from official docs and community cheat sheets that was missing from v2.0: Slash commands: /cost, /btw, /plan, /loop, /batch, /security-review, /resume, /effort (with auto level), /mcp, /release-notes, /voice details Keyboard shortcuts: Alt+P (model), Alt+T (thinking), Alt+O (fast mode), Ctrl+V (paste image), Ctrl+O (transcript), Ctrl+G (external editor) Ultrathink keyword for max reasoning on a specific turn Rules directory: .claude/rules/.md and ~/.claude/rules/.md Auto-memory: ~/.claude/projects/<proj>/memory/ (25KB/200 lines limit) Environment variables: CLAUDE_CODE_EFFORT_LEVEL, MAX_THINKING_TOKENS, CLAUDE_CODE_NO_FLICKER, CLAUDE_CODE_SUBPROCESS_ENV_SCRUB MCP limits: 2KB tool desc cap, maxResultSizeChars 500K, transport types Reorganized slash commands into Session/Development/Configuration groups Reorganized keyboard shortcuts into Controls/Toggles/Multiline groups	2026-04-04 19:15:57 -07:00
Teknium	85cefc7a5a	fix(telegram): prevent duplicate message delivery on send timeout (#5153 ) TimedOut is a subclass of NetworkError in python-telegram-bot. The inner retry loop in send() and the outer _send_with_retry() in base.py both treated it as a transient connection error and retried — but send_message is not idempotent. When the request reaches Telegram but the HTTP response times out, the message is already delivered. Retrying sends duplicates. Worst case: up to 9 copies (inner 3x × outer 3x). Inner loop (telegram.py): - Import TimedOut separately, isinstance-check before generic NetworkError retry (same pattern as BadRequest carve-out from #3390) - Re-raise immediately — no retry - Mark as retryable=False in outer exception handler Outer loop (base.py): - Remove 'timeout', 'timed out', 'readtimeout', 'writetimeout' from _RETRYABLE_ERROR_PATTERNS (read/write timeouts are delivery-ambiguous) - Add 'connecttimeout' (safe — connection never established) - Keep 'network' (other platforms still need it) - Add _is_timeout_error() + early return to prevent plain-text fallback on timeout errors (would also cause duplicate delivery) Connection errors (ConnectionReset, ConnectError, etc.) are still retried — these fail before the request reaches the server. Credit: tmdgusya (PR #3899), barun1997 (PR #3904) for identifying the bug and proposing fixes. Closes #3899, closes #3904.	2026-04-04 19:05:34 -07:00
Teknium	c8220e69a1	fix: strip MEDIA: directives from streamed gateway messages (#5152 ) When streaming is enabled, the GatewayStreamConsumer sends raw text chunks directly to the platform without post-processing. This causes MEDIA:/path/to/file tags and [[audio_as_voice]] directives to appear as visible text in the user's chat instead of being stripped. The non-streaming path already handles this correctly via extract_media() in base.py, but the streaming path was missing equivalent cleanup. Add _clean_for_display() to GatewayStreamConsumer that strips MEDIA: tags and internal markers before any text reaches the platform. The actual media file delivery is unaffected — _deliver_media_from_response() in gateway/run.py still extracts files from the agent's final_response (separate from the stream consumer's display text). Reported by Ao [FotM] on Discord.	2026-04-04 19:05:27 -07:00
Teknium	ff544526cd	docs(skill): comprehensive claude-code skill rewrite v2.0 (#5155 ) Major rewrite of the claude-code orchestration skill from 94 to 460 lines. Based on official docs research, community guides, and live experimentation. Key additions: - Two orchestration modes: Print mode (-p) vs Interactive PTY via tmux - Detailed PTY dialog handling (trust + permissions bypass patterns) - Print mode deep dive: JSON output, piped input, session resumption, --json-schema, --bare mode for CI - Complete flag reference (20+ flags organized by category) - Interactive session patterns with tmux send-keys/capture-pane - Claude's slash commands and keyboard shortcuts reference - CLAUDE.md, hooks, custom subagents, MCP, custom commands docs - Cost/performance tips (effort levels, budget caps, context mgmt) - 10 specific pitfalls discovered through live testing - 10 rules for Hermes agents orchestrating Claude Code	2026-04-04 19:00:50 -07:00
memosr	931624feda	fix(security): guard cron script against path traversal and redact output Relative script paths resolved against HERMES_HOME/scripts/ were not validated to stay within that directory. Paths like '../../etc/passwd' could escape and be executed as Python. Fix: resolve the path and verify it stays within scripts_dir using Path.relative_to(). Also apply redact_sensitive_text() to script stdout before LLM injection — same pattern as execute_code sandbox output. Cherry-picked from PR #5093 by memosr (fixes 1 and 3; absolute path restriction dropped as too restrictive for the feature's design intent).	2026-04-04 17:01:11 -07:00
Teknium	aa475aef31	feat: add exit code context for common CLI tools in terminal results (#5144 ) When commands like grep, diff, test, or find return non-zero exit codes that aren't actual errors (grep 1 = no matches, diff 1 = files differ), the model wastes turns investigating non-problems. This adds an exit_code_meaning field to the terminal JSON result that explains informational exit codes, so the agent can move on instead of debugging. Covers grep/rg/ag/ack (no matches), diff (files differ), find (partial access), test/[ (condition false), curl (timeouts, DNS, HTTP errors), and git (context-dependent). Correctly extracts the last command from pipelines and chains, strips full paths and env var assignments. The exit_code field itself is unchanged — this is purely additive context.	2026-04-04 16:57:24 -07:00
Teknium	5879b3ef82	fix: move pre_llm_call plugin context to user message, preserve prompt cache (#5146 ) Plugin context from pre_llm_call hooks was injected into the system prompt, breaking the prompt cache prefix every turn when content changed (typical for memory plugins). Now all plugin context goes into the current turn's user message — the system prompt stays identical across turns, preserving cached tokens. The system prompt is reserved for Hermes internals. Plugins contribute context alongside the user's input. Also adds comprehensive documentation for all 6 plugin hooks: pre_tool_call, post_tool_call, pre_llm_call, post_llm_call, on_session_start, on_session_end — each with full callback signatures, parameter tables, firing conditions, and examples. Supersedes #5138 which identified the same cache-busting bug and proposed an uncached system suffix approach. This fix goes further by removing system prompt injection entirely. Co-identified-by: OutThisLife (PR #5138)	2026-04-04 16:55:44 -07:00
Teknium	96e96a79ad	fix: --yolo and other flags silently dropped when placed before 'chat' subcommand (#5145 ) When --yolo, -w, -s, -r, -c, and --pass-session-id exist on both the parent parser and the 'chat' subparser with explicit defaults (default=False or default=None), argparse's subparser initialization overwrites the parent's parsed value. So 'hermes --yolo chat' silently drops --yolo, making it appear broken. Fix: use default=argparse.SUPPRESS on all duplicated arguments in the chat subparser. SUPPRESS means 'don't set this attribute if the user didn't explicitly provide it', so the parent parser's value survives through. Affected flags: --yolo, --worktree/-w, --skills/-s, --pass-session-id, --resume/-r, --continue/-c. Adds 15 regression tests covering flag-before-subcommand, flag-after-subcommand, no-subcommand, and env var propagation scenarios.	2026-04-04 16:55:13 -07:00
Teknium	55bbf8caba	fix: include approval metadata in terminal tool results (#5141 ) When a dangerous command is approved (gateway, CLI, or smart approval), the terminal tool now includes an 'approval' field in the result JSON so the model knows approval was requested and granted. Previously the model only saw normal command output with no indication that approval happened, causing it to hallucinate that the approval system didn't fire. Changes: - approval.py: Return user_approved/description in all 3 approval paths (gateway blocking, CLI interactive, smart approval) - terminal_tool.py: Capture approval metadata and inject into both foreground and background command results	2026-04-04 16:33:20 -07:00
Fran Fitzpatrick	2556cfdab1	fix(gateway): match Discord mention-stripping behavior in Matrix adapter Move mention stripping outside the `if not is_dm` guard so mentions are stripped in DMs too. Remove the bare-mention early return so a message containing only a mention passes through as empty string, matching Discord's behavior. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-04 13:09:27 -07:00
Fran Fitzpatrick	d86be33161	feat(gateway): add MATRIX_REQUIRE_MENTION and MATRIX_AUTO_THREAD support Bring Matrix feature parity with Discord by adding mention gating and auto-threading. Both default to true, matching Discord behavior. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-04 13:09:27 -07:00
Teknium	569e9f9670	feat: execute_code runs on remote terminal backends (#5088 ) * feat: execute_code runs on remote terminal backends (Docker/SSH/Modal/Daytona/Singularity) When TERMINAL_ENV is not 'local', execute_code now ships the script to the remote environment and runs it there via the terminal backend -- the same container/sandbox/SSH session used by terminal() and file tools. Architecture: - Local backend: unchanged (UDS RPC, subprocess.Popen) - Remote backends: file-based RPC via execute_oneshot() polling - Script writes request files, parent polls and dispatches tool calls - Responses written atomically (tmp + rename) via base64/stdin - execute_oneshot() bypasses persistent shell lock for concurrency Changes: - tools/environments/base.py: add execute_oneshot() (delegates to execute()) - tools/environments/persistent_shell.py: override execute_oneshot() to bypass _shell_lock via _execute_oneshot(), enabling concurrent polling - tools/code_execution_tool.py: add file-based transport to generate_hermes_tools_module(), _execute_remote() with full env get-or-create, file shipping, RPC poll loop, output post-processing * fix: use _get_env_config() instead of raw TERMINAL_ENV env var Read terminal backend type through the canonical config resolution path (terminal_tool._get_env_config) instead of os.getenv directly. * fix: use echo piping instead of stdin_data for base64 writes Modal doesn't reliably deliver stdin_data to chained commands (base64 -d > file && mv), producing 0-byte files. Switch to echo 'base64' \| base64 -d which works on all backends. Verified E2E on both Docker and Modal.	2026-04-04 12:57:49 -07:00
Chris Bartholomew	28e1e210ee	fix(hindsight): overhaul hindsight memory plugin and memory setup wizard - Dedicated asyncio event loop for Hindsight async calls (fixes aiohttp session leaks) - Client caching (reuse instead of creating per-call) - Local mode daemon management with config change detection and auto-restart - Memory mode support (hybrid/context/tools) and prefetch method (recall/reflect) - Proper shutdown with event loop and client cleanup - Disable HindsightEmbedded.__del__ to avoid GC loop errors - Update API URLs (app -> ui.hindsight.vectorize.io, api_url -> base_url) - Setup wizard: conditional fields (when clause), dynamic defaults (default_from) - Switch dependency install from pip to uv (correct for uv-based venvs) - Add hindsight-all to plugin.yaml and import mapping - 12 new tests for dispatch routing and setup field filtering Original PR #5044 by cdbartholomew.	2026-04-04 12:18:46 -07:00
Teknium	93aa01c71c	fix: use main provider model for auxiliary tasks on non-aggregator providers (#5091 ) Users on direct API-key providers (Alibaba, DeepSeek, ZAI, etc.) without an OpenRouter or Nous key would get broken auxiliary tasks (compression, vision, etc.) because _resolve_auto() only tried aggregator providers first, then fell back to iterating PROVIDER_REGISTRY with wrong default model names. Now _resolve_auto() checks the user's main provider first. If it's not an aggregator (OpenRouter/Nous), it uses their main model directly for all auxiliary tasks. Aggregator users still get the cheap gemini-flash model as before. Adds _read_main_provider() to read model.provider from config.yaml, mirroring the existing _read_main_model(). Reported by SkyLinx — Alibaba Coding Plan user getting 400 errors from google/gemini-3-flash-preview being sent to DashScope.	2026-04-04 12:07:43 -07:00

1 2 3 4 5 ...

3252 Commits