[EPIC-999/Phase I] The Mirror — formal spec extraction artifacts #107

Open

ezra wants to merge 278 commits from epic-999-phase-i into main

Author	SHA1	Message	Date
Ezra	c266661bff	[EPIC-999] Phase I — add call graph, test stubs, and AIAgent decomposition plan Some checks are pending Docker Build and Publish / build-and-push (pull_request) Waiting to run Details Docs Site Checks / docs-site-checks (pull_request) Waiting to run Details Nix / nix (macos-latest) (pull_request) Waiting to run Details Nix / nix (ubuntu-latest) (pull_request) Waiting to run Details Supply Chain Audit / Scan PR for supply chain risks (pull_request) Waiting to run Details Tests / test (pull_request) Waiting to run Details Tests / e2e (pull_request) Waiting to run Details - call_graph.json: 128 calls inside AIAgent.run_conversation identified - test_invariants_stubs.py: property-based contract tests for loop/registry/state/compressor - AIAgent_DECOMPOSITION.md: 5-class refactor plan for Phase II competing rewrites Authored-by: Ezra	2026-04-05 23:30:28 +00:00
Ezra	5f1cdfc9e4	[EPIC-999] Phase I — The Mirror: formal spec extraction artifacts Some checks are pending Docker Build and Publish / build-and-push (pull_request) Waiting to run Details Docs Site Checks / docs-site-checks (pull_request) Waiting to run Details Nix / nix (macos-latest) (pull_request) Waiting to run Details Nix / nix (ubuntu-latest) (pull_request) Waiting to run Details Supply Chain Audit / Scan PR for supply chain risks (pull_request) Waiting to run Details Tests / test (pull_request) Waiting to run Details Tests / e2e (pull_request) Waiting to run Details - module_inventory.json: 679 Python files, 298k lines, 232k SLOC - core_analysis.json: deep AST parse of 9 core modules - SPEC.md: high-level architecture, module specs, coupling risks, Phase II prep Authored-by: Ezra <ezra@hermes.vps>	2026-04-05 23:27:29 +00:00
Teknium	77a2aad771	docs: fix stale references across 8 doc pages Audit found 24+ discrepancies between docs and code. Fixed: HIGH severity: - Remove honcho toolset from tools-reference, toolsets-reference, and tools.md (converted to memory provider plugin, not a built-in toolset) - Add note that Honcho is available via plugin MEDIUM severity: - Add hermes memory command family to cli-commands.md (setup/status/off) - Add --clone-all, --clone-from to profile create in cli-commands.md - Add --max-turns option to hermes chat in cli-commands.md - Add /btw slash command to slash-commands.md - Fix profile show example output (remove nonexistent disk usage, add .env and SOUL.md status lines) - Add missing hermes-webhook toolset to toolsets-reference.md - Add 5 missing providers to fallback-providers.md table - Add 7 missing providers to providers.md fallback list - Fix outdated model examples: glm-4-plus→glm-5, moonshot-v1-auto→kimi-for-coding	2026-04-03 23:30:29 -07:00
Teknium	43d3efd5c8	feat: add docker_env config for explicit container environment variables (#4738 ) Add docker_env option to terminal config — a dict of key-value pairs that get set inside Docker containers via -e flags at both container creation (docker run) and per-command execution (docker exec) time. This complements docker_forward_env (which reads values dynamically from the host process environment). docker_env is useful when Hermes runs as a systemd service without access to the user's shell environment — e.g. setting SSH_AUTH_SOCK or GNUPGHOME to known stable paths for SSH/GPG agent socket forwarding. Precedence: docker_env provides baseline values; docker_forward_env overrides for the same key. Config example: terminal: docker_env: SSH_AUTH_SOCK: /run/user/1000/ssh-agent.sock GNUPGHOME: /root/.gnupg docker_volumes: - /run/user/1000/ssh-agent.sock:/run/user/1000/ssh-agent.sock - /run/user/1000/gnupg/S.gpg-agent:/root/.gnupg/S.gpg-agent	2026-04-03 23:30:12 -07:00
Stefan Vandermeulen	78ec8b017f	style: add debug log for write-back failure in retry path Address review feedback: replace bare `except: pass` with a debug log when the post-retry write-back to ~/.claude/.credentials.json fails. The write-back is best-effort (token is already resolved), but logging helps troubleshooting. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-03 23:26:08 -07:00
Stefan Vandermeulen	a70ee1b898	fix: sync OAuth tokens between credential pool and credentials file OAuth refresh tokens are single-use. When multiple consumers share the same Anthropic OAuth session (credential pool entries, Claude Code CLI, multiple Hermes profiles), whichever refreshes first invalidates the refresh token for all others. This causes a cascade: 1. Pool entry tries to refresh with a consumed refresh token → 400 2. Pool marks the credential as "exhausted" with a 24-hour cooldown 3. All subsequent heartbeats skip the credential entirely 4. The fallback to resolve_anthropic_token() only works while the access token in ~/.claude/.credentials.json hasn't expired 5. Once it expires, nothing can auto-recover without manual re-login Fix: - Add _sync_anthropic_entry_from_credentials_file() to detect when ~/.claude/.credentials.json has a newer refresh token and sync it into the pool entry, clearing exhaustion status - After a successful pool refresh, write the new tokens back to ~/.claude/.credentials.json so other consumers stay in sync - On refresh failure, check if the credentials file has a different (newer) refresh token and retry once before marking exhausted - In _available_entries(), sync exhausted claude_code entries from the credentials file before applying the 24-hour cooldown, so a manual re-login or external refresh immediately unblocks agents Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-03 23:26:08 -07:00
Teknium	b93fa234df	fix: clear ghost status-bar lines on terminal resize (#4960 ) * feat: add /branch (/fork) command for session branching Inspired by Claude Code's /branch command. Creates a copy of the current session's conversation history in a new session, allowing the user to explore a different approach without losing the original. Works like 'git checkout -b' for conversations: - /branch — auto-generates a title from the parent session - /branch my-idea — uses a custom title - /fork — alias for /branch Implementation: - CLI: _handle_branch_command() in cli.py - Gateway: _handle_branch_command() in gateway/run.py - CommandDef with 'fork' alias in commands.py - Uses existing parent_session_id field in session DB - Uses get_next_title_in_lineage() for auto-numbered branches - 14 tests covering session creation, history copy, parent links, title generation, edge cases, and agent sync * fix: clear ghost status-bar lines on terminal resize When the terminal shrinks (e.g. un-maximize), the emulator reflows previously full-width rows (status bar, input rules) into multiple narrower rows. prompt_toolkit's _on_resize only cursor_up()s by the stored layout height, missing the extra rows from reflow — leaving ghost duplicates of the status bar visible. Fix: monkey-patch Application._on_resize to detect width shrinks, calculate the extra rows created by reflow, and inflate the renderer's cursor_pos.y so the erase moves up far enough to clear ghosts.	2026-04-03 22:43:45 -07:00
Octopus	f5c212f69b	feat: add MiniMax TTS provider support (speech-2.8) Add MiniMax as a fifth TTS provider alongside Edge TTS, ElevenLabs, OpenAI, and NeuTTS. Supports speech-2.8-hd (recommended default) and speech-2.8-turbo models via the MiniMax T2A HTTP API. Changes: - Add _generate_minimax_tts() with hex-encoded audio decoding - Add MiniMax to provider dispatch, requirements check, and Telegram Opus compatibility handling - Add MiniMax to interactive setup wizard with API key prompt - Update TTS documentation and config example Configuration: tts: provider: "minimax" minimax: model: "speech-2.8-hd" voice_id: "English_Graceful_Lady" Requires MINIMAX_API_KEY environment variable. API reference: https://platform.minimax.io/docs/api-reference/speech-t2a-http	2026-04-03 22:42:14 -07:00
acsezen	831067c5d3	perf: fix O(n²) catastrophic backtracking in redact regex + reorder file read guard Two pre-existing issues causing test_file_read_guards timeouts on CI: 1. agent/redact.py: _ENV_ASSIGN_RE used unbounded [A-Z_]* with IGNORECASE, matching any letter/underscore to end-of-string at each position → O(n²) backtracking on 100K+ char inputs. Bounded to {0,50} since env var names are never that long. 2. tools/file_tools.py: redact_sensitive_text() ran BEFORE the character-count guard, so oversized content (that would be rejected anyway) went through the expensive regex first. Reordered to check size limit before redaction.	2026-04-03 22:40:37 -07:00
Teknium	1c0c5d957f	fix(gateway): support infinite timeout + periodic notifications + actionable error (#4959 ) - HERMES_AGENT_TIMEOUT=0 now means no limit (infinite execution) - Periodic 'still working' notifications every 10 minutes for long tasks - Timeout error message now tells users how to increase the limit - Stale-lock eviction handles infinite timeout correctly (float inf TTL)	2026-04-03 22:37:38 -07:00
Teknium	34308e4de9	docs: improve youtube-content skill structure and workflow Clearer workflow with validation/chunking steps, expanded description with trigger terms for better agent matching, tightened error handling. Fixed stray pipe character in original PR diff. Based on PR #4778 by fernandezbaptiste. Co-authored-by: fernandezbaptiste <fernandezbaptiste@users.noreply.github.com>	2026-04-03 22:18:00 -07:00
Teknium	ad4feeaf0d	feat: wire skills.external_dirs into all remaining discovery paths The config key skills.external_dirs and core resolution (get_all_skills_dirs, get_external_skills_dirs in agent/skill_utils.py) already existed but several code paths still only scanned SKILLS_DIR. Now external dirs are respected everywhere: - skills_categories(): scan all dirs for category discovery - _get_category_from_path(): resolve categories against any skills root - skill_manager_tool._find_skill(): search all dirs for edit/patch/delete - credential_files.get_skills_directory_mount(): mount all dirs into Docker/Singularity containers (external dirs at external_skills/<idx>) - credential_files.iter_skills_files(): list files from all dirs for Modal/Daytona upload - tools/environments/ssh.py: rsync all skill dirs to remote hosts - gateway _check_unavailable_skill(): check disabled skills across all dirs Usage in config.yaml: skills: external_dirs: - ~/repos/agent-skills/hermes - /shared/team-skills	2026-04-03 21:14:42 -07:00
Teknium	5a98ce5973	fix: use clean user message for all memory provider operations (#4940 ) When a skill is active, user_message contains the full SKILL.md content injected by the skill system. This bloated string was being passed to memory provider sync_all(), queue_prefetch_all(), and prefetch_all(), causing providers with query size limits (e.g. Honcho's 10K char limit) to fail. Both call sites now use original_user_message (the clean user input, already defined at line 6516) instead of the skill-inflated user_message: - Pre-turn prefetch (line ~6695): prefetch_all() query - Post-turn sync (line ~8672): sync_all() + queue_prefetch_all() Fixes #4889	2026-04-03 20:43:01 -07:00
Teknium	585a3b40ad	fix: use 'is not None and != ""' instead of truthiness for mem0.json merge The original filter (if v) silently drops False and 0, so 'rerank: false' in mem0.json would be ignored. Use explicit None/empty-string check to preserve intentional falsy values.	2026-04-03 20:42:48 -07:00
Livia Ellen	5e3303b3d8	fix(mem0): merge env vars with mem0.json instead of either/or When mem0.json exists but is missing the api_key (e.g. after running `hermes memory setup`), the plugin reports "not available" even though MEM0_API_KEY is set in .env. This happens because _load_config() returns the JSON file contents verbatim, never falling back to env vars. Use env vars as the base config and let mem0.json override individual keys on top, so both config sources work together. Fixes: mem0 plugin shows "not available" despite valid MEM0_API_KEY in .env	2026-04-03 20:42:48 -07:00
Mibayy	14e87325df	fix(openviking): send tenant-scoping headers on every request (#4825 ) OpenViking is multi-tenant and requires X-OpenViking-Account and X-OpenViking-User headers. Without them, API calls like POST /api/v1/search/find fail on authenticated servers. Add both headers to _VikingClient._headers(), read from env vars OPENVIKING_ACCOUNT (default: root) and OPENVIKING_USER (default: default). All instantiation sites inherit the fix automatically. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-03 20:32:55 -07:00
Teknium	f1c0847145	fix(gateway): restore short preview truncation for all/new tool progress modes (#4935 ) The tool_preview_length: 0 (unlimited) config change from `e314833c` removed truncation from gateway progress messages in all/new modes. This caused full terminal commands, code blocks, and file paths to appear as permanent messages in Telegram -- the old 40-char truncation was the correct behavior for messaging platforms. Now: - all/new modes: always truncate previews to 40 chars (old behavior) - verbose mode: respects tool_preview_length config for JSON args cap Reported by Paulclgro and socialsurfer on Discord.	2026-04-03 20:32:01 -07:00
Teknium	8af6a08695	fix: don't treat bare file paths as slash commands Input like /Users/ironin/file.md:45-46 was routed to process_command() because it starts with /. Added _looks_like_slash_command() which checks whether the first word contains additional / characters — commands never do (/help, /model), paths always do (/Users/foo/bar.md). Applied to both process_loop routing and handle_enter interrupt bypass. Preserves prefix matching (/h → /help) since short prefixes still pass the check. Based on PR #4782 by iRonin. Co-authored-by: iRonin <iRonin@users.noreply.github.com>	2026-04-03 20:16:04 -07:00
Teknium	fb68c22340	fix(gateway): bypass active-session guard for /approve and /deny commands (#4926 ) The base adapter's active-session guard queues all messages when an agent is running. This creates a deadlock for /approve and /deny: the agent thread is blocked on threading.Event.wait() in tools/approval.py waiting for resolve_gateway_approval(), but the /approve command is queued waiting for the agent to finish. Dispatch /approve and /deny directly to the message handler (which routes to gateway/run.py's _handle_approve_command) without going through _process_message_background — avoids spawning a competing background task that would mess with session lifecycle/guards. Fixes #4898 Co-authored-by: mechovation (original diagnosis in PR #4904)	2026-04-03 20:08:37 -07:00
memosr	287ac15efd	fix(gateway): write update-pending state atomically to prevent corruption	2026-04-03 18:57:38 -07:00
Teknium	cee761ee4a	fix: prevent duplicate messages — gateway dedup + partial stream guard (#4878 ) * fix(gateway): add message deduplication to Discord and Slack adapters (#4777) Discord RESUME replays events after reconnects (~7/day observed), and Slack Socket Mode can redeliver events if the ack was lost. Neither adapter tracked which messages were already processed, causing duplicate bot responses. Add _seen_messages dedup cache (message ID → timestamp) with 5-min TTL and 2000-entry cap to both adapters, matching the pattern already used by Mattermost, Matrix, WeCom, Feishu, DingTalk, and Email. The check goes at the very top of the message handler, before any other logic, so replayed events are silently dropped. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: prevent duplicate messages on partial stream delivery When streaming fails after tokens are already delivered to the platform, _interruptible_streaming_api_call re-raised the error into the outer retry loop, which would make a new API call — creating a duplicate message. Now checks deltas_were_sent before re-raising: if partial content was already streamed, returns a stub response instead. The outer loop treats the turn as complete (no retry, no fallback, no duplicate). Inspired by PR #4871 (@trevorgordon981) which identified the bug. This implementation avoids monkey-patching exception objects and keeps the fix within the streaming call boundary. --------- Co-authored-by: Mibayy <mibayy@users.noreply.github.com> Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-03 18:53:52 -07:00
Teknium	36aace34aa	fix(opencode-go): strip trailing /v1 from base URL for Anthropic models (#4918 ) The Anthropic SDK appends /v1/messages to the base_url, so OpenCode's base URL https://opencode.ai/zen/go/v1 produced a double /v1 path (https://opencode.ai/zen/go/v1/v1/messages), causing 404s for MiniMax models. Strip trailing /v1 when api_mode is anthropic_messages. Also adds MiMo-V2-Pro, MiMo-V2-Omni, and MiniMax-M2.5 to the OpenCode Go model lists per their updated docs. Fixes #4890	2026-04-03 18:47:51 -07:00
Teknium	d4bf517b19	test+docs: add group_topics tests and documentation - 7 new tests covering skill binding, fallthrough, coercion - Docs section in telegram.md with config format, field reference, comparison table, and thread_id discovery tip	2026-04-03 18:20:50 -07:00
Dolf	1cae9ac628	feat(telegram): add group_topics skill binding for supergroup forum topics Reads config.extra['group_topics'] to bind skills to specific thread_ids in supergroup/forum chats. Mirrors the dm_topics skill injection pattern but for group chat_type. Enables per-topic skill auto-loading in Falcon HQ. Config format: platforms.telegram.extra.group_topics: - chat_id: -1003853746818 topics: - name: FalconConnect thread_id: 5 skill: falconconnect-architecture	2026-04-03 18:20:50 -07:00
Teknium	fb654c15d8	fix: add type hints to session key helpers, extend context-local key to terminal_tool - Add contextvars.Token[str] type hints to set/reset_current_session_key - Use get_current_session_key(default='') in terminal_tool.py for background process session tracking, fixing the same env var race for concurrent gateway sessions spawning background processes	2026-04-03 17:50:01 -07:00
Tranquil-Flow	3bfb39a25f	fix(gateway): isolate approval session key per turn	2026-04-03 17:50:01 -07:00
kshitijk4poor	5359921199	refactor: simplify scope validation helpers in google workspace scripts Fix double file read bug in google_api.py _missing_scopes(), consolidate redundant _normalize_scope_values into callers, merge duplicate except blocks.	2026-04-03 17:49:18 -07:00
kshitijk4poor	37e2ef6c3f	fix: protect profile-scoped google workspace oauth tokens	2026-04-03 17:49:18 -07:00
Teknium	92dcdbff66	fix: clarify interrupt re-queue label, document busy_input_mode behaviour The '📨 Queued:' label was misleading — it looked like the message was silently deferred when it was actually being sent immediately after the interrupt. Changed to '⚡ Sending after interrupt:' with multi-message count when the user typed several messages during agent execution. Added comment documenting that this code path only applies when busy_input_mode == 'interrupt' (the default). Based on PR #4821 by iRonin. Co-authored-by: iRonin <iRonin@users.noreply.github.com>	2026-04-03 15:00:05 -07:00
Teknium	3f2180037c	fix: also filter session_meta in /session switch restore path The original PR missed the third CLI restore path — the /session switch command that loads history via get_messages_as_conversation() without stripping session_meta entries.	2026-04-03 14:57:33 -07:00
kagura-agent	6bf5946bbe	fix: filter transcript-only roles from chat-completions payload (#4715 ) Add a provider-agnostic role allowlist guard to _sanitize_api_messages() that drops messages with roles not accepted by the chat-completions API (e.g. session_meta). This prevents CLI resume/session restore from leaking transcript-only metadata into the outgoing messages payload. Two layers of defense: 1. API-boundary guard: _sanitize_api_messages() now filters messages by role allowlist (system/user/assistant/tool/function/developer) before the existing orphaned tool-call repair logic. This protects all current and future call paths. 2. CLI restore defense-in-depth: Both session restore paths in cli.py now strip session_meta entries before loading history into conversation_history, matching the existing gateway behavior. Closes #4715	2026-04-03 14:57:33 -07:00
Hermes Agent	bef895b371	fix(memory): preserve holographic prompt and trust score rendering	2026-04-03 14:22:22 -07:00
Teknium	84a875ca02	fix: scope gateway stop/restart to current profile, --all for global kill gateway stop and restart previously called kill_gateway_processes() which scans ps aux and kills ALL gateway processes across all profiles. Starting a profile gateway would nuke the main one (and vice versa). Now: - hermes gateway stop → only kills the current profile's gateway (PID file) - hermes -p work gateway stop → only kills the 'work' profile's gateway - hermes gateway stop --all → kills every gateway process (old behavior) - hermes gateway restart → profile-scoped for manual fallback path - hermes update → discovers and restarts ALL profile gateways (systemctl list-units hermes-gateway*) since the code update is shared Added stop_profile_gateway() which uses the HERMES_HOME-scoped PID file instead of global process scanning.	2026-04-03 14:21:44 -07:00
Teknium	52ddd6bc64	refactor(skills): consolidate code verification skills into one (#4854 ) * chore: release v0.7.0 (2026.4.3) 168 merged PRs, 223 commits, 46 resolved issues, 40+ contributors. Highlights: pluggable memory providers, credential pools, Camofox browser, inline diff previews, API server session continuity, ACP MCP registration, gateway hardening, secret exfiltration blocking. * refactor(skills): consolidate code-review + verify-code-changes into requesting-code-review Merge the passive code-review checklist and the automated verification pipeline (from PR #4459 by @MorAlekss) into a single requesting-code-review skill. This eliminates model confusion between three overlapping skills. Now includes: - Static security scan (grep on diff lines) - Baseline-aware quality gates (only flag NEW failures) - Multi-language tool detection (Python, Node, Rust, Go) - Independent reviewer subagent with fail-closed JSON verdict - Auto-fix loop with separate fixer agent (max 2 attempts) - Git checkpoint and [verified] commit convention Deletes: skills/software-development/code-review/ (absorbed) Closes: #406 (independent code verification)	2026-04-03 14:13:27 -07:00
Teknium	7def061fee	feat: add arcee-ai/trinity-large-thinking to recommended models Added to OPENROUTER_MODELS and _PROVIDER_MODELS['nous'] lists. Also added 'trinity' family entry to DEFAULT_CONTEXT_LENGTHS (262K).	2026-04-03 13:45:29 -07:00
CK iRonin.IT	de5aacddd2	fix: normalise \r\n and \r line endings in pasted text Windows (CRLF) and old Mac (CR) line endings are normalised to LF before the 5-line collapse threshold is checked in handle_paste. Without this, markdown copied from Windows sources contains \r\n but the line counter (pasted_text.count('\n')) still works — however buf.insert_text() leaves bare \r characters in the buffer which some terminals render by moving the cursor to the start of the line, making multi-line pastes appear as a single overwritten line.	2026-04-03 13:20:50 -07:00
Teknium	b1756084a3	feat: add .zip document support and auto-mount cache dirs into remote backends (#4846 ) - Add .zip to SUPPORTED_DOCUMENT_TYPES so gateway platforms (Telegram, Slack, Discord) cache uploaded zip files instead of rejecting them. - Add get_cache_directory_mounts() and iter_cache_files() to credential_files.py for host-side cache directory passthrough (documents, images, audio, screenshots). - Docker: bind-mount cache dirs read-only alongside credentials/skills. Changes are live (bind mount semantics). - Modal: mount cache files at sandbox creation + resync before each command via _sync_files() with mtime+size change detection. - Handles backward-compat with legacy dir names (document_cache, image_cache, audio_cache, browser_screenshots) via get_hermes_dir(). - Container paths always use the new cache/<subdir> layout regardless of host layout. This replaces the need for a dedicated extract_archive tool (PR #4819) — the agent can now use standard terminal commands (unzip, tar) on uploaded files inside remote containers. Closes: related to PR #4819 by kshitijk4poor	2026-04-03 13:16:26 -07:00
Teknium	8a384628a5	fix(memory): profile-scoped memory isolation and clone support (#4845 ) Three fixes for memory+profile isolation bugs: 1. memory_tool.py: Replace module-level MEMORY_DIR constant with get_memory_dir() function that calls get_hermes_home() dynamically. The old constant was cached at import time and could go stale if HERMES_HOME changed after import. Internal MemoryStore methods now call get_memory_dir() directly. MEMORY_DIR kept as backward-compat alias. 2. profiles.py: profile create --clone now copies MEMORY.md and USER.md from the source profile. These curated memory files are part of the agent's identity (same as SOUL.md) and should carry over on clone. 3. holographic plugin: initialize() now expands $HERMES_HOME and ${HERMES_HOME} in the db_path config value, so users can write 'db_path: $HERMES_HOME/memory_store.db' and it resolves to the active profile directory, not the default home. Tests updated to mock get_memory_dir() alongside the legacy MEMORY_DIR.	2026-04-03 13:10:11 -07:00
Teknium	4979d77a4a	fix: complete browser_tool profile isolation — replace remaining 3 hardcoded HERMES_HOME instances The original PR fixed 4 of 7 instances. This fixes the remaining 3: - _launch_local_browser() PATH setup (line 908) - _start_recording() config read (line 1545) - _cleanup_old_recordings() path (line 1834)	2026-04-03 13:09:54 -07:00
Dusk1e	a09fa690f0	fix: resolve critical stability issues in core, web, and browser tools	2026-04-03 13:09:54 -07:00
Teknium	6d357bb185	fix: regenerate uv.lock to sync with pyproject.toml v0.7.0 (#4842 ) uv.lock was stale at v0.5.0 and missing exa-py (core dep), causing ModuleNotFoundError for Nix flake builds. Also syncs faster-whisper placement (core → voice extra), adds feishu/debugpy/lark-oapi extras. Fixes #4648 Credit to @lvnilesh for identifying the issue in PR #4649.	2026-04-03 12:53:45 -07:00
Dat Pham	b3319b1252	fix(memory): Fix ByteRover plugin - run brv query synchronously before LLM call The pipeline prefetch design was firing \`brv query\` in a background thread after each response, meaning the context injected at turn N was from turn N-1's message — and the first turn got no BRV context at all. Replace the async prefetch pipeline with a synchronous query in \`prefetch()\` so recall runs before the first API call on every turn. Make \`queue_prefetch()\` a no-op and remove the now-unused pipeline state.	2026-04-03 12:11:29 -07:00
Teknium	abf1e98f62	chore: release v0.7.0 (2026.4.3) (#4812 ) 168 merged PRs, 223 commits, 46 resolved issues, 40+ contributors. Highlights: pluggable memory providers, credential pools, Camofox browser, inline diff previews, API server session continuity, ACP MCP registration, gateway hardening, secret exfiltration blocking.	2026-04-03 11:14:55 -07:00
Teknium	e492420df4	fix: route memory provider tools in sequential execution path (#4803 ) Memory provider tools (hindsight_retain, honcho_search, etc.) were advertised to the model via tool schemas but failed with 'Unknown tool' at execution time. The concurrent path (_invoke_tool) correctly checks self._memory_manager.has_tool() before falling through to the registry, but the sequential path (_execute_tool_calls_sequential) was never updated with this check. Since sequential is the default for single tool calls, memory provider tools always hit the registry dispatcher which returns 'Unknown tool' because they're not registered there. Add the memory_manager dispatch check between the delegate_task handler and the quiet_mode fallthrough in the sequential path, with proper spinner/display handling to match the existing pattern. Reported by KiBenderOP — all memory providers affected (Honcho, Hindsight, Holographic, etc.).	2026-04-03 10:31:53 -07:00
Teknium	67e3620c5c	fix: persist API server sessions to shared SessionDB (state.db) (#4802 ) The API server adapter created AIAgent instances without passing session_db, so conversations via Open WebUI and other OpenAI-compatible frontends were never persisted to state.db. This meant 'hermes sessions list' showed no API server sessions — they were effectively stateless. Changes: - Add _ensure_session_db() helper for lazy SessionDB initialization - Pass session_db=self._ensure_session_db() in _create_agent() - Refactor existing X-Hermes-Session-Id handler to use the shared helper Sessions now persist with source='api_server' and are visible alongside CLI and gateway sessions in hermes sessions list/search.	2026-04-03 10:31:11 -07:00
Teknium	aecbf7fa4a	fix(discord): register /approve and /deny slash commands, wire up button-based approval UI (#4800 ) Two fixes for Discord exec approval: 1. Register /approve and /deny as native Discord slash commands so they appear in Discord's command picker (autocomplete). Previously they were only handled as text commands, so users saw 'no commands found' when typing /approve. 2. Wire up the existing ExecApprovalView button UI (was dead code): - ExecApprovalView now calls resolve_gateway_approval() to actually unblock the waiting agent thread when a button is clicked - Gateway's _approval_notify_sync() detects adapters with send_exec_approval() and routes through the button UI - Added 'Allow Session' button for parity with /approve session - send_exec_approval() now accepts session_key and metadata for thread support - Graceful fallback to text-based /approve prompt if button send fails Also updates test mocks to include grey/secondary ButtonStyle and purple Color (used by new button styles).	2026-04-03 10:24:07 -07:00
Teknium	5db630aae4	fix: respect per-platform disabled skills in Telegram menu and gateway dispatch (#4799 ) Three interconnected bugs caused `hermes skills config` per-platform settings to be silently ignored: 1. telegram_menu_commands() never filtered disabled skills — all skills consumed menu slots regardless of platform config, hitting Telegram's 100 command cap. Now loads disabled skills for 'telegram' and excludes them from the menu. 2. Gateway skill dispatch executed disabled skills because get_skill_commands() (process-global cache) only filters by the global disabled list at scan time. Added per-platform check before execution, returning an actionable 'skill is disabled' message. 3. get_disabled_skill_names() only checked HERMES_PLATFORM env var, but the gateway sets HERMES_SESSION_PLATFORM instead. Added HERMES_SESSION_PLATFORM as fallback, plus an explicit platform= parameter for callers that know their platform (menu builder, gateway dispatch). Also added platform to prompt_builder's skills cache key so multi-platform gateways get correct per-platform skill prompts. Reported by SteveSkedasticity (CLAW community).	2026-04-03 10:10:53 -07:00
Teknium	b6f9b70afd	fix(gateway): route /approve and /deny through running-agent guard (#4798 ) When the agent is blocked on a dangerous command approval (threading.Event wait inside tools/approval.py), incoming /approve and /deny commands were falling through to the generic interrupt path instead of being dispatched to their command handlers. The interrupt sets _interrupt_requested on the agent, but the agent thread is blocked on event.wait() — not checking the flag. Result: approval times out after 300s (5 minutes) before executing. Fix: intercept /approve and /deny in the running-agent early-intercept block (alongside /stop, /new, /queue) and route directly to _handle_approve_command / _handle_deny_command.	2026-04-03 09:59:52 -07:00
Teknium	93334b2b92	docs: add community FAQ entries — multi-model workflows, WhatsApp binding, verbose control, skills config, thread sessions, migration, install troubleshooting (#4797 ) Addresses common questions from the Nous Research community Discord: - Multi-model workflows via delegation config - WhatsApp per-chat binding limitations and workarounds - Controlling tool progress display on Telegram - Per-platform skills config and Telegram 100-command limit - Shared thread sessions across multiple users - Exporting/migrating Hermes to a new machine - Permission denied on shell reload after install - HTTP 400 on first agent run	2026-04-03 09:58:22 -07:00
Teknium	d50e5be500	fix: handle None mcp_servers in _get_platform_tools() When config.yaml has 'mcp_servers:' with no value, YAML parses it as None. dict.get('mcp_servers', {}) only returns the default when the key is absent, not when it's explicitly None. Use 'or {}' pattern to handle both cases, matching the other two assignment sites in the same file.	2026-04-03 09:08:20 -07:00
Teknium	cc54818d26	fix(mcp): stability fix pack — reload timeout, shutdown cleanup, event loop handler, OAuth non-blocking (#4757 ) Four fixes for MCP server stability issues reported by community member (terminal lockup, zombie processes, escape sequence pollution, startup hang): 1. MCP reload timeout guard (cli.py): _check_config_mcp_changes now runs _reload_mcp in a separate daemon thread with a 30s hard timeout. Previously, a hung MCP server could block the process_loop thread indefinitely, freezing the entire TUI (user can type but nothing happens, only Ctrl+D/Ctrl+\ work). 2. MCP stdio subprocess PID tracking (mcp_tool.py): Tracks child PIDs spawned by stdio_client via before/after snapshots of /proc children. On shutdown, _stop_mcp_loop force-kills any tracked PIDs that survived the SDK's graceful SIGTERM→SIGKILL cleanup. Prevents zombie MCP server processes from accumulating across sessions. 3. MCP event loop exception handler (mcp_tool.py): Installs _mcp_loop_exception_handler on the MCP background event loop — same pattern as the existing _suppress_closed_loop_errors on prompt_toolkit's loop. Suppresses benign 'Event loop is closed' RuntimeError from httpx transport __del__ during MCP shutdown. Salvaged from PR #2538 (acsezen). 4. MCP OAuth non-blocking (mcp_oauth.py): Replaces blocking input() call in _wait_for_callback with OAuthNonInteractiveError raise. Adds _is_interactive() TTY detection. In non-interactive environments, build_oauth_auth() still returns a provider (cached tokens + refresh work), but the callback handler raises immediately instead of blocking the MCP event loop for 120s. Re-raises OAuth setup failures in _run_http so failed servers are reported cleanly without blocking others. Salvaged from PRs #4521 (voidborne-d) and #4465 (heathley). Closes #2537, closes #4462 Related: #4128, #3436	2026-04-03 02:29:20 -07:00
Teknium	f374ae4c61	fix: prevent compression death spiral from API disconnects (#2153 ) (#4750 ) Three fixes for long-running gateway sessions that enter a death spiral when API disconnects prevent token data collection, which prevents compression, which causes more disconnects: Layer 1 — Stale token counter fallback (run_agent.py in-loop): When last_prompt_tokens is 0 (stale after API disconnect or provider returned no usage data), fall back to estimate_messages_tokens_rough() instead of passing 0 to should_compress(), which would never fire. Layer 2 — Server disconnect heuristic (run_agent.py error handler): When ReadError/RemoteProtocolError hits a large session (>60% context or >200 messages), treat it as a context-length error and trigger compression rather than burning through retries that all fail the same way. Layer 3 — Hard message count limit (gateway/run.py hygiene): Force compression when a session exceeds 400 messages, regardless of token estimates. This catches runaway growth even when all token-based checks fail due to missing API data. Based on the analysis from PR #2157 by ygd58 — the gateway threshold direction fix (1.4x multiplier) was already resolved on main.	2026-04-03 02:16:46 -07:00
Teknium	8fd9fafc84	fix: handle Anthropic Sonnet long-context tier 429 by reducing to 200k (#4747 ) Anthropic returns HTTP 429 'Extra usage is required for long context requests' when a Claude Max subscription doesn't include the 1M context tier. This is NOT a transient rate limit — retrying won't help. Only applies to Sonnet models (Opus 1M is general access). Detects this specific error before the generic rate-limit handler and: 1. Reduces context_length from 1M to 200k (the standard tier) 2. Triggers context compression to fit 3. Retries with the reduced context The reduction is session-scoped (not persisted) so it auto-recovers if the user later enables extra usage on their subscription. Fixes: Sonnet 4.6 instant rate limits on Claude Max without extra usage	2026-04-03 02:05:02 -07:00
Teknium	26d6083624	fix: correct qwen3.6-plus model slug Renamed qwen/qwen3.6-plus-preview:free to qwen/qwen3.6-plus:free in both OPENROUTER_MODELS and _PROVIDER_MODELS['nous'] lists.	2026-04-03 01:56:43 -07:00
Teknium	470c3ea51a	fix: handle Anthropic long-context tier 429 by reducing to 200k Anthropic returns HTTP 429 'Extra usage is required for long context requests' when a Claude Max subscription doesn't include the 1M context tier. This is NOT a transient rate limit — retrying won't help. Detect this specific error before the generic rate-limit handler and: 1. Reduce context_length from 1M to 200k (the standard tier) 2. Trigger context compression to fit 3. Retry with the reduced context The reduction is session-scoped (not persisted) so it auto-recovers if the user later enables extra usage on their subscription. Fixes: Sonnet 4.6 instant rate limits on Claude Max without extra usage	2026-04-03 01:56:43 -07:00
NexVeridian	388241f798	docs(acp): fix zed config	2026-04-03 01:46:45 -07:00
Teknium	67ae7a79df	fix: use get_hermes_home(), consolidate git_cmd, update tests Follow-up for salvaged PR #2352: - Replace hardcoded Path(os.getenv('HERMES_HOME', ...)) with get_hermes_home() from hermes_constants (2 places) - Consolidate redundant git_cmd_base into the existing git_cmd variable, constructed once before fork detection - Update autostash tests for the unmerged index check added in the previous commit	2026-04-03 01:46:42 -07:00
Franci Penov	6b0022bb7b	Add fork detection and upstream sync to hermes update - Detect if origin points to a fork (not NousResearch/hermes-agent) - Show warning when updating from a fork: origin URL - After pulling from origin/main on a fork: - Prompt to add upstream remote if not present - Respect ~/.hermes/.skip_upstream_prompt to avoid repeated prompts - Compare origin/main with upstream/main - If origin has commits not on upstream, skip (don't trample user's work) - If upstream is ahead, pull from upstream and try to sync fork - Use --force-with-lease for safe fork syncing Non-main branches are unaffected - they just pull from origin/{branch}. Co-authored-by: Avery <avery@hermes-agent.ai>	2026-04-03 01:46:42 -07:00
Teknium	0109547fa2	fix(update): handle conflicted git index during hermes update (#4735 ) * fix(gateway): race condition, photo media loss, and flood control in Telegram Three bugs causing intermittent silent drops, partial responses, and flood control delays on the Telegram platform: 1. Race condition in handle_message() — _active_sessions was set inside the background task, not before create_task(). Two rapid messages could both pass the guard and spawn duplicate processing tasks. Fix: set _active_sessions synchronously before spawning the task (grammY sequentialize / aiogram EventIsolation pattern). 2. Photo media loss on dequeue — when a photo (no caption) was queued during active processing and later dequeued, only .text was extracted. Empty text → message silently dropped. Fix: _build_media_placeholder() creates text context for media-only events so they survive the dequeue path. 3. Progress message edits triggered Telegram flood control — rapid tool calls edited the progress message every 0.3s, hitting Telegram's rate limit (23s+ waits). This blocked progress updates and could cause stream consumer timeouts. Fix: throttle edits to 1.5s minimum interval, detect flood control errors and gracefully degrade to new messages. edit_message() now returns failure for flood waits >5s instead of blocking. * fix(gateway): downgrade empty/None response log from WARNING to DEBUG This warning fires on every successful streamed response (streaming delivers the text, handler returns None via already_sent=True) and on every queued message during active processing. Both are expected behavior, not error conditions. Downgrade to DEBUG to reduce log noise. * fix(gateway): prevent stuck sessions with agent timeout and staleness eviction Three changes to prevent sessions from getting permanently locked: 1. Agent execution timeout (HERMES_AGENT_TIMEOUT, default 10min): Wraps run_in_executor with asyncio.wait_for so a hung API call or runaway tool can't lock a session indefinitely. On timeout, the agent is interrupted and the user gets an actionable error message. 2. Staleness eviction for _running_agents: Tracks start timestamps for each session entry. When a new message arrives and the entry is older than timeout + 1min grace, it's evicted as a leaked lock. Safety net for any cleanup path that fails to remove the entry. 3. Cron job timeout (HERMES_CRON_TIMEOUT, default 10min): Wraps run_conversation in a ThreadPoolExecutor with timeout so a hung cron job doesn't block the ticker thread (and all subsequent cron jobs) indefinitely. Follows grammY runner's per-update timeout pattern and aiogram's asyncio.wait_for approach for handler deadlines. * fix(gateway): STT config resolution, stream consumer flood control fallback Three targeted fixes from user-reported issues: 1. STT config resolution (transcription_tools.py): _has_openai_audio_backend() and _resolve_openai_audio_client_config() now check stt.openai.api_key/base_url in config.yaml FIRST, before falling back to env vars. Fixes voice transcription breaking when using a custom OpenAI-compatible endpoint via config.yaml. 2. Stream consumer flood control fallback (stream_consumer.py): When an edit fails mid-stream (e.g., Telegram flood control returns failure for waits >5s), reset _already_sent to False so the normal final send path delivers the complete response. Previously, a truncated partial was left as the final message. 3. Telegram edit_message comment alignment (telegram.py): Clarify that long flood waits return failure so streaming can fall back to a normal final send. * refactor: simplify and harden PR fixes after review - Fix cron ThreadPoolExecutor blocking on timeout: use shutdown(wait=False, cancel_futures=True) instead of context manager that waits indefinitely - Extract _dequeue_pending_text() to deduplicate media-placeholder logic in interrupt and normal-completion dequeue paths - Remove hasattr guards for _running_agents_ts: add class-level default so partial test construction works without scattered defensive checks - Move `import concurrent.futures` to top of cron/scheduler.py - Progress throttle: sleep remaining interval instead of busy-looping 0.1s (~15 wakeups per 1.5s window → 1 wakeup) - Deduplicate _load_stt_config() in transcription_tools.py: _has_openai_audio_backend() now delegates to _resolve_openai_audio_client_config() * fix: move class-level attribute after docstring, clarify throttle comment Follow-up nits for salvaged PR #4577: - Move _running_agents_ts class attribute below the docstring so GatewayRunner.__doc__ is preserved. - Add clarifying comment explaining the throttle continue behavior (batches queued messages during the throttle interval). * fix(update): handle conflicted git index during hermes update When the git index has unmerged entries (e.g. from an interrupted merge or rebase), git stash fails with 'needs merge / could not write index'. Detect this with git ls-files --unmerged and clear the conflict state with git reset before attempting the stash. Working-tree changes are preserved. Reported by @LLMJunky — package-lock.json conflict from a prior merge left the index dirty, blocking hermes update entirely. --------- Co-authored-by: kshitijk4poor <82637225+kshitijk4poor@users.noreply.github.com>	2026-04-03 01:17:12 -07:00
Teknium	c66c688727	fix: remove redundant restart message from update launchd path launchd_restart() already prints stop/start confirmation via its internal helpers — the extra 'Gateway restarted via launchd' line was redundant. Update test assertion to match.	2026-04-03 01:16:42 -07:00
Dave Tist	988ecc7420	fix(update): avoid launchd restart race on macOS	2026-04-03 01:16:42 -07:00
kshitijk4poor	7165eff901	fix(whatsapp): add free_response_chats, mention stripping, and interactive message unwrapping Address feature gaps vs Telegram/Discord/Mattermost adapters: - free_response_chats whitelist to bypass mention gating per-group - strip bot @phone mentions from body before forwarding to agent - unwrap templateMessage/buttonsMessage/listMessage in bridge - info-level log on successful mention pattern compilation - use module-level json import instead of inline import in config - eliminate double _normalize_whatsapp_id call via walrus operator - hoist botIds computation outside per-message loop in bridge	2026-04-03 01:16:39 -07:00
kshitijk4poor	714e4941b8	fix(whatsapp): enforce require_mention in group chats	2026-04-03 01:16:39 -07:00
Teknium	23addf48d3	fix: allow running gateway service as root for LXC/container environments (#4732 ) Previously, `hermes gateway install --system` hard-refused to create a service running as root, even when explicitly requested via `--run-as-user root`. This forced LXC/container users (where root is the only user) to either create throwaway users or comment out the check in source. Changes: - Auto-detected root (no explicit --run-as-user) still raises, but with a message explaining how to override - Explicit `--run-as-user root` now allowed with a warning about security implications - Interactive setup wizard prompt accepts 'root' as a valid username (warning comes from _system_service_identity downstream) - Added tests for all three paths: auto-detected root rejection, explicit root allowance, and normal non-root passthrough	2026-04-03 01:14:21 -07:00
kshitijk4poor	4d99305345	fix(cli): surface recent sessions inside /history and /resume When /history is used in an empty chat or /resume with no argument, show an inline table of recent resumable sessions with title, preview, relative timestamp, and session ID instead of a dead-end message. Table formatting matches the existing hermes sessions list style (column headers + thin separators, no box drawing). Co-authored-by: kshitijk4poor <kshitijk4poor@users.noreply.github.com>	2026-04-03 00:50:49 -07:00
Teknium	a933079564	fix: move class-level attribute after docstring, clarify throttle comment Follow-up nits for salvaged PR #4577: - Move _running_agents_ts class attribute below the docstring so GatewayRunner.__doc__ is preserved. - Add clarifying comment explaining the throttle continue behavior (batches queued messages during the throttle interval).	2026-04-03 00:50:17 -07:00
kshitijk4poor	0ed28ab80c	refactor: simplify and harden PR fixes after review - Fix cron ThreadPoolExecutor blocking on timeout: use shutdown(wait=False, cancel_futures=True) instead of context manager that waits indefinitely - Extract _dequeue_pending_text() to deduplicate media-placeholder logic in interrupt and normal-completion dequeue paths - Remove hasattr guards for _running_agents_ts: add class-level default so partial test construction works without scattered defensive checks - Move `import concurrent.futures` to top of cron/scheduler.py - Progress throttle: sleep remaining interval instead of busy-looping 0.1s (~15 wakeups per 1.5s window → 1 wakeup) - Deduplicate _load_stt_config() in transcription_tools.py: _has_openai_audio_backend() now delegates to _resolve_openai_audio_client_config()	2026-04-03 00:50:17 -07:00
kshitijk4poor	28380e7aed	fix(gateway): STT config resolution, stream consumer flood control fallback Three targeted fixes from user-reported issues: 1. STT config resolution (transcription_tools.py): _has_openai_audio_backend() and _resolve_openai_audio_client_config() now check stt.openai.api_key/base_url in config.yaml FIRST, before falling back to env vars. Fixes voice transcription breaking when using a custom OpenAI-compatible endpoint via config.yaml. 2. Stream consumer flood control fallback (stream_consumer.py): When an edit fails mid-stream (e.g., Telegram flood control returns failure for waits >5s), reset _already_sent to False so the normal final send path delivers the complete response. Previously, a truncated partial was left as the final message. 3. Telegram edit_message comment alignment (telegram.py): Clarify that long flood waits return failure so streaming can fall back to a normal final send.	2026-04-03 00:50:17 -07:00
kshitijk4poor	970042deab	fix(gateway): prevent stuck sessions with agent timeout and staleness eviction Three changes to prevent sessions from getting permanently locked: 1. Agent execution timeout (HERMES_AGENT_TIMEOUT, default 10min): Wraps run_in_executor with asyncio.wait_for so a hung API call or runaway tool can't lock a session indefinitely. On timeout, the agent is interrupted and the user gets an actionable error message. 2. Staleness eviction for _running_agents: Tracks start timestamps for each session entry. When a new message arrives and the entry is older than timeout + 1min grace, it's evicted as a leaked lock. Safety net for any cleanup path that fails to remove the entry. 3. Cron job timeout (HERMES_CRON_TIMEOUT, default 10min): Wraps run_conversation in a ThreadPoolExecutor with timeout so a hung cron job doesn't block the ticker thread (and all subsequent cron jobs) indefinitely. Follows grammY runner's per-update timeout pattern and aiogram's asyncio.wait_for approach for handler deadlines.	2026-04-03 00:50:17 -07:00
kshitijk4poor	9bb83d1298	fix(gateway): downgrade empty/None response log from WARNING to DEBUG This warning fires on every successful streamed response (streaming delivers the text, handler returns None via already_sent=True) and on every queued message during active processing. Both are expected behavior, not error conditions. Downgrade to DEBUG to reduce log noise.	2026-04-03 00:50:17 -07:00
kshitijk4poor	69f85a4dce	fix(gateway): race condition, photo media loss, and flood control in Telegram Three bugs causing intermittent silent drops, partial responses, and flood control delays on the Telegram platform: 1. Race condition in handle_message() — _active_sessions was set inside the background task, not before create_task(). Two rapid messages could both pass the guard and spawn duplicate processing tasks. Fix: set _active_sessions synchronously before spawning the task (grammY sequentialize / aiogram EventIsolation pattern). 2. Photo media loss on dequeue — when a photo (no caption) was queued during active processing and later dequeued, only .text was extracted. Empty text → message silently dropped. Fix: _build_media_placeholder() creates text context for media-only events so they survive the dequeue path. 3. Progress message edits triggered Telegram flood control — rapid tool calls edited the progress message every 0.3s, hitting Telegram's rate limit (23s+ waits). This blocked progress updates and could cause stream consumer timeouts. Fix: throttle edits to 1.5s minimum interval, detect flood control errors and gracefully degrade to new messages. edit_message() now returns failure for flood waits >5s instead of blocking.	2026-04-03 00:50:17 -07:00
Teknium	3659e1f0c2	test(acp): add E2E tests for MCP registration and tool-result reporting Tests the full ACP flow: - new_session with mcpServers → config conversion → register_mcp_servers - prompt → tool_progress_callback → ToolCallStart events - step_callback with results → ToolCallUpdate with rawOutput - toolCallId pairing between start and completion events - server names with slashes/dots sanitized correctly - all session lifecycle methods (load/resume/fork) register MCP	2026-04-02 20:54:27 -07:00
Teknium	21c2d32471	fix(gateway): normalize step_callback prev_tools for backward compat The PR changed prev_tools from list[str] to list[dict] with name/result keys. The gateway's _step_callback_sync passed this directly to hooks as 'tool_names', breaking user-authored hooks that call ', '.join(tool_names). Now: - 'tool_names' always contains strings (backward-compatible) - 'tools' carries the enriched dicts for hooks that want results Also adds summary logging to register_mcp_servers() and comprehensive tests for all three PR changes: - sanitize_mcp_name_component edge cases - register_mcp_servers public API - _register_session_mcp_servers ACP integration - step_callback result forwarding - gateway normalization backward compat	2026-04-02 20:54:27 -07:00
Jack	f66b3fe76b	fix(acp): include tool results in step_callback for ACP tool_call_update events The step_callback previously only forwarded tool names as strings, so build_tool_complete received result=None and ACP tool_call_update events had empty content/rawOutput. Now prev_tools carries dicts with both name and result by pairing each tool_call with its matching tool-role message via tool_call_id.	2026-04-02 20:54:27 -07:00
Jack	9aa82d4807	fix(acp): use raw server name as registry key, only sanitize for tool name prefixes	2026-04-02 20:54:27 -07:00
Jack	9b2fb1cc2e	feat(acp): register client-provided MCP servers as agent tools ACP clients pass MCP server definitions in session/new, load_session, resume_session, and fork_session. Previously these were accepted but silently ignored — the agent never connected to them. This wires the mcp_servers parameter into the existing MCP registration pipeline (tools/mcp_tool.py) so client-provided servers are connected, their tools discovered, and the agent's tool surface refreshed before the first prompt. Changes: tools/mcp_tool.py: - Extract sanitize_mcp_name_component() to replace all non-[A-Za-z0-9_] characters (fixes crash when server names contain / or other chars that violate provider tool-name validation rules) - Use it in _convert_mcp_schema, _sync_mcp_toolsets, _build_utility_schemas - Extract register_mcp_servers(servers: dict) as a public API that takes an explicit {name: config} map. discover_mcp_tools() becomes a thin wrapper that loads config.yaml and calls register_mcp_servers() acp_adapter/server.py: - Add _register_session_mcp_servers() which converts ACP McpServerStdio / McpServerHttp / McpServerSse objects to Hermes MCP config dicts, registers them via asyncio.to_thread (avoids blocking the ACP event loop), then rebuilds agent.tools, valid_tool_names, and invalidates the cached system prompt - Call it from new_session, load_session, resume_session, fork_session Tested with Eden (theproxycompany.com) as ACP client — 5 MCP servers (HTTP + stdio) registered successfully, 110 tools available to the agent.	2026-04-02 20:54:27 -07:00
Erosika	29c98e8f83	feat(honcho): add configurable observation mode (unified/directional) Adds observationMode config field to HonchoClientConfig: - 'unified' (default): user peer self-observations, all agents share one pool - 'directional': AI peer observes user, each agent keeps its own view Changes: - client.py: observation_mode field, _normalize_observation_mode(), config resolution - session.py: add_peers respects mode (peer observation flags), dialectic_query routes through correct peer, create_conclusion uses correct observer	2026-04-02 20:38:36 -07:00
Erosika	9e0fc62650	feat(honcho): restore full integration parity in memory provider plugin Implements all features from the post-merge Honcho plugin spec: B1: recall_mode support (context/tools/hybrid) B2: peer_memory_mode gating (stub for ABC suppression mechanism) B3: resolve_session_name() session key resolution B4: first-turn context baking in system_prompt_block() B5: cost-awareness (cadence, injection frequency, reasoning cap) B6: memory file migration in initialize() B7: pre-warming context at init Ports from open PRs: - #3265: token budget enforcement in prefetch() - #4053: cron guard (skip activation for cron/flush sessions) - #2645: baseUrl-only flow verified in is_available() - #1969: aiPeer sync from SOUL.md - #1957: lazy session init in tools mode Single file change: plugins/memory/honcho/__init__.py No modifications to client.py, session.py, or any files outside the plugin.	2026-04-02 20:38:36 -07:00
Teknium	924bc67eee	feat(memory): pluggable memory provider interface with profile isolation, review fixes, and honcho CLI restoration (#4623 ) * feat(memory): add pluggable memory provider interface with profile isolation Introduces a pluggable MemoryProvider ABC so external memory backends can integrate with Hermes without modifying core files. Each backend becomes a plugin implementing a standard interface, orchestrated by MemoryManager. Key architecture: - agent/memory_provider.py — ABC with core + optional lifecycle hooks - agent/memory_manager.py — single integration point in the agent loop - agent/builtin_memory_provider.py — wraps existing MEMORY.md/USER.md Profile isolation fixes applied to all 6 shipped plugins: - Cognitive Memory: use get_hermes_home() instead of raw env var - Hindsight Memory: check $HERMES_HOME/hindsight/config.json first, fall back to legacy ~/.hindsight/ for backward compat - Hermes Memory Store: replace hardcoded ~/.hermes paths with get_hermes_home() for config loading and DB path defaults - Mem0 Memory: use get_hermes_home() instead of raw env var - RetainDB Memory: auto-derive profile-scoped project name from hermes_home path (hermes-<profile>), explicit env var overrides - OpenViking Memory: read-only, no local state, isolation via .env MemoryManager.initialize_all() now injects hermes_home into kwargs so every provider can resolve profile-scoped storage without importing get_hermes_home() themselves. Plugin system: adds register_memory_provider() to PluginContext and get_plugin_memory_providers() accessor. Based on PR #3825. 46 tests (37 unit + 5 E2E + 4 plugin registration). * refactor(memory): drop cognitive plugin, rewrite OpenViking as full provider Remove cognitive-memory plugin (#727) — core mechanics are broken: decay runs 24x too fast (hourly not daily), prefetch uses row ID as timestamp, search limited by importance not similarity. Rewrite openviking-memory plugin from a read-only search wrapper into a full bidirectional memory provider using the complete OpenViking session lifecycle API: - sync_turn: records user/assistant messages to OpenViking session (threaded, non-blocking) - on_session_end: commits session to trigger automatic memory extraction into 6 categories (profile, preferences, entities, events, cases, patterns) - prefetch: background semantic search via find() endpoint - on_memory_write: mirrors built-in memory writes to the session - is_available: checks env var only, no network calls (ABC compliance) Tools expanded from 3 to 5: - viking_search: semantic search with mode/scope/limit - viking_read: tiered content (abstract ~100tok / overview ~2k / full) - viking_browse: filesystem-style navigation (list/tree/stat) - viking_remember: explicit memory storage via session - viking_add_resource: ingest URLs/docs into knowledge base Uses direct HTTP via httpx (no openviking SDK dependency needed). Response truncation on viking_read to prevent context flooding. * fix(memory): harden Mem0 plugin — thread safety, non-blocking sync, circuit breaker - Remove redundant mem0_context tool (identical to mem0_search with rerank=true, top_k=5 — wastes a tool slot and confuses the model) - Thread sync_turn so it's non-blocking — Mem0's server-side LLM extraction can take 5-10s, was stalling the agent after every turn - Add threading.Lock around _get_client() for thread-safe lazy init (prefetch and sync threads could race on first client creation) - Add circuit breaker: after 5 consecutive API failures, pause calls for 120s instead of hammering a down server every turn. Auto-resets after cooldown. Logs a warning when tripped. - Track success/failure in prefetch, sync_turn, and all tool calls - Wait for previous sync to finish before starting a new one (prevents unbounded thread accumulation on rapid turns) - Clean up shutdown to join both prefetch and sync threads * fix(memory): enforce single external memory provider limit MemoryManager now rejects a second non-builtin provider with a warning. Built-in memory (MEMORY.md/USER.md) is always accepted. Only ONE external plugin provider is allowed at a time. This prevents tool schema bloat (some providers add 3-5 tools each) and conflicting memory backends. The warning message directs users to configure memory.provider in config.yaml to select which provider to activate. Updated all 47 tests to use builtin + one external pattern instead of multiple externals. Added test_second_external_rejected to verify the enforcement. * feat(memory): add ByteRover memory provider plugin Implements the ByteRover integration (from PR #3499 by hieuntg81) as a MemoryProvider plugin instead of direct run_agent.py modifications. ByteRover provides persistent memory via the brv CLI — a hierarchical knowledge tree with tiered retrieval (fuzzy text then LLM-driven search). Local-first with optional cloud sync. Plugin capabilities: - prefetch: background brv query for relevant context - sync_turn: curate conversation turns (threaded, non-blocking) - on_memory_write: mirror built-in memory writes to brv - on_pre_compress: extract insights before context compression Tools (3): - brv_query: search the knowledge tree - brv_curate: store facts/decisions/patterns - brv_status: check CLI version and context tree state Profile isolation: working directory at $HERMES_HOME/byterover/ (scoped per profile). Binary resolution cached with thread-safe double-checked locking. All write operations threaded to avoid blocking the agent (curate can take 120s with LLM processing). * fix(memory): thread remaining sync_turns, fix holographic, add config key Plugin fixes: - Hindsight: thread sync_turn (was blocking up to 30s via _run_in_thread) - RetainDB: thread sync_turn (was blocking on HTTP POST) - Both: shutdown now joins sync threads alongside prefetch threads Holographic retrieval fixes: - reason(): removed dead intersection_key computation (bundled but never used in scoring). Now reuses pre-computed entity_residuals directly, moved role_content encoding outside the inner loop. - contradict(): added _MAX_CONTRADICT_FACTS=500 scaling guard. Above 500 facts, only checks the most recently updated ones to avoid O(n^2) explosion (~125K comparisons at 500 is acceptable). Config: - Added memory.provider key to DEFAULT_CONFIG ("" = builtin only). No version bump needed (deep_merge handles new keys automatically). * feat(memory): extract Honcho as a MemoryProvider plugin Creates plugins/honcho-memory/ as a thin adapter over the existing honcho_integration/ package. All 4 Honcho tools (profile, search, context, conclude) move from the normal tool registry to the MemoryProvider interface. The plugin delegates all work to HonchoSessionManager — no Honcho logic is reimplemented. It uses the existing config chain: $HERMES_HOME/honcho.json -> ~/.honcho/config.json -> env vars. Lifecycle hooks: - initialize: creates HonchoSessionManager via existing client factory - prefetch: background dialectic query - sync_turn: records messages + flushes to API (threaded) - on_memory_write: mirrors user profile writes as conclusions - on_session_end: flushes all pending messages This is a prerequisite for the MemoryManager wiring in run_agent.py. Once wired, Honcho goes through the same provider interface as all other memory plugins, and the scattered Honcho code in run_agent.py can be consolidated into the single MemoryManager integration point. * feat(memory): wire MemoryManager into run_agent.py Adds 8 integration points for the external memory provider plugin, all purely additive (zero existing code modified): 1. Init (~L1130): Create MemoryManager, find matching plugin provider from memory.provider config, initialize with session context 2. Tool injection (~L1160): Append provider tool schemas to self.tools and self.valid_tool_names after memory_manager init 3. System prompt (~L2705): Add external provider's system_prompt_block alongside existing MEMORY.md/USER.md blocks 4. Tool routing (~L5362): Route provider tool calls through memory_manager.handle_tool_call() before the catchall handler 5. Memory write bridge (~L5353): Notify external provider via on_memory_write() when the built-in memory tool writes 6. Pre-compress (~L5233): Call on_pre_compress() before context compression discards messages 7. Prefetch (~L6421): Inject provider prefetch results into the current-turn user message (same pattern as Honcho turn context) 8. Turn sync + session end (~L8161, ~L8172): sync_all() after each completed turn, queue_prefetch_all() for next turn, on_session_end() + shutdown_all() at conversation end All hooks are wrapped in try/except — a failing provider never breaks the agent. The existing memory system, Honcho integration, and all other code paths are completely untouched. Full suite: 7222 passed, 4 pre-existing failures. * refactor(memory): remove legacy Honcho integration from core Extracts all Honcho-specific code from run_agent.py, model_tools.py, toolsets.py, and gateway/run.py. Honcho is now exclusively available as a memory provider plugin (plugins/honcho-memory/). Removed from run_agent.py (-457 lines): - Honcho init block (session manager creation, activation, config) - 8 Honcho methods: _honcho_should_activate, _strip_honcho_tools, _activate_honcho, _register_honcho_exit_hook, _queue_honcho_prefetch, _honcho_prefetch, _honcho_save_user_observation, _honcho_sync - _inject_honcho_turn_context module-level function - Honcho system prompt block (tool descriptions, CLI commands) - Honcho context injection in api_messages building - Honcho params from __init__ (honcho_session_key, honcho_manager, honcho_config) - HONCHO_TOOL_NAMES constant - All honcho-specific tool dispatch forwarding Removed from other files: - model_tools.py: honcho_tools import, honcho params from handle_function_call - toolsets.py: honcho toolset definition, honcho tools from core tools list - gateway/run.py: honcho params from AIAgent constructor calls Removed tests (-339 lines): - 9 Honcho-specific test methods from test_run_agent.py - TestHonchoAtexitFlush class from test_exit_cleanup_interrupt.py Restored two regex constants (_SURROGATE_RE, _BUDGET_WARNING_RE) that were accidentally removed during the honcho function extraction. The honcho_integration/ package is kept intact — the plugin delegates to it. tools/honcho_tools.py registry entries are now dead code (import commented out in model_tools.py) but the file is preserved for reference. Full suite: 7207 passed, 4 pre-existing failures. Zero regressions. * refactor(memory): restructure plugins, add CLI, clean gateway, migration notice Plugin restructure: - Move all memory plugins from plugins/<name>-memory/ to plugins/memory/<name>/ (byterover, hindsight, holographic, honcho, mem0, openviking, retaindb) - New plugins/memory/__init__.py discovery module that scans the directory directly, loading providers by name without the general plugin system - run_agent.py uses load_memory_provider() instead of get_plugin_memory_providers() CLI wiring: - hermes memory setup — interactive curses picker + config wizard - hermes memory status — show active provider, config, availability - hermes memory off — disable external provider (built-in only) - hermes honcho — now shows migration notice pointing to hermes memory setup Gateway cleanup: - Remove _get_or_create_gateway_honcho (already removed in prev commit) - Remove _shutdown_gateway_honcho and _shutdown_all_gateway_honcho methods - Remove all calls to shutdown methods (4 call sites) - Remove _honcho_managers/_honcho_configs dict references Dead code removal: - Delete tools/honcho_tools.py (279 lines, import was already commented out) - Delete tests/gateway/test_honcho_lifecycle.py (131 lines, tested removed methods) - Remove if False placeholder from run_agent.py Migration: - Honcho migration notice on startup: detects existing honcho.json or ~/.honcho/config.json, prints guidance to run hermes memory setup. Only fires when memory.provider is not set and not in quiet mode. Full suite: 7203 passed, 4 pre-existing failures. Zero regressions. * feat(memory): standardize plugin config + add per-plugin documentation Config architecture: - Add save_config(values, hermes_home) to MemoryProvider ABC - Honcho: writes to $HERMES_HOME/honcho.json (SDK native) - Mem0: writes to $HERMES_HOME/mem0.json - Hindsight: writes to $HERMES_HOME/hindsight/config.json - Holographic: writes to config.yaml under plugins.hermes-memory-store - OpenViking/RetainDB/ByteRover: env-var only (default no-op) Setup wizard (hermes memory setup): - Now calls provider.save_config() for non-secret config - Secrets still go to .env via env vars - Only memory.provider activation key goes to config.yaml Documentation: - README.md for each of the 7 providers in plugins/memory/<name>/ - Requirements, setup (wizard + manual), config reference, tools table - Consistent format across all providers The contract for new memory plugins: - get_config_schema() declares all fields (REQUIRED) - save_config() writes native config (REQUIRED if not env-var-only) - Secrets use env_var field in schema, written to .env by wizard - README.md in the plugin directory * docs: add memory providers user guide + developer guide New pages: - user-guide/features/memory-providers.md — comprehensive guide covering all 7 shipped providers (Honcho, OpenViking, Mem0, Hindsight, Holographic, RetainDB, ByteRover). Each with setup, config, tools, cost, and unique features. Includes comparison table and profile isolation notes. - developer-guide/memory-provider-plugin.md — how to build a new memory provider plugin. Covers ABC, required methods, config schema, save_config, threading contract, profile isolation, testing. Updated pages: - user-guide/features/memory.md — replaced Honcho section with link to new Memory Providers page - user-guide/features/honcho.md — replaced with migration redirect to the new Memory Providers page - sidebars.ts — added both new pages to navigation * fix(memory): auto-migrate Honcho users to memory provider plugin When honcho.json or ~/.honcho/config.json exists but memory.provider is not set, automatically set memory.provider: honcho in config.yaml and activate the plugin. The plugin reads the same config files, so all data and credentials are preserved. Zero user action needed. Persists the migration to config.yaml so it only fires once. Prints a one-line confirmation in non-quiet mode. * fix(memory): only auto-migrate Honcho when enabled + credentialed Check HonchoClientConfig.enabled AND (api_key OR base_url) before auto-migrating — not just file existence. Prevents false activation for users who disabled Honcho, stopped using it (config lingers), or have ~/.honcho/ from a different tool. * feat(memory): auto-install pip dependencies during hermes memory setup Reads pip_dependencies from plugin.yaml, checks which are missing, installs them via pip before config walkthrough. Also shows install guidance for external_dependencies (e.g. brv CLI for ByteRover). Updated all 7 plugin.yaml files with pip_dependencies: - honcho: honcho-ai - mem0: mem0ai - openviking: httpx - hindsight: hindsight-client - holographic: (none) - retaindb: requests - byterover: (external_dependencies for brv CLI) * fix: remove remaining Honcho crash risks from cli.py and gateway cli.py: removed Honcho session re-mapping block (would crash importing deleted tools/honcho_tools.py), Honcho flush on compress, Honcho session display on startup, Honcho shutdown on exit, honcho_session_key AIAgent param. gateway/run.py: removed honcho_session_key params from helper methods, sync_honcho param, _honcho.shutdown() block. tests: fixed test_cron_session_with_honcho_key_skipped (was passing removed honcho_key param to _flush_memories_for_session). * fix: include plugins/ in pyproject.toml package list Without this, plugins/memory/ wouldn't be included in non-editable installs. Hermes always runs from the repo checkout so this is belt- and-suspenders, but prevents breakage if the install method changes. * fix(memory): correct pip-to-import name mapping for dep checks The heuristic dep.replace('-', '_') fails for packages where the pip name differs from the import name: honcho-ai→honcho, mem0ai→mem0, hindsight-client→hindsight_client. Added explicit mapping table so hermes memory setup doesn't try to reinstall already-installed packages. * chore: remove dead code from old plugin memory registration path - hermes_cli/plugins.py: removed register_memory_provider(), _memory_providers list, get_plugin_memory_providers() — memory providers now use plugins/memory/ discovery, not the general plugin system - hermes_cli/main.py: stripped 74 lines of dead honcho argparse subparsers (setup, status, sessions, map, peer, mode, tokens, identity, migrate) — kept only the migration redirect - agent/memory_provider.py: updated docstring to reflect new registration path - tests: replaced TestPluginMemoryProviderRegistration with TestPluginMemoryDiscovery that tests the actual plugins/memory/ discovery system. Added 3 new tests (discover, load, nonexistent). * chore: delete dead honcho_integration/cli.py and its tests cli.py (794 lines) was the old 'hermes honcho' command handler — nobody calls it since cmd_honcho was replaced with a migration redirect. Deleted tests that imported from removed code: - tests/honcho_integration/test_cli.py (tested _resolve_api_key) - tests/honcho_integration/test_config_isolation.py (tested CLI config paths) - tests/tools/test_honcho_tools.py (tested the deleted tools/honcho_tools.py) Remaining honcho_integration/ files (actively used by the plugin): - client.py (445 lines) — config loading, SDK client creation - session.py (991 lines) — session management, queries, flush * refactor: move honcho_integration/ into the honcho plugin Moves client.py (445 lines) and session.py (991 lines) from the top-level honcho_integration/ package into plugins/memory/honcho/. No Honcho code remains in the main codebase. - plugins/memory/honcho/client.py — config loading, SDK client creation - plugins/memory/honcho/session.py — session management, queries, flush - Updated all imports: run_agent.py (auto-migration), hermes_cli/doctor.py, plugin __init__.py, session.py cross-import, all tests - Removed honcho_integration/ package and pyproject.toml entry - Renamed tests/honcho_integration/ → tests/honcho_plugin/ * docs: update architecture + gateway-internals for memory provider system - architecture.md: replaced honcho_integration/ with plugins/memory/ - gateway-internals.md: replaced Honcho-specific session routing and flush lifecycle docs with generic memory provider interface docs * fix: update stale mock path for resolve_active_host after honcho plugin migration * fix(memory): address review feedback — P0 lifecycle, ABC contract, honcho CLI restore Review feedback from Honcho devs (erosika): P0 — Provider lifecycle: - Remove on_session_end() + shutdown_all() from run_conversation() tail (was killing providers after every turn in multi-turn sessions) - Add shutdown_memory_provider() method on AIAgent for callers - Wire shutdown into CLI atexit, reset_conversation, gateway stop/expiry Bug fixes: - Remove sync_honcho=False kwarg from /btw callsites (TypeError crash) - Fix doctor.py references to dead 'hermes honcho setup' command - Cache prefetch_all() before tool loop (was re-calling every iteration) ABC contract hardening (all backwards-compatible): - Add session_id kwarg to prefetch/sync_turn/queue_prefetch - Make on_pre_compress() return str (provider insights in compression) - Add *kwargs to on_turn_start() for runtime context - Add on_delegation() hook for parent-side subagent observation - Document agent_context/agent_identity/agent_workspace kwargs on initialize() (prevents cron corruption, enables profile scoping) - Fix docstring: single external provider, not multiple Honcho CLI restoration: - Add plugins/memory/honcho/cli.py (from main's honcho_integration/cli.py with imports adapted to plugin path) - Restore full hermes honcho command with all subcommands (status, peer, mode, tokens, identity, enable/disable, sync, peers, --target-profile) - Restore auto-clone on profile creation + sync on hermes update - hermes honcho setup now redirects to hermes memory setup fix(memory): wire on_delegation, skip_memory for cron/flush, fix ByteRover return type - Wire on_delegation() in delegate_tool.py — parent's memory provider is notified with task+result after each subagent completes - Add skip_memory=True to cron scheduler (prevents cron system prompts from corrupting user representations — closes #4052) - Add skip_memory=True to gateway flush agent (throwaway agent shouldn't activate memory provider) - Fix ByteRover on_pre_compress() return type: None -> str * fix(honcho): port profile isolation fixes from PR #4632 Ports 5 bug fixes found during profile testing (erosika's PR #4632): 1. 3-tier config resolution — resolve_config_path() now checks $HERMES_HOME/honcho.json → ~/.hermes/honcho.json → ~/.honcho/config.json (non-default profiles couldn't find shared host blocks) 2. Thread host=_host_key() through from_global_config() in cmd_setup, cmd_status, cmd_identity (--target-profile was being ignored) 3. Use bare profile name as aiPeer (not host key with dots) — Honcho's peer ID pattern is ^[a-zA-Z0-9_-]+$, dots are invalid 4. Wrap add_peers() in try/except — was fatal on new AI peers, killed all message uploads for the session 5. Gate Honcho clone behind --clone/--clone-all on profile create (bare create should be blank-slate) Also: sanitize assistant_peer_id via _sanitize_id() * fix(tests): add module cleanup fixture to test_cli_provider_resolution test_cli_provider_resolution._import_cli() wipes tools.*, cli, and run_agent from sys.modules to force fresh imports, but had no cleanup. This poisoned all subsequent tests on the same xdist worker — mocks targeting tools.file_tools, tools.send_message_tool, etc. patched the NEW module object while already-imported functions still referenced the OLD one. Caused ~25 cascade failures: send_message KeyError, process_registry FileNotFoundError, file_read_guards timeouts, read_loop_detection file-not-found, mcp_oauth None port, and provider_parity/codex_execution stale tool lists. Fix: autouse fixture saves all affected modules before each test and restores them after, matching the pattern in test_managed_browserbase_and_modal.py.	2026-04-02 15:33:51 -07:00
Teknium	e0b2bdb089	fix: webhook platform support — skip home channel prompt, disable tool progress (salvage #4363 ) (#4660 ) Cherry-picked from PR #4363 by @bennyhodl with follow-up fixes: - Skip 'No home channel' prompt for webhook platform (webhooks deliver to configured targets, not a home channel) - Disable tool progress for webhooks (no message editing support) - Add webhook to PLATFORMS in tools_config.py and skills_config.py - Add hermes-webhook toolset to toolsets.py + hermes-gateway includes - Removed overly aggressive <50 char content filter that blocked legitimate short responses (tool progress already handled at source) Co-authored-by: bennyhodl <bennyhodl@users.noreply.github.com>	2026-04-02 14:00:22 -07:00
SHL0MS	6d68fbf756	Merge pull request #4654 from SHL0MS/skill/research-paper-writing Replace ml-paper-writing with research-paper-writing: full end-to-end research pipeline	2026-04-02 13:24:12 -07:00
SHL0MS	b86647c295	Replace ml-paper-writing with research-paper-writing: full research pipeline skill Replaces the writing-focused ml-paper-writing skill (940 lines) with a complete end-to-end research paper pipeline (1,599 lines SKILL.md + 3,184 lines across 7 reference files). New content: - Full 8-phase pipeline: project setup, literature review, experiment design, execution/monitoring, analysis, paper drafting, review/revision, submission preparation - Iterative refinement strategy guide from autoreason research (when to use autoreason vs critique-and-revise vs single-pass, model selection) - Hermes agent integration: delegate_task parallel drafting, cronjob monitoring, memory/todo state management, skill composition - Professional LaTeX tooling: microtype, siunitx, TikZ diagram patterns, algorithm2e, subcaption, latexdiff, SciencePlots - Human evaluation design: annotation protocols, inter-annotator agreement, crowdsourcing platforms - Title, Figure 1, conclusion, appendix strategy, page budget management - Anonymization checklist, rebuttal writing, camera-ready preparation - AAAI and COLM venue coverage (checklists, reviewer guidelines) Preserved from ml-paper-writing: - All writing philosophy (Nanda, Farquhar, Gopen & Swan, Lipton, Perez) - Citation verification workflow (5-step mandatory process) - All 6 conference templates (NeurIPS, ICML, ICLR, ACL, AAAI, COLM) - Conference requirements, format conversion workflow - Proactivity/collaboration guidance Bug fixes in inherited reference files: - BibLaTeX recommendation now correctly says natbib for conferences - Bare except clauses fixed to except Exception - Jinja2 template tags removed from citation-workflow.md - Stale date caveats added to reviewer-guidelines.md	2026-04-02 16:13:26 -04:00
Teknium	798a7b99e4	docs: add Configuration Options section to Slack docs (#4644 ) * docs: add Configuration Options section to Slack docs Documents all config.yaml options for the Slack bot: - Thread & reply behavior (reply_to_mode, reply_broadcast) - Session isolation (group_sessions_per_user) - Mention & trigger behavior (require_mention, mention_patterns, reply_prefix) - Unauthorized user handling (unauthorized_dm_behavior) - Voice transcription (stt_enabled) - Full example config showing all options together Includes a note about Slack's hardcoded @mention requirement in channels (no free_response_channels equivalent like Discord/Telegram). * docs: consolidate reply_in_thread into Configuration Options section Folds the standalone Reply Threading subsection from PR #4643 into the Thread & Reply Behavior subsection, keeping all config options in one place. Adds reply_in_thread to the table and full example.	2026-04-02 12:38:13 -07:00
kshitijk4poor	d2b08406a4	fix(agent): classify think-only empty responses before retrying	2026-04-02 12:29:18 -07:00
Teknium	241cbeeccd	docs: add reply_in_thread config to Slack docs	2026-04-02 12:18:40 -07:00
Animesh Mishra	b9a968c1de	feat(slack): add reply_in_thread config option By default, Hermes always threads replies to channel messages. Teams that prefer direct channel replies had no way to opt out without patching the source. Add a reply_in_thread option (default: true) to the Slack platform extra config: platforms: slack: extra: reply_in_thread: false When false, _resolve_thread_ts() returns None for top-level channel messages, so replies go directly to the channel. Messages already inside an existing thread are still replied in-thread to preserve conversation context. Default is true for full backward compatibility.	2026-04-02 12:18:40 -07:00
Teknium	d89cc7fec1	feat(prompt): add Google model operational guidance for Gemini and Gemma (#4641 ) Adapted from OpenCode's gemini.txt. Gemini and Gemma models now get structured operational directives alongside tool-use enforcement: absolute paths, verify-before-edit, dependency checks, conciseness, parallel tool calls, non-interactive flags, autonomous execution. Based on PR #4026, extended to cover Gemma models.	2026-04-02 11:52:34 -07:00
Teknium	3186668799	feat: per-turn primary runtime restoration and transport recovery (#4624 ) Makes provider fallback turn-scoped in long-lived CLI sessions. Previously, a single transient failure pinned the session to the fallback provider for every subsequent turn. - _primary_runtime dict snapshot at __init__ (model, provider, base_url, api_mode, client_kwargs, compressor state) - _restore_primary_runtime() at top of run_conversation() — restores all state, resets fallback chain index - _try_recover_primary_transport() — one extra recovery cycle (client rebuild + cooldown) for transient transport errors on direct endpoints before fallback - Skipped for aggregator providers (OpenRouter, Nous) - 25 tests Inspired by #4612 (@betamod). Closes #4612.	2026-04-02 10:52:01 -07:00
Teknium	918d593544	chore: gitignore generated skills.json Follow-up to #4500 — the extraction script generates this file at build time, so it should not be committed.	2026-04-02 10:48:15 -07:00
Nacho Avecilla	b8dd059c40	feat(website): add skills browse and search page to docs (#4500 ) Adds a Skills Hub page to the documentation site with browsable/searchable catalog of all skills (built-in, optional, and community from cached hub indexes). - Python extraction script (website/scripts/extract-skills.py) parses SKILL.md frontmatter and hub index caches into skills.json - React page (website/src/pages/skills/) with search, category filtering, source filtering, and expandable skill cards - CI workflow updated to run extraction before Docusaurus build - Deploy trigger expanded to include skills/ and optional-skills/ changes Authored by @IAvecilla	2026-04-02 10:47:38 -07:00
kshitijk4poor	20441cf2c8	fix(insights): persist token usage for non-CLI sessions	2026-04-02 10:47:13 -07:00
Teknium	585855d2ca	fix: preserve Anthropic thinking block signatures across tool-use turns Anthropic extended thinking blocks include an opaque 'signature' field required for thinking chain continuity across multi-turn tool-use conversations. Previously, normalize_anthropic_response() extracted only the thinking text and set reasoning_details=None, discarding the signature. On subsequent turns the API could not verify the chain. Changes: - _to_plain_data(): new recursive SDK-to-dict converter with depth cap (20 levels) and path-based cycle detection for safety - _extract_preserved_thinking_blocks(): rehydrates preserved thinking blocks (including signature) from reasoning_details on assistant messages, placing them before tool_use blocks as Anthropic requires - normalize_anthropic_response(): stores full thinking blocks in reasoning_details via _to_plain_data() - _extract_reasoning(): adds 'thinking' key to the detail lookup chain so Anthropic-format details are found alongside OpenRouter format Salvaged from PR #4503 by @priveperfumes — focused on the thinking block continuity fix only (cache strategy and other changes excluded).	2026-04-02 10:30:32 -07:00
Teknium	28a073edc6	fix: repair OpenCode model routing and selection (#4508 ) OpenCode Zen and Go are mixed-API-surface providers — different models behind them use different API surfaces (GPT on Zen uses codex_responses, Claude on Zen uses anthropic_messages, MiniMax on Go uses anthropic_messages, GLM/Kimi on Go use chat_completions). Changes: - Add normalize_opencode_model_id() and opencode_model_api_mode() to models.py for model ID normalization and API surface routing - Add _provider_supports_explicit_api_mode() to runtime_provider.py to prevent stale api_mode from leaking across provider switches - Wire opencode routing into all three api_mode resolution paths: pool entry, api_key provider, and explicit runtime - Add api_mode field to ModelSwitchResult for propagation through the switch pipeline - Consolidate _PROVIDER_MODELS from main.py into models.py (single source of truth, eliminates duplicate dict) - Add opencode normalization to setup wizard and model picker flows - Add opencode block to _normalize_model_for_provider in CLI - Add opencode-zen/go fallback model lists to setup.py Tests: 160 targeted tests pass (26 new tests covering normalization, api_mode routing per provider/model, persistence, and setup wizard normalization). Based on PR #3017 by SaM13997. Co-authored-by: SaM13997 <139419381+SaM13997@users.noreply.github.com>	2026-04-02 09:36:24 -07:00
Devorun	f4f64c413f	fix(cli): ensure zero exit code on successful quiet mode queries (#4601 )	2026-04-02 09:33:31 -07:00
Teknium	8dc5b11e95	fix(honcho): remove redundant local HOST import in _all_profile_host_configs HOST is already imported at module level from honcho_integration.client. The local import inside _all_profile_host_configs() was unnecessary.	2026-04-02 09:25:16 -07:00
Erosika	37d73d94bb	fix: patch _local_config_path in tests for write isolation	2026-04-02 09:25:16 -07:00
Erosika	a0eae33248	fix(honcho): address PR review findings - Remove duplicate cmd_sync definition (kept version with error output) - Fix from_env workspace to stay shared (hermes) not profile-derived - Add docstring clarifying get_or_create is idempotent in status - Remove unused import importlib in test - Fix test assertion for shared workspace in from_env path - Add 3 tests for sync_honcho_profiles_quiet	2026-04-02 09:25:16 -07:00
Erosika	c146631e3b	feat(honcho): sync command + auto-sync on hermes update - hermes honcho sync: scan all profiles, create missing host blocks - hermes update: automatically syncs Honcho config to all profiles after skill sync (existing users get profile mapping on next update) - sync_honcho_profiles_quiet() for silent use from update path	2026-04-02 09:25:16 -07:00
Erosika	89eab74c67	feat(honcho): --target-profile flag + peer card display in status - hermes honcho --target-profile <name> <command>: target another profile's Honcho config without switching profiles. Works with all subcommands (status, peer, mode, tokens, enable, disable, etc.) - hermes honcho status now shows user peer card and AI peer representation when connected (fetched live from Honcho API)	2026-04-02 09:25:16 -07:00
Erosika	5f6bf2a473	fix(honcho): share workspace across profiles by default Profiles inherit the default workspace instead of deriving a separate one. All profiles see the same user context, sessions, and project history. Each profile is a different AI peer in a shared space. Workspace can still be overridden per-profile via config if isolation is needed.	2026-04-02 09:25:16 -07:00
Erosika	f27da5fe8e	fix(honcho): remove linkedHosts from peers table	2026-04-02 09:25:16 -07:00
Erosika	0e90df1216	feat(honcho): eager peer creation + enable/disable per profile - Eagerly create AI and user peers in Honcho when a profile is created (not deferred to first message). Uses idempotent peer() SDK call. - hermes honcho enable: turn on Honcho for active profile, clone settings from default if first time, create peer immediately - hermes honcho disable: turn off Honcho for active profile - _ensure_peer_exists() helper for idempotent peer creation	2026-04-02 09:25:16 -07:00
Erosika	37458e72a2	feat(honcho): auto-clone config to new profiles on creation When a profile is created and Honcho is already configured on the default host, automatically creates a host block for the new profile with inherited settings (memory mode, recall mode, write frequency, peer name, etc.) and auto-derived workspace/aiPeer. Zero-friction path: hermes profile create coder -> Honcho config cloned as hermes.coder with all settings inherited.	2026-04-02 09:25:16 -07:00
Erosika	d1189f2be9	feat(honcho): add cross-profile observability for Honcho integration - hermes honcho status: shows active profile name + host key - hermes honcho status --all: compact table of all profiles with mode, recall, write frequency per host block - hermes honcho peers: cross-profile peer identity table (user peer, AI peer, linked hosts) - All write commands (peer, mode, tokens) print [host_key] label when operating on a non-default profile	2026-04-02 09:25:16 -07:00
Erosika	18c156af8e	feat(honcho): scope host and peer resolution to active Hermes profile Derives the Honcho host key from the active Hermes profile so that each profile gets its own Honcho host block, workspace, and AI peer identity. Profile "coder" resolves to host "hermes.coder", reads from hosts["hermes.coder"] in honcho.json, and defaults workspace + aiPeer to the derived host name. Resolution order: HERMES_HONCHO_HOST env var > active profile name > "hermes" (default). Complements #3681 (profiles) with the Honcho identity layer that was part of #2845 (named instances), adapted to the merged profiles system.	2026-04-02 09:25:16 -07:00
Teknium	661a1b0ba2	fix: exclude matrix from [all] extras — python-olm is upstream-broken (#4615 ) python-olm (required by matrix-nio[e2e]) fails to build on modern macOS: - CMake 4 rejects vendored libolm's cmake_minimum_required(VERSION 3.4) - Apple Clang 21+ rejects a C++ type error in include/olm/list.hh - Upstream libolm repo is archived, no fix forthcoming Including matrix in [all] causes the entire extras install to fail during `hermes update`, silently dropping all other extras (telegram, discord, slack, cron, etc.) when the fallback kicks in. The [matrix] extra is preserved for opt-in install: pip install 'hermes-agent[matrix]' Closes #4178	2026-04-02 09:21:37 -07:00
Teknium	acea9ee20b	fix(tests): fix 11 real test failures + major cascade poisoner (#4570 ) Three root causes addressed: 1. AIAgent no longer defaults base_url to OpenRouter (9 tests) Tests that assert OpenRouter-specific behavior (prompt caching, reasoning extra_body, provider preferences) need explicit base_url and model set on the agent. Updated test_run_agent.py and test_provider_parity.py. 2. Credential pool auto-seeding from host env (2 tests) test_auxiliary_client.py tests for Anthropic OAuth and custom endpoint fallback were not mocking _select_pool_entry, so the host's credential pool interfered. Added pool + codex mocks. 3. sys.modules corruption cascade (major - ~250 tests) test_managed_modal_environment.py replaced sys.modules entries (tools, hermes_cli, agent packages) with SimpleNamespace stubs but had NO cleanup fixture. Every subsequent test in the process saw corrupted imports: 'cannot import get_config_path from <unknown module name>' and 'module tools has no attribute environments'. Added _restore_tool_and_agent_modules autouse fixture matching the pattern in test_managed_browserbase_and_modal.py. This was also the root cause of CI failures (104 failed on main).	2026-04-02 08:43:06 -07:00
Teknium	624ad582a5	fix: make gateway approval block agent thread like CLI does (#4557 ) The gateway's dangerous command approval system was fundamentally broken: the agent loop continued running after a command was flagged, and the approval request only reached the user after the agent finished its entire conversation loop. By then the context was lost. This change makes the gateway approval mirror the CLI's synchronous behavior. When a dangerous command is detected: 1. The agent thread blocks on a threading.Event 2. The approval request is sent to the user immediately 3. The user responds with /approve or /deny 4. The event is signaled and the agent resumes with the real result The agent never sees 'approval_required' as a tool result. It either gets the command output (approved) or a definitive BLOCKED message (denied/timed out) — same as CLI mode. Queue-based design supports multiple concurrent approvals (parallel subagents via delegate_task, execute_code RPC handlers). Each approval gets its own _ApprovalEntry with its own threading.Event. /approve resolves the oldest (FIFO); /approve all resolves all at once. Changes: - tools/approval.py: Queue-based per-session blocking gateway approval (register/unregister callbacks, resolve with FIFO or all-at-once) - gateway/run.py: Register approval callback in run_sync(), remove post-loop pop_pending hack, /approve and /deny support 'all' flag - tests: 21 tests including parallel subagent E2E scenarios	2026-04-02 01:47:19 -07:00
Teknium	64584a931f	cleanup: use _generate_session_key for parent key, fix trailing whitespace	2026-04-02 01:33:53 -07:00
Gary Chiu	8cb3596939	fix(gateway): seed DM thread sessions with parent transcript to preserve context	2026-04-02 01:33:53 -07:00
kshitijk4poor	e94b4b2b40	fix: preserve allowed_users during setup reconfigure and quiet unconfigured provider warnings Setup wizard now shows existing allowed_users when reconfiguring a platform and preserves them if the user presses Enter. Previously the wizard would display a misleading "No allowlist set" warning even when the .env still held the original IDs. Also downgrades the "provider X has no API key configured" log from WARNING to DEBUG in resolve_provider_client — callers already handle the None return with their own contextual messages. This eliminates noisy startup warnings for providers in the fallback chain that the user never configured (e.g. minimax).	2026-04-02 01:00:29 -07:00
Teknium	835defe074	fix: invalidate update cache for all profiles, not just current hermes update only cleared .update_check for the active HERMES_HOME, leaving other profiles showing stale 'N commits behind' in their banner. Now _invalidate_update_cache() iterates over ~/.hermes/ (default) plus every directory under ~/.hermes/profiles/ to clear all caches. The git repo is shared across profiles so a single update brings them all current. Reported by SteveSkedasticity on Discord.	2026-04-02 00:49:17 -07:00
Teknium	e4db72ef39	fix: merge dotted+hyphenated FTS5 quoting into single pass The original PR applied dotted and hyphenated regex quoting in two sequential steps. For terms with both dots and hyphens (e.g. my-app.config.ts), step 2 would re-match inside already-quoted output, producing malformed double-quoted FTS5 syntax. Merged into a single regex pass: \w+(?:[.-]\w+)+ — handles dots, hyphens, and mixed terms in one shot. Added test coverage for the mixed case.	2026-04-02 00:49:11 -07:00
Lume	9825cd7b1e	fix(state): quote dotted terms in FTS5 queries FTS5 queries containing dots (e.g. P2.2, simulate.p2.test.ts) can trigger query parse edge cases that yield OperationalError or empty results unless quoted. Extend _sanitize_fts5_query to wrap dotted tokens in double quotes (similar to hyphenated terms) and add regression tests.	2026-04-02 00:49:11 -07:00
Roland Parnaso	c4e626b1fa	refactor: extract _detect_file_drop() + add 28 tests Extract the inline file-drop detection logic into a standalone _detect_file_drop() function at module level for testability. The main loop now calls this function instead of inlining the logic. Tests cover: - Slash commands still route correctly (/help, /quit, /xyz) - Image paths auto-detected (.png, .jpg, .gif, etc.) - Non-image files detected (.py, .txt, Makefile, etc.) - Backslash-escaped spaces from macOS drag-and-drop - Trailing user text preserved as remainder - Edge cases: directories, symlinks, no-extension files - Non-string input, empty strings, nonexistent paths	2026-04-02 00:40:27 -07:00
Roland Parnaso	1841886898	fix(cli): detect dragged file paths instead of treating them as slash commands When a user drags a file into the terminal, macOS pastes the absolute path (e.g. /Users/roland/Desktop/Screenshot.png) which starts with '/' and was incorrectly routed to process_command(), producing an 'Unknown command' error. This change adds file-path detection before the slash-command check: - Parses the first token, handling backslash-escaped spaces from macOS - Checks if the path exists as a real file via Path.exists() - Image files (.png, .jpg, etc.) are auto-attached to the message - Non-image files are reformatted as [User attached file: ...] context - Falls through to normal slash-command handling if not a real file path	2026-04-02 00:40:27 -07:00
Teknium	f4bc6aa856	fix: scope extras retry to [all] group only _load_installable_optional_extras() was returning ALL extras from pyproject.toml except 'all', which included 'rl' and 'yc-bench' — extras not referenced by [all] that install heavy research deps (atroposlib, tinker, wandb) from git repos. Changed to parse the [all] group's references and only retry those 18 extras. Also moved tomllib import to function-level since it only runs during the rare fallback path.	2026-04-02 00:40:07 -07:00
kshitijk4poor	c91f4ef4ed	fix(update): preserve optional extras during fallback install	2026-04-02 00:40:07 -07:00
Ben Barclay	5101f853ba	Merge pull request #3287 from NousResearch/rewbs/tool-use-charge-to-subscription	2026-04-01 18:42:47 -07:00
Hermes Agent	a0f5fc2570	fix(tools): add debug logging for token refresh and tighten domain check - Add logger + debug log to read_nous_access_token() catch-all so token refresh failures are observable instead of silently swallowed - Tighten _is_nous_auxiliary_client() domain check to use proper URL hostname parsing instead of substring match, preventing false-positives on domains like not-nousresearch.com or nousresearch.com.evil.com	2026-04-02 12:40:03 +11:00
Ben	647f99d4dd	fix: resolve post-merge issues in auxiliary_client and model flow - Add missing `from agent.credential_pool import load_pool` import to auxiliary_client.py (introduced by the credential pool feature in main) - Thread `args` through `select_provider_and_model(args=None)` so TLS options from `cmd_model` reach `_model_flow_nous` - Mock `_require_tty` in test_cmd_model_forwards_nous_login_tls_options so it can run in non-interactive test environments Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-02 00:50:40 +00:00
Ben Barclay	a2e56d044b	Merge branch 'main' into rewbs/tool-use-charge-to-subscription	2026-04-02 11:00:35 +11:00
pefontana	bd9e0b605f	test(e2e): remove section separator comments	2026-04-01 15:23:52 -07:00
pefontana	99e6f44204	test(e2e): remove unused imports and duplicate fixtures	2026-04-01 15:23:52 -07:00
pefontana	1f1297f56c	ci: merge e2e into tests workflow as separate job Move e2e tests into tests.yml as a parallel job instead of a separate workflow. Unit tests now also ignore tests/e2e/ to avoid running them twice. Both jobs appear as independent checks in the PR.	2026-04-01 15:23:52 -07:00
pefontana	04e60cfacd	test(e2e): add authorization, session lifecycle, and resilience tests New test classes: - TestSessionLifecycle: /new then /status sequence, idempotent resets - TestAuthorization: unauthorized users get pairing code, not commands - TestSendFailureResilience: pipeline survives send() failures Additional command coverage: /provider, /verbose, /personality, /yolo. Note: /provider test is xfail - found a real bug where model_cfg is referenced unbound when config.yaml is absent (run.py:3247).	2026-04-01 15:23:52 -07:00
pefontana	ecd9bf2ca0	test(e2e): revert intentional failure after CI verification CI correctly detected the broken assertion — e2e workflow works.	2026-04-01 15:23:52 -07:00
pefontana	b209dc0f43	test(e2e): add intentional failure to verify CI detection Temporary commit — will be reverted after confirming CI catches it.	2026-04-01 15:23:52 -07:00
pefontana	67e1170b01	ci: add e2e test workflow Separate workflow for gateway e2e tests, runs on push/PR to main. Same Python 3.11 + uv setup as existing tests.yml but targets only tests/e2e/ with verbose output.	2026-04-01 15:23:52 -07:00
pefontana	bff34b1df9	test(e2e): add telegram slash command e2e tests Tests /help, /status, /new, /stop, /commands through the full adapter background-task pipeline. Validates command dispatch, session lifecycle, and response delivery without any LLM involvement.	2026-04-01 15:23:52 -07:00
pefontana	ba48cfe84a	test(e2e): add telegram gateway e2e test infrastructure Fixtures and helpers for driving messages through the full async pipeline: adapter.handle_message → background task → GatewayRunner command dispatch → adapter.send (mocked). Uses the established _make_runner pattern (object.__new__) to skip filesystem side effects while exercising real command dispatch logic.	2026-04-01 15:23:52 -07:00
Teknium	de9bba8d7c	fix: remove hardcoded OpenRouter/opus defaults No model, base_url, or provider is assumed when the user hasn't configured one. Previously the defaults dict in cli.py, AIAgent constructor args, and several fallback paths all hardcoded anthropic/claude-opus-4.6 + openrouter.ai/api/v1 — silently routing unconfigured users to OpenRouter, which 404s for anyone using a different provider. Now empty defaults force the setup wizard to run, and existing users who already completed setup are unaffected (their config.yaml has the model they chose). Files changed: - cli.py: defaults dict, _DEFAULT_CONFIG_MODEL - run_agent.py: AIAgent.__init__ defaults, main() defaults - hermes_cli/config.py: DEFAULT_CONFIG - hermes_cli/runtime_provider.py: is_fallback sentinel - acp_adapter/session.py: default_model - tests: updated to reflect empty defaults	2026-04-01 15:22:26 -07:00
Teknium	3628ccc8c4	feat: use 'developer' role for GPT-5 and Codex models (#4498 ) OpenAI's newer models (GPT-5, Codex) give stronger instruction-following weight to the 'developer' role vs 'system'. Swap the role at the API boundary in _build_api_kwargs() for the chat_completions path so internal message representation stays consistent ('system' everywhere). Applies regardless of provider — OpenRouter, Nous portal, direct, etc. The codex_responses path (direct OpenAI) uses 'instructions' instead of message roles, so it's unaffected. DEVELOPER_ROLE_MODELS constant in prompt_builder.py defines the matching model name substrings: ('gpt-5', 'codex').	2026-04-01 14:49:32 -07:00
Teknium	c59ab8b0da	fix: profile model.model promoted to model.default when default not set When a profile config sets model.model but not model.default, the hardcoded default (claude-opus-4.6) survived the config merge and took precedence in HermesCLI.__init__ because it checks model.default first. Profile model configs were silently ignored. Now model.model is promoted to model.default during the merge when the user didn't explicitly set model.default. Fixes #4486.	2026-04-01 13:46:18 -07:00
Teknium	16d9f58445	fix(gateway): persist memory flush state to prevent redundant re-flushes on restart (#4481 ) * fix: force-close TCP sockets on client cleanup, detect and recover dead connections When a provider drops connections mid-stream (e.g. OpenRouter outage), httpx's graceful close leaves sockets in CLOSE-WAIT indefinitely. These zombie connections accumulate and can prevent recovery without restarting. Changes: - _force_close_tcp_sockets: walks the httpx connection pool and issues socket.shutdown(SHUT_RDWR) + close() to force TCP RST on every socket when a client is closed, preventing CLOSE-WAIT accumulation - _cleanup_dead_connections: probes the primary client's pool for dead sockets (recv MSG_PEEK), rebuilds the client if any are found - Pre-turn health check at the start of each run_conversation call that auto-recovers with a user-facing status message - Primary client rebuild after stale stream detection to purge pool - User-facing messages on streaming connection failures: "Connection to provider dropped — Reconnecting (attempt 2/3)" "Connection failed after 3 attempts — try again in a moment" Made-with: Cursor * fix: pool entry missing base_url for openrouter, clean error messages - _resolve_runtime_from_pool_entry: add OPENROUTER_BASE_URL fallback when pool entry has no runtime_base_url (pool entries from auth.json credential_pool often omit base_url) - Replace Rich console.print for auth errors with plain print() to prevent ANSI escape code mangling through prompt_toolkit's stdout patch - Force-close TCP sockets on client cleanup to prevent CLOSE-WAIT accumulation after provider outages - Pre-turn dead connection detection with auto-recovery and user message - Primary client rebuild after stale stream detection - User-facing status messages on streaming connection failures/retries Made-with: Cursor * fix(gateway): persist memory flush state to prevent redundant re-flushes on restart The _session_expiry_watcher tracked flushed sessions in an in-memory set (_pre_flushed_sessions) that was lost on gateway restart. Expired sessions remained in sessions.json and were re-discovered every restart, causing redundant AIAgent runs that burned API credits and blocked the event loop. Fix: Add a memory_flushed boolean field to SessionEntry, persisted in sessions.json. The watcher sets it after a successful flush. On restart, the flag survives and the watcher skips already-flushed sessions. - Add memory_flushed field to SessionEntry with to_dict/from_dict support - Old sessions.json entries without the field default to False (backward compat) - Remove the ephemeral _pre_flushed_sessions set from SessionStore - Update tests: save/load roundtrip, legacy entry compat, auto-reset behavior	2026-04-01 12:05:02 -07:00
Teknium	1515e8c8f2	fix: rewrite test mock secrets and add redaction fixture The original test file had mock secrets corrupted by secret-redaction tooling before commit — the test values (sk-ant...l012) didn't actually trigger the PREFIX_RE regex, so 4 of 10 tests were asserting against values that never appeared in the input. - Replace truncated mock values with proper fake keys built via string concatenation (avoids tool redaction during file writes) - Add _ensure_redaction_enabled autouse fixture to patch the module-level _REDACT_ENABLED constant, matching the pattern from test_redact.py	2026-04-01 12:03:56 -07:00
0xbyt4	127a4e512b	security: redact secrets from auxiliary and vision LLM responses LLM responses from browser snapshot extraction and vision analysis could echo back secrets that appeared on screen or in page content. Input redaction alone is insufficient — the LLM may reproduce secrets it read from screenshots (which cannot be text-redacted). Now redact outputs from: - _extract_relevant_content (auxiliary LLM response) - browser_vision (vision LLM response) - camofox_vision (vision LLM response)	2026-04-01 12:03:56 -07:00
0xbyt4	712aa44325	security: block secret exfiltration via browser URLs and auxiliary LLM calls Three exfiltration vectors closed: 1. Browser URL exfil — agent could embed secrets in URL params and navigate to attacker-controlled server. Now scans URLs for known API key patterns before navigating (browser_navigate, web_extract). 2. Browser snapshot leak — page displaying env vars or API keys would send secrets to auxiliary LLM via _extract_relevant_content before run_agent.py's redaction layer sees the result. Now redacts snapshot text before the auxiliary call. 3. Camofox annotation leak — accessibility tree text sent to vision LLM could contain secrets visible on screen. Now redacts annotation context before the vision call. 10 new tests covering URL blocking, snapshot redaction, and annotation redaction for both browser and camofox backends.	2026-04-01 12:03:56 -07:00
Teknium	7e91009018	fix: lazy-init SessionDB on adapter instance instead of per-request Reuse a single SessionDB across requests by caching on self._session_db with lazy initialization. Avoids creating a new SQLite connection per request when X-Hermes-Session-Id is used. Updated tests to set adapter._session_db directly instead of patching the constructor.	2026-04-01 11:41:32 -07:00
txchen	bf19623a53	feat(api-server): support X-Hermes-Session-Id header for session continuity Allow callers to pass X-Hermes-Session-Id in request headers to continue an existing conversation. When provided, history is loaded from SessionDB instead of the request body, and the session_id is echoed in the response header. Without the header, existing behavior is preserved (new uuid per request). This enables web UI clients to maintain thread continuity without modifying any session state themselves — the same mechanism the gateway uses for IM platforms (Telegram, Discord, etc.).	2026-04-01 11:41:32 -07:00
Leegenux	3ff9e0101d	fix(skill_utils): add type check for metadata field in extract_skill_conditions When PyYAML is unavailable or YAML frontmatter is malformed, the fallback parser may return metadata as a string instead of a dict. This causes AttributeError when calling .get("hermes") on the string. Added explicit type checks to handle cases where metadata or hermes fields are not dicts, preventing the crash. Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>	2026-04-01 11:34:56 -07:00
Teknium	b267516851	fix: also exclude .env from default profile exports The original PR excluded auth.json from _DEFAULT_EXPORT_EXCLUDE_ROOT and filtered both auth.json and .env from named profile exports, but missed adding .env to the default profile exclusion set. Default exports would still leak .env containing API keys. Added .env to _DEFAULT_EXPORT_EXCLUDE_ROOT, added test coverage, and updated the existing test that incorrectly asserted .env presence.	2026-04-01 11:20:33 -07:00
dieutx	d435acc2c0	fix(security): exclude auth.json and .env from profile exports	2026-04-01 11:20:33 -07:00
Teknium	bacc86d031	fix: use RedactingFormatter on stderr handler, update types and test mock - stderr handler now uses RedactingFormatter to match file handlers - restart path uses verbose=0 (int) instead of verbose=False (bool) - test mock updated with new run_gateway(verbose, quiet, replace) signature	2026-04-01 11:05:07 -07:00
Alan Justino	5bd01b838c	fix(gateway): wire -v/-q flags to stderr logging By default 'hermes gateway run' now prints WARNING+ to stderr so connection errors and startup failures are visible in the terminal without having to tail ~/.hermes/logs/gateway.log. - gateway/run.py: start_gateway() accepts verbosity: Optional[int]=0. When not None, attaches a StreamHandler to stderr with level mapped from the count (0=WARNING, 1=INFO, 2+=DEBUG). Root logger level is also lowered when DEBUG is requested so records are not swallowed. - hermes_cli/gateway.py: run_gateway() gains verbose: int and quiet: bool params. -q translates to verbosity=None (no stderr handler). Wired through gateway_command(). - hermes_cli/main.py: -v changed from store_true to action=count so -v/-vv/-vvv each increment the level. -q/--quiet added as a new flag. Behaviour summary: hermes gateway run -> WARNING+ on stderr (default) hermes gateway run -q -> silent hermes gateway run -v -> INFO+ hermes gateway run -vv -> DEBUG	2026-04-01 11:05:07 -07:00
analista	3400098481	fix: update fetch_transcript.py for youtube-transcript-api v1.x The library removed the static get_transcript() method in v1.0. Migrate to the new instance-based fetch() API and normalize FetchedTranscriptSnippet objects back to dicts for compatibility with the rest of the script.	2026-04-01 10:49:24 -07:00
Dean Kerr	e905768ffd	fix(gateway): remap HERMES_HOME to target user in system service unit When `sudo hermes gateway install --system --run-as-user <user>` generates the systemd unit, get_hermes_home() resolves to /root/.hermes because Path.home() returns root's home under sudo. The unit correctly sets HOME= and User= via _system_service_identity(), but HERMES_HOME was computed independently and pointed to root's config directory. Add _hermes_home_for_target_user() which remaps the current HERMES_HOME to the equivalent path under the target user's home. This handles: - Default ~/.hermes → target user's ~/.hermes - Profiles (e.g. ~/.hermes/profiles/coder) → preserves relative structure - Custom paths (e.g. /opt/hermes) → kept as-is Supersedes #3861 which only handled the default case and left profiles broken (also flagged by Copilot review). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-01 06:09:33 -07:00
Teknium	e0abf2416d	fix: restore _config_version to 11 (reverted by stale-branch merge in #4419 ) (#4440 ) PR #4419 was based on pre-credential-pools main where _config_version was 10. The squash merge downgraded it from 11 (set by #2647) back to 10. Also fixes the test assertion.	2026-04-01 04:34:04 -07:00
Teknium	f6ada27d1c	feat(skills): size limits for agent writes + fuzzy matching for patch (#4414 ) * feat(skills): add content size limits for agent-created skills Agent writes via skill_manage (create/edit/patch/write_file) are now constrained to prevent unbounded growth: - SKILL.md and supporting files: 100,000 character limit - Supporting files: additional 1 MiB byte limit - Patches on oversized hand-placed skills that reduce the size are allowed (shrink path), but patches that grow beyond the limit are rejected Hand-placed skills and hub-installed skills have NO hard limit — they load and function normally regardless of size. Hub installs get a warning in the log if SKILL.md exceeds 100k chars. This mirrors the memory system's char_limit pattern. Without this, the agent auto-grows skills indefinitely through iterative patches (hermes-agent-dev reached 197k chars / 72k tokens — 40x larger than the largest skill in the entire skills.sh ecosystem). Constants: MAX_SKILL_CONTENT_CHARS (100k), MAX_SKILL_FILE_BYTES (1MiB) Tests: 14 new tests covering all write paths and edge cases * feat(skills): add fuzzy matching to skill patch _patch_skill now uses the same 8-strategy fuzzy matching engine (tools/fuzzy_match.py) as the file patch tool. Handles whitespace normalization, indentation differences, escape sequences, and block-anchor matching. Eliminates exact-match failures when agents patch skills with minor formatting mismatches.	2026-04-01 04:19:19 -07:00
Teknium	70744add15	feat(browser): add persistent Camofox sessions and VNC URL discovery (salvage #4400 ) (#4419 ) Adds two Camofox features: 1. Persistent browser sessions: new `browser.camofox.managed_persistence` config option. When enabled, Hermes sends a deterministic profile-scoped userId to Camofox so the server maps it to a persistent browser profile directory. Cookies, logins, and browser state survive across restarts. Default remains ephemeral (random userId per session). 2. VNC URL discovery: Camofox /health endpoint returns vncPort when running in headed mode. Hermes constructs the VNC URL and includes it in navigate responses so the agent can share it with users. Also fixes camofox_vision bug where call_llm response object was passed directly to json.dumps instead of extracting .choices[0].message.content. Changes from original PR: - Removed browser_evaluate tool (separate feature, needs own PR) - Removed snapshot truncation limit change (unrelated) - Config.yaml only for managed_persistence (no env var, no version bump) - Rewrote tests to use config mock instead of env var - Reverted package-lock.json churn Co-authored-by: analista <psikonetik@gmail.com.com>	2026-04-01 04:18:50 -07:00
Teknium	85e96a4638	fix(skills): move unified hermes-agent skill into autonomous-ai-agents category (#4435 ) The unified skill from PR #4332 was placed at a top-level skills/hermes-agent/ directory, creating a redundant standalone category. Move it to skills/autonomous-ai-agents/hermes-agent/ alongside claude-code, codex, and opencode where it belongs.	2026-04-01 03:39:25 -07:00
Teknium	c9dc6c4749	fix(insights): show cache tokens in overview so total adds up (#4428 ) The total_tokens field includes cache_read + cache_write tokens, but the display only showed input + output — making the math look wrong (e.g. 765K + 134K displayed but total said 9.2M). Now shows a cache line when cache tokens are present so all visible numbers sum to the displayed total. Affects both terminal (hermes insights) and gateway (/insights) formats.	2026-04-01 03:06:47 -07:00
kshitijk4poor	935137f0d9	feat: add inline diff previews for write actions Show inline diffs in the CLI transcript when write_file, patch, or skill_manage modifies files. Captures a filesystem snapshot before the tool runs, computes a unified diff after, and renders it with ANSI coloring in the activity feed. Adds tool_start_callback and tool_complete_callback hooks to AIAgent for pre/post tool execution notifications. Also fixes _extract_parallel_scope_path to normalize relative paths to absolute, preventing the parallel overlap detection from missing conflicts when the same file is referenced with different path styles. Gated by display.inline_diffs config option (default: true). Based on PR #3774 by @kshitijk4poor.	2026-04-01 02:13:57 -07:00
Teknium	68fc4aec21	fix: comprehensive default profile export exclusions and import guard - Add _DEFAULT_EXPORT_EXCLUDE_ROOT constant with 25+ entries to exclude from default profile exports: repo checkout (hermes-agent), worktrees, databases (state.db), caches, runtime state, logs, binaries - Add _default_export_ignore() with root-level and universal exclusions (__pycache__, .sock, .tmp at any depth) - Remove redundant shutil/tempfile imports from contributor's if-block - Block import_profile() from accepting 'default' as target name with clear guidance to use --name - Add 7 tests covering: archive creation, inclusion of profile data, exclusion of infrastructure, nested __pycache__ exclusion, import rejection without --name, import rejection with --name default, full export-import roundtrip with a different name Addresses review feedback on PR #4370.	2026-04-01 01:43:51 -07:00
Devorun	f04977f45a	fix(cli): support exporting the default root profile (#4366 )	2026-04-01 01:43:51 -07:00
Teknium	996250d178	fix(cli): pin entire TUI to bottom of terminal on startup (#4412 ) Replace the per-response padding from PR #4359 (which created a void between short responses and the prompt) with a one-time initial scroll at session start. Prints terminal_height newlines before the banner so the cursor starts at the bottom row — banner, responses, and prompt all appear pinned to the bottom with empty space above, not below. patch_stdout naturally keeps the prompt at the bottom from there, so no per-response padding is needed.	2026-04-01 01:41:09 -07:00
Bartok9	afa75a6185	fix(client): handle is_closed as method in OpenAI SDK The openai SDK's SyncAPIClient.is_closed is a method, not a property. getattr(client, 'is_closed', False) returned the bound method object, which is always truthy — causing _is_openai_client_closed() to report all clients as closed and triggering unnecessary client recreation (~100-200ms TCP+TLS overhead per API call). Fix: check if is_closed is callable and call it, otherwise treat as bool. Fixes #4377 Co-authored-by: Bartok9 <Bartok9@users.noreply.github.com>	2026-04-01 01:40:43 -07:00
Nick	9a581bba50	fix(gateway): resume agent after /approve executes blocked command When a dangerous command was blocked and the user approved it via /approve, the command was executed but the agent loop had already exited — the agent never received the command output and the task died silently. Now _handle_approve_command sends immediate feedback to the user, then creates a synthetic continuation message with the command output and feeds it through _handle_message so the agent picks up where it left off. - Send command result to chat immediately via adapter.send() - Create synthetic MessageEvent with command + output as context - Spawn asyncio task to re-invoke agent via _handle_message - Return None (feedback already sent directly) - Add test for agent re-invocation after approval - Update existing approval tests for new return behavior	2026-04-01 01:38:55 -07:00
Smyile	8327f7cc61	fix(docs): use compound selector instead of media query Target the exact state that breaks: when .navbar-sidebar--show is active on the same <nav> element. This preserves the blur on mobile when the sidebar is closed, and only removes it when the sidebar is open.	2026-04-01 01:14:39 -07:00
Smyile	7baee0b023	fix(docs): restrict backdrop-filter to desktop to fix mobile sidebar backdrop-filter on .navbar creates a new CSS stacking context that hides .navbar-sidebar menu content on mobile (only the close button is visible). Scope the blur effect to min-width: 997px so it only applies on desktop where the sidebar is not rendered inside the navbar. Ref: facebook/docusaurus#6996, facebook/docusaurus#6853	2026-04-01 01:14:39 -07:00
Teknium	efa327a998	fix: add missing provider attrs to cli_obj test fixture _show_status() now references self.provider and self._provider_source, added after the original PR was submitted.	2026-04-01 01:12:23 -07:00
Johannnnn506	9b99ea176e	fix(cli): initialize ctx_len before compact banner path	2026-04-01 01:12:23 -07:00
Teknium	a7f7e87070	fix: preserve credential_pool through smart routing and defer eager fallback on 429 (#4361 ) Three bugs prevented credential pool rotation from working when multiple Codex OAuth tokens were configured: 1. credential_pool was dropped during smart model turn routing. resolve_turn_route() constructed runtime dicts without it, so the AIAgent was created without pool access. Fixed in smart_model_routing.py (no-route and fallback paths), cli.py, and gateway/run.py. 2. Eager fallback fired before pool rotation on 429. The rate-limit handler at line ~7180 switched to a fallback provider immediately, before _recover_with_credential_pool got a chance to rotate to the next credential. Now deferred when the pool still has credentials. 3. (Non-issue) Retry budget was reported as too small, but successful pool rotations already skip retry_count increment — no change needed. Reported by community member Schinsly who identified all three root causes and verified the fix locally with multiple Codex accounts.	2026-04-01 01:02:34 -07:00
Teknium	ef2ae3e48f	fix(file_tools): refresh staleness timestamp after writes (#4390 ) After a successful write_file or patch, update the stored read timestamp to match the file's new modification time. Without this, consecutive edits by the same task (read → write → write) would false-warn on the second write because the stored timestamp still reflected the original read, not the first write. Also renames the internal tracker key from 'file_mtimes' to 'read_timestamps' for clarity.	2026-04-01 00:50:08 -07:00
SHL0MS	83dec2b3ec	fix: skip empty/whitespace text in Telegram send to prevent 400 errors Telegram API returns HTTP 400 when sent whitespace-only or empty text. Add a guard at the top of send() to silently succeed on blank content instead of crashing. Equivalent to OpenClaw #56620.	2026-03-31 19:10:26 -07:00
Laura Batalha	f4d44c777b	feat(discord): only create threads and reactions for authorized users	2026-03-31 19:06:46 -07:00
Teknium	0a6d366327	fix(security): redact secrets from execute_code sandbox output * fix: root-level provider in config.yaml no longer overrides model.provider load_cli_config() had a priority inversion: a stale root-level 'provider' key in config.yaml would OVERRIDE the canonical 'model.provider' set by 'hermes model'. The gateway reads model.provider directly from YAML and worked correctly, but 'hermes chat -q' and the interactive CLI went through the merge logic and picked up the stale root-level key. Fix: root-level provider/base_url are now only used as a fallback when model.provider/model.base_url is not set (never as an override). Also added _normalize_root_model_keys() to config.py load_config() and save_config() — migrates root-level provider/base_url into the model section and removes the root-level keys permanently. Reported by (≧▽≦) in Discord: opencode-go provider persisted as a root-level key and overrode the correct model.provider=openrouter, causing 401 errors. * fix(security): redact secrets from execute_code sandbox output The execute_code sandbox stripped env vars with secret-like names from the child process (preventing os.environ access), but scripts could still read secrets from disk (e.g. open('~/.hermes/.env')) and print them to stdout. The raw values entered the model context unredacted. terminal_tool and file_tools already applied redact_sensitive_text() to their output — execute_code was the only tool that skipped this step. Now the same redaction runs on both stdout and stderr after ANSI stripping. Reported via Discord (not filed on GitHub to avoid public disclosure of the reproduction steps).	2026-03-31 18:52:11 -07:00
Teknium	3604665e44	feat: add qwen/qwen3.6-plus-preview:free to OpenRouter and Nous model lists (#4376 )	2026-03-31 18:05:40 -07:00
Ben Barclay	c36aa5fe98	Merge pull request #4034 from bcross/docker-optimization fix(docker): optimize docker contanier image creation	2026-03-31 15:27:06 -07:00
Teknium	f8cb54ba04	fix(cli): anchor input prompt near bottom of terminal after responses (#4359 ) After short agent responses, the prompt_toolkit input area sat mid-screen with empty terminal space below it. Now prints padding newlines (half terminal height) after each response to push the prompt toward the bottom. patch_stdout renders the padding above the input area.	2026-03-31 14:56:35 -07:00
Teknium	b118f607b2	feat(skills): unify hermes-agent and hermes-agent-setup into single skill (#4332 ) Merges the hermes-agent-spawning skill (autonomous-ai-agents/) and hermes-agent-setup skill (dogfood/) into a single comprehensive skills/hermes-agent/ skill. The unified skill covers: - What Hermes Agent is and how it compares to Claude Code/Codex/OpenClaw - Complete CLI reference (all subcommands and flags) - Slash command reference - Configuration guide (providers, toolsets, config sections) - Voice/STT/TTS setup - Spawning additional agent instances (one-shot and interactive PTY) - Multi-agent coordination patterns - Troubleshooting guide - Where-to-find-things lookup table with docs links - Concise contributor quick reference Removes: - skills/autonomous-ai-agents/hermes-agent/ (hermes-agent-spawning) - skills/dogfood/hermes-agent-setup/	2026-03-31 14:49:20 -07:00
Teknium	f04986029c	feat(file_tools): detect stale files on write and patch (#4345 ) Track file mtime when read_file is called. When write_file or patch subsequently targets the same file, compare the current mtime against the recorded one. If they differ (external edit, concurrent agent, user change), include a _warning in the result advising the agent to re-read. The write still proceeds — this is a soft signal, not a hard block. Key design points: - Per-task isolation: task A's reads don't affect task B's writes. - Files never read produce no warning (not enforcing read-before-write). - mtime naturally updates after the agent's own writes, so the warning only fires on external changes, not the agent's own edits. - V4A multi-file patches check all target paths. Tests: 10 new tests covering write staleness, patch staleness, never-read files, cross-task isolation, and the helper function.	2026-03-31 14:49:00 -07:00
Teknium	f5cc597afc	fix: add CAMOFOX_PORT=9377 to Docker commands for camofox-browser (#4340 ) The camofox-browser image defaults to port 3000 internally, not 9377. Without -e CAMOFOX_PORT=9377, the -p 9377:9377 mapping silently fails because nothing listens on 9377 inside the container. E2E verified: -p 9377:9377 alone → connection reset, -p 9377:9377 -e CAMOFOX_PORT=9377 → healthy and functional.	2026-03-31 13:38:22 -07:00
Teknium	1b62ad9de7	fix: root-level provider in config.yaml no longer overrides model.provider load_cli_config() had a priority inversion: a stale root-level 'provider' key in config.yaml would OVERRIDE the canonical 'model.provider' set by 'hermes model'. The gateway reads model.provider directly from YAML and worked correctly, but 'hermes chat -q' and the interactive CLI went through the merge logic and picked up the stale root-level key. Fix: root-level provider/base_url are now only used as a fallback when model.provider/model.base_url is not set (never as an override). Also added _normalize_root_model_keys() to config.py load_config() and save_config() — migrates root-level provider/base_url into the model section and removes the root-level keys permanently. Reported by (≧▽≦) in Discord: opencode-go provider persisted as a root-level key and overrode the correct model.provider=openrouter, causing 401 errors.	2026-03-31 12:54:22 -07:00
Teknium	e3f8347be3	feat(file_tools): harden read_file with size guard, dedup, and device blocking (#4315 ) * feat(file_tools): harden read_file with size guard, dedup, and device blocking Three improvements to read_file_tool to reduce wasted context tokens and prevent process hangs: 1. Character-count guard: reads that produce more than 100K characters (≈25-35K tokens across tokenisers) are rejected with an error that tells the model to use offset+limit for a smaller range. The effective cap is min(file_size, 100K) so small files that happen to have long lines aren't over-penalised. Large truncated files also get a hint nudging toward targeted reads. 2. File-read deduplication: when the same (path, offset, limit) is read a second time and the file hasn't been modified (mtime unchanged), return a lightweight stub instead of re-sending the full content. Writes and patches naturally change mtime, so post-edit reads always return fresh content. The dedup cache is cleared on context compression — after compression the original read content is summarised away, so the model needs the full content again. 3. Device path blocking: paths like /dev/zero, /dev/random, /dev/stdin etc. are rejected before any I/O to prevent process hangs from infinite-output or blocking-input devices. Tests: 17 new tests covering all three features plus the dedup-reset- on-compression integration. All 52 file-read tests pass (35 existing + 17 new). Full tool suite (2124 tests) passes with 0 failures. * feat: make file_read_max_chars configurable, add docs Add file_read_max_chars to DEFAULT_CONFIG (default 100K). read_file_tool reads this on first call and caches for the process lifetime. Users on large-context models can raise it; users on small local models can lower it. Also adds a 'File Read Safety' section to the configuration docs explaining the char limit, dedup behavior, and example values.	2026-03-31 12:53:19 -07:00
Teknium	d3f1987a05	fix(security): add .config/gh to read protection for @file references (#4327 ) Follow-up to PR #4305 — .config/gh was added to the write-deny list but missed from _SENSITIVE_HOME_DIRS, leaving GitHub CLI OAuth tokens exposed via @file:~/.config/gh/hosts.yml context injection.	2026-03-31 12:48:30 -07:00
maymuneth	655eea2db8	fix(security): protect .docker, .azure, and .config/gh from read and write	2026-03-31 12:47:10 -07:00
binhnt92	c94a5fa1b2	fix(cli): use atomic write in save_config_value to prevent config loss on interrupt save_config_value() used bare open(path, 'w') + yaml.dump() which truncates the file to zero bytes on open. If the process is interrupted mid-write, config.yaml is left empty. Replace with atomic_yaml_write() (temp file + fsync + os.replace), matching the gateway config write path. Co-authored-by: Hermes Agent <hermes@nousresearch.com>	2026-03-31 12:21:55 -07:00
Teknium	7f78deebe7	fix: apply same path traversal checks to config-based credential files _load_config_files() had the same hermes_home / item pattern without containment checks. While config.yaml is user-controlled (lower threat than skill frontmatter), defense in depth prevents exploitation via config injection or copy-paste mistakes.	2026-03-31 12:16:37 -07:00
maymuneth	a97641b9f2	fix(security): reject path traversal in credential file registration	2026-03-31 12:16:37 -07:00
Gutslabs	0f2ea2062b	fix(profiles): validate tar archive member paths on import Fixes a zip-slip path traversal vulnerability in hermes profile import. shutil.unpack_archive() on untrusted tar members allows entries like ../../escape.txt to write files outside ~/.hermes/profiles/. - Add _normalize_profile_archive_parts() to reject absolute paths (POSIX and Windows), traversal (..), empty paths, backslash tricks - Add _safe_extract_profile_archive() for manual per-member extraction that only allows regular files and directories (rejects symlinks) - Replace shutil.unpack_archive() with the safe extraction path - Add regression tests for traversal and absolute-path attacks Co-authored-by: Gutslabs <gutslabsxyz@gmail.com>	2026-03-31 12:14:27 -07:00
0xbyt4	08171c1c31	fix: allow voice mode in WSL when PulseAudio bridge is configured WSL detection was treated as a hard fail, blocking voice mode even when audio worked via PulseAudio bridge. Now PULSE_SERVER env var presence makes WSL a soft notice instead of a blocking warning. Device query failures in WSL with PULSE_SERVER are also treated as non-blocking.	2026-03-31 12:13:33 -07:00
Teknium	7f670a06cf	feat: add --max-turns CLI flag to hermes chat Exposes the existing max_turns parameter (cli.py main()) as a CLI flag so programmatic callers (Paperclip adapter, scripts) can control the agent's tool-calling iteration limit without editing config.yaml. Priority chain unchanged: CLI flag > config agent.max_turns > env HERMES_MAX_ITERATIONS > default 90.	2026-03-31 12:10:12 -07:00
curtitoo	cac9d20c4f	test: add codex transport drop regression	2026-03-31 12:05:06 -07:00
curtitoo	e75964d46d	fix: harden codex responses transport handling	2026-03-31 12:05:06 -07:00
Teknium	161acb0086	fix: credential pool 401 recovery rotates to next credential after failed refresh (#4300 ) When an OAuth token refresh fails on a 401 error, the pool recovery would return 'not recovered' without trying the next credential in the pool. This meant users who added a second valid credential via 'hermes auth add' would never see it used when the primary credential was dead. Now: try refresh first (handles expired tokens quickly), and if that fails, rotate to the next available credential — same as 429/402 already did. Adds three tests covering 401 refresh success, refresh-fail-then-rotate, and refresh-fail-with-no-remaining-credentials.	2026-03-31 12:02:29 -07:00
Teknium	143b74ec00	fix: first-run guard stuck in loop when provider configured via config.yaml (#4298 ) The _has_any_provider_configured() guard only checked env vars, .env file, and auth.json — missing config.yaml model.provider/base_url/api_key entirely. Users who configured a provider through setup (saving to config.yaml) but had empty API key placeholders in .env from the install template were permanently blocked by the 'not configured' message. Changes: - _has_any_provider_configured() now checks config.yaml model section for explicit provider, base_url, or api_key — covers custom endpoints and providers that store credentials in config rather than env vars - .env.example: comment out all empty API key placeholders so they don't pollute the environment when copied to .env by the installer - .env.example: mark LLM_MODEL as deprecated (config.yaml is source of truth) - 4 new tests for the config.yaml detection path Reported by OkadoOP on Discord.	2026-03-31 11:42:52 -07:00
Teknium	57625329a2	docs+feat: comprehensive local LLM provider guides and context length warning (#4294 ) * docs: update llama.cpp section with --jinja flag and tool calling guide The llama.cpp docs were missing the --jinja flag which is required for tool calling to work. Without it, models output tool calls as raw JSON text instead of structured API responses, making Hermes unable to execute them. Changes: - Add --jinja and -fa flags to the server startup example - Replace deprecated env vars (OPENAI_BASE_URL, LLM_MODEL) with hermes model interactive setup - Add caution block explaining the --jinja requirement and symptoms - List models with native tool calling support - Add /props endpoint verification tip * docs+feat: comprehensive local LLM provider guides and context length warning Docs (providers.md): - Rewrote Ollama section with context length warning (defaults to 4k on <24GB VRAM), three methods to increase it, and verification steps - Rewrote vLLM section with --max-model-len, tool calling flags (--enable-auto-tool-choice, --tool-call-parser), and context guidance - Rewrote SGLang section with --context-length, --tool-call-parser, and warning about 128-token default max output - Added LM Studio section (port 1234, context length defaults to 2048, tool calling since 0.3.6) - Added llama.cpp context length flag (-c) and GPU offload (-ngl) - Added Troubleshooting Local Models section covering: - Tool calls appearing as text (with per-server fix table) - Silent context truncation and diagnosis commands - Low detected context at startup - Truncated responses - Replaced all deprecated env vars (OPENAI_BASE_URL, LLM_MODEL) with hermes model interactive setup and config.yaml examples - Added deprecation warning for legacy env vars in General Setup Code (cli.py): - Added context length warning in show_banner() when detected context is <= 8192 tokens, with server-specific fix hints: - Ollama (port 11434): suggests OLLAMA_CONTEXT_LENGTH env var - LM Studio (port 1234): suggests model settings adjustment - Other servers: suggests config.yaml override Tests: - 9 new tests covering warning thresholds, server-specific hints, and no-warning cases	2026-03-31 11:42:48 -07:00
arasovic	0240baa357	fix: strip orphaned think/reasoning tags from user-facing responses Some models (e.g. Kimi K2.5 on Alibaba OpenAI-compatible endpoint) emit reasoning text followed by a closing </think> without a matching opening <think> tag. The existing paired-tag regexes in _strip_think_blocks() cannot match these orphaned tags, so </think> leaks into user-facing responses on all platforms. Add a catch-all regex that strips any remaining opening or closing think/thinking/reasoning/REASONING_SCRATCHPAD tags after the existing paired-block removal pass. Closes #4285	2026-03-31 11:42:44 -07:00
Dakota Secula-Rosell	c1606aed69	fix(cli): allow empty strings and falsy values in config set `hermes config set KEY ""` and `hermes config set KEY 0` were rejected because the guard used `not value` which is truthy for empty strings, zero, and False. Changed to `value is None` so only truly missing arguments are rejected. Closes #4277 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-31 11:41:12 -07:00
MacroAnarchy	49d7210fed	fix(gateway): parse thread_id from delivery target format The delivery target parser uses split(':', 1) which only splits on the first colon. For the documented format platform:chat_id:thread_id (e.g. 'telegram:-1001234567890:17585'), thread_id gets munged into chat_id and is never extracted. Fix: split(':', 2) to correctly extract all three parts. Also fix to_string() to include thread_id for proper round-tripping. The downstream plumbing in _deliver_to_platform() already handles thread_id correctly (line 292-293) — it just never received a value.	2026-03-31 10:45:27 -07:00
Teknium	84a541b619	feat: support * wildcard in platform allowlists and improve WhatsApp docs * docs: clarify WhatsApp allowlist behavior and document WHATSAPP_ALLOW_ALL_USERS - Add WHATSAPP_ALLOW_ALL_USERS and WHATSAPP_DEBUG to env vars reference - Warn that * is not a wildcard and silently blocks all messages - Show WHATSAPP_ALLOWED_USERS as optional, not required - Update troubleshooting with the * trap and debug mode tip - Fix Security section to mention the allow-all alternative Prompted by a user report in Discord where WHATSAPP_ALLOWED_USERS=* caused all incoming messages to be silently dropped at the bridge level. * feat: support * wildcard in platform allowlists Follow the precedent set by SIGNAL_GROUP_ALLOWED_USERS which already supports * as an allow-all wildcard. Bridge (allowlist.js): matchesAllowedUser() now checks for * in the allowedUsers set before iterating sender aliases. Gateway (run.py): _is_authorized() checks for * in allowed_ids after parsing the allowlist. This is generic — works for all platforms, not just WhatsApp. Updated docs to document * as a supported value instead of warning against it. Added WHATSAPP_ALLOW_ALL_USERS and WHATSAPP_DEBUG to the env vars reference. Tests: JS allowlist test + 2 Python gateway tests (WhatsApp + Telegram to verify cross-platform behavior).	2026-03-31 10:42:03 -07:00
Teknium	cca0996a28	fix(browser): skip SSRF check for local backends (Camofox, headless Chromium) (#4292 ) The SSRF protection added in #3041 blocks all private/internal addresses unconditionally in browser_navigate(). This prevents legitimate local use cases (localhost apps, LAN devices) when using Camofox or the built-in headless Chromium without a cloud provider. The check is only meaningful for cloud backends (Browserbase, BrowserUse) where the agent could reach internal resources on a remote machine. Local backends give the user full terminal and network access already — the SSRF check adds zero security value. Add _is_local_backend() helper that returns True when Camofox is active or no cloud provider is configured. Both the pre-navigation and post-redirect SSRF checks now skip when running locally. The browser.allow_private_urls config option remains available as an explicit opt-out for cloud mode.	2026-03-31 10:40:13 -07:00
Teknium	fad3f338d1	fix: patch _REDACT_ENABLED in test fixture for module-level snapshot The _REDACT_ENABLED constant is snapshotted at import time, so monkeypatch.delenv() alone doesn't re-enable redaction during tests when HERMES_REDACT_SECRETS=false is set in the host environment.	2026-03-31 10:30:48 -07:00
Dilee	6dcc3330b3	fix(security): add missing GitHub OAuth token patterns and snapshot redact flag - Add gho_, ghu_, ghs_, ghr_ prefix patterns (OAuth, user-to-server, server-to-server, and refresh tokens) — all four types used by GitHub Apps and Copilot auth flows were absent from _PREFIX_PATTERNS - Snapshot HERMES_REDACT_SECRETS at module import time instead of re-reading os.getenv() on every call, preventing runtime env mutations (e.g. LLM-generated export commands) from disabling redaction	2026-03-31 10:30:48 -07:00
Bryan Cross	289df5dd1c	Merge branch 'NousResearch:main' into docker-optimization	2026-03-31 07:08:44 -05:00
Teknium	344239c2db	feat: auto-detect models from server probe in custom endpoint setup (#4218 ) Custom endpoint setup (_model_flow_custom) now probes the server first and presents detected models instead of asking users to type blind: - Single model: auto-confirms with Y/n prompt - Multiple models: numbered list picker, or type a name - No models / probe failed: falls back to manual input Context length prompt also moved after model selection so the user sees the verified endpoint before being asked for details. All recent fixes preserved: config dict sync (#4172), api_key persistence (#4182), no save_env_value for URLs (#4165). Inspired by PR #4194 by sudoingX — re-implemented against current main. Co-authored-by: Xpress AI (Dip KD) <200180104+sudoingX@users.noreply.github.com>	2026-03-31 03:29:00 -07:00
Teknium	79b2694b9a	fix: _allow_private_urls name collision + stale OPENAI_BASE_URL test (#4217 ) 1. browser_tool.py: _allow_private_urls() used 'global _allow_private_urls' then assigned a bool to it, replacing the function in the module namespace. After first call, subsequent calls hit TypeError: 'bool' object is not callable. Renamed cache variable to _cached_allow_private_urls. 2. test_provider_parity.py: test_custom_endpoint_when_no_nous relied on OPENAI_BASE_URL env var (removed in config refactor). Mock _resolve_custom_runtime directly instead.	2026-03-31 03:16:40 -07:00
Teknium	8d59881a62	feat(auth): same-provider credential pools with rotation, custom endpoint support, and interactive CLI (#2647 ) * feat(auth): add same-provider credential pools and rotation UX Add same-provider credential pooling so Hermes can rotate across multiple credentials for a single provider, recover from exhausted credentials without jumping providers immediately, and configure that behavior directly in hermes setup. - agent/credential_pool.py: persisted per-provider credential pools - hermes auth add/list/remove/reset CLI commands - 429/402/401 recovery with pool rotation in run_agent.py - Setup wizard integration for pool strategy configuration - Auto-seeding from env vars and existing OAuth state Co-authored-by: kshitijk4poor <82637225+kshitijk4poor@users.noreply.github.com> Salvaged from PR #2647 * fix(tests): prevent pool auto-seeding from host env in credential pool tests Tests for non-pool Anthropic paths and auth remove were failing when host env vars (ANTHROPIC_API_KEY) or file-backed OAuth credentials were present. The pool auto-seeding picked these up, causing unexpected pool entries in tests. - Mock _select_pool_entry in auxiliary_client OAuth flag tests - Clear Anthropic env vars and mock _seed_from_singletons in auth remove test * feat(auth): add thread safety, least_used strategy, and request counting - Add threading.Lock to CredentialPool for gateway thread safety (concurrent requests from multiple gateway sessions could race on pool state mutations without this) - Add 'least_used' rotation strategy that selects the credential with the lowest request_count, distributing load more evenly - Add request_count field to PooledCredential for usage tracking - Add mark_used() method to increment per-credential request counts - Wrap select(), mark_exhausted_and_rotate(), and try_refresh_current() with lock acquisition - Add tests: least_used selection, mark_used counting, concurrent thread safety (4 threads × 20 selects with no corruption) * feat(auth): add interactive mode for bare 'hermes auth' command When 'hermes auth' is called without a subcommand, it now launches an interactive wizard that: 1. Shows full credential pool status across all providers 2. Offers a menu: add, remove, reset cooldowns, set strategy 3. For OAuth-capable providers (anthropic, nous, openai-codex), the add flow explicitly asks 'API key or OAuth login?' — making it clear that both auth types are supported for the same provider 4. Strategy picker shows all 4 options (fill_first, round_robin, least_used, random) with the current selection marked 5. Remove flow shows entries with indices for easy selection The subcommand paths (hermes auth add/list/remove/reset) still work exactly as before for scripted/non-interactive use. * fix(tests): update runtime_provider tests for config.yaml source of truth (#4165) Tests were using OPENAI_BASE_URL env var which is no longer consulted after #4165. Updated to use model config (provider, base_url, api_key) which is the new single source of truth for custom endpoint URLs. * feat(auth): support custom endpoint credential pools keyed by provider name Custom OpenAI-compatible endpoints all share provider='custom', making the provider-keyed pool useless. Now pools for custom endpoints are keyed by 'custom:<normalized_name>' where the name comes from the custom_providers config list (auto-generated from URL hostname). - Pool key format: 'custom:together.ai', 'custom:local-(localhost:8080)' - load_pool('custom:name') seeds from custom_providers api_key AND model.api_key when base_url matches - hermes auth add/list now shows custom endpoints alongside registry providers - _resolve_openrouter_runtime and _resolve_named_custom_runtime check pool before falling back to single config key - 6 new tests covering custom pool keying, seeding, and listing * docs: add Excalidraw diagram of full credential pool flow Comprehensive architecture diagram showing: - Credential sources (env vars, auth.json OAuth, config.yaml, CLI) - Pool storage and auto-seeding - Runtime resolution paths (registry, custom, OpenRouter) - Error recovery (429 retry-then-rotate, 402 immediate, 401 refresh) - CLI management commands and strategy configuration Open at: https://excalidraw.com/#json=2Ycqhqpi6f12E_3ITyiwh,c7u9jSt5BwrmiVzHGbm87g * fix(tests): update setup wizard pool tests for unified select_provider_and_model flow The setup wizard now delegates to select_provider_and_model() instead of using its own prompt_choice-based provider picker. Tests needed: - Mock select_provider_and_model as no-op (provider pre-written to config) - Call _stub_tts BEFORE custom prompt_choice mock (it overwrites it) - Pre-write model.provider to config so the pool step is reached * docs: add comprehensive credential pool documentation - New page: website/docs/user-guide/features/credential-pools.md Full guide covering quick start, CLI commands, rotation strategies, error recovery, custom endpoint pools, auto-discovery, thread safety, architecture, and storage format. - Updated fallback-providers.md to reference credential pools as the first layer of resilience (same-provider rotation before cross-provider) - Added hermes auth to CLI commands reference with usage examples - Added credential_pool_strategies to configuration guide * chore: remove excalidraw diagram from repo (external link only) * refactor: simplify credential pool code — extract helpers, collapse extras, dedup patterns - _load_config_safe(): replace 4 identical try/except/import blocks - _iter_custom_providers(): shared generator for custom provider iteration - PooledCredential.extra dict: collapse 11 round-trip-only fields (token_type, scope, client_id, portal_base_url, obtained_at, expires_in, agent_key_id, agent_key_expires_in, agent_key_reused, agent_key_obtained_at, tls) into a single extra dict with __getattr__ for backward-compatible access - _available_entries(): shared exhaustion-check between select and peek - Dedup anthropic OAuth seeding (hermes_pkce + claude_code identical) - SimpleNamespace replaces class _Args boilerplate in auth_commands - _try_resolve_from_custom_pool(): shared pool-check in runtime_provider Net -17 lines. All 383 targeted tests pass. --------- Co-authored-by: kshitijk4poor <82637225+kshitijk4poor@users.noreply.github.com>	2026-03-31 03:10:01 -07:00
Teknium	2ae50bdddd	fix(telegram): enforce 32-char limit on command names with collision avoidance (#4211 ) Telegram Bot API requires command names to be 1-32 characters. Plugin and skill names that exceed this limit now get truncated. If truncation creates a collision (with core commands, other plugins, or other skills), the name is shortened to 31 chars and a digit 0-9 is appended. Adds _clamp_telegram_names() helper used for both plugin and skill entries in telegram_menu_commands(). Core CommandDef commands are tracked as reserved names so truncated plugin/skill names never shadow them. Addresses the fix from PR #4191 (sroecker) with collision-safe truncation. Tests: 9 new tests covering truncation, digit suffixes, exhaustion, dedup.	2026-03-31 02:41:50 -07:00
Nils	50302ed70a	fix(tools): make browser SSRF check configurable via browser.allow_private_urls (#4198 ) * fix(tools): skip SSRF check in local browser mode The SSRF protection added in #3041 blocks all private/internal addresses unconditionally in browser_navigate(). This prevents legitimate local development use cases (localhost testing, LAN device access) when using the local Chromium backend. The SSRF check is only meaningful for cloud browsers (Browserbase, BrowserUse) where the agent could reach internal resources on a remote machine. In local mode, the user already has full terminal and network access, so the check adds no security value. This change makes the SSRF check conditional on _get_cloud_provider(), keeping full protection in cloud mode while allowing private addresses in local mode. * fix(tools): make SSRF check configurable via browser.allow_private_urls Replace unconditional SSRF check with a configurable setting. Default (False) keeps existing security behavior. Setting to True allows navigating to private/internal IPs for local dev and LAN use cases. --------- Co-authored-by: Nils (Norya) <nils@begou.dev>	2026-03-31 02:11:55 -07:00
Teknium	086ec5590d	fix: gate Claude Code credentials behind explicit Hermes config in wizard trigger (#4210 ) If a user has Claude Code installed but never configured Hermes, the first-run guard found those external credentials and skipped the setup wizard. Users got silently routed to someone else's inference without being asked. Now _has_any_provider_configured() checks whether Hermes itself has been explicitly configured (model in config differs from hardcoded default) before counting Claude Code credentials. Fresh installs trigger the wizard regardless of what external tools are on the machine. Salvaged from PR #4194 by sudoingX — wizard trigger fix only. Model auto-detect change under separate review. Co-authored-by: Xpress AI (Dip KD) <200180104+sudoingX@users.noreply.github.com>	2026-03-31 02:01:15 -07:00
Teknium	c53a296df1	feat: add MiniMax M2.7 to hermes model picker and opencode-go (#4208 ) Add MiniMax-M2.7 and M2.7-highspeed to _PROVIDER_MODELS for minimax and minimax-cn providers in main.py so hermes model shows them. Update opencode-go bare ID from m2.5 to m2.7 in models.py. Salvaged from PR #4197 by octo-patch.	2026-03-31 01:54:13 -07:00
Teknium	1bca6f3930	fix: save API key to model config for custom endpoints (#4182 ) Custom cloud endpoints (Together.ai, RunPod, Groq, etc.) lost their API key after #4165 removed OPENAI_API_KEY .env saves. The key was only saved to the custom_providers list which is unreachable at runtime for plain 'custom' provider resolution. Save model.api_key to config.yaml alongside model.provider and model.base_url in all three custom endpoint code paths: - _model_flow_custom (new endpoint with model name) - _model_flow_custom (new endpoint without model name) - _model_flow_named_custom (switching to a saved endpoint) The runtime resolver already reads model.api_key (runtime_provider.py line 224-228), so the key is picked up automatically. Each custom endpoint carries its own key in config — no shared OPENAI_API_KEY env var needed.	2026-03-31 01:36:15 -07:00
Teknium	a994cf5e5a	docs: update adding-providers guide for unified setup flow setup_model_provider() now delegates to select_provider_and_model() from main.py, so new providers only need to be wired in main.py. Removed setup.py from file checklists, replaced the setup.py section with a tip explaining the automatic inheritance.	2026-03-31 01:29:43 -07:00
Teknium	ff78ad4c81	feat: add discord.reactions config option to disable message reactions (#4199 ) Adds a 'reactions' key under the discord config section (default: true). When set to false, the bot no longer adds 👀/✅/❌ reactions to messages during processing. The config maps to DISCORD_REACTIONS env var following the same pattern as require_mention and auto_thread. Files changed: - hermes_cli/config.py: Add reactions default to DEFAULT_CONFIG - gateway/config.py: Map discord.reactions to DISCORD_REACTIONS env var - gateway/platforms/discord.py: Gate on_processing_start/complete hooks - tests/gateway/test_discord_reactions.py: 3 new tests for config gate	2026-03-31 01:24:48 -07:00
Teknium	491e79bca9	refactor: unify setup wizard provider selection with hermes model setup_model_provider() had 800+ lines of duplicated provider handling that reimplemented the same credential prompting, OAuth flows, and model selection that hermes model already provides via the _model_flow_* functions. Every new provider had to be added in both places, and the two implementations diverged in config persistence (setup.py did raw YAML writes, _set_model_provider, and _update_config_for_provider depending on the provider — main.py used its own load/save cycle). This caused the #4172 bug: _model_flow_custom saved config to disk but the wizard's final save_config(config) overwrote it with stale values. Fix: extract the core of cmd_model() into select_provider_and_model() and have setup_model_provider() call it. After the call, re-sync the wizard's config dict from disk. Deletes ~800 lines of duplicated provider handling from setup.py. Also fixes cmd_model() double-AuthError crash on fresh installs with no API keys configured.	2026-03-31 01:04:07 -07:00
Teknium	89d8127772	fix: setup wizard overwrites custom endpoint config (#4172 ) _model_flow_custom() saved model.provider and model.base_url to disk via its own load_config/save_config cycle, but never updated the setup wizard's in-memory config dict. The wizard's final save_config(config) then overwrote the custom settings with the stale default string model value. Fix: after saving to disk, also mutate the caller's config dict so the wizard's final save preserves model.provider='custom' and the base_url. Both the model_name and no-model_name branches are covered. Added regression tests that simulate the full wizard flow including the final save_config(config) call — the step that was previously untested.	2026-03-30 23:17:26 -07:00
Teknium	f890a94c12	refactor: make config.yaml the single source of truth for endpoint URLs (#4165 ) OPENAI_BASE_URL was written to .env AND config.yaml, creating a dual-source confusion. Users (especially Docker) would see the URL in .env and assume that's where all config lives, then wonder why LLM_MODEL in .env didn't work. Changes: - Remove all 27 save_env_value("OPENAI_BASE_URL", ...) calls across main.py, setup.py, and tools_config.py - Remove OPENAI_BASE_URL env var reading from runtime_provider.py, cli.py, models.py, and gateway/run.py - Remove LLM_MODEL/HERMES_MODEL env var reading from gateway/run.py and auxiliary_client.py — config.yaml model.default is authoritative - Vision base URL now saved to config.yaml auxiliary.vision.base_url (both setup wizard and tools_config paths) - Tests updated to set config values instead of env vars Convention enforced: .env is for SECRETS only (API keys). All other configuration (model names, base URLs, provider selection) lives exclusively in config.yaml.	2026-03-30 22:02:53 -07:00
Teknium	4d7e3c7157	fix(tests): provide model name in Codex 401 refresh tests for CI (#4166 ) CI has no config.yaml, so cron/gateway resolve an empty model name. The Codex Responses validator rejects empty models before the mock API call is reached. Provide explicit model in job dict and env var.	2026-03-30 21:17:09 -07:00
Teknium	1bd206ea5d	feat: add /btw command for ephemeral side questions (#4161 ) Adds /btw <question> — ask a quick follow-up using the current session context without interrupting the main conversation. - Snapshots conversation history, answers with a no-tools agent - Response is not persisted to session history or DB - Runs in a background thread (CLI) / async task (gateway) - Per-session guard prevents concurrent /btw in gateway Implementation: - model_tools.py: enabled_toolsets=[] now correctly means "no tools" (was falsy, fell through to default "all tools") - run_agent.py: persist_session=False gates _persist_session() - cli.py: _handle_btw_command (background thread, Rich panel output) - gateway/run.py: _handle_btw_command + _run_btw_task (async task) - hermes_cli/commands.py: CommandDef for "btw" Inspired by PR #3504 by areu01or00, reimplemented cleanly on current main with the enabled_toolsets=[] fix and without the __btw_no_tools__ hack.	2026-03-30 21:10:05 -07:00
Teknium	f8e1ee10aa	Fix profile list model display (#4160 ) Co-authored-by: txhno <roshwarrier@gmail.com>	2026-03-30 20:40:13 -07:00
Teknium	c1ef9b2250	fix(cli): ensure on_session_end hook fires on interrupted exits (#4159 ) - Add SIGTERM/SIGHUP signal handlers for graceful shutdown - Add BrokenPipeError to exit exception handling (SSH disconnects) - Fire on_session_end plugin hook in finally block, guarded by _agent_running to avoid double-firing on normal exits (the hook already fires per-turn from run_conversation) Co-authored-by: kelsia14 <kelsia14@users.noreply.github.com>	2026-03-30 20:37:17 -07:00
Teknium	3a68ec3172	feat: add Fireworks context length detection support (#4158 ) - Add api.fireworks.ai to _URL_TO_PROVIDER for automatic provider detection - Add fireworks to PROVIDER_TO_MODELS_DEV mapped to 'fireworks-ai' (the correct models.dev provider key — original PR used 'fireworks' which would silently fail the lookup) Cherry-picked from PR #3989 with models.dev key fix. Co-authored-by: sroecker <sroecker@users.noreply.github.com>	2026-03-30 20:37:08 -07:00
Teknium	d30ea65c9b	fix: URL-based auth for third-party Anthropic endpoints + CI test fixes (#4148 ) * fix(tests): mock sys.stdin.isatty for cmd_model TTY guard * fix(tests): update camofox snapshot format + trajectory compressor mock path - test_browser_camofox: mock response now uses snapshot format (accessibility tree) - test_trajectory_compressor: mock _get_async_client instead of setting async_client directly * fix: URL-based auth detection for third-party Anthropic endpoints + test fixes Reverts the key-prefix approach from #4093 which broke JWT and managed key OAuth detection. Instead, detects third-party endpoints by URL: if base_url is set and isn't anthropic.com, it's a proxy (Azure AI Foundry, AWS Bedrock, etc.) that uses x-api-key regardless of key format. Auth decision chain is now: 1. _requires_bearer_auth(url) → MiniMax → Bearer 2. _is_third_party_anthropic_endpoint(url) → Azure/Bedrock → x-api-key 3. _is_oauth_token(key) → OAuth on direct Anthropic → Bearer 4. else → x-api-key Also includes test fixes from PR #4051 by @erosika: - Mock sys.stdin.isatty for cmd_model TTY guard - Update camofox snapshot format mock - Fix trajectory compressor async client mock path --------- Co-authored-by: Erosika <eri@plasticlabs.ai>	2026-03-30 20:36:56 -07:00
Teknium	fb4b87f4af	chore: add claude-sonnet-4.6 to OpenRouter and Nous model lists (#4157 )	2026-03-30 20:33:21 -07:00
Teknium	5b0243e6ad	docs: deep quality pass — expand 10 thin pages, fix specific issues (#4134 ) Developer guide stubs expanded to full documentation: - trajectory-format.md: 56→233 lines (JSONL format, ShareGPT example, normalization rules, reasoning markup, replay code) - session-storage.md: 66→388 lines (SQLite schema, migration table, FTS5 search syntax, lineage queries, Python API examples) - context-compression-and-caching.md: 72→321 lines (dual compression system, config defaults, 4-phase algorithm, before/after example, prompt caching mechanics, cache-aware patterns) - tools-runtime.md: 65→246 lines (registry API, dispatch flow, availability checking, error wrapping, approval flow) - prompt-assembly.md: 89→246 lines (concrete assembled prompt example, SOUL.md injection, context file discovery table) User-facing pages expanded: - docker.md: 62→224 lines (volumes, env forwarding, docker-compose, resource limits, troubleshooting) - updating.md: 79→167 lines (update behavior, version checking, rollback instructions, Nix users) - skins.md: 80→206 lines (all color/spinner/branding keys, built-in skin descriptions, full custom skin YAML template) Hub pages improved: - integrations/index.md: 25→82 lines (web search backends table, TTS/browser providers, quick config example) - features/overview.md: added Integrations section with 6 missing links Specific fixes: - configuration.md: removed duplicate Gateway Streaming section - mcp.md: removed internal "PR work" language - plugins.md: added inline minimal plugin example (self-contained) 13 files changed, ~1700 lines added. Docusaurus build verified clean.	2026-03-30 20:30:11 -07:00
Teknium	54b876a5c9	fix: add actionable guidance to context-exceeded error messages (#4155 ) When context compression fails, users now see hints suggesting /new or /compress instead of a dead-end error. Covers all 4 error paths: payload-too-large, max compression attempts (2 paths), and context length exceeded. Closes #4061 Salvaged from PR #4076 by SHL0MS. Co-authored-by: SHL0MS <SHL0MS@users.noreply.github.com>	2026-03-30 20:23:28 -07:00
Teknium	83e5249be6	fix(gateway): use setsid instead of systemd-run --user for /update (salvage #4024 ) (#4104 ) Salvaged from PR #4024 by @Sertug17. Fixes #4017. - Replace systemd-run --user --scope with setsid for portable session detach - Add system-level service detection to cmd_update gateway restart - Falls back to start_new_session=True on systems without setsid (macOS, minimal containers)	2026-03-30 20:22:09 -07:00
Teknium	fb2af3bd1d	docs: document tool progress streaming in API server and Open WebUI (#4138 ) Update docs to reflect that tool progress now streams inline during SSE responses. Previously docs said tool calls were invisible. - api-server.md: add 'Tool progress in streams' note to streaming docs - open-webui.md: update 'How It Works' steps, add Tool Progress tip	2026-03-30 19:40:39 -07:00
Teknium	cc63b2d1cd	fix(gateway): remove user-facing compression warnings (#4139 ) Auto-compression still runs silently in the background with server-side logging, but no longer sends messages to the user's chat about it. Removed: - 'Session is large... Auto-compressing' pre-compression notification - 'Compressed: N → M messages' post-compression notification - 'Session is still very large after compression' warning - 'Auto-compression failed' warning - Rate-limit tracking (only existed for these warnings)	2026-03-30 19:17:07 -07:00
Teknium	45396aaa92	fix(alibaba): use standard DashScope international endpoint (#4133 ) * fix(alibaba): use standard DashScope international endpoint The Alibaba Cloud provider was hardcoded to the coding-intl endpoint (https://coding-intl.dashscope.aliyuncs.com/v1) which only accepts Alibaba Coding Plan API keys. Standard DashScope API keys fail with invalid_api_key error against this endpoint. Changed to the international compatible-mode endpoint (https://dashscope-intl.aliyuncs.com/compatible-mode/v1) which works with standard DashScope keys. Users with Coding Plan keys or China-region keys can still override via DASHSCOPE_BASE_URL or config.yaml base_url. Fixes #3912 * fix: update test to match new DashScope default endpoint --------- Co-authored-by: kagura-agent <kagura.chen28@gmail.com>	2026-03-30 19:06:30 -07:00
Teknium	04367e2fac	fix(cron): stop truncating job IDs in list view (#4132 ) Remove [:8] truncation from hermes cron list output. Job IDs are 12 hex chars — truncating to 8 makes them unusable for cron run/pause/remove which require the full ID. Co-authored-by: vitobotta <vitobotta@users.noreply.github.com>	2026-03-30 19:05:34 -07:00
Teknium	cdb64a869a	fix(security): reject private and loopback IPs in Telegram DoH fallback (#4129 ) Co-authored-by: Maymun <139681654+maymuneth@users.noreply.github.com>	2026-03-30 18:53:24 -07:00
Teknium	1e59d4813c	feat(api_server): stream tool progress to Open WebUI (#4092 ) Wire the existing tool_progress_callback through the API server's streaming handler so Open WebUI users see what tool is running. Uses the existing 3-arg callback signature (name, preview, args) that fires at tool start — no changes to run_agent.py needed. Progress appears as inline markdown in the SSE content stream. Inspired by PR #4032 by sroecker, reimplemented to avoid breaking the callback signature used by CLI and gateway consumers.	2026-03-30 18:50:27 -07:00
Teknium	f776191650	fix: persist compressed context to gateway session after mid-run compression When context compression fires during run_conversation() in the gateway, the compressed messages were silently lost on the next turn. Two bugs: 1. Agent-side: _flush_messages_to_session_db() calculated flush_from = max(len(conversation_history), _last_flushed_db_idx). After compression, _last_flushed_db_idx was correctly reset to 0, but conversation_history still had its original pre-compression length (e.g. 200). Since compressed messages are shorter (~30), messages[200:] was empty — nothing written to the new session's SQLite. Fix: Set conversation_history = None after each _compress_context() call so start_idx = 0 and all compressed messages are flushed. 2. Gateway-side: history_offset was always len(agent_history) — the original pre-compression length. After compression shortened the message list, agent_messages[200:] was empty, causing the gateway to fall back to writing only a user/assistant pair, losing the compressed summary and tail context. Fix: Detect session splits (agent.session_id != original) and set history_offset = 0 so all compressed messages are written to JSONL.	2026-03-30 18:49:14 -07:00
Teknium	44d02f35d2	docs: restructure site navigation — promote features and platforms to top-level (#4116 ) Major reorganization of the documentation site for better discoverability and navigation. 94 pages across 8 top-level sections (was 5). Structural changes: - Promote Features from 3-level-deep subcategory to top-level section with new Overview hub page categorizing all 26 feature pages - Promote Messaging Platforms from User Guide subcategory to top-level section, add platform comparison matrix (13 platforms x 7 features) - Create new Integrations section with hub page, grouping MCP, ACP, API Server, Honcho, Provider Routing, Fallback Providers - Extract AI provider content (626 lines) from configuration.md into dedicated integrations/providers.md — configuration.md drops from 1803 to 1178 lines - Subcategorize Developer Guide into Architecture, Extending, Internals - Rename "User Guide" to "Using Hermes" for top-level items Orphan fixes (7 pages now reachable via sidebar): - build-a-hermes-plugin.md added to Guides - sms.md added to Messaging Platforms - context-references.md added to Features > Core - plugins.md added to Features > Core - git-worktrees.md added to Using Hermes - checkpoints-and-rollback.md added to Using Hermes - checkpoints.md (30-line stub) deleted, superseded by checkpoints-and-rollback.md (203 lines) New files: - integrations/index.md — Integrations hub page - integrations/providers.md — AI provider setup (extracted) - user-guide/features/overview.md — Features hub page Broken link fixes: - quickstart.md, faq.md: update context-length-detection anchors - configuration.md: update checkpoints link - overview.md: fix checkpoint link path Docusaurus build verified clean (zero broken links/anchors).	2026-03-30 18:39:51 -07:00
Teknium	b2e1a095f8	fix(anthropic): write scopes field to Claude Code credentials on token refresh (#4126 ) Claude Code >=2.1.81 checks for a 'scopes' array containing 'user:inference' in ~/.claude/.credentials.json before accepting stored OAuth tokens as valid. When Hermes refreshes the token, it writes only accessToken, refreshToken, and expiresAt — omitting the scopes field. This causes Claude Code to report 'loggedIn: false' and refuse to start, even though the token is valid. This commit: - Parses the 'scope' field from the OAuth refresh response - Passes it to _write_claude_code_credentials() as a keyword argument - Persists the scopes array in the claudeAiOauth credential store - Preserves existing scopes when the refresh response omits the field Tested against Claude Code v2.1.87 on Linux — auth status correctly reports loggedIn: true and claude --print works after this fix. Co-authored-by: Nick <git@flybynight.io>	2026-03-30 18:35:16 -07:00
Teknium	ffd5d37f9b	fix: treat non-sk-ant- keys as regular API keys, not OAuth tokens (#4093 ) * fix: treat non-sk-ant- prefixed keys (Azure AI Foundry) as regular API keys, not OAuth tokens * fix: treat non-sk-ant- keys as regular API keys, not OAuth tokens _is_oauth_token() returned True for any key not starting with sk-ant-api, misclassifying Azure AI Foundry keys as OAuth tokens and sending Bearer auth instead of x-api-key → 401 rejection. Real Anthropic OAuth tokens all start with sk-ant-oat (confirmed from live .credentials.json). Non-sk-ant- keys are third-party provider keys that should use x-api-key. Test fixtures updated to use realistic sk-ant-oat01- prefixed tokens instead of fake strings. Salvaged from PR #4075 by @HangGlidersRule. --------- Co-authored-by: Clawdbot <clawdbot@openclaw.ai>	2026-03-30 17:41:13 -07:00
Teknium	720507efac	feat: add post-migration cleanup for OpenClaw directories (#4100 ) After migrating from OpenClaw, leftover workspace directories contain state files (todo.json, sessions, logs) that confuse the agent — it discovers them and reads/writes to stale locations instead of the Hermes state directory, causing issues like cron jobs reading a different todo list than interactive sessions. Changes: - hermes claw migrate now offers to archive the source directory after successful migration (rename to .pre-migration, not delete) - New `hermes claw cleanup` subcommand for users who already migrated and need to archive leftover OpenClaw directories - Migration notes updated with explicit cleanup guidance - 42 tests covering all new functionality Reported by SteveSkedasticity — multiple todo.json files across ~/.hermes/, ~/.openclaw/workspace/, and ~/.openclaw/workspace-assistant/ caused cron jobs to read from wrong locations.	2026-03-30 17:39:08 -07:00
Teknium	8a794d029d	fix(ci): add repo conditionals to prevent fork workflow failures (#4107 ) Add github.repository checks to docker-publish and deploy-site workflows so they skip on forks where upstream-specific resources (Docker Hub org, custom domain) are unavailable. Co-authored-by: StreamOfRon <StreamOfRon@users.noreply.github.com>	2026-03-30 17:38:32 -07:00
Teknium	e64b047663	chore: prepare Hermes for Homebrew packaging (#4099 ) Co-authored-by: Yabuku-xD <78594762+Yabuku-xD@users.noreply.github.com>	2026-03-30 17:34:43 -07:00
Robin Fernandes	1b7473e702	Fixes and refactors enabled by recent updates to main.	2026-03-31 09:29:59 +09:00
Robin Fernandes	1126284c97	Merge branch 'main' into rewbs/tool-use-charge-to-subscription	2026-03-31 09:29:43 +09:00
Teknium	11aa44d34d	docs(telegram): add webhook mode documentation (#4089 ) Documents the Telegram webhook mode from #3880: - New 'Webhook Mode' section in telegram.md with polling vs webhook comparison, config table, Fly.io deployment example, troubleshooting - Add TELEGRAM_WEBHOOK_URL/PORT/SECRET to environment-variables.md - Add Telegram section to .env.example (existing + webhook vars) Co-authored-by: raulbcs <raulbcs@users.noreply.github.com>	2026-03-30 17:21:59 -07:00
Teknium	07746dca0c	fix(matrix): E2EE decryption — request keys, auto-trust devices, retry buffered events (#4083 ) When the Matrix adapter receives encrypted events it can't decrypt (MegolmEvent), it now: 1. Requests the missing room key from other devices via client.request_room_key(event) instead of silently dropping the message 2. Buffers undecrypted events (bounded to 100, 5 min TTL) and retries decryption after each E2EE maintenance cycle when new keys arrive 3. Auto-trusts/verifies all devices after key queries so other clients share session keys with the bot proactively 4. Exports Megolm keys on disconnect and imports them on connect, so session keys survive gateway restarts This addresses the 'could not decrypt event' warnings that caused the bot to miss messages in encrypted rooms.	2026-03-30 17:16:09 -07:00
Teknium	7e0c2c3ce3	docs: comprehensive documentation audit — fix 9 HIGH, 20+ MEDIUM gaps (#4087 ) Reference docs fixes: - cli-commands.md: remove non-existent --provider alibaba, add hermes profile/completion/plugins/mcp to top-level table, add --profile/-p global flag, add --source chat option - slash-commands.md: add /yolo and /commands, fix /q alias conflict (resolves to /queue not /quit), add missing aliases (/bg, /set-home, /reload_mcp, /gateway) - toolsets-reference.md: fix hermes-api-server (not same as hermes-cli, omits clarify/send_message/text_to_speech) - profile-commands.md: fix show name required not optional, --clone-from not --from, add --remove/--name to alias, fix alias path, fix export/ import arg types, remove non-existent fish completion - tools-reference.md: add EXA_API_KEY to web tools requires_env - mcp-config-reference.md: add auth key for OAuth, tool name sanitization - environment-variables.md: add EXA_API_KEY, update provider values - plugins.md: remove non-existent ctx.register_command(), add ctx.inject_message() Feature docs additions: - security.md: add /yolo mode, approval modes (manual/smart/off), configurable timeout, expanded dangerous patterns table - cron.md: add wrap_response config, [SILENT] suppression - mcp.md: add dynamic tool discovery, MCP sampling support - cli.md: add Ctrl+Z suspend, busy_input_mode, tool_preview_length - docker.md: add skills/credential file mounting Messaging platform docs: - telegram.md: add webhook mode, DoH fallback IPs - slack.md: add multi-workspace OAuth support - discord.md: add DISCORD_IGNORE_NO_MENTION - matrix.md: add MSC3245 native voice messages - feishu.md: expand from 129 to 365 lines (encrypt key, verification token, group policy, card actions, media, rate limiting, markdown, troubleshooting) - wecom.md: expand from 86 to 264 lines (per-group allowlists, media, AES decryption, stream replies, reconnection, troubleshooting) Configuration docs: - quickstart.md: add DeepSeek, Copilot, Copilot ACP providers - configuration.md: add DeepSeek provider, Exa web backend, terminal env_passthrough/images, browser.command_timeout, compression params, discord config, security/tirith config, timezone, auxiliary models 21 files changed, ~1000 lines added	2026-03-30 17:15:21 -07:00
SHL0MS	3c8f910973	feat: respect NO_COLOR env var and TERM=dumb (#4079 ) Add should_use_color() function to hermes_cli/colors.py that checks NO_COLOR (https://no-color.org/) and TERM=dumb before emitting ANSI escapes. The existing color() helper now uses this function instead of a bare isatty() check. This is the foundation — cli.py and banner.py still have inline ANSI constants that bypass this module (tracked in #4071). Closes #4066 Co-authored-by: SHL0MS <SHL0MS@users.noreply.github.com>	2026-03-30 17:07:21 -07:00
Teknium	13f3e67165	ux: show 'Initializing agent...' on first message (#4086 ) Display a brief status message before the heavy agent initialization (OpenAI client setup, tool loading, memory init, etc.) so users aren't staring at a blank screen for several seconds. Only prints when self.agent is None (first use or after model switch). Closes #4060 Co-authored-by: SHL0MS <SHL0MS@users.noreply.github.com>	2026-03-30 17:05:40 -07:00
Teknium	4a7c17fca5	fix(gateway): read custom_providers context_length in hygiene compression (#4085 ) Gateway hygiene pre-compression only checked model.context_length from the top-level config, missing per-model context_length defined in custom_providers entries. This caused premature compression for custom provider users (e.g. 128K default instead of 200K configured). The AIAgent's own compressor already reads custom_providers correctly (run_agent.py lines 1171-1189). This adds the same fallback to the gateway hygiene path, running after runtime provider resolution so the base_url is available for matching.	2026-03-30 17:04:31 -07:00
Robin Fernandes	6e4598ce1e	Merge branch 'main' into rewbs/tool-use-charge-to-subscription	2026-03-31 08:48:54 +09:00
Teknium	f007284d05	fix: rate-limit pairing rejection messages to prevent spam (#4081 ) * fix: rate-limit pairing rejection messages to prevent spam When generate_code() returns None (rate limited or max pending), the "Too many pairing requests" message was sent on every subsequent DM with no cooldown. A user sending 30 messages would get 30 rejection replies — reported as potential hack on WhatsApp. Now check _is_rate_limited() before any pairing response, and record rate limit after sending a rejection. Subsequent messages from the same user are silently ignored until the rate limit window expires. * test: add coverage for pairing response rate limiting Follow-up to cherry-picked PR #4042 — adds tests verifying: - Rate-limited users get silently ignored (no response sent) - Rejection messages record rate limit for subsequent suppression --------- Co-authored-by: 0xbyt4 <35742124+0xbyt4@users.noreply.github.com>	2026-03-30 16:48:00 -07:00
Teknium	3d47af01c3	fix(honcho): write config to instance-local path for profile isolation (#4037 ) Multiple agents/profiles running 'hermes honcho setup' all wrote to the shared global ~/.honcho/config.json, overwriting each other's configuration. Root cause: _write_config() defaulted to resolve_config_path() which returns the global path when no instance-local file exists yet (i.e. on first setup). Fix: _write_config() now defaults to _local_config_path() which always returns $HERMES_HOME/honcho.json. Each profile gets its own config file. Reading still falls back to global for cross-app interop and seeding. Also updates cmd_setup and cmd_status messaging to show the actual write path. Includes 10 new tests verifying profile isolation, global fallback reads, and multi-profile independence.	2026-03-30 16:41:19 -07:00
SHL0MS	275fcc6673	Merge pull request #4054 from NousResearch/ascii-video/text-readability-and-layout-oracle ascii-video skill: text readability techniques and external layout oracle	2026-03-30 15:52:14 -07:00
SHL0MS	ab62614a89	ascii-video: add text readability techniques and external layout oracle pattern - composition.md: add text backdrop (gaussian dark mask behind glyphs) and external layout oracle pattern (browser-based text layout → JSON → Python renderer pipeline for obstacle-aware text reflow) - shaders.md: add reverse vignette shader (center-darkening for text readability) - troubleshooting.md: add diagnostic entries for text-over-busy-background readability and kaleidoscope-destroys-text pitfall	2026-03-30 18:48:22 -04:00
Bryan Cross	0287597d02	Optimize Playwright install	2026-03-30 17:38:07 -05:00
Teknium	de368cac54	fix(tools): show browser and TTS in reconfigure menu (#4041 ) * fix(gateway): honor default for invalid bool-like config values * refactor: simplify web backend priority detection Replace cascading boolean conditions with a priority-ordered loop. Same behavior (verified against all 16 env var combinations), half the lines, trivially extensible for new backends. * fix(tools): show browser and TTS in reconfigure menu _toolset_has_keys() returned False for toolsets with no-key providers (Local Browser, Edge TTS) because it only checked providers with env_vars. Users couldn't find these tools in the reconfigure list and had no obvious way to switch browser/TTS backends. Now treats providers with empty env_vars as always-configured, so toolsets with free/local options always appear in the reconfigure menu. --------- Co-authored-by: aydnOktay <xaydinoktay@gmail.com>	2026-03-30 14:11:39 -07:00
Bryan Cross	3a1e489dd6	Add build-essential to Dockerfile dependencies	2026-03-30 15:57:22 -05:00
Teknium	0d1003559d	refactor: simplify web backend priority detection (#4036 ) * fix(gateway): honor default for invalid bool-like config values * refactor: simplify web backend priority detection Replace cascading boolean conditions with a priority-ordered loop. Same behavior (verified against all 16 env var combinations), half the lines, trivially extensible for new backends. --------- Co-authored-by: aydnOktay <xaydinoktay@gmail.com>	2026-03-30 13:37:25 -07:00
Bryan Cross	4f4d7c4eeb	Merge branch 'NousResearch:main' into docker-optimization	2026-03-30 15:29:27 -05:00
Bryan Cross	5de312c9e3	Simplify dockerignore	2026-03-30 15:29:06 -05:00
Bryan Cross	48942c89b5	Further npm optimizations	2026-03-30 15:27:11 -05:00
Teknium	eba8d52d54	fix: show correct shell config path for macOS/zsh in install script (#4025 ) - print_success() hardcoded 'source ~/.bashrc' regardless of user's shell - On macOS (default zsh), ~/.bashrc doesn't exist, leaving users unable to find the hermes command after install - Now detects $SHELL and shows the correct file (zshrc/bashrc) - Also captures .[all] install failure output instead of silencing with 2>/dev/null, so users can diagnose why full extras failed	2026-03-30 13:25:11 -07:00
Teknium	72104eb06f	fix(gateway): honor default for invalid bool-like config values (#4029 ) Co-authored-by: aydnOktay <xaydinoktay@gmail.com>	2026-03-30 13:24:48 -07:00
Bryan Cross	fdef0456a7	Merge branch 'NousResearch:main' into docker-optimization	2026-03-30 15:21:45 -05:00
Teknium	4b35836ba4	fix(auth): use bearer auth for MiniMax Anthropic endpoints (#4028 ) MiniMax's /anthropic endpoints implement Anthropic's Messages API but require Authorization: Bearer instead of x-api-key. Without this fix, MiniMax users get 401 errors in gateway sessions. Adds _requires_bearer_auth() to detect MiniMax endpoints and route through auth_token in the Anthropic SDK. Check runs before OAuth token detection so MiniMax keys aren't misclassified as setup tokens. Co-authored-by: kshitijk4poor <kshitijk4poor@users.noreply.github.com>	2026-03-30 13:21:39 -07:00
Teknium	bd376fe976	fix(docs): improve mobile sidebar navigation The sidebar had all categories expanded by default (collapsed: false), which on mobile created a 60+ item flat list when opening the sidebar. Reported by danny on Discord. Changes: - Set all top-level categories to collapsed: true (tap to expand) - Enable autoCollapseCategories: true (accordion — opening one section closes others, prevents the overwhelming flat list) - Enable hideable sidebar (swipe-to-dismiss on mobile) - Add mobile CSS: larger touch targets (0.75rem padding), bolder category headers, visible subcategory indentation with left border, wider sidebar (85vw / 360px max), darker backdrop overlay	2026-03-30 13:20:55 -07:00
Teknium	f93637b3a1	feat: add /profile slash command to show active profile (#4027 ) Adds /profile to COMMAND_REGISTRY (Info category) with handlers in both CLI and gateway. Shows the active profile name and home directory. Works on all platforms — CLI, Telegram, Discord, Slack, etc. Detects profile by checking if HERMES_HOME is under ~/.hermes/profiles/. Shows 'default' when running without a profile.	2026-03-30 13:20:06 -07:00
Bryan Cross	8210e7aba6	Optimize Dockerfile: combine RUN commands, clear caches, add .dockerignore - Combine apt-get update and install into single RUN with cache clearing - Remove APT lists after installation - Add --no-cache-dir to pip install - Add --prefer-offline --no-audit to npm install - Create .dockerignore to exclude unnecessary files from build context - Update docker-publish.yml workflow to tag images with release names - Ensure buildx caching is used (type=gha)	2026-03-30 15:19:52 -05:00
Teknium	7b4fe0528f	fix(auth): use bearer auth for MiniMax Anthropic endpoints (#4028 ) MiniMax's /anthropic endpoints implement Anthropic's Messages API but require Authorization: Bearer instead of x-api-key. Without this fix, MiniMax users get 401 errors in gateway sessions. Adds _requires_bearer_auth() to detect MiniMax endpoints and route through auth_token in the Anthropic SDK. Check runs before OAuth token detection so MiniMax keys aren't misclassified as setup tokens. Co-authored-by: kshitijk4poor <kshitijk4poor@users.noreply.github.com>	2026-03-30 13:19:44 -07:00
Teknium	950f69475f	feat(browser): add Camofox local anti-detection browser backend (#4008 ) Camofox-browser is a self-hosted Node.js server wrapping Camoufox (Firefox fork with C++ fingerprint spoofing). When CAMOFOX_URL is set, all 11 browser tools route through the Camofox REST API instead of the agent-browser CLI. Maps 1:1 to the existing browser tool interface: - Navigate, snapshot, click, type, scroll, back, press, close - Get images, vision (screenshot + LLM analysis) - Console (returns empty with note — camofox limitation) Setup: npm start in camofox-browser dir, or docker run -p 9377:9377 Then: CAMOFOX_URL=http://localhost:9377 in ~/.hermes/.env Advantages over Browserbase (cloud): - Free (no per-session API costs) - Local (zero network latency for browser ops) - Anti-detection at C++ level (bypasses Cloudflare/Google bot detection) - Works offline, Docker-ready Files: - tools/browser_camofox.py: Full REST backend (~400 lines) - tools/browser_tool.py: Routing at each tool function - hermes_cli/config.py: CAMOFOX_URL env var entry - tests/tools/test_browser_camofox.py: 20 tests	2026-03-30 13:18:42 -07:00
Teknium	7dac75f2ae	fix: prevent context pressure warning spam after compression (#4012 ) * feat: add /yolo slash command to toggle dangerous command approvals Adds a /yolo command that toggles HERMES_YOLO_MODE at runtime, skipping all dangerous command approval prompts for the current session. Works in both CLI and gateway (Telegram, Discord, etc.). - /yolo -> ON: all commands auto-approved, no confirmation prompts - /yolo -> OFF: normal approval flow restored The --yolo CLI flag already existed for launch-time opt-in. This adds the ability to toggle mid-session without restarting. Session-scoped — resets when the process ends. Uses the existing HERMES_YOLO_MODE env var that check_all_command_guards() already respects. * fix: prevent context pressure warning spam (agent loop + gateway rate-limit) Two complementary fixes for repeated context pressure warnings spamming gateway users (Telegram, Discord, etc.): 1. Agent-level loop fix (run_agent.py): After compression, only reset _context_pressure_warned if the post-compression estimate is actually below the 85% warning level. Previously the flag was unconditionally reset, causing the warning to re-fire every loop iteration when compression couldn't reduce below 85% of the threshold (e.g. very low threshold like 15%, or system prompt alone exceeds the warning level). 2. Gateway-level rate-limit (gateway/run.py, salvaged from PR #3786): Per-chat_id cooldown of 1 hour on compression warning messages. Both warning paths ('still large after compression' and 'compression failed') are gated. Defense-in-depth — even if the agent-level fix has edge cases, users won't see more than one warning per hour. Co-authored-by: dlkakbs <dlkakbs@users.noreply.github.com> --------- Co-authored-by: dlkakbs <dlkakbs@users.noreply.github.com>	2026-03-30 13:18:21 -07:00
Teknium	ed9af6e589	fix: create AsyncOpenAI lazily in trajectory_compressor to avoid closed event loop (#4013 ) The AsyncOpenAI client was created once at __init__ and stored as an instance attribute. process_directory() calls asyncio.run() which creates and closes a fresh event loop. On a second call, the client's httpx transport is still bound to the closed loop, raising RuntimeError: "Event loop is closed" — the same pattern fixed by PR #3398 for the main agent loop. Create the client lazily in _get_async_client() so each asyncio.run() gets a client bound to the current loop. Co-authored-by: binhnt92 <binhnt.ht.92@gmail.com>	2026-03-30 13:16:16 -07:00
Teknium	158f49f19a	fix: enforce priority order in Telegram menu — core > plugins > skills (#4023 ) The menu now has explicit priority tiers: 1. Core CommandDef commands (always included, never bumped) 2. Plugin slash commands (take precedence over skills) 3. Built-in skill commands (fill remaining slots alphabetically) Only skills get trimmed when the 100-command cap is hit. Adding new core commands or plugin commands automatically pushes skills out, not the other way around.	2026-03-30 13:04:06 -07:00
Teknium	86250a3e45	docs: expand terminal backends section + fix docs build (#4016 ) * feat(telegram): add webhook mode as alternative to polling When TELEGRAM_WEBHOOK_URL is set, the adapter starts an HTTP webhook server (via python-telegram-bot's start_webhook()) instead of long polling. This enables cloud platforms like Fly.io and Railway to auto-wake suspended machines on inbound HTTP traffic. Polling remains the default — no behavior change unless the env var is set. Env vars: TELEGRAM_WEBHOOK_URL Public HTTPS URL for Telegram to push to TELEGRAM_WEBHOOK_PORT Local listen port (default 8443) TELEGRAM_WEBHOOK_SECRET Secret token for update verification Cherry-picked and adapted from PR #2022 by SHL0MS. Preserved all current main enhancements (network error recovery, polling conflict detection, DM topics setup). Co-authored-by: SHL0MS <SHL0MS@users.noreply.github.com> * fix: send_document call in background task delivery + vision download timeout Two fixes salvaged from PR #2269 by amethystani: 1. gateway/run.py: adapter.send_file() → adapter.send_document() send_file() doesn't exist on BasePlatformAdapter. Background task media files were silently never delivered (AttributeError swallowed by except Exception: pass). 2. tools/vision_tools.py: configurable image download timeout via HERMES_VISION_DOWNLOAD_TIMEOUT env var (default 30s), plus guard against raise None when max_retries=0. The third fix in #2269 (opencode-go auth config) was already resolved on main. Co-authored-by: amethystani <amethystani@users.noreply.github.com> * docs: expand terminal backends section + fix feishu MDX build error --------- Co-authored-by: SHL0MS <SHL0MS@users.noreply.github.com> Co-authored-by: amethystani <amethystani@users.noreply.github.com>	2026-03-30 12:59:58 -07:00
Teknium	ea342f2382	Fix banner alignment in installer script (#4011 ) Co-authored-by: Ahmed Khaled <wakeupwithme000@gmail.com>	2026-03-30 11:24:10 -07:00
Teknium	60ecde8ac7	fix: fit all 100 commands in Telegram menu with 40-char descriptions (#4010 ) * fix: truncate skill descriptions to 100 chars in Telegram menu * fix: 40-char desc cap + 100 command limit for Telegram menu setMyCommands has an undocumented total payload size limit. 50 commands with 256-char descriptions failed, 50 with 100-char worked, and 100 with 40-char descriptions also works (~5300 total chars). Truncate skill descriptions to 40 chars in the menu picker and set cap back to 100. Full descriptions available via /commands.	2026-03-30 11:21:13 -07:00
Teknium	f3069c649c	fix(cli): add missing subprocess.run() timeouts in doctor and status (#4009 ) Add timeout parameters to 4 subprocess.run() calls that could hang indefinitely if the child process blocks (e.g., unresponsive docker daemon, systemctl waiting for D-Bus): - doctor.py: docker info (timeout=10), ssh check (timeout=15) - status.py: systemctl is-active (timeout=5), launchctl list (timeout=5) Each call site now catches subprocess.TimeoutExpired and treats it as a failure, consistent with how non-zero return codes are already handled. Add AST-based regression test that verifies every subprocess.run() call in CLI modules specifies a timeout keyword argument. Co-authored-by: dieutx <dangtc94@gmail.com>	2026-03-30 11:17:15 -07:00
Teknium	0976bf6cd0	feat: add /yolo slash command to toggle dangerous command approvals (#3990 ) Adds a /yolo command that toggles HERMES_YOLO_MODE at runtime, skipping all dangerous command approval prompts for the current session. Works in both CLI and gateway (Telegram, Discord, etc.). - /yolo -> ON: all commands auto-approved, no confirmation prompts - /yolo -> OFF: normal approval flow restored The --yolo CLI flag already existed for launch-time opt-in. This adds the ability to toggle mid-session without restarting. Session-scoped — resets when the process ends. Uses the existing HERMES_YOLO_MODE env var that check_all_command_guards() already respects.	2026-03-30 11:17:09 -07:00
Teknium	da3e22bcfa	fix: cap Telegram menu at 50 commands — API rejects above ~60 (#4006 ) * fix: use SKILLS_DIR not repo path for Telegram menu skill filter Skills are synced to ~/.hermes/skills/ (SKILLS_DIR), not the repo's skills/ directory. The previous filter compared against the repo path so no skills matched. Now checks SKILLS_DIR and excludes .hub/ subdirectory (user-installed hub skills). * fix: cap Telegram menu at 50 commands — API rejects above ~60 Telegram's setMyCommands returns BOT_COMMANDS_TOO_MUCH when registering close to 100 commands despite docs claiming 100 is the limit. Metadata overhead causes rejection above ~60. Cap at 50 for reliability — remaining commands accessible via /commands.	2026-03-30 11:05:20 -07:00
Teknium	9fd78c7a8e	fix: use SKILLS_DIR not repo path for Telegram menu skill filter (#4005 ) Skills are synced to ~/.hermes/skills/ (SKILLS_DIR), not the repo's skills/ directory. The previous filter compared against the repo path so no skills matched. Now checks SKILLS_DIR and excludes .hub/ subdirectory (user-installed hub skills).	2026-03-30 11:01:13 -07:00
Teknium	5ceed021dc	feat(gateway): skill-aware slash commands, paginated /commands, Telegram 100-cap (#3934 ) * feat(gateway): skill-aware slash commands, paginated /commands, Telegram 100-cap Map active skills to Telegram's slash command menu so users can discover and invoke skills directly. Three changes: 1. Telegram menu now includes active skill commands alongside built-in commands, capped at 100 entries (Telegram Bot API limit). Overflow commands remain callable but hidden from the picker. Logged at startup when cap is hit. 2. New /commands [page] gateway command for paginated browsing of all commands + skills. /help now shows first 10 skill commands and points to /commands for the full list. 3. When a user types a slash command that matches a disabled or uninstalled skill, they get actionable guidance: - Disabled: 'Enable it with: hermes skills config' - Optional (not installed): 'Install with: hermes skills install official/<path>' Built on ideas from PR #3921 by @kshitijk4poor. * chore: move 21 niche skills to optional-skills Move specialized/niche skills from built-in (skills/) to optional (optional-skills/) to reduce the default skill count. Users can install them with: hermes skills install official/<category>/<name> Moved skills (21): - mlops: accelerate, chroma, faiss, flash-attention, hermes-atropos-environments, huggingface-tokenizers, instructor, lambda-labs, llava, nemo-curator, pinecone, pytorch-lightning, qdrant, saelens, simpo, slime, tensorrt-llm, torchtitan - research: domain-intel, duckduckgo-search - devops: inference-sh cli Built-in skills: 96 → 75 Optional skills: 22 → 43 * fix: only include repo built-in skills in Telegram menu, not user-installed User-installed skills (from hub or manually added) stay accessible via /skills and by typing the command directly, but don't get registered in the Telegram slash command picker. Only skills whose SKILL.md is under the repo's skills/ directory are included in the menu. This keeps the Telegram menu focused on the curated built-in set while user-installed skills remain discoverable through /skills and /commands.	2026-03-30 10:57:30 -07:00
Teknium	97d6813f51	fix(cache): use deterministic call_id fallbacks instead of random UUIDs (#3991 ) When the API doesn't provide a call_id for tool calls, the fallback generated a random uuid4 hex. This made every API call's input unique when replayed, preventing OpenAI's prompt cache from matching the prefix across turns. Replaced all four uuid4 fallback sites with a deterministic hash of (function_name, arguments, position_index). The same tool call now always produces the same fallback call_id, preserving cache-friendly input stability. Affected code paths: - _chat_messages_to_responses_input() — Codex input reconstruction - _normalize_codex_response() — function_call and custom_tool_call - _build_assistant_message() — assistant message construction	2026-03-30 09:43:56 -07:00
Teknium	37825189dd	fix(skills): validate hub bundle paths before install (#3986 ) Co-authored-by: Gutslabs <gutslabsxyz@gmail.com>	2026-03-30 08:37:19 -07:00
Teknium	e08778fa1e	chore: release v0.6.0 (2026.3.30) (#3985 )	2026-03-30 08:29:38 -07:00
Robin Fernandes	1cbb1b99cc	Gate tool-gateway behind an env var, so it's not in users' faces until we're ready. Even if users enable it, it'll be blocked server-side for now, until we unlock for non-admin users on tool-gateway.	2026-03-30 13:28:10 +09:00
Robin Fernandes	e95965d76a	Merge branch 'main' into rewbs/tool-use-charge-to-subscription	2026-03-26 16:18:28 -07:00
Robin Fernandes	95dc9aaa75	feat: add managed tool gateway and Nous subscription support - add managed modal and gateway-backed tool integrations\n- improve CLI setup, auth, and configuration for subscriber flows\n- expand tests and docs for managed tool support	2026-03-26 16:17:58 -07:00

[EPIC-999/Phase I] The Mirror — formal spec extraction artifacts #107

278 Commits