hermes-agent

Author	SHA1	Message	Date
teknium1	8e07f9ca56	fix: audit fixes — 5 bugs found and resolved Thorough code review found 5 issues across run_agent.py, cli.py, and gateway/: 1. CRITICAL — Gateway stream consumer task never started: stream_consumer_holder was checked BEFORE run_sync populated it. Fixed with async polling pattern (same as track_agent). 2. MEDIUM-HIGH — Streaming fallback after partial delivery caused double-response: if streaming failed after some tokens were delivered, the fallback would re-deliver the full response. Now tracks deltas_were_sent and only falls back when no tokens reached consumers yet. 3. MEDIUM — Codex mode lost on_first_delta spinner callback: _run_codex_stream now accepts on_first_delta parameter, fires it on first text delta. Passed through from _interruptible_streaming_api_call via _codex_on_first_delta instance attribute. 4. MEDIUM — CLI close-tag after-text bypassed tag filtering: text after a reasoning close tag was sent directly to _emit_stream_text, skipping open-tag detection. Now routes through _stream_delta for full filtering. 5. LOW — Removed 140 lines of dead code: old _streaming_api_call method (superseded by _interruptible_streaming_api_call). Updated 13 tests in test_run_agent.py and test_openai_client_lifecycle.py to use the new method name and signature. 4573 tests passing.	2026-03-16 06:35:46 -07:00
Teknium	dd7921d514	fix(honcho): isolate session routing for multi-user gateway (#1500 ) Salvaged from PR #1470 by adavyas. Core fix: Honcho tool calls in a multi-session gateway could route to the wrong session because honcho_tools.py relied on process-global state. Now threads session context through the call chain: AIAgent._invoke_tool() → handle_function_call() → registry.dispatch() → handler **kw → _resolve_session_context() Changes: - Add _resolve_session_context() to prefer per-call context over globals - Plumb honcho_manager + honcho_session_key through handle_function_call - Add sync_honcho=False to run_conversation() for synthetic flush turns - Pass honcho_session_key through gateway memory flush lifecycle - Harden gateway PID detection when /proc cmdline is unreadable - Make interrupt test scripts import-safe for pytest-xdist - Wrap BibTeX examples in Jekyll raw blocks for docs build - Fix thread-order-dependent assertion in client lifecycle test - Expand Honcho docs: session isolation, lifecycle, routing internals Dropped from original PR: - Indentation change in _create_request_openai_client that would move client creation inside the lock (causes unnecessary contention) Co-authored-by: adavyas <adavyas@users.noreply.github.com>	2026-03-16 00:23:47 -07:00
Teknium	3f0f4a04a9	fix(agent): skip reasoning extra_body for unsupported OpenRouter models (#1485 ) * fix(agent): skip reasoning extra_body for models that don't support it Sending reasoning config to models like MiniMax or Nvidia via OpenRouter causes a 400 BadRequestError. Previously, reasoning extra_body was sent to all OpenRouter and Nous models unconditionally. Fix: only send reasoning extra_body when the model slug starts with a known reasoning-capable prefix (deepseek/, anthropic/, openai/, x-ai/, google/gemini-2, qwen/qwen3) or when using Nous Portal directly. Applies to both the main API call path (_build_api_kwargs) and the conversation summary path. Fixes #1083 * test(agent): cover reasoning extra_body gating --------- Co-authored-by: ygd58 <buraysandro9@gmail.com>	2026-03-15 20:42:07 -07:00
teknium1	62abb453d3	Merge origin/main into hermes/hermes-daa73839	2026-03-14 23:44:47 -07:00
teknium1	735a6e7651	fix: convert anthropic image content blocks	2026-03-14 23:41:20 -07:00
0xbyt4	6f85283553	fix: use json.dumps instead of str() for Codex Responses API arguments When the Responses API returns tool call arguments as a dict, str(dict) produces Python repr with single quotes (e.g. {'key': 'val'}) which is invalid JSON. Downstream json.loads() fails silently and the tool gets called with empty arguments, losing all parameters. Affects both function_call and custom_tool_call item types in _normalize_codex_response().	2026-03-14 22:03:53 -07:00
teknium1	70ea13eb40	fix: preflight Anthropic auth and prefer Claude store	2026-03-14 19:38:55 -07:00
teknium1	e052c74727	fix: refresh Anthropic OAuth before stale env tokens	2026-03-14 19:22:31 -07:00
teknium1	7b10881b9e	fix: persist clean voice transcripts and /voice off state - keep CLI voice prefixes API-local while storing the original user text - persist explicit gateway off state and restore adapter auto-TTS suppression on restart - add regression coverage for both behaviors	2026-03-14 06:14:22 -07:00
0xbyt4	eb34c0b09a	fix: voice pipeline hardening — 7 bug fixes with tests 1. Anthropic + ElevenLabs TTS silence: forward full response to TTS callback for non-streaming providers (choices first, then native content blocks fallback). 2. Subprocess timeout kill: play_audio_file now kills the process on TimeoutExpired instead of leaving zombie processes. 3. Discord disconnect cleanup: leave all voice channels before closing the client to prevent leaked state. 4. Audio stream leak: close InputStream if stream.start() fails. 5. Race condition: read/write _on_silence_stop under lock in audio callback thread. 6. _vprint force=True: show API error, retry, and truncation messages even during streaming TTS. 7. _refresh_level lock: read _voice_recording under _voice_lock.	2026-03-14 14:27:21 +03:00
0xbyt4	c797314fcf	test: add security and hardening tests for voice mode fixes - Path traversal sanitization (Path.name strips ../) - Media endpoint authentication (401 without token, 404 on traversal) - hmac.compare_digest usage verification (no == for tokens) - DOMPurify XSS prevention in HTML template - Default bind 127.0.0.1 (adapter and config) - /remote-control token hiding in group chats - Opus find_library instead of hardcoded paths - Opus decode error logging (no silent swallow) - Interrupt _vprint force=True on all 6 calls - Anthropic interrupt handler in both API call paths - Update test_web_defaults for new 127.0.0.1 default	2026-03-14 14:27:21 +03:00
0xbyt4	ecc3dd7c63	test: add comprehensive voice mode test coverage (86 tests) - Add TestStreamingApiCall (11 tests) for _streaming_api_call in test_run_agent.py - Add regression tests for all 7 bug fixes (edge_tts lazy import, output_stream cleanup, ctrl+c continuous reset, disable stops TTS, config key, chat cleanup, browser_tool signal handler removal) - Add real behavior tests for CLI voice methods via _make_voice_cli() fixture: TestHandleVoiceCommandReal (7), TestEnableVoiceModeReal (7), TestDisableVoiceModeReal (6), TestVoiceSpeakResponseReal (7), TestVoiceStopAndTranscribeReal (12)	2026-03-14 14:27:20 +03:00
teknium1	cbbba87099	fix: reuse shared atomic session log helper	2026-03-14 02:56:13 -07:00
Teknium	1117a21065	Merge pull request #1271 from NousResearch/hermes/hermes-de3d4e49 fix: guard init-time stdio writes	2026-03-14 02:21:39 -07:00
teknium1	936040d8f7	fix: guard init-time stdio writes	2026-03-14 02:19:46 -07:00
teknium1	806b79b589	test: cover errors.log handler reuse	2026-03-13 23:56:51 -07:00
Teknium	938e887b4c	fix: keep honcho recall out of cached system prefix (#1201 ) Attach later-turn Honcho recall to the current-turn user message at API call time instead of appending it to the system prompt. This preserves the stable system-prefix cache while keeping Honcho continuity context available for the turn. Also adds regression coverage for the injection helper and for continuing sessions so Honcho recall stays out of the system prompt.	2026-03-13 21:07:00 -07:00
Teknium	0157253145	Merge pull request #1152 from NousResearch/hermes/hermes-f47f71c0 feat: concurrent tool execution with ThreadPoolExecutor	2026-03-13 03:20:38 -07:00
kshitijk4poor	ccfbf42844	feat: secure skill env setup on load (core #688 ) When a skill declares required_environment_variables in its YAML frontmatter, missing env vars trigger a secure TUI prompt (identical to the sudo password widget) when the skill is loaded. Secrets flow directly to ~/.hermes/.env, never entering LLM context. Key changes: - New required_environment_variables frontmatter field for skills - Secure TUI widget (masked input, 120s timeout) - Gateway safety: messaging platforms show local setup guidance - Legacy prerequisites.env_vars normalized into new format - Remote backend handling: conservative setup_needed=True - Env var name validation, file permissions hardened to 0o600 - Redact patterns extended for secret-related JSON fields - 12 existing skills updated with prerequisites declarations - ~48 new tests covering skip, timeout, gateway, remote backends - Dynamic panel widget sizing (fixes hardcoded width from original PR) Cherry-picked from PR #723 by kshitijk4poor, rebased onto current main with conflict resolution. Fixes #688 Co-authored-by: kshitijk4poor <kshitijk4poor@users.noreply.github.com>	2026-03-13 03:14:04 -07:00
teknium1	5d0d5b191c	feat: concurrent tool execution with ThreadPoolExecutor When the model returns multiple tool calls in a single response, they are now executed concurrently using a thread pool instead of sequentially. This significantly reduces wall-clock time when multiple independent tools are batched (e.g. parallel web_search, read_file, terminal calls). Architecture: - _execute_tool_calls() dispatches to sequential or concurrent path - Single tool calls and batches containing 'clarify' use sequential path - Multiple non-interactive tools use ThreadPoolExecutor (max 8 workers) - Results are collected and appended to messages in original order - _invoke_tool() extracted as shared tool invocation helper Safety: - Pre-flight interrupt check skips all tools if interrupted - Per-tool exception handling: one failure doesn't crash the batch - Result truncation (100k char limit) applied per tool - Budget pressure injection after all tools complete - Checkpoints taken before file-mutating tools - CLI spinner shows batch progress, then per-tool completion messages Tests: 10 new tests covering dispatch logic, ordering, error handling, interrupt behavior, truncation, and _invoke_tool routing.	2026-03-13 02:51:51 -07:00
Teknium	a282322845	Merge pull request #1121 from 0xbyt4/fix/anthropic-adapter-issues fix: anthropic adapter — max_tokens, fallback crash, proxy base_url	2026-03-12 19:07:06 -07:00
Teknium	475dd58a8e	Merge PR #736 : feat(honcho): async writes, memory modes, session title integration, setup CLI Authored by erosika. Builds on #38 and #243. Adds async write support, configurable memory modes, context prefetch pipeline, 4 new Honcho tools (honcho_context, honcho_profile, honcho_search, honcho_conclude), full 'hermes honcho' CLI, session strategies, AI peer identity, recallMode A/B, gateway lifecycle management, and comprehensive docs. Cherry-picks fixes from PRs #831/#832 (adavyas). Co-authored-by: erosika <erosika@users.noreply.github.com> Co-authored-by: adavyas <adavyas@users.noreply.github.com>	2026-03-12 19:05:11 -07:00
0xbyt4	22479b053c	fix: anthropic adapter — max_tokens ignored, fallback crash, proxy base_url filtered - Pass self.max_tokens to build_anthropic_kwargs instead of hardcoded None - Add anthropic case to _try_activate_fallback (was only handling openai-codex) - Remove 'anthropic in base_url' filter that blocked custom proxy URLs	2026-03-13 04:22:16 +03:00
teknium1	5e12442b4b	feat: native Anthropic provider with Claude Code credential auto-discovery Add Anthropic as a first-class inference provider, bypassing OpenRouter for direct API access. Uses the native Anthropic SDK with a full format adapter (same pattern as the codex_responses api_mode). ## Auth (three methods, priority order) 1. ANTHROPIC_API_KEY env var (regular API key, sk-ant-api-) 2. ANTHROPIC_TOKEN / CLAUDE_CODE_OAUTH_TOKEN env var (setup-token, sk-ant-oat-) 3. Auto-discovery from ~/.claude/.credentials.json (Claude Code subscription) - Reads Claude Code's OAuth credentials - Checks token expiry with 60s buffer - Setup tokens use Bearer auth + anthropic-beta: oauth-2025-04-20 header - Regular API keys use standard x-api-key header ## Changes by file ### New files - agent/anthropic_adapter.py — Client builder, message/tool/response format conversion, Claude Code credential reader, token resolver. Handles system prompt extraction, tool_use/tool_result blocks, thinking/reasoning, orphaned tool_use cleanup, cache_control. - tests/test_anthropic_adapter.py — 36 tests covering all adapter logic ### Modified files - pyproject.toml — Add anthropic>=0.39.0 dependency - hermes_cli/auth.py — Add 'anthropic' to PROVIDER_REGISTRY with three env vars, plus 'claude'/'claude-code' aliases - hermes_cli/models.py — Add model catalog, labels, aliases, provider order - hermes_cli/main.py — Add 'anthropic' to --provider CLI choices - hermes_cli/runtime_provider.py — Add Anthropic branch returning api_mode='anthropic_messages' (before generic api_key fallthrough) - hermes_cli/setup.py — Add Anthropic setup wizard with Claude Code credential auto-discovery, model selection, OpenRouter tools prompt - agent/auxiliary_client.py — Add claude-haiku-4-5 as aux model - agent/model_metadata.py — Add bare Claude model context lengths - run_agent.py — Add anthropic_messages api_mode: * Client init (Anthropic SDK instead of OpenAI) * API call dispatch (_anthropic_client.messages.create) * Response validation (content blocks) * finish_reason mapping (stop_reason -> finish_reason) * Token usage (input_tokens/output_tokens) * Response normalization (normalize_anthropic_response) * Client interrupt/rebuild * Prompt caching auto-enabled for native Anthropic - tests/test_run_agent.py — Update test_anthropic_base_url_accepted to expect native routing, add test_prompt_caching_native_anthropic	2026-03-12 15:47:45 -07:00
Erosika	fefc709b2c	merge: resolve conflict with main in subagent interrupt test	2026-03-12 16:28:57 -04:00
Erosika	ae2a5e5743	refactor(honcho): remove local memory mode The "local" memoryMode was redundant with enabled: false. Simplifies the mode system to hybrid and honcho only.	2026-03-12 16:23:34 -04:00
teknium1	29ef69c703	fix: update all test mocks for call_llm migration Update 14 test files to use the new call_llm/async_call_llm mock patterns instead of the old get_text_auxiliary_client/ get_vision_auxiliary_client tuple returns. - vision_tools tests: mock async_call_llm instead of _aux_async_client - browser tests: mock call_llm instead of _aux_vision_client - flush_memories tests: mock call_llm instead of get_text_auxiliary_client - session_search tests: mock async_call_llm with RuntimeError - mcp_tool tests: fix whitelist model config, use side_effect for multi-response tests - auxiliary_config_bridge: update for model=None (resolved in router) 3251 passed, 2 pre-existing unrelated failures.	2026-03-11 21:06:54 -07:00
Erosika	2d35016b94	fix(honcho): harden tool gating and migration peer routing Prevent stale Honcho tool exposure in context/local modes, restore reliable async write retry behavior, and ensure SOUL.md migration uploads target the AI peer instead of the user peer. Also align Honcho CLI key checks with host-scoped apiKey resolution and lock the fixes with regression tests. Made-with: Cursor	2026-03-11 18:21:27 -04:00
Erosika	a0b0dbe6b2	Merge remote-tracking branch 'origin/main' into feat/honcho-async-memory Made-with: Cursor # Conflicts: # cli.py # tests/test_run_agent.py	2026-03-11 12:22:56 -04:00
teknium1	a8409a161f	fix: guard all print() calls against OSError with _SafeWriter When hermes-agent runs as a systemd service, Docker container, or headless daemon, the stdout pipe can become unavailable (idle timeout, buffer exhaustion, socket reset). Any print() call then raises OSError: [Errno 5] Input/output error, crashing run_conversation() and causing cron jobs to fail. Rather than wrapping individual print() calls (68 in run_conversation alone), this adds a transparent _SafeWriter wrapper installed once at the start of run_conversation(). It delegates all writes to the real stdout and silently catches OSError. Zero overhead on the happy path, comprehensive coverage of all print calls including future ones. Fixes #845 Co-authored-by: J0hnLawMississippi <J0hnLawMississippi@users.noreply.github.com>	2026-03-11 09:19:10 -07:00
Erosika	047b118299	fix(honcho): resolve review blockers for merge Address merge-blocking review feedback by removing unsafe signal handler overrides, wiring next-turn Honcho prefetch, restoring per-directory session defaults, and exposing all Honcho tools to the model surface. Also harden prefetch cache access with public thread-safe accessors and remove duplicate browser cleanup code. Made-with: Cursor	2026-03-11 11:46:37 -04:00
teknium1	21ff0d39ad	feat: iteration budget pressure via tool result injection Two-tier warning system that nudges the LLM as it approaches max_iterations, injected into the last tool result JSON rather than as a separate system message: - Caution (70%): {"_budget_warning": "[BUDGET: 42/60...]"} - Warning (90%): {"_budget_warning": "[BUDGET WARNING: 54/60...]"} For JSON tool results, adds a _budget_warning field to the existing dict. For plain text results, appends the warning as text. Key properties: - No system messages injected mid-conversation - No changes to message structure - Prompt cache stays valid - Configurable thresholds (0.7 / 0.9) - Can be disabled: _budget_pressure_enabled = False Inspired by PR #421 (@Bartok9) and issue #414. 8 tests covering thresholds, edge cases, JSON and text injection.	2026-03-11 00:37:24 -07:00
adavyas	87cc5287a8	fix(honcho): enforce local mode and cache-safe warmup	2026-03-10 16:21:42 -04:00
teknium1	771969f747	fix: wire up enabled_tools in agent loop + simplify sandbox tool selection Completes the fix started in `8318a51` — handle_function_call() accepted enabled_tools but run_agent.py never passed it. Now both call sites in _execute_tool_calls() pass self.valid_tool_names, so each agent session uses its own tool list instead of the process-global _last_resolved_tool_names (which subagents can overwrite). Also simplifies the redundant ternary in code_execution_tool.py: sandbox_tools is already computed correctly (intersection with session tools, or full SANDBOX_ALLOWED_TOOLS as fallback), so the conditional was dead logic. Inspired by PR #663 (JasonOA888). Closes #662. Tests: 2857 passed.	2026-03-10 06:35:28 -07:00
vincent	b0a5fe8974	fix: continue after output-length truncation	2026-03-10 04:30:19 -07:00
teknium1	aedb773f0d	fix: stabilize system prompt across gateway turns for cache hits Two changes to prevent unnecessary Anthropic prompt cache misses in the gateway, where a fresh AIAgent is created per user message: 1. Reuse stored system prompt for continuing sessions: When conversation_history is non-empty, load the system prompt from the session DB instead of rebuilding from disk. The model already has updated memory in its conversation history (it wrote it!), so re-reading memory from disk produces a different system prompt that breaks the cache prefix. 2. Stabilize Honcho context per session: - Only prefetch Honcho context on the first turn (empty history) - Bake Honcho context into the cached system prompt and store to DB - Remove the per-turn Honcho injection from the API call loop This ensures the system message is identical across all turns in a session. Previously, re-fetching Honcho could return different context on each turn, changing the system message and invalidating the cache. Both changes preserve the existing behavior for compression (which invalidates the prompt and rebuilds from scratch) and for the CLI (where the same AIAgent persists and the cached prompt is already stable across turns). Tests: 2556 passed (6 new)	2026-03-09 01:50:58 -07:00
teknium1	19b6f81ee7	fix: allow Anthropic API URLs as custom OpenAI-compatible endpoints Removed the hard block on base_url containing 'api.anthropic.com'. Anthropic now offers an OpenAI-compatible /chat/completions endpoint, so blocking their URL prevents legitimate use. If the endpoint isn't compatible, the API call will fail with a proper error anyway. Removed from: run_agent.py, mini_swe_runner.py Updated test to verify Anthropic URLs are accepted.	2026-03-07 23:36:35 -08:00
teknium1	b84f9e410c	feat: default reasoning effort from xhigh to medium Reduces token usage and latency for most tasks by defaulting to medium reasoning effort instead of xhigh. Users can still override via config or CLI flag. Updates code, tests, example config, and docs.	2026-03-07 10:14:19 -08:00
Robin Fernandes	bc091eb7ef	fix: implement Nous credential refresh on 401 error for retry logic	2026-03-07 13:34:23 +11:00
teknium1	3e93db16bd	Merge PR #436 : fix: use _max_tokens_param in max-iterations retry path Authored by Farukest. Fixes #435. The retry summary in _handle_max_iterations() hardcoded max_tokens instead of using _max_tokens_param(), which returns max_completion_tokens for direct OpenAI API (required by gpt-4o, o-series). The first attempt already used _max_tokens_param correctly — only the retry path was wrong. Includes 4 tests for _max_tokens_param provider detection.	2026-03-06 04:46:24 -08:00
teknium1	5c867fd79f	test: strengthen assertions across 3 more test files (batch 2) test_run_agent.py (2 weak → 0, +13 assertions): - Session ID validated against actual YYYYMMDD_HHMMSS_hex format - API failure verifies error message propagation - Invalid JSON args verifies empty dict fallback + message structure - Context compression verifies final_response + completed flag - Invalid tool name retry verifies api_calls count - Invalid response verifies completed/failed/error structure test_model_tools.py (3 weak → 0): - Unknown tool error includes tool name in message - Exception returns dict with 'error' key + non-empty message - get_all_tool_names verifies both web_search AND terminal present test_approval.py (1 weak → 0, assert ratio 1.1 → 2.2): - Dangerous commands verify description content (delete, shell, drop, etc.) - Safe commands explicitly assert key AND desc are None - Pre/post condition checks for state management	2026-03-05 18:46:30 -08:00
Teknium	21d61bdd71	Merge pull request #307 from batuhankocyigit/patch-1 fix: correct typo 'Grup' -> 'Group' in test section headers	2026-03-05 08:54:05 -08:00
Farukest	e25ad79d5d	fix: use _max_tokens_param in max-iterations retry path The retry summary in _handle_max_iterations hardcodes max_tokens instead of calling _max_tokens_param(). For direct OpenAI API users (gpt-4o, o-series), the correct parameter name is max_completion_tokens. The first attempt at line 2697 already uses _max_tokens_param correctly but the retry path at line 2743 was missed.	2026-03-05 17:49:37 +03:00
0xbyt4	aefc330b8f	merge: resolve conflict with main (add mcp + homeassistant extras)	2026-03-03 14:52:22 +03:00
BathreeNode	f08ad94d4d	fix: correct typo 'Grup' -> 'Group' in test section headers Three section header comments in tests/test_run_agent.py used 'Grup' instead of 'Group': - Line 124: # Grup 1: Pure Functions - Line 276: # Grup 2: State / Structure Methods - Line 572: # Grup 3: Conversation Loop Pieces (OpenAI mock)	2026-03-03 09:10:35 +03:00
teknium1	56b53bff6e	Merge PR #229 : fix(agent): copy conversation_history to avoid mutating caller's list Authored by Farukest. Fixes #228. # Conflicts: # tests/test_run_agent.py	2026-03-02 04:21:39 -08:00
teknium1	c4ea996612	fix: repair flush sentinel test — mock auxiliary client and add guard The TestFlushSentinelNotLeaked test from PR #227 had two issues: 1. flush_memories() uses get_text_auxiliary_client() which could bypass agent.client entirely — mock it to return (None, None) 2. No assertion that the API was actually called — added guard assert Without these fixes the test passed vacuously (API never called).	2026-03-02 03:21:08 -08:00
teknium1	234b67f5fd	fix: mock time in retry exhaustion tests to prevent backoff sleep The TestRetryExhaustion tests from PR #223 didn't mock time.sleep/time.time, causing the retry backoff loops (275s+ total) to run in real time. Tests would time out instead of running quickly. Added _make_fast_time_mock() helper that creates a mock time module where time.time() advances 500s per call (so sleep_end is always in the past) and time.sleep() is a no-op. Both tests now complete in <1s.	2026-03-02 02:59:41 -08:00
Farukest	e87859e82c	fix(agent): copy conversation_history to avoid mutating caller's list	2026-03-01 03:06:13 +03:00
Farukest	de101a8202	fix(agent): strip _flush_sentinel from API messages	2026-03-01 02:51:31 +03:00

1 2

57 Commits