- Keep InputStream alive across recordings to avoid CoreAudio hang on
repeated open/close cycles on macOS. New _ensure_stream() creates the
stream once; start()/stop()/cancel() only toggle frame collection.
- Add _close_stream_with_timeout() with daemon thread to prevent
stream.stop()/close() from blocking indefinitely.
- Add generation counter to detect stale stream-open completions after
cancel or restart.
- Run recorder.cancel() in background thread from Ctrl+C handler to
keep the event loop responsive.
- Add shutdown() method called on /voice off to release audio resources.
- Fix silence timer reset during active speech: use dip tolerance for
_resume_start tracker so natural speech pauses (< 0.3s) don't prevent
the silence timer from being reset.
- Update tests to match persistent stream behavior.
- process_loop's continuous mode restart called _voice_start_recording()
directly, blocking the loop if play_beep/sd.wait hangs — queued user
input would stall silently. Dispatch to daemon thread like Ctrl+B handler.
- Replace print() with _cprint() in _handle_voice_command for consistency
with the rest of the voice mode code.
The handle_voice_record key binding runs in prompt_toolkit's event-loop
thread. When silence auto-stopped recording, _voice_recording was False
but recorder.stop() still held AudioRecorder._lock. A concurrent Ctrl+B
press entered the START path and blocked on that lock, freezing all
keyboard input.
Three changes:
- Set _voice_processing atomically with _voice_recording=False in
_voice_stop_and_transcribe to close the race window
- Add _voice_processing guard in the START path to prevent starting
while stop/transcribe is still running
- Dispatch _voice_start_recording to a daemon thread so play_beep
(sd.wait) and AudioRecorder.start (lock acquire) never block the
event loop
browser_tool.py registered SIGINT/SIGTERM handlers that called sys.exit()
at module import time. When a signal arrived during a lock acquisition
(e.g. AudioRecorder._lock in voice mode), SystemExit was raised inside
prompt_toolkit's async event loop, corrupting coroutine state and making
the process unkillable (required SIGKILL).
atexit handler already ensures browser sessions are cleaned up on any
normal exit path, so the signal handlers were redundant and harmful.
- edge_tts NameError: _generate_edge_tts now calls _import_edge_tts()
instead of referencing bare module name (tts_tool.py)
- TTS thread leak: chat() finally block sends sentinel to text_queue,
sets stop_event, and joins tts_thread on exception paths (cli.py)
- output_stream leak: moved close() into finally block so audio device
is released even on exception (tts_tool.py)
- Ctrl+C continuous mode: cancel handler now resets _voice_continuous
to prevent auto-restart after user cancels recording (cli.py)
- _disable_voice_mode: now calls stop_playback() and sets
_voice_tts_done so TTS stops when voice mode is turned off (cli.py)
- _show_voice_status: reads record key from config instead of
hardcoding Ctrl+B (cli.py)
Bug A: Replace stale _HAS_ELEVENLABS/_HAS_AUDIO boolean imports with
lazy import function calls (_import_elevenlabs, _import_sounddevice).
The old constants no longer exist in tts_tool -- the try/except
silently swallowed the ImportError, leaving streaming TTS dead.
Bug B: Use user message prefix instead of modifying system prompt for
voice mode instruction. Changing ephemeral_system_prompt mid-session
invalidates the prompt cache. Now the concise-response hint is
prepended to the user_message passed to run_conversation while
conversation_history keeps the original text.
Minor: Add force parameter to _vprint so critical error messages
(max retries, non-retryable errors, API failures) are always shown
even during streaming TTS playback.
Tests: 15 new tests in test_voice_cli_integration.py covering all
three fixes -- lazy import activation, message prefix behavior,
history cleanliness, system prompt stability, and AST verification
that all critical _vprint calls use force=True.
1. Fully lazy imports: sounddevice, numpy, elevenlabs, edge_tts, and
openai are never imported at module level. Each is imported only when
the feature is explicitly activated, preventing crashes in headless
environments (SSH, Docker, WSL, no PortAudio).
2. No core agent loop changes: streaming TTS path extracted from
_interruptible_api_call() into separate _streaming_api_call() method.
The original method is restored to its upstream form.
3. Configurable key binding: push-to-talk key changed from Ctrl+R
(conflicts with readline reverse-search) to Ctrl+B by default.
Configurable via voice.push_to_talk_key in config.yaml.
4. Environment detection: new detect_audio_environment() function checks
for SSH, Docker, WSL, and missing audio devices before enabling voice
mode. Auto-disables with clear warnings in incompatible environments.
5. Graceful degradation: every audio touchpoint (sd.play, sd.InputStream,
sd.OutputStream) wrapped in try/except with ImportError/OSError
handling. Failures produce warnings, not crashes.
- Audio cues: beep on record start (880Hz), double beep on stop (660Hz)
- Silence detection: auto-stop recording after 3s of silence (RMS-based)
- Continuous mode: auto-restart recording after agent responds
- Ctrl+R starts continuous mode, Ctrl+R during recording exits it
- Waits for TTS to finish before restarting to avoid recording speaker
- Tests: 7 new tests for beep generation and silence detection
The test was failing because GROQ_API_KEY leaked from the environment.
Now both VOICE_TOOLS_OPENAI_KEY and GROQ_API_KEY are removed to
properly test the "no STT key" scenario.
* feat: improve context compaction handoff summaries
Adapt PR #916 onto current main by replacing the old context summary marker
with a clearer handoff wrapper, updating the summarization prompt for
resume-oriented summaries, and preserving the current call_llm-based
compression path.
* fix: clearer error when docker backend is unavailable
* fix: preserve docker discovery in backend preflight
Follow up on salvaged PR #940 by reusing find_docker() during the new
availability check so non-PATH Docker Desktop installs still work. Add
a regression test covering the resolved executable path.
---------
Co-authored-by: aydnOktay <xaydinoktay@gmail.com>
Restore gateway/run.py to current main behavior while keeping tirith startup
and pattern_keys replay, preserve yolo and non-interactive bypass semantics in
the combined guard, and add regression tests for yolo and view-full flows.
Integrate tirith as a pre-execution security scanner that detects
homograph URLs, pipe-to-interpreter patterns, terminal injection,
zero-width Unicode, and environment variable manipulation — threats
the existing 50-pattern dangerous command detector doesn't cover.
Architecture: gather-then-decide — both tirith and the dangerous
command detector run before any approval prompt, preventing gateway
force=True replay from bypassing one check when only the other was
shown to the user.
New files:
- tools/tirith_security.py: subprocess wrapper with auto-installer,
mandatory cosign provenance verification, non-blocking background
download, disk-persistent failure markers with retryable-cause
tracking (cosign_missing auto-clears when cosign appears on PATH)
- tests/tools/test_tirith_security.py: 62 tests covering exit code
mapping, fail_open, cosign verification, background install,
HERMES_HOME isolation, and failure recovery
- tests/tools/test_command_guards.py: 21 integration tests for the
combined guard orchestration
Modified files:
- tools/approval.py: add check_all_command_guards() orchestrator,
add allow_permanent parameter to prompt_dangerous_approval()
- tools/terminal_tool.py: replace _check_dangerous_command with
consolidated check_all_command_guards
- cli.py: update _approval_callback for allow_permanent kwarg,
call ensure_installed() at startup
- gateway/run.py: iterate pattern_keys list on replay approval,
call ensure_installed() at startup
- hermes_cli/config.py: add security config defaults, split
commented sections for independent fallback
- cli-config.yaml.example: document tirith security config
Salvage of PR #1246 by ChatGPT (teknium1 session), resolved against
current main which already includes #1239.
Changes:
- Add minisweagent_path.py: worktree-aware helper that finds
mini-swe-agent/src from either the current checkout or the main
checkout behind a git worktree
- Use the helper in tools/terminal_tool.py and mini_swe_runner.py
instead of naive path-relative lookup that fails in worktrees
- Clean up check_terminal_requirements():
- local: return True (no minisweagent dep, per #1239)
- singularity/ssh: remove unnecessary minisweagent imports
- docker/modal: use importlib.util.find_spec with clear error
- Add regression tests for worktree path discovery and tool resolution
* fix: Home Assistant event filtering now closed by default
Previously, when no watch_domains or watch_entities were configured,
ALL state_changed events passed through to the agent, causing users
to be flooded with notifications for every HA entity change.
Now events are dropped by default unless the user explicitly configures:
- watch_domains: list of domains to monitor (e.g. climate, light)
- watch_entities: list of specific entity IDs to monitor
- watch_all: true (new option — opt-in to receive all events)
A warning is logged at connect time if no filters are configured,
guiding users to set up their HA platform config.
All 49 gateway HA tests + 52 HA tool tests pass.
* docs: update Home Assistant integration documentation
- homeassistant.md: Fix event filtering docs to reflect closed-by-default
behavior. Add watch_all option. Replace Python dict config example with
YAML. Fix defaults table (was incorrectly showing 'all'). Add required
configuration warning admonition.
- environment-variables.md: Add HASS_TOKEN and HASS_URL to Messaging section.
- messaging/index.md: Add Home Assistant to description, architecture
diagram, platform toolsets table, and Next Steps links.
* fix(terminal): strip provider env vars from background and PTY subprocesses
Extends the env var blocklist from #1157 to also cover the two remaining
leaky paths in process_registry.py:
- spawn_local() PTY path (line 156)
- spawn_local() background Popen path (line 197)
Both were still using raw os.environ, leaking provider vars to background
processes and interactive PTY sessions. Now uses the same dynamic
_HERMES_PROVIDER_ENV_BLOCKLIST from local.py.
Explicit env_vars passed to spawn_local() still override the blocklist,
matching the existing behavior for callers that intentionally need these.
Gap identified by PR #1004 (@PeterFile).
* feat(delegate): add observability metadata to subagent results
Enrich delegate_task results with metadata from the child AIAgent:
- model: which model the child used
- exit_reason: completed | interrupted | max_iterations
- tokens.input / tokens.output: token counts
- tool_trace: per-tool-call trace with byte sizes and ok/error status
Tool trace uses tool_call_id matching to correctly pair parallel tool
calls with their results, with a fallback for messages without IDs.
Cherry-picked from PR #872 by @omerkaz, with fixes:
- Fixed parallel tool call trace pairing (was always updating last entry)
- Removed redundant 'iterations' field (identical to existing 'api_calls')
- Added test for parallel tool call trace correctness
Co-authored-by: omerkaz <omerkaz@users.noreply.github.com>
* feat(stt): add free local whisper transcription via faster-whisper
Replace OpenAI-only STT with a dual-provider system mirroring the TTS
architecture (Edge TTS free / ElevenLabs paid):
STT: faster-whisper local (free, default) / OpenAI Whisper API (paid)
Changes:
- tools/transcription_tools.py: Full rewrite with provider dispatch,
config loading, local faster-whisper backend, and OpenAI API backend.
Auto-downloads model (~150MB for 'base') on first voice message.
Singleton model instance reused across calls.
- pyproject.toml: Add faster-whisper>=1.0.0 as core dependency
- hermes_cli/config.py: Expand stt config to match TTS pattern with
provider selection and per-provider model settings
- agent/context_compressor.py: Fix .strip() crash when LLM returns
non-string content (dict from llama.cpp, None). Fixes#1100 partially.
- tests/: 23 new tests for STT providers + 2 for compressor fix
- docs/: Updated Voice & TTS page with STT provider table, model sizes,
config examples, and fallback behavior
Fallback behavior:
- Local not installed → OpenAI API (if key set)
- OpenAI key not set → local whisper (if installed)
- Neither → graceful error message to user
Co-authored-by: Jah-yee <Jah-yee@users.noreply.github.com>
---------
Co-authored-by: omerkaz <omerkaz@users.noreply.github.com>
Co-authored-by: Jah-yee <Jah-yee@users.noreply.github.com>
* fix: Home Assistant event filtering now closed by default
Previously, when no watch_domains or watch_entities were configured,
ALL state_changed events passed through to the agent, causing users
to be flooded with notifications for every HA entity change.
Now events are dropped by default unless the user explicitly configures:
- watch_domains: list of domains to monitor (e.g. climate, light)
- watch_entities: list of specific entity IDs to monitor
- watch_all: true (new option — opt-in to receive all events)
A warning is logged at connect time if no filters are configured,
guiding users to set up their HA platform config.
All 49 gateway HA tests + 52 HA tool tests pass.
* docs: update Home Assistant integration documentation
- homeassistant.md: Fix event filtering docs to reflect closed-by-default
behavior. Add watch_all option. Replace Python dict config example with
YAML. Fix defaults table (was incorrectly showing 'all'). Add required
configuration warning admonition.
- environment-variables.md: Add HASS_TOKEN and HASS_URL to Messaging section.
- messaging/index.md: Add Home Assistant to description, architecture
diagram, platform toolsets table, and Next Steps links.
* fix(terminal): strip provider env vars from background and PTY subprocesses
Extends the env var blocklist from #1157 to also cover the two remaining
leaky paths in process_registry.py:
- spawn_local() PTY path (line 156)
- spawn_local() background Popen path (line 197)
Both were still using raw os.environ, leaking provider vars to background
processes and interactive PTY sessions. Now uses the same dynamic
_HERMES_PROVIDER_ENV_BLOCKLIST from local.py.
Explicit env_vars passed to spawn_local() still override the blocklist,
matching the existing behavior for callers that intentionally need these.
Gap identified by PR #1004 (@PeterFile).
* feat(delegate): add observability metadata to subagent results
Enrich delegate_task results with metadata from the child AIAgent:
- model: which model the child used
- exit_reason: completed | interrupted | max_iterations
- tokens.input / tokens.output: token counts
- tool_trace: per-tool-call trace with byte sizes and ok/error status
Tool trace uses tool_call_id matching to correctly pair parallel tool
calls with their results, with a fallback for messages without IDs.
Cherry-picked from PR #872 by @omerkaz, with fixes:
- Fixed parallel tool call trace pairing (was always updating last entry)
- Removed redundant 'iterations' field (identical to existing 'api_calls')
- Added test for parallel tool call trace correctness
Co-authored-by: omerkaz <omerkaz@users.noreply.github.com>
---------
Co-authored-by: omerkaz <omerkaz@users.noreply.github.com>
Terminal subprocesses inherit OPENAI_BASE_URL and other provider env
vars loaded from ~/.hermes/.env, silently misrouting external CLIs
like codex. Build a blocklist dynamically from the provider registry
so new providers are automatically covered. Callers that truly need
a blocked var can opt in via the _HERMES_FORCE_ prefix.
Closes#1002
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
The old message referenced 'hermes setup' which doesn't handle
skill-specific env vars. Updated to direct users to load the skill
in the local CLI (which triggers the secure prompt) or add the key
to ~/.hermes/.env manually.
When a skill declares required_environment_variables in its YAML
frontmatter, missing env vars trigger a secure TUI prompt (identical
to the sudo password widget) when the skill is loaded. Secrets flow
directly to ~/.hermes/.env, never entering LLM context.
Key changes:
- New required_environment_variables frontmatter field for skills
- Secure TUI widget (masked input, 120s timeout)
- Gateway safety: messaging platforms show local setup guidance
- Legacy prerequisites.env_vars normalized into new format
- Remote backend handling: conservative setup_needed=True
- Env var name validation, file permissions hardened to 0o600
- Redact patterns extended for secret-related JSON fields
- 12 existing skills updated with prerequisites declarations
- ~48 new tests covering skip, timeout, gateway, remote backends
- Dynamic panel widget sizing (fixes hardcoded width from original PR)
Cherry-picked from PR #723 by kshitijk4poor, rebased onto current main
with conflict resolution.
Fixes#688
Co-authored-by: kshitijk4poor <kshitijk4poor@users.noreply.github.com>
When the model returns multiple tool calls in a single response, they are
now executed concurrently using a thread pool instead of sequentially.
This significantly reduces wall-clock time when multiple independent tools
are batched (e.g. parallel web_search, read_file, terminal calls).
Architecture:
- _execute_tool_calls() dispatches to sequential or concurrent path
- Single tool calls and batches containing 'clarify' use sequential path
- Multiple non-interactive tools use ThreadPoolExecutor (max 8 workers)
- Results are collected and appended to messages in original order
- _invoke_tool() extracted as shared tool invocation helper
Safety:
- Pre-flight interrupt check skips all tools if interrupted
- Per-tool exception handling: one failure doesn't crash the batch
- Result truncation (100k char limit) applied per tool
- Budget pressure injection after all tools complete
- Checkpoints taken before file-mutating tools
- CLI spinner shows batch progress, then per-tool completion messages
Tests: 10 new tests covering dispatch logic, ordering, error handling,
interrupt behavior, truncation, and _invoke_tool routing.
When a dangerous command is detected and the user is prompted for
approval, long commands are truncated (80 chars in fallback, 70 chars
in the TUI). Users had no way to see the full command before deciding.
This adds a 'View full command' option across all approval interfaces:
- CLI fallback (tools/approval.py): [v]iew option in the prompt menu.
Shows the full command and re-prompts for approval decision.
- CLI TUI (cli.py): 'Show full command' choice in the arrow-key
selection panel. Expands the command display in-place and removes
the view option after use.
- CLI callbacks (callbacks.py): 'view' choice added to the list when
the command exceeds 70 characters.
- Gateway (gateway/run.py): 'full', 'show', 'view' responses reveal
the complete command while keeping the approval pending.
Includes 7 new tests covering view-then-approve, view-then-deny,
short command fallthrough, and double-view behavior.
Closes community feedback about the 80-char cap on dangerous commands.
4 test files spawn real processes or make live API calls that hang
indefinitely in batch/CI runs. Skip them with pytestmark:
- tests/tools/test_code_execution.py (subprocess spawns)
- tests/tools/test_file_tools_live.py (live LocalEnvironment)
- tests/test_413_compression.py (blocks on process)
- tests/test_agent_loop_tool_calling.py (live OpenRouter API calls)
Also added global 30s signal.alarm timeout in conftest.py as a safety
net, and removed stale nous-api test that hung on OAuth browser login.
Suite now runs in ~55s with no hangs.
- gateway/run.py: Take main's _resolve_gateway_model() helper
- hermes_cli/setup.py: Re-apply nous-api removal after merge brought
it back. Fix provider_idx offset (Custom is now index 3, not 4).
- tests/hermes_cli/test_setup.py: Fix custom setup test index (3→4)
Update 14 test files to use the new call_llm/async_call_llm mock
patterns instead of the old get_text_auxiliary_client/
get_vision_auxiliary_client tuple returns.
- vision_tools tests: mock async_call_llm instead of _aux_async_client
- browser tests: mock call_llm instead of _aux_vision_client
- flush_memories tests: mock call_llm instead of get_text_auxiliary_client
- session_search tests: mock async_call_llm with RuntimeError
- mcp_tool tests: fix whitelist model config, use side_effect for
multi-response tests
- auxiliary_config_bridge: update for model=None (resolved in router)
3251 passed, 2 pre-existing unrelated failures.
Add centralized call_llm() and async_call_llm() functions that own the
full LLM request lifecycle:
1. Resolve provider + model from task config or explicit args
2. Get or create a cached client for that provider
3. Format request args (max_tokens handling, provider extra_body)
4. Make the API call with max_tokens/max_completion_tokens retry
5. Return the response
Config: expanded auxiliary section with provider:model slots for all
tasks (compression, vision, web_extract, session_search, skills_hub,
mcp, flush_memories). Config version bumped to 7.
Migrated all auxiliary consumers:
- context_compressor.py: uses call_llm(task='compression')
- vision_tools.py: uses async_call_llm(task='vision')
- web_tools.py: uses async_call_llm(task='web_extract')
- session_search_tool.py: uses async_call_llm(task='session_search')
- browser_tool.py: uses call_llm(task='vision'/'web_extract')
- mcp_tool.py: uses call_llm(task='mcp')
- skills_guard.py: uses call_llm(provider='openrouter')
- run_agent.py flush_memories: uses call_llm(task='flush_memories')
Tests updated for context_compressor and MCP tool. Some test mocks
still need updating (15 remaining failures from mock pattern changes,
2 pre-existing).
_discover_one() caught all exceptions and returned [], making
asyncio.gather(return_exceptions=True) redundant. The
isinstance(result, Exception) branch in _discover_all() was dead
code, so failed_count was always 0. This caused:
- No summary printed when all servers fail (silent failure)
- ok_servers always equaling total_servers (misleading count)
- Unused variables transport_desc and transport_type
Fix: let exceptions propagate to gather() so failed_count increments
correctly. Move per-server failure logging to _discover_all(). Remove
dead variables.
- test_agent_loop_tool_calling.py: import atroposlib at module level
to trigger skip (environments.agent_loop is now importable without
atroposlib due to __init__.py graceful fallback)
- test_modal_sandbox_fixes.py: skip TestToolResolution tests when
minisweagent not installed
Guard all test files that import from environments/ or atroposlib
with try/except + pytest.skip(allow_module_level=True) so they
gracefully skip instead of crashing when deps aren't available.
The weaker assertion (r.exc_info is not None) passes even when
exc_info is (None, None, None). Check r.exc_info[0] is not None
to verify actual exception info is present.
The _aux_async_client mock was already applied on main.
Co-authored-by: OutThisLife <nickolasgustafsson@gmail.com>
Adds delegation.model and delegation.provider config fields so subagents
can run on a completely different provider:model pair than the parent agent.
When delegation.provider is set, the system resolves the full credential
bundle (base_url, api_key, api_mode) via resolve_runtime_provider() —
the same path used by CLI/gateway startup. This means all configured
providers work out of the box: openrouter, nous, zai, kimi-coding,
minimax, minimax-cn.
Key design decisions:
- Provider resolution uses hermes_cli.runtime_provider (single source of
truth for credential resolution across CLI, gateway, cron, and now
delegation)
- When only delegation.model is set (no provider), the model name changes
but parent credentials are inherited (for switching models within the
same provider like OpenRouter)
- When delegation.provider is set, full credentials are resolved
independently — enabling cross-provider delegation (e.g. parent on
Nous Portal, subagents on OpenRouter)
- Clear error messages if provider resolution fails (missing API key,
unknown provider name)
- _load_config() now falls back to hermes_cli.config.load_config() for
gateway/cron contexts where CLI_CONFIG is unavailable
Based on PR #791 by 0xbyt4 (closes#609), reworked to use proper
provider credential resolution instead of passing provider as metadata.
Co-authored-by: 0xbyt4 <0xbyt4@users.noreply.github.com>
Adds defensive guard against empty/None/missing choices in SamplingHandler.__call__
before accessing response.choices[0]. Returns proper ErrorData instead of crashing
with IndexError/TypeError on content filtering, provider errors, or rate limits.
Authored by 0xbyt4.
Co-authored-by: 0xbyt4 <0xbyt4@users.noreply.github.com>
Isolate Telegram forum topic sessions — each topic gets its own independent session key, history, and interrupt tracking. Progress, hygiene, and cron messages all route to the correct topic.
Replaces head-only stdout capture with a two-buffer approach (40% head,
60% tail rolling window) so scripts that print() their final results
at the end never lose them. Adds truncation notice between sections.
Cherry-picked from PR #755, conflict resolved (test file additions).
3 new tests for short output, head+tail preservation, and notice format.
Authored by dmahan93. Adds HERMES_YOLO_MODE env var and --yolo CLI flag
to auto-approve all dangerous command prompts.
Post-merge: renamed --fuck-it-ship-it to --yolo for brevity,
resolved conflict with --checkpoints flag.
On macOS, Docker Desktop installs the CLI to /usr/local/bin/docker, but
when Hermes runs as a gateway service (launchd) or in other non-login
contexts, /usr/local/bin is often not in PATH. This causes the Docker
requirements check to fail with 'No such file or directory: docker' even
though docker works fine from the user's terminal.
Add find_docker() helper that uses shutil.which() first, then probes
common Docker Desktop install paths on macOS (/usr/local/bin,
/opt/homebrew/bin, Docker.app bundle). The resolved path is cached and
passed to mini-swe-agent via its 'executable' parameter.
- tools/environments/docker.py: add find_docker(), use it in
_storage_opt_supported() and pass to _Docker(executable=...)
- tools/terminal_tool.py: use find_docker() in requirements check
- tests/tools/test_docker_find.py: 4 tests (PATH, fallback, not found, cache)
2877 tests pass.
The 4 early-return paths in _spawn_training_run (API exit, trainer
exit, env not found, env exit) were doing manual process.terminate()
or returning without cleanup, leaking open log file handles. Now all
paths call _stop_training_run() which handles both process termination
and file handle closure.
Also adds 12 tests for _stop_training_run covering file handle
cleanup, process termination, status transitions, and edge cases.
Inspired by PR #715 (0xbyt4) which identified the early-return issue.
Core file handle fix was already on main via e28dc13 (memosr.eth).