hermes-agent

Author	SHA1	Message	Date
0xbyt4	f1b4d0b280	fix(voice): make play_tts play in VC instead of no-op play_tts was returning success without playing anything when bot was in a voice channel. Now it calls play_in_voice_channel directly. Simplified skip_double dedup: base adapter handles voice input TTS via play_tts (which now works for VC), runner skips to avoid double.	2026-03-15 05:20:17 -07:00
teknium1	21c20aeaa5	fix(gateway): cancel active runs during shutdown Track adapter background message-processing tasks, cancel them during gateway shutdown, and interrupt running agents before disconnecting adapters. This prevents old gateway instances from continuing in-flight work after stop/replace, which was contributing to the restart-time task continuation/flicker behavior reported in #1414. Adds regression coverage for adapter task cancellation and shutdown interrupts.	2026-03-15 04:21:50 -07:00
CoinDegen	4ae1334287	fix(gateway): prevent telegram photo burst interrupts	2026-03-15 03:49:01 -07:00
Teknium	84d99f7754	Merge pull request #1394 from NousResearch/hermes/hermes-eca4a640 fix: honor stt.enabled false across gateway transcription	2026-03-14 22:11:47 -07:00
teyrebaz33	c36136084a	fix(gateway): honor stt.enabled false for voice transcription - bridge stt.enabled from config.yaml into gateway runtime config - preserve the flag in GatewayConfig serialization - skip gateway voice transcription when STT is disabled - add regression tests for config loading and disabled transcription flow	2026-03-14 22:09:53 -07:00
Teknium	b14a07315b	fix: save /plan output in workspace (#1381 )	2026-03-14 21:28:51 -07:00
Teknium	ff3473a37c	feat: add /plan command (#1372 ) * feat: add /plan command * refactor: back /plan with bundled skill * docs: document /plan skill	2026-03-14 21:18:17 -07:00
teknium1	9f6bccd76a	feat: add direct endpoint overrides for auxiliary and delegation Add base_url/api_key overrides for auxiliary tasks and delegation so users can route those flows straight to a custom OpenAI-compatible endpoint without having to rely on provider=main or named custom providers. Also clear gateway session env vars in test isolation so the full suite stays deterministic when run from a messaging-backed agent session.	2026-03-14 21:11:37 -07:00
teknium1	3229e434b8	Merge origin/main into hermes/hermes-5d160594	2026-03-14 19:34:05 -07:00
teknium1	ed0c7194ed	fix: preserve current gateway update and startup behavior Follow up on salvaged PR #1052. Restore current-main gateway lifecycle handling after conflict resolution and adapt the update fallback to use shell-quoted argv parts safely.	2026-03-14 18:03:50 -07:00
teknium1	df5c61b37c	feat: compress cron management into one tool	2026-03-14 12:21:50 -07:00
teyrebaz33	f3a38c90fc	fix(gateway): fall back to sys.executable -m hermes_cli.main when hermes not on PATH When shutil.which('hermes') returns None, _resolve_hermes_bin() now tries sys.executable -m hermes_cli.main as a fallback. This handles setups where Hermes is launched via a venv or module invocation and the hermes symlink is not on PATH for the gateway process. Fixes #1049	2026-03-14 12:15:51 -07:00
Teknium	917adcbaf4	Merge pull request #1306 from NousResearch/hermes/hermes-2ba57c8a fix: backfill model on gateway sessions after agent runs	2026-03-14 06:48:32 -07:00
teknium1	19f4f8970a	fix: tolerate test doubles without model attr Use getattr() when returning model metadata from GatewayRunner._run_agent so fake agents and minimal stubs without a model attribute do not break unrelated gateway flows while preserving the session-model backfill behavior.	2026-03-14 06:47:39 -07:00
ac (sourcetree)	2046a4c08c	fix: backfill model on gateway sessions after agent runs Gateway sessions end up with model=NULL because the session row is created before AIAgent is constructed. After the agent responds, update_session() writes token counts but never fills in the model. Thread agent.model through _run_agent()'s return dict into update_session() → update_token_counts(). The SQL uses COALESCE(model, ?) so it only fills NULL rows — never overwrites a model already set at creation time (e.g. CLI sessions). If the agent falls back to a different provider, agent.model is updated in-place by _try_activate_fallback(), so the recorded value reflects whichever model actually produced the response. Fixes #987	2026-03-14 06:42:57 -07:00
teknium1	7b10881b9e	fix: persist clean voice transcripts and /voice off state - keep CLI voice prefixes API-local while storing the original user text - persist explicit gateway off state and restore adapter auto-TTS suppression on restart - add regression coverage for both behaviors	2026-03-14 06:14:22 -07:00
teknium1	523a1b6faf	merge: salvage PR #327 voice mode branch Merge contributor branch feature/voice-mode onto current main for follow-up fixes.	2026-03-14 06:03:07 -07:00
0xbyt4	cc0a453476	fix: address PR review round 5 — streaming guard, VC auth, history prefix, auto-TTS control 1. Gate _streaming_api_call to chat_completions mode only — Anthropic and Codex fall back to _interruptible_api_call. Preserve Anthropic base_url across all client rebuild paths (interrupt, fallback, 401 refresh). 2. Discord VC synthetic events now use chat_type="channel" instead of defaulting to "dm" — prevents session bleed into DM context. Authorization runs before echoing transcript. Sanitize @everyone/@here in voice transcripts. 3. CLI voice prefix ("[Voice input...]") is now API-call-local only — stripped from returned history so it never persists to session DB or resumed sessions. 4. /voice off now disables base adapter auto-TTS via _auto_tts_disabled_chats set — voice input no longer triggers TTS when voice mode is off.	2026-03-14 14:27:21 +03:00
0xbyt4	35748a2fb0	fix: address PR review round 4 — remove web UI, fix audio/import/interface issues Remove web UI gateway (web.py, tests, docs, toolset, env vars, Platform.WEB enum) per maintainer request — Nous is building their own official chat UI. Fix 1: Replace sd.wait() with polling pattern in play_audio_file() to prevent indefinite hang when audio device stalls (consistent with play_beep()). Fix 2: Use importlib.util.find_spec() for faster_whisper/openai availability checks instead of module-level imports that trigger heavy native library loading (CUDA/cuDNN) at import time. Fix 3: Remove inspect.signature() hack in _send_voice_reply() — add **kwargs to Telegram send_voice() so all adapters accept metadata uniformly. Fix 4: Make session loading resilient to removed platform enum values — skip entries with unknown platforms instead of crashing the entire gateway.	2026-03-14 14:27:21 +03:00
0xbyt4	1ad5e0ed15	feat: add voice channel awareness — inject participant and speaking state into agent context	2026-03-14 14:27:21 +03:00
0xbyt4	e3126aeb40	fix: STT consistency — web.py model param, error matching, local provider key - web.py: pass stt_model from config like discord.py and run.py do - run.py: match new error messages (No STT provider / not set) - _transcribe_local: add missing "provider": "local" to return dict	2026-03-14 14:27:21 +03:00
0xbyt4	44abe852fb	fix: add macOS Homebrew Opus fallback and fix shutdown dict iteration - Add Homebrew library path fallback when ctypes.util.find_library fails on macOS (Apple Silicon + Intel paths, guarded by platform check) - Fix RuntimeError in gateway stop() by iterating over dict copy - Update Opus tests to verify find_library-first + conditional fallback	2026-03-14 14:27:21 +03:00
0xbyt4	0ff1b4ade2	fix: harden web gateway security and fix error swallowing - Use hmac.compare_digest for timing-safe token comparison (3 endpoints) - Default bind to 127.0.0.1 instead of 0.0.0.0 - Sanitize upload filenames with Path.name to prevent path traversal - Add DOMPurify to sanitize marked.parse() output against XSS - Replace add_static with authenticated media handler - Hide token in group chats for /remote-control command - Use ctypes.util.find_library for Opus instead of hardcoded paths - Add force=True to 5 interrupt _vprint calls for visibility - Log Opus decode errors and voice restart failures instead of swallowing	2026-03-14 14:27:21 +03:00
0xbyt4	2c84979d77	refactor: extract get_stt_model_from_config helper to eliminate DRY violation Duplicated YAML config parsing for stt.model existed in gateway/run.py and gateway/platforms/discord.py. Moved to a single helper in transcription_tools.py and added 5 tests covering all edge cases.	2026-03-14 14:27:21 +03:00
0xbyt4	238a431545	fix: make STT config env-overridable and fix doc issues Code fixes: - STT model, Groq base URL, and OpenAI STT base URL are now configurable via env vars (STT_GROQ_MODEL, STT_OPENAI_MODEL, GROQ_BASE_URL, STT_OPENAI_BASE_URL) instead of hardcoded - Gateway and Discord VC now read stt.model from config.yaml (previously only CLI did this — gateway always used defaults) Doc fixes: - voice-mode.md: move Web UI troubleshooting to web.md (was duplicated) - voice-mode.md: simplify "How It Works" for end users (remove NaCl, DAVE, RTP internals) - voice-mode.md: clarify STT priority (OpenAI used first if both keys set, Groq recommended for free tier) - voice-mode.md: document new STT env overrides in config reference - web.md: remove duplicate Quick Start / Step 1-3 sections - web.md: add mobile HTTPS mic workarounds (moved from voice-mode.md) - web.md: clarify STT fallback order	2026-03-14 14:27:20 +03:00
0xbyt4	9722bd8be0	fix: 8 voice pipeline bugs with tests proving each fix 1. VoiceReceiver.stop() now acquires _lock before clearing shared state to prevent race with _on_packet on the socket reader thread 2. _packet_debug_count moved from class-level to instance-level to avoid cross-instance race condition in multi-guild setups 3. play_in_voice_channel uses asyncio.get_running_loop() instead of deprecated asyncio.get_event_loop() 4. _send_voice_reply uses uuid for filenames instead of time-based names that can collide when two replies happen in the same second 5. Voice timeout now notifies runner via _on_voice_disconnect callback so runner cleans up _voice_mode state (prevents orphaned TTS replies) 6. play_in_voice_channel adds PLAYBACK_TIMEOUT (120s) to prevent infinite blocking when FFmpeg callback is never called 7. _send_voice_reply moves temp file cleanup to finally block so files are always cleaned up even when send_voice/play raises 8. Base adapter auto-TTS wraps play_tts in try/finally with os.remove to clean up generated audio files after playback 18 new tests (120 total voice tests)	2026-03-14 14:27:20 +03:00
0xbyt4	c925d2ee76	fix: voice pipeline thread safety and error handling bugs - Add lock protection around VoiceReceiver buffer writes in _on_packet to prevent race condition with check_silence on different threads - Wire _voice_input_callback BEFORE join_voice_channel to avoid losing voice input during the join window - Add try/except around leave_voice_channel to ensure state cleanup (voice_mode, callback) even if leave raises an exception - Guard against empty text after markdown stripping in base.py auto-TTS - Add 11 tests proving each bug and verifying the fix	2026-03-14 14:27:20 +03:00
0xbyt4	86ddaaee9c	fix: extract voice reply logic and add comprehensive tests - Fix tempfile.mktemp() TOCTOU race in Discord voice input (use NamedTemporaryFile) - Extract voice reply decision from _handle_message into _should_send_voice_reply() - Rewrite TestAutoVoiceReply to call real method instead of testing a copy - Add 59 new tests: VoiceReceiver, VC commands, adapter methods, streaming TTS	2026-03-14 14:27:20 +03:00
0xbyt4	fbf47e9ff6	fix: allow voice reply in Discord VC despite skip_double guard When bot is in a Discord voice channel, both base auto-TTS and Discord play_tts override skip audio. The skip_double guard was also blocking the runner's _send_voice_reply, resulting in zero audio output in VC. Now skip_double is overridden when the bot is actively connected to a voice channel, allowing play_in_voice_channel to handle TTS. Add comprehensive test matrix covering all platform x input x mode combinations with full decision table documentation.	2026-03-14 14:27:20 +03:00
0xbyt4	095815d520	fix: skip gateway voice reply for all platforms on voice input Base adapter auto-TTS already generates and sends audio for voice messages in _process_message_background. The gateway runner's _send_voice_reply was causing double audio on all platforms (not just Web). Now skip_double applies to any voice input regardless of platform.	2026-03-14 14:27:20 +03:00
0xbyt4	815e83952e	fix: prevent double TTS on Web UI voice messages When voice mode is enabled and user sends a voice message on Web UI, both the base adapter auto-TTS (play_audio) and the gateway voice reply (send_voice) would fire, causing duplicate audio playback. Skip the gateway voice reply for Web platform voice input since base adapter already handles it.	2026-03-14 14:27:20 +03:00
0xbyt4	d3e09df01a	feat: add voice conversation support and futuristic UI redesign - Auto-TTS: voice messages get spoken response (audio first, then text) - STT: Groq Whisper fallback when VOICE_TOOLS_OPENAI_KEY not set - Futuristic UI: glassmorphism, centered container, purple theme, glow effects - Voice bubble: custom waveform player with seek and progress - Invisible TTS playback via play_tts() method (no audio file in chat) - Add hermes-web toolset with full tool access - Register Platform.WEB in toolset/config maps - Update docs for voice conversation feature	2026-03-14 14:27:20 +03:00
0xbyt4	ddfbc22b7c	feat: add /remote-control command to start web UI on demand Type /remote-control from any platform (Telegram, Discord, etc.) to instantly start the web UI without restarting the gateway. - Auto-generates access token if not provided - Shows URL + token in response - Optional: /remote-control [port] [token] - Reports status if already running - Added to /help command list	2026-03-14 14:27:20 +03:00
0xbyt4	a3905ef289	feat: add web gateway — browser-based chat UI over WebSocket New platform adapter that serves a full-featured chat interface via HTTP. Enables access from any device on the network (phone, tablet, desktop). Features: - aiohttp server with WebSocket real-time messaging - Token-based authentication - Markdown rendering (marked.js) + code highlighting (highlight.js) - Voice recording via MediaRecorder API + STT transcription - Image, voice, and document display - Typing indicator + message editing (streaming support) - Mobile responsive dark theme - Auto-reconnect on disconnect - Media file cleanup (24h TTL) Config: WEB_UI_ENABLED=true, WEB_UI_PORT=8765, WEB_UI_TOKEN=<token> No new dependencies — uses aiohttp already in [messaging] extra.	2026-03-14 14:27:20 +03:00
0xbyt4	c0c358d051	feat: add Discord voice channel listening — STT transcription and agent response pipeline Phase 2 of voice channel support: bot listens to users speaking in VC, transcribes speech via Groq Whisper, and processes through the agent pipeline. - Add VoiceReceiver class for RTP packet capture, NaCl/DAVE decryption, Opus decode - Add silence detection and per-user PCM buffering - Wire voice input callback from adapter to GatewayRunner - Fix adapter dict key: use Platform.DISCORD enum instead of string - Fix guild_id extraction for synthetic voice events via SimpleNamespace raw_message - Pause/resume receiver during TTS playback to prevent echo	2026-03-14 14:27:20 +03:00
0xbyt4	cc974904f8	feat: Discord voice channel support — bot joins VC and speaks replies - /voice channel: bot joins user's voice channel, speaks TTS replies - /voice leave: disconnect from voice channel - Auto-disconnect after 5 min inactivity - _get_guild_id() helper extracts guild from raw_message - Load opus codec for voice playback - discord.py[voice] in pyproject.toml (pulls PyNaCl + davey)	2026-03-14 14:27:20 +03:00
0xbyt4	cbe4c23efa	fix: Discord voice bubble + edge-tts mp3/ogg format mismatch - Send Discord voice messages with flags=8192 and waveform metadata so they render as native voice bubbles instead of file attachments - Use .mp3 output path for TTS so edge-tts opus conversion works correctly (edge always outputs mp3, convert was skipped for .ogg) - Use actual file_path from TTS result after potential opus conversion	2026-03-14 14:27:20 +03:00
0xbyt4	f6cf4ca826	feat: add /voice slash command to Discord + fix cross-platform send_voice - Register /voice as Discord slash command with mode choices - Fix _send_voice_reply to handle adapters that don't accept metadata parameter (Discord) by inspecting the method signature at runtime	2026-03-14 14:27:20 +03:00
0xbyt4	d80da5ddd8	feat: add /voice command for auto voice reply in Telegram gateway - /voice on: reply with voice when user sends voice messages - /voice tts: reply with voice to all messages - /voice off: disable, text-only replies - /voice status: show current mode - Per-chat state persisted to gateway_voice_mode.json - Dedup: skips auto-reply if agent already called text_to_speech tool - drop_pending_updates=True to ignore stale Telegram messages on restart - 25 tests covering command handler, reply logic, and edge cases	2026-03-14 14:27:20 +03:00
Teknium	02752c83b4	Merge pull request #1287 from NousResearch/hermes/hermes-cc060dd9 fix(gateway): avoid slash-command crash with GatewayConfig	2026-03-14 04:13:56 -07:00
clabbe-bot	3126c60885	fix: notify gateway users when updates finish or fail	2026-03-14 03:59:05 -07:00
teknium1	7e52e8eb54	fix(gateway): bridge quick commands into GatewayConfig runtime Follow-up on salvaged PR #975. Bridge quick_commands from config.yaml into load_gateway_config(), normalize non-dict quick command config at runtime, and add coverage for GatewayConfig round-trips plus config.yaml bridging. This makes the GatewayConfig quick-command fix complete for the real user-facing config path implicated by issue #973.	2026-03-14 03:57:25 -07:00
stablegenius49	ce56b45514	fix(gateway): support quick commands from GatewayConfig	2026-03-14 03:51:28 -07:00
Verne	52ba940c9b	feat(gateway): add reasoning hot reload Add a /reasoning command across gateway adapters so users can inspect or change reasoning effort without editing config by hand. Reload reasoning settings from config.yaml before each agent run, including background tasks, so the next message picks up the new value consistently.	2026-03-14 02:42:47 -07:00
teknium1	6f1889b0fa	fix: preserve current approval semantics for tirith guard Restore gateway/run.py to current main behavior while keeping tirith startup and pattern_keys replay, preserve yolo and non-interactive bypass semantics in the combined guard, and add regression tests for yolo and view-full flows.	2026-03-14 00:17:04 -07:00
sheeki003	375ce8a881	feat(security): add tirith pre-exec command scanning Integrate tirith as a pre-execution security scanner that detects homograph URLs, pipe-to-interpreter patterns, terminal injection, zero-width Unicode, and environment variable manipulation — threats the existing 50-pattern dangerous command detector doesn't cover. Architecture: gather-then-decide — both tirith and the dangerous command detector run before any approval prompt, preventing gateway force=True replay from bypassing one check when only the other was shown to the user. New files: - tools/tirith_security.py: subprocess wrapper with auto-installer, mandatory cosign provenance verification, non-blocking background download, disk-persistent failure markers with retryable-cause tracking (cosign_missing auto-clears when cosign appears on PATH) - tests/tools/test_tirith_security.py: 62 tests covering exit code mapping, fail_open, cosign verification, background install, HERMES_HOME isolation, and failure recovery - tests/tools/test_command_guards.py: 21 integration tests for the combined guard orchestration Modified files: - tools/approval.py: add check_all_command_guards() orchestrator, add allow_permanent parameter to prompt_dangerous_approval() - tools/terminal_tool.py: replace _check_dangerous_command with consolidated check_all_command_guards - cli.py: update _approval_callback for allow_permanent kwarg, call ensure_installed() at startup - gateway/run.py: iterate pattern_keys list on replay approval, call ensure_installed() at startup - hermes_cli/config.py: add security config defaults, split commented sections for independent fallback - cli-config.yaml.example: document tirith security config	2026-03-14 00:11:27 -07:00
Teknium	6235fdde75	fix: raise session hygiene threshold from 50% to 85% Session hygiene was firing at the same threshold (50%) as the agent's own context compressor, causing premature compression on every turn in long gateway sessions (especially Telegram). Hygiene is a safety net for pathologically large sessions that would cause API failures — it should NOT be doing normal compression work. The agent's own compressor handles that during its tool loop with accurate real token counts from the API. Changes: - Default hygiene threshold: 0.50 → 0.85 (fires only when truly large) - Hygiene threshold is now independent of compression.threshold config (that setting controls the agent's compressor, not the pre-agent safety net) - Removed env var override for hygiene threshold (CONTEXT_COMPRESSION_THRESHOLD still controls the agent's own compressor)	2026-03-13 04:17:45 -07:00
Teknium	8f8dd83443	fix: sync session_id after mid-run context compression Critical bug: when the agent's context compressor fires during a tool loop (_compress_context), it creates a new session_id and writes the compressed messages there. But the gateway's session_entry still pointed to the old session_id. On the next message, load_transcript() loaded the stale pre-compression transcript, causing: - Context bloat returning every turn - Repeated compression cycles - Loss of carefully compressed context Fix: after run_conversation() returns, check if the agent's session_id changed (compression split) and sync it back to the session store entry. Also pass the effective session_id in the result dict so _handle_message writes transcript entries to the correct session. This affects ALL gateway adapters, not just webhook.	2026-03-13 04:14:35 -07:00
kshitijk4poor	ccfbf42844	feat: secure skill env setup on load (core #688 ) When a skill declares required_environment_variables in its YAML frontmatter, missing env vars trigger a secure TUI prompt (identical to the sudo password widget) when the skill is loaded. Secrets flow directly to ~/.hermes/.env, never entering LLM context. Key changes: - New required_environment_variables frontmatter field for skills - Secure TUI widget (masked input, 120s timeout) - Gateway safety: messaging platforms show local setup guidance - Legacy prerequisites.env_vars normalized into new format - Remote backend handling: conservative setup_needed=True - Env var name validation, file permissions hardened to 0o600 - Redact patterns extended for secret-related JSON fields - 12 existing skills updated with prerequisites declarations - ~48 new tests covering skip, timeout, gateway, remote backends - Dynamic panel widget sizing (fixes hardcoded width from original PR) Cherry-picked from PR #723 by kshitijk4poor, rebased onto current main with conflict resolution. Fixes #688 Co-authored-by: kshitijk4poor <kshitijk4poor@users.noreply.github.com>	2026-03-13 03:14:04 -07:00
Teknium	475dd58a8e	Merge PR #736 : feat(honcho): async writes, memory modes, session title integration, setup CLI Authored by erosika. Builds on #38 and #243. Adds async write support, configurable memory modes, context prefetch pipeline, 4 new Honcho tools (honcho_context, honcho_profile, honcho_search, honcho_conclude), full 'hermes honcho' CLI, session strategies, AI peer identity, recallMode A/B, gateway lifecycle management, and comprehensive docs. Cherry-picks fixes from PRs #831/#832 (adavyas). Co-authored-by: erosika <erosika@users.noreply.github.com> Co-authored-by: adavyas <adavyas@users.noreply.github.com>	2026-03-12 19:05:11 -07:00

1 2 3 4 5

222 Commits