Commit Graph

141 Commits

Author SHA1 Message Date
teknium1
2d57946ee9 test(voice): clarify install guidance and local skips
Add an explicit messaging-extra install hint to the missing PyNaCl/davey error path, cover it with a voice-channel join regression test, and skip the low-level NaCl packet tests when PyNaCl is not installed locally.
2026-03-15 05:24:34 -07:00
0xbyt4
63f0ec96ec test(voice): add comprehensive flow tests for voice channel fixes
Tests cover the actual code paths changed in voice fixes:

_on_packet DAVE passthrough (8 tests):
- Known SSRC + DAVE decrypt success → buffered
- Unknown SSRC + DAVE → skip DAVE, passthrough to Opus
- DAVE "Unencrypted" error → passthrough, not dropped
- DAVE other error → packet dropped
- No DAVE session → direct decode
- Bot's own SSRC → ignored (echo prevention)
- Multiple SSRCs → separate buffers

SSRC auto-mapping (6 tests):
- Single allowed user → auto-mapped
- Multiple allowed users → no auto-map
- No allowlist → sole non-bot member inferred
- Unallowed user → rejected
- Only bot in channel → no map
- Auto-map persists across checks

Buffer lifecycle (4 tests):
- Known SSRC completed utterance
- Short buffer ignored
- Recent audio waits
- Stale unknown buffer discarded

TTS playback (10 tests):
- play_tts calls play_in_voice_channel in VC
- play_tts falls through when not in VC
- play_tts wrong channel no match
- Voice input dedup (runner skips)
- Text + voice_mode combinations
- Error/empty response skipped
- Agent TTS tool dedup

UDP keepalive (2 tests):
- Interval within bounds
- Silence frame actually sent via send_packet
2026-03-15 05:20:17 -07:00
0xbyt4
f1b4d0b280 fix(voice): make play_tts play in VC instead of no-op
play_tts was returning success without playing anything when bot was
in a voice channel. Now it calls play_in_voice_channel directly.

Simplified skip_double dedup: base adapter handles voice input TTS
via play_tts (which now works for VC), runner skips to avoid double.
2026-03-15 05:20:17 -07:00
teknium1
21c20aeaa5 fix(gateway): cancel active runs during shutdown
Track adapter background message-processing tasks, cancel them during gateway shutdown, and interrupt running agents before disconnecting adapters. This prevents old gateway instances from continuing in-flight work after stop/replace, which was contributing to the restart-time task continuation/flicker behavior reported in #1414. Adds regression coverage for adapter task cancellation and shutdown interrupts.
2026-03-15 04:21:50 -07:00
teknium1
fef710aca8 test(gateway): cover photo burst interrupt regressions
Add regression coverage for non-album Telegram photo burst batching, photo follow-ups that should queue without interrupting active runs, and the gateway priority-interrupt path for photo events.
2026-03-15 03:50:45 -07:00
CoinDegen
4ae1334287 fix(gateway): prevent telegram photo burst interrupts 2026-03-15 03:49:01 -07:00
teknium1
232ba441d7 test: cover DM session key isolation
Update interrupt-key expectations for namespaced DM session keys and add a regression test that different DM chat IDs produce distinct gateway sessions.
2026-03-15 02:38:48 -07:00
heyyyimmax
34e120bcbb fix(gateway): enforce chat_id isolation for all DM sessions 2026-03-15 02:37:53 -07:00
Teknium
84d99f7754 Merge pull request #1394 from NousResearch/hermes/hermes-eca4a640
fix: honor stt.enabled false across gateway transcription
2026-03-14 22:11:47 -07:00
teyrebaz33
c36136084a fix(gateway): honor stt.enabled false for voice transcription
- bridge stt.enabled from config.yaml into gateway runtime config
- preserve the flag in GatewayConfig serialization
- skip gateway voice transcription when STT is disabled
- add regression tests for config loading and disabled transcription flow
2026-03-14 22:09:53 -07:00
halfprice06
9a177d6f4b fix(discord): preserve native document and video attachment support
Salvaged from PR #1115 onto current main by reusing the shared
Discord file-attachment helper for local video and document sends,
including file_name support for documents and regression coverage.
2026-03-14 22:01:02 -07:00
teknium1
9938d27e27 test(telegram): cover disconnect with inactive updater 2026-03-14 21:53:28 -07:00
teknium1
a05a4afa53 fix: align salvaged Discord send test mock with current slash-command API 2026-03-14 21:44:50 -07:00
insecurejezza
8ce66a01ee fix(discord): retry without reply reference for system messages 2026-03-14 21:44:38 -07:00
teknium1
9c322f7f59 Merge origin/main into hermes/hermes-7ef7cb6a 2026-03-14 21:39:01 -07:00
Teknium
b14a07315b fix: save /plan output in workspace (#1381) 2026-03-14 21:28:51 -07:00
teknium1
4f4e2671ac test: lock retry replacement semantics
Add regression coverage for gateway and CLI /retry behavior so retried messages replace the original user turn instead of accumulating duplicate user entries in history.
2026-03-14 21:19:22 -07:00
Teknium
ff3473a37c feat: add /plan command (#1372)
* feat: add /plan command

* refactor: back /plan with bundled skill

* docs: document /plan skill
2026-03-14 21:18:17 -07:00
Teknium
fa89b65230 Merge pull request #1355 from NousResearch/hermes/hermes-ec1096a3
Salvaged PR #1052 onto current main with the contributor commit preserved plus a small follow-up for current-main conflict resolution and safe command quoting.
2026-03-14 18:05:28 -07:00
teknium1
79c81b2244 Merge origin/main into hermes/hermes-2f2b4807 2026-03-14 18:02:08 -07:00
teknium1
3fab72f1e1 fix(gateway): clean up pending Telegram media groups on disconnect
Cancel any queued media-group flush tasks during Telegram adapter disconnect
and clear the buffered events map so shutdown can't leave a pending album
flush behind. Add a regression test covering disconnect before the debounce
window expires.
2026-03-14 12:18:24 -07:00
teyrebaz33
f3a38c90fc fix(gateway): fall back to sys.executable -m hermes_cli.main when hermes not on PATH
When shutil.which('hermes') returns None, _resolve_hermes_bin() now tries
sys.executable -m hermes_cli.main as a fallback. This handles setups where
Hermes is launched via a venv or module invocation and the hermes symlink is
not on PATH for the gateway process.

Fixes #1049
2026-03-14 12:15:51 -07:00
capybaraonchain
8fb618234f fix(gateway): buffer Telegram media groups to prevent self-interruption
Telegram albums arrive as multiple updates with a shared media_group_id.
Previously each image triggered a separate MessageEvent, causing the agent
to interrupt itself when describing the first image.

- Add 0.8s debounce window for media group items
- Merge attachments into single MessageEvent
- Add regression test for photo album buffering
2026-03-14 12:14:45 -07:00
teknium1
5a2fcaab39 fix(gateway): harden Telegram polling conflict handling
- detect Telegram getUpdates conflicts and stop polling cleanly instead of retry-spamming forever
- add a machine-local token-scoped lock so different HERMES_HOME profiles on the same host can't poll the same bot token at once
- persist gateway runtime health/fatal adapter state and surface it in ● hermes-gateway.service - Hermes Agent Gateway - Messaging Platform Integration
     Loaded: loaded (/home/teknium/.config/systemd/user/hermes-gateway.service; enabled; preset: enabled)
     Active: active (running) since Sat 2026-03-14 09:25:35 PDT; 2h 45min ago
 Invocation: 8879379b25994201b98381f4bd80c2af
   Main PID: 1147926 (python)
      Tasks: 16 (limit: 76757)
     Memory: 151.4M (peak: 168.1M)
        CPU: 47.883s
     CGroup: /user.slice/user-1000.slice/user@1000.service/app.slice/hermes-gateway.service
             ├─1147926 /home/teknium/.hermes/hermes-agent/venv/bin/python -m hermes_cli.main gateway run --replace
             └─1147966 node /home/teknium/.hermes/hermes-agent/scripts/whatsapp-bridge/bridge.js --port 3000 --session /home/teknium/.hermes/whatsapp/session --mode self-chat

Mar 14 09:27:03 teknium-dev python[1147926]: 🔄 Retrying API call (2/3)...
Mar 14 09:27:04 teknium-dev python[1147926]: [409B blob data]
Mar 14 09:27:04 teknium-dev python[1147926]:    Content: ''
Mar 14 09:27:04 teknium-dev python[1147926]:  Max retries (3) for empty content exceeded.
Mar 14 09:27:07 teknium-dev python[1147926]: [1K blob data]
Mar 14 09:27:07 teknium-dev python[1147926]:    Content: ''
Mar 14 09:27:07 teknium-dev python[1147926]: 🔄 Retrying API call (1/3)...
Mar 14 09:27:12 teknium-dev python[1147926]: [1.7K blob data]
Mar 14 09:27:12 teknium-dev python[1147926]:    Content: ''
Mar 14 09:27:12 teknium-dev python[1147926]: 🔄 Retrying API call (2/3)...
⚠ Installed gateway service definition is outdated
  Run: hermes gateway restart  # auto-refreshes the unit

✓ Gateway service is running
✓ Systemd linger is enabled (service survives logout)
- cleanly exit non-retryable startup conflicts without triggering service restart loops

Tests:
- gateway status runtime-state helpers
- Telegram token-lock and polling-conflict behavior
- GatewayRunner clean exit on non-retryable startup conflict
- CLI runtime health summary
2026-03-14 12:11:23 -07:00
Himess
e5dc569daa fix: salvage gateway dedup and executor cleanup from PR #993
Salvages the two still-relevant fixes from PR #993 onto current main:
- use a 3-tuple LOCAL delivery key so explicit/local-origin targets are not duplicated
- shut down the previous agent-loop ThreadPoolExecutor when resizing the global pool

Adds regression tests for both behaviors.
2026-03-14 11:03:20 -07:00
teknium1
8f3d7dfcc0 fix: defer discord adapter annotations
Prevent gateway.platforms.discord from crashing at import time when discord.py is unavailable. Python 3.11 eagerly evaluates annotations, so using discord.Interaction and similar annotations caused an AttributeError after the optional import fallback set discord=None. Add postponed annotation evaluation and a regression test covering import without discord installed.
2026-03-14 09:32:05 -07:00
teknium1
eb8316ea69 fix: harden gateway restart recovery
- store gateway PID metadata and validate the live process before trusting gateway.pid
- auto-refresh outdated systemd user units before start/restart so installs pick up --replace fixes
- sweep stray manual gateway processes after service stops
- add regression tests for PID validation and service drift recovery
2026-03-14 07:42:31 -07:00
Teknium
917adcbaf4 Merge pull request #1306 from NousResearch/hermes/hermes-2ba57c8a
fix: backfill model on gateway sessions after agent runs
2026-03-14 06:48:32 -07:00
Teknium
95c0bee7f8 Merge pull request #1299 from NousResearch/hermes/hermes-f5fb1d3b
fix: salvage PR #327 voice mode onto current main
2026-03-14 06:45:20 -07:00
teknium1
8602e61fca test: cover gateway session model backfill
Add regression coverage for backfilling NULL gateway session models in SQLite, preserving existing models, and forwarding the resolved agent model through SessionStore updates.
2026-03-14 06:44:14 -07:00
teknium1
71cffbfa4f fix: verify SMTP TLS in send_message_tool
Add regression coverage for the standalone email send path and pass an explicit default SSL context to STARTTLS for certificate verification, matching the gateway email adapter hardening salvaged from PR #994.
2026-03-14 06:31:52 -07:00
Himess
344adc72a1 fix: update email test mocks to use imap.uid() instead of imap.search/fetch
Tests were still mocking imap.search() and imap.fetch() but the
implementation was changed to use imap.uid("search", ...) and
imap.uid("fetch", ...) for proper UID-based IMAP operations.
2026-03-14 06:29:00 -07:00
teknium1
7b10881b9e fix: persist clean voice transcripts and /voice off state
- keep CLI voice prefixes API-local while storing the original user text
- persist explicit gateway off state and restore adapter auto-TTS suppression on restart
- add regression coverage for both behaviors
2026-03-14 06:14:22 -07:00
teknium1
523a1b6faf merge: salvage PR #327 voice mode branch
Merge contributor branch feature/voice-mode onto current main for follow-up fixes.
2026-03-14 06:03:07 -07:00
0xbyt4
eb34c0b09a fix: voice pipeline hardening — 7 bug fixes with tests
1. Anthropic + ElevenLabs TTS silence: forward full response to TTS
   callback for non-streaming providers (choices first, then native
   content blocks fallback).

2. Subprocess timeout kill: play_audio_file now kills the process on
   TimeoutExpired instead of leaving zombie processes.

3. Discord disconnect cleanup: leave all voice channels before closing
   the client to prevent leaked state.

4. Audio stream leak: close InputStream if stream.start() fails.

5. Race condition: read/write _on_silence_stop under lock in audio
   callback thread.

6. _vprint force=True: show API error, retry, and truncation messages
   even during streaming TTS.

7. _refresh_level lock: read _voice_recording under _voice_lock.
2026-03-14 14:27:21 +03:00
0xbyt4
7a24168080 fix: add missing choices/Choice to discord mock in test_discord_free_response
The mock's app_commands SimpleNamespace lacked choices and Choice attrs,
causing xdist test ordering failures when this mock loaded before
test_discord_slash_commands.
2026-03-14 14:27:21 +03:00
0xbyt4
cc0a453476 fix: address PR review round 5 — streaming guard, VC auth, history prefix, auto-TTS control
1. Gate _streaming_api_call to chat_completions mode only — Anthropic and
   Codex fall back to _interruptible_api_call. Preserve Anthropic base_url
   across all client rebuild paths (interrupt, fallback, 401 refresh).

2. Discord VC synthetic events now use chat_type="channel" instead of
   defaulting to "dm" — prevents session bleed into DM context.
   Authorization runs before echoing transcript. Sanitize @everyone/@here
   in voice transcripts.

3. CLI voice prefix ("[Voice input...]") is now API-call-local only —
   stripped from returned history so it never persists to session DB or
   resumed sessions.

4. /voice off now disables base adapter auto-TTS via _auto_tts_disabled_chats
   set — voice input no longer triggers TTS when voice mode is off.
2026-03-14 14:27:21 +03:00
0xbyt4
35748a2fb0 fix: address PR review round 4 — remove web UI, fix audio/import/interface issues
Remove web UI gateway (web.py, tests, docs, toolset, env vars, Platform.WEB
enum) per maintainer request — Nous is building their own official chat UI.

Fix 1: Replace sd.wait() with polling pattern in play_audio_file() to prevent
indefinite hang when audio device stalls (consistent with play_beep()).

Fix 2: Use importlib.util.find_spec() for faster_whisper/openai availability
checks instead of module-level imports that trigger heavy native library
loading (CUDA/cuDNN) at import time.

Fix 3: Remove inspect.signature() hack in _send_voice_reply() — add **kwargs
to Telegram send_voice() so all adapters accept metadata uniformly.

Fix 4: Make session loading resilient to removed platform enum values — skip
entries with unknown platforms instead of crashing the entire gateway.
2026-03-14 14:27:21 +03:00
0xbyt4
1ad5e0ed15 feat: add voice channel awareness — inject participant and speaking state into agent context 2026-03-14 14:27:21 +03:00
0xbyt4
49f3f0fc62 fix: add choices/Choice to discord mock for /voice slash command test 2026-03-14 14:27:21 +03:00
0xbyt4
b8f8d3ef9e feat: integrate faster-whisper local STT with three-provider fallback
Merge main's faster-whisper (local, free) with our Groq support into a
unified three-provider STT pipeline: local > groq > openai.

Provider priority ensures free options are tried first. Each provider
has its own transcriber function with model auto-correction, env-
overridable endpoints, and proper error handling.

74 tests cover the full provider matrix, fallback chains, model
correction, config loading, validation edge cases, and dispatch.
2026-03-14 14:27:21 +03:00
0xbyt4
fa2c825e2f fix: isolate WEB_UI_HOST env var in test and handle empty string
- Patch WEB_UI_HOST in test_web_defaults to avoid env leak
- Handle empty WEB_UI_HOST string in config (fall back to 127.0.0.1)
2026-03-14 14:27:21 +03:00
0xbyt4
5b47b87c42 fix: show only reachable URLs in Web UI startup message
When bound to 127.0.0.1, only show localhost URL instead of listing
unreachable network interfaces. Add hint about WEB_UI_HOST=0.0.0.0
for phone/tablet access. Add VPN/multi-interface and token exposure
tests (11 new tests).
2026-03-14 14:27:21 +03:00
0xbyt4
44abe852fb fix: add macOS Homebrew Opus fallback and fix shutdown dict iteration
- Add Homebrew library path fallback when ctypes.util.find_library fails
  on macOS (Apple Silicon + Intel paths, guarded by platform check)
- Fix RuntimeError in gateway stop() by iterating over dict copy
- Update Opus tests to verify find_library-first + conditional fallback
2026-03-14 14:27:21 +03:00
0xbyt4
c797314fcf test: add security and hardening tests for voice mode fixes
- Path traversal sanitization (Path.name strips ../)
- Media endpoint authentication (401 without token, 404 on traversal)
- hmac.compare_digest usage verification (no == for tokens)
- DOMPurify XSS prevention in HTML template
- Default bind 127.0.0.1 (adapter and config)
- /remote-control token hiding in group chats
- Opus find_library instead of hardcoded paths
- Opus decode error logging (no silent swallow)
- Interrupt _vprint force=True on all 6 calls
- Anthropic interrupt handler in both API call paths
- Update test_web_defaults for new 127.0.0.1 default
2026-03-14 14:27:21 +03:00
0xbyt4
9722bd8be0 fix: 8 voice pipeline bugs with tests proving each fix
1. VoiceReceiver.stop() now acquires _lock before clearing shared state
   to prevent race with _on_packet on the socket reader thread
2. _packet_debug_count moved from class-level to instance-level to avoid
   cross-instance race condition in multi-guild setups
3. play_in_voice_channel uses asyncio.get_running_loop() instead of
   deprecated asyncio.get_event_loop()
4. _send_voice_reply uses uuid for filenames instead of time-based names
   that can collide when two replies happen in the same second
5. Voice timeout now notifies runner via _on_voice_disconnect callback
   so runner cleans up _voice_mode state (prevents orphaned TTS replies)
6. play_in_voice_channel adds PLAYBACK_TIMEOUT (120s) to prevent
   infinite blocking when FFmpeg callback is never called
7. _send_voice_reply moves temp file cleanup to finally block so files
   are always cleaned up even when send_voice/play raises
8. Base adapter auto-TTS wraps play_tts in try/finally with os.remove
   to clean up generated audio files after playback

18 new tests (120 total voice tests)
2026-03-14 14:27:20 +03:00
0xbyt4
c925d2ee76 fix: voice pipeline thread safety and error handling bugs
- Add lock protection around VoiceReceiver buffer writes in _on_packet
  to prevent race condition with check_silence on different threads
- Wire _voice_input_callback BEFORE join_voice_channel to avoid
  losing voice input during the join window
- Add try/except around leave_voice_channel to ensure state cleanup
  (voice_mode, callback) even if leave raises an exception
- Guard against empty text after markdown stripping in base.py auto-TTS
- Add 11 tests proving each bug and verifying the fix
2026-03-14 14:27:20 +03:00
0xbyt4
86ddaaee9c fix: extract voice reply logic and add comprehensive tests
- Fix tempfile.mktemp() TOCTOU race in Discord voice input (use NamedTemporaryFile)
- Extract voice reply decision from _handle_message into _should_send_voice_reply()
- Rewrite TestAutoVoiceReply to call real method instead of testing a copy
- Add 59 new tests: VoiceReceiver, VC commands, adapter methods, streaming TTS
2026-03-14 14:27:20 +03:00
0xbyt4
fbf47e9ff6 fix: allow voice reply in Discord VC despite skip_double guard
When bot is in a Discord voice channel, both base auto-TTS and Discord
play_tts override skip audio. The skip_double guard was also blocking
the runner's _send_voice_reply, resulting in zero audio output in VC.

Now skip_double is overridden when the bot is actively connected to a
voice channel, allowing play_in_voice_channel to handle TTS.

Add comprehensive test matrix covering all platform x input x mode
combinations with full decision table documentation.
2026-03-14 14:27:20 +03:00
0xbyt4
dcb84a8d30 test: add double TTS prevention tests for voice reply logic
- Update TestAutoVoiceReply to include skip_double logic: voice input
  is handled by base adapter auto-TTS, gateway runner skips to prevent
  duplicate audio
- Add TestDiscordPlayTtsSkip: verifies Discord adapter skips play_tts
  when bot is in a voice channel (VC playback handled by runner)
- Add TestWebPlayTts: verifies Web adapter sends invisible play_audio
  instead of voice bubble
2026-03-14 14:27:20 +03:00