Salvaged PR #1052 onto current main with the contributor commit preserved plus a small follow-up for current-main conflict resolution and safe command quoting.
Cancel any queued media-group flush tasks during Telegram adapter disconnect
and clear the buffered events map so shutdown can't leave a pending album
flush behind. Add a regression test covering disconnect before the debounce
window expires.
When shutil.which('hermes') returns None, _resolve_hermes_bin() now tries
sys.executable -m hermes_cli.main as a fallback. This handles setups where
Hermes is launched via a venv or module invocation and the hermes symlink is
not on PATH for the gateway process.
Fixes#1049
Telegram albums arrive as multiple updates with a shared media_group_id.
Previously each image triggered a separate MessageEvent, causing the agent
to interrupt itself when describing the first image.
- Add 0.8s debounce window for media group items
- Merge attachments into single MessageEvent
- Add regression test for photo album buffering
Salvages the two still-relevant fixes from PR #993 onto current main:
- use a 3-tuple LOCAL delivery key so explicit/local-origin targets are not duplicated
- shut down the previous agent-loop ThreadPoolExecutor when resizing the global pool
Adds regression tests for both behaviors.
Prevent gateway.platforms.discord from crashing at import time when discord.py is unavailable. Python 3.11 eagerly evaluates annotations, so using discord.Interaction and similar annotations caused an AttributeError after the optional import fallback set discord=None. Add postponed annotation evaluation and a regression test covering import without discord installed.
- store gateway PID metadata and validate the live process before trusting gateway.pid
- auto-refresh outdated systemd user units before start/restart so installs pick up --replace fixes
- sweep stray manual gateway processes after service stops
- add regression tests for PID validation and service drift recovery
Add regression coverage for backfilling NULL gateway session models in SQLite, preserving existing models, and forwarding the resolved agent model through SessionStore updates.
Add regression coverage for the standalone email send path and pass an explicit default SSL context to STARTTLS for certificate verification, matching the gateway email adapter hardening salvaged from PR #994.
Tests were still mocking imap.search() and imap.fetch() but the
implementation was changed to use imap.uid("search", ...) and
imap.uid("fetch", ...) for proper UID-based IMAP operations.
- keep CLI voice prefixes API-local while storing the original user text
- persist explicit gateway off state and restore adapter auto-TTS suppression on restart
- add regression coverage for both behaviors
1. Anthropic + ElevenLabs TTS silence: forward full response to TTS
callback for non-streaming providers (choices first, then native
content blocks fallback).
2. Subprocess timeout kill: play_audio_file now kills the process on
TimeoutExpired instead of leaving zombie processes.
3. Discord disconnect cleanup: leave all voice channels before closing
the client to prevent leaked state.
4. Audio stream leak: close InputStream if stream.start() fails.
5. Race condition: read/write _on_silence_stop under lock in audio
callback thread.
6. _vprint force=True: show API error, retry, and truncation messages
even during streaming TTS.
7. _refresh_level lock: read _voice_recording under _voice_lock.
The mock's app_commands SimpleNamespace lacked choices and Choice attrs,
causing xdist test ordering failures when this mock loaded before
test_discord_slash_commands.
1. Gate _streaming_api_call to chat_completions mode only — Anthropic and
Codex fall back to _interruptible_api_call. Preserve Anthropic base_url
across all client rebuild paths (interrupt, fallback, 401 refresh).
2. Discord VC synthetic events now use chat_type="channel" instead of
defaulting to "dm" — prevents session bleed into DM context.
Authorization runs before echoing transcript. Sanitize @everyone/@here
in voice transcripts.
3. CLI voice prefix ("[Voice input...]") is now API-call-local only —
stripped from returned history so it never persists to session DB or
resumed sessions.
4. /voice off now disables base adapter auto-TTS via _auto_tts_disabled_chats
set — voice input no longer triggers TTS when voice mode is off.
Remove web UI gateway (web.py, tests, docs, toolset, env vars, Platform.WEB
enum) per maintainer request — Nous is building their own official chat UI.
Fix 1: Replace sd.wait() with polling pattern in play_audio_file() to prevent
indefinite hang when audio device stalls (consistent with play_beep()).
Fix 2: Use importlib.util.find_spec() for faster_whisper/openai availability
checks instead of module-level imports that trigger heavy native library
loading (CUDA/cuDNN) at import time.
Fix 3: Remove inspect.signature() hack in _send_voice_reply() — add **kwargs
to Telegram send_voice() so all adapters accept metadata uniformly.
Fix 4: Make session loading resilient to removed platform enum values — skip
entries with unknown platforms instead of crashing the entire gateway.
Merge main's faster-whisper (local, free) with our Groq support into a
unified three-provider STT pipeline: local > groq > openai.
Provider priority ensures free options are tried first. Each provider
has its own transcriber function with model auto-correction, env-
overridable endpoints, and proper error handling.
74 tests cover the full provider matrix, fallback chains, model
correction, config loading, validation edge cases, and dispatch.
When bound to 127.0.0.1, only show localhost URL instead of listing
unreachable network interfaces. Add hint about WEB_UI_HOST=0.0.0.0
for phone/tablet access. Add VPN/multi-interface and token exposure
tests (11 new tests).
- Path traversal sanitization (Path.name strips ../)
- Media endpoint authentication (401 without token, 404 on traversal)
- hmac.compare_digest usage verification (no == for tokens)
- DOMPurify XSS prevention in HTML template
- Default bind 127.0.0.1 (adapter and config)
- /remote-control token hiding in group chats
- Opus find_library instead of hardcoded paths
- Opus decode error logging (no silent swallow)
- Interrupt _vprint force=True on all 6 calls
- Anthropic interrupt handler in both API call paths
- Update test_web_defaults for new 127.0.0.1 default
1. VoiceReceiver.stop() now acquires _lock before clearing shared state
to prevent race with _on_packet on the socket reader thread
2. _packet_debug_count moved from class-level to instance-level to avoid
cross-instance race condition in multi-guild setups
3. play_in_voice_channel uses asyncio.get_running_loop() instead of
deprecated asyncio.get_event_loop()
4. _send_voice_reply uses uuid for filenames instead of time-based names
that can collide when two replies happen in the same second
5. Voice timeout now notifies runner via _on_voice_disconnect callback
so runner cleans up _voice_mode state (prevents orphaned TTS replies)
6. play_in_voice_channel adds PLAYBACK_TIMEOUT (120s) to prevent
infinite blocking when FFmpeg callback is never called
7. _send_voice_reply moves temp file cleanup to finally block so files
are always cleaned up even when send_voice/play raises
8. Base adapter auto-TTS wraps play_tts in try/finally with os.remove
to clean up generated audio files after playback
18 new tests (120 total voice tests)
- Add lock protection around VoiceReceiver buffer writes in _on_packet
to prevent race condition with check_silence on different threads
- Wire _voice_input_callback BEFORE join_voice_channel to avoid
losing voice input during the join window
- Add try/except around leave_voice_channel to ensure state cleanup
(voice_mode, callback) even if leave raises an exception
- Guard against empty text after markdown stripping in base.py auto-TTS
- Add 11 tests proving each bug and verifying the fix
When bot is in a Discord voice channel, both base auto-TTS and Discord
play_tts override skip audio. The skip_double guard was also blocking
the runner's _send_voice_reply, resulting in zero audio output in VC.
Now skip_double is overridden when the bot is actively connected to a
voice channel, allowing play_in_voice_channel to handle TTS.
Add comprehensive test matrix covering all platform x input x mode
combinations with full decision table documentation.
- Update TestAutoVoiceReply to include skip_double logic: voice input
is handled by base adapter auto-TTS, gateway runner skips to prevent
duplicate audio
- Add TestDiscordPlayTtsSkip: verifies Discord adapter skips play_tts
when bot is in a voice channel (VC playback handled by runner)
- Add TestWebPlayTts: verifies Web adapter sends invisible play_audio
instead of voice bubble
play_tts base class forwards metadata via **kwargs to send_voice,
but Discord and Slack adapters did not accept extra keyword arguments,
causing TypeError and silent message handling failure.
Also fix test_web_defaults to patch correct env var (WEB_UI_TOKEN).
- Register /voice as Discord slash command with mode choices
- Fix _send_voice_reply to handle adapters that don't accept metadata
parameter (Discord) by inspecting the method signature at runtime
- /voice on: reply with voice when user sends voice messages
- /voice tts: reply with voice to all messages
- /voice off: disable, text-only replies
- /voice status: show current mode
- Per-chat state persisted to gateway_voice_mode.json
- Dedup: skips auto-reply if agent already called text_to_speech tool
- drop_pending_updates=True to ignore stale Telegram messages on restart
- 25 tests covering command handler, reply logic, and edge cases
- prevent raw MEDIA tag leakage outside the gateway pipeline
- make extract_media handle quoted/backticked paths and optional whitespace
- send Telegram media natively with explicit error/warning handling
- add regression tests for Telegram media dispatch and MEDIA parsing
Follow-up on salvaged PR #975.
Bridge quick_commands from config.yaml into load_gateway_config(),
normalize non-dict quick command config at runtime, and add coverage
for GatewayConfig round-trips plus config.yaml bridging. This makes the
GatewayConfig quick-command fix complete for the real user-facing config
path implicated by issue #973.
* feat: improve context compaction handoff summaries
Adapt PR #916 onto current main by replacing the old context summary marker
with a clearer handoff wrapper, updating the summarization prompt for
resume-oriented summaries, and preserving the current call_llm-based
compression path.
* fix: clearer error when docker backend is unavailable
* fix: preserve docker discovery in backend preflight
Follow up on salvaged PR #940 by reusing find_docker() during the new
availability check so non-PATH Docker Desktop installs still work. Add
a regression test covering the resolved executable path.
* test: make gateway async tests xdist-safe
Replace sync test usage of asyncio.get_event_loop().run_until_complete()
with asyncio.run() so tests do not depend on an ambient current event loop.
Also create the email disconnect poll task inside a running loop. This fixes
xdist/CI failures where workers have no current loop in MainThread.
---------
Co-authored-by: aydnOktay <xaydinoktay@gmail.com>
Use asyncio.run in sync tests that were relying on an implicit current event loop. This makes the gateway send-image and Slack connect tests pass reliably under Python 3.11+ and xdist workers.
Add a /reasoning command across gateway adapters so users can
inspect or change reasoning effort without editing config by hand.
Reload reasoning settings from config.yaml before each agent run,
including background tasks, so the next message picks up the new
value consistently.
- Introduced _approval_lock to ensure that approval prompts are handled sequentially, preventing state clobbering from parallel delegation subtasks.
- Updated approval_callback and HermesCLI methods to utilize the lock for managing approval state and deadlines.
- Added tests for the config bridging logic to ensure correct environment variable mapping from config.yaml.
- update managed-server compatibility tests to match the current
ServerManager.tool_parser wiring used by hermes_base_env
- make quick-command CLI assertions accept Rich Text objects, which is how
ANSI-safe output is rendered now
- set HERMES_HOME explicitly in the Discord auto-thread config bridge test
so it loads the intended temporary config file
Validated with the targeted test set and the full pytest suite.