Prevent gateway.platforms.discord from crashing at import time when discord.py is unavailable. Python 3.11 eagerly evaluates annotations, so using discord.Interaction and similar annotations caused an AttributeError after the optional import fallback set discord=None. Add postponed annotation evaluation and a regression test covering import without discord installed.
- store gateway PID metadata and validate the live process before trusting gateway.pid
- auto-refresh outdated systemd user units before start/restart so installs pick up --replace fixes
- sweep stray manual gateway processes after service stops
- add regression tests for PID validation and service drift recovery
Add regression coverage for backfilling NULL gateway session models in SQLite, preserving existing models, and forwarding the resolved agent model through SessionStore updates.
Add regression coverage for the standalone email send path and pass an explicit default SSL context to STARTTLS for certificate verification, matching the gateway email adapter hardening salvaged from PR #994.
Tests were still mocking imap.search() and imap.fetch() but the
implementation was changed to use imap.uid("search", ...) and
imap.uid("fetch", ...) for proper UID-based IMAP operations.
- keep CLI voice prefixes API-local while storing the original user text
- persist explicit gateway off state and restore adapter auto-TTS suppression on restart
- add regression coverage for both behaviors
1. Anthropic + ElevenLabs TTS silence: forward full response to TTS
callback for non-streaming providers (choices first, then native
content blocks fallback).
2. Subprocess timeout kill: play_audio_file now kills the process on
TimeoutExpired instead of leaving zombie processes.
3. Discord disconnect cleanup: leave all voice channels before closing
the client to prevent leaked state.
4. Audio stream leak: close InputStream if stream.start() fails.
5. Race condition: read/write _on_silence_stop under lock in audio
callback thread.
6. _vprint force=True: show API error, retry, and truncation messages
even during streaming TTS.
7. _refresh_level lock: read _voice_recording under _voice_lock.
The mock's app_commands SimpleNamespace lacked choices and Choice attrs,
causing xdist test ordering failures when this mock loaded before
test_discord_slash_commands.
1. Gate _streaming_api_call to chat_completions mode only — Anthropic and
Codex fall back to _interruptible_api_call. Preserve Anthropic base_url
across all client rebuild paths (interrupt, fallback, 401 refresh).
2. Discord VC synthetic events now use chat_type="channel" instead of
defaulting to "dm" — prevents session bleed into DM context.
Authorization runs before echoing transcript. Sanitize @everyone/@here
in voice transcripts.
3. CLI voice prefix ("[Voice input...]") is now API-call-local only —
stripped from returned history so it never persists to session DB or
resumed sessions.
4. /voice off now disables base adapter auto-TTS via _auto_tts_disabled_chats
set — voice input no longer triggers TTS when voice mode is off.
Remove web UI gateway (web.py, tests, docs, toolset, env vars, Platform.WEB
enum) per maintainer request — Nous is building their own official chat UI.
Fix 1: Replace sd.wait() with polling pattern in play_audio_file() to prevent
indefinite hang when audio device stalls (consistent with play_beep()).
Fix 2: Use importlib.util.find_spec() for faster_whisper/openai availability
checks instead of module-level imports that trigger heavy native library
loading (CUDA/cuDNN) at import time.
Fix 3: Remove inspect.signature() hack in _send_voice_reply() — add **kwargs
to Telegram send_voice() so all adapters accept metadata uniformly.
Fix 4: Make session loading resilient to removed platform enum values — skip
entries with unknown platforms instead of crashing the entire gateway.
Merge main's faster-whisper (local, free) with our Groq support into a
unified three-provider STT pipeline: local > groq > openai.
Provider priority ensures free options are tried first. Each provider
has its own transcriber function with model auto-correction, env-
overridable endpoints, and proper error handling.
74 tests cover the full provider matrix, fallback chains, model
correction, config loading, validation edge cases, and dispatch.
When bound to 127.0.0.1, only show localhost URL instead of listing
unreachable network interfaces. Add hint about WEB_UI_HOST=0.0.0.0
for phone/tablet access. Add VPN/multi-interface and token exposure
tests (11 new tests).
- Path traversal sanitization (Path.name strips ../)
- Media endpoint authentication (401 without token, 404 on traversal)
- hmac.compare_digest usage verification (no == for tokens)
- DOMPurify XSS prevention in HTML template
- Default bind 127.0.0.1 (adapter and config)
- /remote-control token hiding in group chats
- Opus find_library instead of hardcoded paths
- Opus decode error logging (no silent swallow)
- Interrupt _vprint force=True on all 6 calls
- Anthropic interrupt handler in both API call paths
- Update test_web_defaults for new 127.0.0.1 default
1. VoiceReceiver.stop() now acquires _lock before clearing shared state
to prevent race with _on_packet on the socket reader thread
2. _packet_debug_count moved from class-level to instance-level to avoid
cross-instance race condition in multi-guild setups
3. play_in_voice_channel uses asyncio.get_running_loop() instead of
deprecated asyncio.get_event_loop()
4. _send_voice_reply uses uuid for filenames instead of time-based names
that can collide when two replies happen in the same second
5. Voice timeout now notifies runner via _on_voice_disconnect callback
so runner cleans up _voice_mode state (prevents orphaned TTS replies)
6. play_in_voice_channel adds PLAYBACK_TIMEOUT (120s) to prevent
infinite blocking when FFmpeg callback is never called
7. _send_voice_reply moves temp file cleanup to finally block so files
are always cleaned up even when send_voice/play raises
8. Base adapter auto-TTS wraps play_tts in try/finally with os.remove
to clean up generated audio files after playback
18 new tests (120 total voice tests)
- Add lock protection around VoiceReceiver buffer writes in _on_packet
to prevent race condition with check_silence on different threads
- Wire _voice_input_callback BEFORE join_voice_channel to avoid
losing voice input during the join window
- Add try/except around leave_voice_channel to ensure state cleanup
(voice_mode, callback) even if leave raises an exception
- Guard against empty text after markdown stripping in base.py auto-TTS
- Add 11 tests proving each bug and verifying the fix
When bot is in a Discord voice channel, both base auto-TTS and Discord
play_tts override skip audio. The skip_double guard was also blocking
the runner's _send_voice_reply, resulting in zero audio output in VC.
Now skip_double is overridden when the bot is actively connected to a
voice channel, allowing play_in_voice_channel to handle TTS.
Add comprehensive test matrix covering all platform x input x mode
combinations with full decision table documentation.
- Update TestAutoVoiceReply to include skip_double logic: voice input
is handled by base adapter auto-TTS, gateway runner skips to prevent
duplicate audio
- Add TestDiscordPlayTtsSkip: verifies Discord adapter skips play_tts
when bot is in a voice channel (VC playback handled by runner)
- Add TestWebPlayTts: verifies Web adapter sends invisible play_audio
instead of voice bubble
play_tts base class forwards metadata via **kwargs to send_voice,
but Discord and Slack adapters did not accept extra keyword arguments,
causing TypeError and silent message handling failure.
Also fix test_web_defaults to patch correct env var (WEB_UI_TOKEN).
- Register /voice as Discord slash command with mode choices
- Fix _send_voice_reply to handle adapters that don't accept metadata
parameter (Discord) by inspecting the method signature at runtime
- /voice on: reply with voice when user sends voice messages
- /voice tts: reply with voice to all messages
- /voice off: disable, text-only replies
- /voice status: show current mode
- Per-chat state persisted to gateway_voice_mode.json
- Dedup: skips auto-reply if agent already called text_to_speech tool
- drop_pending_updates=True to ignore stale Telegram messages on restart
- 25 tests covering command handler, reply logic, and edge cases
- prevent raw MEDIA tag leakage outside the gateway pipeline
- make extract_media handle quoted/backticked paths and optional whitespace
- send Telegram media natively with explicit error/warning handling
- add regression tests for Telegram media dispatch and MEDIA parsing
Follow-up on salvaged PR #975.
Bridge quick_commands from config.yaml into load_gateway_config(),
normalize non-dict quick command config at runtime, and add coverage
for GatewayConfig round-trips plus config.yaml bridging. This makes the
GatewayConfig quick-command fix complete for the real user-facing config
path implicated by issue #973.
* feat: improve context compaction handoff summaries
Adapt PR #916 onto current main by replacing the old context summary marker
with a clearer handoff wrapper, updating the summarization prompt for
resume-oriented summaries, and preserving the current call_llm-based
compression path.
* fix: clearer error when docker backend is unavailable
* fix: preserve docker discovery in backend preflight
Follow up on salvaged PR #940 by reusing find_docker() during the new
availability check so non-PATH Docker Desktop installs still work. Add
a regression test covering the resolved executable path.
* test: make gateway async tests xdist-safe
Replace sync test usage of asyncio.get_event_loop().run_until_complete()
with asyncio.run() so tests do not depend on an ambient current event loop.
Also create the email disconnect poll task inside a running loop. This fixes
xdist/CI failures where workers have no current loop in MainThread.
---------
Co-authored-by: aydnOktay <xaydinoktay@gmail.com>
Use asyncio.run in sync tests that were relying on an implicit current event loop. This makes the gateway send-image and Slack connect tests pass reliably under Python 3.11+ and xdist workers.
Add a /reasoning command across gateway adapters so users can
inspect or change reasoning effort without editing config by hand.
Reload reasoning settings from config.yaml before each agent run,
including background tasks, so the next message picks up the new
value consistently.
- Introduced _approval_lock to ensure that approval prompts are handled sequentially, preventing state clobbering from parallel delegation subtasks.
- Updated approval_callback and HermesCLI methods to utilize the lock for managing approval state and deadlines.
- Added tests for the config bridging logic to ensure correct environment variable mapping from config.yaml.
- update managed-server compatibility tests to match the current
ServerManager.tool_parser wiring used by hermes_base_env
- make quick-command CLI assertions accept Rich Text objects, which is how
ANSI-safe output is rendered now
- set HERMES_HOME explicitly in the Discord auto-thread config bridge test
so it loads the intended temporary config file
Validated with the targeted test set and the full pytest suite.
Tell the agent what it CANNOT do on Slack and Discord — no searching
channel history, no pinning messages, no managing channels/roles.
Prevents the agent from hallucinating capabilities it doesn't have
and promising actions it can't deliver.
Addresses user feedback: agent says 'I'll search your Slack history'
then goes silent because no Slack-specific tools exist.
- Add /thread slash command that creates a Discord thread and starts a
new Hermes session in it. The starter message (if provided) becomes
the first user input in the new session.
- Add discord.auto_thread config option (DISCORD_AUTO_THREAD env var):
when enabled, every message in a text channel automatically creates
a thread, allowing parallel isolated sessions.
- Fix Discord media method signatures to accept metadata kwarg
(send_voice, send_image_file, send_image) — prevents TypeError
when the base adapter passes platform metadata.
- Fix test mock isolation: add app_commands and ForumChannel to
discord mocks so tests pass in full-suite runs.
Based on PRs #866 and #1109 by insecurejezza, modified per review:
removed /channel command (unsafe), added auto_thread feature,
made /thread dispatch new sessions.
Co-authored-by: insecurejezza <insecurejezza@users.noreply.github.com>
Previously, when no watch_domains or watch_entities were configured,
ALL state_changed events passed through to the agent, causing users
to be flooded with notifications for every HA entity change.
Now events are dropped by default unless the user explicitly configures:
- watch_domains: list of domains to monitor (e.g. climate, light)
- watch_entities: list of specific entity IDs to monitor
- watch_all: true (new option — opt-in to receive all events)
A warning is logged at connect time if no filters are configured,
guiding users to set up their HA platform config.
All 49 gateway HA tests + 52 HA tool tests pass.
The old message referenced 'hermes setup' which doesn't handle
skill-specific env vars. Updated to direct users to load the skill
in the local CLI (which triggers the secure prompt) or add the key
to ~/.hermes/.env manually.
When a skill declares required_environment_variables in its YAML
frontmatter, missing env vars trigger a secure TUI prompt (identical
to the sudo password widget) when the skill is loaded. Secrets flow
directly to ~/.hermes/.env, never entering LLM context.
Key changes:
- New required_environment_variables frontmatter field for skills
- Secure TUI widget (masked input, 120s timeout)
- Gateway safety: messaging platforms show local setup guidance
- Legacy prerequisites.env_vars normalized into new format
- Remote backend handling: conservative setup_needed=True
- Env var name validation, file permissions hardened to 0o600
- Redact patterns extended for secret-related JSON fields
- 12 existing skills updated with prerequisites declarations
- ~48 new tests covering skip, timeout, gateway, remote backends
- Dynamic panel widget sizing (fixes hardcoded width from original PR)
Cherry-picked from PR #723 by kshitijk4poor, rebased onto current main
with conflict resolution.
Fixes#688
Co-authored-by: kshitijk4poor <kshitijk4poor@users.noreply.github.com>