Commit Graph

16 Commits

Author SHA1 Message Date
Teknium
46176c8029 refactor: centralize slash command registry (#1603)
* refactor: centralize slash command registry

Replace 7+ scattered command definition sites with a single
CommandDef registry in hermes_cli/commands.py. All downstream
consumers now derive from this registry:

- CLI process_command() resolves aliases via resolve_command()
- Gateway _known_commands uses GATEWAY_KNOWN_COMMANDS frozenset
- Gateway help text generated by gateway_help_lines()
- Telegram BotCommands generated by telegram_bot_commands()
- Slack subcommand map generated by slack_subcommand_map()

Adding a command or alias is now a one-line change to
COMMAND_REGISTRY instead of touching 6+ files.

Bugfixes included:
- Telegram now registers /rollback, /background (were missing)
- Slack now has /voice, /update, /reload-mcp (were missing)
- Gateway duplicate 'reasoning' dispatch (dead code) removed
- Gateway help text can no longer drift from CLI help

Backwards-compatible: COMMANDS and COMMANDS_BY_CATEGORY dicts are
rebuilt from the registry, so existing imports work unchanged.

* docs: update developer docs for centralized command registry

Update AGENTS.md with full 'Slash Command Registry' and 'Adding a
Slash Command' sections covering CommandDef fields, registry helpers,
and the one-line alias workflow.

Also update:
- CONTRIBUTING.md: commands.py description
- website/docs/reference/slash-commands.md: reference central registry
- docs/plans/centralize-command-registry.md: mark COMPLETED
- plans/checkpoint-rollback.md: reference new pattern
- hermes-agent-dev skill: architecture table

* chore: remove stale plan docs
2026-03-16 23:21:03 -07:00
teknium1
2d57946ee9 test(voice): clarify install guidance and local skips
Add an explicit messaging-extra install hint to the missing PyNaCl/davey error path, cover it with a voice-channel join regression test, and skip the low-level NaCl packet tests when PyNaCl is not installed locally.
2026-03-15 05:24:34 -07:00
0xbyt4
63f0ec96ec test(voice): add comprehensive flow tests for voice channel fixes
Tests cover the actual code paths changed in voice fixes:

_on_packet DAVE passthrough (8 tests):
- Known SSRC + DAVE decrypt success → buffered
- Unknown SSRC + DAVE → skip DAVE, passthrough to Opus
- DAVE "Unencrypted" error → passthrough, not dropped
- DAVE other error → packet dropped
- No DAVE session → direct decode
- Bot's own SSRC → ignored (echo prevention)
- Multiple SSRCs → separate buffers

SSRC auto-mapping (6 tests):
- Single allowed user → auto-mapped
- Multiple allowed users → no auto-map
- No allowlist → sole non-bot member inferred
- Unallowed user → rejected
- Only bot in channel → no map
- Auto-map persists across checks

Buffer lifecycle (4 tests):
- Known SSRC completed utterance
- Short buffer ignored
- Recent audio waits
- Stale unknown buffer discarded

TTS playback (10 tests):
- play_tts calls play_in_voice_channel in VC
- play_tts falls through when not in VC
- play_tts wrong channel no match
- Voice input dedup (runner skips)
- Text + voice_mode combinations
- Error/empty response skipped
- Agent TTS tool dedup

UDP keepalive (2 tests):
- Interval within bounds
- Silence frame actually sent via send_packet
2026-03-15 05:20:17 -07:00
0xbyt4
f1b4d0b280 fix(voice): make play_tts play in VC instead of no-op
play_tts was returning success without playing anything when bot was
in a voice channel. Now it calls play_in_voice_channel directly.

Simplified skip_double dedup: base adapter handles voice input TTS
via play_tts (which now works for VC), runner skips to avoid double.
2026-03-15 05:20:17 -07:00
teknium1
7b10881b9e fix: persist clean voice transcripts and /voice off state
- keep CLI voice prefixes API-local while storing the original user text
- persist explicit gateway off state and restore adapter auto-TTS suppression on restart
- add regression coverage for both behaviors
2026-03-14 06:14:22 -07:00
0xbyt4
eb34c0b09a fix: voice pipeline hardening — 7 bug fixes with tests
1. Anthropic + ElevenLabs TTS silence: forward full response to TTS
   callback for non-streaming providers (choices first, then native
   content blocks fallback).

2. Subprocess timeout kill: play_audio_file now kills the process on
   TimeoutExpired instead of leaving zombie processes.

3. Discord disconnect cleanup: leave all voice channels before closing
   the client to prevent leaked state.

4. Audio stream leak: close InputStream if stream.start() fails.

5. Race condition: read/write _on_silence_stop under lock in audio
   callback thread.

6. _vprint force=True: show API error, retry, and truncation messages
   even during streaming TTS.

7. _refresh_level lock: read _voice_recording under _voice_lock.
2026-03-14 14:27:21 +03:00
0xbyt4
cc0a453476 fix: address PR review round 5 — streaming guard, VC auth, history prefix, auto-TTS control
1. Gate _streaming_api_call to chat_completions mode only — Anthropic and
   Codex fall back to _interruptible_api_call. Preserve Anthropic base_url
   across all client rebuild paths (interrupt, fallback, 401 refresh).

2. Discord VC synthetic events now use chat_type="channel" instead of
   defaulting to "dm" — prevents session bleed into DM context.
   Authorization runs before echoing transcript. Sanitize @everyone/@here
   in voice transcripts.

3. CLI voice prefix ("[Voice input...]") is now API-call-local only —
   stripped from returned history so it never persists to session DB or
   resumed sessions.

4. /voice off now disables base adapter auto-TTS via _auto_tts_disabled_chats
   set — voice input no longer triggers TTS when voice mode is off.
2026-03-14 14:27:21 +03:00
0xbyt4
35748a2fb0 fix: address PR review round 4 — remove web UI, fix audio/import/interface issues
Remove web UI gateway (web.py, tests, docs, toolset, env vars, Platform.WEB
enum) per maintainer request — Nous is building their own official chat UI.

Fix 1: Replace sd.wait() with polling pattern in play_audio_file() to prevent
indefinite hang when audio device stalls (consistent with play_beep()).

Fix 2: Use importlib.util.find_spec() for faster_whisper/openai availability
checks instead of module-level imports that trigger heavy native library
loading (CUDA/cuDNN) at import time.

Fix 3: Remove inspect.signature() hack in _send_voice_reply() — add **kwargs
to Telegram send_voice() so all adapters accept metadata uniformly.

Fix 4: Make session loading resilient to removed platform enum values — skip
entries with unknown platforms instead of crashing the entire gateway.
2026-03-14 14:27:21 +03:00
0xbyt4
1ad5e0ed15 feat: add voice channel awareness — inject participant and speaking state into agent context 2026-03-14 14:27:21 +03:00
0xbyt4
9722bd8be0 fix: 8 voice pipeline bugs with tests proving each fix
1. VoiceReceiver.stop() now acquires _lock before clearing shared state
   to prevent race with _on_packet on the socket reader thread
2. _packet_debug_count moved from class-level to instance-level to avoid
   cross-instance race condition in multi-guild setups
3. play_in_voice_channel uses asyncio.get_running_loop() instead of
   deprecated asyncio.get_event_loop()
4. _send_voice_reply uses uuid for filenames instead of time-based names
   that can collide when two replies happen in the same second
5. Voice timeout now notifies runner via _on_voice_disconnect callback
   so runner cleans up _voice_mode state (prevents orphaned TTS replies)
6. play_in_voice_channel adds PLAYBACK_TIMEOUT (120s) to prevent
   infinite blocking when FFmpeg callback is never called
7. _send_voice_reply moves temp file cleanup to finally block so files
   are always cleaned up even when send_voice/play raises
8. Base adapter auto-TTS wraps play_tts in try/finally with os.remove
   to clean up generated audio files after playback

18 new tests (120 total voice tests)
2026-03-14 14:27:20 +03:00
0xbyt4
c925d2ee76 fix: voice pipeline thread safety and error handling bugs
- Add lock protection around VoiceReceiver buffer writes in _on_packet
  to prevent race condition with check_silence on different threads
- Wire _voice_input_callback BEFORE join_voice_channel to avoid
  losing voice input during the join window
- Add try/except around leave_voice_channel to ensure state cleanup
  (voice_mode, callback) even if leave raises an exception
- Guard against empty text after markdown stripping in base.py auto-TTS
- Add 11 tests proving each bug and verifying the fix
2026-03-14 14:27:20 +03:00
0xbyt4
86ddaaee9c fix: extract voice reply logic and add comprehensive tests
- Fix tempfile.mktemp() TOCTOU race in Discord voice input (use NamedTemporaryFile)
- Extract voice reply decision from _handle_message into _should_send_voice_reply()
- Rewrite TestAutoVoiceReply to call real method instead of testing a copy
- Add 59 new tests: VoiceReceiver, VC commands, adapter methods, streaming TTS
2026-03-14 14:27:20 +03:00
0xbyt4
fbf47e9ff6 fix: allow voice reply in Discord VC despite skip_double guard
When bot is in a Discord voice channel, both base auto-TTS and Discord
play_tts override skip audio. The skip_double guard was also blocking
the runner's _send_voice_reply, resulting in zero audio output in VC.

Now skip_double is overridden when the bot is actively connected to a
voice channel, allowing play_in_voice_channel to handle TTS.

Add comprehensive test matrix covering all platform x input x mode
combinations with full decision table documentation.
2026-03-14 14:27:20 +03:00
0xbyt4
dcb84a8d30 test: add double TTS prevention tests for voice reply logic
- Update TestAutoVoiceReply to include skip_double logic: voice input
  is handled by base adapter auto-TTS, gateway runner skips to prevent
  duplicate audio
- Add TestDiscordPlayTtsSkip: verifies Discord adapter skips play_tts
  when bot is in a voice channel (VC playback handled by runner)
- Add TestWebPlayTts: verifies Web adapter sends invisible play_audio
  instead of voice bubble
2026-03-14 14:27:20 +03:00
0xbyt4
f6cf4ca826 feat: add /voice slash command to Discord + fix cross-platform send_voice
- Register /voice as Discord slash command with mode choices
- Fix _send_voice_reply to handle adapters that don't accept metadata
  parameter (Discord) by inspecting the method signature at runtime
2026-03-14 14:27:20 +03:00
0xbyt4
d80da5ddd8 feat: add /voice command for auto voice reply in Telegram gateway
- /voice on: reply with voice when user sends voice messages
- /voice tts: reply with voice to all messages
- /voice off: disable, text-only replies
- /voice status: show current mode
- Per-chat state persisted to gateway_voice_mode.json
- Dedup: skips auto-reply if agent already called text_to_speech tool
- drop_pending_updates=True to ignore stale Telegram messages on restart
- 25 tests covering command handler, reply logic, and edge cases
2026-03-14 14:27:20 +03:00