feat: add voice conversation support and futuristic UI redesign

- Auto-TTS: voice messages get spoken response (audio first, then text) - STT: Groq Whisper fallback when VOICE_TOOLS_OPENAI_KEY not set - Futuristic UI: glassmorphism, centered container, purple theme, glow effects - Voice bubble: custom waveform player with seek and progress - Invisible TTS playback via play_tts() method (no audio file in chat) - Add hermes-web toolset with full tool access - Register Platform.WEB in toolset/config maps - Update docs for voice conversation feature
2026-03-11 20:16:57 +03:00
parent db51cfa60e
commit d3e09df01a
5 changed files with 369 additions and 78 deletions
--- a/website/docs/user-guide/messaging/web.md
+++ b/website/docs/user-guide/messaging/web.md
@@ -107,9 +107,11 @@ You'll see output like:

 Bot responses render full GitHub-flavored Markdown with syntax-highlighted code blocks powered by highlight.js.

-### Voice Messages
+### Voice Conversation

-Click the microphone button to record a voice message. The audio is transcribed via Whisper STT and sent to the agent. If voice mode is enabled (`/voice tts`), the bot replies with audio playback in the browser.
+Click the microphone button to record a voice message. The audio is transcribed via Whisper STT (using OpenAI or Groq as fallback) and sent to the agent. The bot automatically replies with audio playback — voice first, then the text response appears. No extra configuration needed.
+
+STT priority: `VOICE_TOOLS_OPENAI_KEY` (OpenAI Whisper) > `GROQ_API_KEY` (Groq Whisper). TTS uses Edge TTS (free, no key) by default, or ElevenLabs/OpenAI if configured in `~/.hermes/config.yaml`.

 ### Images & Files