[VOICE] Add edge-tts as Zero-Cost Voice Output Provider

Timmy commented

2026-04-07 21:17:10 +00:00

Owner

Objective

Integrate edge-tts as a lightweight, API-key-free voice output provider for Hermes alerts and voice memos.

Background

Hermes already has a text_to_speech tool. Adding edge-tts gives us a zero-cost fallback that works on Linux without Microsoft Edge or Windows. Useful for: Night Watch alerts, spoken summaries, and accessibility.

Acceptance Criteria

Phase 1 — Integration (2 days)

edge-tts is installed in the Hermes Python environment
New TTS provider edge-tts added to the existing text_to_speech tool
Voice can be selected from available locales (default: en-US-GuyNeural)
Output format supports .mp3 and .ogg for Telegram compatibility

Phase 2 — Trigger Points (3 days)

Night Watch can optionally deliver its summary as a voice memo via edge-tts
Critical alerts (e.g., CI failure, runner down) can be sent as voice messages to Timmy Time
A /voice-summary or similar command generates a spoken daily brief

Phase 3 — Fallback & Sovereignty (1 week)

If edge-tts fails (network unreachable), Hermes falls back to another configured TTS provider
Evaluate local TTS alternatives (fish-speech, F5-TTS) for full offline sovereignty; create follow-up issue if viable
Document in the-nexus/docs/voice-output.md

Suggested Implementation Path

pip install edge-tts
Add provider logic in tools/text_to_speech.py
Add voice delivery path in Telegram/notification pipeline

Owner

Bezalel

Linked Epic

#1120

## Objective Integrate `edge-tts` as a lightweight, API-key-free voice output provider for Hermes alerts and voice memos. ## Background Hermes already has a `text_to_speech` tool. Adding `edge-tts` gives us a zero-cost fallback that works on Linux without Microsoft Edge or Windows. Useful for: Night Watch alerts, spoken summaries, and accessibility. ## Acceptance Criteria ### Phase 1 — Integration (2 days) - [ ] `edge-tts` is installed in the Hermes Python environment - [ ] New TTS provider `edge-tts` added to the existing `text_to_speech` tool - [ ] Voice can be selected from available locales (default: `en-US-GuyNeural`) - [ ] Output format supports `.mp3` and `.ogg` for Telegram compatibility ### Phase 2 — Trigger Points (3 days) - [ ] Night Watch can optionally deliver its summary as a voice memo via edge-tts - [ ] Critical alerts (e.g., CI failure, runner down) can be sent as voice messages to Timmy Time - [ ] A `/voice-summary` or similar command generates a spoken daily brief ### Phase 3 — Fallback & Sovereignty (1 week) - [ ] If edge-tts fails (network unreachable), Hermes falls back to another configured TTS provider - [ ] Evaluate local TTS alternatives (`fish-speech`, `F5-TTS`) for full offline sovereignty; create follow-up issue if viable - [ ] Document in `the-nexus/docs/voice-output.md` ## Suggested Implementation Path 1. `pip install edge-tts` 2. Add provider logic in `tools/text_to_speech.py` 3. Add voice delivery path in Telegram/notification pipeline ## Owner Bezalel ## Linked Epic #1120

claude self-assigned this 2026-04-08 10:24:58 +00:00

claude referenced this issue from a commit

2026-04-08 10:29:38 +00:00

feat: add edge-tts as zero-cost voice output provider

claude referenced a pull request that will close this issue

2026-04-08 10:29:46 +00:00

[claude] Add edge-tts as zero-cost voice output provider (#1126) #1130

claude commented

2026-04-08 10:30:09 +00:00

Member

PR created: #1130

Implemented edge-tts as a zero-cost voice output provider:

EdgeTTSAdapter added to bin/deepdive_tts.py (provider key: edge-tts, default voice: en-US-GuyNeural, no API key)
EdgeTTS class added to intelligence/deepdive/tts_engine.py; HybridTTS now uses edge-tts as fallback between Piper and ElevenLabs
--voice-memo flag on bin/night_watch.py generates a spoken MP3 of the nightly report
requirements.txt pinned edge-tts>=6.1.9
docs/voice-output.md documents all providers, fallback chain, and Phase 3 fish-speech/F5-TTS evaluation TODO
17 unit tests added (tests/test_edge_tts.py), all mocked, zero network calls — all passing

PR created: https://forge.alexanderwhitestone.com/Timmy_Foundation/the-nexus/pulls/1130 Implemented `edge-tts` as a zero-cost voice output provider: - **`EdgeTTSAdapter`** added to `bin/deepdive_tts.py` (provider key: `edge-tts`, default voice: `en-US-GuyNeural`, no API key) - **`EdgeTTS`** class added to `intelligence/deepdive/tts_engine.py`; `HybridTTS` now uses edge-tts as fallback between Piper and ElevenLabs - **`--voice-memo`** flag on `bin/night_watch.py` generates a spoken MP3 of the nightly report - **`requirements.txt`** pinned `edge-tts>=6.1.9` - **`docs/voice-output.md`** documents all providers, fallback chain, and Phase 3 fish-speech/F5-TTS evaluation TODO - **17 unit tests** added (`tests/test_edge_tts.py`), all mocked, zero network calls — all passing