Commit Graph

31 Commits

Author SHA1 Message Date
Teknium
c62cadb73a fix: make display_hermes_home imports lazy to prevent ImportError during hermes update (#3776)
When a user runs 'hermes update', the Python process caches old modules
in sys.modules.  After git pull updates files on disk, lazy imports of
newly-updated modules fail because they try to import display_hermes_home
from the cached (old) hermes_constants which doesn't have the function.

This specifically broke the gateway auto-restart in cmd_update — importing
hermes_cli/gateway.py triggered the top-level 'from hermes_constants
import display_hermes_home' against the cached old module.  The ImportError
was silently caught, so the gateway was never restarted after update.

Users with a running gateway then hit the ImportError on their next
Telegram/Discord message when the stale gateway process lazily loaded
run_agent.py (new version) which also had the top-level import.

Fixes:
- hermes_cli/gateway.py: lazy import at call site (line 940)
- run_agent.py: lazy import at call site (line 6927)
- tools/terminal_tool.py: lazy imports at 3 call sites
- tools/tts_tool.py: static schema string (no module-level call)
- hermes_cli/auth.py: lazy import at call site (line 2024)
- hermes_cli/main.py: reload hermes_constants after git pull in cmd_update

Also fixes 4 pre-existing test failures in test_parse_env_var caused by
NameError on display_hermes_home in terminal_tool.py.
2026-03-29 15:15:17 -07:00
Teknium
9f01244137 fix: replace user-facing hardcoded ~/.hermes paths with display_hermes_home()
Prep for profiles: user-facing messages now use display_hermes_home() so
diagnostic output shows the correct path for each profile.

New helper: display_hermes_home() in hermes_constants.py
12 files swept, ~30 user-facing string replacements.
Includes dynamic TTS schema description.
2026-03-28 23:47:21 -07:00
Teknium
1e924e99b9 refactor: consolidate ~/.hermes directory layout with backward compat (#3610)
New installs get a cleaner structure:
  cache/images/      (was image_cache/)
  cache/audio/       (was audio_cache/)
  cache/documents/   (was document_cache/)
  cache/screenshots/ (was browser_screenshots/)
  platforms/whatsapp/session/ (was whatsapp/session/)
  platforms/matrix/store/    (was matrix/store/)
  platforms/pairing/         (was pairing/)

Existing installs are unaffected -- get_hermes_dir() checks for the
old path first and uses it if present. No migration needed.

Adds get_hermes_dir(new_subpath, old_name) helper to hermes_constants.py
for reuse by any future subsystem.
2026-03-28 15:22:19 -07:00
Teknium
be416cdfa9 fix: guard config.get() against YAML null values to prevent AttributeError (#3377)
dict.get(key, default) returns None — not the default — when the key IS
present but explicitly set to null/~ in YAML.  Calling .lower() on that
raises AttributeError.

Use (config.get(key) or fallback) so both missing keys and explicit nulls
coalesce to the intended default.

Files fixed:
- tools/tts_tool.py — _get_provider()
- tools/web_tools.py — _get_backend()
- tools/mcp_tool.py — MCPServerTask auth config
- trajectory_compressor.py — _detect_provider() and config loading

Co-authored-by: dieutx <dangtc94@gmail.com>
2026-03-27 04:03:00 -07:00
Teknium
cbf195e806 chore: fix 154 f-strings, simplify getattr/URL patterns, remove dead code (#3119)
Three categories of cleanup, all zero-behavioral-change:

1. F-strings without placeholders (154 fixes across 29 files)
   - Converted f'...' to '...' where no {expression} was present
   - Heaviest files: run_agent.py (24), cli.py (20), honcho_integration/cli.py (34)

2. Simplify defensive patterns in run_agent.py
   - Added explicit self._is_anthropic_oauth = False in __init__ (before
     the api_mode branch that conditionally sets it)
   - Replaced 7x getattr(self, '_is_anthropic_oauth', False) with direct
     self._is_anthropic_oauth (attribute always initialized now)
   - Added _is_openrouter_url() and _is_anthropic_url() helper methods
   - Replaced 3 inline 'openrouter' in self._base_url_lower checks

3. Remove dead code in small files
   - hermes_cli/claw.py: removed unused 'total' computation
   - tools/fuzzy_match.py: removed unused strip_indent() function and
     pattern_stripped variable

Full test suite: 6184 passed, 0 failures
E2E PTY: banner clean, tool calls work, zero garbled ANSI
2026-03-25 19:47:58 -07:00
Teknium
77bcaba2d7 refactor: consolidate get_hermes_home() and parse_reasoning_effort() (#3062)
Centralizes two widely-duplicated patterns into hermes_constants.py:

1. get_hermes_home() — Path resolution for ~/.hermes (HERMES_HOME env var)
   - Was copy-pasted inline across 30+ files as:
     Path(os.getenv("HERMES_HOME", Path.home() / ".hermes"))
   - Now defined once in hermes_constants.py (zero-dependency module)
   - hermes_cli/config.py re-exports it for backward compatibility
   - Removed local wrapper functions in honcho_integration/client.py,
     tools/website_policy.py, tools/tirith_security.py, hermes_cli/uninstall.py

2. parse_reasoning_effort() — Reasoning effort string validation
   - Was copy-pasted in cli.py, gateway/run.py, cron/scheduler.py
   - Same validation logic: check against (xhigh, high, medium, low, minimal, none)
   - Now defined once in hermes_constants.py, called from all 3 locations
   - Warning log for unknown values kept at call sites (context-specific)

31 files changed, net +31 lines (125 insertions, 94 deletions)
Full test suite: 6179 passed, 0 failed
2026-03-25 15:54:28 -07:00
Teknium
8bb1d15da4 chore: remove ~100 unused imports across 55 files (#3016)
Automated cleanup via pyflakes + autoflake with manual review.

Changes:
- Removed unused stdlib imports (os, sys, json, pathlib.Path, etc.)
- Removed unused typing imports (List, Dict, Any, Optional, Tuple, Set, etc.)
- Removed unused internal imports (hermes_cli.auth, hermes_cli.config, etc.)
- Fixed cli.py: removed 8 shadowed banner imports (imported from hermes_cli.banner
  then immediately redefined locally — only build_welcome_banner is actually used)
- Added noqa comments to imports that appear unused but serve a purpose:
  - Re-exports (gateway/session.py SessionResetPolicy, tools/terminal_tool.py
    is_interrupted/_interrupt_event)
  - SDK presence checks in try/except (daytona, fal_client, discord)
  - Test mock targets (auxiliary_client.py Path, mcp_config.py get_hermes_home)

Zero behavioral changes. Full test suite passes (6162/6162, 2 pre-existing
streaming test failures unrelated to this change).
2026-03-25 15:02:03 -07:00
Han
116984feb7 feat(tools): add base_url support to OpenAI TTS provider
Allow users to configure a custom base_url for the OpenAI TTS provider
in ~/.hermes/config.yaml under tts.openai.base_url. Defaults to the
official OpenAI endpoint. Enables use of self-hosted or OpenAI-compatible
TTS services (e.g. http://localhost:8000/v1).

Also adds a TTS configuration example block to cli-config.yaml.example.
2026-03-19 23:55:13 +08:00
Teknium
11f029c311 fix(tts): document NeuTTS provider and align install guidance (#1903)
Co-authored-by: charles-édouard <59705750+ccbbccbb@users.noreply.github.com>
2026-03-18 02:55:30 -07:00
Teknium
d50e0711c2 refactor(tts): replace NeuTTS optional skill with built-in provider + setup flow
Remove the optional skill (redundant now that NeuTTS is a built-in TTS
provider). Replace neutts_cli dependency with a standalone synthesis
helper (tools/neutts_synth.py) that calls the neutts Python API directly
in a subprocess.

Add TTS provider selection to hermes setup:
- 'hermes setup' now prompts for TTS provider after model selection
- 'hermes setup tts' available as standalone section
- Selecting NeuTTS checks for deps and offers to install:
  espeak-ng (system) + neutts[all] (pip)
- ElevenLabs/OpenAI selections prompt for API keys
- Tool status display shows NeuTTS install state

Changes:
- Remove optional-skills/mlops/models/neutts/ (skill + CLI scaffold)
- Add tools/neutts_synth.py (standalone synthesis subprocess helper)
- Move jo.wav/jo.txt to tools/neutts_samples/ (bundled default voice)
- Refactor _generate_neutts() — uses neutts API via subprocess, no
  neutts_cli dependency, config-driven ref_audio/ref_text/model/device
- Add TTS setup to hermes_cli/setup.py (SETUP_SECTIONS, tool status)
- Update config.py defaults (ref_audio, ref_text, model, device)
2026-03-17 02:33:12 -07:00
Teknium
cb0deb5f9d feat: add NeuTTS optional skill + local TTS provider backend
* feat(skills): add bundled neutts optional skill

Add NeuTTS optional skill with CLI scaffold, bootstrap helper, and
sample voice profile. Also fixes skills_hub.py to handle binary
assets (WAV files) during skill installation.

Changes:
- optional-skills/mlops/models/neutts/ — skill + CLI scaffold
- tools/skills_hub.py — binary asset support (read_bytes, write_bytes)
- tests/tools/test_skills_hub.py — regression tests for binary assets

* feat(tts): add NeuTTS as local TTS provider backend

Add NeuTTS as a fourth TTS provider option alongside Edge, ElevenLabs,
and OpenAI. NeuTTS runs fully on-device via neutts_cli — no API key
needed.

Provider behavior:
- Explicit: set tts.provider to 'neutts' in config.yaml
- Fallback: when Edge TTS is unavailable and neutts_cli is installed,
  automatically falls back to NeuTTS instead of failing
- check_tts_requirements() now includes NeuTTS in availability checks

NeuTTS outputs WAV natively. For Telegram voice bubbles, ffmpeg
converts to Opus (same pattern as Edge TTS).

Changes:
- tools/tts_tool.py — _generate_neutts(), _check_neutts_available(),
  provider dispatch, fallback logic, Opus conversion
- hermes_cli/config.py — tts.neutts config defaults

---------

Co-authored-by: unmodeled-tyler <unmodeled.tyler@proton.me>
2026-03-17 02:13:34 -07:00
teknium1
210d5ade1e feat(tools): centralize tool emoji metadata in registry + skin integration
- Add 'emoji' field to ToolEntry and 'get_emoji()' to ToolRegistry
- Add emoji= to all 50+ registry.register() calls across tool files
- Add get_tool_emoji() helper in agent/display.py with 3-tier resolution:
  skin override → registry default → hardcoded fallback
- Replace hardcoded emoji maps in run_agent.py, delegate_tool.py, and
  gateway/run.py with centralized get_tool_emoji() calls
- Add 'tool_emojis' field to SkinConfig so skins can override per-tool
  emojis (e.g. ares skin could use swords instead of wrenches)
- Add 11 tests (5 registry emoji, 6 display/skin integration)
- Update AGENTS.md skin docs table

Based on the approach from PR #1061 by ForgingAlex (emoji centralization
in registry). This salvage fixes several issues from the original:
- Does NOT split the cronjob tool (which would crash on missing schemas)
- Does NOT change image_generate toolset/requires_env/is_async
- Does NOT delete existing tests
- Completes the centralization (gateway/run.py was missed)
- Hooks into the skin system for full customizability
2026-03-15 20:21:21 -07:00
0xbyt4
ddfd6e0c59 fix: resolve 6 voice mode bugs found during audit
- edge_tts NameError: _generate_edge_tts now calls _import_edge_tts()
  instead of referencing bare module name (tts_tool.py)
- TTS thread leak: chat() finally block sends sentinel to text_queue,
  sets stop_event, and joins tts_thread on exception paths (cli.py)
- output_stream leak: moved close() into finally block so audio device
  is released even on exception (tts_tool.py)
- Ctrl+C continuous mode: cancel handler now resets _voice_continuous
  to prevent auto-restart after user cancels recording (cli.py)
- _disable_voice_mode: now calls stop_playback() and sets
  _voice_tts_done so TTS stops when voice mode is turned off (cli.py)
- _show_voice_status: reads record key from config instead of
  hardcoding Ctrl+B (cli.py)
2026-03-14 14:27:20 +03:00
0xbyt4
b859dfab16 fix: address voice mode review feedback
1. Fully lazy imports: sounddevice, numpy, elevenlabs, edge_tts, and
   openai are never imported at module level. Each is imported only when
   the feature is explicitly activated, preventing crashes in headless
   environments (SSH, Docker, WSL, no PortAudio).

2. No core agent loop changes: streaming TTS path extracted from
   _interruptible_api_call() into separate _streaming_api_call() method.
   The original method is restored to its upstream form.

3. Configurable key binding: push-to-talk key changed from Ctrl+R
   (conflicts with readline reverse-search) to Ctrl+B by default.
   Configurable via voice.push_to_talk_key in config.yaml.

4. Environment detection: new detect_audio_environment() function checks
   for SSH, Docker, WSL, and missing audio devices before enabling voice
   mode. Auto-disables with clear warnings in incompatible environments.

5. Graceful degradation: every audio touchpoint (sd.play, sd.InputStream,
   sd.OutputStream) wrapped in try/except with ImportError/OSError
   handling. Failures produce warnings, not crashes.
2026-03-14 14:27:20 +03:00
0xbyt4
46db7aeffd fix: streaming tool call parsing, error handling, and fake HA state mutation
- Fix Gemini streaming tool call merge bug: multiple tool calls with same
  index but different IDs are now parsed as separate calls instead of
  concatenating names (e.g. ha_call_serviceha_call_service)
- Handle partial results in voice mode: show error and stop continuous
  mode when agent returns partial/failed results with empty response
- Fix error display during streaming TTS: error messages are shown in
  full response box even when streaming box was already opened
- Add duplicate sentence filter in TTS: skip near-duplicate sentences
  from LLM repetition
- Fix fake HA server state mutation: turn_on/turn_off/set_temperature
  correctly update entity states; temperature sensor simulates change
  when thermostat is adjusted
2026-03-14 14:27:20 +03:00
0xbyt4
3a1b35ed92 fix: voice mode race conditions, temp file leak, think tag parsing
- Atomic check-and-set for _voice_recording flag with _voice_lock
- Guard _voice_stop_and_transcribe against concurrent invocation
- Remove premature flag clearing from Ctrl+R handler
- Clean up temp WAV files in finally block (_play_via_tempfile)
- Use buffer-level regex for <think> block filtering (handles chunked tags)
- Prevent /voice on prompt accumulation on repeated calls
- Include Groq in STT key error message
2026-03-14 14:26:55 +03:00
0xbyt4
7d4b4e95f1 feat: sync text display with TTS audio playback
Move screen output from stream_callback to display_callback called by
TTS consumer thread. Text now appears sentence-by-sentence in sync with
audio instead of streaming ahead at LLM speed. Removes quiet_mode hack.
2026-03-14 14:26:55 +03:00
0xbyt4
fd4f229eab fix: catch OSError on sounddevice import for CI without PortAudio
sounddevice raises OSError (not ImportError) when the PortAudio C
library is missing. This broke test collection on CI runners that
have the Python package installed but lack the native library.
2026-03-14 14:26:30 +03:00
0xbyt4
179d9e1a22 feat: add streaming sentence-by-sentence TTS via ElevenLabs
Stream audio to speaker as the agent generates tokens instead of
waiting for the full response. First sentence plays within ~1-2s
of agent starting to respond.

- run_agent: add stream_callback to run_conversation/chat, streaming
  path in _interruptible_api_call accumulates chunks into mock
  ChatCompletion while forwarding content deltas to callback
- tts_tool: add stream_tts_to_speaker() with sentence buffering,
  think block filtering, markdown stripping, ElevenLabs pcm_24000
  streaming to sounddevice OutputStream
- cli: wire up streaming TTS pipeline in chat(), detect elevenlabs
  provider + sounddevice availability, skip batch TTS when streaming
  is active, signal stop on interrupt

Falls back to batch TTS for Edge/OpenAI providers or when
elevenlabs/sounddevice are not available. Zero impact on non-voice
mode (callback defaults to None).
2026-03-14 14:26:30 +03:00
0xIbra
437ec17125 fix(cli): respect HERMES_HOME in all remaining hardcoded ~/.hermes paths
Several files resolved paths via Path.home() / ".hermes" or
os.path.expanduser("~/.hermes/..."), bypassing the HERMES_HOME
environment variable. This broke isolation when running multiple
Hermes instances with distinct HERMES_HOME directories.

Replace all hardcoded paths with calls to get_hermes_home() from
hermes_cli.config, consistent with the rest of the codebase.

Files fixed:
- tools/process_registry.py (processes.json)
- gateway/pairing.py (pairing/)
- gateway/sticker_cache.py (sticker_cache.json)
- gateway/channel_directory.py (channel_directory.json, sessions.json)
- gateway/config.py (gateway.json, config.yaml, sessions_dir)
- gateway/mirror.py (sessions/)
- gateway/hooks.py (hooks/)
- gateway/platforms/base.py (image_cache/, audio_cache/, document_cache/)
- gateway/platforms/whatsapp.py (whatsapp/session)
- gateway/delivery.py (cron/output)
- agent/auxiliary_client.py (auth.json)
- agent/prompt_builder.py (SOUL.md)
- cli.py (config.yaml, images/, pastes/, history)
- run_agent.py (logs/)
- tools/environments/base.py (sandboxes/)
- tools/environments/modal.py (modal_snapshots.json)
- tools/environments/singularity.py (singularity_snapshots.json)
- tools/tts_tool.py (audio_cache)
- hermes_cli/status.py (cron/jobs.json, sessions.json)
- hermes_cli/gateway.py (logs/, whatsapp session)
- hermes_cli/main.py (whatsapp/session)

Tests updated to use HERMES_HOME env var instead of patching Path.home().

Closes #892

(cherry picked from commit 78ac1bba43b8b74a934c6172f2c29bb4d03164b9)
2026-03-13 21:32:53 -07:00
aydnOktay
86caa8539c Improve TTS error handling and logging 2026-03-07 16:53:30 +03:00
teknium1
a5ea272936 refactor: streamline API key retrieval in transcription and TTS tools
- Removed fallback to OPENAI_API_KEY in favor of exclusively using VOICE_TOOLS_OPENAI_KEY for improved clarity and consistency.
- Updated environment variable checks to ensure only VOICE_TOOLS_OPENAI_KEY is considered, enhancing error handling and messaging.
2026-02-26 19:56:42 -08:00
teknium1
f64a87209d refactor: enhance session content handling in AIAgent and update TTS output path
- Introduced a new static method `_clean_session_content` in the `AIAgent` class to convert REASONING_SCRATCHPAD tags to <think> blocks and clean up whitespace in session logs.
- Updated the `_save_session_log` method to utilize the cleaned content for assistant messages, ensuring consistency in session logs.
- Changed the default output directory for TTS audio files from `~/voice-memos` to `~/.hermes/audio_cache`, reflecting a more appropriate storage location.
2026-02-25 04:22:03 -08:00
teknium1
54dd1b3038 feat: enhance README and update API client initialization
- Updated the README to include new badges, a detailed description of the Hermes Agent, and a table summarizing its features, improving clarity and presentation for users.
- Modified the API client initialization in `transcription_tools.py` and `tts_tool.py` to include a base URL, ensuring compatibility with the OpenAI API.
2026-02-23 20:59:39 -08:00
Teknium
0858ee2f27 refactor: rename HERMES_OPENAI_API_KEY to VOICE_TOOLS_OPENAI_KEY
- Updated the environment variable name from HERMES_OPENAI_API_KEY to VOICE_TOOLS_OPENAI_KEY across multiple files to avoid interference with OpenRouter.
- Adjusted related error messages and configuration prompts to reflect the new variable name, ensuring consistency throughout the codebase.
2026-02-23 23:21:33 +00:00
teknium1
08ff1c1aa8 More major refactor/tech debt removal! 2026-02-21 20:22:33 -08:00
teknium1
748fd3db88 refactor: enhance error handling with structured logging across multiple modules
- Updated various modules including cli.py, run_agent.py, gateway, and tools to replace silent exception handling with structured logging.
- Improved error messages to provide more context, aiding in debugging and monitoring.
- Ensured consistent logging practices throughout the codebase, enhancing traceability and maintainability.
2026-02-21 03:32:11 -08:00
teknium1
a885d2f240 refactor: implement structured logging across multiple modules
- Introduced logging functionality in cli.py, run_agent.py, scheduler.py, and various tool modules to replace print statements with structured logging.
- Enhanced error handling and informational messages to improve debugging and monitoring capabilities.
- Ensured consistent logging practices across the codebase, facilitating better traceability and maintenance.
2026-02-21 03:11:11 -08:00
teknium1
bdac541d1e Rename OPENAI_API_KEY to HERMES_OPENAI_API_KEY in configuration and codebase for clarity and to avoid conflicts. Update related documentation and error messages to reflect the new key name, ensuring backward compatibility with existing setups. 2026-02-17 03:11:17 -08:00
teknium1
ff9ea6c4b1 Enhance TTS tool to support platform-specific audio formats
- Added detection of the platform from the environment variable to determine the appropriate audio output format.
- Implemented logic to output Opus (.ogg) files for Telegram when using compatible TTS providers, while defaulting to MP3 for others.
2026-02-14 16:13:26 -08:00
teknium1
f5be6177b2 Add Text-to-Speech (TTS) functionality with multiple providers
Add tool previews

Add AGENTS and SOUL.md support

Add Exec Approval
2026-02-12 10:05:08 -08:00