Authored by satelerd. Adds native WhatsApp media sending for images, videos,
and documents via MEDIA: tags. Also includes conflict resolution with edit_message
feature, Telegram hint fix (only advertise supported media types), and import cleanup.
Building on PR #288's edit_message() abstraction:
- Telegram: edit_message_text() with MarkdownV2 + plain text fallback
- Discord: channel.fetch_message() + msg.edit() with length capping
- Slack: chat_update() via slack_bolt client
Also fixes the fallback regression in send_progress_messages() where
platforms that don't support editing would receive duplicated accumulated
tool lines. Now uses a can_edit flag — after the first failed edit, falls
back to sending individual lines (matching pre-PR behavior).
Instead of sending a separate WhatsApp message for each tool call during
agent execution (N+1 messages), the first tool sends a new message and
subsequent tools edit it to append their line. Result: 1 growing progress
message + 1 final response = 2 messages instead of N+1.
Changes:
- bridge.js: Add POST /edit endpoint using Baileys message editing
- base.py: Add optional edit_message() to BasePlatformAdapter (no-op
default, so platforms without editing support work unchanged)
- whatsapp.py: Implement edit_message() calling bridge /edit
- run.py: Rewrite send_progress_messages() to accumulate tool lines and
edit the progress message. Falls back to sending a new message if
edit fails (graceful degradation).
Before (5 tools = 6 messages):
⚕ Hermes Agent ─── 🔍 web_search... "query"
⚕ Hermes Agent ─── 📄 web_extract... "url"
⚕ Hermes Agent ─── 💻 terminal... "pip install"
⚕ Hermes Agent ─── ✍️ write_file... "app.py"
⚕ Hermes Agent ─── 💻 terminal... "python app.py"
⚕ Hermes Agent ─── Done! The server is running...
After (5 tools = 2 messages):
⚕ Hermes Agent ───
🔍 web_search... "query"
📄 web_extract... "url"
💻 terminal... "pip install"
✍️ write_file... "app.py"
💻 terminal... "python app.py"
⚕ Hermes Agent ─── Done! The server is running...
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Adds a /update command to Telegram, Discord, and other gateway platforms
that runs `hermes update` to pull the latest code, update dependencies,
sync skills, and restart the gateway.
Implementation:
- Spawns `hermes update` in a separate systemd scope (systemd-run --user
--scope) so the process survives the gateway restart that hermes update
triggers at the end. Falls back to nohup if systemd-run is unavailable.
- Writes a marker file (.update_pending.json) with the originating
platform and chat_id before spawning the update.
- On gateway startup, _send_update_notification() checks for the marker,
reads the captured update output, sends the results back to the user,
and cleans up.
Also:
- Registers /update as a Discord slash command
- Updates README.md, docs/messaging.md, docs/slash-commands.md
- Adds 18 tests covering handler, notification, and edge cases
Authored by FarukEst. Fixes#392.
1. Initialize data={} before health-check loop to prevent NameError when
resp.json() raises after http_ready is set to True.
2. Extract _close_bridge_log() helper and call on all return False paths
to prevent file descriptor leaks on failed connection attempts.
Refactors disconnect() to reuse the same helper.
Authored by 0xbyt4.
The italic regex [^*]+ matched across newlines, corrupting bullet lists
using * markers (e.g. '* Item one\n* Item two' became italic garbage).
Fixed by adding \n to the negated character class: [^*\n]+.
Authored by 0xbyt4.
Two fixes:
- extract_images(): only remove extracted image tags, not all markdown image
tags. Previously  was silently dropped when real images
were also present.
- truncate_message(): walk chunk_body not full_chunk when tracking code block
state, so the reopened fence prefix doesn't toggle in_code off and leave
continuation chunks with unclosed code blocks.
The gateway health check broke out of the polling loop as soon as
the bridge HTTP server returned 200, regardless of the actual
WhatsApp connection status. This meant 'Bridge ready (status:
disconnected)' was printed and the gateway moved on, even when
WhatsApp never connected.
Additionally, bridge stdout/stderr were piped to DEVNULL, so if the
session had expired and the bridge needed a QR re-scan, the user had
no way to see that. The 'Scan QR code if prompted (check bridge
output)' message was misleading since there was no output to check.
Changes:
- Health check now has two phases: wait for HTTP (15s), then wait
for status:connected (15s more). Total 30s budget.
- Bridge output routes to ~/.hermes/whatsapp/bridge.log instead of
DEVNULL — QR codes, errors, reconnection msgs are preserved.
- Clear warnings with actionable steps if connection fails after 30s
(check bridge.log, re-pair with hermes whatsapp).
- Removed misleading 'Scan QR code' message.
- Log file handle properly cleaned up on disconnect.
Fixes#365
Improvements to the HA integration merged from PR #184:
- Add ha_list_services tool: discovers available services (actions) per
domain with descriptions and parameter fields. Tells the model what
it can do with each device type (e.g. light.turn_on accepts brightness,
color_name, transition). Closes the gap where the model had to guess
available actions.
- Add HA to hermes tools config: users can enable/disable the homeassistant
toolset and configure HASS_TOKEN + HASS_URL through 'hermes tools' setup
flow instead of manually editing .env.
- Fix should-fix items from code review:
- Remove sys.path.insert hack from gateway adapter
- Replace all print() calls with proper logger (info/warning/error)
- Move env var reads from import-time to handler-time via _get_config()
- Add dedicated REST session reuse in gateway send()
- Update ha_call_service description to reference ha_list_services for
action discovery.
- Update tests for new ha_list_services tool in toolset resolution.
The ImportError fallback set ContextTypes = Any, but then
ContextTypes.DEFAULT_TYPE was used as a type annotation at class
definition time — Any doesn't have .DEFAULT_TYPE, causing AttributeError.
Fix: create a _MockContextTypes class with DEFAULT_TYPE = Any.
Also stub CommandHandler, TelegramMessageHandler, filters, ParseMode,
and ChatType to prevent potential NameErrors.
Fixes#304.
Updated the README and messaging documentation to clarify the two modes for WhatsApp integration: 'bot' mode (recommended) and 'self-chat' mode. Improved setup instructions to guide users through the configuration process, including allowlist management and dependency installation. Adjusted CLI commands to reflect these changes and ensure a smoother user experience. Additionally, modified the WhatsApp bridge to support the new mode functionality.
Add a /send-media endpoint to the WhatsApp bridge and corresponding
adapter methods so the agent can send files as native WhatsApp
attachments instead of plain-text URLs/paths.
- bridge.js: new POST /send-media endpoint using Baileys' native
image/video/document/audio message types with MIME detection
- base.py: add send_video(), send_document(), send_image_file()
with text fallbacks; route MEDIA: tags by file extension instead
of always treating them as voice messages
- whatsapp.py: implement all media methods via a shared
_send_media_to_bridge() helper; override send_image() to download
URLs to local cache and send as native photos
- prompt_builder.py: update WhatsApp and Telegram platform hints so
the agent knows it can use MEDIA:/path tags to send native media
Fixes#163
- Add chat_topic field to SessionSource dataclass
- Update to_dict/from_dict for serialization support
- Add chat_topic parameter to build_source helper
- Extract channel.topic in Discord adapter for messages and slash commands
- Display Channel Topic in system prompt when available
- Normalize empty topics to None
os.setsid, os.killpg, and os.getpgid do not exist on Windows and raise
AttributeError on import or first call. This breaks the terminal tool,
code execution sandbox, process registry, and WhatsApp bridge on Windows.
Added _IS_WINDOWS platform guard in all four affected files, following
the pattern documented in CONTRIBUTING.md. On Windows, preexec_fn is
set to None and process termination falls back to proc.terminate() /
proc.kill() instead of process group signals.
Files changed:
- tools/environments/local.py (3 call sites)
- tools/process_registry.py (2 call sites)
- tools/code_execution_tool.py (3 call sites)
- gateway/platforms/whatsapp.py (3 call sites)
The italic regex \*([^*]+)\* used [^*] which matches newlines, causing
bullet lists with * markers to be incorrectly converted to italic text.
Changed to [^*\n]+ to prevent cross-line matching.
Adds 43 tests for _escape_mdv2 and format_message covering code blocks,
bold/italic, headers, links, mixed formatting, and the regression case.
- extract_images: only remove extracted image tags from content, preserve
non-image markdown links (e.g. PDFs) that were previously silently lost
- truncate_message: walk only chunk_body (not prepended prefix) so the
reopened code fence does not toggle in_code off, leaving continuation
chunks with unclosed code blocks
- Add 49 unit tests covering MessageEvent command parsing, extract_images,
extract_media, truncate_message code block handling, and _get_human_delay
- Auto-authorize HA events in gateway (system-generated, not user messages)
- Guard _read_events against None/closed WebSocket after failed reconnect
- Use UUID for send() message_id instead of polluting WS sequence counter
- entity_id parameter now takes precedence over data["entity_id"]
- Add ha_list_entities, ha_get_state, ha_call_service tools via REST API
- Add WebSocket gateway adapter for real-time state_changed event monitoring
- Support domain/entity filtering, cooldown, and auto-reconnect with backoff
- Use REST API for outbound notifications to avoid WS race condition
- Gate tool availability on HASS_TOKEN env var
- Add 82 unit tests covering real logic (filtering, payload building, event pipeline)
- Sanitize filenames in cache_document_from_bytes to prevent path traversal (strip directory components, null bytes, resolve check)
- Reject documents with None file_size instead of silently allowing download
- Cap text file injection at 100 KB to prevent oversized prompt payloads
- Sanitize display_name in run.py context notes to block prompt injection via filenames
- Add 35 unit tests covering document cache utilities and Telegram document handling
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Download, cache, and enrich document files sent via Telegram. Supports
.pdf, .md, .txt, .docx, .xlsx, .pptx with size validation, unsupported
type rejection, text content injection for .md/.txt, and hourly cache
cleanup.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Updated the command name from `/set-home` to `/sethome` in the GatewayRunner class for consistency.
- Added a new slash command `/sethome` in the Discord adapter to set the home channel.
- Registered the `/sethome` command in the Telegram adapter to align with the updated naming convention.
- Added functionality to load values from config.yaml into the environment, allowing os.getenv() to access them.
- Ensured that existing environment variables take precedence over config values.
- Updated DiscordAdapter to resolve usernames in DISCORD_ALLOWED_USERS to numeric IDs, improving user authorization checks.
- Enhanced event handling to provide clearer logging and ensure proper synchronization of slash commands.
- Introduced new methods in run_agent.py for building API keyword arguments and normalizing assistant messages from API responses.
- Added functionality for compressing conversation context and managing session state in SQLite.
- Improved tool call execution handling, including enhanced logging and error management.
- Updated path handling in multiple platform files to utilize pathlib for better compatibility and readability.
- Updated various modules including cli.py, run_agent.py, gateway, and tools to replace silent exception handling with structured logging.
- Improved error messages to provide more context, aiding in debugging and monitoring.
- Ensured consistent logging practices throughout the codebase, enhancing traceability and maintainability.
- Introduced the `/new` command to start a new conversation, resetting the history.
- Updated command handling in the CLI and various platform adapters (Discord, Slack, Telegram) to support the new command.
- Added help command functionality to list available commands, improving user guidance.
- Enhanced command mapping for better integration across platforms, ensuring consistent command behavior.
- Updated the vision tool to accept both HTTP/HTTPS URLs and local file paths for image analysis.
- Implemented caching of user-uploaded images in local directories to ensure reliable access for the vision tool, addressing issues with ephemeral URLs.
- Enhanced platform adapters (Discord, Telegram, WhatsApp) to download and cache images, allowing for immediate analysis and enriched message context.
- Added a new method to auto-analyze images attached by users, enriching the conversation with detailed descriptions.
- Improved documentation for image handling processes and updated related functions for clarity and efficiency.
- Updated the image generation function description to clarify usage with markdown.
- Added `send_image` method to `BasePlatformAdapter` for native image sending across platforms.
- Implemented `send_image` in `DiscordAdapter` and `TelegramAdapter` to handle image attachments directly.
- Introduced `extract_images` method to extract image URLs from markdown and HTML, improving content processing.
- Enhanced message handling to support sending images as attachments while maintaining text content.
- Added logic to clear the adapter's interrupt event to prevent infinite loops during message processing.
- Updated the get_pending_message method to pop messages from the pending queue, ensuring proper message handling.
- Introduced a monitoring mechanism in GatewayRunner to detect incoming messages while an agent is active, allowing for graceful interruption and processing of new messages.
- Enhanced BasePlatformAdapter to manage active sessions and pending messages, ensuring that new messages can interrupt ongoing tasks effectively.
- Improved the handling of pending messages by checking for interrupts and processing them in the correct order, enhancing user experience during message interactions.
- Updated the cleanup process for active tasks to ensure proper resource management after interruptions.
- Updated the AIAgent class to extract the first user message for trajectory formatting, improving the accuracy of user queries in the trajectory format.
- Enhanced the GatewayRunner to convert transcript history into the agent format, ensuring proper handling of message roles and content.
- Adjusted the typing indicator refresh rate to every 2 seconds for better responsiveness.
- Improved error handling in the message sending process for the Telegram adapter, implementing a fallback mechanism for Markdown parsing failures, and logging send failures for better debugging.
- Adjusted the `_keep_typing` method to refresh the typing indicator every 2 seconds instead of 4, improving responsiveness after progress messages.
- Updated the `GatewayRunner` to restore the typing indicator after sending progress messages, enhancing user experience during message processing.
- Added a new private method `_keep_typing` to send a typing indicator continuously while processing messages, refreshing every 4 seconds to comply with Telegram/Discord limitations.
- Updated the `handle_message` method to initiate the typing indicator at the start of message processing and ensure it stops once processing is complete, improving user experience during message handling.
- Updated CLI to load configuration from user-specific and project-specific YAML files, prioritizing user settings.
- Introduced a new command `/platforms` to display the status of connected messaging platforms (Telegram, Discord, WhatsApp).
- Implemented a gateway system for handling messaging interactions, including session management and delivery routing for cron job outputs.
- Added support for environment variable configuration and a dedicated gateway configuration file for advanced settings.
- Enhanced documentation in README.md and added a new messaging.md file to guide users on platform integrations and setup.
- Updated toolsets to include platform-specific capabilities for Telegram, Discord, and WhatsApp, ensuring secure and tailored interactions.