* fix: Anthropic OAuth compatibility — Claude Code identity fingerprinting
Anthropic routes OAuth/subscription requests based on Claude Code's
identity markers. Without them, requests get intermittent 500 errors
(~25% failure rate observed). This matches what pi-ai (clawdbot) and
OpenCode both implement for OAuth compatibility.
Changes (OAuth tokens only — API key users unaffected):
1. Headers: user-agent 'claude-cli/2.1.2 (external, cli)' + x-app 'cli'
2. System prompt: prepend 'You are Claude Code, Anthropic's official CLI'
3. System prompt sanitization: replace Hermes/Nous references
4. Tool names: prefix with 'mcp_' (Claude Code convention for non-native tools)
5. Tool name stripping: remove 'mcp_' prefix from response tool calls
Before: 9/12 OK, 1 hard fail, 4 needed retries (~25% error rate)
After: 16/16 OK, 0 failures, 0 retries (0% error rate)
* fix: three gateway issues from user error logs
1. send_animation missing metadata kwarg (base.py)
- Base class send_animation lacked the metadata parameter that the
call site in base.py line 917 passes. Telegram's override accepted
it, but any platform without an override (Discord, Slack, etc.)
hit TypeError. Added metadata to base class signature.
2. MarkdownV2 split-inside-inline-code (base.py truncate_message)
- truncate_message could split at a space inside an inline code span
(e.g. `function(arg1, arg2)`), leaving an unpaired backtick and
unescaped parentheses in the chunk. Telegram rejects with
'character ( is reserved'. Added inline code awareness to the
split-point finder — detects odd backtick counts and moves the
split before the code span.
3. tirith auto-install without cosign (tirith_security.py)
- Previously required cosign on PATH for auto-install, blocking
install entirely with a warning if missing. Now proceeds with
SHA-256 checksum verification only when cosign is unavailable.
Cosign is still used for full supply chain verification when
present. If cosign IS present but verification explicitly fails,
install is still aborted (tampered release).
default group and channel sessions to per-user isolation, allow opting back into shared room sessions via config.yaml, and document Discord gateway routing and session behavior.
Track adapter background message-processing tasks, cancel them during gateway shutdown, and interrupt running agents before disconnecting adapters. This prevents old gateway instances from continuing in-flight work after stop/replace, which was contributing to the restart-time task continuation/flicker behavior reported in #1414. Adds regression coverage for adapter task cancellation and shutdown interrupts.
1. Gate _streaming_api_call to chat_completions mode only — Anthropic and
Codex fall back to _interruptible_api_call. Preserve Anthropic base_url
across all client rebuild paths (interrupt, fallback, 401 refresh).
2. Discord VC synthetic events now use chat_type="channel" instead of
defaulting to "dm" — prevents session bleed into DM context.
Authorization runs before echoing transcript. Sanitize @everyone/@here
in voice transcripts.
3. CLI voice prefix ("[Voice input...]") is now API-call-local only —
stripped from returned history so it never persists to session DB or
resumed sessions.
4. /voice off now disables base adapter auto-TTS via _auto_tts_disabled_chats
set — voice input no longer triggers TTS when voice mode is off.
1. VoiceReceiver.stop() now acquires _lock before clearing shared state
to prevent race with _on_packet on the socket reader thread
2. _packet_debug_count moved from class-level to instance-level to avoid
cross-instance race condition in multi-guild setups
3. play_in_voice_channel uses asyncio.get_running_loop() instead of
deprecated asyncio.get_event_loop()
4. _send_voice_reply uses uuid for filenames instead of time-based names
that can collide when two replies happen in the same second
5. Voice timeout now notifies runner via _on_voice_disconnect callback
so runner cleans up _voice_mode state (prevents orphaned TTS replies)
6. play_in_voice_channel adds PLAYBACK_TIMEOUT (120s) to prevent
infinite blocking when FFmpeg callback is never called
7. _send_voice_reply moves temp file cleanup to finally block so files
are always cleaned up even when send_voice/play raises
8. Base adapter auto-TTS wraps play_tts in try/finally with os.remove
to clean up generated audio files after playback
18 new tests (120 total voice tests)
- Add lock protection around VoiceReceiver buffer writes in _on_packet
to prevent race condition with check_silence on different threads
- Wire _voice_input_callback BEFORE join_voice_channel to avoid
losing voice input during the join window
- Add try/except around leave_voice_channel to ensure state cleanup
(voice_mode, callback) even if leave raises an exception
- Guard against empty text after markdown stripping in base.py auto-TTS
- Add 11 tests proving each bug and verifying the fix
- Auto-TTS: voice messages get spoken response (audio first, then text)
- STT: Groq Whisper fallback when VOICE_TOOLS_OPENAI_KEY not set
- Futuristic UI: glassmorphism, centered container, purple theme, glow effects
- Voice bubble: custom waveform player with seek and progress
- Invisible TTS playback via play_tts() method (no audio file in chat)
- Add hermes-web toolset with full tool access
- Register Platform.WEB in toolset/config maps
- Update docs for voice conversation feature
- prevent raw MEDIA tag leakage outside the gateway pipeline
- make extract_media handle quoted/backticked paths and optional whitespace
- send Telegram media natively with explicit error/warning handling
- add regression tests for Telegram media dispatch and MEDIA parsing
The send_message tool's _send_telegram() sent MEDIA:<path> tags as
literal text instead of delivering actual files. This fixes it by
extracting MEDIA tags via BasePlatformAdapter.extract_media() and
routing files to the appropriate Telegram Bot API method by extension.
Changes:
- send_message_tool: extract MEDIA tags and send files natively as
photo/video/voice/audio/document based on file extension
- send_message_tool: add per-file error handling and missing-file logging
- send_message_tool: use cleaned text in fallback to avoid leaking tags
- base.py extract_media: handle optional space after MEDIA: colon
- base.py extract_media: strip surrounding backticks/quotes from paths
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The old message referenced 'hermes setup' which doesn't handle
skill-specific env vars. Updated to direct users to load the skill
in the local CLI (which triggers the secure prompt) or add the key
to ~/.hermes/.env manually.
When a skill declares required_environment_variables in its YAML
frontmatter, missing env vars trigger a secure TUI prompt (identical
to the sudo password widget) when the skill is loaded. Secrets flow
directly to ~/.hermes/.env, never entering LLM context.
Key changes:
- New required_environment_variables frontmatter field for skills
- Secure TUI widget (masked input, 120s timeout)
- Gateway safety: messaging platforms show local setup guidance
- Legacy prerequisites.env_vars normalized into new format
- Remote backend handling: conservative setup_needed=True
- Env var name validation, file permissions hardened to 0o600
- Redact patterns extended for secret-related JSON fields
- 12 existing skills updated with prerequisites declarations
- ~48 new tests covering skip, timeout, gateway, remote backends
- Dynamic panel widget sizing (fixes hardcoded width from original PR)
Cherry-picked from PR #723 by kshitijk4poor, rebased onto current main
with conflict resolution.
Fixes#688
Co-authored-by: kshitijk4poor <kshitijk4poor@users.noreply.github.com>
The MEDIA routing in _process_message_background passes
metadata=_thread_metadata to send_video, send_document, and
send_image_file — but none accepted it, causing TypeError silently
caught by the except handler. Files just failed to send.
Fix: add **kwargs to all four base class media methods and their
Telegram overrides.
_keep_typing() was called with metadata= for thread-aware typing
indicators, but neither it nor the base send_typing() accepted
that parameter. Most adapter overrides (Slack, Discord, Telegram,
WhatsApp, HA) already accept metadata=None, but the base class
and Signal adapter did not.
- Add metadata=None to BasePlatformAdapter.send_typing()
- Add metadata=None to BasePlatformAdapter._keep_typing(), pass through
- Add metadata=None to SignalAdapter.send_typing()
Fixes TypeError in _process_message_background for Signal.
Replies in Telegram forum topics (supergroups with topics) now land in
the correct topic thread instead of 'General'.
- base.py: build thread_id metadata from event.source, pass to all
send/media calls; add metadata param to send_typing, send_image,
send_animation, send_voice, send_video, send_document, send_image_file,
_keep_typing
- telegram.py: extract thread_id from metadata and pass as
message_thread_id to all Bot API calls (send_photo, send_voice,
send_audio, send_animation, send_chat_action)
- run.py: pass thread_id metadata to progress/streaming send calls
- discord/slack/whatsapp/homeassistant: update send_typing signature
Based on the fix proposed by @Bitstreamono in PR #656.
Adds a 'find-nearby' skill for discovering nearby places using
OpenStreetMap (Overpass + Nominatim). No API keys needed. Works with:
- Coordinates (from Telegram location pins)
- Addresses, cities, zip codes, landmarks (auto-geocoded)
- Multiple place types (restaurant, cafe, bar, pharmacy, etc.)
Returns names, distances, cuisine, hours, addresses, and Google Maps
links (pin + directions). 184-line stdlib-only script.
Also adds Telegram location message handling:
- New MessageType.LOCATION in gateway base
- Telegram adapter handles LOCATION and VENUE messages
- Injects lat/lon coordinates into conversation context
- Prompts agent to ask what the user wants nearby
Inspired by PR #422 (reimplemented with simpler script and broader
skill scope — addresses/cities/zips, not just Telegram coordinates).
Complete Signal adapter using signal-cli daemon HTTP API.
Based on PR #268 by ibhagwan, rebuilt on current main with bug fixes.
Architecture:
- SSE streaming for inbound messages with exponential backoff (2s→60s)
- JSON-RPC 2.0 for outbound (send, typing, attachments, contacts)
- Health monitor detects stale SSE connections (120s threshold)
- Phone number redaction in all logs and global redact.py
Features:
- DM and group message support with separate access policies
- DM policies: pairing (default), allowlist, open
- Group policies: disabled (default), allowlist, open
- Attachment download with magic-byte type detection
- Typing indicators (8s refresh interval)
- 100MB attachment size limit, 8000 char message limit
- E.164 phone + UUID allowlist support
Integration:
- Platform.SIGNAL enum in gateway/config.py
- Signal in _is_user_authorized() allowlist maps (gateway/run.py)
- Adapter factory in _create_adapter() (gateway/run.py)
- user_id_alt/chat_id_alt fields in SessionSource for UUIDs
- send_message tool support via httpx JSON-RPC (not aiohttp)
- Interactive setup wizard in 'hermes gateway setup'
- Connectivity testing during setup (pings /api/v1/check)
- signal-cli detection and install guidance
Bug fixes from PR #268:
- Timestamp reads from envelope_data (not outer wrapper)
- Uses httpx consistently (not aiohttp in send_message tool)
- SIGNAL_DEBUG scoped to signal logger (not root)
- extract_images regex NOT modified (preserves group numbering)
- pairing.py NOT modified (no cross-platform side effects)
- No dual authorization (adapter defers to run.py for user auth)
- Wildcard uses set membership ('*' in set, not list equality)
- .zip default for PK magic bytes (not .docx)
No new Python dependencies — uses httpx (already core).
External requirement: signal-cli daemon (user-installed).
Tests: 30 new tests covering config, init, helpers, session source,
phone redaction, authorization, and send_message integration.
Co-authored-by: ibhagwan <ibhagwan@users.noreply.github.com>
Authored by satelerd. Adds native WhatsApp media sending for images, videos,
and documents via MEDIA: tags. Also includes conflict resolution with edit_message
feature, Telegram hint fix (only advertise supported media types), and import cleanup.
Instead of sending a separate WhatsApp message for each tool call during
agent execution (N+1 messages), the first tool sends a new message and
subsequent tools edit it to append their line. Result: 1 growing progress
message + 1 final response = 2 messages instead of N+1.
Changes:
- bridge.js: Add POST /edit endpoint using Baileys message editing
- base.py: Add optional edit_message() to BasePlatformAdapter (no-op
default, so platforms without editing support work unchanged)
- whatsapp.py: Implement edit_message() calling bridge /edit
- run.py: Rewrite send_progress_messages() to accumulate tool lines and
edit the progress message. Falls back to sending a new message if
edit fails (graceful degradation).
Before (5 tools = 6 messages):
⚕ Hermes Agent ─── 🔍 web_search... "query"
⚕ Hermes Agent ─── 📄 web_extract... "url"
⚕ Hermes Agent ─── 💻 terminal... "pip install"
⚕ Hermes Agent ─── ✍️ write_file... "app.py"
⚕ Hermes Agent ─── 💻 terminal... "python app.py"
⚕ Hermes Agent ─── Done! The server is running...
After (5 tools = 2 messages):
⚕ Hermes Agent ───
🔍 web_search... "query"
📄 web_extract... "url"
💻 terminal... "pip install"
✍️ write_file... "app.py"
💻 terminal... "python app.py"
⚕ Hermes Agent ─── Done! The server is running...
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Authored by 0xbyt4.
Two fixes:
- extract_images(): only remove extracted image tags, not all markdown image
tags. Previously  was silently dropped when real images
were also present.
- truncate_message(): walk chunk_body not full_chunk when tracking code block
state, so the reopened fence prefix doesn't toggle in_code off and leave
continuation chunks with unclosed code blocks.
Add a /send-media endpoint to the WhatsApp bridge and corresponding
adapter methods so the agent can send files as native WhatsApp
attachments instead of plain-text URLs/paths.
- bridge.js: new POST /send-media endpoint using Baileys' native
image/video/document/audio message types with MIME detection
- base.py: add send_video(), send_document(), send_image_file()
with text fallbacks; route MEDIA: tags by file extension instead
of always treating them as voice messages
- whatsapp.py: implement all media methods via a shared
_send_media_to_bridge() helper; override send_image() to download
URLs to local cache and send as native photos
- prompt_builder.py: update WhatsApp and Telegram platform hints so
the agent knows it can use MEDIA:/path tags to send native media
Fixes#163
- Add chat_topic field to SessionSource dataclass
- Update to_dict/from_dict for serialization support
- Add chat_topic parameter to build_source helper
- Extract channel.topic in Discord adapter for messages and slash commands
- Display Channel Topic in system prompt when available
- Normalize empty topics to None
- extract_images: only remove extracted image tags from content, preserve
non-image markdown links (e.g. PDFs) that were previously silently lost
- truncate_message: walk only chunk_body (not prepended prefix) so the
reopened code fence does not toggle in_code off, leaving continuation
chunks with unclosed code blocks
- Add 49 unit tests covering MessageEvent command parsing, extract_images,
extract_media, truncate_message code block handling, and _get_human_delay
- Sanitize filenames in cache_document_from_bytes to prevent path traversal (strip directory components, null bytes, resolve check)
- Reject documents with None file_size instead of silently allowing download
- Cap text file injection at 100 KB to prevent oversized prompt payloads
- Sanitize display_name in run.py context notes to block prompt injection via filenames
- Add 35 unit tests covering document cache utilities and Telegram document handling
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Download, cache, and enrich document files sent via Telegram. Supports
.pdf, .md, .txt, .docx, .xlsx, .pptx with size validation, unsupported
type rejection, text content injection for .md/.txt, and hourly cache
cleanup.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Introduced new methods in run_agent.py for building API keyword arguments and normalizing assistant messages from API responses.
- Added functionality for compressing conversation context and managing session state in SQLite.
- Improved tool call execution handling, including enhanced logging and error management.
- Updated path handling in multiple platform files to utilize pathlib for better compatibility and readability.
- Updated the vision tool to accept both HTTP/HTTPS URLs and local file paths for image analysis.
- Implemented caching of user-uploaded images in local directories to ensure reliable access for the vision tool, addressing issues with ephemeral URLs.
- Enhanced platform adapters (Discord, Telegram, WhatsApp) to download and cache images, allowing for immediate analysis and enriched message context.
- Added a new method to auto-analyze images attached by users, enriching the conversation with detailed descriptions.
- Improved documentation for image handling processes and updated related functions for clarity and efficiency.
- Updated the image generation function description to clarify usage with markdown.
- Added `send_image` method to `BasePlatformAdapter` for native image sending across platforms.
- Implemented `send_image` in `DiscordAdapter` and `TelegramAdapter` to handle image attachments directly.
- Introduced `extract_images` method to extract image URLs from markdown and HTML, improving content processing.
- Enhanced message handling to support sending images as attachments while maintaining text content.
- Added logic to clear the adapter's interrupt event to prevent infinite loops during message processing.
- Updated the get_pending_message method to pop messages from the pending queue, ensuring proper message handling.
- Introduced a monitoring mechanism in GatewayRunner to detect incoming messages while an agent is active, allowing for graceful interruption and processing of new messages.
- Enhanced BasePlatformAdapter to manage active sessions and pending messages, ensuring that new messages can interrupt ongoing tasks effectively.
- Improved the handling of pending messages by checking for interrupts and processing them in the correct order, enhancing user experience during message interactions.
- Updated the cleanup process for active tasks to ensure proper resource management after interruptions.
- Updated the AIAgent class to extract the first user message for trajectory formatting, improving the accuracy of user queries in the trajectory format.
- Enhanced the GatewayRunner to convert transcript history into the agent format, ensuring proper handling of message roles and content.
- Adjusted the typing indicator refresh rate to every 2 seconds for better responsiveness.
- Improved error handling in the message sending process for the Telegram adapter, implementing a fallback mechanism for Markdown parsing failures, and logging send failures for better debugging.
- Adjusted the `_keep_typing` method to refresh the typing indicator every 2 seconds instead of 4, improving responsiveness after progress messages.
- Updated the `GatewayRunner` to restore the typing indicator after sending progress messages, enhancing user experience during message processing.
- Added a new private method `_keep_typing` to send a typing indicator continuously while processing messages, refreshing every 4 seconds to comply with Telegram/Discord limitations.
- Updated the `handle_message` method to initiate the typing indicator at the start of message processing and ensure it stops once processing is complete, improving user experience during message handling.
- Updated CLI to load configuration from user-specific and project-specific YAML files, prioritizing user settings.
- Introduced a new command `/platforms` to display the status of connected messaging platforms (Telegram, Discord, WhatsApp).
- Implemented a gateway system for handling messaging interactions, including session management and delivery routing for cron job outputs.
- Added support for environment variable configuration and a dedicated gateway configuration file for advanced settings.
- Enhanced documentation in README.md and added a new messaging.md file to guide users on platform integrations and setup.
- Updated toolsets to include platform-specific capabilities for Telegram, Discord, and WhatsApp, ensuring secure and tailored interactions.