Three fixes for the duplicate reply bug affecting all gateway platforms:
1. base.py: Suppress stale response when the session was interrupted by a
new message that hasn't been consumed yet. Checks both interrupt_event
and _pending_messages to avoid false positives. (#8221, #2483)
2. run.py (return path): Remove response_previewed guard from already_sent
check. Stream consumer's already_sent alone is authoritative — if
content was delivered via streaming, the duplicate send must be
suppressed regardless of the agent's response_previewed flag. (#8375)
3. run.py (queued-message path): Same fix — already_sent without
response_previewed now correctly marks the first response as already
streamed, preventing re-send before processing the queued message.
The response_previewed field is still produced by the agent (run_agent.py)
but is no longer required as a gate for duplicate suppression. The stream
consumer's already_sent flag is the delivery-level truth about what the
user actually saw.
Concepts from PR #8380 (konsisumer). Closes#8375, #8221, #2483.
When a user sends a message while the agent is executing a task on the
gateway, the agent is now interrupted immediately — not silently queued.
Previously, messages were stored in _pending_messages with zero feedback
to the user, potentially leaving them waiting 1+ hours.
Root cause: Level 1 guard (base.py) intercepted all messages for active
sessions and returned with no response. Level 2 (gateway/run.py) which
calls agent.interrupt() was never reached.
Fix: Expand _handle_active_session_busy_message to handle the normal
(non-draining) case:
1. Call running_agent.interrupt(text) to abort in-flight tool calls
and exit the agent loop at the next check point
2. Store the message as pending so it becomes the next turn once the
interrupted run returns
3. Send a brief ack: 'Interrupting current task (10 min elapsed,
iteration 21/60, running: terminal). I'll respond shortly.'
4. Debounce acks to once per 30s to avoid spam on rapid messages
Reported by @Lonely__MH.
The /v1/responses endpoint generated a new UUID session_id for every
request, even when previous_response_id was provided. This caused each
turn of a multi-turn conversation to appear as a separate session on the
web dashboard, despite the conversation history being correctly chained.
Fix: store session_id alongside the response in the ResponseStore, and
reuse it when a subsequent request chains via previous_response_id.
Applies to both the non-streaming /v1/responses path and the streaming
SSE path. The /v1/runs endpoint also gains session continuity from
stored responses (explicit body.session_id still takes priority).
Adds test verifying session_id is preserved across chained requests.
The streaming path emits output as content-part arrays for Open WebUI
compatibility, but the batch (non-streaming) Responses API path must
return output as a plain string per the OpenAI Responses API spec.
Reverts the _extract_output_items change from the cherry-picked commits
while preserving the streaming path's array format.
When a session gets stuck (hung terminal, runaway tool loop) and the
user restarts the gateway, the same session history loads and puts the
agent right back in the stuck state. The user is trapped in a loop:
restart → stuck → restart → stuck.
Fix: track restart-failure counts per session using a simple JSON file
(.restart_failure_counts). On each shutdown with active agents, the
counter increments for those sessions. On startup, if any session has
been active across 3+ consecutive restarts, it's auto-suspended —
giving the user a clean slate on their next message.
The counter resets to 0 when a session completes a turn successfully
(response delivered), so normal sessions that happen to be active
during planned restarts (/restart, hermes update) won't accumulate
false counts.
Implementation:
- _increment_restart_failure_counts(): called during stop() when
agents are active. Writes {session_key: count} to JSON file.
Sessions NOT active are dropped (loop broken).
- _suspend_stuck_loop_sessions(): called on startup. Reads the file,
suspends sessions at threshold (3), clears the file.
- _clear_restart_failure_count(): called after successful response
delivery. Removes the session from the counter file.
No SessionEntry schema changes. No database migration. Pure file-based
tracking that naturally cleans up.
Test plan:
- 9 new stuck-loop tests (increment, accumulate, threshold, clear,
suspend, file cleanup, edge cases)
- All 28 gateway lifecycle tests pass (restart drain + auto-continue
+ stuck loop)
When the gateway restarts mid-agent-work, the session transcript ends
on a tool result the agent never processed. Previously, the user had
to type 'continue' or use /retry (which replays from scratch, losing
all prior work).
Now, when the next user message arrives and the loaded history ends
with role='tool', a system note is prepended:
[System note: Your previous turn was interrupted before you could
process the last tool result(s). Please finish processing those
results and summarize what was accomplished, then address the
user's new message below.]
This is injected in _run_agent()'s run_sync closure, right before
calling agent.run_conversation(). The agent sees the full history
(including the pending tool results) and the system note, so it can
summarize what was accomplished and then handle the user's new input.
Design decisions:
- No new session flags or schema changes — purely detects trailing
tool messages in the loaded history
- Works for any restart scenario (clean, crash, SIGTERM, drain timeout)
as long as the session wasn't suspended (suspended = fresh start)
- The user's actual message is preserved after the note
- If the session WAS suspended (unclean shutdown), the old history is
abandoned and the user starts fresh — no false auto-continue
Also updates the shutdown notification message from 'Use /retry after
restart to continue' to 'Send any message after restart to resume
where it left off' — which is now accurate.
Test plan:
- 6 new auto-continue tests (trailing tool detection, no false
positives for assistant/user/empty history, multi-tool, message
preservation)
- All 13 restart drain tests pass (updated /retry assertion)
Instead of consuming one top-level slash command slot per skill (hitting the
100-command limit with ~26 built-ins + 74 skills), skills are now organized
under a single /skill group command with category-based subcommand groups:
/skill creative ascii-art [args]
/skill media gif-search [args]
/skill mlops axolotl [args]
Discord supports 25 subcommand groups × 25 subcommands = 625 max skills,
well beyond the previous 74-slot ceiling.
Categories are derived from the skill directory structure:
- skills/creative/ascii-art/ → category 'creative'
- skills/mlops/training/axolotl/ → category 'mlops' (top-level parent)
- skills/dogfood/ → uncategorized (direct subcommand)
Changes:
- hermes_cli/commands.py: add discord_skill_commands_by_category() with
category grouping, hub/disabled filtering, Discord limit enforcement
- gateway/platforms/discord.py: replace top-level skill registration with
_register_skill_group() using app_commands.Group hierarchy
- tests: 7 new tests covering group creation, category grouping,
uncategorized skills, hub exclusion, deep nesting, empty skills,
and handler dispatch
Inspired by Discord community suggestion from bottium.
- TestHealthDetailedEndpoint: 3 tests for the new API server endpoint
(returns runtime data, handles missing status, no auth required)
- TestProbeGatewayHealth: 5 tests for _probe_gateway_health()
(URL normalization, successful/failed probes, fallback chain)
- TestStatusRemoteGateway: 4 tests for /api/status remote fallback
(remote probe triggers, skipped when local PID found, null PID handling)
Feishu approval clicks need the resolved card to come back from the
synchronous callback path itself. Leaving approval resolution to the
generic asynchronous card-action flow made button feedback depend on
later loop work instead of the callback response the client is waiting
for.
Change-Id: I574997cbbcaa097fdba759b47367e28d1b56b040
Constraint: Feishu card-action callbacks must acknowledge quickly and reflect final approval state from the callback response path
Rejected: Keep approval handling on the generic async card-action route | leaves card state synchronization vulnerable to callback timing and follow-up update ordering
Confidence: high
Scope-risk: narrow
Reversibility: clean
Directive: Keep approval callback response construction separate from async queue unblocking unless Feishu callback semantics change
Tested: pytest tests/gateway/test_feishu.py tests/gateway/test_feishu_approval_buttons.py tests/gateway/test_approve_deny_commands.py tests/gateway/test_slack_approval_buttons.py tests/gateway/test_telegram_approval_buttons.py -q
Not-tested: Live Feishu workspace end-to-end callback rendering
Three fixes for gateway lifecycle stability:
1. Notify active sessions before shutdown (#new)
When the gateway receives SIGTERM or /restart, it now sends a
notification to every chat with an active agent BEFORE starting
the drain. Users see:
- Shutdown: 'Gateway shutting down — your task will be interrupted.'
- Restart: 'Gateway restarting — use /retry after restart to continue.'
Deduplicates per-chat so group sessions with multiple users get
one notification. Best-effort: send failures are logged and swallowed.
2. Skip .clean_shutdown marker when drain timed out
Previously, a graceful SIGTERM always wrote .clean_shutdown, even if
agents were force-interrupted when the drain timed out. This meant
the next startup skipped session suspension, leaving interrupted
sessions in a broken state (trailing tool response, no final message).
Now the marker is only written if the drain completed without timeout,
so interrupted sessions get properly suspended on next startup.
3. Post-restart health check for hermes update (#6631)
cmd_update() now verifies the gateway actually survived after
systemctl restart (sleep 3s + is-active check). If the service
crashed immediately, it retries once. If still dead, prints
actionable diagnostics (journalctl command, manual restart hint).
Also closes#8104 — already fixed on main (the /restart handler
correctly detects systemd via INVOCATION_ID and uses via_service=True).
Test plan:
- 6 new tests for shutdown notifications (dedup, restart vs shutdown
messaging, sentinel filtering, send failure resilience)
- Existing restart drain + update tests pass (47 total)
Follow-up for cherry-picked PR #9746 — three pre-existing tests used
adapter._webhook_url (bare URL) in mock data, but _register_webhook
and _unregister_webhook now compare against _webhook_register_url
(password-bearing URL). Updated to match.
When BlueBubbles posts webhook events to the adapter, it uses the exact
URL registered via /api/v1/webhook — and BB's registration API does not
support custom headers. The adapter currently registers the bare URL
(no credentials), but then requires password auth on inbound POSTs,
rejecting every webhook with HTTP 401.
This is masked on fresh BB installs by a race condition: the webhook
might register once with a prior (possibly patched) URL and keep working
until the first restart. On v0.9.0, _unregister_webhook runs on clean
shutdown, so the next startup re-registers with the bare URL and the
401s begin. Users see the bot go silent with no obvious cause.
Root cause: there's no way to pass auth credentials from BB to the
webhook handler except via the URL itself. BB accepts query params and
preserves them on outbound POSTs.
## Fix
Introduce `_webhook_register_url` — the URL handed to BB's registration
API, with the configured password appended as a `?password=<value>`
query param. The existing webhook auth handler already accepts this
form (it reads `request.query.get("password")`), so no change to the
receive side is needed.
The bare `_webhook_url` is still used for logging and for binding the
local listener, so credentials don't leak into log output. Only the
registration/find/unregister paths use the password-bearing form.
## Notes
- Password is URL-encoded via urllib.parse.quote, handling special
characters (&, *, @, etc.) that would otherwise break parsing.
- Storing the password in BB's webhook table is not a new disclosure:
anyone with access to that table already has the BB admin password
(same credential used for every other API call).
- If `self.password` is empty (no auth configured), the register URL
is the bare URL — preserves current behavior for unauthenticated
local-only setups.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
BlueBubbles v1.9+ webhook payloads for new-message events do not always
include a top-level chatGuid field on the message data object. Instead,
the chat GUID is nested under data.chats[0].guid.
The adapter currently checks five top-level fallback locations (record and
payload, snake_case and camelCase, plus payload.guid) but never looks
inside the chats array. When none of those top-level fields contain the
GUID, the adapter falls through to using the sender's phone/email as the
session chat ID.
This causes two observable bugs when a user is a participant in both a DM
and a group chat with the bot:
1. DM and group sessions merge. Every message from that user ends up with
the same session_chat_id (their own address), so the bot cannot
distinguish which thread the message came from.
2. Outbound routing becomes ambiguous. _resolve_chat_guid() iterates all
chats and returns the first one where the address appears as a
participant; group chats typically sort ahead of DMs by activity, so
replies and cron messages intended for the DM can land in a group.
This was observed in production: a user's morning brief cron delivered to
a group chat with his spouse instead of his DM thread.
The fix adds a single fallback that extracts chat_guid from
record["chats"][0]["guid"] when the top-level fields are empty. The chats
array is included in every new-message webhook payload in BB v1.9.9
(verified against a live server). It is backwards compatible: if a future
BB version starts including chatGuid at the top level, that still wins.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
When GATEWAY_PROXY_URL (or gateway.proxy_url in config.yaml) is set,
the gateway becomes a thin relay: it handles platform I/O (encryption,
threading, media) and delegates all agent work to a remote Hermes API
server via POST /v1/chat/completions with SSE streaming.
This enables the primary use case of running a Matrix E2EE gateway in
Docker on Linux while the actual agent runs on the host (e.g. macOS)
with full access to local files, memory, skills, and a unified session
store. Works for any platform adapter, not just Matrix.
Configuration:
- GATEWAY_PROXY_URL env var (Docker-friendly)
- gateway.proxy_url in config.yaml
- GATEWAY_PROXY_KEY env var for API auth (matches API_SERVER_KEY)
- X-Hermes-Session-Id header for session continuity
Architecture:
- _get_proxy_url() checks env var first, then config.yaml
- _run_agent_via_proxy() handles HTTP forwarding with SSE streaming
- _run_agent() delegates to proxy path when URL is configured
- Platform streaming (GatewayStreamConsumer) works through proxy
- Returns compatible result dict for session store recording
Files changed:
- gateway/run.py: proxy mode implementation (~250 lines)
- hermes_cli/config.py: GATEWAY_PROXY_URL + GATEWAY_PROXY_KEY env vars
- tests/gateway/test_proxy_mode.py: 17 tests covering config
resolution, dispatch, HTTP forwarding, error handling, message
filtering, and result shape validation
Closes discussion from Cars29 re: Matrix gateway mixed-mode issue.
During rapid tool-calling, the model often emits 1-2 tokens before
switching to tool calls. The stream consumer would create a new message
with 'X ▉' (short text + cursor), and if the follow-up edit to strip
the cursor was rate-limited by the platform, the cursor remained as
a permanent standalone message — reported on Telegram as 'white box'
artifacts.
Add a minimum-content guard in _send_or_edit: when creating a new
standalone message (no existing message_id), require at least 4
visible characters alongside the cursor before sending. Shorter text
accumulates into the next streaming segment instead.
This prevents cursor-only 'tofu' messages across all platforms without
affecting normal streaming (edits to existing messages, final sends
without cursor, and messages with substantial text are all unaffected).
Reported by @michalkomar on X.
Production fixes:
- Add clear_session_context() to hermes_logging.py (fixes 48 teardown errors)
- Add clear_session() to tools/approval.py (fixes 9 setup errors)
- Add SyncError M_UNKNOWN_TOKEN check to Matrix _sync_loop (bug fix)
- Fall back to inline api_key in named custom providers when key_env
is absent (runtime_provider.py)
Test fixes:
- test_memory_user_id: use builtin+external provider pair, fix honcho
peer_name override test to match production behavior
- test_display_config: remove TestHelpers for non-existent functions
- test_auxiliary_client: fix OAuth tokens to match _is_oauth_token
patterns, replace get_vision_auxiliary_client with resolve_vision_provider_client
- test_cli_interrupt_subagent: add missing _execution_thread_id attr
- test_compress_focus: add model/provider/api_key/base_url/api_mode
to mock compressor
- test_auth_provider_gate: add autouse fixture to clean Anthropic env
vars that leak from CI secrets
- test_opencode_go_in_model_list: accept both 'built-in' and 'hermes'
source (models.dev API unavailable in CI)
- test_email: verify email Platform enum membership instead of source
inspection (build_channel_directory now uses dynamic enum loop)
- test_feishu: add bot_added/bot_deleted handler mocks to _Builder
- test_ws_auth_retry: add AsyncMock for sync_store.get_next_batch,
add _pending_megolm and _joined_rooms to Matrix adapter mocks
- test_restart_drain: monkeypatch-delete INVOCATION_ID (systemd sets
this in CI, changing the restart call signature)
- test_session_hygiene: add user_id to SessionSource
- test_session_env: use relative baseline for contextvar clear check
(pytest-xdist workers share context)
- Add Platform.QQBOT to _UPDATE_ALLOWED_PLATFORMS (enables /update command)
- Add 'qqbot' to webhook cross-platform delivery routing
- Add 'qqbot' to hermes dump platform detection
- Fix test_name_property casing: 'QQBot' not 'QQBOT'
- Add _parse_qq_timestamp() for ISO 8601 + integer ms compatibility
(QQ API changed timestamp format — from PR #2411 finding)
- Wire timestamp parsing into all 4 message handlers
- Rename platform from 'qq' to 'qqbot' across all integration points
(Platform enum, toolset, config keys, import paths, file rename qq.py → qqbot.py)
- Add PLATFORM_HINTS for QQBot in prompt_builder (QQ supports markdown)
- Set SUPPORTS_MESSAGE_EDITING = False to skip streaming on QQ
(prevents duplicate messages from non-editable partial + final sends)
- Add _send_qqbot() standalone send function for cron/send_message tool
- Add interactive _setup_qq() wizard in hermes_cli/setup.py
- Restore missing _setup_signal/email/sms/dingtalk/feishu/wecom/wecom_callback
functions that were lost during the original merge
Models like MiniMax emit inline <think>...</think> reasoning blocks in
their content field. The CLI already suppresses these via a state machine
in _stream_delta, but the gateway's GatewayStreamConsumer had no
equivalent filtering — raw think blocks were streamed directly to
Discord/Telegram/Slack.
The fix adds a _filter_and_accumulate() method that mirrors the CLI's
approach: a state machine tracks whether we're inside a reasoning block
and silently discards the content. Includes the same block-boundary
check (tag must appear at line start or after whitespace-only prefix)
to avoid false positives when models mention <think> in prose.
Handles all tag variants: <think>, <thinking>, <THINKING>, <thought>,
<reasoning>, <REASONING_SCRATCHPAD>.
Also handles edge cases:
- Tags split across streaming deltas (partial tag buffering)
- Unclosed blocks (content suppressed until stream ends)
- Multiple consecutive blocks
- _flush_think_buffer on stream end for held-back partial tags
Adds 22 unit tests + 1 integration test covering all scenarios.
When the 5-second stream_task timeout in gateway/run.py expires (due to
slow Telegram API calls from rate limiting after several messages), the
stream consumer is cancelled via asyncio.CancelledError. The
CancelledError handler did a best-effort final edit but never set
final_response_sent, so the gateway fell through to the normal send path
and delivered the full response again as a reply — causing a duplicate.
The fix: in the CancelledError handler, set final_response_sent = True
when already_sent is True (i.e., the stream consumer had already
delivered content to the user). This tells the gateway's already_sent
check that the response was delivered, preventing the duplicate send.
Adds two tests verifying the cancellation behavior:
- Cancelled with already_sent=True → final_response_sent=True (no dup)
- Cancelled with already_sent=False → final_response_sent=False (normal
send path proceeds)
Reported by community user hume on Discord.
/stop was calling suspend_session() which marked the session for auto-reset
on the next message. This meant users lost their conversation history every
time they stopped a running agent — especially painful for untitled sessions
that can't be resumed by name.
Now /stop just interrupts the agent and cleans the session lock. The session
stays intact so users can continue the conversation.
The suspend behavior was introduced in #7536 to break stuck session resume
loops on gateway restart. That case is already handled by
suspend_recently_active() which runs at gateway startup, so removing it from
/stop doesn't regress the original fix.
Updated the acquire_scoped_lock function to treat empty or corrupt lock files as stale. This change ensures that if a lock file exists but is invalid, it will be removed to prevent issues with stale locks. Added tests to verify recovery from both empty and corrupt lock files.
- Store source metadata on /voice channel join so voice input shares the
same session as the linked text channel conversation
- Treat voice-linked text channels as free-response (skip @mention and
auto-thread) while voice is active
- Scope the voice-linked exemption to the exact bound channel, not
sibling threads
- Guard signal handler registration in start_gateway() for non-main
threads (prevents RuntimeError when gateway runs in a daemon thread)
- Clean up _voice_sources on leave_voice_channel
Salvaged from PR #3475 by twilwa (Modal runtime portions excluded).
The detached bash subprocess spawned by /restart gets killed by
systemd's KillMode=mixed cgroup cleanup, leaving the gateway dead.
Under systemd (detected via INVOCATION_ID env var), /restart now uses
via_service=True which exits with code 75 — RestartForceExitStatus=75
in the unit file makes systemd auto-restart the service. The detached
subprocess approach is preserved as fallback for non-systemd
environments (Docker, tmux, foreground mode).
When a user sends /restart, the gateway now persists their routing info
(platform, chat_id, thread_id) to .restart_notify.json. After the new
gateway process starts and adapters connect, it reads the file, sends a
'Gateway restarted successfully' message to that specific chat, and
cleans up the file.
This follows the same pattern as _send_update_notification (used by
/update). Thread IDs are preserved so the notification lands in the
correct Telegram topic or Discord thread.
Previously, after /restart the user had no feedback that the gateway was
back — they had to send a message to find out. Now they get a proactive
notification and know their session continues.
When tool_preview_length is 0 (default for platforms without a tier
default, like Session), verbose mode was truncating args JSON to 200
characters. Since the user explicitly opted into verbose mode, they
expect full tool call detail — the 200-char cap defeated the purpose.
Now: tool_preview_length=0 means no truncation in verbose mode.
Positive values still cap as before. Platform message-length limits
handle overflow naturally.
Three changes that address the poor WhatsApp experience reported by users:
1. Reclassify WhatsApp from TIER_LOW to TIER_MEDIUM in display_config.py
— enables streaming and tool progress via the existing Baileys /edit
bridge endpoint. Users now see progressive responses instead of
minutes of silence followed by a wall of text.
2. Lower MAX_MESSAGE_LENGTH from 65536 to 4096 and add proper chunking
— send() now calls format_message() and truncate_message() before
sending, then loops through chunks with a small delay between them.
The base class truncate_message() already handles code block boundary
detection (closes/reopens fences at chunk boundaries). reply_to is
only set on the first chunk.
3. Override format_message() with WhatsApp-specific markdown conversion
— converts **bold** to *bold*, ~~strike~~ to ~strike~, headers to
bold text, and [links](url) to text (url). Code blocks and inline
code are protected from conversion via placeholder substitution.
Together these fix the two user complaints:
- 'sends the whole code all the time' → now chunked at 4K with proper
formatting
- 'terminal gets interrupted and gets cooked' → streaming + tool progress
give visual feedback so users don't accidentally interrupt with
follow-up messages
Port from nearai/ironclaw#2304: Telegram's 4096 character limit is
measured in UTF-16 code units, not Unicode codepoints. Characters
outside the Basic Multilingual Plane (emoji like 😀, CJK Extension B,
musical symbols) are surrogate pairs: 1 Python char but 2 UTF-16 units.
Previously, truncate_message() used Python's len() which counts
codepoints. This could produce chunks exceeding Telegram's actual limit
when messages contain many astral-plane characters.
Changes:
- Add utf16_len() helper and _prefix_within_utf16_limit() for
UTF-16-aware string measurement and truncation
- Add _custom_unit_to_cp() binary-search helper that maps a custom-unit
budget to the largest safe codepoint slice position
- Update truncate_message() to accept optional len_fn parameter
- Telegram adapter now passes len_fn=utf16_len when splitting messages
- Fix fallback truncation in Telegram error handler to use
_prefix_within_utf16_limit instead of codepoint slicing
- Update send_message_tool.py to use utf16_len for Telegram platform
- Add comprehensive tests: utf16_len, _prefix_within_utf16_limit,
truncate_message with len_fn (emoji splitting, content preservation,
code block handling)
- Update mock lambdas in reply_mode tests to accept **kw for len_fn
Port from openclaw/openclaw#64586: users who copy .env.example without
changing placeholder values now get a clear error at startup instead of
a confusing auth failure from the platform API. Also rejects placeholder
API_SERVER_KEY when binding to a network-accessible address.
Cherry-picked from PR #8677.
Port from openclaw/openclaw#64796: Per MSC3952 / Matrix v1.7, the
m.mentions.user_ids field is the authoritative mention signal. Clients
that populate m.mentions but don't duplicate @bot in the body text
were being silently dropped when MATRIX_REQUIRE_MENTION=true.
Cherry-picked from PR #8673.
Some OpenAI-compatible clients (Open WebUI, LobeChat, etc.) send
message content as an array of typed parts instead of a plain string:
[{"type": "text", "text": "hello"}]
The agent pipeline expects strings, so these array payloads caused
silent failures or empty messages.
Add _normalize_chat_content() with defensive limits (recursion depth,
list size, output length) and apply it to both the Chat Completions
and Responses API endpoints. The Responses path had inline
normalization that only handled input_text/output_text — the shared
function also handles the standard 'text' type.
Salvaged from PR #7980 (ikelvingo) — only the content normalization;
the SSE and Weixin changes in that PR were regressions and are not
included.
Co-authored-by: ikelvingo <ikelvingo@users.noreply.github.com>
Four fixes for the Weixin/WeChat adapter, synthesized from the best
aspects of community PRs #8407, #8521, #8360, #7695, #8308, #8525,
#7531, #8144, #8251.
1. Streaming cursor (▉) stuck permanently — WeChat doesn't support
message editing, so the cursor appended during streaming can never
be removed. Add SUPPORTS_MESSAGE_EDITING = False to WeixinAdapter
and check it in gateway/run.py to use an empty cursor for non-edit
platforms. (Fixes#8307, #8326)
2. Media upload failures — two bugs in _send_file():
a) upload_full_url path used PUT (404 on WeChat CDN); now uses POST.
b) aes_key was base64(raw_bytes) but the iLink API expects
base64(hex_string); images showed as grey boxes. (Fixes#8352, #7529)
Also: unified both upload paths into _upload_ciphertext(), preferring
upload_full_url. Added send_video/send_voice methods and voice_item
media builder for audio/.silk files. Added video_md5 field.
3. Markdown links stripped — WeChat can't render [text](url), so
format_message() now converts them to 'text (url)' plaintext.
Code blocks are preserved. (Fixes#7617)
4. Blank message prevention — three guards:
a) _split_text_for_weixin_delivery('') returns [] not ['']
b) send() filters empty/whitespace chunks before _send_text_chunk
c) _send_message() raises ValueError for empty text as safety net
Community credit: joei4cm (#8407), lyonDan (#8521), SKFDJKLDG (#8360),
tomqiaozc (#7695), joshleeeeee (#8308), luoxiao6645(#8525),
longsizhuo (#7531), Astral-Yang (#8144), QingWei-Li (#8251).
- Remove duplicate _setup_feishu() definition (old 3-line version left
behind by cherry-pick — Python picked the new one but dead code
remained)
- Remove misleading 'Disable direct messages' DM option — the Feishu
adapter has no DM policy mechanism, so 'disable' produced identical
env vars to 'pairing'. Users who chose 'disable' would still see
pairing prompts. Reduced to 3 options: pairing, allow-all, allowlist.
- Fix test_probe_returns_bot_info_on_success and
test_probe_returns_none_on_failure: patch FEISHU_AVAILABLE=True so
probe_bot() takes the SDK path when lark_oapi is not installed
The _watch_update_progress() poll loop never deleted .update_prompt.json
after forwarding the prompt to the user, causing the same prompt to be
re-sent every poll cycle (2s). Two fixes:
1. Delete .update_prompt.json after forwarding — the update process only
polls for .update_response, it doesn't need the prompt file to persist.
2. Guard re-sends with _update_prompt_pending check — belt-and-suspenders
to prevent duplicates even under race conditions.
Add regression test asserting the prompt is sent exactly once.