Add has_usable_secret() to reject empty, short (<4 char), and common
placeholder API key values (changeme, your_api_key, placeholder, etc.)
throughout the auth/runtime resolution chain.
Update list_available_providers() to use provider-specific auth status
via get_auth_status() instead of resolve_runtime_provider(), preventing
cross-provider key fallback from making providers appear available when
they aren't actually configured.
Preserve keyless custom endpoint support by checking via base URL.
Cherry-picked from PR #2121 by aashizpoudel.
Streaming provides a better UX — tokens appear as they arrive instead
of waiting for the full response. show_reasoning remains false so
thinking blocks are not streamed to the user.
Fixes#1803. send_image_file, send_document, and send_video were missing
message_thread_id forwarding, causing them to fail in Telegram forum/supergroups
where thread_id is required. send_voice already handled this correctly. Adds
metadata parameter + message_thread_id to all three methods, and adds tests
covering the thread_id forwarding path.
PR #2314 checked for provider names 'alibaba-coding-plan' and
'alibaba-coding-plan-anthropic' which don't exist in the provider
registry. The provider is always 'alibaba' — the condition was dead
code. Fixed to check self.provider == 'alibaba'.
Based on PR #1749 by @erosika (reimplemented on current main).
Extracts three protected methods from run() so wrapper CLIs can extend
the TUI without overriding the entire method:
- _get_extra_tui_widgets(): inject widgets between spacer and status bar
- _register_extra_tui_keybindings(kb, input_area): add keybindings
- _build_tui_layout_children(**widgets): full control over ordering
Default implementations reproduce existing layout exactly. The inline
HSplit in run() now delegates to _build_tui_layout_children().
5 tests covering defaults, widget insertion position, and keybinding
registration.
When using Alibaba (DashScope) with an anthropic-compatible endpoint,
model names like qwen3.5-plus were being normalized to qwen3-5-plus.
Alibaba's API expects the dot. Added preserve_dots parameter to
normalize_model_name() and build_anthropic_kwargs().
Also fixed 401 auth: when provider is alibaba or base_url contains
dashscope/aliyuncs, use only the resolved API key (DASHSCOPE_API_KEY).
Never fall back to resolve_anthropic_token(), and skip Anthropic
credential refresh for DashScope endpoints.
Cherry-picked from PR #1748 by crazywriter1. Fixes#1739.
Users reported that the bot fails to resolve usernames without the
Server Members privileged intent enabled. Updated the setup docs
to mark it as Required instead of Optional.
Feedback from Blangs [MADD].
Bare strings like "image", "audio", "document" were appended to
media_types, but downstream run.py checks mtype.startswith("image/")
and mtype.startswith("audio/"), which never matched. This caused all
Mattermost file attachments to be silently dropped from vision/STT
processing. Use the actual MIME type from file_info instead.
Cherry-picked from PR #2319 by @itenev.
When the gateway fails to connect (e.g. PrivilegedIntentsRequired,
missing token), systemd's default RestartSec=10 with no start rate
limit causes rapid reconnect storms flooding logs and triggering
platform-side rate limits.
- StartLimitIntervalSec=600 + StartLimitBurst=5 in [Unit] (max 5
restarts per 10 min)
- RestartSec: 10 → 30
- Applied to both templates in gateway.py and scripts/hermes-gateway
The gateway config loader read config.yaml but never merged its
`platforms` key into the runtime config dict. This meant that
platform-specific settings defined under `platforms.<name>.extra`
(e.g. webhook routes) were silently ignored unless the user also
duplicated them in the legacy gateway.json file.
Merge `yaml_cfg["platforms"]` into `gw_data["platforms"]` with a
shallow deep-merge of the `extra` dict so that gateway.json defaults
are preserved while config.yaml values take precedence.
Closes#2305
Six improvements to reduce information loss during context compression,
informed by analysis of Cline, OpenCode, Pi-mono, Codex, and ClawdBot:
1. Structured summary template — sections for Goal, Progress (Done/
In Progress/Blocked), Key Decisions, Relevant Files, Next Steps,
and Critical Context. Forces the summarizer to preserve each
category instead of writing a vague paragraph.
2. Iterative summary updates — on re-compression, the prompt says
'PRESERVE existing info, ADD new progress, UPDATE done/in-progress
status.' Previous summary is stored and fed back to the summarizer
so accumulated context survives across multiple compactions.
3. Token-budget tail protection — instead of fixed protect_last_n=4,
walks backward keeping ~20K tokens of recent context. Adapts to
message density: sessions with big tool results protect fewer
messages, short exchanges protect more. Falls back to protect_last_n
for small conversations.
4. Tool output pruning (pre-pass) — before the expensive LLM summary,
replaces old tool result contents with a placeholder. This is free
(no LLM call) and can save 30%+ of context by itself.
5. Scaled summary budget — instead of fixed 2500 tokens, allocates 20%
of compressed content tokens (clamped to 2000-8000). A 50-turn
conversation gets more summary space than a 10-turn one.
6. Richer summarizer input — tool calls now include arguments (up to
500 chars) and tool results keep up to 3000 chars (was 1500).
The summarizer sees 'terminal(git status) → M src/config.py'
instead of just '[Tool calls: terminal]'.
When streaming is enabled, the base adapter receives None from
_handle_message (already_sent=True) and cannot run auto-TTS for
voice input. The runner was unconditionally skipping voice input
TTS assuming the base adapter would handle it.
Now the runner takes over TTS responsibility when streaming has
already delivered the text response, so voice channel playback
works with both streaming on and off.
Streaming off behavior is unchanged (default already_sent=False
preserves the original code path exactly).
Co-authored-by: 0xbyt4 <35742124+0xbyt4@users.noreply.github.com>
On macOS, zsh users may not have ~/.zshrc if they haven't customized
their shell yet. The installer would silently fail to add ~/.local/bin
to PATH, causing 'hermes: command not found' after installation.
- Check ~/.zprofile as fallback for zsh users (macOS login shell config)
- Create ~/.zshrc if neither config file exists
Cherry-picked from PR #2315 by erhnysr.
Co-authored-by: erhnysr <erhnysr@users.noreply.github.com>
Cherry-picked from PR #2290 by @Mibayy. Closes#2138.
When asyncio.run() raises RuntimeError (running loop exists), the
coroutine was created but never awaited, producing a RuntimeWarning
on GC. Extract coro before try, call coro.close() in the except
branch before falling back to ThreadPoolExecutor.
Cron deliveries were mirrored into the target gateway session as
assistant-role messages, causing consecutive assistant messages that
violate message alternation (issue #2221).
Instead of fixing the role, remove the mirror injection entirely.
Cron outputs already live in their own cron session and don't belong
in the interactive conversation history.
Delivered messages are now wrapped with a header (task name) and a
footer noting the agent cannot see or respond to the message, so
users have clear context about what they're reading.
Closes#2221
Cherry-picked from PR #2292 by @Mibayy. Closes#2134.
resolve_toolset() called visited.copy() per sibling include, breaking
dedup for diamond dependencies (D resolved twice via B and C paths)
and causing duplicate cycle warnings.
Fix: pass visited directly so siblings share the same set. The .copy()
for the all/* alias at the top level is kept so each top-level toolset
gets an independent pass. Removes the print() cycle warning since
hitting a visited name now usually means diamond (not a bug).
A single Telegram 409 Conflict from getUpdates permanently killed
Telegram polling with no recovery possible (retryable=False on
first occurrence). This is too aggressive for production use with
process supervisors.
Transient 409s are expected during:
- --replace handoffs where the old long-poll session lingers on
Telegram servers for a few seconds after SIGTERM
- systemd Restart=on-failure respawns that overlap with the dying
instance cleanup
Now _handle_polling_conflict() retries up to 3 times with a
10-second delay between attempts. The 30-second total retry window
lets stale server-side sessions expire. If all retries fail, the
error is still marked as permanently fatal — preserving the original
protection against genuine dual-instance conflicts.
Tests updated: split the single conflict test into two — one verifying
retry on transient conflict, one verifying fatal after exhausted
retries.
Closes#2296
Cherry-picked from PR #2295 by @dlkakbs.
The web_extract auxiliary client api_key env var was literally stored as
'AUXILI..._KEY' (dots in the source) instead of the full name. Users
configuring an auxiliary web_extract model with an API key would have
auth failures because the key was written to a non-existent var.
Two changes to the error handler in the agent loop:
1. Remove the 'if not pending_handled' block that injected fake
[System error during processing: ...] messages into conversation
history. These polluted history, burned tokens on retries, and
could violate role alternation by injecting as role=user.
The tool_calls error-result path (role=tool) is preserved.
2. Append the error final_response as an assistant message when
hitting the iteration limit, so session resume doesn't produce
consecutive user messages.
Enhanced the review agent to scan and summarize successful tool actions, providing users with a compact overview of updates made during the review process. This includes actions related to memory and user profiles, improving user feedback and interaction clarity.
Added a check to suppress further reasoning rendering once the response box is open, preventing potential overlap of reasoning boxes during late thinking blocks. This enhances the user experience by maintaining a clean output in the CLI.
Previously, all project context files (AGENTS.md, .cursorrules, .hermes.md)
were loaded and concatenated into the system prompt. This bloated the prompt
with potentially redundant or conflicting instructions.
Now only ONE project context type is loaded, using priority order:
1. .hermes.md / HERMES.md (walk to git root)
2. AGENTS.md / agents.md (recursive directory walk)
3. CLAUDE.md / claude.md (cwd only, NEW)
4. .cursorrules / .cursor/rules/*.mdc (cwd only)
SOUL.md from HERMES_HOME remains independent and always loads.
Also adds CLAUDE.md as a recognized context file format, matching the
convention popularized by Claude Code.
Refactored the monolithic function into four focused helpers:
_load_hermes_md, _load_agents_md, _load_claude_md, _load_cursorrules.
Tests: replaced 1 coexistence test with 10 new tests covering priority
ordering, CLAUDE.md loading, case sensitivity, injection blocking.
- Introduced a mechanism to mute output after the main response is delivered, ensuring that subsequent tool calls run without cluttering the CLI.
- Redirected stdout to devnull during the review agent's execution to prevent any print statements from interfering with the main CLI display.
- Added a new attribute `_mute_post_response` to manage output suppression effectively.
- Added a user bar separator for improved visual clarity when displaying pasted text and user input in the HermesCLI.
- Ensured consistent formatting for both multi-line and single-line user inputs, enhancing the overall user experience in the command-line interface.
These changes contribute to a more organized and visually appealing output during interactions.
Same bug as opencode-zen/go — alibaba fell through to the OpenRouter
model list instead of using _setup_provider_model_selection() which
probes the provider's own /models endpoint.
All user-selectable providers now have correct model selection routing.
After selecting OpenCode Zen or Go as provider in hermes setup, the
model selection page showed OpenRouter models because these providers
weren't in the list that routes to _setup_provider_model_selection().
They fell through to the else branch which shows the OpenRouter catalog.
Users ended up with an OpenCode API key but an OpenRouter model name,
causing 'Provider resolver returned an empty API key' on first use.
Fix: add opencode-zen and opencode-go to the provider list that uses
_setup_provider_model_selection() for live /models detection.