setup_model_provider() had 800+ lines of duplicated provider handling
that reimplemented the same credential prompting, OAuth flows, and model
selection that hermes model already provides via the _model_flow_*
functions. Every new provider had to be added in both places, and the
two implementations diverged in config persistence (setup.py did raw
YAML writes, _set_model_provider, and _update_config_for_provider
depending on the provider — main.py used its own load/save cycle).
This caused the #4172 bug: _model_flow_custom saved config to disk but
the wizard's final save_config(config) overwrote it with stale values.
Fix: extract the core of cmd_model() into select_provider_and_model()
and have setup_model_provider() call it. After the call, re-sync the
wizard's config dict from disk. Deletes ~800 lines of duplicated
provider handling from setup.py.
Also fixes cmd_model() double-AuthError crash on fresh installs with
no API keys configured.
_model_flow_custom() saved model.provider and model.base_url to disk
via its own load_config/save_config cycle, but never updated the
setup wizard's in-memory config dict. The wizard's final
save_config(config) then overwrote the custom settings with the
stale default string model value.
Fix: after saving to disk, also mutate the caller's config dict so
the wizard's final save preserves model.provider='custom' and the
base_url. Both the model_name and no-model_name branches are
covered.
Added regression tests that simulate the full wizard flow including
the final save_config(config) call — the step that was previously
untested.
OPENAI_BASE_URL was written to .env AND config.yaml, creating a dual-source
confusion. Users (especially Docker) would see the URL in .env and assume
that's where all config lives, then wonder why LLM_MODEL in .env didn't work.
Changes:
- Remove all 27 save_env_value("OPENAI_BASE_URL", ...) calls across main.py,
setup.py, and tools_config.py
- Remove OPENAI_BASE_URL env var reading from runtime_provider.py, cli.py,
models.py, and gateway/run.py
- Remove LLM_MODEL/HERMES_MODEL env var reading from gateway/run.py and
auxiliary_client.py — config.yaml model.default is authoritative
- Vision base URL now saved to config.yaml auxiliary.vision.base_url
(both setup wizard and tools_config paths)
- Tests updated to set config values instead of env vars
Convention enforced: .env is for SECRETS only (API keys). All other
configuration (model names, base URLs, provider selection) lives
exclusively in config.yaml.
CI has no config.yaml, so cron/gateway resolve an empty model name.
The Codex Responses validator rejects empty models before the mock
API call is reached. Provide explicit model in job dict and env var.
Salvaged from PR #4024 by @Sertug17. Fixes#4017.
- Replace systemd-run --user --scope with setsid for portable session detach
- Add system-level service detection to cmd_update gateway restart
- Falls back to start_new_session=True on systems without setsid (macOS, minimal containers)
Auto-compression still runs silently in the background with server-side
logging, but no longer sends messages to the user's chat about it.
Removed:
- 'Session is large... Auto-compressing' pre-compression notification
- 'Compressed: N → M messages' post-compression notification
- 'Session is still very large after compression' warning
- 'Auto-compression failed' warning
- Rate-limit tracking (only existed for these warnings)
* fix(alibaba): use standard DashScope international endpoint
The Alibaba Cloud provider was hardcoded to the coding-intl endpoint
(https://coding-intl.dashscope.aliyuncs.com/v1) which only accepts
Alibaba Coding Plan API keys.
Standard DashScope API keys fail with invalid_api_key error against
this endpoint. Changed to the international compatible-mode endpoint
(https://dashscope-intl.aliyuncs.com/compatible-mode/v1) which works
with standard DashScope keys.
Users with Coding Plan keys or China-region keys can still override
via DASHSCOPE_BASE_URL or config.yaml base_url.
Fixes#3912
* fix: update test to match new DashScope default endpoint
---------
Co-authored-by: kagura-agent <kagura.chen28@gmail.com>
Wire the existing tool_progress_callback through the API server's
streaming handler so Open WebUI users see what tool is running.
Uses the existing 3-arg callback signature (name, preview, args)
that fires at tool start — no changes to run_agent.py needed.
Progress appears as inline markdown in the SSE content stream.
Inspired by PR #4032 by sroecker, reimplemented to avoid breaking
the callback signature used by CLI and gateway consumers.
When context compression fires during run_conversation() in the gateway,
the compressed messages were silently lost on the next turn. Two bugs:
1. Agent-side: _flush_messages_to_session_db() calculated
flush_from = max(len(conversation_history), _last_flushed_db_idx).
After compression, _last_flushed_db_idx was correctly reset to 0,
but conversation_history still had its original pre-compression
length (e.g. 200). Since compressed messages are shorter (~30),
messages[200:] was empty — nothing written to the new session's
SQLite.
Fix: Set conversation_history = None after each _compress_context()
call so start_idx = 0 and all compressed messages are flushed.
2. Gateway-side: history_offset was always len(agent_history) — the
original pre-compression length. After compression shortened the
message list, agent_messages[200:] was empty, causing the gateway
to fall back to writing only a user/assistant pair, losing the
compressed summary and tail context.
Fix: Detect session splits (agent.session_id != original) and set
history_offset = 0 so all compressed messages are written to JSONL.
* fix: treat non-sk-ant- prefixed keys (Azure AI Foundry) as regular API keys, not OAuth tokens
* fix: treat non-sk-ant- keys as regular API keys, not OAuth tokens
_is_oauth_token() returned True for any key not starting with
sk-ant-api, misclassifying Azure AI Foundry keys as OAuth tokens
and sending Bearer auth instead of x-api-key → 401 rejection.
Real Anthropic OAuth tokens all start with sk-ant-oat (confirmed
from live .credentials.json). Non-sk-ant- keys are third-party
provider keys that should use x-api-key.
Test fixtures updated to use realistic sk-ant-oat01- prefixed
tokens instead of fake strings.
Salvaged from PR #4075 by @HangGlidersRule.
---------
Co-authored-by: Clawdbot <clawdbot@openclaw.ai>
After migrating from OpenClaw, leftover workspace directories contain
state files (todo.json, sessions, logs) that confuse the agent — it
discovers them and reads/writes to stale locations instead of the
Hermes state directory, causing issues like cron jobs reading a
different todo list than interactive sessions.
Changes:
- hermes claw migrate now offers to archive the source directory after
successful migration (rename to .pre-migration, not delete)
- New `hermes claw cleanup` subcommand for users who already migrated
and need to archive leftover OpenClaw directories
- Migration notes updated with explicit cleanup guidance
- 42 tests covering all new functionality
Reported by SteveSkedasticity — multiple todo.json files across
~/.hermes/, ~/.openclaw/workspace/, and ~/.openclaw/workspace-assistant/
caused cron jobs to read from wrong locations.
When the Matrix adapter receives encrypted events it can't decrypt
(MegolmEvent), it now:
1. Requests the missing room key from other devices via
client.request_room_key(event) instead of silently dropping the message
2. Buffers undecrypted events (bounded to 100, 5 min TTL) and retries
decryption after each E2EE maintenance cycle when new keys arrive
3. Auto-trusts/verifies all devices after key queries so other clients
share session keys with the bot proactively
4. Exports Megolm keys on disconnect and imports them on connect, so
session keys survive gateway restarts
This addresses the 'could not decrypt event' warnings that caused the
bot to miss messages in encrypted rooms.
* fix: rate-limit pairing rejection messages to prevent spam
When generate_code() returns None (rate limited or max pending), the
"Too many pairing requests" message was sent on every subsequent DM
with no cooldown. A user sending 30 messages would get 30 rejection
replies — reported as potential hack on WhatsApp.
Now check _is_rate_limited() before any pairing response, and record
rate limit after sending a rejection. Subsequent messages from the
same user are silently ignored until the rate limit window expires.
* test: add coverage for pairing response rate limiting
Follow-up to cherry-picked PR #4042 — adds tests verifying:
- Rate-limited users get silently ignored (no response sent)
- Rejection messages record rate limit for subsequent suppression
---------
Co-authored-by: 0xbyt4 <35742124+0xbyt4@users.noreply.github.com>
Multiple agents/profiles running 'hermes honcho setup' all wrote to
the shared global ~/.honcho/config.json, overwriting each other's
configuration.
Root cause: _write_config() defaulted to resolve_config_path() which
returns the global path when no instance-local file exists yet (i.e.
on first setup).
Fix: _write_config() now defaults to _local_config_path() which always
returns $HERMES_HOME/honcho.json. Each profile gets its own config file.
Reading still falls back to global for cross-app interop and seeding.
Also updates cmd_setup and cmd_status messaging to show the actual
write path.
Includes 10 new tests verifying profile isolation, global fallback
reads, and multi-profile independence.
MiniMax's /anthropic endpoints implement Anthropic's Messages API but
require Authorization: Bearer instead of x-api-key. Without this fix,
MiniMax users get 401 errors in gateway sessions.
Adds _requires_bearer_auth() to detect MiniMax endpoints and route
through auth_token in the Anthropic SDK. Check runs before OAuth
token detection so MiniMax keys aren't misclassified as setup tokens.
Co-authored-by: kshitijk4poor <kshitijk4poor@users.noreply.github.com>
Camofox-browser is a self-hosted Node.js server wrapping Camoufox
(Firefox fork with C++ fingerprint spoofing). When CAMOFOX_URL is set,
all 11 browser tools route through the Camofox REST API instead of
the agent-browser CLI.
Maps 1:1 to the existing browser tool interface:
- Navigate, snapshot, click, type, scroll, back, press, close
- Get images, vision (screenshot + LLM analysis)
- Console (returns empty with note — camofox limitation)
Setup: npm start in camofox-browser dir, or docker run -p 9377:9377
Then: CAMOFOX_URL=http://localhost:9377 in ~/.hermes/.env
Advantages over Browserbase (cloud):
- Free (no per-session API costs)
- Local (zero network latency for browser ops)
- Anti-detection at C++ level (bypasses Cloudflare/Google bot detection)
- Works offline, Docker-ready
Files:
- tools/browser_camofox.py: Full REST backend (~400 lines)
- tools/browser_tool.py: Routing at each tool function
- hermes_cli/config.py: CAMOFOX_URL env var entry
- tests/tools/test_browser_camofox.py: 20 tests
* feat: add /yolo slash command to toggle dangerous command approvals
Adds a /yolo command that toggles HERMES_YOLO_MODE at runtime, skipping
all dangerous command approval prompts for the current session. Works in
both CLI and gateway (Telegram, Discord, etc.).
- /yolo -> ON: all commands auto-approved, no confirmation prompts
- /yolo -> OFF: normal approval flow restored
The --yolo CLI flag already existed for launch-time opt-in. This adds
the ability to toggle mid-session without restarting.
Session-scoped — resets when the process ends. Uses the existing
HERMES_YOLO_MODE env var that check_all_command_guards() already
respects.
* fix: prevent context pressure warning spam (agent loop + gateway rate-limit)
Two complementary fixes for repeated context pressure warnings spamming
gateway users (Telegram, Discord, etc.):
1. Agent-level loop fix (run_agent.py):
After compression, only reset _context_pressure_warned if the
post-compression estimate is actually below the 85% warning level.
Previously the flag was unconditionally reset, causing the warning
to re-fire every loop iteration when compression couldn't reduce
below 85% of the threshold (e.g. very low threshold like 15%,
or system prompt alone exceeds the warning level).
2. Gateway-level rate-limit (gateway/run.py, salvaged from PR #3786):
Per-chat_id cooldown of 1 hour on compression warning messages.
Both warning paths ('still large after compression' and 'compression
failed') are gated. Defense-in-depth — even if the agent-level fix
has edge cases, users won't see more than one warning per hour.
Co-authored-by: dlkakbs <dlkakbs@users.noreply.github.com>
---------
Co-authored-by: dlkakbs <dlkakbs@users.noreply.github.com>
The AsyncOpenAI client was created once at __init__ and stored as an
instance attribute. process_directory() calls asyncio.run() which creates
and closes a fresh event loop. On a second call, the client's httpx
transport is still bound to the closed loop, raising RuntimeError:
"Event loop is closed" — the same pattern fixed by PR #3398 for the
main agent loop.
Create the client lazily in _get_async_client() so each asyncio.run()
gets a client bound to the current loop.
Co-authored-by: binhnt92 <binhnt.ht.92@gmail.com>
Add timeout parameters to 4 subprocess.run() calls that could hang
indefinitely if the child process blocks (e.g., unresponsive docker
daemon, systemctl waiting for D-Bus):
- doctor.py: docker info (timeout=10), ssh check (timeout=15)
- status.py: systemctl is-active (timeout=5), launchctl list (timeout=5)
Each call site now catches subprocess.TimeoutExpired and treats it as
a failure, consistent with how non-zero return codes are already handled.
Add AST-based regression test that verifies every subprocess.run() call
in CLI modules specifies a timeout keyword argument.
Co-authored-by: dieutx <dangtc94@gmail.com>
ElevenLabs (sk_), Tavily (tvly-), and Exa (exa_) keys were not covered
by _PREFIX_PATTERNS, leaking in plain text via printenv or log output.
Salvaged from PR #3790 by @memosr. Tests rewritten with correct
assertions (original tests had vacuously true checks).
Co-authored-by: memosr <memosr@users.noreply.github.com>
1. matrix voice: _on_room_message_media unconditionally overwrote
media_urls with the image cache path (always None for non-images),
wiping the locally-cached voice path. Now only overrides when
cached_path is truthy.
2. cli_tools_command: /tools disable no longer prompts for confirmation
(input() removed in earlier commit to fix TUI hang), but tests still
expected the old Y/N prompt flow. Updated tests to match current
behavior (direct apply + session reset).
3. slack app_mention: connect() was refactored for multi-workspace
(creates AsyncWebClient per token), but test only mocked the old
self._app.client path. Added AsyncWebClient and acquire_scoped_lock
mocks.
4. website_policy: module-level _cached_policy from earlier tests caused
fast-path return of None. Added invalidate_cache() before assertion.
5. codex 401 refresh: already passing on current main (fixed by
intervening commit).
Skills with scripts/, templates/, and references/ subdirectories need
those files available inside sandboxed execution environments. Previously
the skills directory was missing entirely from remote backends.
Live sync — files stay current as credentials refresh and skills update:
- Docker/Singularity: bind mounts are inherently live (host changes
visible immediately)
- Modal: _sync_files() runs before each command with mtime+size caching,
pushing only changed credential and skill files (~13μs no-op overhead)
- SSH: rsync --safe-links before each command (naturally incremental)
- Daytona: _upload_if_changed() with mtime+size caching before each command
Security — symlink filtering:
- Docker/Singularity: sanitized temp copy when symlinks detected
- Modal/Daytona: iter_skills_files() skips symlinks
- SSH: rsync --safe-links skips symlinks pointing outside source tree
- Temp dir cleanup via atexit + reuse across calls
Non-root user support:
- SSH: detects remote home via echo $HOME, syncs to $HOME/.hermes/
- Daytona: detects sandbox home before sync, uploads to $HOME/.hermes/
- Docker/Modal/Singularity: run as root, /root/.hermes/ is correct
Also:
- credential_files.py: fix name/path key fallback in required_credential_files
- Singularity, SSH, Daytona: gained credential file support
- 14 tests covering symlink filtering, name/path fallback, iter_skills_files
* feat(matrix): support native voice messages
* fix: skip matrix voice tests when matrix-nio not installed
---------
Co-authored-by: Carlos Alberto Pereira Gomes <carlosapgomes@users.noreply.github.com>
- measure status bar display width using prompt_toolkit cell widths
- trim rendered status text when fragments would overflow
- add a final single-fragment fallback to prevent wrapping
- update width assertions to validate display cells instead of len()
Adds lifecycle hooks to the base platform adapter so Discord (and future
platforms) can react to message processing events:
👀 when processing starts
✅ on successful completion (delivery confirmed)
❌ on failure, error, or cancellation
Implementation:
- base.py: on_processing_start/on_processing_complete hooks with
_run_processing_hook error isolation wrapper; delivery tracking
via _record_delivery closure for accurate success detection
- discord.py: _add_reaction/_remove_reaction helpers + hook overrides
- Tests for base hook lifecycle and Discord-specific reactions
Co-authored-by: alanwilhelm <alanwilhelm@users.noreply.github.com>
The ./hermes convenience script still used the legacy Fire-based
cli.main wrapper, which doesn't support subcommands (gateway, cron,
doctor, etc.). The installed 'hermes' command already uses
hermes_cli.main:main (argparse) — this aligns the launcher.
Salvaged from PR #2009 by gito369.
Adds Discord-style mention gating for Telegram groups:
- telegram.require_mention: gate group messages (default: false)
- telegram.mention_patterns: regex wake-word triggers
- telegram.free_response_chats: bypass gating for specific chats
When require_mention is enabled, group messages are accepted only for:
- slash commands
- replies to the bot
- @botusername mentions
- regex wake-word pattern matches
DMs remain unrestricted. @mention text is stripped before passing to
the agent. Invalid regex patterns are ignored with a warning.
Config bridges follow the existing Discord pattern (yaml → env vars).
Cherry-picked and adapted from PR #1977 by mcleay. Fixed ChatType
comparison to work without python-telegram-bot installed (uses string
matching instead of enum, consistent with other entity_type checks).
Co-authored-by: mcleay <mcleay@users.noreply.github.com>
Setup previously only printed a manual install hint for matrix-nio,
causing the gateway to crash with 'matrix-nio not installed' after
configuring Matrix. Now auto-installs matrix-nio (or matrix-nio[e2e]
when E2EE is enabled) using the same uv-first/pip-fallback pattern
as Daytona and Modal backends.
Also adds hermes-agent[matrix] to the [all] extra in pyproject.toml
and a regression test to keep it there.
Co-authored-by: Gutslabs <Gutslabs@users.noreply.github.com>
Co-authored-by: cutepawss <cutepawss@users.noreply.github.com>
When a command timed out, all captured output was discarded — the agent
only saw 'Command timed out after Xs' with zero context. Now returns
the buffered output followed by a timeout marker, matching the existing
interrupt path behavior.
Salvaged from PR #3286 by @binhnt92.
Co-authored-by: nguyen binh <binhnt92@users.noreply.github.com>
Adds WeCom as a gateway platform adapter using the AI Bot WebSocket
gateway for real-time bidirectional communication. No public endpoint
or new pip dependencies needed (uses existing aiohttp + httpx).
Features:
- WebSocket persistent connection with auto-reconnect (exponential backoff)
- DM and group messaging with configurable access policies
- Media upload/download with AES decryption for encrypted attachments
- Markdown rendering, quote context preservation
- Proactive + passive reply message modes
- Chunked media upload pipeline (512KB chunks)
Cherry-picked from PR #1898 by EvilRan with:
- Moved to current main (PR was 300 commits behind)
- Skipped base.py regressions (reply_to additions are good but belong
in a separate PR since they affect all platforms)
- Fixed test assertions to match current base class send() signature
(reply_to=None kwarg now explicit)
- All 16 integration points added surgically to current main
- No new pip dependencies (aiohttp + httpx already installed)
Fixes#1898
Co-authored-by: EvilRan <EvilRan@users.noreply.github.com>
Cron jobs configured with deliver labels from send_message(action='list')
like 'whatsapp:Alice (dm)' passed the label as a literal chat_id.
WhatsApp bridge failed with jidDecode error since 'Alice (dm)' isn't
a valid JID.
Now _resolve_delivery_target() strips display suffixes like ' (dm)' and
resolves human-friendly names via the channel directory before using
them. Raw IDs pass through unchanged when the directory has no match.
Fixes#1945.
Previously, when no API keys or provider credentials were found, Hermes
silently defaulted to OpenRouter + Claude Opus. This caused confusion
when users configured local servers (LM Studio, Ollama, etc.) with a
typo or unrecognized provider name — the system would silently route to
OpenRouter instead of telling them something was wrong.
Changes:
- resolve_provider() now raises AuthError when no credentials are found
instead of returning 'openrouter' as a silent fallback
- Added local server aliases: lmstudio, ollama, vllm, llamacpp → custom
- Removed hardcoded 'anthropic/claude-opus-4.6' fallback from gateway
and cron scheduler (they read from config.yaml instead)
- Updated cli-config.yaml.example with complete provider documentation
including all supported providers, aliases, and local server setup
Local inference servers (Ollama, llama.cpp, vLLM, LM Studio) don't
require API keys, but the auxiliary client's _resolve_custom_runtime()
rejected endpoints with empty keys — causing the auto-detection chain
to skip the user's local server entirely. This broke compression,
summarization, and memory flush for users running local models without
an OpenRouter/cloud API key.
The main CLI already had this fix (PR #2556, 'no-key-required'
placeholder), but the auxiliary client's resolution path was missed.
Two fixes:
- _resolve_custom_runtime(): use 'no-key-required' placeholder instead
of returning None when base_url is present but key is empty
- resolve_provider_client() custom branch: same placeholder fallback
for explicit_base_url without explicit_api_key
Updates 2 tests that expected the old (broken) behavior.
Three safety gaps in vision_analyze_tool:
1. Local files accepted without checking if they're actually images —
a renamed text file would get base64-encoded and sent to the model.
Now validates magic bytes (PNG, JPEG, GIF, BMP, WebP, SVG).
2. No website policy enforcement on image URLs — blocked domains could
be fetched via the vision tool. Now checks before download.
3. No redirect check — if an allowed URL redirected to a blocked domain,
the download would proceed. Now re-checks the final URL.
Fixed one test that needed _validate_image_url mocked to bypass DNS
resolution on the fake blocked.test domain (is_safe_url does DNS
checks that were added after the original PR).
Co-authored-by: GutSlabs <GutSlabs@users.noreply.github.com>
Validate category names in _create_skill() before using them as
filesystem path segments. Previously, categories like '../escape' or
'/tmp/pwned' could write skill files outside ~/.hermes/skills/.
Adds _validate_category() that rejects slashes, backslashes, absolute
paths, and non-alphanumeric characters (reuses existing VALID_NAME_RE).
Tests: 5 new tests for traversal, absolute paths, and valid categories.
Salvaged from PR #1939 by Gutslabs.
test_hooks.py (7 failures): Built-in boot-md hook was always loaded
by _register_builtin_hooks(), adding +1 to every expected hook count.
Mock out built-in registration in TestDiscoverAndLoad so tests isolate
user-hook discovery logic.
test_tool_token_estimation.py (2 failures): tiktoken is not in
core/[all] dependencies. The estimation function gracefully returns {}
when tiktoken is missing, but tests expected non-empty results. Added
skipif markers for tests that need tiktoken.
test_plugins_cmd.py (1 failure): bare 'hermes plugins' now dispatches
to cmd_toggle() (interactive curses UI) instead of cmd_list(). Updated
test to match the new behavior.
WhatsApp DMs can arrive with LID sender IDs even when
WHATSAPP_ALLOWED_USERS is configured with phone numbers. The allowlist
check now reads bridge session mapping files (lid-mapping-*.json) to
resolve phone↔LID aliases, matching users regardless of which
identifier format the message uses.
Both the Python gateway (_is_user_authorized) and the Node bridge
(allowlist.js) now share the same mapping-file-based resolution logic.
Co-authored-by: Frederico Ribeiro <fr@tecompanytea.com>
Adds Feishu (ByteDance's enterprise messaging platform) as a gateway
platform adapter with full feature parity: WebSocket + webhook transports,
message batching, dedup, rate limiting, rich post/card content parsing,
media handling (images/audio/files/video), group @mention gating,
reaction routing, and interactive card button support.
Cherry-picked from PR #1793 by penwyp with:
- Moved to current main (PR was 458 commits behind)
- Fixed _send_with_retry shadowing BasePlatformAdapter method (renamed to
_feishu_send_with_retry to avoid signature mismatch crash)
- Fixed import structure: aiohttp/websockets imported independently of
lark_oapi so they remain available when SDK is missing
- Fixed get_hermes_home import (hermes_constants, not hermes_cli.config)
- Added skip decorators for tests requiring lark_oapi SDK
- All 16 integration points added surgically to current main
New dependency: lark-oapi>=1.5.3,<2 (optional, pip install hermes-agent[feishu])
Fixes#1788
Co-authored-by: penwyp <penwyp@users.noreply.github.com>
* feat(skills): add memento-flashcards skill
* docs(skills): clarify memento-flashcards interaction model
* fix: use HERMES_HOME env var for profile-safe data path
---------
Co-authored-by: Magnus Ahmad <magnus.ahmad@gmail.com>
Adds a config option to suppress the header/footer text that wraps
cron job responses when delivered to messaging platforms.
Set cron.wrap_response: false in config.yaml for clean output without
the 'Cronjob Response: <name>' header and 'The agent cannot see this
message' footer. Default is true (preserves current behavior).
Replace per-request aiohttp.ClientSession() in every WhatsApp adapter
method with a single persistent self._http_session, matching the pattern
used by Mattermost, HomeAssistant, and SMS adapters.
Changes:
- Create self._http_session in connect(), close in disconnect()
- All bridge HTTP calls (send, edit, send-media, typing, get_chat_info,
poll_messages) now use the shared session
- Explicitly cancel _poll_task on disconnect() instead of relying
solely on self._running = False
- Health-check sessions in connect() remain ephemeral (persistent
session not yet created at that point)
- Remove per-method ImportError guards for aiohttp (always available
when gateway runs via [messaging] extras)
Salvaged from PR #1851 by Himess. The _poll_task storage was already
on main from PR #3267; this adds the disconnect cancellation and the
persistent session.
Tests: 4 new tests for session close, already-closed skip, poll task
cancellation, and done-task skip.
Extends the single fallback_model mechanism into an ordered chain.
When the primary model fails, Hermes tries each fallback provider in
sequence until one succeeds or the chain is exhausted.
Config format (new):
fallback_providers:
- provider: openrouter
model: anthropic/claude-sonnet-4
- provider: openai
model: gpt-4o
Legacy single-dict fallback_model format still works unchanged.
Key fix vs original PR: the call sites in the retry loop now use
_fallback_index < len(_fallback_chain) instead of the old one-shot
_fallback_activated guard, so the chain actually advances through
all configured providers.
Changes:
- run_agent.py: _fallback_chain list + _fallback_index replaces
one-shot _fallback_model; _try_activate_fallback() advances
through chain; failed provider resolution skips to next entry;
call sites updated to allow chain advancement
- cli.py: reads fallback_providers with legacy fallback_model compat
- gateway/run.py: same
- hermes_cli/config.py: fallback_providers: [] in DEFAULT_CONFIG
- tests: 12 new chain tests + 6 existing test fixtures updated
Co-authored-by: uzaylisak <uzaylisak@users.noreply.github.com>
The honcho check_fn only checked runtime session state, which isn't
set until the agent initializes. At banner time, honcho tools showed
as red/disabled even when properly configured.
Now checks configuration (enabled + api_key/base_url) as a fallback
when the session context isn't active yet. Fast path (session active)
unchanged; slow path (config check) only runs at banner time.
Adds 4 tests covering: session active, configured but no session,
not configured, and import failure graceful fallback.
Closes#1843.
When a connected MCP server sends a ToolListChangedNotification (per the
MCP spec), Hermes now automatically re-fetches the tool list, deregisters
removed tools, and registers new ones — without requiring a restart.
This enables MCP servers with dynamic toolsets (e.g. GitHub MCP with
GITHUB_DYNAMIC_TOOLSETS=1) to add/remove tools at runtime.
Changes:
- registry.py: add ToolRegistry.deregister() for nuke-and-repave refresh
- mcp_tool.py: extract _register_server_tools() from
_discover_and_register_server() as a shared helper for both initial
discovery and dynamic refresh
- mcp_tool.py: add _make_message_handler() and _refresh_tools() on
MCPServerTask, wired into all 3 ClientSession sites (stdio, new HTTP,
deprecated HTTP)
- Graceful degradation: silently falls back to static discovery when the
MCP SDK lacks notification types or message_handler support
- 8 new tests covering registration, refresh, handler dispatch, and
deregister
Salvaged from PR #1794 by shivvor2.
Home channel env vars (SLACK_HOME_CHANNEL, SIGNAL_HOME_CHANNEL, etc.)
for Slack, Signal, Mattermost, Matrix, Email, and SMS were nested
inside the credential-env blocks, so they were ignored when the
platform was already configured via config.yaml.
Moved the home channel handling outside the credential blocks with a
Platform.X in config.platforms guard, matching the existing pattern
for Telegram and Discord.
Co-authored-by: cutepawss <cutepawss@users.noreply.github.com>