The auxiliary client's auto-detection chain was a black box — when
compression, summarization, or memory flush failed, the only clue was
a generic 'Request timed out' with no indication of which provider was
tried or why it was skipped.
Now logs at INFO level:
- 'Auxiliary auto-detect: using local/custom (qwen3.5-9b) — skipped:
openrouter, nous' when auto-detection picks a provider
- 'Auxiliary compression: using auto (qwen3.5-9b) at http://localhost:11434/v1'
before each auxiliary call
- 'Auxiliary compression: provider custom unavailable, falling back to
openrouter' on fallback
- Clear warning with actionable guidance when NO provider is available:
'Set OPENROUTER_API_KEY or configure a local model in config.yaml'
The existing docs were two lines. The migration script handles 35
categories of data across persona, memory, skills, messaging platforms,
model providers, MCP servers, agent config, and more.
New docs cover:
- All CLI options (--dry-run, --preset, --overwrite, --migrate-secrets,
--source, --workspace-target, --skill-conflict, --yes)
- 27 directly-imported categories with source → destination mapping
- 7 archived categories with manual recreation guidance
- Security notes on API key allowlisting
- Usage examples for common migration scenarios
Cron jobs configured with deliver labels from send_message(action='list')
like 'whatsapp:Alice (dm)' passed the label as a literal chat_id.
WhatsApp bridge failed with jidDecode error since 'Alice (dm)' isn't
a valid JID.
Now _resolve_delivery_target() strips display suffixes like ' (dm)' and
resolves human-friendly names via the channel directory before using
them. Raw IDs pass through unchanged when the directory has no match.
Fixes#1945.
Previously, when no API keys or provider credentials were found, Hermes
silently defaulted to OpenRouter + Claude Opus. This caused confusion
when users configured local servers (LM Studio, Ollama, etc.) with a
typo or unrecognized provider name — the system would silently route to
OpenRouter instead of telling them something was wrong.
Changes:
- resolve_provider() now raises AuthError when no credentials are found
instead of returning 'openrouter' as a silent fallback
- Added local server aliases: lmstudio, ollama, vllm, llamacpp → custom
- Removed hardcoded 'anthropic/claude-opus-4.6' fallback from gateway
and cron scheduler (they read from config.yaml instead)
- Updated cli-config.yaml.example with complete provider documentation
including all supported providers, aliases, and local server setup
Local inference servers (Ollama, llama.cpp, vLLM, LM Studio) don't
require API keys, but the auxiliary client's _resolve_custom_runtime()
rejected endpoints with empty keys — causing the auto-detection chain
to skip the user's local server entirely. This broke compression,
summarization, and memory flush for users running local models without
an OpenRouter/cloud API key.
The main CLI already had this fix (PR #2556, 'no-key-required'
placeholder), but the auxiliary client's resolution path was missed.
Two fixes:
- _resolve_custom_runtime(): use 'no-key-required' placeholder instead
of returning None when base_url is present but key is empty
- resolve_provider_client() custom branch: same placeholder fallback
for explicit_base_url without explicit_api_key
Updates 2 tests that expected the old (broken) behavior.
The model picker for API-key providers (MiniMax, z.ai, etc.) probes
the live /models endpoint when the curated list has fewer than 8
models. When the live endpoint returns fewer models than the curated
list (e.g. MiniMax's Anthropic-compatible endpoint doesn't list M2.7),
the incomplete live list was used instead.
Now falls back to the curated list when live returns fewer models,
ensuring new models like MiniMax-M2.7 always appear in the picker.
Three safety gaps in vision_analyze_tool:
1. Local files accepted without checking if they're actually images —
a renamed text file would get base64-encoded and sent to the model.
Now validates magic bytes (PNG, JPEG, GIF, BMP, WebP, SVG).
2. No website policy enforcement on image URLs — blocked domains could
be fetched via the vision tool. Now checks before download.
3. No redirect check — if an allowed URL redirected to a blocked domain,
the download would proceed. Now re-checks the final URL.
Fixed one test that needed _validate_image_url mocked to bypass DNS
resolution on the fake blocked.test domain (is_safe_url does DNS
checks that were added after the original PR).
Co-authored-by: GutSlabs <GutSlabs@users.noreply.github.com>
* fix: add gpt-5.4-mini to Codex fallback catalog
* fix(telegram): gracefully handle deleted reply targets
When a user deletes their message while Hermes is processing, Telegram
returns BadRequest 'Message to be replied not found'. Previously this
was an unhandled permanent error causing silent delivery failure.
Now clears reply_to_id and retries so the response is still delivered,
matching the existing 'thread not found' recovery pattern.
Inspired by PR #3231 by @heathley. Fixes#3229.
---------
Co-authored-by: Clippy <clippy@grads.flow>
Co-authored-by: Nigel Gibbs <heathley@users.noreply.github.com>
PR #3726 fixed stale codex_responses persisting when switching providers
by hardcoding api_mode=chat_completions in 5 model flows. This broke
MiniMax, MiniMax-CN, and Alibaba which use /anthropic endpoints that
need anthropic_messages — the hardcoded value overrides the URL-based
auto-detection in runtime_provider.py.
Fix: pop api_mode from config in the 3 URL-dependent flows (custom
endpoint, Kimi, api_key_provider) instead of hardcoding. The runtime
resolver already correctly auto-detects api_mode from the base_url
suffix (/anthropic -> anthropic_messages, else chat_completions).
OpenRouter and Copilot ACP flows keep the explicit value since their
api_mode is always known.
Reported by stefan171.
Validate category names in _create_skill() before using them as
filesystem path segments. Previously, categories like '../escape' or
'/tmp/pwned' could write skill files outside ~/.hermes/skills/.
Adds _validate_category() that rejects slashes, backslashes, absolute
paths, and non-alphanumeric characters (reuses existing VALID_NAME_RE).
Tests: 5 new tests for traversal, absolute paths, and valid categories.
Salvaged from PR #1939 by Gutslabs.
test_hooks.py (7 failures): Built-in boot-md hook was always loaded
by _register_builtin_hooks(), adding +1 to every expected hook count.
Mock out built-in registration in TestDiscoverAndLoad so tests isolate
user-hook discovery logic.
test_tool_token_estimation.py (2 failures): tiktoken is not in
core/[all] dependencies. The estimation function gracefully returns {}
when tiktoken is missing, but tests expected non-empty results. Added
skipif markers for tests that need tiktoken.
test_plugins_cmd.py (1 failure): bare 'hermes plugins' now dispatches
to cmd_toggle() (interactive curses UI) instead of cmd_list(). Updated
test to match the new behavior.
WhatsApp DMs can arrive with LID sender IDs even when
WHATSAPP_ALLOWED_USERS is configured with phone numbers. The allowlist
check now reads bridge session mapping files (lid-mapping-*.json) to
resolve phone↔LID aliases, matching users regardless of which
identifier format the message uses.
Both the Python gateway (_is_user_authorized) and the Node bridge
(allowlist.js) now share the same mapping-file-based resolution logic.
Co-authored-by: Frederico Ribeiro <fr@tecompanytea.com>
When a subagent hit max_iterations, status was always 'failed' even
if it produced a usable summary via _handle_max_iterations(). This
happened because the status check required both completed=True AND
a summary, but completed is False whenever max_iterations is reached
(run_agent.py line 7969).
Now gates status on whether a summary was produced — if the subagent
returned a final_response, the parent has usable output regardless of
iteration budget. The exit_reason field already distinguishes
'completed' vs 'max_iterations' for anything that needs to know how
the task ended.
Closes#1899.
When stdout is closed (piped to a dead process, broken terminal),
Python raises ValueError('I/O operation on closed file'), not OSError.
_safe_print and the API error printer only caught OSError, letting the
ValueError propagate and crash the agent.
Salvaged from PR #3760 by @apexscaleai. Fixes#3534.
Co-authored-by: apexscaleai <apexscaleai@users.noreply.github.com>
Adds Feishu (ByteDance's enterprise messaging platform) as a gateway
platform adapter with full feature parity: WebSocket + webhook transports,
message batching, dedup, rate limiting, rich post/card content parsing,
media handling (images/audio/files/video), group @mention gating,
reaction routing, and interactive card button support.
Cherry-picked from PR #1793 by penwyp with:
- Moved to current main (PR was 458 commits behind)
- Fixed _send_with_retry shadowing BasePlatformAdapter method (renamed to
_feishu_send_with_retry to avoid signature mismatch crash)
- Fixed import structure: aiohttp/websockets imported independently of
lark_oapi so they remain available when SDK is missing
- Fixed get_hermes_home import (hermes_constants, not hermes_cli.config)
- Added skip decorators for tests requiring lark_oapi SDK
- All 16 integration points added surgically to current main
New dependency: lark-oapi>=1.5.3,<2 (optional, pip install hermes-agent[feishu])
Fixes#1788
Co-authored-by: penwyp <penwyp@users.noreply.github.com>
Tool call previews (paths, commands, queries) were hardcoded to truncate
at 35-40 chars across CLI spinners, completion lines, and gateway progress
messages. Users could not see full file paths in tool output.
New config option: display.tool_preview_length (default 0 = no limit).
Set a positive number to truncate at that length.
Changes:
- display.py: module-level _tool_preview_max_len with getter/setter;
build_tool_preview() and get_cute_tool_message() _trunc/_path respect it
- cli.py: reads config at startup, spinner widget respects config
- gateway/run.py: reads config per-message, progress callback respects config
- run_agent.py: removed redundant 30-char quiet-mode spinner truncation
- config.py: added display.tool_preview_length to DEFAULT_CONFIG
Reported by kriskaminski
The write-deny list in file_operations.py hardcoded ~/.hermes/.env,
which misses the actual .env in custom HERMES_HOME or profile setups.
Use get_hermes_home() for profile-safe path resolution.
Salvaged from PR #3232 by @erhnysr.
Co-authored-by: Erhnysr <erhnysr@users.noreply.github.com>
When compression creates a child session with a new session_id,
session_log_file was still pointing to the old session's JSON file.
This caused _save_session_log() to write new data to the wrong file.
Closes#3731.
Co-authored-by: kelsia14 <kelsia14@users.noreply.github.com>
Adds a songwriting craft and AI music prompt engineering skill covering
song structure, rhyme/meter, emotional arcs, Suno metatag reference,
phonetic tricks for AI singers, parody adaptation, and production workflow.
Complements existing music skills (heartmula, audiocraft, songsee) which
cover model setup/usage — this one covers the creative process itself.
Also removes the empty skills/music-creation/ category (only had a
DESCRIPTION.md, no actual skills).
Co-authored-by: 123mikeyd <123mikeyd@users.noreply.github.com>
Tools from check_fn-gated toolsets (honcho, homeassistant) showed as
red (disabled) in the startup banner even when properly configured.
This happened because check_fn runs lazily after session context is
set, but the banner renders before agent init.
Now distinguishes three states:
- red: truly unavailable (missing env var, no API key)
- yellow: lazy-initialized (check_fn pending, will activate on use)
- normal: available and ready
Only the banner fix was salvaged from the original PR; unrelated
bundled changes (context_compressor, STT config, auth default_model,
SessionResetPolicy) were discarded.
Co-authored-by: Jah-yee <Jah-yee@users.noreply.github.com>
* feat(skills): add memento-flashcards skill
* docs(skills): clarify memento-flashcards interaction model
* fix: use HERMES_HOME env var for profile-safe data path
---------
Co-authored-by: Magnus Ahmad <magnus.ahmad@gmail.com>
Adds a config option to suppress the header/footer text that wraps
cron job responses when delivered to messaging platforms.
Set cron.wrap_response: false in config.yaml for clean output without
the 'Cronjob Response: <name>' header and 'The agent cannot see this
message' footer. Default is true (preserves current behavior).
Replace per-request aiohttp.ClientSession() in every WhatsApp adapter
method with a single persistent self._http_session, matching the pattern
used by Mattermost, HomeAssistant, and SMS adapters.
Changes:
- Create self._http_session in connect(), close in disconnect()
- All bridge HTTP calls (send, edit, send-media, typing, get_chat_info,
poll_messages) now use the shared session
- Explicitly cancel _poll_task on disconnect() instead of relying
solely on self._running = False
- Health-check sessions in connect() remain ephemeral (persistent
session not yet created at that point)
- Remove per-method ImportError guards for aiohttp (always available
when gateway runs via [messaging] extras)
Salvaged from PR #1851 by Himess. The _poll_task storage was already
on main from PR #3267; this adds the disconnect cancellation and the
persistent session.
Tests: 4 new tests for session close, already-closed skip, poll task
cancellation, and done-task skip.
Third report of gateway crashing with:
ImportError: cannot import name 'get_hermes_home' from 'hermes_constants'
Root cause: stale .pyc bytecode files survive code updates. When Python
loads a cached .pyc that references names from the old source, the import
fails and the gateway won't start.
Two bugs fixed:
1. Git update path: no cache clearing at all after git pull
2. ZIP update path: __pycache__ was explicitly in the preserve set
Added _clear_bytecode_cache() helper that removes all __pycache__ dirs
under PROJECT_ROOT (skipping venv/node_modules/.git/.worktrees). Called
in both git and ZIP update paths, before pip install.
Some providers (Fireworks AI) reject tools=null, and others (Anthropic)
reject tools=[]. The safest approach is to not include the key at all
when there are no tools — the OpenAI SDK treats a missing parameter as
NOT_GIVEN and omits it from the request entirely.
Inspired by PR #3736 (@kelsia14).
Extends the single fallback_model mechanism into an ordered chain.
When the primary model fails, Hermes tries each fallback provider in
sequence until one succeeds or the chain is exhausted.
Config format (new):
fallback_providers:
- provider: openrouter
model: anthropic/claude-sonnet-4
- provider: openai
model: gpt-4o
Legacy single-dict fallback_model format still works unchanged.
Key fix vs original PR: the call sites in the retry loop now use
_fallback_index < len(_fallback_chain) instead of the old one-shot
_fallback_activated guard, so the chain actually advances through
all configured providers.
Changes:
- run_agent.py: _fallback_chain list + _fallback_index replaces
one-shot _fallback_model; _try_activate_fallback() advances
through chain; failed provider resolution skips to next entry;
call sites updated to allow chain advancement
- cli.py: reads fallback_providers with legacy fallback_model compat
- gateway/run.py: same
- hermes_cli/config.py: fallback_providers: [] in DEFAULT_CONFIG
- tests: 12 new chain tests + 6 existing test fixtures updated
Co-authored-by: uzaylisak <uzaylisak@users.noreply.github.com>
The honcho check_fn only checked runtime session state, which isn't
set until the agent initializes. At banner time, honcho tools showed
as red/disabled even when properly configured.
Now checks configuration (enabled + api_key/base_url) as a fallback
when the session context isn't active yet. Fast path (session active)
unchanged; slow path (config check) only runs at banner time.
Adds 4 tests covering: session active, configured but no session,
not configured, and import failure graceful fallback.
Closes#1843.
When a connected MCP server sends a ToolListChangedNotification (per the
MCP spec), Hermes now automatically re-fetches the tool list, deregisters
removed tools, and registers new ones — without requiring a restart.
This enables MCP servers with dynamic toolsets (e.g. GitHub MCP with
GITHUB_DYNAMIC_TOOLSETS=1) to add/remove tools at runtime.
Changes:
- registry.py: add ToolRegistry.deregister() for nuke-and-repave refresh
- mcp_tool.py: extract _register_server_tools() from
_discover_and_register_server() as a shared helper for both initial
discovery and dynamic refresh
- mcp_tool.py: add _make_message_handler() and _refresh_tools() on
MCPServerTask, wired into all 3 ClientSession sites (stdio, new HTTP,
deprecated HTTP)
- Graceful degradation: silently falls back to static discovery when the
MCP SDK lacks notification types or message_handler support
- 8 new tests covering registration, refresh, handler dispatch, and
deregister
Salvaged from PR #1794 by shivvor2.
When a tool plugin registers a schema without an explicit 'name' key,
get_definitions() crashes with KeyError:
available_tool_names = {t["function"]["name"] for t in filtered_tools}
Fix: always merge entry.name into schema so 'name' is never missing.
Refs: #3729
Co-authored-by: ekkoitac <ekko.itac@gmail.com>
Home channel env vars (SLACK_HOME_CHANNEL, SIGNAL_HOME_CHANNEL, etc.)
for Slack, Signal, Mattermost, Matrix, Email, and SMS were nested
inside the credential-env blocks, so they were ignored when the
platform was already configured via config.yaml.
Moved the home channel handling outside the credential blocks with a
Platform.X in config.platforms guard, matching the existing pattern
for Telegram and Discord.
Co-authored-by: cutepawss <cutepawss@users.noreply.github.com>
When hitting rate limits (429), the agent now:
- Extracts the Retry-After header from the provider response and uses it
as the wait time instead of blind exponential backoff (capped at 120s)
- Shows rate-limit-specific messaging: 'Rate limit reached. Waiting Xs
before retry (attempt N/M)...'
- Shows a distinct exhaustion message: 'Rate limit persisted after N
retries. Please try again later.'
Non-429 errors keep the existing exponential backoff and generic messaging.
Co-authored-by: ygd58 <ygd58@users.noreply.github.com>
Adds a Ctrl+Z key binding to suspend the hermes CLI to background
using standard Unix job control. Uses prompt_toolkit's run_in_terminal()
to properly save/restore terminal state, then sends SIGTSTP to the
process group. Prints a branded message with resume instructions.
Shows a not-supported notice on Windows.
Co-authored-by: CharlieKerfoot <CharlieKerfoot@users.noreply.github.com>
hermes mcp serve starts a stdio MCP server that lets any MCP client
(Claude Code, Cursor, Codex, etc.) interact with Hermes conversations.
Matches OpenClaw's 9-tool channel bridge surface:
Tools exposed:
- conversations_list: list active sessions across all platforms
- conversation_get: details on one conversation
- messages_read: read message history
- attachments_fetch: extract non-text content from messages
- events_poll: poll for new events since a cursor
- events_wait: long-poll / block until next event (near-real-time)
- messages_send: send to any platform via send_message_tool
- channels_list: browse available messaging targets
- permissions_list_open: list pending approval requests
- permissions_respond: allow/deny approvals
Architecture:
- EventBridge: background thread polls SessionDB for new messages,
maintains in-memory event queue with waiter support
- Reads sessions.json + SessionDB directly (no gateway dep for reads)
- Reuses send_message_tool for sending (same platform adapters)
- FastMCP server with stdio transport
- Zero new dependencies (uses existing mcp>=1.2.0 optional dep)
Files:
- mcp_serve.py: MCP server + EventBridge (~600 lines)
- hermes_cli/main.py: added serve sub-parser to hermes mcp
- hermes_cli/mcp_config.py: route serve action to run_mcp_server
- tests/test_mcp_serve.py: 53 tests
- docs: updated MCP page + CLI commands reference
* Add new Gemini 3.1 model entries to models.py
* fix: also add Gemini 3.1 models to nous provider list
---------
Co-authored-by: Andrei Ignat <andrei@ignat.se>
SMTP connections in _send_email() and _send_email_with_attachment() leak
when login() or send_message() raises before quit() is reached. Both now
wrapped in try/finally with a close() fallback if quit() also fails.
IMAP connection in _fetch_new_messages() leaks when UID processing raises,
since logout() sits after the loop. Restructured with try/finally so
logout() runs unconditionally.
Co-authored-by: Himess <Himess@users.noreply.github.com>
* feat: show estimated tool token context in hermes tools checklist
Adds a live token estimate indicator to the bottom of the interactive
tool configuration checklist (hermes tools / hermes setup). As users
toggle toolsets on/off, the total estimated context cost updates in
real time.
Implementation:
- tools/registry.py: Add get_schema() for check_fn-free schema access
- hermes_cli/curses_ui.py: Add optional status_fn callback to
curses_checklist — renders at bottom-right of terminal, stays fixed
while items scroll
- hermes_cli/tools_config.py: Add _estimate_tool_tokens() using
tiktoken (cl100k_base, already installed) to count tokens in the
JSON-serialised OpenAI-format tool schemas. Results are cached
per-process. The status function deduplicates overlapping tools
(e.g. browser includes web_search) for accurate totals.
- 12 new tests covering estimation, caching, graceful degradation
when tiktoken is unavailable, status_fn wiring, deduplication,
and the numbered fallback display
* fix: use effective toolsets (includes plugins) for token estimation index mapping
The status_fn closure built ts_keys from CONFIGURABLE_TOOLSETS but the
checklist uses _get_effective_configurable_toolsets() which appends plugin
toolsets. With plugins present, the indices would mismatch, causing
IndexError when selecting a plugin toolset.
Commit ed27b826 introduced patch-tool redaction corruption that:
- Replaced max_token_length=16000 with max_token_length=***
- Truncated api_key=os.getenv(...) to api_key=os.get...EY
- Truncated tokenizer_name to NousRe...1-8B
- Deleted 409 lines including _run_tests(), _eval_with_timeout(),
evaluate(), wandb_log(), and the __main__ entry point
Restores the file from pre-corruption state (ed27b826^) and re-applies
the two legitimate changes from subsequent commits:
- eval_concurrency config field (from ed27b826)
- docker_image registration in register_task_env_overrides (from ed27b826)
- ManagedServer branching for vLLM/SGLang backends (from 13f54596)
Closes#1737, #1740.
Replace all 5 plain open(config_path, 'w') calls in gateway command
handlers with atomic_yaml_write() from utils.py. This uses the
established tempfile + fsync + os.replace pattern to ensure config.yaml
is never left half-written if the process is killed mid-write.
Affected handlers: /personality (clear + set), /sethome, /reasoning
(_save_config_key helper), /verbose (tool_progress cycling).
Also fixes missing encoding='utf-8' on the /personality clear write.
Salvaged from PR #1211 by albatrosjj.
Replace all 5 plain open(config_path, 'w') calls in gateway command
handlers with atomic_yaml_write() from utils.py. This uses the
established tempfile + fsync + os.replace pattern to ensure config.yaml
is never left half-written if the process is killed mid-write.
Affected handlers: /personality (clear + set), /sethome, /reasoning
(_save_config_key helper), /verbose (tool_progress cycling).
Also fixes missing encoding='utf-8' on the /personality clear write.
Salvaged from PR #1211 by albatrosjj.
Adds a Canvas LMS integration skill under optional-skills/productivity/canvas/
with a Python CLI wrapper (canvas_api.py) for listing courses and assignments
via personal access token auth.
Cherry-picked from PR #1250 by Alicorn-Max-S with:
- Moved from skills/ to optional-skills/ (niche educational integration)
- Fixed hardcoded ~/.hermes/ path to use $HERMES_HOME
- Removed Canvas env vars from .env.example (optional skill)
- Cleaned stale 'mini-swe-agent backend' reference from .env.example header
Co-authored-by: Alicorn-Max-S <Alicorn-Max-S@users.noreply.github.com>
Adds a structured 1-3-1 decision-making framework as an optional skill.
Produces: one problem statement, three options with trade-offs, one
recommendation with definition of done and implementation plan.
Moved to optional-skills/ (niche communication framework, not broadly
needed by default). Improved description with clearer trigger conditions
and replaced implementation-specific example with a generic one.
Based on PR #1262 by Willardgmoore.
Co-authored-by: Willard Moore <willardgmoore@users.noreply.github.com>
* fix(tools): implement send_message routing for Matrix, Mattermost, HomeAssistant, DingTalk
Matrix, Mattermost, HomeAssistant, and DingTalk were present in
platform_map but fell through to the "not yet implemented" else branch,
causing send_message tool calls to silently fail on these platforms.
Add four async sender functions:
- _send_mattermost: POST /api/v4/posts via Mattermost REST API
- _send_matrix: PUT /_matrix/client/v3/rooms/.../send via Matrix CS API
- _send_homeassistant: POST /api/services/notify/notify via HA REST API
- _send_dingtalk: POST to session webhook URL
Add routing in _send_to_platform() and 17 unit tests covering success,
HTTP errors, missing config, env var fallback, and Matrix txn_id uniqueness.
* fix: pass platform tokens explicitly to Mattermost/Matrix/HA senders
The original PR passed pconfig.extra to sender functions, but tokens
live at pconfig.token (not in extra). This caused the senders to always
fall through to env var lookup instead of using the gateway-resolved
token.
Changes:
- Mattermost/Matrix/HA: accept token as first arg, matching the
Telegram/Discord/Slack sender pattern
- DingTalk: add DINGTALK_WEBHOOK_URL env var fallback + docstring
explaining the session-webhook vs robot-webhook difference
- Tests updated for new signatures + new DingTalk env var test
---------
Co-authored-by: sprmn24 <oncuevtv@gmail.com>
When a user runs 'hermes update', the Python process caches old modules
in sys.modules. After git pull updates files on disk, lazy imports of
newly-updated modules fail because they try to import display_hermes_home
from the cached (old) hermes_constants which doesn't have the function.
This specifically broke the gateway auto-restart in cmd_update — importing
hermes_cli/gateway.py triggered the top-level 'from hermes_constants
import display_hermes_home' against the cached old module. The ImportError
was silently caught, so the gateway was never restarted after update.
Users with a running gateway then hit the ImportError on their next
Telegram/Discord message when the stale gateway process lazily loaded
run_agent.py (new version) which also had the top-level import.
Fixes:
- hermes_cli/gateway.py: lazy import at call site (line 940)
- run_agent.py: lazy import at call site (line 6927)
- tools/terminal_tool.py: lazy imports at 3 call sites
- tools/tts_tool.py: static schema string (no module-level call)
- hermes_cli/auth.py: lazy import at call site (line 2024)
- hermes_cli/main.py: reload hermes_constants after git pull in cmd_update
Also fixes 4 pre-existing test failures in test_parse_env_var caused by
NameError on display_hermes_home in terminal_tool.py.
Community review (devoruncommented) correctly identified that the Slack
adapter re-read SLACK_APP_TOKEN from os.getenv() during disconnect,
which could differ from the value used during connect if the environment
changed. Discord had the same pattern with self.config.token (less risky
but still not bulletproof).
Both now follow the Telegram pattern: store the token identity on self
at acquire time, use the stored value for release, clear after release.
Also fixes docs: alias naming was hermes-<name> in docs but actual
implementation creates <name> directly (e.g. ~/.local/bin/coder not
~/.local/bin/hermes-coder).
The docs incorrectly showed aliases as 'hermes-work' when the actual
implementation creates 'work' (profile name directly, no prefix).
Rewrote the user guide to lead with the alias pattern:
hermes profile create coder → coder chat, coder setup, etc.
Also clarified that the banner shows 'Profile: coder' and the prompt
shows 'coder ❯' when a non-default profile is active.
Fixed alias paths in command reference (hermes-work → work).