- Default enabled: false (zero overhead when not configured)
- Fast path: cached disabled state skips all work immediately
- TTL cache (30s) for parsed policy — avoids re-reading config.yaml
on every URL check
- Missing shared files warn + skip instead of crashing all web tools
- Lazy yaml import — missing PyYAML doesn't break browser toolset
- Guarded browser_tool import — fail-open lambda fallback
- check_website_access never raises for default path (fail-open with
warning log); only raises with explicit config_path (test mode)
- Simplified enforcement code in web_tools/browser_tool — no more
try/except wrappers since errors are handled internally
Add DingTalk as a messaging platform using the dingtalk-stream SDK
for real-time message reception via Stream Mode (no webhook needed).
Replies are sent via session webhook using markdown format.
Features:
- Stream Mode connection (long-lived WebSocket, no public URL needed)
- Text and rich text message support
- DM and group chat support
- Message deduplication with 5-minute window
- Auto-reconnection with exponential backoff
- Session webhook caching for reply routing
Configuration:
export DINGTALK_CLIENT_ID=your-app-key
export DINGTALK_CLIENT_SECRET=your-app-secret
# or in config.yaml:
platforms:
dingtalk:
enabled: true
extra:
client_id: your-app-key
client_secret: your-app-secret
Files:
- gateway/platforms/dingtalk.py (340 lines) — adapter implementation
- gateway/config.py — add DINGTALK to Platform enum
- gateway/run.py — add DingTalk to _create_adapter
- hermes_cli/config.py — add env vars to _EXTRA_ENV_KEYS
- hermes_cli/tools_config.py — add dingtalk to PLATFORMS
- tests/gateway/test_dingtalk.py — 21 tests
Add inference.sh CLI (infsh) as a tool integration, giving agents
access to 150+ AI apps through a single CLI — image gen (FLUX, Reve,
Seedream), video (Veo, Wan, Seedance), LLMs, search (Tavily, Exa),
3D, avatar/lipsync, and more. One API key manages all services.
Tools:
- infsh: run any infsh CLI command (app list, app run, etc.)
- infsh_install: install the CLI if not present
Registered as an 'inference' toolset (opt-in, not in core tools).
Includes comprehensive skill docs with examples for all app categories.
Changes from original PR:
- NOT added to _HERMES_CORE_TOOLS (available via --toolsets inference)
- Added 12 tests covering tool registration, command execution,
error handling, timeout, JSON parsing, and install flow
Inspired by PR #1021 by @okaris.
Co-authored-by: okaris <okaris@users.noreply.github.com>
* fix: thread safety for concurrent subagent delegation
Four thread-safety fixes that prevent crashes and data races when
running multiple subagents concurrently via delegate_task:
1. Remove redirect_stdout/stderr from delegate_tool — mutating global
sys.stdout races with the spinner thread when multiple children start
concurrently, causing segfaults. Children already run with
quiet_mode=True so the redirect was redundant.
2. Split _run_single_child into _build_child_agent (main thread) +
_run_single_child (worker thread). AIAgent construction creates
httpx/SSL clients which are not thread-safe to initialize
concurrently.
3. Add threading.Lock to SessionDB — subagents share the parent's
SessionDB and call create_session/append_message from worker threads
with no synchronization.
4. Add _active_children_lock to AIAgent — interrupt() iterates
_active_children while worker threads append/remove children.
5. Add _client_cache_lock to auxiliary_client — multiple subagent
threads may resolve clients concurrently via call_llm().
Based on PR #1471 by peteromallet.
* feat: Honcho base_url override via config.yaml + quick command alias type
Two features salvaged from PR #1576:
1. Honcho base_url override: allows pointing Hermes at a remote
self-hosted Honcho deployment via config.yaml:
honcho:
base_url: "http://192.168.x.x:8000"
When set, this overrides the Honcho SDK's environment mapping
(production/local), enabling LAN/VPN Honcho deployments without
requiring the server to live on localhost. Uses config.yaml instead
of env var (HONCHO_URL) per project convention.
2. Quick command alias type: adds a new 'alias' quick command type
that rewrites to another slash command before normal dispatch:
quick_commands:
sc:
type: alias
target: /context
Supports both CLI and gateway. Arguments are forwarded to the
target command.
Based on PR #1576 by redhelix.
---------
Co-authored-by: peteromallet <peteromallet@users.noreply.github.com>
Co-authored-by: redhelix <redhelix@users.noreply.github.com>
Add display.theme_mode setting (auto/light/dark) that makes the CLI
readable on light terminal backgrounds.
- Auto-detect terminal background via COLORFGBG, OSC 11, and macOS
appearance (fallback chain in hermes_cli/colors.py)
- Add colors_light overrides to all 7 built-in skins with dark/readable
colors for light backgrounds
- SkinConfig.get_color() now returns light overrides when theme is light
- get_prompt_toolkit_style_overrides() uses light bg colors for
completion menus in light mode
- init_skin_from_config() reads display.theme_mode from config
- 7 new tests covering theme mode resolution, detection fallbacks,
and light-mode skin overrides
Salvaged from PR #1187 by @peteromallet. Core design preserved;
adapted to current main (kept all existing helpers, tool_emojis,
convenience functions that were added after the PR branched).
Co-authored-by: Peter O'Mallet <peteromallet@users.noreply.github.com>
When a user sends a long message, Telegram clients split it into
multiple updates that arrive within milliseconds of each other.
Previously each chunk was dispatched independently — the first would
start the agent, and subsequent chunks would interrupt or queue as
separate turns, causing the agent to only see part of the message.
Add text message batching to TelegramAdapter following the same pattern
as the existing photo burst batching:
- _enqueue_text_event() buffers text by session key, concatenating
chunks that arrive in rapid succession
- _flush_text_batch() dispatches the combined message after a 0.6s
quiet period (configurable via HERMES_TELEGRAM_TEXT_BATCH_DELAY_SECONDS)
- Timer resets on each new chunk, so all parts of a split arrive
before the batch is dispatched
Reported by NulledVector on Discord.
Add Kilo Gateway (kilo.ai) as an API-key provider with OpenAI-compatible
endpoint at https://api.kilo.ai/api/gateway. Supports 500+ models from
Anthropic, OpenAI, Google, xAI, Mistral, MiniMax via a single API key.
- Register kilocode in PROVIDER_REGISTRY with aliases (kilo, kilo-code,
kilo-gateway) and KILOCODE_API_KEY / KILOCODE_BASE_URL env vars
- Add to model catalog, CLI provider menu, setup wizard, doctor checks
- Add google/gemini-3-flash-preview as default aux model
- 12 new tests covering registration, aliases, credential resolution,
runtime config
- Documentation updates (env vars, config, fallback providers)
- Fix setup test index shift from provider insertion
Inspired by PR #1473 by @amanning3390.
Co-authored-by: amanning3390 <amanning3390@users.noreply.github.com>
Docker terminal sessions are secret-dark by default. This adds
terminal.docker_forward_env as an explicit allowlist for env vars
that may be forwarded into Docker containers.
Values resolve from the current shell first, then fall back to
~/.hermes/.env. Only variables the user explicitly lists are
forwarded — nothing is auto-exposed.
Cherry-picked from PR #1449 by @teknium1, conflict-resolved onto
current main.
Fixes#1436
Supersedes #1439
_bot_participated_threads was an in-memory set — lost on every restart.
After restart, the bot forgot which threads it was active in, requiring
fresh @mentions and potentially creating duplicate threads instead of
continuing existing conversations.
Changes:
- Persist thread IDs to ~/.hermes/discord_threads.json
- Load on adapter init, save on every new thread participation
- _track_thread() replaces direct .add() calls for atomic persist
- Cap at 500 tracked threads to prevent unbounded growth
- /thread slash command also tracks participation
- 7 new tests covering persistence, restart survival, corruption
recovery, cap enforcement
* fix(security): harden terminal safety and sandbox file writes
Two security improvements:
1. Dangerous command detection: expand shell -c pattern to catch
combined flags (bash -lc, bash -ic, ksh -c) that were previously
undetected. Pattern changed from matching only 'bash -c' to
matching any shell invocation with -c anywhere in the flags.
2. File write sandboxing: add HERMES_WRITE_SAFE_ROOT env var that
constrains all write_file/patch operations to a configured directory
tree. Opt-in — when unset, behavior is unchanged. Useful for
gateway/messaging deployments that should only touch a workspace.
Based on PR #1085 by ismoilh.
* fix: correct "POSIDEON" typo to "POSEIDON" in banner ASCII art
The poseidon skin's banner_logo had the E and I letters swapped,
spelling "POSIDEON-AGENT" instead of "POSEIDON-AGENT".
---------
Co-authored-by: ismoilh <ismoilh@users.noreply.github.com>
Co-authored-by: unmodeled-tyler <unmodeled.tyler@proton.me>
* fix: prevent infinite 400 failure loop on context overflow (#1630)
When a gateway session exceeds the model's context window, Anthropic may
return a generic 400 invalid_request_error with just 'Error' as the
message. This bypassed the phrase-based context-length detection,
causing the agent to treat it as a non-retryable client error. Worse,
the failed user message was still persisted to the transcript, making
the session even larger on each attempt — creating an infinite loop.
Three-layer fix:
1. run_agent.py — Fallback heuristic: when a 400 error has a very short
generic message AND the session is large (>40% of context or >80
messages), treat it as a probable context overflow and trigger
compression instead of aborting.
2. run_agent.py + gateway/run.py — Don't persist failed messages:
when the agent returns failed=True before generating any response,
skip writing the user's message to the transcript/DB. This prevents
the session from growing on each failure.
3. gateway/run.py — Smarter error messages: detect context-overflow
failures and suggest /compact or /reset specifically, instead of a
generic 'try again' that will fail identically.
* fix(skills): detect prompt injection patterns and block cache file reads
Adds two security layers to prevent prompt injection via skills hub
cache files (#1558):
1. read_file: blocks direct reads of ~/.hermes/skills/.hub/ directory
(index-cache, catalog files). The 3.5MB clawhub_catalog_v1.json
was the original injection vector — untrusted skill descriptions
in the catalog contained adversarial text that the model executed.
2. skill_view: warns when skills are loaded from outside the trusted
~/.hermes/skills/ directory, and detects common injection patterns
in skill content ("ignore previous instructions", "<system>", etc.).
Cherry-picked from PR #1562 by ygd58.
* fix(tools): chunk long messages in send_message_tool before dispatch (#1552)
Long messages sent via send_message tool or cron delivery silently
failed when exceeding platform limits. Gateway adapters handle this
via truncate_message(), but the standalone senders in send_message_tool
bypassed that entirely.
- Apply truncate_message() chunking in _send_to_platform() before
dispatching to individual platform senders
- Remove naive message[i:i+2000] character split in _send_discord()
in favor of centralized smart splitting
- Attach media files to last chunk only for Telegram
- Add regression tests for chunking and media placement
Cherry-picked from PR #1557 by llbn.
* fix(approval): show full command in dangerous command approval (#1553)
Previously the command was truncated to 80 chars in CLI (with a
[v]iew full option), 500 chars in Discord embeds, and missing entirely
in Telegram/Slack approval messages. Now the full command is always
displayed everywhere:
- CLI: removed 80-char truncation and [v]iew full menu option
- Gateway (TG/Slack): approval_required message includes full command
in a code block
- Discord: embed shows full command up to 4096-char limit
- Windows: skip SIGALRM-based test timeout (Unix-only)
- Updated tests: replaced view-flow tests with direct approval tests
Cherry-picked from PR #1566 by crazywriter1.
* fix(cli): flush stdout during agent loop to prevent macOS display freeze (#1624)
The interrupt polling loop in chat() waited on the queue without
invalidating the prompt_toolkit renderer. On macOS, the StdoutProxy
buffer only flushed on input events, causing the CLI to appear frozen
during tool execution until the user typed a key.
Fix: call _invalidate() on each queue timeout (every ~100ms, throttled
to 150ms) to force the renderer to flush buffered agent output.
* fix(claw): warn when API keys are skipped during OpenClaw migration (#1580)
When --migrate-secrets is not passed (the default), API keys like
OPENROUTER_API_KEY are silently skipped with no warning. Users don't
realize their keys weren't migrated until the agent fails to connect.
Add a post-migration warning with actionable instructions: either
re-run with --migrate-secrets or add the key manually via
hermes config set.
Cherry-picked from PR #1593 by ygd58.
* fix(security): block sandbox backend creds from subprocess env (#1264)
Add Modal and Daytona sandbox credentials to the subprocess env
blocklist so they're not leaked to agent terminal sessions via
printenv/env.
Cherry-picked from PR #1571 by ygd58.
---------
Co-authored-by: buray <ygd58@users.noreply.github.com>
Co-authored-by: lbn <llbn@users.noreply.github.com>
Co-authored-by: crazywriter1 <53251494+crazywriter1@users.noreply.github.com>
* feat(skills): add bundled neutts optional skill
Add NeuTTS optional skill with CLI scaffold, bootstrap helper, and
sample voice profile. Also fixes skills_hub.py to handle binary
assets (WAV files) during skill installation.
Changes:
- optional-skills/mlops/models/neutts/ — skill + CLI scaffold
- tools/skills_hub.py — binary asset support (read_bytes, write_bytes)
- tests/tools/test_skills_hub.py — regression tests for binary assets
* feat(tts): add NeuTTS as local TTS provider backend
Add NeuTTS as a fourth TTS provider option alongside Edge, ElevenLabs,
and OpenAI. NeuTTS runs fully on-device via neutts_cli — no API key
needed.
Provider behavior:
- Explicit: set tts.provider to 'neutts' in config.yaml
- Fallback: when Edge TTS is unavailable and neutts_cli is installed,
automatically falls back to NeuTTS instead of failing
- check_tts_requirements() now includes NeuTTS in availability checks
NeuTTS outputs WAV natively. For Telegram voice bubbles, ffmpeg
converts to Opus (same pattern as Edge TTS).
Changes:
- tools/tts_tool.py — _generate_neutts(), _check_neutts_available(),
provider dispatch, fallback logic, Opus conversion
- hermes_cli/config.py — tts.neutts config defaults
---------
Co-authored-by: unmodeled-tyler <unmodeled.tyler@proton.me>
Remove HERMES_API_MODE env var. api_mode is now configured where the
endpoint is defined:
- model.api_mode in config.yaml (for the active model config)
- custom_providers[].api_mode (for named custom providers)
Replace _get_configured_api_mode() with _parse_api_mode() which just
validates a value against the whitelist without reading env vars.
Both paths (model config and named custom providers) now read api_mode
from their respective config entries rather than a global override.
Add in-session tool management via /tools disable/enable/list, plus
hermes tools list/disable/enable CLI subcommands. Supports both
built-in toolsets (web, memory) and MCP tools (github:create_issue).
To preserve prompt caching, /tools disable/enable in a chat session
saves the change to config and resets the session cleanly — the user
is asked to confirm before the reset happens.
Also improves prefix matching: /qui now dispatches to /quit instead
of showing ambiguous when longer skill commands like /quint-pipeline
are installed.
Based on PR #1520 by @YanSte.
Co-authored-by: Yannick Stephan <YanSte@users.noreply.github.com>
Add HERMES_API_MODE env var and model.api_mode config field to let
custom OpenAI-compatible endpoints opt into codex_responses mode
without requiring the OpenAI Codex OAuth provider path.
- _get_configured_api_mode() reads HERMES_API_MODE env (precedence)
then model.api_mode from config.yaml; validates against whitelist
- Applied in both _resolve_openrouter_runtime() and
_resolve_named_custom_runtime() (original PR only covered openrouter)
- Fix _dump_api_request_debug() to show /responses URL when in
codex_responses mode instead of always showing /chat/completions
- Tests for config override, env override, invalid values, named
custom providers, and debug dump URL for both API modes
Inspired by PR #1041 by @mxyhi.
Co-authored-by: mxyhi <mxyhi@users.noreply.github.com>
browser_console was registered in the tool registry but missing from
all toolset definitions (TOOLSETS, _HERMES_CORE_TOOLS, _LEGACY_TOOLSET_MAP),
so the agent could never discover or use it.
Added to all 4 locations + 4 wiring tests.
Cherry-picked from PR #1084 by @0xbyt4 (authorship preserved in tests).
The primary injection vector in #1558 was search_files discovering
catalog cache files in .hub/index-cache/ via find or grep, which
don't skip hidden directories like ripgrep does by default.
Three-layer fix:
1. _search_files (find): add -not -path '*/.*' to exclude hidden
directories, matching ripgrep's default behavior.
2. _search_with_grep: add --exclude-dir='.*' to skip hidden
directories in the grep fallback path.
3. _write_index_cache: write a .ignore file to .hub/ so ripgrep
also skips it even when invoked with --hidden (belt-and-suspenders).
This makes all three search backends (rg, grep, find) consistently
exclude hidden directories, preventing the agent from discovering
and reading unvetted community content in hub cache files.
* fix: prevent infinite 400 failure loop on context overflow (#1630)
When a gateway session exceeds the model's context window, Anthropic may
return a generic 400 invalid_request_error with just 'Error' as the
message. This bypassed the phrase-based context-length detection,
causing the agent to treat it as a non-retryable client error. Worse,
the failed user message was still persisted to the transcript, making
the session even larger on each attempt — creating an infinite loop.
Three-layer fix:
1. run_agent.py — Fallback heuristic: when a 400 error has a very short
generic message AND the session is large (>40% of context or >80
messages), treat it as a probable context overflow and trigger
compression instead of aborting.
2. run_agent.py + gateway/run.py — Don't persist failed messages:
when the agent returns failed=True before generating any response,
skip writing the user's message to the transcript/DB. This prevents
the session from growing on each failure.
3. gateway/run.py — Smarter error messages: detect context-overflow
failures and suggest /compact or /reset specifically, instead of a
generic 'try again' that will fail identically.
* fix(skills): detect prompt injection patterns and block cache file reads
Adds two security layers to prevent prompt injection via skills hub
cache files (#1558):
1. read_file: blocks direct reads of ~/.hermes/skills/.hub/ directory
(index-cache, catalog files). The 3.5MB clawhub_catalog_v1.json
was the original injection vector — untrusted skill descriptions
in the catalog contained adversarial text that the model executed.
2. skill_view: warns when skills are loaded from outside the trusted
~/.hermes/skills/ directory, and detects common injection patterns
in skill content ("ignore previous instructions", "<system>", etc.).
Cherry-picked from PR #1562 by ygd58.
* fix(tools): chunk long messages in send_message_tool before dispatch (#1552)
Long messages sent via send_message tool or cron delivery silently
failed when exceeding platform limits. Gateway adapters handle this
via truncate_message(), but the standalone senders in send_message_tool
bypassed that entirely.
- Apply truncate_message() chunking in _send_to_platform() before
dispatching to individual platform senders
- Remove naive message[i:i+2000] character split in _send_discord()
in favor of centralized smart splitting
- Attach media files to last chunk only for Telegram
- Add regression tests for chunking and media placement
Cherry-picked from PR #1557 by llbn.
* fix(approval): show full command in dangerous command approval (#1553)
Previously the command was truncated to 80 chars in CLI (with a
[v]iew full option), 500 chars in Discord embeds, and missing entirely
in Telegram/Slack approval messages. Now the full command is always
displayed everywhere:
- CLI: removed 80-char truncation and [v]iew full menu option
- Gateway (TG/Slack): approval_required message includes full command
in a code block
- Discord: embed shows full command up to 4096-char limit
- Windows: skip SIGALRM-based test timeout (Unix-only)
- Updated tests: replaced view-flow tests with direct approval tests
Cherry-picked from PR #1566 by crazywriter1.
---------
Co-authored-by: buray <ygd58@users.noreply.github.com>
Co-authored-by: lbn <llbn@users.noreply.github.com>
Co-authored-by: crazywriter1 <53251494+crazywriter1@users.noreply.github.com>
Fixes hanging when using /skills install or /skills uninstall from the
TUI — bare input() calls hang inside prompt_toolkit's event loop.
Changes:
- Add skip_confirm parameter to do_install() and do_uninstall()
- Separate --yes/-y (confirmation bypass) from --force (scan override)
in both argparse and slash command handlers
- Update usage hint for /skills uninstall to show [--yes]
The original PR (#1595) accidentally deleted the install_from_quarantine()
call, which would have broken all installs. That bug is not present here.
Based on PR #1595 by 333Alden333.
Co-authored-by: 333Alden333 <333Alden333@users.noreply.github.com>
* fix: prevent infinite 400 failure loop on context overflow (#1630)
When a gateway session exceeds the model's context window, Anthropic may
return a generic 400 invalid_request_error with just 'Error' as the
message. This bypassed the phrase-based context-length detection,
causing the agent to treat it as a non-retryable client error. Worse,
the failed user message was still persisted to the transcript, making
the session even larger on each attempt — creating an infinite loop.
Three-layer fix:
1. run_agent.py — Fallback heuristic: when a 400 error has a very short
generic message AND the session is large (>40% of context or >80
messages), treat it as a probable context overflow and trigger
compression instead of aborting.
2. run_agent.py + gateway/run.py — Don't persist failed messages:
when the agent returns failed=True before generating any response,
skip writing the user's message to the transcript/DB. This prevents
the session from growing on each failure.
3. gateway/run.py — Smarter error messages: detect context-overflow
failures and suggest /compact or /reset specifically, instead of a
generic 'try again' that will fail identically.
* fix(skills): detect prompt injection patterns and block cache file reads
Adds two security layers to prevent prompt injection via skills hub
cache files (#1558):
1. read_file: blocks direct reads of ~/.hermes/skills/.hub/ directory
(index-cache, catalog files). The 3.5MB clawhub_catalog_v1.json
was the original injection vector — untrusted skill descriptions
in the catalog contained adversarial text that the model executed.
2. skill_view: warns when skills are loaded from outside the trusted
~/.hermes/skills/ directory, and detects common injection patterns
in skill content ("ignore previous instructions", "<system>", etc.).
Cherry-picked from PR #1562 by ygd58.
* fix(tools): chunk long messages in send_message_tool before dispatch (#1552)
Long messages sent via send_message tool or cron delivery silently
failed when exceeding platform limits. Gateway adapters handle this
via truncate_message(), but the standalone senders in send_message_tool
bypassed that entirely.
- Apply truncate_message() chunking in _send_to_platform() before
dispatching to individual platform senders
- Remove naive message[i:i+2000] character split in _send_discord()
in favor of centralized smart splitting
- Attach media files to last chunk only for Telegram
- Add regression tests for chunking and media placement
Cherry-picked from PR #1557 by llbn.
---------
Co-authored-by: buray <ygd58@users.noreply.github.com>
Co-authored-by: lbn <llbn@users.noreply.github.com>
* fix: prevent infinite 400 failure loop on context overflow (#1630)
When a gateway session exceeds the model's context window, Anthropic may
return a generic 400 invalid_request_error with just 'Error' as the
message. This bypassed the phrase-based context-length detection,
causing the agent to treat it as a non-retryable client error. Worse,
the failed user message was still persisted to the transcript, making
the session even larger on each attempt — creating an infinite loop.
Three-layer fix:
1. run_agent.py — Fallback heuristic: when a 400 error has a very short
generic message AND the session is large (>40% of context or >80
messages), treat it as a probable context overflow and trigger
compression instead of aborting.
2. run_agent.py + gateway/run.py — Don't persist failed messages:
when the agent returns failed=True before generating any response,
skip writing the user's message to the transcript/DB. This prevents
the session from growing on each failure.
3. gateway/run.py — Smarter error messages: detect context-overflow
failures and suggest /compact or /reset specifically, instead of a
generic 'try again' that will fail identically.
* fix(skills): detect prompt injection patterns and block cache file reads
Adds two security layers to prevent prompt injection via skills hub
cache files (#1558):
1. read_file: blocks direct reads of ~/.hermes/skills/.hub/ directory
(index-cache, catalog files). The 3.5MB clawhub_catalog_v1.json
was the original injection vector — untrusted skill descriptions
in the catalog contained adversarial text that the model executed.
2. skill_view: warns when skills are loaded from outside the trusted
~/.hermes/skills/ directory, and detects common injection patterns
in skill content ("ignore previous instructions", "<system>", etc.).
Cherry-picked from PR #1562 by ygd58.
---------
Co-authored-by: buray <ygd58@users.noreply.github.com>
Small models (7B-14B) can't reliably use MEDIA: or IMAGE: syntax. This
adds extract_local_files() to BasePlatformAdapter that regex-detects
bare local file paths ending in image/video extensions, validates them
with os.path.isfile(), and delivers them as native platform attachments.
Hardened over the original PR:
- Code-block exclusion: paths inside fenced blocks and inline code are
skipped so code samples are never mutilated
- URL rejection: negative lookbehind prevents matching path segments
inside HTTP URLs
- Relative path rejection: ./foo.png no longer matches
- Tilde path cleanup: raw ~/... form is removed from response text
- Deduplication by expanded path
- Added .webm to _VIDEO_EXTS
- Fallback to send_document for unrecognized media extensions
Based on PR #1636 by sudoingX.
Co-authored-by: sudoingX <sudoingX@users.noreply.github.com>
* feat(cli): two-stage /model autocomplete with ghost text suggestions
- SlashCommandCompleter: Tab-complete providers first (anthropic:, openrouter:, etc.)
then models within the selected provider
- SlashCommandAutoSuggest: inline ghost text for slash commands, subcommands,
and /model provider:model two-stage suggestions
- Custom Tab key binding: accepts provider completion and immediately
re-triggers completions to show that provider's models
- COMMANDS_BY_CATEGORY: structured format with explicit subcommands for
tab completion and ghost text (prompt, reasoning, voice, skills, cron, browser)
- SUBCOMMANDS dict auto-extracted from command definitions
- Model/provider info cached 60s for responsive completions
* fix: repair test regression and restore gold color from PR #1622
- Fix test_unknown_command_still_shows_error: patch _cprint instead of
console.print to match the _cprint switch in process_command()
- Restore gold color on 'Type /help' hint using _DIM + _GOLD constants
instead of bare \033[2m (was losing the #B8860B gold)
- Use _GOLD constant for ambiguous command message for consistency
- Add clarifying comment on SUBCOMMANDS regex fallback
---------
Co-authored-by: Lars van der Zande <lmvanderzande@gmail.com>
The MarkdownV2 formatting change imports telegram.constants.ParseMode,
which the test mock didn't provide. Add ParseMode to the mock so
existing tests continue working.
* fix(tools): remove unnecessary crontab requirement from cronjob tool
The hermes cron system is internal — it uses a JSON-based scheduler
ticked by the gateway (cron/scheduler.py), not system crontab.
The check for shutil.which('crontab') was preventing the cronjob tool
from being available in environments without crontab installed (e.g.
minimal Ubuntu containers).
Changes:
- Remove shutil.which('crontab') check from check_cronjob_requirements()
- Remove unused shutil import
- Update docstring to clarify internal scheduler is used
- Update tests to reflect new behavior and add coverage for all
session modes (interactive, gateway, exec_ask)
Fixes#1589
* test: add HERMES_EXEC_ASK coverage for cronjob requirements
Adds missing test for the exec_ask session mode, complementing
the cherry-picked fix from PR #1633.
---------
Co-authored-by: Bartok9 <bartokmagic@proton.me>
Verifies that write_runtime_status() overwrites pid and start_time
from a previous process rather than preserving them via setdefault().
Covers the fix from PR #1632.
- Bump _config_version 8 → 9
- Move stale ANTHROPIC_TOKEN clearing into 'if current_ver < 9' block
so it only runs once during the upgrade, not on every migrate_config()
- ANTHROPIC_TOKEN is still a valid auth path (OAuth flow), so we don't
want to clear it repeatedly — only during the one-time migration from
old setups that left it stale
- Add test_skips_on_version_9_or_later to verify one-time behavior
- All tests set config version 8 to trigger migration
- Remove *** placeholder detection from _sanitize_env_lines (was based on
confusing terminal redaction with literal file content)
- Add migrate_config() logic to clear stale ANTHROPIC_TOKEN when better
credentials exist (ANTHROPIC_API_KEY or Claude Code auto-discovery)
- Old ANTHROPIC_TOKEN values shadow Claude Code credential fallthrough,
breaking auth for users who updated without re-running setup
- Preserves ANTHROPIC_TOKEN when it's the only auth method available
- 3 new migration tests, updated existing tests
Fixes two corruption patterns that break API keys during updates:
1. Concatenated KEY=VALUE pairs on a single line due to missing newlines
(e.g. ANTHROPIC_API_KEY=sk-...OPENAI_BASE_URL=https://...). Uses a
known-keys set to safely detect and split concatenated entries without
false-splitting values that contain uppercase text.
2. Stale KEY=*** placeholder entries left by incomplete setup runs that
never get updated and shadow real credentials.
Changes:
- Add _sanitize_env_lines() that splits concatenated known keys and drops
*** placeholders
- Add sanitize_env_file() public API for explicit repair
- Call sanitization in save_env_value() on every read (self-healing)
- Call sanitize_env_file() at the start of migrate_config() so existing
corrupted files are repaired on update
- 12 new tests covering splits, placeholders, edge cases, and integration
* feat: add Vercel AI Gateway as a first-class provider
Adds AI Gateway (ai-gateway.vercel.sh) as a new inference provider
with AI_GATEWAY_API_KEY authentication, live model discovery, and
reasoning support via extra_body.reasoning.
Based on PR #1492 by jerilynzheng.
* feat: add AI Gateway to setup wizard, doctor, and fallback providers
* test: add AI Gateway to api_key_providers test suite
* feat: add AI Gateway to hermes model CLI and model metadata
Wire AI Gateway into the interactive model selection menu and add
context lengths for AI Gateway model IDs in model_metadata.py.
* feat: use claude-haiku-4.5 as AI Gateway auxiliary model
* revert: use gemini-3-flash as AI Gateway auxiliary model
* fix: move AI Gateway below established providers in selection order
---------
Co-authored-by: jerilynzheng <jerilynzheng@users.noreply.github.com>
Co-authored-by: jerilynzheng <zheng.jerilyn@gmail.com>
When the gateway restarts after being down past a scheduled run time,
recurring jobs (cron/interval) were firing immediately because their
next_run_at was in the past. Now jobs more than 2 minutes late are
fast-forwarded to the next future occurrence instead.
- get_due_jobs() checks staleness for cron/interval jobs
- Stale jobs get next_run_at recomputed and saved
- Jobs within 2 minutes of their schedule still fire normally
- One-shot (once) jobs are unaffected — they fire if missed
Fixes the 'cron jobs run on every gateway restart' issue.
* fix: Anthropic OAuth compatibility — Claude Code identity fingerprinting
Anthropic routes OAuth/subscription requests based on Claude Code's
identity markers. Without them, requests get intermittent 500 errors
(~25% failure rate observed). This matches what pi-ai (clawdbot) and
OpenCode both implement for OAuth compatibility.
Changes (OAuth tokens only — API key users unaffected):
1. Headers: user-agent 'claude-cli/2.1.2 (external, cli)' + x-app 'cli'
2. System prompt: prepend 'You are Claude Code, Anthropic's official CLI'
3. System prompt sanitization: replace Hermes/Nous references
4. Tool names: prefix with 'mcp_' (Claude Code convention for non-native tools)
5. Tool name stripping: remove 'mcp_' prefix from response tool calls
Before: 9/12 OK, 1 hard fail, 4 needed retries (~25% error rate)
After: 16/16 OK, 0 failures, 0 retries (0% error rate)
* fix: three gateway issues from user error logs
1. send_animation missing metadata kwarg (base.py)
- Base class send_animation lacked the metadata parameter that the
call site in base.py line 917 passes. Telegram's override accepted
it, but any platform without an override (Discord, Slack, etc.)
hit TypeError. Added metadata to base class signature.
2. MarkdownV2 split-inside-inline-code (base.py truncate_message)
- truncate_message could split at a space inside an inline code span
(e.g. `function(arg1, arg2)`), leaving an unpaired backtick and
unescaped parentheses in the chunk. Telegram rejects with
'character ( is reserved'. Added inline code awareness to the
split-point finder — detects odd backtick counts and moves the
split before the code span.
3. tirith auto-install without cosign (tirith_security.py)
- Previously required cosign on PATH for auto-install, blocking
install entirely with a warning if missing. Now proceeds with
SHA-256 checksum verification only when cosign is unavailable.
Cosign is still used for full supply chain verification when
present. If cosign IS present but verification explicitly fails,
install is still aborted (tampered release).
* refactor: centralize slash command registry
Replace 7+ scattered command definition sites with a single
CommandDef registry in hermes_cli/commands.py. All downstream
consumers now derive from this registry:
- CLI process_command() resolves aliases via resolve_command()
- Gateway _known_commands uses GATEWAY_KNOWN_COMMANDS frozenset
- Gateway help text generated by gateway_help_lines()
- Telegram BotCommands generated by telegram_bot_commands()
- Slack subcommand map generated by slack_subcommand_map()
Adding a command or alias is now a one-line change to
COMMAND_REGISTRY instead of touching 6+ files.
Bugfixes included:
- Telegram now registers /rollback, /background (were missing)
- Slack now has /voice, /update, /reload-mcp (were missing)
- Gateway duplicate 'reasoning' dispatch (dead code) removed
- Gateway help text can no longer drift from CLI help
Backwards-compatible: COMMANDS and COMMANDS_BY_CATEGORY dicts are
rebuilt from the registry, so existing imports work unchanged.
* docs: update developer docs for centralized command registry
Update AGENTS.md with full 'Slash Command Registry' and 'Adding a
Slash Command' sections covering CommandDef fields, registry helpers,
and the one-line alias workflow.
Also update:
- CONTRIBUTING.md: commands.py description
- website/docs/reference/slash-commands.md: reference central registry
- docs/plans/centralize-command-registry.md: mark COMPLETED
- plans/checkpoint-rollback.md: reference new pattern
- hermes-agent-dev skill: architecture table
* chore: remove stale plan docs
Two tests lacked filesystem isolation causing them to pick up real
~/.claude/.credentials.json tokens on machines with Claude Code installed.
- test_prefers_oauth_token_over_api_key: add tmp_path, mock Path.home,
clear CLAUDE_CODE_OAUTH_TOKEN env
- test_falls_back_to_token: same isolation
Also commit run_agent.py generic-400 retry fix.
Repair stale launchd/systemd definitions during install and
teach launchd start to reload unloaded jobs before retrying.
Stop masking service restart failures by falling back to a
foreground gateway when a configured service manager is still
broken.
Refs: #1613
Anthropic prompt caching splits input into cache_read_input_tokens,
cache_creation_input_tokens, and non-cached input_tokens. The context
counter only read input_tokens (non-cached portion), showing ~3 tokens
instead of the real ~18K total. Now includes cached portions for
Anthropic native provider only — other providers (OpenAI, OpenRouter,
Codex) already include cached tokens in their prompt_tokens field.
Before: 3/200K | 0%
After: 17.7K/200K | 9%