Cherry-picked from PR #2575 by ticketclosed-wontfix.
Filters out Discord system messages (thread renames, pins, member joins,
boosts) that were being treated as regular user messages.
Follow-up fix: also allow MessageType.reply (value 19) — the original
filter only allowed MessageType.default, which would silently drop all
reply-based interactions.
Added pytest.importorskip for discord dependency in tests.
An agent session killed the systemd-managed gateway (PID 1605) and restarted
it with '&disown', taking it outside systemd's Restart= management. When the
orphaned process later received SIGTERM, nothing restarted it.
Add dangerous command patterns to detect:
- 'gateway run' with & (background), disown, nohup, or setsid
- These should use 'systemctl --user restart hermes-gateway' instead
Also applied directly to main repo and fixed the systemd service:
- Changed Restart=on-failure to Restart=always (clean SIGTERM = exit 0 = not
a 'failure', so on-failure never triggered)
- RestartSec=10 for reasonable restart delay
Previously 'Activated skills: xxx' was printed above the banner in
show_banner(). Now it prints directly after the 'Welcome to Hermes
Agent!' line in run(), which is a more natural placement.
Path('~/.hermes/image.png').is_file() returns False because Path
doesn't expand tilde. This caused the tool to fall through to URL
validation, which also failed, producing a confusing error:
'Invalid image source. Provide an HTTP/HTTPS URL or a valid local
file path.'
Fix: use os.path.expanduser() before constructing the Path object.
Added two tests for tilde expansion (success and nonexistent file).
When a messaging platform fails to connect at startup (e.g. transient DNS
failure) or disconnects at runtime with a retryable error, the gateway now
queues it for background reconnection instead of giving up permanently.
- New _platform_reconnect_watcher background task runs alongside the
existing session expiry watcher
- Exponential backoff: 30s, 60s, 120s, 240s, 300s cap
- Max 20 retry attempts before giving up on a platform
- Non-retryable errors (bad auth token, etc.) are not retried
- Runtime disconnections via _handle_adapter_fatal_error now queue
retryable failures instead of triggering gateway shutdown
- On successful reconnect, adapter is wired up and channel directory
is rebuilt automatically
Fixes the case where a DNS blip during gateway startup caused Telegram
and Discord to be permanently unavailable until manual restart.
The previous commit capped the 1.4x at 95% of context, but the multiplier
itself is unnecessary and confusing:
85% threshold × 1.4 = 119% of context → never fires
95% warn × 1.4 = 133% of context → never warns
The 85% hygiene threshold already provides ample headroom over the agent's
own 50% compressor. Even if rough estimates overestimate by 50%, hygiene
would fire at ~57% actual usage — safe and harmless.
Remove the multiplier entirely. Both actual and estimated token paths
now use the same 85% / 95% thresholds. Update tests and comments.
Three bugs in gateway session hygiene pre-compression caused 'Session too
large' errors for ~200K context models like GLM-5-turbo on z.ai:
1. Gateway hygiene called get_model_context_length(model) without passing
config_context_length, provider, or base_url — so user overrides like
model.context_length: 180000 were ignored, and provider-aware detection
(models.dev, z.ai endpoint) couldn't fire. The agent's own compressor
correctly passed all three (run_agent.py line 1038).
2. The 1.4x safety factor on rough token estimates pushed the compression
threshold above the model's actual context limit:
200K * 0.85 * 1.4 = 238K > 200K (model limit)
So hygiene never compressed, sessions grew past the limit, and the API
rejected the request.
3. Same issue for the warn threshold: 200K * 0.95 * 1.4 = 266K.
Fix:
- Read model.context_length, provider, and base_url from config.yaml
(same as run_agent.py does) and pass them to get_model_context_length()
- Resolve provider/base_url from runtime when not in config
- Cap the 1.4x-adjusted compress threshold at 95% of context_length
- Cap the 1.4x-adjusted warn threshold at context_length
Affects: z.ai GLM-5/GLM-5-turbo, any ~200K or smaller context model
where the 1.4x factor would push 85% above 100%.
Ref: Discord report from Ddox — glm-5-turbo on z.ai coding plan
* fix(mcp-oauth): port mismatch, path traversal, and shared state in OAuth flow
Three bugs in the new MCP OAuth 2.1 PKCE implementation:
1. CRITICAL: OAuth redirect port mismatch — build_oauth_auth() calls
_find_free_port() to register the redirect_uri, but _wait_for_callback()
calls _find_free_port() again getting a DIFFERENT port. Browser redirects
to port A, server listens on port B — callback never arrives, 120s timeout.
Fix: share the port via module-level _oauth_port variable.
2. MEDIUM: Path traversal via unsanitized server_name — HermesTokenStorage
uses server_name directly in filenames. A name like "../../.ssh/config"
writes token files outside ~/.hermes/mcp-tokens/.
Fix: sanitize server_name with the same regex pattern used elsewhere.
3. MEDIUM: Class-level auth_code/state on _CallbackHandler causes data
races if concurrent OAuth flows run. Second callback overwrites first.
Fix: factory function _make_callback_handler() returns a handler class
with a closure-scoped result dict, isolating each flow.
* test: add tests for MCP OAuth path traversal, handler isolation, and port sharing
7 new tests covering:
- Path traversal blocked (../../.ssh/config stays in mcp-tokens/)
- Dots/slashes sanitized and resolved within base dir
- Normal server names preserved
- Special characters sanitized (@, :, /)
- Concurrent handler result dicts are independent
- Handler writes to its own result dict, not class-level
- build_oauth_auth stores port in module-level _oauth_port
---------
Co-authored-by: 0xbyt4 <35742124+0xbyt4@users.noreply.github.com>
When a session expires (daily schedule or idle timeout) and is
automatically reset, send a notification to the user explaining
what happened:
◐ Session automatically reset (inactive for 24h).
Conversation history cleared.
Use /resume to browse and restore a previous session.
Adjust reset timing in config.yaml under session_reset.
Notifications are suppressed when:
- The expired session had no activity (no tokens used)
- The platform is excluded (api_server, webhook by default)
- notify: false in config
Changes:
- session.py: _should_reset() returns reason string ('idle'/'daily')
instead of bool; SessionEntry gains auto_reset_reason and
reset_had_activity fields; old entry's total_tokens checked
- config.py: SessionResetPolicy gains notify (bool, default: true)
and notify_exclude_platforms (default: api_server, webhook)
- run.py: sends notification via adapter.send() before processing
the user's message, with activity + platform checks
- 13 new tests
Config (config.yaml):
session_reset:
notify: true
notify_exclude_platforms: [api_server, webhook]
- Download and cache .pdf, .docx, .xlsx, .pptx attachments locally
instead of passing expiring CDN URLs to the agent
- Inject .txt and .md content (≤100 KB) into event.text so the agent
sees file content without needing to fetch the URL
- Add 20 MB size guard and SUPPORTED_DOCUMENT_TYPES allowlist
- Fix: unsupported types (.zip etc.) no longer get MessageType.DOCUMENT
- Add 9 unit tests in test_discord_document_handling.py
Mirrors the Slack implementation from PR #784. Discord CDN URLs are
publicly accessible so no auth header is needed (unlike Slack).
Co-authored-by: Dilee <uzmpsk.dilekakbas@gmail.com>
- test_plugins.py: remove tests for unimplemented plugin command API
(get_plugin_command_handler, register_command never existed)
- test_redact.py: add autouse fixture to clear HERMES_REDACT_SECRETS
env var leaked by cli.py import in other tests
- test_signal.py: same HERMES_REDACT_SECRETS fix for phone redaction
- test_mattermost.py: add @bot_user_id to test messages after the
mention-only filter was added in #2443
- test_context_token_tracking.py: mock resolve_provider_client for
openai-codex provider that requires real OAuth credentials
Full suite: 5893 passed, 0 failed.
* fix: respect DashScope v1 runtime mode for alibaba
Remove the hardcoded Alibaba branch from resolve_runtime_provider()
that forced api_mode='anthropic_messages' regardless of the base URL.
Alibaba now goes through the generic API-key provider path, which
auto-detects the protocol from the URL:
- /apps/anthropic → anthropic_messages (via endswith check)
- /v1 → chat_completions (default)
This fixes Alibaba setup with OpenAI-compatible DashScope endpoints
(e.g. coding-intl.dashscope.aliyuncs.com/v1) that were broken because
runtime always forced Anthropic mode even when setup saved a /v1 URL.
Based on PR #2024 by @kshitijk4poor.
* docs(skill): add split, merge, search examples to ocr-and-documents skill
Adds pymupdf examples for PDF splitting, merging, and text search
to the existing ocr-and-documents skill. No new dependencies — pymupdf
already covers all three operations natively.
* fix: replace all production print() calls with logger in rl_training_tool
Replace all bare print() calls in production code paths with proper logger calls.
- Add `import logging` and module-level `logger = logging.getLogger(__name__)`
- Replace print() in _start_training_run() with logger.info()
- Replace print() in _stop_training_run() with logger.info()
- Replace print(Warning/Note) calls with logger.warning() and logger.info()
Using the logging framework allows log level filtering, proper formatting,
and log routing instead of always printing to stdout.
* fix(gateway): process /queue'd messages after agent completion
/queue stored messages in adapter._pending_messages but never consumed
them after normal (non-interrupted) completion. The consumption path
at line 5219 only checked pending messages when result.get('interrupted')
was True — since /queue deliberately doesn't interrupt, queued messages
were silently dropped.
Now checks adapter._pending_messages after both interrupted AND normal
completion. For queued messages (non-interrupt), the first response is
delivered before recursing to process the queued follow-up. Skips the
direct send when streaming already delivered the response.
Reported by GhostMode on Discord.
---------
Co-authored-by: kshitijk4poor <kshitijk4poor@users.noreply.github.com>
Co-authored-by: memosr.eth <96793918+memosr@users.noreply.github.com>
The /v1/responses endpoint used an in-memory OrderedDict that lost
all conversation state on gateway restart. Replace with SQLite-backed
storage at ~/.hermes/response_store.db.
- Responses and conversation name mappings survive restarts
- Same LRU eviction behavior (configurable max_size)
- WAL mode for concurrent read performance
- Falls back to in-memory SQLite if disk path unavailable
- Conversation name→response_id mapping moved into the store
Reverts the sanitizer addition from PR #2466 (originally #2129).
We already have _empty_content_retries handling for reasoning-only
responses. The trailing strip risks silently eating valid messages
and is redundant with existing empty-content handling.
Add hermes mcp add/remove/list/test/configure CLI for managing MCP
server connections interactively. Discovery-first 'add' flow connects,
discovers tools, and lets users select which to enable via curses checklist.
Add OAuth 2.1 PKCE authentication for MCP HTTP servers (RFC 7636).
Supports browser-based and manual (headless) authorization, token
caching with 0600 permissions, automatic refresh. Zero external deps.
Add ${ENV_VAR} interpolation in MCP server config values, resolved
from os.environ + ~/.hermes/.env at load time.
Core OAuth module from PR #2021 by @imnotdev25. CLI and mcp_tool
wiring rewritten against current main. Closes#497, #690.
Cherry-picked from PR #2017 by @simpolism. Fixes#2011.
Discord slash commands in threads were missing thread_id in the
SessionSource, causing them to route to the parent channel session.
Commands like /usage and /reset returned wrong data or affected the
wrong session.
Detects discord.Thread channels in _build_slash_event and sets
chat_type='thread' with thread_id. Two tests added.
Remove the hardcoded Alibaba branch from resolve_runtime_provider()
that forced api_mode='anthropic_messages' regardless of the base URL.
Alibaba now goes through the generic API-key provider path, which
auto-detects the protocol from the URL:
- /apps/anthropic → anthropic_messages (via endswith check)
- /v1 → chat_completions (default)
This fixes Alibaba setup with OpenAI-compatible DashScope endpoints
(e.g. coding-intl.dashscope.aliyuncs.com/v1) that were broken because
runtime always forced Anthropic mode even when setup saved a /v1 URL.
Based on PR #2024 by @kshitijk4poor.
Co-authored-by: kshitijk4poor <kshitijk4poor@users.noreply.github.com>
Parse thread_id from explicit deliver target (e.g. telegram:-1003724596514:17)
and forward it to _send_to_platform and mirror_to_session.
Previously _resolve_delivery_target() always set thread_id=None when
parsing the platform:chat_id format, breaking cron job delivery to
specific Telegram topics.
Added tests:
- test_explicit_telegram_topic_target_with_thread_id
- test_explicit_telegram_chat_id_without_thread_id
Also updated CRONJOB_SCHEMA deliver description to document the
platform:chat_id:thread_id format.
Co-authored-by: Alex Ferrari <alex@thealexferrari.com>
Five improvements to the /api/jobs endpoints:
1. Startup availability check — cron module imported once at class load,
endpoints return 501 if unavailable (not 500 per-request import error)
2. Input limits — name ≤ 200 chars, prompt ≤ 5000 chars, repeat must be
positive int
3. Update field whitelist — only name/schedule/prompt/deliver/skills/
repeat/enabled pass through to cron.jobs.update_job, preventing
arbitrary key injection
4. Deduplicated validation — _check_job_id and _check_jobs_available
helpers replace repeated boilerplate
5. 32 new tests covering all endpoints, validation, auth, and
cron-unavailable cases
Replace hardcoded 120-second grace period with a dynamic window that
scales with the job's scheduling frequency (half the period, clamped
to [120s, 2h]). Daily jobs now catch up if missed by up to 2 hours
instead of being silently skipped after just 2 minutes.
Changes the policy for agent-created skills with critical security
findings from 'block' (silently rejected) to 'ask' (allowed with
warning logged). The agent created the skill, so blocking it entirely
is too aggressive — let it through but log the findings.
- Policy: agent-created dangerous changed from block to ask
- should_allow_install returns None for 'ask' (vs True/False)
- format_scan_report shows 'NEEDS CONFIRMATION' for ask
- skill_manager_tool.py caller handles None (allows with warning)
- force=True still overrides as before
Based on PR #2271 by redhelix (closed — 3200 lines of unrelated
Mission Control code excluded).
Python 3.12 changed PosixPath.__new__ to ignore the redirected path
argument, breaking the FakePath subclass pattern. Use monkeypatch on
Path.exists instead.
Based on PR #2261 by @dieutx, fixed NameError (bare Path not imported).
Fixes#2234
The placeholder '(No response generated)' was overwriting the actual
final_response, causing it to be delivered to Discord even when the
agent completed work silently via tools.
Changes:
- Separate logged_response for output template display
- Keep final_response clean (empty when agent has no text)
- Delivery logic now correctly skips when final_response is empty
Test added to verify empty response stays empty for delivery.
Co-authored-by: Bartok9 <bartokmagic@proton.me>
Two bugs in the auxiliary provider auto-detection chain:
1. Expired Codex JWT blocks the auto chain: _read_codex_access_token()
returned any stored token without checking expiry, preventing fallback
to working providers. Now decodes JWT exp claim and returns None for
expired tokens.
2. Auxiliary Anthropic client missing OAuth identity transforms:
_AnthropicCompletionsAdapter always called build_anthropic_kwargs with
is_oauth=False, causing 400 errors for OAuth tokens. Now detects OAuth
tokens via _is_oauth_token() and propagates the flag through the
adapter chain.
Cherry-picked from PR #2378 by 0xbyt4. Fixed test_api_key_no_oauth_flag
to mock resolve_anthropic_token directly (env var alone was insufficient).
redact_sensitive_text() now returns early for None and coerces other
non-string values to str before applying regex-based redaction,
preventing TypeErrors in logging/tool-output paths.
Cherry-picked from PR #2369 by aydnOktay.
On the native Anthropic Messages API path, convert_messages_to_anthropic()
moves top-level cache_control on role:tool messages inside the tool_result
block. On OpenRouter (chat_completions), no such conversion happens — the
unexpected top-level field causes a silent hang on the second tool call.
Add native_anthropic parameter to _apply_cache_marker() and
apply_anthropic_cache_control(). When False (OpenRouter), role:tool messages
are skipped entirely. When True (native Anthropic), existing behaviour is
preserved.
Fixes#2362
When 'hermes update' stashes local changes and the restore hits
conflicts, the previous behavior silently ran 'git reset --hard HEAD'
to clean up. This could surprise users who didn't realize their
working tree was being nuked.
Now the conflict handler:
- Lists the specific conflicted files
- Reassures the user their stash is preserved
- Asks before resetting (interactive mode)
- Auto-resets in non-interactive mode (prompt_user=False)
- If declined, leaves the working tree as-is with guidance
* fix: prevent Anthropic token fallback leaking to third-party anthropic_messages providers
When provider is minimax/alibaba/etc and MINIMAX_API_KEY is not set,
the code fell back to resolve_anthropic_token() sending Anthropic OAuth
credentials to third-party endpoints, causing 401 errors.
Now only provider=="anthropic" triggers the fallback. Generalizes the
Alibaba-specific guard from #1739 to all non-Anthropic providers.
* fix: set provider='anthropic' in credential refresh tests
Follow-up for cherry-picked PR #2383 — existing tests didn't set
agent.provider, which the new guard requires to allow Anthropic
token refresh.
---------
Co-authored-by: 0xbyt4 <35742124+0xbyt4@users.noreply.github.com>
The gateway created a fresh AIAgent per message, rebuilding the system
prompt (including memory, skills, context files) every turn. This broke
prompt prefix caching — providers like Anthropic charge ~10x more for
uncached prefixes.
Now caches AIAgent instances per session_key with a config signature.
The cached agent is reused across messages in the same session,
preserving the frozen system prompt and tool schemas. Cache is
invalidated when:
- Config changes (model, provider, toolsets, reasoning, ephemeral
prompt) — detected via signature mismatch
- /new, /reset, /clear — explicit session reset
- /model — global model change clears all cached agents
- /reasoning — global reasoning change clears all cached agents
Per-message state (callbacks, stream consumers, progress queues) is
set on the agent instance before each run_conversation() call.
This matches CLI behavior where a single AIAgent lives across all turns
in a session, with _cached_system_prompt built once and reused.
When `hermes update` stashes local changes and the subsequent
`git stash apply` fails or leaves unmerged files, the conflict markers
(<<<<<<< etc.) were left in the working tree, making Hermes unrunnable
until manually cleaned up.
Now the update command runs `git reset --hard HEAD` to restore a clean
working tree before exiting, and also detects unmerged files even when
git stash apply reports success.
Closes#2348
Add @file:path, @folder:dir, @diff, @staged, @git:N, and @url:
references that expand inline before the message reaches the LLM.
Supports line ranges (@file:main.py:10-50), token budget enforcement
(soft warn at 25%, hard block at 50%), and path sandboxing for gateway.
Core module from PR #2090 by @kshitijk4poor. CLI and gateway wiring
rewritten against current main. Fixed asyncio.run() crash when called
from inside a running event loop (gateway).
Closes#682.
Fixes#1803. send_image_file, send_document, and send_video were missing
message_thread_id forwarding, causing them to fail in Telegram forum/supergroups
where thread_id is required. send_voice already handled this correctly. Adds
metadata parameter + message_thread_id to all three methods, and adds tests
covering the thread_id forwarding path.