Cherry-picked from PR #2017 by @simpolism. Fixes#2011.
Discord slash commands in threads were missing thread_id in the
SessionSource, causing them to route to the parent channel session.
Commands like /usage and /reset returned wrong data or affected the
wrong session.
Detects discord.Thread channels in _build_slash_event and sets
chat_type='thread' with thread_id. Two tests added.
Remove the hardcoded Alibaba branch from resolve_runtime_provider()
that forced api_mode='anthropic_messages' regardless of the base URL.
Alibaba now goes through the generic API-key provider path, which
auto-detects the protocol from the URL:
- /apps/anthropic → anthropic_messages (via endswith check)
- /v1 → chat_completions (default)
This fixes Alibaba setup with OpenAI-compatible DashScope endpoints
(e.g. coding-intl.dashscope.aliyuncs.com/v1) that were broken because
runtime always forced Anthropic mode even when setup saved a /v1 URL.
Based on PR #2024 by @kshitijk4poor.
Co-authored-by: kshitijk4poor <kshitijk4poor@users.noreply.github.com>
Parse thread_id from explicit deliver target (e.g. telegram:-1003724596514:17)
and forward it to _send_to_platform and mirror_to_session.
Previously _resolve_delivery_target() always set thread_id=None when
parsing the platform:chat_id format, breaking cron job delivery to
specific Telegram topics.
Added tests:
- test_explicit_telegram_topic_target_with_thread_id
- test_explicit_telegram_chat_id_without_thread_id
Also updated CRONJOB_SCHEMA deliver description to document the
platform:chat_id:thread_id format.
Co-authored-by: Alex Ferrari <alex@thealexferrari.com>
Five improvements to the /api/jobs endpoints:
1. Startup availability check — cron module imported once at class load,
endpoints return 501 if unavailable (not 500 per-request import error)
2. Input limits — name ≤ 200 chars, prompt ≤ 5000 chars, repeat must be
positive int
3. Update field whitelist — only name/schedule/prompt/deliver/skills/
repeat/enabled pass through to cron.jobs.update_job, preventing
arbitrary key injection
4. Deduplicated validation — _check_job_id and _check_jobs_available
helpers replace repeated boilerplate
5. 32 new tests covering all endpoints, validation, auth, and
cron-unavailable cases
Replace hardcoded 120-second grace period with a dynamic window that
scales with the job's scheduling frequency (half the period, clamped
to [120s, 2h]). Daily jobs now catch up if missed by up to 2 hours
instead of being silently skipped after just 2 minutes.
Changes the policy for agent-created skills with critical security
findings from 'block' (silently rejected) to 'ask' (allowed with
warning logged). The agent created the skill, so blocking it entirely
is too aggressive — let it through but log the findings.
- Policy: agent-created dangerous changed from block to ask
- should_allow_install returns None for 'ask' (vs True/False)
- format_scan_report shows 'NEEDS CONFIRMATION' for ask
- skill_manager_tool.py caller handles None (allows with warning)
- force=True still overrides as before
Based on PR #2271 by redhelix (closed — 3200 lines of unrelated
Mission Control code excluded).
Python 3.12 changed PosixPath.__new__ to ignore the redirected path
argument, breaking the FakePath subclass pattern. Use monkeypatch on
Path.exists instead.
Based on PR #2261 by @dieutx, fixed NameError (bare Path not imported).
Fixes#2234
The placeholder '(No response generated)' was overwriting the actual
final_response, causing it to be delivered to Discord even when the
agent completed work silently via tools.
Changes:
- Separate logged_response for output template display
- Keep final_response clean (empty when agent has no text)
- Delivery logic now correctly skips when final_response is empty
Test added to verify empty response stays empty for delivery.
Co-authored-by: Bartok9 <bartokmagic@proton.me>
Two bugs in the auxiliary provider auto-detection chain:
1. Expired Codex JWT blocks the auto chain: _read_codex_access_token()
returned any stored token without checking expiry, preventing fallback
to working providers. Now decodes JWT exp claim and returns None for
expired tokens.
2. Auxiliary Anthropic client missing OAuth identity transforms:
_AnthropicCompletionsAdapter always called build_anthropic_kwargs with
is_oauth=False, causing 400 errors for OAuth tokens. Now detects OAuth
tokens via _is_oauth_token() and propagates the flag through the
adapter chain.
Cherry-picked from PR #2378 by 0xbyt4. Fixed test_api_key_no_oauth_flag
to mock resolve_anthropic_token directly (env var alone was insufficient).
redact_sensitive_text() now returns early for None and coerces other
non-string values to str before applying regex-based redaction,
preventing TypeErrors in logging/tool-output paths.
Cherry-picked from PR #2369 by aydnOktay.
On the native Anthropic Messages API path, convert_messages_to_anthropic()
moves top-level cache_control on role:tool messages inside the tool_result
block. On OpenRouter (chat_completions), no such conversion happens — the
unexpected top-level field causes a silent hang on the second tool call.
Add native_anthropic parameter to _apply_cache_marker() and
apply_anthropic_cache_control(). When False (OpenRouter), role:tool messages
are skipped entirely. When True (native Anthropic), existing behaviour is
preserved.
Fixes#2362
When 'hermes update' stashes local changes and the restore hits
conflicts, the previous behavior silently ran 'git reset --hard HEAD'
to clean up. This could surprise users who didn't realize their
working tree was being nuked.
Now the conflict handler:
- Lists the specific conflicted files
- Reassures the user their stash is preserved
- Asks before resetting (interactive mode)
- Auto-resets in non-interactive mode (prompt_user=False)
- If declined, leaves the working tree as-is with guidance
* fix: prevent Anthropic token fallback leaking to third-party anthropic_messages providers
When provider is minimax/alibaba/etc and MINIMAX_API_KEY is not set,
the code fell back to resolve_anthropic_token() sending Anthropic OAuth
credentials to third-party endpoints, causing 401 errors.
Now only provider=="anthropic" triggers the fallback. Generalizes the
Alibaba-specific guard from #1739 to all non-Anthropic providers.
* fix: set provider='anthropic' in credential refresh tests
Follow-up for cherry-picked PR #2383 — existing tests didn't set
agent.provider, which the new guard requires to allow Anthropic
token refresh.
---------
Co-authored-by: 0xbyt4 <35742124+0xbyt4@users.noreply.github.com>
The gateway created a fresh AIAgent per message, rebuilding the system
prompt (including memory, skills, context files) every turn. This broke
prompt prefix caching — providers like Anthropic charge ~10x more for
uncached prefixes.
Now caches AIAgent instances per session_key with a config signature.
The cached agent is reused across messages in the same session,
preserving the frozen system prompt and tool schemas. Cache is
invalidated when:
- Config changes (model, provider, toolsets, reasoning, ephemeral
prompt) — detected via signature mismatch
- /new, /reset, /clear — explicit session reset
- /model — global model change clears all cached agents
- /reasoning — global reasoning change clears all cached agents
Per-message state (callbacks, stream consumers, progress queues) is
set on the agent instance before each run_conversation() call.
This matches CLI behavior where a single AIAgent lives across all turns
in a session, with _cached_system_prompt built once and reused.
When `hermes update` stashes local changes and the subsequent
`git stash apply` fails or leaves unmerged files, the conflict markers
(<<<<<<< etc.) were left in the working tree, making Hermes unrunnable
until manually cleaned up.
Now the update command runs `git reset --hard HEAD` to restore a clean
working tree before exiting, and also detects unmerged files even when
git stash apply reports success.
Closes#2348
Add @file:path, @folder:dir, @diff, @staged, @git:N, and @url:
references that expand inline before the message reaches the LLM.
Supports line ranges (@file:main.py:10-50), token budget enforcement
(soft warn at 25%, hard block at 50%), and path sandboxing for gateway.
Core module from PR #2090 by @kshitijk4poor. CLI and gateway wiring
rewritten against current main. Fixed asyncio.run() crash when called
from inside a running event loop (gateway).
Closes#682.
Fixes#1803. send_image_file, send_document, and send_video were missing
message_thread_id forwarding, causing them to fail in Telegram forum/supergroups
where thread_id is required. send_voice already handled this correctly. Adds
metadata parameter + message_thread_id to all three methods, and adds tests
covering the thread_id forwarding path.
Based on PR #1749 by @erosika (reimplemented on current main).
Extracts three protected methods from run() so wrapper CLIs can extend
the TUI without overriding the entire method:
- _get_extra_tui_widgets(): inject widgets between spacer and status bar
- _register_extra_tui_keybindings(kb, input_area): add keybindings
- _build_tui_layout_children(**widgets): full control over ordering
Default implementations reproduce existing layout exactly. The inline
HSplit in run() now delegates to _build_tui_layout_children().
5 tests covering defaults, widget insertion position, and keybinding
registration.
When using Alibaba (DashScope) with an anthropic-compatible endpoint,
model names like qwen3.5-plus were being normalized to qwen3-5-plus.
Alibaba's API expects the dot. Added preserve_dots parameter to
normalize_model_name() and build_anthropic_kwargs().
Also fixed 401 auth: when provider is alibaba or base_url contains
dashscope/aliyuncs, use only the resolved API key (DASHSCOPE_API_KEY).
Never fall back to resolve_anthropic_token(), and skip Anthropic
credential refresh for DashScope endpoints.
Cherry-picked from PR #1748 by crazywriter1. Fixes#1739.
- Add resolve_config_path(): checks $HERMES_HOME/honcho.json first,
falls back to ~/.honcho/config.json. Enables isolated Hermes instances
with independent Honcho credentials and settings.
- Update CLI and doctor to use resolved path instead of hardcoded global.
- Change default session_strategy from per-session to per-directory.
Part 1 of #1962 by @erosika.
Bare strings like "image", "audio", "document" were appended to
media_types, but downstream run.py checks mtype.startswith("image/")
and mtype.startswith("audio/"), which never matched. This caused all
Mattermost file attachments to be silently dropped from vision/STT
processing. Use the actual MIME type from file_info instead.
When streaming is enabled, the base adapter receives None from
_handle_message (already_sent=True) and cannot run auto-TTS for
voice input. The runner was unconditionally skipping voice input
TTS assuming the base adapter would handle it.
Now the runner takes over TTS responsibility when streaming has
already delivered the text response, so voice channel playback
works with both streaming on and off.
Streaming off behavior is unchanged (default already_sent=False
preserves the original code path exactly).
Co-authored-by: 0xbyt4 <35742124+0xbyt4@users.noreply.github.com>
Cron deliveries were mirrored into the target gateway session as
assistant-role messages, causing consecutive assistant messages that
violate message alternation (issue #2221).
Instead of fixing the role, remove the mirror injection entirely.
Cron outputs already live in their own cron session and don't belong
in the interactive conversation history.
Delivered messages are now wrapped with a header (task name) and a
footer noting the agent cannot see or respond to the message, so
users have clear context about what they're reading.
Closes#2221
A single Telegram 409 Conflict from getUpdates permanently killed
Telegram polling with no recovery possible (retryable=False on
first occurrence). This is too aggressive for production use with
process supervisors.
Transient 409s are expected during:
- --replace handoffs where the old long-poll session lingers on
Telegram servers for a few seconds after SIGTERM
- systemd Restart=on-failure respawns that overlap with the dying
instance cleanup
Now _handle_polling_conflict() retries up to 3 times with a
10-second delay between attempts. The 30-second total retry window
lets stale server-side sessions expire. If all retries fail, the
error is still marked as permanently fatal — preserving the original
protection against genuine dual-instance conflicts.
Tests updated: split the single conflict test into two — one verifying
retry on transient conflict, one verifying fatal after exhausted
retries.
Closes#2296
Previously, all project context files (AGENTS.md, .cursorrules, .hermes.md)
were loaded and concatenated into the system prompt. This bloated the prompt
with potentially redundant or conflicting instructions.
Now only ONE project context type is loaded, using priority order:
1. .hermes.md / HERMES.md (walk to git root)
2. AGENTS.md / agents.md (recursive directory walk)
3. CLAUDE.md / claude.md (cwd only, NEW)
4. .cursorrules / .cursor/rules/*.mdc (cwd only)
SOUL.md from HERMES_HOME remains independent and always loads.
Also adds CLAUDE.md as a recognized context file format, matching the
convention popularized by Claude Code.
Refactored the monolithic function into four focused helpers:
_load_hermes_md, _load_agents_md, _load_claude_md, _load_cursorrules.
Tests: replaced 1 coexistence test with 10 new tests covering priority
ordering, CLAUDE.md loading, case sensitivity, injection blocking.
Cherry-picked from PR #2201 by @Gutslabs.
session_search resolved hits to parent/root sessions but only excluded
the exact current_session_id. If the active session was a child
continuation (compression/delegation), its parent could still appear
as a 'past' conversation result.
Fix: resolve current_session_id to its lineage root before filtering,
so the entire active lineage (parent and children) is excluded.
Remove the [Files already read — do NOT re-read these] user message
that was injected into the conversation after context compression.
This message used role='user' for system-generated content, creating
a fake user turn that confused models about conversation state and
could contribute to task-redo behavior.
The file_tools.py read tracker (warn on 3rd consecutive read, block
on 4th+) already handles re-read prevention inline without injecting
synthetic messages.
Closes#2224.
Co-authored-by: Test <test@test.com>
Replace asyncio.run() with thread-local persistent event loops for
worker threads (e.g., delegate_task's ThreadPoolExecutor). asyncio.run()
creates and closes a fresh loop on every call, leaving cached
httpx/AsyncOpenAI clients bound to a dead loop — causing 'Event loop is
closed' errors during GC when parallel subagents clean up connections.
The fix mirrors the main thread's _get_tool_loop() pattern but uses
threading.local() so each worker thread gets its own long-lived loop,
avoiding both cross-thread contention and the create-destroy lifecycle.
Added 4 regression tests covering worker loop persistence, reuse,
per-thread isolation, and separation from the main thread's loop.