Commit Graph

3493 Commits

Author SHA1 Message Date
Teknium
ab21fbfd89 fix: add gateway coverage for session boundary hooks, move test to tests/cli/
- Fire on_session_finalize and on_session_reset in gateway _handle_reset_command()
- Fire on_session_finalize during gateway stop() for each active agent
- Move CLI test from tests/ root to tests/cli/ (matches recent restructure)
- Add 5 gateway tests covering reset hooks, ordering, shutdown, and error handling
- Place on_session_reset after new session is guaranteed to exist (covers
  the get_or_create_session fallback path)
2026-04-08 04:27:34 -07:00
Felipe de Leon
bdc72ec355 feat(cli): add on_session_finalize and on_session_reset plugin hooks
Plugins can now subscribe to session boundary events via
ctx.register_hook('on_session_finalize', ...) and
ctx.register_hook('on_session_reset', ...).

on_session_finalize — fires during CLI exit (/quit, Ctrl-C) and
before /new or /reset, giving plugins a chance to flush or clean up.

on_session_reset — fires after a new session is created via
/new or /reset, so plugins can initialize per-session state.

Closes #5592
2026-04-08 04:27:34 -07:00
Teknium
c8a5e36be8 feat(prompting): self-optimized GPT/Codex tool-use guidance via automated behavioral benchmarking (#6120)
Hermes Agent identified and patched its own prompting blind spots through
automated self-evaluation — running 64+ tool-use benchmarks across GPT-5.4
and Codex-5.3, diagnosing 5 failure modes, writing targeted prompt patches,
and verifying the fix in a closed loop.

Failure modes discovered and fixed:
- Mental arithmetic (wrong answers: 39,152,053 vs correct 39,151,253)
- User profile hallucination ('Windows 11' when running on Linux)
- Time guessing without verification
- Clarification-seeking instead of acting ('open where?' for port checks)
- Hash computation from memory (SHA-256, encodings)
- Confusing system RAM with agent's own persistent memory store

Two new XML sections added to OPENAI_MODEL_EXECUTION_GUIDANCE:
- <mandatory_tool_use>: explicit categories that must always use tools
- <act_dont_ask>: default to action on obvious interpretations

Results:
  gpt-5.4:       68.8% → 100% tool compliance (+31.2pp)
  gpt-5.3-codex: 62.5% → 100% tool compliance (+37.5pp)
  Regression:    0/8 conversational prompts over-tooled
2026-04-08 04:06:42 -07:00
Teknium
1368caf66f fix(anthropic): smart thinking block signature management (#6112)
Anthropic signs thinking blocks against the full turn content. Any
upstream mutation (context compression, session truncation, orphan
stripping, message merging) invalidates the signature, causing HTTP 400
'Invalid signature in thinking block' — especially in long-lived
gateway sessions.

Strategy (following clawdbot/OpenClaw pattern):

1. Strip thinking/redacted_thinking from all assistant messages EXCEPT
   the last one — preserves reasoning continuity on the current
   tool-use chain while avoiding stale signature errors on older turns.

2. Downgrade unsigned thinking blocks to plain text — Anthropic can't
   validate them, but the reasoning content is preserved.

3. Strip cache_control from thinking/redacted_thinking blocks to
   prevent cache markers from interfering with signature validation.

4. Drop thinking blocks from the second message when merging
   consecutive assistant messages (role alternation enforcement).

5. Error recovery: on HTTP 400 mentioning 'signature' and 'thinking',
   strip all reasoning_details from the conversation and retry once.
   This is the safety net for edge cases the proactive stripping
   misses.

Addresses the issue reported in PR #6086 by @mingginwan while
preserving reasoning continuity (their PR stripped ALL thinking
blocks unconditionally).

Files changed:
- agent/anthropic_adapter.py: thinking block management in
  convert_messages_to_anthropic (strip old turns, downgrade unsigned,
  strip cache_control, merge-time strip)
- run_agent.py: one-shot signature error recovery in retry loop
- tests/test_anthropic_adapter.py: 10 new tests covering all cases
2026-04-08 03:38:08 -07:00
Teknium
30ea423ce8 fix: unify reasoning_effort to config.yaml only, remove HERMES_REASONING_EFFORT env var
Gateway and cron had inconsistent reasoning_effort resolution:
- CLI: config.yaml only (correct)
- Gateway: config.yaml first, env var fallback
- Cron: env var first, config.yaml fallback

All three now read exclusively from agent.reasoning_effort in config.yaml.
Removed HERMES_REASONING_EFFORT env var support entirely — .env is for
secrets only, not behavioral config.
2026-04-08 03:36:44 -07:00
mrshu
19b0ddce40 fix(process): correct detached crash recovery state
Previously crash recovery recreated detached sessions as if they were
fully managed, so polls and kills could lie about liveness and the
checkpoint could forget recovered jobs after the next restart.
This commit refreshes recovered host-backed sessions from real PID
state, keeps checkpoint data durable, and preserves notify watcher
metadata while treating sandbox-only PIDs as non-recoverable.

- Persist `pid_scope` in `tools/process_registry.py` and skip
  recovering sandbox-backed entries without a host-visible PID handle
- Refresh detached sessions on access so `get`/`poll`/`wait` and active
  session queries observe exited processes instead of hanging forever
- Allow recovered host PIDs to be terminated honestly and requeue
  `notify_on_complete` watchers during checkpoint recovery
- Add regression tests for durable checkpoints, detached exit/kill
  behavior, sandbox skip logic, and recovered notify watchers
2026-04-08 03:35:43 -07:00
landy
383db35925 fix: improve streaming fallback after edit failures 2026-04-08 03:33:43 -07:00
史官
55ac056920 fix(hindsight): add missing get_hermes_home import
Import hermes_constants.get_hermes_home at module level so it is
available in _start_daemon() when local mode starts the embedded
daemon. Previously the import was only inside _load_config(), causing
NameError when _start_daemon() referenced get_hermes_home().

Fixes #5993

Co-Authored-By: 史官 <historian@slock.team>
2026-04-08 03:18:04 -07:00
Vasanthdev2004
085c1c6875 fix(browser): preserve agent-browser paths with spaces 2026-04-08 02:35:48 -07:00
Teknium
a18e5b95ad docs: add Hermes Mod visual skin editor section to skins page (#6095)
Add documentation for cocktailpeanut's hermes-mod community tool —
a web UI for creating and managing Hermes skins visually. Covers
installation (Pinokio, npx, manual), usage walkthrough, and feature
overview including ASCII art generation from images.

Ref: https://github.com/cocktailpeanut/hermes-mod
2026-04-08 02:28:40 -07:00
Teknium
3696c74bfb fix: preserve existing thresholds, remove pre-read byte guard
- DEFAULT_RESULT_SIZE_CHARS: 50K -> 100K (match current _LARGE_RESULT_CHARS)
- DEFAULT_PREVIEW_SIZE_CHARS: 2K -> 1.5K (match current _LARGE_RESULT_PREVIEW_CHARS)
- Per-tool overrides all set to 100K (terminal, execute_code, search_files)
- Remove pre-read byte guard (no behavioral regression vs current main)
- Revert limit signature change to int=500 (match current default)
- Restore original read_file schema description
- Update test assertions to match 100K thresholds
2026-04-08 02:24:32 -07:00
alt-glitch
bbcff8dcd0 fix(tools): address PR review — remove _extract_raw_output, BudgetConfig everywhere, read_file hardening
- Remove _extract_raw_output: persist content verbatim (fixes size mismatch bug)
- Drop import aliases: import from budget_config directly, one canonical name
- BudgetConfig param on maybe_persist_tool_result and enforce_turn_budget
- read_file: limit=None signature, pre-read guard fires only when limit omitted (256KB)
- Unify binary extensions: file_operations.py imports from binary_extensions.py
- Exclude .pdf and .svg from binary set (text-based, agents may inspect)
- Remove redundant outer try/except in eval path (internal fallback handles it)
- Fix broken tests: update assertion strings for new persistence format
- Module-level constants: _PRE_READ_MAX_BYTES, _DEFAULT_READ_LIMIT
- Remove redundant pathlib import (Path already at module level)
- Update spec.md with IMPLEMENTED annotations and design decisions
2026-04-08 02:24:32 -07:00
alt-glitch
77c5bc9da9 feat(budget): make tool result persistence thresholds configurable
Add BudgetConfig dataclass to centralize and make overridable the
hardcoded constants (50K per-result, 200K per-turn, 2K preview) that
control when tool outputs get persisted to sandbox. Configurable at
the RL environment level via HermesAgentEnvConfig fields, threaded
through HermesAgentLoop to the storage layer.

Resolution: pinned (read_file=inf) > env config overrides > registry
per-tool > default. CLI override: --env.turn_budget_chars 80000
2026-04-08 02:24:32 -07:00
alt-glitch
65e24c942e wip: tool result fixes -- persistence 2026-04-08 02:24:32 -07:00
kshitij
22d1bda185 fix(minimax): correct context lengths, model catalog, thinking guard, aux model, and config base_url
Cherry-picked from PR #6046 by kshitijk4poor with dead code stripped.

- Context lengths: 204800 → 1M (M1) / 1048576 (M2.5/M2.7) per official docs
- Model catalog: add M1 family, remove deprecated M2.1 and highspeed variants
- Thinking guard: skip extended thinking for MiniMax (Anthropic-compat endpoint)
- Aux model: MiniMax-M2.7-highspeed → MiniMax-M2.7 (same model, half price)
- Config base_url: honour model.base_url for API-key providers (fixes China users)
- Stripped unused get_minimax_max_output() / _MINIMAX_MAX_OUTPUT (no consumer)

Fixes #5777, #4082, #6039. Closes #3895.
2026-04-08 02:20:46 -07:00
Mibayy
ab271ebe10 fix(vision): simplify vision auto-detection to openrouter → nous → active provider
Simplify the vision auto-detection chain from 5 backends (openrouter,
nous, codex, anthropic, custom) down to 3:

  1. OpenRouter  (known vision-capable default model)
  2. Nous Portal (known vision-capable default model)
  3. Active provider + model (whatever the user is running)
  4. Stop

This is simpler and more predictable. The active provider step uses
resolve_provider_client() which handles all provider types including
named custom providers (from #5978).

Removed the complex preferred-provider promotion logic and API-level
fallback — the chain is short enough that it doesn't need them.

Based on PR #5376 by Mibay. Closes #5366.
2026-04-08 01:21:54 -07:00
zocomputer
e1befe5077 feat(agent): add jittered retry backoff
Adds agent/retry_utils.py with jittered_backoff() — exponential backoff
with additive jitter to prevent thundering-herd retry spikes when
multiple gateway sessions hit the same rate-limited provider.

Replaces fixed exponential backoff at 4 call sites:
- run_agent.py: None-choices retry path (5s base, 120s cap)
- run_agent.py: API error retry path (2s base, 60s cap)
- trajectory_compressor.py: sync + async summarization retries

Thread-safe jitter counter with overflow guards ensures unique seeds
across concurrent retries.

Trimmed from original PR to keep only wired-in functionality.

Co-authored-by: martinp09 <martinp09@users.noreply.github.com>
2026-04-08 00:41:36 -07:00
Teknium
fff237e111 feat(cron): track delivery failures in job status (#6042)
_deliver_result() now returns Optional[str] — None on success, error
message on failure. All failure paths (unknown platform, platform
disabled, config load error, send failure, unresolvable target)
return descriptive error strings.

mark_job_run() gains delivery_error param, tracked as
last_delivery_error on the job — separate from agent execution errors.
A job where the agent succeeded but delivery failed shows
last_status='ok' + last_delivery_error='...'.

The cronjob list tool now surfaces last_delivery_error so agents and
users can see when cron outputs aren't arriving.

Inspired by PR #5863 (oxngon) — reimplemented with proper wiring.

Tests: 3 new mark_job_run tests + 6 new _deliver_result return tests.
2026-04-07 22:49:01 -07:00
Teknium
598c25d43e feat(feishu): add interactive card approval buttons (#6043)
Add button-based exec approval to the Feishu adapter, matching the
existing Discord, Telegram, and Slack implementations.

When the agent encounters a dangerous command, Feishu users now see
an interactive card with four buttons instead of text instructions:
- Allow Once (primary)
- Allow Session
- Always Allow
- Deny (danger)

Implementation:
- send_exec_approval() sends an interactive card via the Feishu
  message API with buttons carrying hermes_action in their value dict
- _handle_card_action_event() intercepts approval button clicks
  before routing them as synthetic commands, directly calling
  resolve_gateway_approval() to unblock the agent thread
- _update_approval_card() replaces the orange approval card with a
  green (approved) or red (denied) status card showing who acted
- _approval_state dict tracks pending approval_id → session_key
  mappings; cleaned up on resolution

The gateway's existing routing in _approval_notify_sync already checks
getattr(type(adapter), 'send_exec_approval', None) and will
automatically use the button-based flow for Feishu.

Tests: 16 new tests covering send, callback resolution, state
management, card updates, and non-interference with existing card
actions.
2026-04-07 22:45:14 -07:00
Teknium
5c03f2e7cc fix: provider/model resolution — salvage 4 PRs + MiniMax aux URL fix (#5983)
Salvaged fixes from community PRs:

- fix(model_switch): _read_auth_store → _load_auth_store + fix auth store
  key lookup (was checking top-level dict instead of store['providers']).
  OAuth providers now correctly detected in /model picker.
  Cherry-picked from PR #5911 by Xule Lin (linxule).

- fix(ollama): pass num_ctx to override 2048 default context window.
  Ollama defaults to 2048 context regardless of model capabilities. Now
  auto-detects from /api/show metadata and injects num_ctx into every
  request. Config override via model.ollama_num_ctx. Fixes #2708.
  Cherry-picked from PR #5929 by kshitij (kshitijk4poor).

- fix(aux): normalize provider aliases for vision/auxiliary routing.
  Adds _normalize_aux_provider() with 17 aliases (google→gemini,
  claude→anthropic, glm→zai, etc). Fixes vision routing failure when
  provider is set to 'google' instead of 'gemini'.
  Cherry-picked from PR #5793 by e11i (Elizabeth1979).

- fix(aux): rewrite MiniMax /anthropic base URLs to /v1 for OpenAI SDK.
  MiniMax's inference_base_url ends in /anthropic (Anthropic Messages API),
  but auxiliary client uses OpenAI SDK which appends /chat/completions →
  404 at /anthropic/chat/completions. Generic _to_openai_base_url() helper
  rewrites terminal /anthropic to /v1 for OpenAI-compatible endpoint.
  Inspired by PR #5786 by Lempkey.

Added debug logging to silent exception blocks across all fixes.

Co-authored-by: Hermes Agent <hermes@nousresearch.com>
2026-04-07 22:23:28 -07:00
Teknium
8d7a98d2ff feat: use mimo-v2-pro for non-vision auxiliary tasks on Nous free tier (#6018)
Free-tier Nous Portal users were getting mimo-v2-omni (a multimodal
model) for all auxiliary tasks including compression, session search,
and web extraction. Now routes non-vision tasks to mimo-v2-pro (a
text model) which is better suited for those workloads.

- Added _NOUS_FREE_TIER_AUX_MODEL constant for text auxiliary tasks
- _try_nous() accepts vision=False param to select the right model
- Vision path (_resolve_strict_vision_backend) passes vision=True
- All other callers default to vision=False → mimo-v2-pro
2026-04-07 21:41:05 -07:00
Jonathan Barket
7fe6782a25 feat(tools): add "no_mcp" sentinel to exclude MCP servers per platform
Currently, MCP servers are included on all platforms by default. If a
platform's toolset list does not explicitly name any MCP servers, every
globally enabled MCP server is injected. There is no way to opt a
platform out of MCP servers entirely.

This matters for the API server platform when used as an execution
backend — each spawned agent session gets the full MCP tool schema
injected into its system prompt, dramatically inflating token usage
(e.g. 57K tokens vs 9K without MCP tools) and slowing response times.

Add a "no_mcp" sentinel value for platform_toolsets. When present in a
platform's toolset list, all MCP servers are excluded for that platform.
Other platforms are unaffected.

Usage in config.yaml:

    platform_toolsets:
      api_server:
        - terminal
        - file
        - web
        - no_mcp    # exclude all MCP servers

The sentinel is filtered out of the final toolset — it does not appear
as an actual toolset name.
2026-04-07 18:00:01 -07:00
Teknium
b9a5e6e247 fix: use camelCase structuredContent attr, prefer structured over text
- The MCP SDK Pydantic model uses camelCase (structuredContent), not
  snake_case (structured_content). The original getattr was a silent no-op.
- When structuredContent is present, return it AS the result instead of
  alongside text — the structured payload is the machine-readable data.
- Move test file to tests/tools/ and fix fake class to use camelCase.
- Patch _run_on_mcp_loop in tests so the handler actually executes.
2026-04-07 18:00:01 -07:00
r266-tech
363c5bc3c3 test(mcp): add structured_content preservation tests 2026-04-07 18:00:01 -07:00
r266-tech
2ad7694874 fix(mcp): preserve structured_content in tool call results
MCP CallToolResult may include structured_content (a JSON object) alongside
content blocks. The tool handler previously only forwarded concatenated text
from content blocks, silently dropping the structured payload.

This breaks MCP tools that return a minimal human text in content while
putting the actual machine-usable payload in structured_content.

Now, when structured_content is present, it is included in the returned
JSON under the 'structuredContent' key.

Fixes NousResearch/hermes-agent#5874
2026-04-07 18:00:01 -07:00
Teknium
cbf1f15cfe fix(auxiliary): resolve named custom providers and 'main' alias in auxiliary routing (#5978)
* fix(telegram): replace substring caption check with exact line-by-line match

Captions in photo bursts and media group albums were silently dropped when
a shorter caption happened to be a substring of an existing one (e.g.
"Meeting" lost inside "Meeting agenda"). Extract a shared _merge_caption
static helper that splits on "\n\n" and uses exact match with whitespace
normalisation, then use it in both _enqueue_photo_event and
_queue_media_group_event.

Adds 13 unit tests covering the fixed bug scenarios.

Cherry-picked from PR #2671 by Dilee.

* fix: extend caption substring fix to all platforms

Move _merge_caption helper from TelegramAdapter to BasePlatformAdapter
so all adapters inherit it. Fix the same substring-containment bug in:
- gateway/platforms/base.py (photo burst merging)
- gateway/run.py (priority photo follow-up merging)
- gateway/platforms/feishu.py (media batch merging)

The original fix only covered telegram.py. The same bug existed in base.py
and run.py (pure substring check) and feishu.py (list membership without
whitespace normalization).

* fix(auxiliary): resolve named custom providers and 'main' alias in auxiliary routing

Two bugs caused auxiliary tasks (vision, compression, etc.) to fail when
using named custom providers defined in config.yaml:

1. 'provider: main' was hardcoded to 'custom', which only checks legacy
   OPENAI_BASE_URL env vars. Now reads _read_main_provider() to resolve
   to the actual provider (e.g., 'custom:beans', 'openrouter', 'deepseek').

2. Named custom provider names (e.g., 'beans') fell through to
   PROVIDER_REGISTRY which doesn't know about config.yaml entries.
   Now checks _get_named_custom_provider() before the registry fallback.

Fixes both resolve_provider_client() and _normalize_vision_provider()
so the fix covers all auxiliary tasks (vision, compression, web_extract,
session_search, etc.).

Adds 13 unit tests. Reported by Laura via Discord.

---------

Co-authored-by: Dilee <uzmpsk.dilekakbas@gmail.com>
2026-04-07 17:59:47 -07:00
Teknium
9692b3c28a fix: CLI/UX batch — ChatConsole errors, curses scroll, skin-aware banner, git state banner (#5974)
* fix(cli): route error messages through ChatConsole inside patch_stdout

Cherry-pick of PR #5798 by @icn5381.

Replace self.console.print() with ChatConsole().print() for 11 error/status
messages reachable during the interactive session. Inside patch_stdout,
self.console (plain Rich Console) writes raw ANSI escapes that StdoutProxy
mangles into garbled text. ChatConsole uses prompt_toolkit's native
print_formatted_text which renders correctly.

Same class of bug as #2262 — that fix covered agent output but missed
these error paths in _ensure_runtime_credentials, _init_agent, quick
commands, skill loading, and plan mode.

* fix(model-picker): add scrolling viewport to curses provider menu

Cherry-pick of PR #5790 by @Lempkey. Fixes #5755.

_curses_prompt_choice rendered items starting unconditionally from index 0
with no scroll offset. The 'More providers' submenu has 13 entries. On
terminals shorter than ~16 rows, items past the fold were never drawn.
When UP-arrow wrapped cursor from 0 to the last item (Cancel, index 12),
the highlight rendered off-screen — appearing as if only Cancel existed.

Adds scroll_offset tracking that adjusts each frame to keep the cursor
inside the visible window.

* feat(cli): skin-aware compact banner + git state in startup banner

Combined salvage of PR #5922 by @ASRagab and PR #5877 by @xinbenlv.

Compact banner changes (from #5922):
- Read active skin colors and branding instead of hardcoding gold/NOUS HERMES
- Default skin preserves backward-compatible legacy branding
- Non-default skins use their own agent_name and colors

Git state in banner (from #5877):
- New format_banner_version_label() shows upstream/local git hashes
- Full banner title now includes git state (upstream hash, carried commits)
- Compact banner line2 shows the version label with git state
- Widen compact banner max width from 64 to 88 to fit version info

Both the full Rich banner and compact fallback are now skin-aware
and show git state.
2026-04-07 17:59:42 -07:00
Teknium
f3c59321af fix: add _profile_arg tests + move STT language to config.yaml
- Add 7 unit tests for _profile_arg: default home, named profile,
  hash path, nested path, invalid name, systemd integration, launchd integration
- Add stt.local.language to config.yaml (empty = auto-detect)
- Both STT code paths now read config.yaml first, env var fallback,
  then default (auto-detect for faster-whisper, 'en' for CLI command)
- HERMES_LOCAL_STT_LANGUAGE env var still works as backward-compat fallback
2026-04-07 17:59:16 -07:00
Marc Bickel
6e02fa73c2 fix(discord): discard empty placeholder on voice transcription + force STT language
- gateway/run.py: Strip "(The user sent a message with no text content)"
  placeholder when voice transcription succeeds — it was being appended
  alongside the transcript, creating duplicate user turns.
- tools/transcription_tools.py: Wire HERMES_LOCAL_STT_LANGUAGE env var
  into the faster-whisper backend. It was only used by the CLI fallback
  path (_transcribe_local_command), not the primary faster-whisper path.
2026-04-07 17:59:16 -07:00
Marc Bickel
25080986a0 fix(gateway): discard empty placeholder when voice transcription succeeds
When a Discord voice message arrives, the adapter sets event.text to
"(The user sent a message with no text content)" since voice messages
have no text content. The transcription enrichment in
_enrich_message_with_transcription() then prepends the transcript but
leaves the placeholder intact, causing the agent to receive both:

    [The user sent a voice message~ Here's what they said: "..."]

    (The user sent a message with no text content)

The agent sees this as two separate user turns — one transcribed
and one empty — creating confusing duplicate messages.

Fix: when the transcription succeeds and user_text is only the empty
placeholder, return just the transcript without the redundant placeholder.
2026-04-07 17:59:16 -07:00
Jarvis AI
c3158d38b2 fix(gateway): include --profile in launchd/systemd argv for named profiles
generate_launchd_plist() and generate_systemd_unit() were missing the
--profile <name> argument in ProgramArguments/ExecStart, causing
hermes gateway start to regenerate plists that fell back to
~/.hermes/active_profile instead of the intended profile.

Fix:
- Add _profile_arg(hermes_home?) helper returning '--profile <name>'
  only for ~/.hermes/profiles/<name> paths, empty string otherwise.
- Update generate_launchd_plist() to build ProgramArguments array
  dynamically with --profile when applicable.
- Update generate_systemd_unit() both user and system service
  branches with {profile_arg} in ExecStart.

This ensures hermes --profile <name> gateway start produces a
service definition that correctly scopes to the named profile.
2026-04-07 17:59:16 -07:00
Teknium
50d1518df6 fix(tests): update tool_progress_callback test calls to new 4-arg signature
Follow-up to sroecker's PR #5918 — test mocks were using the old 3-arg
callback signature (name, preview, args) instead of the new
(event_type, name, preview, args, **kwargs).
2026-04-07 17:56:01 -07:00
pradeep7127
1d5a69a445 fix(api_server): preserve conversation history when /v1/runs input is a message array
When /v1/runs receives an OpenAI-style array of messages as input, all
messages except the last user turn are now extracted as conversation_history.
Previously only the last message was kept, silently discarding earlier
context in multi-turn conversations.

Handles multi-part content blocks by flattening text portions. Only fires
when no explicit conversation_history was provided.

Based on PR #5837 by pradeep7127.
2026-04-07 17:56:01 -07:00
VanBladee
786038443e feat(api): accept conversation_history in request body
Allow clients to pass explicit conversation_history in /v1/responses and
/v1/runs request bodies instead of relying on server-side response chaining
via previous_response_id. Solves problems with stateless deployments where
the in-memory ResponseStore is lost on restart.

Adds input validation (must be array of {role, content} objects) and clear
precedence: explicit conversation_history > previous_response_id.

Based on PR #5805 by VanBladee, with added input validation.
2026-04-07 17:56:01 -07:00
Steffen Röcker
7ec838507a fix(api_server): update tool_progress_callback signature for Open WebUI streaming
Commit cc2b56b2 changed the tool_progress_callback signature from
(name, preview, args) to (event_type, name, preview, args, **kwargs)
but the API server's chat completion streaming callback was not updated.

This caused tool calls to not display in Open WebUI because the
callback received arguments in wrong positions.

- Update _on_tool_progress to use new 4-arg signature
- Add event_type filter to only show tool.started events
- Add **kwargs for optional duration/is_error parameters
2026-04-07 17:56:01 -07:00
Teknium
efbe8d674a docs: add Discord channel controls and Telegram reactions documentation
- Discord: ignored_channels, no_thread_channels config reference + examples
- Telegram: message reactions section with config, behavior notes
- Environment variables reference updated for all new vars
2026-04-07 17:55:55 -07:00
Teknium
a6547f399f test: add tests for Discord channel controls and Telegram reactions
- 14 tests for ignored_channels, no_thread_channels, and config bridging
- 17 tests for reaction enable/disable, API calls, error handling, and config
2026-04-07 17:55:55 -07:00
Teknium
52b3a3ca3a fix: default Telegram reactions to off, remove dead _remove_reaction
Telegram's set_message_reaction replaces all reactions in one call,
so _remove_reaction was never called (unlike Discord's additive model).
Default reactions to disabled — users opt in via telegram.reactions: true.
2026-04-07 17:55:55 -07:00
Alvaro Linares
74b0072f8f feat(telegram): add message reactions on processing start/complete
Mirror the Discord reaction pattern for Telegram:
- 👀 (eyes) when message processing begins
-  (check) on successful completion
-  (cross) on failure

Controlled via TELEGRAM_REACTIONS env var or telegram.reactions
in config.yaml (enabled by default, like Discord).

Uses python-telegram-bot's Bot.set_message_reaction() API.
Failures are caught and logged at debug level so they never
break message processing.
2026-04-07 17:55:55 -07:00
Angello Picasso
f6d4b6a319 feat(discord): add ignored_channels and no_thread_channels config
- ignored_channels: channels where bot never responds (even when mentioned)
- no_thread_channels: channels where bot responds directly without thread

Both support config.yaml and env vars (DISCORD_IGNORED_CHANNELS,
DISCORD_NO_THREAD_CHANNELS), following existing pattern for
free_response_channels.

Fixes #5881
2026-04-07 17:55:55 -07:00
lesterli
37bf19a29d fix(codex): align validation with normalization for empty stream output
The response validation stage unconditionally marked Codex Responses API
replies as invalid when response.output was empty, triggering unnecessary
retries and fallback chains. However, _normalize_codex_response can
recover from this state by synthesizing output from response.output_text.

Now the validation stage checks for output_text before marking the
response invalid, matching the normalization logic. Also fixes
logging.warning → logger.warning for consistency with the rest of the
file.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 17:29:41 -07:00
Teknium
469cd16fe0 fix(security): consolidated security hardening — SSRF, timing attack, tar traversal, credential leakage (#5944)
Salvaged from PRs #5800 (memosr), #5806 (memosr), #5915 (Ruzzgar), #5928 (Awsh1).

Changes:
- Use hmac.compare_digest for API key comparison (timing attack prevention)
- Apply provider env var blocklist to Docker containers (credential leakage)
- Replace tar.extractall() with safe extraction in TerminalBench2 (CVE-2007-4559)
- Add SSRF protection via is_safe_url to ALL platform adapters:
  base.py (cache_image_from_url, cache_audio_from_url),
  discord, slack, telegram, matrix, mattermost, feishu, wecom
  (Signal and WhatsApp protected via base.py helpers)
- Update tests: mock is_safe_url in Mattermost download tests
- Add security tests for tar extraction (traversal, symlinks, safe files)
2026-04-07 17:28:37 -07:00
Teknium
b1a66d55b4 refactor: migrate 10 config.yaml inline loaders to read_raw_config()
Replace 10 callsites across 6 files that manually opened config.yaml,
called yaml.safe_load(), and handled missing-file/parse-error fallbacks
with the new read_raw_config() helper from hermes_cli/config.py.

Each migrated site previously had 5-8 lines of boilerplate:
    config_path = get_hermes_home() / 'config.yaml'
    if config_path.exists():
        import yaml
        with open(config_path) as f:
            cfg = yaml.safe_load(f) or {}

Now reduced to:
    from hermes_cli.config import read_raw_config
    cfg = read_raw_config()

Migrated files:
- tools/browser_tool.py (4 sites): command_timeout, cloud_provider,
  allow_private_urls, record_sessions
- tools/env_passthrough.py: terminal.env_passthrough
- tools/credential_files.py: terminal.credential_files
- tools/transcription_tools.py: stt.model
- hermes_cli/commands.py: config-gated command resolution
- hermes_cli/auth.py (2 sites): model config read + provider reset

Skipped (intentionally):
- gateway/run.py: 10+ sites with local aliases, critical path
- hermes_cli/profiles.py: profile-specific config path
- hermes_cli/doctor.py: reads raw then writes fixes back
- agent/model_metadata.py: different file (context_length_cache.yaml)
- tools/website_policy.py: custom config_path param + error types
2026-04-07 17:28:23 -07:00
Zainan Victor Zhou
0d41fb0827 fix(gateway): show full session id and title in /status 2026-04-07 17:27:09 -07:00
Jeff Escalante
4aef055805 fix(gateway/webhook): don't pop delivery_info on send
The webhook adapter stored per-request `deliver`/`deliver_extra` config in
`_delivery_info[chat_id]` during POST handling and consumed it via `.pop()`
inside `send()`. That worked for routes whose agent run produced exactly
one outbound message — the final response — but it broke whenever the
agent emitted any interim status message before the final response.

Status messages flow through the same `send(chat_id, ...)` path as the
final response (see `gateway/run.py::_status_callback_sync` →
`adapter.send(...)`). Common triggers include:

  - "🔄 Primary model failed — switching to fallback: ..."
    (run_agent.py::_emit_status when `fallback_providers` activates)
  - context-pressure / compression notices
  - any other lifecycle event routed through `status_callback`

When any of those fired, the first `send()` call popped the entry, so the
subsequent final-response `send()` saw an empty dict and silently
downgraded `deliver_type` from `"telegram"` (or `discord`/`slack`/etc.) to
the default `"log"`. The agent's response was logged to the gateway log
instead of being delivered to the configured cross-platform target — no
warning, no error, just a missing message.

This was easy to hit in practice. Any user with `fallback_providers`
configured saw it the first time their primary provider hiccuped on a
webhook-triggered run. Routes that worked perfectly in dev (where the
primary stays healthy) silently dropped responses in prod.

Fix: read `_delivery_info` with `.get()` so multiple `send()` calls for
the same `chat_id` all see the same delivery config. To keep the dict
bounded without relying on per-send cleanup, add a parallel
`_delivery_info_created` timestamp dict and a `_prune_delivery_info()`
helper that drops entries older than `_idempotency_ttl` (1h, same window
already used by `_seen_deliveries`). Pruning runs on each POST, mirroring
the existing `_seen_deliveries` cleanup pattern.

Worst-case memory footprint is now `rate_limit * TTL = 30/min * 60min =
1800` entries, each ~1KB → under 2 MB. In practice it'll be far smaller
because most webhooks complete in seconds, not the full hour.

Test changes:
  - `test_delivery_info_cleaned_after_send` is replaced with
    `test_delivery_info_survives_multiple_sends`, which is now the
    regression test for this bug — it asserts that two consecutive
    `send()` calls both see the delivery config.
  - A new `test_delivery_info_pruned_via_ttl` covers the TTL cleanup
    behavior.
  - The two integration tests that asserted `chat_id not in
    adapter._delivery_info` after `send()` now assert the opposite, with
    a comment explaining why.

All 40 tests in `tests/gateway/test_webhook_adapter.py` and
`tests/gateway/test_webhook_integration.py` pass. Verified end-to-end
locally against a dynamic `hermes webhook subscribe` route configured
with `--deliver telegram --deliver-chat-id <user>`: with `gpt-5.4` as
the primary (currently flaky) and `claude-opus-4.6` as the fallback,
the fallback notification fires, the agent finishes, and the final
response is delivered to Telegram as expected.
2026-04-07 17:27:09 -07:00
Siddharth Balyan
f3006ebef9 refactor(tests): re-architect tests + fix CI failures (#5946)
* refactor: re-architect tests to mirror the codebase

* Update tests.yml

* fix: add missing tool_error imports after registry refactor

* fix(tests): replace patch.dict with monkeypatch to prevent env var leaks under xdist

patch.dict(os.environ) can leak TERMINAL_ENV across xdist workers,
causing test_code_execution tests to hit the Modal remote path.

* fix(tests): fix update_check and telegram xdist failures

- test_update_check: replace patch("hermes_cli.banner.os.getenv") with
  monkeypatch.setenv("HERMES_HOME") — banner.py no longer imports os
  directly, it uses get_hermes_home() from hermes_constants.

- test_telegram_conflict/approval_buttons: provide real exception classes
  for telegram.error mock (NetworkError, TimedOut, BadRequest) so the
  except clause in connect() doesn't fail with "catching classes that do
  not inherit from BaseException" when xdist pollutes sys.modules.

* fix(tests): accept unavailable_models kwarg in _prompt_model_selection mock
2026-04-07 17:19:07 -07:00
Teknium
99ff375f7a fix(gateway): respect tool_preview_length in all/new progress modes (#5937)
Previously, all/new tool progress modes always hard-truncated previews
to 40 chars, ignoring the display.tool_preview_length config. This made
it impossible for gateway users to see meaningful command/path info
without switching to verbose mode (which shows too much detail).

Now all/new modes read tool_preview_length from config:
- tool_preview_length: 0 (default/unset) → 40 chars (no regression)
- tool_preview_length: 120 → 120-char previews in all/new mode
- verbose mode: unchanged (already respected the config)

Users who want longer previews can set:
  display:
    tool_preview_length: 120

Reported by demontut_ on Discord.
2026-04-07 14:10:56 -07:00
Teknium
125e5ef089 fix: extend caption substring fix to all platforms
Move _merge_caption helper from TelegramAdapter to BasePlatformAdapter
so all adapters inherit it. Fix the same substring-containment bug in:
- gateway/platforms/base.py (photo burst merging)
- gateway/run.py (priority photo follow-up merging)
- gateway/platforms/feishu.py (media batch merging)

The original fix only covered telegram.py. The same bug existed in base.py
and run.py (pure substring check) and feishu.py (list membership without
whitespace normalization).
2026-04-07 14:08:59 -07:00
Dilee
4a630c2071 fix(telegram): replace substring caption check with exact line-by-line match
Captions in photo bursts and media group albums were silently dropped when
a shorter caption happened to be a substring of an existing one (e.g.
"Meeting" lost inside "Meeting agenda"). Extract a shared _merge_caption
static helper that splits on "\n\n" and uses exact match with whitespace
normalisation, then use it in both _enqueue_photo_event and
_queue_media_group_event.

Adds 13 unit tests covering the fixed bug scenarios.

Cherry-picked from PR #2671 by Dilee.
2026-04-07 14:08:59 -07:00
Teknium
7b18eeee9b feat(supermemory): add multi-container, search_mode, identity template, and env var override (#5933)
Based on PR #5413 spec by MaheshtheDev (Mahesh Sanikommu).

Changes:
- Add search_mode config (hybrid/memories/documents) passed to SDK
- Add {identity} template support in container_tag for profile-scoped containers
- Add SUPERMEMORY_CONTAINER_TAG env var override (priority over config)
- Add multi-container mode: enable_custom_container_tags, custom_containers,
  custom_container_instructions in supermemory.json
- Dynamic tool schemas when multi-container enabled (optional container_tag param)
- Whitelist validation for custom container tags in tool calls
- Simplify get_config_schema() to only prompt for API key during setup
- Defer container_tag sanitization to initialize() (after template resolution)
- Add custom_id support to documents.add calls
- Update README with multi-container docs, search_mode, identity template,
  support links (Discord, email)
- Update memory-providers.md with new features and multi-container example
- Update memory-provider-plugin.md with minimal vs full schema guidance
- Add 12 new tests covering identity template, search_mode, multi-container,
  config schema, and env var override
2026-04-07 14:03:46 -07:00