Commit Graph

3511 Commits

Author SHA1 Message Date
alt-glitch
d684d7ee7e feat(environments): unified spawn-per-call execution layer
Replace dual execution model (PersistentShellMixin + per-backend oneshot)
with spawn-per-call + session snapshot for all backends except ManagedModal.

Core changes:
- Every command spawns a fresh bash process; session snapshot (env vars,
  functions, aliases) captured at init and re-sourced before each command
- CWD persists via file-based read (local) or in-band stdout markers (remote)
- ProcessHandle protocol + _ThreadedProcessHandle adapter for SDK backends
- cancel_fn wired for Modal (sandbox.terminate) and Daytona (sandbox.stop)
- Shared utilities extracted: _pipe_stdin, _popen_bash, _load_json_store,
  _save_json_store, _file_mtime_key, _SYNC_INTERVAL_SECONDS
- Rate-limited file sync unified in base _before_execute() with _sync_files() hook
- execute_oneshot() removed; all 11 call sites in code_execution_tool.py
  migrated to execute()
- Daytona timeout wrapper replaced with SDK-native timeout parameter
- persistent_shell.py deleted (291 lines)

Backend-specific:
- Local: process-group kill via os.killpg, file-based CWD read
- Docker: -e env flags only on init_session, not per-command
- SSH: shlex.quote transport, ControlMaster connection reuse
- Singularity: apptainer exec with instance://, no forced --pwd
- Modal: _AsyncWorker + _ThreadedProcessHandle, cancel_fn -> sandbox.terminate
- Daytona: SDK-level timeout (not shell wrapper), cancel_fn -> sandbox.stop
- ManagedModal: unchanged (gateway owns execution); docstring added explaining why
2026-04-08 17:23:15 -07:00
Teknium
7d26feb9a3 feat(discord): add DISCORD_REPLY_TO_MODE setting (#6333)
Add configurable reply-reference behavior for Discord, matching the
existing Telegram (TELEGRAM_REPLY_TO_MODE) and Mattermost
(MATTERMOST_REPLY_MODE) implementations.

Modes:
- 'off': never reply-reference the original message
- 'first': reply-reference on first chunk only (default, current behavior)
- 'all': reply-reference on every chunk

Set DISCORD_REPLY_TO_MODE=off in .env to disable reply-to messages.

Changes:
- gateway/config.py: parse DISCORD_REPLY_TO_MODE env var
- gateway/platforms/discord.py: read reply_to_mode from config, respect
  it in send() — skip fetch_message entirely when 'off'
- hermes_cli/config.py: add to OPTIONAL_ENV_VARS for hermes setup
- 23 tests covering config, send behavior, env var override
- docs: discord.md env var table + environment-variables.md reference

Closes community request from Stuart on Discord.
2026-04-08 17:08:40 -07:00
kshitijk4poor
875a72e4c8 fix: normalize httpx.URL base_url + strip thinking signatures for third-party endpoints
Two linked fixes for MiniMax Anthropic-compatible fallback:

1. Normalize httpx.URL to str before calling .rstrip() in auth/provider
   detection helpers. Some client objects expose base_url as httpx.URL,
   not str — crashed with AttributeError in _requires_bearer_auth() and
   _is_third_party_anthropic_endpoint(). Also fixes _try_activate_fallback()
   to use the already-stringified fb_base_url instead of raw httpx.URL.

2. Strip Anthropic-proprietary thinking block signatures when targeting
   third-party Anthropic-compatible endpoints (MiniMax, Azure AI Foundry,
   self-hosted proxies). These endpoints cannot validate Anthropic's
   signatures and reject them with HTTP 400 'Invalid signature in
   thinking block'. Now threads base_url through convert_messages_to_anthropic()
   → build_anthropic_kwargs() so signature management is endpoint-aware.

Based on PR #4945 by kshitijk4poor (rstrip fix).
Fixes #4944.
2026-04-08 16:39:29 -07:00
Teknium
20a5e589c6 docs: clarify that provider "main" is for auxiliary tasks only (#6291)
Users were setting model.provider to "main" after reading the auxiliary
provider docs, causing "Unknown provider" errors. The "main" alias is
only valid inside auxiliary:, compression:, and fallback_model: configs
where it means "use the same provider as my main agent chat."

Added warning admonitions and inline clarifications to:
- configuration.md: Auxiliary Models provider list and Provider Options table
- fallback-providers.md: Provider Options for Auxiliary Tasks table

Reported by community member cn on Discord.
2026-04-08 16:39:17 -07:00
Teknium
7156f8d866 fix: CI test failures — metadata key, cli console, docker env, vision order (#6294)
Fixes 9 test failures on current main, incorporating ideas from PR stack
#6219-#6222 by xinbenlv with corrections:

- model_metadata: sync HF context length key casing
  (minimaxai/minimax-m2.5 → MiniMaxAI/MiniMax-M2.5)

- cli.py: route quick command error output through self.console
  instead of creating a new ChatConsole() instance

- docker.py: explicit docker_forward_env entries now bypass the
  Hermes secret blocklist (intentional opt-in wins over generic filter)

- auxiliary_client: revert _read_main_provider() to simple
  provider.strip().lower() — the _normalize_aux_provider() call
  introduced in 5c03f2e7 stripped the custom: prefix, breaking
  named custom provider resolution

- auxiliary_client: flip vision auto-detection order to
  active provider → OpenRouter → Nous → stop (was OR → Nous → active)

- test: update vision priority test to match new order

Based on PR #6219-#6222 by xinbenlv.
2026-04-08 16:37:05 -07:00
Siddharth Balyan
8de91ce9d2 fix(nix): make addToSystemPackages fully functional for interactive CLI (#6317)
* fix(nix): export HERMES_HOME system-wide when addToSystemPackages is true

The `addToSystemPackages` option's documentation (and the `:::tip` block in
`website/docs/getting-started/nix-setup.md`) promises that enabling it both
puts the `hermes` CLI on PATH and sets `HERMES_HOME` system-wide so interactive
shells share state with the gateway service. The module only did the former,
so running `hermes` in a user shell silently created a separate `~/.hermes/`
directory instead of the managed `${stateDir}/.hermes`.

Implement the documented behavior by also setting
`environment.variables.HERMES_HOME = "${cfg.stateDir}/.hermes"` in the same
mkIf block, and update the option description to match.

Fixes #6044

* fix(nix): preserve group-readable permissions in managed mode

The NixOS module sets HERMES_HOME directories to 0750 and files to 0640
so interactive users in the hermes group can share state with the gateway
service. Two issues prevented this from working:

1. hermes_cli/config.py: _secure_dir() unconditionally chmod'd HERMES_HOME
   to 0700 on every startup, overwriting the NixOS module's 0750. Similarly,
   _secure_file() forced 0600 on config files. Both now skip in managed mode
   (detected via .managed marker or HERMES_MANAGED env var).

2. nix/nixosModules.nix: the .env file was created with 0600 (owner-only),
   while config.yaml was already 0640 (group-readable). Changed to 0640 for
   consistency — users granted hermes group membership should be able to read
   the managed .env.

Verified with a NixOS VM integration test: a normal user in the hermes group
can now run `hermes version` and `hermes config` against the managed
HERMES_HOME without PermissionError.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: zerone0x <zerone0x@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-09 04:09:53 +05:30
yyovil
8385f54e98 fix(nix): preserve voice deps on aarch64-darwin via nixpkgs (#5079)
* Fixes the nix profile installation for hermes agent

(cherry picked from commit c822a082a8c0ce33f3d406e6b2ae1b2833071df0)

* Update nix/python.nix

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Applied gating for aarch64-darwin platform

Entire-Checkpoint: 1ab2074bd4f1

---------

Co-authored-by: yyovil <tanishq231003@gmail.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2026-04-09 03:39:39 +05:30
Teknium
105caa001b chore: regenerate uv.lock against current main 2026-04-08 13:47:08 -07:00
jjovalle99
d46db0a1b4 fix(tools): use correct import path for mistralai SDK
mistralai v2.x is a namespace package — `Mistral` class lives at
`mistralai.client`, not at the top-level `mistralai` module. The
previous `from mistralai import Mistral` raises ImportError at runtime.

Update both production code and test fixture to use the correct path.
2026-04-08 13:47:08 -07:00
jjovalle99
5f4b93c20f feat(tools): add Voxtral Transcribe STT provider (Mistral AI) 2026-04-08 13:47:08 -07:00
Teknium
5d2fc6d928 fix: cleanup Qwen OAuth provider gaps
- Add HERMES_QWEN_BASE_URL to OPTIONAL_ENV_VARS in config.py (was missing
  despite being referenced in code)
- Remove redundant qwen-oauth entry from _API_KEY_PROVIDER_AUX_MODELS
  (non-aggregator providers use their main model for aux tasks automatically)
2026-04-08 13:46:30 -07:00
kshitijk4poor
3377017eb4 feat(qwen): add Qwen OAuth provider with portal request support
Based on #6079 by @tunamitom with critical fixes and comprehensive tests.

Changes from #6079:
- Fix: sanitization overwrite bug — Qwen message prep now runs AFTER codex
  field sanitization, not before (was silently discarding Qwen transforms)
- Fix: missing try/except AuthError in runtime_provider.py — stale Qwen
  credentials now fall through to next provider on auto-detect
- Fix: 'qwen' alias conflict — bare 'qwen' stays mapped to 'alibaba'
  (DashScope); use 'qwen-portal' or 'qwen-cli' for the OAuth provider
- Fix: hardcoded ['coder-model'] replaced with live API fetch + curated
  fallback list (qwen3-coder-plus, qwen3-coder)
- Fix: extract _is_qwen_portal() helper + _qwen_portal_headers() to replace
  5 inline 'portal.qwen.ai' string checks and share headers between init
  and credential swap
- Fix: add Qwen branch to _apply_client_headers_for_base_url for mid-session
  credential swaps
- Fix: remove suspicious TypeError catch blocks around _prompt_provider_choice
- Fix: handle bare string items in content lists (were silently dropped)
- Fix: remove redundant dict() copies after deepcopy in message prep
- Revert: unrelated ai-gateway test mock removal and model_switch.py comment deletion

New tests (30 test functions):
- _qwen_cli_auth_path, _read_qwen_cli_tokens (success + 3 error paths)
- _save_qwen_cli_tokens (roundtrip, parent creation, permissions)
- _qwen_access_token_is_expiring (5 edge cases: fresh, expired, within skew,
  None, non-numeric)
- _refresh_qwen_cli_tokens (success, preserve old refresh, 4 error paths,
  default expires_in, disk persistence)
- resolve_qwen_runtime_credentials (fresh, auto-refresh, force-refresh,
  missing token, env override)
- get_qwen_auth_status (logged in, not logged in)
- Runtime provider resolution (direct, pool entry, alias)
- _build_api_kwargs (metadata, vl_high_resolution_images, message formatting,
  max_tokens suppression)
2026-04-08 13:46:30 -07:00
Teknium
a1213d06bd fix(hindsight): correct config key mismatch and add base URL support (#6282)
Fixes #6259. Three bugs fixed:

1. Config key mismatch: _get_client() and _start_daemon() read
   'llmApiKey' (camelCase) but save_config() stores 'llm_api_key'
   (snake_case). The config value was never read — only the env var
   fallback worked.

2. Missing base URL support: users on OpenRouter or custom endpoints
   had no way to configure HINDSIGHT_API_LLM_BASE_URL through setup.
   Added llm_base_url to config schema with empty default, passed
   conditionally to HindsightEmbedded constructor.

3. Daemon config change detection: config_changed now also checks
   HINDSIGHT_API_LLM_BASE_URL, and the daemon profile .env includes
   the base URL when set.

Keeps HINDSIGHT_API_LLM_API_KEY (with double API) in the daemon
profile .env — this matches the upstream hindsight .env.example
convention.
2026-04-08 13:46:14 -07:00
Teknium
1631895d5a docs(telegram): add proxy support section
Documents the proxy env var support added in PR #3591 (salvage of #3411
by @kufufu9). Covers HTTPS_PROXY/HTTP_PROXY/ALL_PROXY precedence,
configuration methods, and scope.
2026-04-08 13:45:14 -07:00
Teknium
4f467700d4 fix(doctor): only check the active memory provider, not all providers unconditionally (#6285)
* fix(tools): skip camofox auto-cleanup when managed persistence is enabled

When managed_persistence is enabled, cleanup_browser() was calling
camofox_close() which destroys the server-side browser context via
DELETE /sessions/{userId}, killing login sessions across cron runs.

Add camofox_soft_cleanup() — a public wrapper that drops only the
in-memory session entry when managed persistence is on, returning True.
When persistence is off it returns False so the caller falls back to
the full camofox_close().  The inactivity reaper still handles idle
resource cleanup.

Also surface a logger.warning() when _managed_persistence_enabled()
fails to load config, replacing a silent except-and-return-False.

Salvaged from #6182 by el-analista (Eduardo Perea Fernandez).
Added public API wrapper to avoid cross-module private imports,
and test coverage for both persistence paths.

Co-authored-by: Eduardo Perea Fernandez <el-analista@users.noreply.github.com>

* fix(doctor): only check the active memory provider, not all providers unconditionally

hermes doctor had hardcoded Honcho Memory and Mem0 Memory sections that
always ran regardless of the user's memory.provider config setting. After
the swappable memory provider update (#4623), users with leftover Honcho
config but no active provider saw false 'broken' errors.

Replaced both sections with a single Memory Provider section that reads
memory.provider from config.yaml and only checks the configured provider.
Users with no external provider see a green 'Built-in memory active' check.

Reported by community user michaelruiz001, confirmed by Eri (Honcho).

---------

Co-authored-by: Eduardo Perea Fernandez <el-analista@users.noreply.github.com>
2026-04-08 13:44:58 -07:00
Teknium
ff6a86cb52 docs: update v0.8.0 highlights — notify_on_complete, MiMo v2 Pro, reorder 2026-04-08 04:59:45 -07:00
Teknium
86960cdbb0 chore: release v0.8.0 (2026.4.8) (#6135) 2026-04-08 04:56:20 -07:00
Teknium
8b0afa0e57 fix: aggressive worktree and branch cleanup to prevent accumulation (#6134)
Problem: hermes -w sessions accumulated 37+ worktrees and 1200+ orphaned
branches because:
- _cleanup_worktree bailed on any dirty working tree, but agent sessions
  almost always leave untracked files/artifacts behind
- _prune_stale_worktrees had the same dirty-check, so stale worktrees
  survived indefinitely
- pr-* and hermes/* branches from PR review had zero cleanup mechanism

Changes:
- _cleanup_worktree: check for unpushed commits instead of dirty state.
  Agent work lives in pushed commits/PRs — dirty working tree without
  unpushed commits is just artifacts, safe to remove.
- _prune_stale_worktrees: three-tier age system:
  - Under 24h: skip (session may be active)
  - 24h-72h: remove if no unpushed commits
  - Over 72h: force remove regardless
- New _prune_orphaned_branches: on each -w startup, deletes local
  hermes/hermes-* and pr-* branches with no corresponding worktree.
  Protects main, checked-out branch, and active worktree branches.

Tests: 42 pass (6 new covering unpushed-commit logic, force-prune
tier, and orphaned branch cleanup).
2026-04-08 04:44:49 -07:00
Teknium
ab21fbfd89 fix: add gateway coverage for session boundary hooks, move test to tests/cli/
- Fire on_session_finalize and on_session_reset in gateway _handle_reset_command()
- Fire on_session_finalize during gateway stop() for each active agent
- Move CLI test from tests/ root to tests/cli/ (matches recent restructure)
- Add 5 gateway tests covering reset hooks, ordering, shutdown, and error handling
- Place on_session_reset after new session is guaranteed to exist (covers
  the get_or_create_session fallback path)
2026-04-08 04:27:34 -07:00
Felipe de Leon
bdc72ec355 feat(cli): add on_session_finalize and on_session_reset plugin hooks
Plugins can now subscribe to session boundary events via
ctx.register_hook('on_session_finalize', ...) and
ctx.register_hook('on_session_reset', ...).

on_session_finalize — fires during CLI exit (/quit, Ctrl-C) and
before /new or /reset, giving plugins a chance to flush or clean up.

on_session_reset — fires after a new session is created via
/new or /reset, so plugins can initialize per-session state.

Closes #5592
2026-04-08 04:27:34 -07:00
Teknium
c8a5e36be8 feat(prompting): self-optimized GPT/Codex tool-use guidance via automated behavioral benchmarking (#6120)
Hermes Agent identified and patched its own prompting blind spots through
automated self-evaluation — running 64+ tool-use benchmarks across GPT-5.4
and Codex-5.3, diagnosing 5 failure modes, writing targeted prompt patches,
and verifying the fix in a closed loop.

Failure modes discovered and fixed:
- Mental arithmetic (wrong answers: 39,152,053 vs correct 39,151,253)
- User profile hallucination ('Windows 11' when running on Linux)
- Time guessing without verification
- Clarification-seeking instead of acting ('open where?' for port checks)
- Hash computation from memory (SHA-256, encodings)
- Confusing system RAM with agent's own persistent memory store

Two new XML sections added to OPENAI_MODEL_EXECUTION_GUIDANCE:
- <mandatory_tool_use>: explicit categories that must always use tools
- <act_dont_ask>: default to action on obvious interpretations

Results:
  gpt-5.4:       68.8% → 100% tool compliance (+31.2pp)
  gpt-5.3-codex: 62.5% → 100% tool compliance (+37.5pp)
  Regression:    0/8 conversational prompts over-tooled
2026-04-08 04:06:42 -07:00
Teknium
1368caf66f fix(anthropic): smart thinking block signature management (#6112)
Anthropic signs thinking blocks against the full turn content. Any
upstream mutation (context compression, session truncation, orphan
stripping, message merging) invalidates the signature, causing HTTP 400
'Invalid signature in thinking block' — especially in long-lived
gateway sessions.

Strategy (following clawdbot/OpenClaw pattern):

1. Strip thinking/redacted_thinking from all assistant messages EXCEPT
   the last one — preserves reasoning continuity on the current
   tool-use chain while avoiding stale signature errors on older turns.

2. Downgrade unsigned thinking blocks to plain text — Anthropic can't
   validate them, but the reasoning content is preserved.

3. Strip cache_control from thinking/redacted_thinking blocks to
   prevent cache markers from interfering with signature validation.

4. Drop thinking blocks from the second message when merging
   consecutive assistant messages (role alternation enforcement).

5. Error recovery: on HTTP 400 mentioning 'signature' and 'thinking',
   strip all reasoning_details from the conversation and retry once.
   This is the safety net for edge cases the proactive stripping
   misses.

Addresses the issue reported in PR #6086 by @mingginwan while
preserving reasoning continuity (their PR stripped ALL thinking
blocks unconditionally).

Files changed:
- agent/anthropic_adapter.py: thinking block management in
  convert_messages_to_anthropic (strip old turns, downgrade unsigned,
  strip cache_control, merge-time strip)
- run_agent.py: one-shot signature error recovery in retry loop
- tests/test_anthropic_adapter.py: 10 new tests covering all cases
2026-04-08 03:38:08 -07:00
Teknium
30ea423ce8 fix: unify reasoning_effort to config.yaml only, remove HERMES_REASONING_EFFORT env var
Gateway and cron had inconsistent reasoning_effort resolution:
- CLI: config.yaml only (correct)
- Gateway: config.yaml first, env var fallback
- Cron: env var first, config.yaml fallback

All three now read exclusively from agent.reasoning_effort in config.yaml.
Removed HERMES_REASONING_EFFORT env var support entirely — .env is for
secrets only, not behavioral config.
2026-04-08 03:36:44 -07:00
mrshu
19b0ddce40 fix(process): correct detached crash recovery state
Previously crash recovery recreated detached sessions as if they were
fully managed, so polls and kills could lie about liveness and the
checkpoint could forget recovered jobs after the next restart.
This commit refreshes recovered host-backed sessions from real PID
state, keeps checkpoint data durable, and preserves notify watcher
metadata while treating sandbox-only PIDs as non-recoverable.

- Persist `pid_scope` in `tools/process_registry.py` and skip
  recovering sandbox-backed entries without a host-visible PID handle
- Refresh detached sessions on access so `get`/`poll`/`wait` and active
  session queries observe exited processes instead of hanging forever
- Allow recovered host PIDs to be terminated honestly and requeue
  `notify_on_complete` watchers during checkpoint recovery
- Add regression tests for durable checkpoints, detached exit/kill
  behavior, sandbox skip logic, and recovered notify watchers
2026-04-08 03:35:43 -07:00
landy
383db35925 fix: improve streaming fallback after edit failures 2026-04-08 03:33:43 -07:00
史官
55ac056920 fix(hindsight): add missing get_hermes_home import
Import hermes_constants.get_hermes_home at module level so it is
available in _start_daemon() when local mode starts the embedded
daemon. Previously the import was only inside _load_config(), causing
NameError when _start_daemon() referenced get_hermes_home().

Fixes #5993

Co-Authored-By: 史官 <historian@slock.team>
2026-04-08 03:18:04 -07:00
Vasanthdev2004
085c1c6875 fix(browser): preserve agent-browser paths with spaces 2026-04-08 02:35:48 -07:00
Teknium
a18e5b95ad docs: add Hermes Mod visual skin editor section to skins page (#6095)
Add documentation for cocktailpeanut's hermes-mod community tool —
a web UI for creating and managing Hermes skins visually. Covers
installation (Pinokio, npx, manual), usage walkthrough, and feature
overview including ASCII art generation from images.

Ref: https://github.com/cocktailpeanut/hermes-mod
2026-04-08 02:28:40 -07:00
Teknium
3696c74bfb fix: preserve existing thresholds, remove pre-read byte guard
- DEFAULT_RESULT_SIZE_CHARS: 50K -> 100K (match current _LARGE_RESULT_CHARS)
- DEFAULT_PREVIEW_SIZE_CHARS: 2K -> 1.5K (match current _LARGE_RESULT_PREVIEW_CHARS)
- Per-tool overrides all set to 100K (terminal, execute_code, search_files)
- Remove pre-read byte guard (no behavioral regression vs current main)
- Revert limit signature change to int=500 (match current default)
- Restore original read_file schema description
- Update test assertions to match 100K thresholds
2026-04-08 02:24:32 -07:00
alt-glitch
bbcff8dcd0 fix(tools): address PR review — remove _extract_raw_output, BudgetConfig everywhere, read_file hardening
- Remove _extract_raw_output: persist content verbatim (fixes size mismatch bug)
- Drop import aliases: import from budget_config directly, one canonical name
- BudgetConfig param on maybe_persist_tool_result and enforce_turn_budget
- read_file: limit=None signature, pre-read guard fires only when limit omitted (256KB)
- Unify binary extensions: file_operations.py imports from binary_extensions.py
- Exclude .pdf and .svg from binary set (text-based, agents may inspect)
- Remove redundant outer try/except in eval path (internal fallback handles it)
- Fix broken tests: update assertion strings for new persistence format
- Module-level constants: _PRE_READ_MAX_BYTES, _DEFAULT_READ_LIMIT
- Remove redundant pathlib import (Path already at module level)
- Update spec.md with IMPLEMENTED annotations and design decisions
2026-04-08 02:24:32 -07:00
alt-glitch
77c5bc9da9 feat(budget): make tool result persistence thresholds configurable
Add BudgetConfig dataclass to centralize and make overridable the
hardcoded constants (50K per-result, 200K per-turn, 2K preview) that
control when tool outputs get persisted to sandbox. Configurable at
the RL environment level via HermesAgentEnvConfig fields, threaded
through HermesAgentLoop to the storage layer.

Resolution: pinned (read_file=inf) > env config overrides > registry
per-tool > default. CLI override: --env.turn_budget_chars 80000
2026-04-08 02:24:32 -07:00
alt-glitch
65e24c942e wip: tool result fixes -- persistence 2026-04-08 02:24:32 -07:00
kshitij
22d1bda185 fix(minimax): correct context lengths, model catalog, thinking guard, aux model, and config base_url
Cherry-picked from PR #6046 by kshitijk4poor with dead code stripped.

- Context lengths: 204800 → 1M (M1) / 1048576 (M2.5/M2.7) per official docs
- Model catalog: add M1 family, remove deprecated M2.1 and highspeed variants
- Thinking guard: skip extended thinking for MiniMax (Anthropic-compat endpoint)
- Aux model: MiniMax-M2.7-highspeed → MiniMax-M2.7 (same model, half price)
- Config base_url: honour model.base_url for API-key providers (fixes China users)
- Stripped unused get_minimax_max_output() / _MINIMAX_MAX_OUTPUT (no consumer)

Fixes #5777, #4082, #6039. Closes #3895.
2026-04-08 02:20:46 -07:00
Mibayy
ab271ebe10 fix(vision): simplify vision auto-detection to openrouter → nous → active provider
Simplify the vision auto-detection chain from 5 backends (openrouter,
nous, codex, anthropic, custom) down to 3:

  1. OpenRouter  (known vision-capable default model)
  2. Nous Portal (known vision-capable default model)
  3. Active provider + model (whatever the user is running)
  4. Stop

This is simpler and more predictable. The active provider step uses
resolve_provider_client() which handles all provider types including
named custom providers (from #5978).

Removed the complex preferred-provider promotion logic and API-level
fallback — the chain is short enough that it doesn't need them.

Based on PR #5376 by Mibay. Closes #5366.
2026-04-08 01:21:54 -07:00
zocomputer
e1befe5077 feat(agent): add jittered retry backoff
Adds agent/retry_utils.py with jittered_backoff() — exponential backoff
with additive jitter to prevent thundering-herd retry spikes when
multiple gateway sessions hit the same rate-limited provider.

Replaces fixed exponential backoff at 4 call sites:
- run_agent.py: None-choices retry path (5s base, 120s cap)
- run_agent.py: API error retry path (2s base, 60s cap)
- trajectory_compressor.py: sync + async summarization retries

Thread-safe jitter counter with overflow guards ensures unique seeds
across concurrent retries.

Trimmed from original PR to keep only wired-in functionality.

Co-authored-by: martinp09 <martinp09@users.noreply.github.com>
2026-04-08 00:41:36 -07:00
Teknium
fff237e111 feat(cron): track delivery failures in job status (#6042)
_deliver_result() now returns Optional[str] — None on success, error
message on failure. All failure paths (unknown platform, platform
disabled, config load error, send failure, unresolvable target)
return descriptive error strings.

mark_job_run() gains delivery_error param, tracked as
last_delivery_error on the job — separate from agent execution errors.
A job where the agent succeeded but delivery failed shows
last_status='ok' + last_delivery_error='...'.

The cronjob list tool now surfaces last_delivery_error so agents and
users can see when cron outputs aren't arriving.

Inspired by PR #5863 (oxngon) — reimplemented with proper wiring.

Tests: 3 new mark_job_run tests + 6 new _deliver_result return tests.
2026-04-07 22:49:01 -07:00
Teknium
598c25d43e feat(feishu): add interactive card approval buttons (#6043)
Add button-based exec approval to the Feishu adapter, matching the
existing Discord, Telegram, and Slack implementations.

When the agent encounters a dangerous command, Feishu users now see
an interactive card with four buttons instead of text instructions:
- Allow Once (primary)
- Allow Session
- Always Allow
- Deny (danger)

Implementation:
- send_exec_approval() sends an interactive card via the Feishu
  message API with buttons carrying hermes_action in their value dict
- _handle_card_action_event() intercepts approval button clicks
  before routing them as synthetic commands, directly calling
  resolve_gateway_approval() to unblock the agent thread
- _update_approval_card() replaces the orange approval card with a
  green (approved) or red (denied) status card showing who acted
- _approval_state dict tracks pending approval_id → session_key
  mappings; cleaned up on resolution

The gateway's existing routing in _approval_notify_sync already checks
getattr(type(adapter), 'send_exec_approval', None) and will
automatically use the button-based flow for Feishu.

Tests: 16 new tests covering send, callback resolution, state
management, card updates, and non-interference with existing card
actions.
2026-04-07 22:45:14 -07:00
Teknium
5c03f2e7cc fix: provider/model resolution — salvage 4 PRs + MiniMax aux URL fix (#5983)
Salvaged fixes from community PRs:

- fix(model_switch): _read_auth_store → _load_auth_store + fix auth store
  key lookup (was checking top-level dict instead of store['providers']).
  OAuth providers now correctly detected in /model picker.
  Cherry-picked from PR #5911 by Xule Lin (linxule).

- fix(ollama): pass num_ctx to override 2048 default context window.
  Ollama defaults to 2048 context regardless of model capabilities. Now
  auto-detects from /api/show metadata and injects num_ctx into every
  request. Config override via model.ollama_num_ctx. Fixes #2708.
  Cherry-picked from PR #5929 by kshitij (kshitijk4poor).

- fix(aux): normalize provider aliases for vision/auxiliary routing.
  Adds _normalize_aux_provider() with 17 aliases (google→gemini,
  claude→anthropic, glm→zai, etc). Fixes vision routing failure when
  provider is set to 'google' instead of 'gemini'.
  Cherry-picked from PR #5793 by e11i (Elizabeth1979).

- fix(aux): rewrite MiniMax /anthropic base URLs to /v1 for OpenAI SDK.
  MiniMax's inference_base_url ends in /anthropic (Anthropic Messages API),
  but auxiliary client uses OpenAI SDK which appends /chat/completions →
  404 at /anthropic/chat/completions. Generic _to_openai_base_url() helper
  rewrites terminal /anthropic to /v1 for OpenAI-compatible endpoint.
  Inspired by PR #5786 by Lempkey.

Added debug logging to silent exception blocks across all fixes.

Co-authored-by: Hermes Agent <hermes@nousresearch.com>
2026-04-07 22:23:28 -07:00
Teknium
8d7a98d2ff feat: use mimo-v2-pro for non-vision auxiliary tasks on Nous free tier (#6018)
Free-tier Nous Portal users were getting mimo-v2-omni (a multimodal
model) for all auxiliary tasks including compression, session search,
and web extraction. Now routes non-vision tasks to mimo-v2-pro (a
text model) which is better suited for those workloads.

- Added _NOUS_FREE_TIER_AUX_MODEL constant for text auxiliary tasks
- _try_nous() accepts vision=False param to select the right model
- Vision path (_resolve_strict_vision_backend) passes vision=True
- All other callers default to vision=False → mimo-v2-pro
2026-04-07 21:41:05 -07:00
Jonathan Barket
7fe6782a25 feat(tools): add "no_mcp" sentinel to exclude MCP servers per platform
Currently, MCP servers are included on all platforms by default. If a
platform's toolset list does not explicitly name any MCP servers, every
globally enabled MCP server is injected. There is no way to opt a
platform out of MCP servers entirely.

This matters for the API server platform when used as an execution
backend — each spawned agent session gets the full MCP tool schema
injected into its system prompt, dramatically inflating token usage
(e.g. 57K tokens vs 9K without MCP tools) and slowing response times.

Add a "no_mcp" sentinel value for platform_toolsets. When present in a
platform's toolset list, all MCP servers are excluded for that platform.
Other platforms are unaffected.

Usage in config.yaml:

    platform_toolsets:
      api_server:
        - terminal
        - file
        - web
        - no_mcp    # exclude all MCP servers

The sentinel is filtered out of the final toolset — it does not appear
as an actual toolset name.
2026-04-07 18:00:01 -07:00
Teknium
b9a5e6e247 fix: use camelCase structuredContent attr, prefer structured over text
- The MCP SDK Pydantic model uses camelCase (structuredContent), not
  snake_case (structured_content). The original getattr was a silent no-op.
- When structuredContent is present, return it AS the result instead of
  alongside text — the structured payload is the machine-readable data.
- Move test file to tests/tools/ and fix fake class to use camelCase.
- Patch _run_on_mcp_loop in tests so the handler actually executes.
2026-04-07 18:00:01 -07:00
r266-tech
363c5bc3c3 test(mcp): add structured_content preservation tests 2026-04-07 18:00:01 -07:00
r266-tech
2ad7694874 fix(mcp): preserve structured_content in tool call results
MCP CallToolResult may include structured_content (a JSON object) alongside
content blocks. The tool handler previously only forwarded concatenated text
from content blocks, silently dropping the structured payload.

This breaks MCP tools that return a minimal human text in content while
putting the actual machine-usable payload in structured_content.

Now, when structured_content is present, it is included in the returned
JSON under the 'structuredContent' key.

Fixes NousResearch/hermes-agent#5874
2026-04-07 18:00:01 -07:00
Teknium
cbf1f15cfe fix(auxiliary): resolve named custom providers and 'main' alias in auxiliary routing (#5978)
* fix(telegram): replace substring caption check with exact line-by-line match

Captions in photo bursts and media group albums were silently dropped when
a shorter caption happened to be a substring of an existing one (e.g.
"Meeting" lost inside "Meeting agenda"). Extract a shared _merge_caption
static helper that splits on "\n\n" and uses exact match with whitespace
normalisation, then use it in both _enqueue_photo_event and
_queue_media_group_event.

Adds 13 unit tests covering the fixed bug scenarios.

Cherry-picked from PR #2671 by Dilee.

* fix: extend caption substring fix to all platforms

Move _merge_caption helper from TelegramAdapter to BasePlatformAdapter
so all adapters inherit it. Fix the same substring-containment bug in:
- gateway/platforms/base.py (photo burst merging)
- gateway/run.py (priority photo follow-up merging)
- gateway/platforms/feishu.py (media batch merging)

The original fix only covered telegram.py. The same bug existed in base.py
and run.py (pure substring check) and feishu.py (list membership without
whitespace normalization).

* fix(auxiliary): resolve named custom providers and 'main' alias in auxiliary routing

Two bugs caused auxiliary tasks (vision, compression, etc.) to fail when
using named custom providers defined in config.yaml:

1. 'provider: main' was hardcoded to 'custom', which only checks legacy
   OPENAI_BASE_URL env vars. Now reads _read_main_provider() to resolve
   to the actual provider (e.g., 'custom:beans', 'openrouter', 'deepseek').

2. Named custom provider names (e.g., 'beans') fell through to
   PROVIDER_REGISTRY which doesn't know about config.yaml entries.
   Now checks _get_named_custom_provider() before the registry fallback.

Fixes both resolve_provider_client() and _normalize_vision_provider()
so the fix covers all auxiliary tasks (vision, compression, web_extract,
session_search, etc.).

Adds 13 unit tests. Reported by Laura via Discord.

---------

Co-authored-by: Dilee <uzmpsk.dilekakbas@gmail.com>
2026-04-07 17:59:47 -07:00
Teknium
9692b3c28a fix: CLI/UX batch — ChatConsole errors, curses scroll, skin-aware banner, git state banner (#5974)
* fix(cli): route error messages through ChatConsole inside patch_stdout

Cherry-pick of PR #5798 by @icn5381.

Replace self.console.print() with ChatConsole().print() for 11 error/status
messages reachable during the interactive session. Inside patch_stdout,
self.console (plain Rich Console) writes raw ANSI escapes that StdoutProxy
mangles into garbled text. ChatConsole uses prompt_toolkit's native
print_formatted_text which renders correctly.

Same class of bug as #2262 — that fix covered agent output but missed
these error paths in _ensure_runtime_credentials, _init_agent, quick
commands, skill loading, and plan mode.

* fix(model-picker): add scrolling viewport to curses provider menu

Cherry-pick of PR #5790 by @Lempkey. Fixes #5755.

_curses_prompt_choice rendered items starting unconditionally from index 0
with no scroll offset. The 'More providers' submenu has 13 entries. On
terminals shorter than ~16 rows, items past the fold were never drawn.
When UP-arrow wrapped cursor from 0 to the last item (Cancel, index 12),
the highlight rendered off-screen — appearing as if only Cancel existed.

Adds scroll_offset tracking that adjusts each frame to keep the cursor
inside the visible window.

* feat(cli): skin-aware compact banner + git state in startup banner

Combined salvage of PR #5922 by @ASRagab and PR #5877 by @xinbenlv.

Compact banner changes (from #5922):
- Read active skin colors and branding instead of hardcoding gold/NOUS HERMES
- Default skin preserves backward-compatible legacy branding
- Non-default skins use their own agent_name and colors

Git state in banner (from #5877):
- New format_banner_version_label() shows upstream/local git hashes
- Full banner title now includes git state (upstream hash, carried commits)
- Compact banner line2 shows the version label with git state
- Widen compact banner max width from 64 to 88 to fit version info

Both the full Rich banner and compact fallback are now skin-aware
and show git state.
2026-04-07 17:59:42 -07:00
Teknium
f3c59321af fix: add _profile_arg tests + move STT language to config.yaml
- Add 7 unit tests for _profile_arg: default home, named profile,
  hash path, nested path, invalid name, systemd integration, launchd integration
- Add stt.local.language to config.yaml (empty = auto-detect)
- Both STT code paths now read config.yaml first, env var fallback,
  then default (auto-detect for faster-whisper, 'en' for CLI command)
- HERMES_LOCAL_STT_LANGUAGE env var still works as backward-compat fallback
2026-04-07 17:59:16 -07:00
Marc Bickel
6e02fa73c2 fix(discord): discard empty placeholder on voice transcription + force STT language
- gateway/run.py: Strip "(The user sent a message with no text content)"
  placeholder when voice transcription succeeds — it was being appended
  alongside the transcript, creating duplicate user turns.
- tools/transcription_tools.py: Wire HERMES_LOCAL_STT_LANGUAGE env var
  into the faster-whisper backend. It was only used by the CLI fallback
  path (_transcribe_local_command), not the primary faster-whisper path.
2026-04-07 17:59:16 -07:00
Marc Bickel
25080986a0 fix(gateway): discard empty placeholder when voice transcription succeeds
When a Discord voice message arrives, the adapter sets event.text to
"(The user sent a message with no text content)" since voice messages
have no text content. The transcription enrichment in
_enrich_message_with_transcription() then prepends the transcript but
leaves the placeholder intact, causing the agent to receive both:

    [The user sent a voice message~ Here's what they said: "..."]

    (The user sent a message with no text content)

The agent sees this as two separate user turns — one transcribed
and one empty — creating confusing duplicate messages.

Fix: when the transcription succeeds and user_text is only the empty
placeholder, return just the transcript without the redundant placeholder.
2026-04-07 17:59:16 -07:00
Jarvis AI
c3158d38b2 fix(gateway): include --profile in launchd/systemd argv for named profiles
generate_launchd_plist() and generate_systemd_unit() were missing the
--profile <name> argument in ProgramArguments/ExecStart, causing
hermes gateway start to regenerate plists that fell back to
~/.hermes/active_profile instead of the intended profile.

Fix:
- Add _profile_arg(hermes_home?) helper returning '--profile <name>'
  only for ~/.hermes/profiles/<name> paths, empty string otherwise.
- Update generate_launchd_plist() to build ProgramArguments array
  dynamically with --profile when applicable.
- Update generate_systemd_unit() both user and system service
  branches with {profile_arg} in ExecStart.

This ensures hermes --profile <name> gateway start produces a
service definition that correctly scopes to the named profile.
2026-04-07 17:59:16 -07:00
Teknium
50d1518df6 fix(tests): update tool_progress_callback test calls to new 4-arg signature
Follow-up to sroecker's PR #5918 — test mocks were using the old 3-arg
callback signature (name, preview, args) instead of the new
(event_type, name, preview, args, **kwargs).
2026-04-07 17:56:01 -07:00