Commit Graph

447 Commits

Author SHA1 Message Date
teknium1
936040d8f7 fix: guard init-time stdio writes 2026-03-14 02:19:46 -07:00
teknium1
163fa4a9d1 refactor(cli): implement approval locking mechanism to serialize concurrent requests
- Introduced _approval_lock to ensure that approval prompts are handled sequentially, preventing state clobbering from parallel delegation subtasks.
- Updated approval_callback and HermesCLI methods to utilize the lock for managing approval state and deadlines.
- Added tests for the config bridging logic to ensure correct environment variable mapping from config.yaml.
2026-03-13 23:59:18 -07:00
Teknium
a20d373945 fix: worktree-aware minisweagent path discovery + clean up requirements check (#1248)
Salvage of PR #1246 by ChatGPT (teknium1 session), resolved against
current main which already includes #1239.

Changes:
- Add minisweagent_path.py: worktree-aware helper that finds
  mini-swe-agent/src from either the current checkout or the main
  checkout behind a git worktree
- Use the helper in tools/terminal_tool.py and mini_swe_runner.py
  instead of naive path-relative lookup that fails in worktrees
- Clean up check_terminal_requirements():
  - local: return True (no minisweagent dep, per #1239)
  - singularity/ssh: remove unnecessary minisweagent imports
  - docker/modal: use importlib.util.find_spec with clear error
- Add regression tests for worktree path discovery and tool resolution
2026-03-13 23:39:51 -07:00
Teknium
21422dba44 Merge pull request #1239 from NousResearch/hermes/hermes-07d947aa
fix: stop local terminal warning without minisweagent
2026-03-13 22:14:44 -07:00
teknium1
329f83ff2d fix: stop local terminal warning without minisweagent 2026-03-13 22:00:36 -07:00
Teknium
af8791a49d test: fix stale CI assumptions in parser and quick-command coverage (#1236)
- update managed-server compatibility tests to match the current
  ServerManager.tool_parser wiring used by hermes_base_env
- make quick-command CLI assertions accept Rich Text objects, which is how
  ANSI-safe output is rendered now
- set HERMES_HOME explicitly in the Discord auto-thread config bridge test
  so it loads the intended temporary config file

Validated with the targeted test set and the full pytest suite.
2026-03-13 21:56:12 -07:00
Teknium
7c3cb9bb31 Merge pull request #1227 from NousResearch/hermes/hermes-07d947aa
fix: surface gpt-5.4 in codex setup
2026-03-13 21:55:51 -07:00
teknium1
253d54a9e1 fix(cli): make /new, /reset, and /clear start real fresh sessions
Create a new session DB row when starting fresh from the CLI, reset the
agent DB flush cursor and todo state, and update session timing/session ID
bookkeeping so follow-up logging stays correct.

Also update slash-command descriptions and add regression tests for /new,
/reset, and /clear.

Supersedes PR #899.
Closes #641.
2026-03-13 21:53:54 -07:00
teknium1
206e56cc5e fix: finish HERMES_HOME path cleanup
- route CLI interrupt debug logging through HERMES_HOME
- update the remaining channel_directory test to patch HERMES_HOME
  instead of Path.home()
2026-03-13 21:35:07 -07:00
teknium1
607689095e fix: add codex forward-compat model listing 2026-03-13 21:34:01 -07:00
0xIbra
437ec17125 fix(cli): respect HERMES_HOME in all remaining hardcoded ~/.hermes paths
Several files resolved paths via Path.home() / ".hermes" or
os.path.expanduser("~/.hermes/..."), bypassing the HERMES_HOME
environment variable. This broke isolation when running multiple
Hermes instances with distinct HERMES_HOME directories.

Replace all hardcoded paths with calls to get_hermes_home() from
hermes_cli.config, consistent with the rest of the codebase.

Files fixed:
- tools/process_registry.py (processes.json)
- gateway/pairing.py (pairing/)
- gateway/sticker_cache.py (sticker_cache.json)
- gateway/channel_directory.py (channel_directory.json, sessions.json)
- gateway/config.py (gateway.json, config.yaml, sessions_dir)
- gateway/mirror.py (sessions/)
- gateway/hooks.py (hooks/)
- gateway/platforms/base.py (image_cache/, audio_cache/, document_cache/)
- gateway/platforms/whatsapp.py (whatsapp/session)
- gateway/delivery.py (cron/output)
- agent/auxiliary_client.py (auth.json)
- agent/prompt_builder.py (SOUL.md)
- cli.py (config.yaml, images/, pastes/, history)
- run_agent.py (logs/)
- tools/environments/base.py (sandboxes/)
- tools/environments/modal.py (modal_snapshots.json)
- tools/environments/singularity.py (singularity_snapshots.json)
- tools/tts_tool.py (audio_cache)
- hermes_cli/status.py (cron/jobs.json, sessions.json)
- hermes_cli/gateway.py (logs/, whatsapp session)
- hermes_cli/main.py (whatsapp/session)

Tests updated to use HERMES_HOME env var instead of patching Path.home().

Closes #892

(cherry picked from commit 78ac1bba43b8b74a934c6172f2c29bb4d03164b9)
2026-03-13 21:32:53 -07:00
teknium1
899cb52e7a refactor: drop codex oauth model warning 2026-03-13 21:18:29 -07:00
teknium1
529729831c fix: explain codex oauth gpt-5.4 limits 2026-03-13 21:12:55 -07:00
Teknium
938e887b4c fix: keep honcho recall out of cached system prefix (#1201)
Attach later-turn Honcho recall to the current-turn user message at API
call time instead of appending it to the system prompt. This preserves the
stable system-prefix cache while keeping Honcho continuity context
available for the turn.

Also adds regression coverage for the injection helper and for continuing
sessions so Honcho recall stays out of the system prompt.
2026-03-13 21:07:00 -07:00
teknium1
57e98fe6c9 fix: surface gpt-5.4 in codex setup 2026-03-13 21:06:06 -07:00
Teknium
07d70a0345 test: cover empty cached Anthropic tool-call turns (#1222)
Add an integration-style regression test that runs prompt caching output
through the Anthropic adapter for an assistant tool-call turn with empty
content. This locks in the empty-text-block hotfix merged in PR #1216.
2026-03-13 20:44:25 -07:00
brandtcormorant
76efb0153a fix(cache_control) treat empty text like None to avoid anthropic api cache_control error 2026-03-13 18:08:46 -07:00
Teknium
bfb82b5cee fix: preserve Anthropic cache markers through adapter (#1205)
Keep assistant cache-control blocks intact when converting OpenAI-format
messages to Anthropic format, and propagate tool-message cache markers onto
generated tool_result blocks.

Adds regression tests covering assistant and tool cache marker preservation
through convert_messages_to_anthropic().
2026-03-13 13:27:03 -07:00
Teknium
c8bfb1db8f fix(gateway): add platform-specific notes to session context prompt (#1184)
Tell the agent what it CANNOT do on Slack and Discord — no searching
channel history, no pinning messages, no managing channels/roles.
Prevents the agent from hallucinating capabilities it doesn't have
and promising actions it can't deliver.

Addresses user feedback: agent says 'I'll search your Slack history'
then goes silent because no Slack-specific tools exist.
2026-03-13 12:34:11 -07:00
Teknium
07927f6bf2 feat(stt): add free local whisper transcription via faster-whisper (#1185)
* fix: Home Assistant event filtering now closed by default

Previously, when no watch_domains or watch_entities were configured,
ALL state_changed events passed through to the agent, causing users
to be flooded with notifications for every HA entity change.

Now events are dropped by default unless the user explicitly configures:
- watch_domains: list of domains to monitor (e.g. climate, light)
- watch_entities: list of specific entity IDs to monitor
- watch_all: true (new option — opt-in to receive all events)

A warning is logged at connect time if no filters are configured,
guiding users to set up their HA platform config.

All 49 gateway HA tests + 52 HA tool tests pass.

* docs: update Home Assistant integration documentation

- homeassistant.md: Fix event filtering docs to reflect closed-by-default
  behavior. Add watch_all option. Replace Python dict config example with
  YAML. Fix defaults table (was incorrectly showing 'all'). Add required
  configuration warning admonition.
- environment-variables.md: Add HASS_TOKEN and HASS_URL to Messaging section.
- messaging/index.md: Add Home Assistant to description, architecture
  diagram, platform toolsets table, and Next Steps links.

* fix(terminal): strip provider env vars from background and PTY subprocesses

Extends the env var blocklist from #1157 to also cover the two remaining
leaky paths in process_registry.py:

- spawn_local() PTY path (line 156)
- spawn_local() background Popen path (line 197)

Both were still using raw os.environ, leaking provider vars to background
processes and interactive PTY sessions. Now uses the same dynamic
_HERMES_PROVIDER_ENV_BLOCKLIST from local.py.

Explicit env_vars passed to spawn_local() still override the blocklist,
matching the existing behavior for callers that intentionally need these.

Gap identified by PR #1004 (@PeterFile).

* feat(delegate): add observability metadata to subagent results

Enrich delegate_task results with metadata from the child AIAgent:

- model: which model the child used
- exit_reason: completed | interrupted | max_iterations
- tokens.input / tokens.output: token counts
- tool_trace: per-tool-call trace with byte sizes and ok/error status

Tool trace uses tool_call_id matching to correctly pair parallel tool
calls with their results, with a fallback for messages without IDs.

Cherry-picked from PR #872 by @omerkaz, with fixes:
- Fixed parallel tool call trace pairing (was always updating last entry)
- Removed redundant 'iterations' field (identical to existing 'api_calls')
- Added test for parallel tool call trace correctness

Co-authored-by: omerkaz <omerkaz@users.noreply.github.com>

* feat(stt): add free local whisper transcription via faster-whisper

Replace OpenAI-only STT with a dual-provider system mirroring the TTS
architecture (Edge TTS free / ElevenLabs paid):

  STT: faster-whisper local (free, default) / OpenAI Whisper API (paid)

Changes:
- tools/transcription_tools.py: Full rewrite with provider dispatch,
  config loading, local faster-whisper backend, and OpenAI API backend.
  Auto-downloads model (~150MB for 'base') on first voice message.
  Singleton model instance reused across calls.
- pyproject.toml: Add faster-whisper>=1.0.0 as core dependency
- hermes_cli/config.py: Expand stt config to match TTS pattern with
  provider selection and per-provider model settings
- agent/context_compressor.py: Fix .strip() crash when LLM returns
  non-string content (dict from llama.cpp, None). Fixes #1100 partially.
- tests/: 23 new tests for STT providers + 2 for compressor fix
- docs/: Updated Voice & TTS page with STT provider table, model sizes,
  config examples, and fallback behavior

Fallback behavior:
- Local not installed → OpenAI API (if key set)
- OpenAI key not set → local whisper (if installed)
- Neither → graceful error message to user

Co-authored-by: Jah-yee <Jah-yee@users.noreply.github.com>

---------

Co-authored-by: omerkaz <omerkaz@users.noreply.github.com>
Co-authored-by: Jah-yee <Jah-yee@users.noreply.github.com>
2026-03-13 11:11:05 -07:00
Teknium
11b577671b fix: auxiliary client uses main model for custom/local endpoints instead of gpt-4o-mini (#1189)
* fix: prevent model/provider mismatch when switching providers during active gateway

When _update_config_for_provider() writes the new provider and base_url
to config.yaml, the gateway (which re-reads config per-message) can pick
up the change before model selection completes. This causes the old model
name (e.g. 'anthropic/claude-opus-4.6') to be sent to the new provider's
API (e.g. MiniMax), which fails.

Changes:
- _update_config_for_provider() now accepts an optional default_model
  parameter. When provided and the current model.default is empty or
  uses OpenRouter format (contains '/'), it sets a safe default model
  for the new provider.
- All setup.py callers for direct-API providers (zai, kimi, minimax,
  minimax-cn, anthropic) now pass a provider-appropriate default model.
- _setup_provider_model_selection() now validates the 'Keep current'
  choice: if the current model uses OpenRouter format and wouldn't work
  with the new provider, it warns and switches to the provider's first
  default model instead of silently keeping the incompatible name.

Reported by a user on Home Assistant whose gateway started sending
'anthropic/claude-opus-4.6' to MiniMax's API after running hermes setup.

* fix: auxiliary client uses main model for custom/local endpoints instead of gpt-4o-mini

When a user runs a local server (e.g. Qwen3.5-9B via OPENAI_BASE_URL),
the auxiliary client (context compression, vision, session search) would
send requests for 'gpt-4o-mini' or 'google/gemini-3-flash-preview' to
the local server, which only serves one model — causing 404 errors
mid-task.

Changes:
- _try_custom_endpoint() now reads the user's configured main model via
  _read_main_model() (checks OPENAI_MODEL → HERMES_MODEL → LLM_MODEL →
  config.yaml model.default) instead of hardcoding 'gpt-4o-mini'.
- resolve_provider_client() auto mode now detects when an OpenRouter-
  formatted model override (containing '/') would be sent to a non-
  OpenRouter provider (like a local server) and drops it in favor of
  the provider's default model.
- Test isolation fixes: properly clear env vars in 'nothing available'
  tests to prevent host environment leakage.
2026-03-13 10:02:16 -07:00
Teknium
b8b45bfb77 feat(discord): add /thread command, auto_thread config, and media metadata fix (#1178)
- Add /thread slash command that creates a Discord thread and starts a
  new Hermes session in it. The starter message (if provided) becomes
  the first user input in the new session.
- Add discord.auto_thread config option (DISCORD_AUTO_THREAD env var):
  when enabled, every message in a text channel automatically creates
  a thread, allowing parallel isolated sessions.
- Fix Discord media method signatures to accept metadata kwarg
  (send_voice, send_image_file, send_image) — prevents TypeError
  when the base adapter passes platform metadata.
- Fix test mock isolation: add app_commands and ForumChannel to
  discord mocks so tests pass in full-suite runs.

Based on PRs #866 and #1109 by insecurejezza, modified per review:
removed /channel command (unsafe), added auto_thread feature,
made /thread dispatch new sessions.

Co-authored-by: insecurejezza <insecurejezza@users.noreply.github.com>
2026-03-13 08:52:54 -07:00
Teknium
d425901bae fix: report cronjob tool as available in hermes doctor
Set HERMES_INTERACTIVE=1 via setdefault in run_doctor() so CLI-gated
tool checks (like cronjob) see the same context as the interactive CLI.

Cherry-picked from PR #895 by @stablegenius49.

Fixes #878

Co-authored-by: stablegenius49 <stablegenius49@users.noreply.github.com>
2026-03-13 08:51:45 -07:00
Teknium
02a819b16e feat(delegate): add observability metadata to subagent results (#1175)
* fix: Home Assistant event filtering now closed by default

Previously, when no watch_domains or watch_entities were configured,
ALL state_changed events passed through to the agent, causing users
to be flooded with notifications for every HA entity change.

Now events are dropped by default unless the user explicitly configures:
- watch_domains: list of domains to monitor (e.g. climate, light)
- watch_entities: list of specific entity IDs to monitor
- watch_all: true (new option — opt-in to receive all events)

A warning is logged at connect time if no filters are configured,
guiding users to set up their HA platform config.

All 49 gateway HA tests + 52 HA tool tests pass.

* docs: update Home Assistant integration documentation

- homeassistant.md: Fix event filtering docs to reflect closed-by-default
  behavior. Add watch_all option. Replace Python dict config example with
  YAML. Fix defaults table (was incorrectly showing 'all'). Add required
  configuration warning admonition.
- environment-variables.md: Add HASS_TOKEN and HASS_URL to Messaging section.
- messaging/index.md: Add Home Assistant to description, architecture
  diagram, platform toolsets table, and Next Steps links.

* fix(terminal): strip provider env vars from background and PTY subprocesses

Extends the env var blocklist from #1157 to also cover the two remaining
leaky paths in process_registry.py:

- spawn_local() PTY path (line 156)
- spawn_local() background Popen path (line 197)

Both were still using raw os.environ, leaking provider vars to background
processes and interactive PTY sessions. Now uses the same dynamic
_HERMES_PROVIDER_ENV_BLOCKLIST from local.py.

Explicit env_vars passed to spawn_local() still override the blocklist,
matching the existing behavior for callers that intentionally need these.

Gap identified by PR #1004 (@PeterFile).

* feat(delegate): add observability metadata to subagent results

Enrich delegate_task results with metadata from the child AIAgent:

- model: which model the child used
- exit_reason: completed | interrupted | max_iterations
- tokens.input / tokens.output: token counts
- tool_trace: per-tool-call trace with byte sizes and ok/error status

Tool trace uses tool_call_id matching to correctly pair parallel tool
calls with their results, with a fallback for messages without IDs.

Cherry-picked from PR #872 by @omerkaz, with fixes:
- Fixed parallel tool call trace pairing (was always updating last entry)
- Removed redundant 'iterations' field (identical to existing 'api_calls')
- Added test for parallel tool call trace correctness

Co-authored-by: omerkaz <omerkaz@users.noreply.github.com>

---------

Co-authored-by: omerkaz <omerkaz@users.noreply.github.com>
2026-03-13 08:07:12 -07:00
Muhammet Eren Karakuş
c92507e53d fix(terminal): strip Hermes provider env vars from subprocess environment (#1157)
Terminal subprocesses inherit OPENAI_BASE_URL and other provider env
vars loaded from ~/.hermes/.env, silently misrouting external CLIs
like codex.  Build a blocklist dynamically from the provider registry
so new providers are automatically covered.  Callers that truly need
a blocked var can opt in via the _HERMES_FORCE_ prefix.

Closes #1002

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-13 07:52:03 -07:00
Teknium
61531396a0 fix: Home Assistant event filtering now closed by default (#1169)
Previously, when no watch_domains or watch_entities were configured,
ALL state_changed events passed through to the agent, causing users
to be flooded with notifications for every HA entity change.

Now events are dropped by default unless the user explicitly configures:
- watch_domains: list of domains to monitor (e.g. climate, light)
- watch_entities: list of specific entity IDs to monitor
- watch_all: true (new option — opt-in to receive all events)

A warning is logged at connect time if no filters are configured,
guiding users to set up their HA platform config.

All 49 gateway HA tests + 52 HA tool tests pass.
2026-03-13 07:40:38 -07:00
teknium1
06a5cc484c fix: improve gateway secret capture guidance message
The old message referenced 'hermes setup' which doesn't handle
skill-specific env vars. Updated to direct users to load the skill
in the local CLI (which triggers the secure prompt) or add the key
to ~/.hermes/.env manually.
2026-03-13 04:10:22 -07:00
Teknium
0157253145 Merge pull request #1152 from NousResearch/hermes/hermes-f47f71c0
feat: concurrent tool execution with ThreadPoolExecutor
2026-03-13 03:20:38 -07:00
kshitijk4poor
ccfbf42844 feat: secure skill env setup on load (core #688)
When a skill declares required_environment_variables in its YAML
frontmatter, missing env vars trigger a secure TUI prompt (identical
to the sudo password widget) when the skill is loaded. Secrets flow
directly to ~/.hermes/.env, never entering LLM context.

Key changes:
- New required_environment_variables frontmatter field for skills
- Secure TUI widget (masked input, 120s timeout)
- Gateway safety: messaging platforms show local setup guidance
- Legacy prerequisites.env_vars normalized into new format
- Remote backend handling: conservative setup_needed=True
- Env var name validation, file permissions hardened to 0o600
- Redact patterns extended for secret-related JSON fields
- 12 existing skills updated with prerequisites declarations
- ~48 new tests covering skip, timeout, gateway, remote backends
- Dynamic panel widget sizing (fixes hardcoded width from original PR)

Cherry-picked from PR #723 by kshitijk4poor, rebased onto current main
with conflict resolution.

Fixes #688

Co-authored-by: kshitijk4poor <kshitijk4poor@users.noreply.github.com>
2026-03-13 03:14:04 -07:00
Teknium
c097e56142 Merge pull request #1149 from NousResearch/hermes/hermes-d28bf447
feat: Agentic On-Policy Distillation (OPD) environment
2026-03-13 03:09:43 -07:00
teknium1
ef3f3f9c08 fix: normalize dot-versioned model names for Anthropic API
anthropic/claude-opus-4.6 (OpenRouter format) was being sent as
claude-opus-4.6 to the Anthropic API, which expects claude-opus-4-6
(hyphens, not dots).

normalize_model_name() now converts dots to hyphens after stripping
the provider prefix, matching Anthropic's naming convention.

Fixes 404: 'model: claude-opus-4.6 was not found'
2026-03-13 03:08:14 -07:00
teknium1
5d0d5b191c feat: concurrent tool execution with ThreadPoolExecutor
When the model returns multiple tool calls in a single response, they are
now executed concurrently using a thread pool instead of sequentially.
This significantly reduces wall-clock time when multiple independent tools
are batched (e.g. parallel web_search, read_file, terminal calls).

Architecture:
- _execute_tool_calls() dispatches to sequential or concurrent path
- Single tool calls and batches containing 'clarify' use sequential path
- Multiple non-interactive tools use ThreadPoolExecutor (max 8 workers)
- Results are collected and appended to messages in original order
- _invoke_tool() extracted as shared tool invocation helper

Safety:
- Pre-flight interrupt check skips all tools if interrupted
- Per-tool exception handling: one failure doesn't crash the batch
- Result truncation (100k char limit) applied per tool
- Budget pressure injection after all tools complete
- Checkpoints taken before file-mutating tools
- CLI spinner shows batch progress, then per-tool completion messages

Tests: 10 new tests covering dispatch logic, ordering, error handling,
interrupt behavior, truncation, and _invoke_tool routing.
2026-03-13 02:51:51 -07:00
kshitijk4poor
bb3f5ed32a fix: separate Anthropic OAuth tokens from API keys
Persist OAuth/setup tokens in ANTHROPIC_TOKEN instead of ANTHROPIC_API_KEY.
Reserve ANTHROPIC_API_KEY for regular Console API keys.

Changes:
- anthropic_adapter: reorder resolve_anthropic_token() priority —
  ANTHROPIC_TOKEN first, ANTHROPIC_API_KEY as legacy fallback
- config: add save_anthropic_oauth_token() / save_anthropic_api_key() helpers
  that clear the opposing slot to prevent priority conflicts
- config: show_config() prefers ANTHROPIC_TOKEN for display
- setup: OAuth login and pasted setup-tokens write to ANTHROPIC_TOKEN
- setup: API key entry writes to ANTHROPIC_API_KEY and clears ANTHROPIC_TOKEN
- main: same fixes in _run_anthropic_oauth_flow() and _model_flow_anthropic()
- main: _has_any_provider_configured() checks ANTHROPIC_TOKEN
- doctor: use _is_oauth_token() for correct auth method validation
- runtime_provider: updated error message
- run_agent: simplified client init to use resolve_anthropic_token()
- run_agent: updated 401 troubleshooting messages
- status: prefer ANTHROPIC_TOKEN in status display
- tests: updated priority test, added persistence helper tests

Cherry-picked from PR #1141 by kshitijk4poor, rebased onto current main
with unrelated changes (web_policy config, blocklist CLI) removed.

Co-authored-by: kshitijk4poor <kshitijk4poor@users.noreply.github.com>
2026-03-13 02:09:52 -07:00
Teknium
d24bcad90b fix: Anthropic OAuth — beta header, token refresh, config contamination, reauthentication (#1132)
Fixes Anthropic OAuth/subscription authentication end-to-end:

Auth failures (401 errors):
- Add missing 'claude-code-20250219' beta header for OAuth tokens. Both
  clawdbot and OpenCode include this alongside 'oauth-2025-04-20' — without
  it, Anthropic's API rejects OAuth tokens with 401 authentication errors.
- Fix _fetch_anthropic_models() to use canonical beta headers from
  _COMMON_BETAS + _OAUTH_ONLY_BETAS instead of hardcoding.

Token refresh:
- Add _refresh_oauth_token() — when Claude Code credentials from
  ~/.claude/.credentials.json are expired but have a refresh token,
  automatically POST to console.anthropic.com/v1/oauth/token to get
  a new access token. Uses the same client_id as Claude Code / OpenCode.
- Add _write_claude_code_credentials() — writes refreshed tokens back
  to ~/.claude/.credentials.json, preserving other fields.
- resolve_anthropic_token() now auto-refreshes expired tokens before
  returning None.

Config contamination:
- Anthropic's _model_flow_anthropic() no longer saves base_url to config.
  Since resolve_runtime_provider() always hardcodes Anthropic's URL, the
  stale base_url was contaminating other providers when users switched
  without re-running 'hermes model' (e.g., Codex hitting api.anthropic.com).
- _update_config_for_provider() now pops base_url when passed empty string.
- Same fix in setup.py.

Flow/UX (hermes model command):
- CLAUDE_CODE_OAUTH_TOKEN env var now checked in credential detection
- Reauthentication option when existing credentials found
- run_oauth_setup_token() runs 'claude setup-token' as interactive
  subprocess, then auto-detects saved credentials
- Clean has_creds/needs_auth flow in both main.py and setup.py

Tests (14 new):
- Beta header assertions for claude-code-20250219
- Token refresh: successful refresh with credential writeback, failed
  refresh returns None, no refresh token returns None
- Credential writeback: new file creation, preserving existing fields
- Auto-refresh integration in resolve_anthropic_token()
- CLAUDE_CODE_OAUTH_TOKEN fallback, credential file auto-discovery
- run_oauth_setup_token() (5 scenarios)
2026-03-12 20:45:50 -07:00
Teknium
8de14c5624 fix(doctor): treat configured honcho as available (#962)
fix(doctor): treat configured honcho as available
2026-03-12 19:34:37 -07:00
PeterFile
2a1f92ef4a fix(doctor): treat configured honcho as available
Doctor-only override so honcho shows as available when configured,
even outside a live agent session. Runtime tool gate unchanged.

Cherry-picked from PR #962 by PeterFile, rebased onto current main
(post-#736 merge) with conflict resolution.

Fixes #961

Co-authored-by: PeterFile <PeterFile@users.noreply.github.com>
2026-03-12 19:34:19 -07:00
Ahmad Ragab
3dc148ab6f fix: use adaptive thinking without budget_tokens for Claude 4.6 models
For Claude 4.6 models (Opus and Sonnet), the Anthropic API rejects
budget_tokens when thinking.type is 'adaptive'. This was causing a
400 error: 'thinking.adaptive.budget_tokens: Extra inputs are not
permitted'.

Changes:
- Send thinking: {type: 'adaptive'} without budget_tokens for 4.6
- Move effort control to output_config: {effort: ...} per Anthropic docs
- Map Hermes effort levels to Anthropic effort levels (xhigh->max, etc.)
- Narrow adaptive detection to 4.6 models only (4.5 still uses manual)
- Add tests for adaptive thinking on 4.6 and manual thinking on pre-4.6

Fixes #1126
2026-03-13 03:21:13 +01:00
Teknium
a282322845 Merge pull request #1121 from 0xbyt4/fix/anthropic-adapter-issues
fix: anthropic adapter — max_tokens, fallback crash, proxy base_url
2026-03-12 19:07:06 -07:00
Teknium
475dd58a8e Merge PR #736: feat(honcho): async writes, memory modes, session title integration, setup CLI
Authored by erosika. Builds on #38 and #243.

Adds async write support, configurable memory modes, context prefetch pipeline,
4 new Honcho tools (honcho_context, honcho_profile, honcho_search, honcho_conclude),
full 'hermes honcho' CLI, session strategies, AI peer identity, recallMode A/B,
gateway lifecycle management, and comprehensive docs.

Cherry-picks fixes from PRs #831/#832 (adavyas).

Co-authored-by: erosika <erosika@users.noreply.github.com>
Co-authored-by: adavyas <adavyas@users.noreply.github.com>
2026-03-12 19:05:11 -07:00
Teknium
28ffa8e693 fix: slack file upload fallback loses thread context (#1122)
fix: slack file upload fallback loses thread context
2026-03-12 18:56:27 -07:00
0xbyt4
93c3a1a9c9 fix(setup): remove dead code causing is_coding_plan NameError crash
Remove 50 lines of unreachable duplicate model selection logic in
setup_model_provider() for zai/kimi-coding/minimax/minimax-cn providers.
The code referenced undefined `is_coding_plan` variable, crashing setup.
_setup_provider_model_selection() already handles these providers correctly
via _DEFAULT_PROVIDER_MODELS dict.
2026-03-13 04:42:26 +03:00
0xbyt4
064c66df8c fix: slack file upload fallback loses thread context
Fallback paths in send_image_file, send_video, and send_document called
super() without metadata, causing replies to appear outside the thread
when file upload fails. Use self.send() with metadata instead to preserve
thread_ts context.
2026-03-13 04:26:27 +03:00
0xbyt4
22479b053c fix: anthropic adapter — max_tokens ignored, fallback crash, proxy base_url filtered
- Pass self.max_tokens to build_anthropic_kwargs instead of hardcoded None
- Add anthropic case to _try_activate_fallback (was only handling openai-codex)
- Remove 'anthropic in base_url' filter that blocked custom proxy URLs
2026-03-13 04:22:16 +03:00
Teknium
3bc933586a fix: Slack MAX_MESSAGE_LENGTH + typing indicator via assistant.threads.setStatus (#1117)
fix: Slack MAX_MESSAGE_LENGTH 3900 → 39000
2026-03-12 17:53:49 -07:00
teknium1
e976879cf2 merge: resolve conflicts with main (URL update to hermes-agent.nousresearch.com) 2026-03-12 17:49:26 -07:00
teknium1
319e6615c3 fix: Slack MAX_MESSAGE_LENGTH + typing indicator via assistant.threads.setStatus
- Increase MAX_MESSAGE_LENGTH from 3,900 to 39,000 (Slack API allows 40k)
- Implement real typing indicator using assistant.threads.setStatus API
  - Shows 'BotName is thinking...' next to the bot name in threads
  - Auto-clears when the bot sends a reply
  - Requires assistant:write or chat:write scope
  - Falls back silently if scope unavailable (reactions still work)
- 4 new tests for typing indicator
2026-03-12 17:46:53 -07:00
teknium1
4068f20ce9 fix(anthropic): deep scan fixes — auth, retries, edge cases
Fixes from comprehensive code review and cross-referencing with
clawdbot/OpenCode implementations:

CRITICAL:
- Add one-shot guard (anthropic_auth_retry_attempted) to prevent
  infinite 401 retry loops when credentials keep changing
- Fix _is_oauth_token(): managed keys from ~/.claude.json are NOT
  regular API keys (don't start with sk-ant-api). Inverted the logic:
  only sk-ant-api* is treated as API key auth, everything else uses
  Bearer auth + oauth beta headers

HIGH:
- Wrap json.loads(args) in try/except in message conversion — malformed
  tool_call arguments no longer crash the entire conversation
- Raise AuthError in runtime_provider when no Anthropic token found
  (was silently passing empty string, causing confusing API errors)
- Remove broken _try_anthropic() from auxiliary vision chain — the
  centralized router creates an OpenAI client for api_key providers
  which doesn't work with Anthropic's Messages API

MEDIUM:
- Handle empty assistant message content — Anthropic rejects empty
  content blocks, now inserts '(empty)' placeholder
- Fix setup.py existing_key logic — set to 'KEEP' sentinel instead
  of None to prevent falling through to the auth choice prompt
- Add debug logging to _fetch_anthropic_models on failure

Tests: 43 adapter tests (2 new for token detection), 3197 total passed
2026-03-12 17:14:22 -07:00
teknium1
978e1356c0 feat: Slack adapter improvements — formatting, reactions, user resolution, commands
1. Markdown → mrkdwn conversion (format_message override):
   - **bold** → *bold*, *italic* → _italic_
   - ## Headers → *Headers* (bold)
   - [link](url) → <url|link>
   - ~~strike~~ → ~strike~
   - Code blocks and inline code preserved unchanged
   - Placeholder-based approach (same pattern as Telegram)

2. Message length splitting:
   - send() now calls format_message() + truncate_message()
   - Long responses split at natural boundaries (newlines, spaces)
   - Code blocks properly closed/reopened across chunks
   - Chunk indicators (1/N) appended for multi-part messages

3. Reaction-based acknowledgment:
   - 👀 (eyes) reaction added on message receipt
   - Replaced with  (white_check_mark) when response is complete
   - Graceful error handling (missing scopes, already-reacted)
   - Serves as visual feedback since Slack has no bot typing API

4. User identity resolution:
   - Resolves Slack user IDs to display names via users.info API
   - LRU-style in-memory cache (one API call per user)
   - Fallback chain: display_name → real_name → user_id
   - user_name now included in MessageEvent source

5. Expanded slash commands (/hermes <subcommand>):
   - Added: compact, compress, resume, background, usage,
     insights, title, reasoning, provider, rollback
   - Arguments preserved (e.g. /hermes resume my session)

6. reply_broadcast config option:
   - When gateway.slack.reply_broadcast is true, first response
     in a thread also appears in the main channel
   - Disabled by default — thread = session stays clean

30 new tests covering all features.
2026-03-12 16:22:39 -07:00
teknium1
7086fde37e fix(anthropic): revert inline vision, add hermes model flow, wire vision aux
Feedback fixes:

1. Revert _convert_vision_content — vision is handled by the vision_analyze
   tool, not by converting image blocks inline in conversation messages.
   Removed the function and its tests.

2. Add Anthropic to 'hermes model' (cmd_model in main.py):
   - Added to provider_labels dict
   - Added to providers selection list
   - Added _model_flow_anthropic() with Claude Code credential auto-detection,
     API key prompting, and model selection from catalog.

3. Wire up Anthropic as a vision-capable auxiliary provider:
   - Added _try_anthropic() to auxiliary_client.py using claude-sonnet-4
     as the vision model (Claude natively supports multimodal)
   - Added to the get_vision_auxiliary_client() auto-detection chain
     (after OpenRouter/Nous, before Codex/custom)

Cache tracking note: the Anthropic cache metrics branch in run_agent.py
(cache_read_input_tokens / cache_creation_input_tokens) is in the correct
place — it's response-level parsing, same location as the existing
OpenRouter cache tracking. auxiliary_client.py has no cache tracking.
2026-03-12 16:09:04 -07:00
Teknium
4a8cd6f856 fix: stop rejecting unlisted models, accept with warning instead
* fix: use session_key instead of chat_id for adapter interrupt lookups

monitor_for_interrupt() in _run_agent was using source.chat_id to query
the adapter's has_pending_interrupt() and get_pending_message() methods.
But the adapter stores interrupt events under build_session_key(source),
which produces a different string (e.g. 'agent:main:telegram:dm' vs '123456').

This key mismatch meant the interrupt was never detected through the
adapter path, which is the only active interrupt path for all adapter-based
platforms (Telegram, Discord, Slack, etc.). The gateway-level interrupt
path (in dispatch_message) is unreachable because the adapter intercepts
the 2nd message in handle_message() before it reaches dispatch_message().

Result: sending a new message while subagents were running had no effect —
the interrupt was silently lost.

Fix: replace all source.chat_id references in the interrupt-related code
within _run_agent() with the session_key parameter, which matches the
adapter's storage keys.

Also adds regression tests verifying session_key vs chat_id consistency.

* debug: add file-based logging to CLI interrupt path

Temporary instrumentation to diagnose why message-based interrupts
don't seem to work during subagent execution. Logs to
~/.hermes/interrupt_debug.log (immune to redirect_stdout).

Two log points:
1. When Enter handler puts message into _interrupt_queue
2. When chat() reads it and calls agent.interrupt()

This will reveal whether the message reaches the queue and
whether the interrupt is actually fired.

* fix: accept unlisted models with warning instead of rejecting

validate_requested_model() previously hard-rejected any model not found
in the provider's API listing. This was too aggressive — users on higher
plan tiers (e.g. Z.AI Pro/Max) may have access to models not shown in
the public listing (like glm-5 on coding endpoints).

Changes:
- validate_requested_model: accept unlisted models with a warning note
  instead of blocking. The model is saved to config and used immediately.
- Z.AI setup: always offer glm-5 in the model list regardless of whether
  a coding endpoint was detected. Pro/Max plans support it.
- Z.AI setup detection message: softened from 'GLM-5 is not available'
  to 'GLM-5 may still be available depending on your plan tier'
2026-03-12 16:02:35 -07:00