Commit Graph

1476 Commits

Author SHA1 Message Date
teknium1
638136e353 fix(anthropic): skip thinking params for Haiku models
Haiku models don't support extended thinking at all. Without this
guard, claude-haiku-4-5-20251001 would receive type=enabled +
budget_tokens and return a 400 error.

Incorporates the fix from PR #1127 (by frizynn) on top of #1128's
adaptive thinking refactor.

Verified live with Claude Code OAuth:
  claude-opus-4-6       → adaptive thinking ✓
  claude-haiku-4-5      → no thinking params ✓
  claude-sonnet-4       → enabled thinking ✓
2026-03-12 19:34:55 -07:00
Teknium
15911d70c0 Merge pull request #1128 from ASRagab/fix/adaptive-thinking-budget-tokens
fix: use adaptive thinking without budget_tokens for Claude 4.6 models
2026-03-12 19:32:46 -07:00
Ahmad Ragab
3dc148ab6f fix: use adaptive thinking without budget_tokens for Claude 4.6 models
For Claude 4.6 models (Opus and Sonnet), the Anthropic API rejects
budget_tokens when thinking.type is 'adaptive'. This was causing a
400 error: 'thinking.adaptive.budget_tokens: Extra inputs are not
permitted'.

Changes:
- Send thinking: {type: 'adaptive'} without budget_tokens for 4.6
- Move effort control to output_config: {effort: ...} per Anthropic docs
- Map Hermes effort levels to Anthropic effort levels (xhigh->max, etc.)
- Narrow adaptive detection to 4.6 models only (4.5 still uses manual)
- Add tests for adaptive thinking on 4.6 and manual thinking on pre-4.6

Fixes #1126
2026-03-13 03:21:13 +01:00
Teknium
9dfa81ab4b Merge pull request #1125 from NousResearch/hermes/hermes-c877bdeb
fix(anthropic): add diagnostic output on 401 auth failures
2026-03-12 19:15:21 -07:00
teknium1
e5b8e06037 fix(anthropic): add diagnostic output on 401 auth failures
When Anthropic returns 401 and credential refresh doesn't help,
now prints actionable troubleshooting info:
- Which auth method was used (Bearer vs x-api-key)
- Token prefix for debugging
- Common fixes (stale ANTHROPIC_API_KEY, verify key, refresh login)
- How to clear stale keys
2026-03-12 19:09:06 -07:00
Teknium
a282322845 Merge pull request #1121 from 0xbyt4/fix/anthropic-adapter-issues
fix: anthropic adapter — max_tokens, fallback crash, proxy base_url
2026-03-12 19:07:06 -07:00
Teknium
475dd58a8e Merge PR #736: feat(honcho): async writes, memory modes, session title integration, setup CLI
Authored by erosika. Builds on #38 and #243.

Adds async write support, configurable memory modes, context prefetch pipeline,
4 new Honcho tools (honcho_context, honcho_profile, honcho_search, honcho_conclude),
full 'hermes honcho' CLI, session strategies, AI peer identity, recallMode A/B,
gateway lifecycle management, and comprehensive docs.

Cherry-picks fixes from PRs #831/#832 (adavyas).

Co-authored-by: erosika <erosika@users.noreply.github.com>
Co-authored-by: adavyas <adavyas@users.noreply.github.com>
2026-03-12 19:05:11 -07:00
Teknium
28ffa8e693 fix: slack file upload fallback loses thread context (#1122)
fix: slack file upload fallback loses thread context
2026-03-12 18:56:27 -07:00
Teknium
e53dfd88bb Merge pull request #1123 from 0xbyt4/fix/setup-is-coding-plan-nameError
Clean fix — removes dead code that crashed with NameError on is_coding_plan. The generic _setup_provider_model_selection() already handles all affected providers.
2026-03-12 18:55:59 -07:00
0xbyt4
93c3a1a9c9 fix(setup): remove dead code causing is_coding_plan NameError crash
Remove 50 lines of unreachable duplicate model selection logic in
setup_model_provider() for zai/kimi-coding/minimax/minimax-cn providers.
The code referenced undefined `is_coding_plan` variable, crashing setup.
_setup_provider_model_selection() already handles these providers correctly
via _DEFAULT_PROVIDER_MODELS dict.
2026-03-13 04:42:26 +03:00
0xbyt4
064c66df8c fix: slack file upload fallback loses thread context
Fallback paths in send_image_file, send_video, and send_document called
super() without metadata, causing replies to appear outside the thread
when file upload fails. Use self.send() with metadata instead to preserve
thread_ts context.
2026-03-13 04:26:27 +03:00
0xbyt4
22479b053c fix: anthropic adapter — max_tokens ignored, fallback crash, proxy base_url filtered
- Pass self.max_tokens to build_anthropic_kwargs instead of hardcoded None
- Add anthropic case to _try_activate_fallback (was only handling openai-codex)
- Remove 'anthropic in base_url' filter that blocked custom proxy URLs
2026-03-13 04:22:16 +03:00
Teknium
a1c4431479 Merge pull request #1062 from NousResearch/feat/optional-rl-training
feat: make tinker-atropos RL training fully optional
2026-03-12 18:02:44 -07:00
Teknium
3bc933586a fix: Slack MAX_MESSAGE_LENGTH + typing indicator via assistant.threads.setStatus (#1117)
fix: Slack MAX_MESSAGE_LENGTH 3900 → 39000
2026-03-12 17:53:49 -07:00
Teknium
0219abfeed Merge pull request #1097 from NousResearch/hermes/hermes-c877bdeb
feat: native Anthropic provider with Claude Code credential auto-discovery
2026-03-12 17:49:39 -07:00
teknium1
e976879cf2 merge: resolve conflicts with main (URL update to hermes-agent.nousresearch.com) 2026-03-12 17:49:26 -07:00
teknium1
319e6615c3 fix: Slack MAX_MESSAGE_LENGTH + typing indicator via assistant.threads.setStatus
- Increase MAX_MESSAGE_LENGTH from 3,900 to 39,000 (Slack API allows 40k)
- Implement real typing indicator using assistant.threads.setStatus API
  - Shows 'BotName is thinking...' next to the bot name in threads
  - Auto-clears when the bot sends a reply
  - Requires assistant:write or chat:write scope
  - Falls back silently if scope unavailable (reactions still work)
- 4 new tests for typing indicator
2026-03-12 17:46:53 -07:00
teknium1
7f7282c78d fix(anthropic): guard memory flush tool_calls extraction for Anthropic response format
The memory flush path extracted tool_calls from the response assuming
OpenAI format (response.choices[0].message.tool_calls). When using
the Anthropic client directly (aux unavailable), the response is an
Anthropic Message object which has no .choices attribute. Now uses
normalize_anthropic_response() to extract tool_calls correctly.
2026-03-12 17:35:01 -07:00
teknium1
809abd60bf docs: add Anthropic provider to all documentation pages
- quickstart.md: Add Anthropic to the provider comparison table
- configuration.md: Add Anthropic to provider list table, add full
  'Anthropic (Native)' section with three auth methods (API key,
  setup-token, Claude Code auto-detect), config.yaml example,
  and provider alias tip
- environment-variables.md: Add ANTHROPIC_API_KEY, ANTHROPIC_TOKEN,
  CLAUDE_CODE_OAUTH_TOKEN to LLM Providers table; add 'anthropic'
  to HERMES_INFERENCE_PROVIDER values list
2026-03-12 17:28:36 -07:00
teknium1
aaaba78126 fix(anthropic): final polish — tool ID sanitization, crash guards, temp=1
Remaining issues from deep scan:

Adapter (agent/anthropic_adapter.py):
- Add _sanitize_tool_id() — Anthropic requires IDs matching [a-zA-Z0-9_-],
  now strips invalid chars and ensures non-empty (both tool_use and tool_result)
- Empty tool result content → '(no output)' placeholder (Anthropic rejects empty)
- Set temperature=1 when thinking type='enabled' on older models (required)
- normalize_model_name now case-insensitive for 'Anthropic/' prefix
- Fix stale docstrings referencing only ~/.claude/.credentials.json

Agent loop (run_agent.py):
- Guard memory flush path (line ~2684) — was calling self.client.chat.completions
  which is None in anthropic_messages mode. Now routes through Anthropic client.
- Guard summary generation path (line ~3171) — same crash when reaching
  iteration limit. Now builds proper Anthropic kwargs and normalizes response.
- Guard retry summary path (line ~3200) — same fix for the summary retry loop.

All three self.client.chat.completions.create() calls outside the main
loop now have anthropic_messages branches to prevent NoneType crashes.
2026-03-12 17:23:09 -07:00
teknium1
4068f20ce9 fix(anthropic): deep scan fixes — auth, retries, edge cases
Fixes from comprehensive code review and cross-referencing with
clawdbot/OpenCode implementations:

CRITICAL:
- Add one-shot guard (anthropic_auth_retry_attempted) to prevent
  infinite 401 retry loops when credentials keep changing
- Fix _is_oauth_token(): managed keys from ~/.claude.json are NOT
  regular API keys (don't start with sk-ant-api). Inverted the logic:
  only sk-ant-api* is treated as API key auth, everything else uses
  Bearer auth + oauth beta headers

HIGH:
- Wrap json.loads(args) in try/except in message conversion — malformed
  tool_call arguments no longer crash the entire conversation
- Raise AuthError in runtime_provider when no Anthropic token found
  (was silently passing empty string, causing confusing API errors)
- Remove broken _try_anthropic() from auxiliary vision chain — the
  centralized router creates an OpenAI client for api_key providers
  which doesn't work with Anthropic's Messages API

MEDIUM:
- Handle empty assistant message content — Anthropic rejects empty
  content blocks, now inserts '(empty)' placeholder
- Fix setup.py existing_key logic — set to 'KEEP' sentinel instead
  of None to prevent falling through to the auth choice prompt
- Add debug logging to _fetch_anthropic_models on failure

Tests: 43 adapter tests (2 new for token detection), 3197 total passed
2026-03-12 17:14:22 -07:00
teknium1
cd4e995d54 fix(anthropic): live model fetching + adaptive thinking for 4.5+ models
- Add _fetch_anthropic_models() to hermes_cli/models.py — hits the
  Anthropic /v1/models endpoint to get the live model catalog. Handles
  both API key and OAuth token auth headers.

- Wire it into provider_model_ids() so both 'hermes model' and
  'hermes setup model' show the live list instead of a stale static one.

- Update static _PROVIDER_MODELS fallback with full current catalog:
  opus-4-6, sonnet-4-6, opus-4-5, sonnet-4-5, opus-4, sonnet-4, haiku-4-5

- Update model_metadata.py with context lengths for all current models.

- Fix thinking parameter for 4.5+ models: use type='adaptive' instead
  of type='enabled' (Anthropic deprecated 'enabled' for newer models,
  warns at runtime). Detects model version from the model name string.

Verified live:
  hermes model → Anthropic → auto-detected creds → shows 7 live models
  hermes chat --provider anthropic --model claude-opus-4-6 → works
2026-03-12 17:04:31 -07:00
teknium1
d51243b6d3 fix(anthropic): read credentials from ~/.claude.json (native binary v2.x)
The critical bug: read_claude_code_credentials() only looked at
~/.claude/.credentials.json, but Claude Code's native binary (v2.x,
Bun-compiled) stores credentials in ~/.claude.json at the top level
as 'primaryApiKey'. The .credentials.json file is only written by
older npm-based installs.

Now checks both locations in priority order:
  1. ~/.claude.json → primaryApiKey (native binary, v2.x)
  2. ~/.claude/.credentials.json → claudeAiOauth.accessToken (legacy)

Verified live: hermes model → Anthropic → auto-detected credentials →
claude-sonnet-4-20250514 → 'Hello there, how are you?' (5 words)
2026-03-12 16:43:31 -07:00
Teknium
df07baedfe feat: Slack adapter improvements — formatting, reactions, user resolution, commands (#1106)
feat: Slack adapter improvements — formatting, reactions, user resolution, commands
2026-03-12 16:35:44 -07:00
teknium1
38aa47ad6c fix(anthropic): improve auth UX with clear setup-token vs API key choice
Both 'hermes model' and 'hermes setup model' now present a clear
two-option auth flow when no credentials are found:

  1. Claude Pro/Max subscription (setup-token)
     - Step-by-step instructions to run 'claude setup-token'
     - User pastes the resulting sk-ant-oat01-... token

  2. Anthropic API key (pay-per-token)
     - Link to console.anthropic.com/settings/keys
     - User pastes sk-ant-api03-... key

Also handles:
  - Auto-detection of existing Claude Code creds (~/.claude/.credentials.json)
  - Existing credentials shown with option to update
  - Consistent UX between 'hermes model' and 'hermes setup model'
2026-03-12 16:28:00 -07:00
teknium1
978e1356c0 feat: Slack adapter improvements — formatting, reactions, user resolution, commands
1. Markdown → mrkdwn conversion (format_message override):
   - **bold** → *bold*, *italic* → _italic_
   - ## Headers → *Headers* (bold)
   - [link](url) → <url|link>
   - ~~strike~~ → ~strike~
   - Code blocks and inline code preserved unchanged
   - Placeholder-based approach (same pattern as Telegram)

2. Message length splitting:
   - send() now calls format_message() + truncate_message()
   - Long responses split at natural boundaries (newlines, spaces)
   - Code blocks properly closed/reopened across chunks
   - Chunk indicators (1/N) appended for multi-part messages

3. Reaction-based acknowledgment:
   - 👀 (eyes) reaction added on message receipt
   - Replaced with  (white_check_mark) when response is complete
   - Graceful error handling (missing scopes, already-reacted)
   - Serves as visual feedback since Slack has no bot typing API

4. User identity resolution:
   - Resolves Slack user IDs to display names via users.info API
   - LRU-style in-memory cache (one API call per user)
   - Fallback chain: display_name → real_name → user_id
   - user_name now included in MessageEvent source

5. Expanded slash commands (/hermes <subcommand>):
   - Added: compact, compress, resume, background, usage,
     insights, title, reasoning, provider, rollback
   - Arguments preserved (e.g. /hermes resume my session)

6. reply_broadcast config option:
   - When gateway.slack.reply_broadcast is true, first response
     in a thread also appears in the main channel
   - Disabled by default — thread = session stays clean

30 new tests covering all features.
2026-03-12 16:22:39 -07:00
Teknium
39f3c0aeb0 fix: use hermes-agent.nousresearch.com as OpenRouter HTTP-Referer
* fix: stop rejecting unlisted models + auto-detect from /models endpoint

validate_requested_model() now accepts models not in the provider's API
listing with a warning instead of blocking. Removes hardcoded catalog
fallback for validation — if API is unreachable, accepts with a warning.

Model selection flows (setup + /model command) now probe the provider's
/models endpoint to get the real available models. Falls back to
hardcoded defaults with a clear warning when auto-detection fails:
'Could not auto-detect models — use Custom model if yours isn't listed.'

Z.AI setup no longer excludes GLM-5 on coding plans.

* fix: use hermes-agent.nousresearch.com as HTTP-Referer for OpenRouter

OpenRouter scrapes the favicon/logo from the HTTP-Referer URL for app
rankings. We were sending the GitHub repo URL, which gives us a generic
GitHub logo. Changed to the proper website URL so our actual branding
shows up in rankings.

Changed in run_agent.py (main agent client) and auxiliary_client.py
(vision/summarization clients).
2026-03-12 16:20:22 -07:00
teknium1
7086fde37e fix(anthropic): revert inline vision, add hermes model flow, wire vision aux
Feedback fixes:

1. Revert _convert_vision_content — vision is handled by the vision_analyze
   tool, not by converting image blocks inline in conversation messages.
   Removed the function and its tests.

2. Add Anthropic to 'hermes model' (cmd_model in main.py):
   - Added to provider_labels dict
   - Added to providers selection list
   - Added _model_flow_anthropic() with Claude Code credential auto-detection,
     API key prompting, and model selection from catalog.

3. Wire up Anthropic as a vision-capable auxiliary provider:
   - Added _try_anthropic() to auxiliary_client.py using claude-sonnet-4
     as the vision model (Claude natively supports multimodal)
   - Added to the get_vision_auxiliary_client() auto-detection chain
     (after OpenRouter/Nous, before Codex/custom)

Cache tracking note: the Anthropic cache metrics branch in run_agent.py
(cache_read_input_tokens / cache_creation_input_tokens) is in the correct
place — it's response-level parsing, same location as the existing
OpenRouter cache tracking. auxiliary_client.py has no cache tracking.
2026-03-12 16:09:04 -07:00
Teknium
4cb553c765 fix: Slack thread handling — progress messages, responses, and session isolation (#1103)
fix: Slack thread handling — progress messages, responses, and session isolation
2026-03-12 16:07:05 -07:00
teknium1
987410fff3 fix: Slack thread handling — progress messages, responses, and session isolation
Three bugs fixed in the Slack adapter:

1. Tool progress messages leaked to main channel instead of thread.
   Root cause: metadata key mismatch — gateway uses 'thread_id' but
   Slack adapter checked for 'thread_ts'. Added _resolve_thread_ts()
   helper that checks both keys with correct precedence.

2. Bot responses could escape threads for replies.
   Root cause: reply_to was set to the child message's ts, but Slack
   API needs the parent message's ts for thread_ts. Now metadata
   thread_id (always the parent ts) takes priority over reply_to.

3. All Slack DMs shared one session key ('agent:main:slack:dm'),
   so a long-running task blocked all other DM conversations.
   Fix: DMs with thread_id now get per-thread session keys. Top-level
   DMs still share one session for conversation continuity.

Additional fix: All Slack media methods (send_image, send_voice,
send_video, send_document, send_image_file) now accept metadata
parameter for thread routing. Previously they only accepted reply_to,
which caused media to silently fail to post in threads.

Session key behavior after this change:
- Slack channel @mention: creates thread, thread = session
- Slack thread reply: stays in thread, same session
- Slack DM (top-level): one continuous session
- Slack DM (threaded): per-thread session
- Other platforms: unchanged
2026-03-12 16:05:45 -07:00
Teknium
4a8cd6f856 fix: stop rejecting unlisted models, accept with warning instead
* fix: use session_key instead of chat_id for adapter interrupt lookups

monitor_for_interrupt() in _run_agent was using source.chat_id to query
the adapter's has_pending_interrupt() and get_pending_message() methods.
But the adapter stores interrupt events under build_session_key(source),
which produces a different string (e.g. 'agent:main:telegram:dm' vs '123456').

This key mismatch meant the interrupt was never detected through the
adapter path, which is the only active interrupt path for all adapter-based
platforms (Telegram, Discord, Slack, etc.). The gateway-level interrupt
path (in dispatch_message) is unreachable because the adapter intercepts
the 2nd message in handle_message() before it reaches dispatch_message().

Result: sending a new message while subagents were running had no effect —
the interrupt was silently lost.

Fix: replace all source.chat_id references in the interrupt-related code
within _run_agent() with the session_key parameter, which matches the
adapter's storage keys.

Also adds regression tests verifying session_key vs chat_id consistency.

* debug: add file-based logging to CLI interrupt path

Temporary instrumentation to diagnose why message-based interrupts
don't seem to work during subagent execution. Logs to
~/.hermes/interrupt_debug.log (immune to redirect_stdout).

Two log points:
1. When Enter handler puts message into _interrupt_queue
2. When chat() reads it and calls agent.interrupt()

This will reveal whether the message reaches the queue and
whether the interrupt is actually fired.

* fix: accept unlisted models with warning instead of rejecting

validate_requested_model() previously hard-rejected any model not found
in the provider's API listing. This was too aggressive — users on higher
plan tiers (e.g. Z.AI Pro/Max) may have access to models not shown in
the public listing (like glm-5 on coding endpoints).

Changes:
- validate_requested_model: accept unlisted models with a warning note
  instead of blocking. The model is saved to config and used immediately.
- Z.AI setup: always offer glm-5 in the model list regardless of whether
  a coding endpoint was detected. Pro/Max plans support it.
- Z.AI setup detection message: softened from 'GLM-5 is not available'
  to 'GLM-5 may still be available depending on your plan tier'
2026-03-12 16:02:35 -07:00
teknium1
d7adfe8f61 fix(anthropic): address gaps found in deep-dive audit
After studying clawdbot (OpenClaw) and OpenCode implementations:

## Beta headers
- Add interleaved-thinking-2025-05-14 and fine-grained-tool-streaming-2025-05-14
  as common betas (sent with ALL auth types, not just OAuth)
- OAuth tokens additionally get oauth-2025-04-20
- API keys now also get the common betas (previously got none)

## Vision/image support
- Add _convert_vision_content() to convert OpenAI multimodal format
  (image_url blocks) to Anthropic format (image blocks with base64/url source)
- Handles both data: URIs (base64) and regular URLs

## Role alternation enforcement
- Anthropic strictly rejects consecutive same-role messages (400 error)
- Add post-processing step that merges consecutive user/assistant messages
- Handles string, list, and mixed content types during merge

## Tool choice support
- Add tool_choice parameter to build_anthropic_kwargs()
- Maps OpenAI values: auto→auto, required→any, none→omit, name→tool

## Cache metrics tracking
- Anthropic uses cache_read_input_tokens / cache_creation_input_tokens
  (different from OpenRouter's prompt_tokens_details.cached_tokens)
- Add api_mode-aware branch in run_agent.py cache stats logging

## Credential refresh on 401
- On 401 error during anthropic_messages mode, re-read credentials
  via resolve_anthropic_token() (picks up refreshed Claude Code tokens)
- Rebuild client if new token differs from current one
- Follows same pattern as Codex/Nous 401 refresh handlers

## Tests
- 44 adapter tests (8 new: vision conversion, role alternation, tool choice)
- Updated beta header tests to verify new structure
- Full suite: 3198 passed, 0 regressions
2026-03-12 16:00:46 -07:00
Teknium
def7b84a12 Merge pull request #1098 from NousResearch/hermes/hermes-465f3702
fix: eliminate execute_code progress spam on gateway platforms
2026-03-12 15:55:02 -07:00
teknium1
8121aef83c fix: eliminate execute_code progress spam on gateway platforms
Root cause: two issues combined to create visual spam on Telegram/Discord:

1. build_tool_preview() preserved newlines from tool arguments. A preview
   like 'import os\nprint("...")' rendered as 2+ visual lines per
   progress entry on messaging platforms. This affected execute_code most
   (code always has newlines), but could also hit terminal, memory,
   send_message, session_search, and process tools.

2. No deduplication of identical progress messages. When models iterate
   with execute_code using the same boilerplate code (common pattern),
   each call produced an identical progress line. 9 calls x 2 visual
   lines = 18 lines of identical spam in one message bubble.

Fixes:
- Added _oneline() helper to collapse all whitespace (newlines, tabs) to
  single spaces. Applied to ALL code paths in build_tool_preview() —
  both the generic path and every early-return path that touches user
  content (memory, session_search, send_message, process).
- Added dedup in gateway progress_callback: consecutive identical messages
  are collapsed with a repeat counter, e.g. 'execute_code: ... (x9)'
  instead of 9 identical lines. The send_progress_messages async loop
  handles dedup tuples by updating the last progress_line in-place.
2026-03-12 15:53:02 -07:00
Teknium
1bb8ed4495 chore: lower default compression threshold from 85% to 50% (#1096)
* fix: ClawHub skill install — use /download ZIP endpoint

The ClawHub API v1 version endpoint only returns file metadata
(path, size, sha256, contentType) without inline content or download
URLs. Our code was looking for inline content in the metadata, which
never existed, causing all ClawHub installs to fail with:
'no inline/raw file content was available'

Fix: Use the /api/v1/download endpoint (same as the official clawhub
CLI) to download skills as ZIP bundles and extract files in-memory.

Changes:
- Add _download_zip() method that downloads and extracts ZIP bundles
- Retry on 429 rate limiting with Retry-After header support
- Path sanitization and binary file filtering for security
- Keep _extract_files() as a fallback for inline/raw content
- Also fix nested file lookup (version_data.version.files)

* chore: lower default compression threshold from 85% to 50%

Triggers context compression earlier — at 50% of the model's context
window instead of 85%. Updated in all four places where the default
is defined: context_compressor.py, cli.py, run_agent.py, config.py,
and gateway/run.py.
2026-03-12 15:51:50 -07:00
teknium1
5e12442b4b feat: native Anthropic provider with Claude Code credential auto-discovery
Add Anthropic as a first-class inference provider, bypassing OpenRouter
for direct API access. Uses the native Anthropic SDK with a full format
adapter (same pattern as the codex_responses api_mode).

## Auth (three methods, priority order)
1. ANTHROPIC_API_KEY env var (regular API key, sk-ant-api-*)
2. ANTHROPIC_TOKEN / CLAUDE_CODE_OAUTH_TOKEN env var (setup-token, sk-ant-oat-*)
3. Auto-discovery from ~/.claude/.credentials.json (Claude Code subscription)
   - Reads Claude Code's OAuth credentials
   - Checks token expiry with 60s buffer
   - Setup tokens use Bearer auth + anthropic-beta: oauth-2025-04-20 header
   - Regular API keys use standard x-api-key header

## Changes by file

### New files
- agent/anthropic_adapter.py — Client builder, message/tool/response
  format conversion, Claude Code credential reader, token resolver.
  Handles system prompt extraction, tool_use/tool_result blocks,
  thinking/reasoning, orphaned tool_use cleanup, cache_control.
- tests/test_anthropic_adapter.py — 36 tests covering all adapter logic

### Modified files
- pyproject.toml — Add anthropic>=0.39.0 dependency
- hermes_cli/auth.py — Add 'anthropic' to PROVIDER_REGISTRY with
  three env vars, plus 'claude'/'claude-code' aliases
- hermes_cli/models.py — Add model catalog, labels, aliases, provider order
- hermes_cli/main.py — Add 'anthropic' to --provider CLI choices
- hermes_cli/runtime_provider.py — Add Anthropic branch returning
  api_mode='anthropic_messages' (before generic api_key fallthrough)
- hermes_cli/setup.py — Add Anthropic setup wizard with Claude Code
  credential auto-discovery, model selection, OpenRouter tools prompt
- agent/auxiliary_client.py — Add claude-haiku-4-5 as aux model
- agent/model_metadata.py — Add bare Claude model context lengths
- run_agent.py — Add anthropic_messages api_mode:
  * Client init (Anthropic SDK instead of OpenAI)
  * API call dispatch (_anthropic_client.messages.create)
  * Response validation (content blocks)
  * finish_reason mapping (stop_reason -> finish_reason)
  * Token usage (input_tokens/output_tokens)
  * Response normalization (normalize_anthropic_response)
  * Client interrupt/rebuild
  * Prompt caching auto-enabled for native Anthropic
- tests/test_run_agent.py — Update test_anthropic_base_url_accepted to
  expect native routing, add test_prompt_caching_native_anthropic
2026-03-12 15:47:45 -07:00
Erosika
fefc709b2c merge: resolve conflict with main in subagent interrupt test 2026-03-12 16:28:57 -04:00
Erosika
45d3e83ad1 fix(honcho): normalize legacy recallMode values like 'auto' to 'hybrid' 2026-03-12 16:27:49 -04:00
Erosika
0aed9bfde1 refactor(honcho): rename memory tools to Honcho tools, clarify recall mode language
Replace "memory tools" with "Honcho tools" and "pre-warmed/prefetch"
with "auto-injected context" in all user-facing strings and docs.
2026-03-12 16:26:10 -04:00
Erosika
ae2a5e5743 refactor(honcho): remove local memory mode
The "local" memoryMode was redundant with enabled: false. Simplifies
the mode system to hybrid and honcho only.
2026-03-12 16:23:34 -04:00
Erosika
f896bb5d8c fix(test): patch correct method in subagent interrupt test
build_system_prompt was refactored to AIAgent._build_system_prompt
but the test still patched the non-existent module-level function.
2026-03-12 15:05:42 -04:00
Erosika
cd6e5e44e4 feat(honcho): show clickable session line on CLI startup
Display a one-line Honcho session indicator with an OSC 8 terminal
hyperlink after the banner. Also shown when /title remaps the session.
2026-03-12 12:30:42 -04:00
teknium1
47e49da77c feat: make tinker-atropos RL training fully optional
The tinker-atropos submodule and its heavy dependencies (atroposlib, tinker,
wandb, fastapi, uvicorn) were being installed for all users by default,
adding significant install time and disk usage for most users who don't
need RL training capabilities.

Changes:
- install.sh: Only init mini-swe-agent submodule by default; skip
  tinker-atropos clone and install entirely
- install.sh: Remove --recurse-submodules from git clone (only fetches
  what's needed)
- pyproject.toml: Add [rl] optional dependency group for explicit opt-in
- rl_training_tool.py: Move LOGS_DIR.mkdir() from module-level to lazy
  init (_ensure_logs_dir) to avoid side effects on import
- README.md: Update contributor quick start to not auto-fetch
  tinker-atropos; add RL opt-in instructions

Users who want RL training can opt in with:
  git submodule update --init tinker-atropos
  uv pip install -e ./tinker-atropos
2026-03-12 09:11:44 -07:00
Teknium
e004c094ea fix: use session_key instead of chat_id for adapter interrupt lookups
* fix: use session_key instead of chat_id for adapter interrupt lookups

monitor_for_interrupt() in _run_agent was using source.chat_id to query
the adapter's has_pending_interrupt() and get_pending_message() methods.
But the adapter stores interrupt events under build_session_key(source),
which produces a different string (e.g. 'agent:main:telegram:dm' vs '123456').

This key mismatch meant the interrupt was never detected through the
adapter path, which is the only active interrupt path for all adapter-based
platforms (Telegram, Discord, Slack, etc.). The gateway-level interrupt
path (in dispatch_message) is unreachable because the adapter intercepts
the 2nd message in handle_message() before it reaches dispatch_message().

Result: sending a new message while subagents were running had no effect —
the interrupt was silently lost.

Fix: replace all source.chat_id references in the interrupt-related code
within _run_agent() with the session_key parameter, which matches the
adapter's storage keys.

Also adds regression tests verifying session_key vs chat_id consistency.

* debug: add file-based logging to CLI interrupt path

Temporary instrumentation to diagnose why message-based interrupts
don't seem to work during subagent execution. Logs to
~/.hermes/interrupt_debug.log (immune to redirect_stdout).

Two log points:
1. When Enter handler puts message into _interrupt_queue
2. When chat() reads it and calls agent.interrupt()

This will reveal whether the message reaches the queue and
whether the interrupt is actually fired.
2026-03-12 08:35:45 -07:00
Teknium
5c54128475 fix: ClawHub skill install — use /download ZIP endpoint (#1060)
The ClawHub API v1 version endpoint only returns file metadata
(path, size, sha256, contentType) without inline content or download
URLs. Our code was looking for inline content in the metadata, which
never existed, causing all ClawHub installs to fail with:
'no inline/raw file content was available'

Fix: Use the /api/v1/download endpoint (same as the official clawhub
CLI) to download skills as ZIP bundles and extract files in-memory.

Changes:
- Add _download_zip() method that downloads and extracts ZIP bundles
- Retry on 429 rate limiting with Retry-After header support
- Path sanitization and binary file filtering for security
- Keep _extract_files() as a fallback for inline/raw content
- Also fix nested file lookup (version_data.version.files)
2026-03-12 08:26:24 -07:00
Teknium
42cf66ae39 feat: add 'hermes claw migrate' command + migration docs (#1059)
feat: add 'hermes claw migrate' command + migration docs
2026-03-12 08:23:05 -07:00
Teknium
73ea5102dc Merge pull request #1058 from NousResearch/hermes/hermes-465f3702
fix: strip call_id/response_item_id from tool_calls for Mistral compatibility
2026-03-12 08:21:36 -07:00
teknium1
d53035ad82 feat: add 'hermes claw migrate' command + migration docs
- Add hermes_cli/claw.py with full CLI migration handler:
  - hermes claw migrate (interactive migration with confirmation)
  - --dry-run, --preset, --overwrite, --skill-conflict flags
  - --source for custom OpenClaw path
  - --yes to skip confirmation
  - Clean formatted output matching setup wizard style

- Fix Python 3.11+ @dataclass compatibility bug in dynamic module loading:
  - Register module in sys.modules before exec_module()
  - Fixes both setup.py (PR #981) and new claw.py

- Add 16 tests in tests/hermes_cli/test_claw.py covering:
  - Script discovery (project root, installed, missing)
  - Command routing
  - Dry-run, execute, cancellation, error handling
  - Preset/secrets behavior, report formatting

- Documentation updates:
  - README.md: Add 'hermes claw migrate' to Getting Started, new Migration section
  - docs/migration/openclaw.md: Full migration guide with all options
  - SKILL.md: Add CLI Command section at top of openclaw-migration skill
2026-03-12 08:20:12 -07:00
Teknium
5a4348d046 Merge pull request #1053 from NousResearch/hermes/hermes-c877bdeb
chore(skills): clean up PR #862 + feat(docs): add search to Docusaurus
2026-03-12 08:20:10 -07:00
teknium1
400b8d92b7 fix: strip call_id/response_item_id from tool_calls for Mistral compatibility
Mistral's API strictly validates the Chat Completions schema and rejects
unknown fields (call_id, response_item_id) with 422. These fields are
added by _build_assistant_message() for Codex Responses API support.

This fix:
- Only strips when targeting Mistral (api.mistral.ai in base_url)
- Creates new tool_call dicts instead of mutating originals (shallow
  copy safety — msg.copy() shares the tool_calls list)
- Preserves call_id/response_item_id in the internal message history
  so _chat_messages_to_responses_input() can still read them if the
  session falls back to a Codex provider mid-conversation

Applied in all 3 API message building locations:
- Main conversation loop (run_conversation)
- _handle_max_iterations()
- flush_memories()

Inspired by PR #864 (unmodeled-tyler) which identified the issue but
applied the fix unconditionally and mutated originals via shallow copy.

Co-authored-by: unmodeled-tyler <unmodeled.tyler@proton.me>
2026-03-12 08:18:27 -07:00