* feat: nix flake, uv2nix build, dev shell and home manager
* fixed nix run, updated docs for setup
* feat(nix): NixOS module with persistent container mode, managed guards, checks
- Replace homeModules.nix with nixosModules.nix (two deployment modes)
- Mode A (native): hardened systemd service with ProtectSystem=strict
- Mode B (container): persistent Ubuntu container with /nix/store bind-mount,
identity-hash-based recreation, GC root protection, symlink-based updates
- Add HERMES_MANAGED guards blocking CLI config mutation (config set, setup,
gateway install/uninstall) when running under NixOS module
- Add nix/checks.nix with build-time verification (binary, CLI, managed guard)
- Remove container.nix (no Nix-built OCI image; pulls ubuntu:24.04 at runtime)
- Simplify packages.nix (drop fetchFromGitHub submodules, PYTHONPATH wrappers)
- Rewrite docs/nixos-setup.md with full options reference, container
architecture, secrets management, and troubleshooting guide
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* Update config.py
* feat(nix): add CI workflow and enhanced build checks
- GitHub Actions workflow for nix flake check + build on linux/macOS
- Entry point sync check to catch pyproject.toml drift
- Expanded managed-guard check to cover config edit
- Wrap hermes-acp binary in Nix package
- Fix Path type mismatch in is_managed()
* Update MCP server package name; bundled skills support
* fix reading .env. instead have container user a common mounted .env file
* feat(nix): container entrypoint with privilege drop and sudo provisioning
Container was running as non-root via --user, which broke apt/pip installs
and caused crashes when $HOME didn't exist. Replace --user with a Nix-built
entrypoint script that provisions the hermes user, sudo (NOPASSWD), and
/home/hermes inside the container on first boot, then drops privileges via
setpriv. Writable layer persists so setup only runs once.
Also expands MCP server options to support HTTP transport and sampling.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix group and user creation in container mode
* feat(nix): persistent /home/hermes and MESSAGING_CWD in container mode
Container mode now bind-mounts ${stateDir}/home to /home/hermes so the
agent's home directory survives container recreation. Previously it lived
in the writable layer and was lost on image/volume/options changes.
Also passes MESSAGING_CWD to the container so the agent finds its
workspace and documents, matching native mode behavior.
Other changes:
- Extract containerDataDir/containerHomeDir bindings (no more magic strings)
- Fix entrypoint chown to run unconditionally (volume mounts always exist)
- Add schema field to container identity hash for auto-recreation
- Add idempotency test (Scenario G) to config-roundtrip check
* docs: add Nix & NixOS setup guide to docs site
Add comprehensive Nix documentation to the Docusaurus site at
website/docs/getting-started/nix-setup.md, covering nix run/profile
install, NixOS module (native + container modes), declarative settings,
secrets management, MCP servers, managed mode, container architecture,
dev shell, flake checks, and full options reference.
- Register nix-setup in sidebar after installation page
- Add Nix callout tip to installation.md linking to new guide
- Add canonical version pointer in docs/nixos-setup.md
* docs: remove docs/nixos-setup.md, consolidate into website docs
Backfill missing details (restart/restartSec in full example,
gateway.pid, 0750 permissions, docker inspect commands) into
the canonical website/docs/getting-started/nix-setup.md and
delete the old standalone file.
* fix(nix): add compression.protect_last_n and target_ratio to config-keys.json
New keys were added to DEFAULT_CONFIG on main, causing the
config-drift check to fail in CI.
* fix(nix): skip checks on aarch64-darwin (onnxruntime wheel missing)
The full Python venv includes onnxruntime (via faster-whisper/STT)
which lacks a compatible uv2nix wheel on aarch64-darwin. Gate all
checks behind stdenv.hostPlatform.isLinux. The package and devShell
still evaluate on macOS.
* fix(nix): skip flake check and build on macOS CI
onnxruntime (transitive dep via faster-whisper) lacks a compatible
uv2nix wheel on aarch64-darwin. Run full checks and build on Linux
only; macOS CI verifies the flake evaluates without building.
* fix(nix): preserve container writable layer across nixos-rebuild
The container identity hash included the entrypoint's Nix store path,
which changes on every nixpkgs update (due to runtimeShell/stdenv
input-addressing). This caused false-positive identity mismatches,
triggering container recreation and losing the persistent writable layer.
- Use stable symlink (current-entrypoint) like current-package already does
- Remove entrypoint from identity hash (only image/volumes/options matter)
- Add GC root for entrypoint so nix-collect-garbage doesn't break it
- Remove global HERMES_HOME env var from addToSystemPackages (conflicted
with interactive CLI use, service already sets its own)
---------
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Move OpenRouter to position 1 in the setup wizard's provider list
to match hermes model ordering. Update default selection index and
fix test expectations for the new ordering.
Setup order: OpenRouter → Nous Portal → Codex → Custom → ...
Reverts the sanitizer addition from PR #2466 (originally #2129).
We already have _empty_content_retries handling for reasoning-only
responses. The trailing strip risks silently eating valid messages
and is redundant with existing empty-content handling.
Same bug as opencode-zen/go — alibaba fell through to the OpenRouter
model list instead of using _setup_provider_model_selection() which
probes the provider's own /models endpoint.
All user-selectable providers now have correct model selection routing.
After selecting OpenCode Zen or Go as provider in hermes setup, the
model selection page showed OpenRouter models because these providers
weren't in the list that routes to _setup_provider_model_selection().
They fell through to the else branch which shows the OpenRouter catalog.
Users ended up with an OpenCode API key but an OpenRouter model name,
causing 'Provider resolver returned an empty API key' on first use.
Fix: add opencode-zen and opencode-go to the provider list that uses
_setup_provider_model_selection() for live /models detection.
Replace the fragile hardcoded context length system with a multi-source
resolution chain that correctly identifies context windows per provider.
Key changes:
- New agent/models_dev.py: Fetches and caches the models.dev registry
(3800+ models across 100+ providers with per-provider context windows).
In-memory cache (1hr TTL) + disk cache for cold starts.
- Rewritten get_model_context_length() resolution chain:
0. Config override (model.context_length)
1. Custom providers per-model context_length
2. Persistent disk cache
3. Endpoint /models (local servers)
4. Anthropic /v1/models API (max_input_tokens, API-key only)
5. OpenRouter live API (existing, unchanged)
6. Nous suffix-match via OpenRouter (dot/dash normalization)
7. models.dev registry lookup (provider-aware)
8. Thin hardcoded defaults (broad family patterns)
9. 128K fallback (was 2M)
- Provider-aware context: same model now correctly resolves to different
context windows per provider (e.g. claude-opus-4.6: 1M on Anthropic,
128K on GitHub Copilot). Provider name flows through ContextCompressor.
- DEFAULT_CONTEXT_LENGTHS shrunk from 80+ entries to ~16 broad patterns.
models.dev replaces the per-model hardcoding.
- CONTEXT_PROBE_TIERS changed from [2M, 1M, 512K, 200K, 128K, 64K, 32K]
to [128K, 64K, 32K, 16K, 8K]. Unknown models no longer start at 2M.
- hermes model: prompts for context_length when configuring custom
endpoints. Supports shorthand (32k, 128K). Saved to custom_providers
per-model config.
- custom_providers schema extended with optional models dict for
per-model context_length (backward compatible).
- Nous Portal: suffix-matches bare IDs (claude-opus-4-6) against
OpenRouter's prefixed IDs (anthropic/claude-opus-4.6) with dot/dash
normalization. Handles all 15 current Nous models.
- Anthropic direct: queries /v1/models for max_input_tokens. Only works
with regular API keys (sk-ant-api*), not OAuth tokens. Falls through
to models.dev for OAuth users.
Tests: 5574 passed (18 new tests for models_dev + updated probe tiers)
Docs: Updated configuration.md context length section, AGENTS.md
Co-authored-by: Test <test@test.com>
MiniMax: Add M2.7 and M2.7-highspeed as new defaults across provider
model lists, auxiliary client, metadata, setup wizard, RL training tool,
fallback tests, and docs. Retain M2.5/M2.1 as alternatives.
OpenRouter: Add grok-4.20-beta, nemotron-3-super-120b-a12b:free,
trinity-large-preview:free, glm-5-turbo, and hunter-alpha to the
model catalog.
MiniMax changes based on PR #1882 by @octo-patch (applied manually
due to stale conflicts in refactored pricing module).
Add first-class GitHub Copilot and Copilot ACP provider support across
model selection, runtime provider resolution, CLI sessions, delegated
subagents, cron jobs, and the Telegram gateway.
This also normalizes Copilot model catalogs and API modes, introduces a
Copilot ACP OpenAI-compatible shim, and fixes service-mode auth by
resolving Homebrew-installed gh binaries under launchd.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The OpenCode Zen and OpenCode Go setup sections used prompt_text()
which is undefined. All other providers correctly use the local
prompt() function defined in setup.py. Fixes crash during
'hermes setup' when selecting either OpenCode provider.
Salvage of PR #1707 by @kshitijk4poor (cherry-picked with authorship preserved).
Adds Tavily as a third web backend alongside Firecrawl and Parallel, using the Tavily REST API via httpx.
- Backend selection via hermes tools → saved as web.backend in config.yaml
- All three tools supported: search, extract, crawl
- TAVILY_API_KEY in config registry, doctor, status, setup wizard
- 15 new Tavily tests + 9 backend selection tests + 5 config tests
- Backward compatible
Closes#1707
* feat(web): add Parallel as alternative web search/extract backend
Adds Parallel (parallel.ai) as a drop-in alternative to Firecrawl for
web_search and web_extract tools using the official parallel-web SDK.
- Backend selection via WEB_SEARCH_BACKEND env var (auto/parallel/firecrawl)
- Auto mode prefers Firecrawl when both keys present; Parallel when sole backend
- web_crawl remains Firecrawl-only with clear error when unavailable
- Lazy SDK imports, interrupt support, singleton clients
- 16 new unit tests for backend selection and client config
Co-authored-by: s-jag <s-jag@users.noreply.github.com>
* fix: add PARALLEL_API_KEY to config registry and fix web_crawl policy tests
Follow-up for Parallel backend integration:
- Add PARALLEL_API_KEY to OPTIONAL_ENV_VARS (hermes doctor, env blocklist)
- Add to set_config_value api_keys list (hermes config set)
- Add to doctor keys display
- Fix 2 web_crawl policy tests that didn't set FIRECRAWL_API_KEY
(needed now that web_crawl has a Firecrawl availability guard)
* refactor: explicit backend selection via hermes tools, not auto-detect
Replace the auto-detect backend selection with explicit user choice:
- hermes tools saves WEB_SEARCH_BACKEND to .env when user picks a provider
- _get_backend() reads the explicit choice first
- Fallback only for manual/legacy config (uses whichever key is present)
- _is_provider_active() shows [active] for the selected web backend
- Updated tests, docs, and .env.example to remove 'auto' mode language
* refactor: use config.yaml for web backend, not env var
Match the TTS/browser pattern — web.backend is stored in config.yaml
(set by hermes tools), not as a WEB_SEARCH_BACKEND env var.
- _load_web_config() reads web: section from config.yaml
- _get_backend() reads web.backend from config, falls back to key detection
- _configure_provider() saves to config dict (saved to config.yaml)
- _is_provider_active() reads from config dict
- Removed WEB_SEARCH_BACKEND from .env.example, set_config_value, docs
- Updated all tests to mock _load_web_config instead of env vars
---------
Co-authored-by: s-jag <s-jag@users.noreply.github.com>
The function uses subprocess.run() and subprocess.CalledProcessError but
never imported the module. This caused a NameError crash during setup
when users selected NeuTTS as their TTS provider.
Fixes#1698
Add support for Mattermost (self-hosted Slack alternative) and Matrix
(federated messaging protocol) as messaging platforms.
Mattermost adapter:
- REST API v4 client for posts, files, channels, typing indicators
- WebSocket listener for real-time 'posted' events with reconnect backoff
- Thread support via root_id
- File upload/download with auth-aware caching
- Dedup cache (5min TTL, 2000 entries)
- Full self-hosted instance support
Matrix adapter:
- matrix-nio AsyncClient with sync loop
- Dual auth: access token or user_id + password
- Optional E2EE via matrix-nio[e2e] (libolm)
- Thread support via m.thread (MSC3440)
- Reply support via m.in_reply_to with fallback stripping
- Media upload/download via mxc:// URLs (authenticated v1.11+ endpoint)
- Auto-join on room invite
- DM detection via m.direct account data with sync fallback
- Markdown to HTML conversion
Fixes applied over original PR #1225 by @cyb0rgk1tty:
- Mattermost: add timeout to file downloads, wrap API helpers in
try/except for network errors, download incoming files immediately
with auth headers instead of passing auth-required URLs
- Matrix: use authenticated media endpoint (/_matrix/client/v1/media/),
robust m.direct cache with sync fallback, prefer aiohttp over httpx
Install Matrix support: pip install 'hermes-agent[matrix]'
Mattermost needs no extra deps (uses aiohttp).
Salvaged from PR #1225 by @cyb0rgk1tty with fixes.
* feat(gateway): add DingTalk platform adapter
Add DingTalk as a messaging platform using the dingtalk-stream SDK
for real-time message reception via Stream Mode (no webhook needed).
Replies are sent via session webhook using markdown format.
Features:
- Stream Mode connection (long-lived WebSocket, no public URL needed)
- Text and rich text message support
- DM and group chat support
- Message deduplication with 5-minute window
- Auto-reconnection with exponential backoff
- Session webhook caching for reply routing
Configuration:
export DINGTALK_CLIENT_ID=your-app-key
export DINGTALK_CLIENT_SECRET=your-app-secret
# or in config.yaml:
platforms:
dingtalk:
enabled: true
extra:
client_id: your-app-key
client_secret: your-app-secret
Files:
- gateway/platforms/dingtalk.py (340 lines) — adapter implementation
- gateway/config.py — add DINGTALK to Platform enum
- gateway/run.py — add DingTalk to _create_adapter
- hermes_cli/config.py — add env vars to _EXTRA_ENV_KEYS
- hermes_cli/tools_config.py — add dingtalk to PLATFORMS
- tests/gateway/test_dingtalk.py — 21 tests
* docs: add Alibaba Cloud and DingTalk to setup wizard and docs
Wire Alibaba Cloud (DashScope) into hermes setup and hermes model
provider selection flows. Add DingTalk env vars to documentation.
Changes:
- setup.py: Add Alibaba Cloud as provider choice (index 11) with
DASHSCOPE_API_KEY prompt and model studio link
- main.py: Add alibaba to provider_labels, providers list, and
model flow dispatch
- environment-variables.md: Add DASHSCOPE_API_KEY, DINGTALK_CLIENT_ID,
DINGTALK_CLIENT_SECRET, and alibaba to HERMES_INFERENCE_PROVIDER
Add Kilo Gateway (kilo.ai) as an API-key provider with OpenAI-compatible
endpoint at https://api.kilo.ai/api/gateway. Supports 500+ models from
Anthropic, OpenAI, Google, xAI, Mistral, MiniMax via a single API key.
- Register kilocode in PROVIDER_REGISTRY with aliases (kilo, kilo-code,
kilo-gateway) and KILOCODE_API_KEY / KILOCODE_BASE_URL env vars
- Add to model catalog, CLI provider menu, setup wizard, doctor checks
- Add google/gemini-3-flash-preview as default aux model
- 12 new tests covering registration, aliases, credential resolution,
runtime config
- Documentation updates (env vars, config, fallback providers)
- Fix setup test index shift from provider insertion
Inspired by PR #1473 by @amanning3390.
Co-authored-by: amanning3390 <amanning3390@users.noreply.github.com>
Remove the optional skill (redundant now that NeuTTS is a built-in TTS
provider). Replace neutts_cli dependency with a standalone synthesis
helper (tools/neutts_synth.py) that calls the neutts Python API directly
in a subprocess.
Add TTS provider selection to hermes setup:
- 'hermes setup' now prompts for TTS provider after model selection
- 'hermes setup tts' available as standalone section
- Selecting NeuTTS checks for deps and offers to install:
espeak-ng (system) + neutts[all] (pip)
- ElevenLabs/OpenAI selections prompt for API keys
- Tool status display shows NeuTTS install state
Changes:
- Remove optional-skills/mlops/models/neutts/ (skill + CLI scaffold)
- Add tools/neutts_synth.py (standalone synthesis subprocess helper)
- Move jo.wav/jo.txt to tools/neutts_samples/ (bundled default voice)
- Refactor _generate_neutts() — uses neutts API via subprocess, no
neutts_cli dependency, config-driven ref_audio/ref_text/model/device
- Add TTS setup to hermes_cli/setup.py (SETUP_SECTIONS, tool status)
- Update config.py defaults (ref_audio, ref_text, model, device)
Add support for OpenCode Zen (pay-as-you-go, 35+ curated models) and
OpenCode Go ($10/month subscription, open models) as first-class providers.
Both are OpenAI-compatible endpoints resolved via the generic api_key
provider flow — no custom adapter needed.
Files changed:
- hermes_cli/auth.py — ProviderConfig entries + aliases
- hermes_cli/config.py — OPENCODE_ZEN/GO API key env vars
- hermes_cli/models.py — model catalogs, labels, aliases, provider order
- hermes_cli/main.py — provider labels, menu entries, model flow dispatch
- hermes_cli/setup.py — setup wizard branches (idx 10, 11)
- agent/model_metadata.py — context lengths for all OpenCode models
- agent/auxiliary_client.py — default aux models
- .env.example — documentation
Co-authored-by: DevAgarwal2 <DevAgarwal2@users.noreply.github.com>
* feat: add Vercel AI Gateway as a first-class provider
Adds AI Gateway (ai-gateway.vercel.sh) as a new inference provider
with AI_GATEWAY_API_KEY authentication, live model discovery, and
reasoning support via extra_body.reasoning.
Based on PR #1492 by jerilynzheng.
* feat: add AI Gateway to setup wizard, doctor, and fallback providers
* test: add AI Gateway to api_key_providers test suite
* feat: add AI Gateway to hermes model CLI and model metadata
Wire AI Gateway into the interactive model selection menu and add
context lengths for AI Gateway model IDs in model_metadata.py.
* feat: use claude-haiku-4.5 as AI Gateway auxiliary model
* revert: use gemini-3-flash as AI Gateway auxiliary model
* fix: move AI Gateway below established providers in selection order
---------
Co-authored-by: jerilynzheng <jerilynzheng@users.noreply.github.com>
Co-authored-by: jerilynzheng <zheng.jerilyn@gmail.com>
_update_config_for_provider() was called immediately after provider
selection for zai, kimi-coding, minimax, minimax-cn, and anthropic —
before model selection happened. Since the gateway re-reads config.yaml
per-message, this created a race where the gateway would pick up the
new provider but still use the old (incompatible) model name.
Capture selected_base_url in each provider block, then call
_update_config_for_provider() once, after model selection completes,
right before save_config(). The in-memory _set_model_provider() calls
stay in place so the config object remains consistent during setup.
Closes#1182
- mark private-channel scopes/events as optional
- note reinstall requirement after scope/event changes
- correct Slack allowlist messaging to match gateway behavior
Add regression coverage for the new provider-aware vision setup flow and make the default OpenAI choice write AUXILIARY_VISION_MODEL so auxiliary vision requests don't fall back to the main model slug.
The old flow blindly asked for an OpenRouter API key after ANY non-OR
provider selection, even for Nous Portal and Codex which already
support vision natively. This was confusing and annoying.
New behavior:
- OpenRouter: skip — vision uses Gemini via their OR key
- Nous Portal OAuth: skip — vision uses Gemini via Nous
- OpenAI Codex: skip — gpt-5.3-codex supports vision
- Custom endpoint (api.openai.com): show OpenAI vision model picker
(gpt-4o, gpt-4o-mini, gpt-4.1, etc.), saves AUXILIARY_VISION_MODEL
- Custom (other) / z.ai / kimi / minimax / nous-api:
- First checks if existing OR/Nous creds already cover vision
- If not, offers friendly choice: OpenRouter / OpenAI / Skip
- No more 'enter OpenRouter key' thrown in your face
Also fixes the setup summary to check actual vision availability
across all providers instead of hardcoding 'requires OPENROUTER_API_KEY'.
MoA still correctly requires OpenRouter (calls multiple frontier models).
hermes setup hung indefinitely on headless SSH sessions, Docker
containers, and CI/CD environments because the interactive provider
selection menu could not receive input.
Two-layer fix:
1. sys.stdin.isatty() check — auto-detects non-interactive environments
2. --non-interactive flag support — already in CLI parser, now honored
In both cases the wizard exits immediately with helpful guidance
pointing users to 'hermes config set' commands.
Closes#905
Users sometimes paste Discord IDs with prefixes like 'user:123456',
'<@123456>', or '<@!123456>' from Discord's UI or third-party tools.
This caused auth failures since the allowlist contained 'user:123' but
the actual user_id from messages was just '123'.
Fixes:
- Added _clean_discord_id() helper in discord.py to strip common prefixes
- Applied sanitization at runtime when parsing DISCORD_ALLOWED_USERS env var
- Applied sanitization in hermes setup and hermes gateway setup input flows
- Handles user:, <@>, and <@!> prefix formats
When _update_config_for_provider() writes the new provider and base_url
to config.yaml, the gateway (which re-reads config per-message) can pick
up the change before model selection completes. This causes the old model
name (e.g. 'anthropic/claude-opus-4.6') to be sent to the new provider's
API (e.g. MiniMax), which fails.
Changes:
- _update_config_for_provider() now accepts an optional default_model
parameter. When provided and the current model.default is empty or
uses OpenRouter format (contains '/'), it sets a safe default model
for the new provider.
- All setup.py callers for direct-API providers (zai, kimi, minimax,
minimax-cn, anthropic) now pass a provider-appropriate default model.
- _setup_provider_model_selection() now validates the 'Keep current'
choice: if the current model uses OpenRouter format and wouldn't work
with the new provider, it warns and switches to the provider's first
default model instead of silently keeping the incompatible name.
Reported by a user on Home Assistant whose gateway started sending
'anthropic/claude-opus-4.6' to MiniMax's API after running hermes setup.
Fixes Anthropic OAuth/subscription authentication end-to-end:
Auth failures (401 errors):
- Add missing 'claude-code-20250219' beta header for OAuth tokens. Both
clawdbot and OpenCode include this alongside 'oauth-2025-04-20' — without
it, Anthropic's API rejects OAuth tokens with 401 authentication errors.
- Fix _fetch_anthropic_models() to use canonical beta headers from
_COMMON_BETAS + _OAUTH_ONLY_BETAS instead of hardcoding.
Token refresh:
- Add _refresh_oauth_token() — when Claude Code credentials from
~/.claude/.credentials.json are expired but have a refresh token,
automatically POST to console.anthropic.com/v1/oauth/token to get
a new access token. Uses the same client_id as Claude Code / OpenCode.
- Add _write_claude_code_credentials() — writes refreshed tokens back
to ~/.claude/.credentials.json, preserving other fields.
- resolve_anthropic_token() now auto-refreshes expired tokens before
returning None.
Config contamination:
- Anthropic's _model_flow_anthropic() no longer saves base_url to config.
Since resolve_runtime_provider() always hardcodes Anthropic's URL, the
stale base_url was contaminating other providers when users switched
without re-running 'hermes model' (e.g., Codex hitting api.anthropic.com).
- _update_config_for_provider() now pops base_url when passed empty string.
- Same fix in setup.py.
Flow/UX (hermes model command):
- CLAUDE_CODE_OAUTH_TOKEN env var now checked in credential detection
- Reauthentication option when existing credentials found
- run_oauth_setup_token() runs 'claude setup-token' as interactive
subprocess, then auto-detects saved credentials
- Clean has_creds/needs_auth flow in both main.py and setup.py
Tests (14 new):
- Beta header assertions for claude-code-20250219
- Token refresh: successful refresh with credential writeback, failed
refresh returns None, no refresh token returns None
- Credential writeback: new file creation, preserving existing fields
- Auto-refresh integration in resolve_anthropic_token()
- CLAUDE_CODE_OAUTH_TOKEN fallback, credential file auto-discovery
- run_oauth_setup_token() (5 scenarios)
Remove 50 lines of unreachable duplicate model selection logic in
setup_model_provider() for zai/kimi-coding/minimax/minimax-cn providers.
The code referenced undefined `is_coding_plan` variable, crashing setup.
_setup_provider_model_selection() already handles these providers correctly
via _DEFAULT_PROVIDER_MODELS dict.
Fixes from comprehensive code review and cross-referencing with
clawdbot/OpenCode implementations:
CRITICAL:
- Add one-shot guard (anthropic_auth_retry_attempted) to prevent
infinite 401 retry loops when credentials keep changing
- Fix _is_oauth_token(): managed keys from ~/.claude.json are NOT
regular API keys (don't start with sk-ant-api). Inverted the logic:
only sk-ant-api* is treated as API key auth, everything else uses
Bearer auth + oauth beta headers
HIGH:
- Wrap json.loads(args) in try/except in message conversion — malformed
tool_call arguments no longer crash the entire conversation
- Raise AuthError in runtime_provider when no Anthropic token found
(was silently passing empty string, causing confusing API errors)
- Remove broken _try_anthropic() from auxiliary vision chain — the
centralized router creates an OpenAI client for api_key providers
which doesn't work with Anthropic's Messages API
MEDIUM:
- Handle empty assistant message content — Anthropic rejects empty
content blocks, now inserts '(empty)' placeholder
- Fix setup.py existing_key logic — set to 'KEEP' sentinel instead
of None to prevent falling through to the auth choice prompt
- Add debug logging to _fetch_anthropic_models on failure
Tests: 43 adapter tests (2 new for token detection), 3197 total passed
- Add _fetch_anthropic_models() to hermes_cli/models.py — hits the
Anthropic /v1/models endpoint to get the live model catalog. Handles
both API key and OAuth token auth headers.
- Wire it into provider_model_ids() so both 'hermes model' and
'hermes setup model' show the live list instead of a stale static one.
- Update static _PROVIDER_MODELS fallback with full current catalog:
opus-4-6, sonnet-4-6, opus-4-5, sonnet-4-5, opus-4, sonnet-4, haiku-4-5
- Update model_metadata.py with context lengths for all current models.
- Fix thinking parameter for 4.5+ models: use type='adaptive' instead
of type='enabled' (Anthropic deprecated 'enabled' for newer models,
warns at runtime). Detects model version from the model name string.
Verified live:
hermes model → Anthropic → auto-detected creds → shows 7 live models
hermes chat --provider anthropic --model claude-opus-4-6 → works
Both 'hermes model' and 'hermes setup model' now present a clear
two-option auth flow when no credentials are found:
1. Claude Pro/Max subscription (setup-token)
- Step-by-step instructions to run 'claude setup-token'
- User pastes the resulting sk-ant-oat01-... token
2. Anthropic API key (pay-per-token)
- Link to console.anthropic.com/settings/keys
- User pastes sk-ant-api03-... key
Also handles:
- Auto-detection of existing Claude Code creds (~/.claude/.credentials.json)
- Existing credentials shown with option to update
- Consistent UX between 'hermes model' and 'hermes setup model'