hermes-agent

Author	SHA1	Message	Date
Bezalel	7936483ffc	feat(provider): first-class Ollama support + Gemma 4 defaults (#169 ) - Add 'ollama' to CLI provider choices and auth aliases - Wire Ollama through resolve_provider_client with auto-detection - Add _try_ollama to auxiliary fallback chain (before local/custom) - Add ollama to vision provider order - Update model_metadata.py: ollama prefix + gemma-4-* context lengths (256K) - Default model: gemma4:12b when provider=ollama	2026-04-07 12:04:10 -04:00
Yang Zhi	9e844160f9	fix(credential_pool): auto-detect Z.AI endpoint via probe and cache The credential pool seeder and runtime credential resolver hardcoded api.z.ai/api/paas/v4 for all Z.AI keys. Keys on the Coding Plan (or CN endpoint) would hit the wrong endpoint, causing 401/429 errors on the first request even though a working endpoint exists. Add _resolve_zai_base_url() that: - Respects GLM_BASE_URL env var (no probe when explicitly set) - Probes all candidate endpoints (global, cn, coding-global, coding-cn) via detect_zai_endpoint() to find one that returns HTTP 200 - Caches the detected endpoint in provider state (auth.json) keyed on a SHA-256 hash of the API key so subsequent starts skip the probe - Falls back to the default URL if all probes fail Wire into both _seed_from_env() in the credential pool and resolve_api_key_provider_credentials() in the runtime resolver, matching the pattern from the kimi-coding fix (PR #5566). Fixes the same class of bug as #5561 but for the zai provider.	2026-04-07 00:00:08 -07:00
Teknium	dc4c07ed9d	fix: codex OAuth credential pool disconnect + expired token import (#5681 ) Three bugs causing OpenAI Codex sessions to fail silently: 1. Credential pool vs legacy store disconnect: hermes auth and hermes model store device_code tokens in the credential pool, but get_codex_auth_status(), resolve_codex_runtime_credentials(), and _model_flow_openai_codex() only read from the legacy provider state. Fresh pool tokens were invisible to the auth status checks and model selection flow. 2. _import_codex_cli_tokens() imported expired tokens from ~/.codex/ without checking JWT expiry. Combined with _login_openai_codex() saying 'Login successful!' for expired credentials, users got stuck in a loop of dead tokens being recycled. 3. _login_openai_codex() accepted expired tokens from resolve_codex_runtime_credentials() without validating expiry before telling the user login succeeded. Fixes: - get_codex_auth_status() now checks credential pool first, falls back to legacy provider state - _model_flow_openai_codex() uses pool-aware auth status for token retrieval when fetching model lists - _import_codex_cli_tokens() validates JWT exp claim, rejects expired - _login_openai_codex() verifies resolved token isn't expiring before accepting existing credentials - _run_codex_stream() logs response.incomplete/failed terminal events with status and incomplete_details for diagnostics - Codex empty output recovery: captures streamed text during streaming and synthesizes a response when get_final_response() returns empty output (handles chatgpt.com backend-api edge cases)	2026-04-06 18:10:33 -07:00
Teknium	8cf013ecd9	fix: replace stale 'hermes login' refs with 'hermes auth' + fix credential removal re-seeding (#5670 ) Two fixes: 1. Replace all stale 'hermes login' references with 'hermes auth' across auth.py, auxiliary_client.py, delegate_tool.py, config.py, run_agent.py, and documentation. The 'hermes login' command was deprecated; 'hermes auth' now handles OAuth credential management. 2. Fix credential removal not persisting for singleton-sourced credentials (device_code for openai-codex/nous, hermes_pkce for anthropic). auth_remove_command already cleared env vars for env-sourced credentials, but singleton credentials stored in the auth store were re-seeded by _seed_from_singletons() on the next load_pool() call. Now clears the underlying auth store entry when removing singleton-sourced credentials.	2026-04-06 17:17:57 -07:00
tymrtn	40527ff5e3	fix(auth): actionable error message when Codex refresh token is reused When the Codex CLI (or VS Code extension) consumes a refresh token before Hermes can use it, Hermes previously surfaced a generic 401 error with no actionable guidance. - In `refresh_codex_oauth_pure`: detect `refresh_token_reused` from the OAuth endpoint and raise an AuthError explaining the cause and the exact steps to recover (run `codex` to refresh, then `hermes login`). - In `run_agent.py`: when provider is `openai-codex` and HTTP 401 is received, show Codex-specific recovery steps instead of the generic "check your API key" message. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-06 16:50:10 -07:00
Teknium	6dfab35501	feat(providers): add Google AI Studio (Gemini) as a first-class provider Cherry-picked from PR #5494 by kshitijk4poor. Adds native Gemini support via Google's OpenAI-compatible endpoint. Zero new dependencies.	2026-04-06 10:28:03 -07:00
Austin Pickett	79aeaa97e6	fix: re-order providers,Quick Install, subscription polling	2026-04-06 11:16:07 -04:00
Teknium	dce5f51c7c	feat: config structure validation — detect malformed YAML at startup (#5426 ) Add validate_config_structure() that catches common config.yaml mistakes: - custom_providers as dict instead of list (missing '-' in YAML) - fallback_model accidentally nested inside another section - custom_providers entries missing required fields (name, base_url) - Missing model section when custom_providers is configured - Root-level keys that look like misplaced custom_providers fields Surface these diagnostics at three levels: 1. Startup: print_config_warnings() runs at CLI and gateway module load, so users see issues before hitting cryptic errors 2. Error time: 'Unknown provider' errors in auth.py and model_switch.py now include config diagnostics with fix suggestions 3. Doctor: 'hermes doctor' shows a Config Structure section with all issues and fix hints Also adds a warning log in runtime_provider.py when custom_providers is a dict (previously returned None silently). Motivated by a Discord user who had malformed custom_providers YAML and got only 'Unknown Provider' with no guidance on what was wrong. 17 new tests covering all validation paths.	2026-04-05 23:31:20 -07:00
Teknium	aa56df090f	fix: allow env var overrides for Nous portal/inference URLs (#5419 ) The _login_nous() call site was pre-filling portal_base_url, inference_base_url, client_id, and scope with pconfig defaults before passing them to _nous_device_code_login(). Since pconfig defaults are always truthy, the env var checks inside the function (HERMES_PORTAL_BASE_URL, NOUS_PORTAL_BASE_URL, NOUS_INFERENCE_BASE_URL) could never take effect. Fix: pass None from the call site when no CLI flag is provided, letting the function's own priority chain handle defaults correctly: explicit CLI flag > env var > pconfig default. Addresses the issue reported in PR #5397 by jquesnelle.	2026-04-05 22:33:24 -07:00
emozilla	3962bc84b7	show cache pricing as well (if supported)	2026-04-05 22:02:21 -07:00
emozilla	0365f6202c	feat: show model pricing for OpenRouter and Nous Portal providers Display live per-million-token pricing from /v1/models when listing models for OpenRouter or Nous Portal. Prices are shown in a column-aligned table with decimal points vertically aligned for easy comparison. Pricing appears in three places: - /provider slash command (table with In/Out headers) - hermes model picker (aligned columns in both TerminalMenu and numbered fallback) Implementation: - Add fetch_models_with_pricing() in models.py with per-base_url module-level cache (one network call per endpoint per session) - Add _format_price_per_mtok() with fixed 2-decimal formatting - Add format_model_pricing_table() for terminal table display - Add get_pricing_for_provider() convenience wrapper - Update _prompt_model_selection() to accept optional pricing dict - Wire pricing through _model_flow_openrouter/nous in main.py - Update test mocks for new pricing parameter	2026-04-05 22:02:21 -07:00
Teknium	28a073edc6	fix: repair OpenCode model routing and selection (#4508 ) OpenCode Zen and Go are mixed-API-surface providers — different models behind them use different API surfaces (GPT on Zen uses codex_responses, Claude on Zen uses anthropic_messages, MiniMax on Go uses anthropic_messages, GLM/Kimi on Go use chat_completions). Changes: - Add normalize_opencode_model_id() and opencode_model_api_mode() to models.py for model ID normalization and API surface routing - Add _provider_supports_explicit_api_mode() to runtime_provider.py to prevent stale api_mode from leaking across provider switches - Wire opencode routing into all three api_mode resolution paths: pool entry, api_key provider, and explicit runtime - Add api_mode field to ModelSwitchResult for propagation through the switch pipeline - Consolidate _PROVIDER_MODELS from main.py into models.py (single source of truth, eliminates duplicate dict) - Add opencode normalization to setup wizard and model picker flows - Add opencode block to _normalize_model_for_provider in CLI - Add opencode-zen/go fallback model lists to setup.py Tests: 160 targeted tests pass (26 new tests covering normalization, api_mode routing per provider/model, persistence, and setup wizard normalization). Based on PR #3017 by SaM13997. Co-authored-by: SaM13997 <139419381+SaM13997@users.noreply.github.com>	2026-04-02 09:36:24 -07:00
Ben Barclay	a2e56d044b	Merge branch 'main' into rewbs/tool-use-charge-to-subscription	2026-04-02 11:00:35 +11:00
Teknium	8d59881a62	feat(auth): same-provider credential pools with rotation, custom endpoint support, and interactive CLI (#2647 ) * feat(auth): add same-provider credential pools and rotation UX Add same-provider credential pooling so Hermes can rotate across multiple credentials for a single provider, recover from exhausted credentials without jumping providers immediately, and configure that behavior directly in hermes setup. - agent/credential_pool.py: persisted per-provider credential pools - hermes auth add/list/remove/reset CLI commands - 429/402/401 recovery with pool rotation in run_agent.py - Setup wizard integration for pool strategy configuration - Auto-seeding from env vars and existing OAuth state Co-authored-by: kshitijk4poor <82637225+kshitijk4poor@users.noreply.github.com> Salvaged from PR #2647 * fix(tests): prevent pool auto-seeding from host env in credential pool tests Tests for non-pool Anthropic paths and auth remove were failing when host env vars (ANTHROPIC_API_KEY) or file-backed OAuth credentials were present. The pool auto-seeding picked these up, causing unexpected pool entries in tests. - Mock _select_pool_entry in auxiliary_client OAuth flag tests - Clear Anthropic env vars and mock _seed_from_singletons in auth remove test * feat(auth): add thread safety, least_used strategy, and request counting - Add threading.Lock to CredentialPool for gateway thread safety (concurrent requests from multiple gateway sessions could race on pool state mutations without this) - Add 'least_used' rotation strategy that selects the credential with the lowest request_count, distributing load more evenly - Add request_count field to PooledCredential for usage tracking - Add mark_used() method to increment per-credential request counts - Wrap select(), mark_exhausted_and_rotate(), and try_refresh_current() with lock acquisition - Add tests: least_used selection, mark_used counting, concurrent thread safety (4 threads × 20 selects with no corruption) * feat(auth): add interactive mode for bare 'hermes auth' command When 'hermes auth' is called without a subcommand, it now launches an interactive wizard that: 1. Shows full credential pool status across all providers 2. Offers a menu: add, remove, reset cooldowns, set strategy 3. For OAuth-capable providers (anthropic, nous, openai-codex), the add flow explicitly asks 'API key or OAuth login?' — making it clear that both auth types are supported for the same provider 4. Strategy picker shows all 4 options (fill_first, round_robin, least_used, random) with the current selection marked 5. Remove flow shows entries with indices for easy selection The subcommand paths (hermes auth add/list/remove/reset) still work exactly as before for scripted/non-interactive use. * fix(tests): update runtime_provider tests for config.yaml source of truth (#4165) Tests were using OPENAI_BASE_URL env var which is no longer consulted after #4165. Updated to use model config (provider, base_url, api_key) which is the new single source of truth for custom endpoint URLs. * feat(auth): support custom endpoint credential pools keyed by provider name Custom OpenAI-compatible endpoints all share provider='custom', making the provider-keyed pool useless. Now pools for custom endpoints are keyed by 'custom:<normalized_name>' where the name comes from the custom_providers config list (auto-generated from URL hostname). - Pool key format: 'custom:together.ai', 'custom:local-(localhost:8080)' - load_pool('custom:name') seeds from custom_providers api_key AND model.api_key when base_url matches - hermes auth add/list now shows custom endpoints alongside registry providers - _resolve_openrouter_runtime and _resolve_named_custom_runtime check pool before falling back to single config key - 6 new tests covering custom pool keying, seeding, and listing * docs: add Excalidraw diagram of full credential pool flow Comprehensive architecture diagram showing: - Credential sources (env vars, auth.json OAuth, config.yaml, CLI) - Pool storage and auto-seeding - Runtime resolution paths (registry, custom, OpenRouter) - Error recovery (429 retry-then-rotate, 402 immediate, 401 refresh) - CLI management commands and strategy configuration Open at: https://excalidraw.com/#json=2Ycqhqpi6f12E_3ITyiwh,c7u9jSt5BwrmiVzHGbm87g * fix(tests): update setup wizard pool tests for unified select_provider_and_model flow The setup wizard now delegates to select_provider_and_model() instead of using its own prompt_choice-based provider picker. Tests needed: - Mock select_provider_and_model as no-op (provider pre-written to config) - Call _stub_tts BEFORE custom prompt_choice mock (it overwrites it) - Pre-write model.provider to config so the pool step is reached * docs: add comprehensive credential pool documentation - New page: website/docs/user-guide/features/credential-pools.md Full guide covering quick start, CLI commands, rotation strategies, error recovery, custom endpoint pools, auto-discovery, thread safety, architecture, and storage format. - Updated fallback-providers.md to reference credential pools as the first layer of resilience (same-provider rotation before cross-provider) - Added hermes auth to CLI commands reference with usage examples - Added credential_pool_strategies to configuration guide * chore: remove excalidraw diagram from repo (external link only) * refactor: simplify credential pool code — extract helpers, collapse extras, dedup patterns - _load_config_safe(): replace 4 identical try/except/import blocks - _iter_custom_providers(): shared generator for custom provider iteration - PooledCredential.extra dict: collapse 11 round-trip-only fields (token_type, scope, client_id, portal_base_url, obtained_at, expires_in, agent_key_id, agent_key_expires_in, agent_key_reused, agent_key_obtained_at, tls) into a single extra dict with __getattr__ for backward-compatible access - _available_entries(): shared exhaustion-check between select and peek - Dedup anthropic OAuth seeding (hermes_pkce + claude_code identical) - SimpleNamespace replaces class _Args boilerplate in auth_commands - _try_resolve_from_custom_pool(): shared pool-check in runtime_provider Net -17 lines. All 383 targeted tests pass. --------- Co-authored-by: kshitijk4poor <82637225+kshitijk4poor@users.noreply.github.com>	2026-03-31 03:10:01 -07:00
Teknium	45396aaa92	fix(alibaba): use standard DashScope international endpoint (#4133 ) * fix(alibaba): use standard DashScope international endpoint The Alibaba Cloud provider was hardcoded to the coding-intl endpoint (https://coding-intl.dashscope.aliyuncs.com/v1) which only accepts Alibaba Coding Plan API keys. Standard DashScope API keys fail with invalid_api_key error against this endpoint. Changed to the international compatible-mode endpoint (https://dashscope-intl.aliyuncs.com/compatible-mode/v1) which works with standard DashScope keys. Users with Coding Plan keys or China-region keys can still override via DASHSCOPE_BASE_URL or config.yaml base_url. Fixes #3912 * fix: update test to match new DashScope default endpoint --------- Co-authored-by: kagura-agent <kagura.chen28@gmail.com>	2026-03-30 19:06:30 -07:00
Robin Fernandes	1126284c97	Merge branch 'main' into rewbs/tool-use-charge-to-subscription	2026-03-31 09:29:43 +09:00
Robin Fernandes	6e4598ce1e	Merge branch 'main' into rewbs/tool-use-charge-to-subscription	2026-03-31 08:48:54 +09:00
Teknium	ccf7bb1102	fix(nous): use curated model list instead of full API dump for Nous Portal (#3867 ) All three Nous Portal model selection paths (hermes model, first-time login, setup wizard) were hitting the live /models endpoint and showing every model available — potentially hundreds. Now uses the curated _PROVIDER_MODELS['nous'] list (25 agentic models matching OpenRouter defaults) with 'Enter custom model name' for anything else. Fixed in: - hermes_cli/main.py: _model_flow_nous() - hermes_cli/auth.py: _login_nous() model selection - hermes_cli/setup.py: post-login model selection	2026-03-29 21:38:10 -07:00
Teknium	86ac23c8da	fix(auth): stop silently falling back to OpenRouter when no provider is configured (#3862 ) Previously, when no API keys or provider credentials were found, Hermes silently defaulted to OpenRouter + Claude Opus. This caused confusion when users configured local servers (LM Studio, Ollama, etc.) with a typo or unrecognized provider name — the system would silently route to OpenRouter instead of telling them something was wrong. Changes: - resolve_provider() now raises AuthError when no credentials are found instead of returning 'openrouter' as a silent fallback - Added local server aliases: lmstudio, ollama, vllm, llamacpp → custom - Removed hardcoded 'anthropic/claude-opus-4.6' fallback from gateway and cron scheduler (they read from config.yaml instead) - Updated cli-config.yaml.example with complete provider documentation including all supported providers, aliases, and local server setup	2026-03-29 21:06:35 -07:00
Teknium	c62cadb73a	fix: make display_hermes_home imports lazy to prevent ImportError during hermes update (#3776 ) When a user runs 'hermes update', the Python process caches old modules in sys.modules. After git pull updates files on disk, lazy imports of newly-updated modules fail because they try to import display_hermes_home from the cached (old) hermes_constants which doesn't have the function. This specifically broke the gateway auto-restart in cmd_update — importing hermes_cli/gateway.py triggered the top-level 'from hermes_constants import display_hermes_home' against the cached old module. The ImportError was silently caught, so the gateway was never restarted after update. Users with a running gateway then hit the ImportError on their next Telegram/Discord message when the stale gateway process lazily loaded run_agent.py (new version) which also had the top-level import. Fixes: - hermes_cli/gateway.py: lazy import at call site (line 940) - run_agent.py: lazy import at call site (line 6927) - tools/terminal_tool.py: lazy imports at 3 call sites - tools/tts_tool.py: static schema string (no module-level call) - hermes_cli/auth.py: lazy import at call site (line 2024) - hermes_cli/main.py: reload hermes_constants after git pull in cmd_update Also fixes 4 pre-existing test failures in test_parse_env_var caused by NameError on display_hermes_home in terminal_tool.py.	2026-03-29 15:15:17 -07:00
Teknium	9f01244137	fix: replace user-facing hardcoded ~/.hermes paths with display_hermes_home() Prep for profiles: user-facing messages now use display_hermes_home() so diagnostic output shows the correct path for each profile. New helper: display_hermes_home() in hermes_constants.py 12 files swept, ~30 user-facing string replacements. Includes dynamic TTS schema description.	2026-03-28 23:47:21 -07:00
Teknium	09796b183b	fix: alibaba provider default endpoint and model list (#3484 ) - Change default inference_base_url from dashscope-intl Anthropic-compat endpoint to coding-intl OpenAI-compat /v1 endpoint. The old Anthropic endpoint 404'd when used with the OpenAI SDK (which appends /chat/completions to a /apps/anthropic base URL). - Update curated model list: remove models unavailable on coding-intl (qwen3-max, qwen-plus-latest, qwen3.5-flash, qwen-vl-max), add third-party models available on the platform (glm-5, glm-4.7, kimi-k2.5, MiniMax-M2.5). - URL-based api_mode auto-detection still works: overriding DASHSCOPE_BASE_URL to an /apps/anthropic endpoint automatically switches to anthropic_messages mode. - Update provider description and env var descriptions to reflect the coding-intl multi-provider platform. - Update tests to match new default URL and test the anthropic override path instead.	2026-03-27 22:10:10 -07:00
Teknium	fd8c465e42	feat: add Hugging Face as a first-class inference provider (#3419 ) Salvage of PR #1747 (original PR #1171 by @davanstrien) onto current main. Registers Hugging Face Inference Providers (router.huggingface.co/v1) as a named provider: - hermes chat --provider huggingface (or --provider hf) - 18 curated open models via hermes model picker - HF_TOKEN in ~/.hermes/.env - OpenAI-compatible endpoint with automatic failover (Groq, Together, SambaNova, etc.) Files: auth.py, models.py, main.py, setup.py, config.py, model_metadata.py, .env.example, 5 docs pages, 17 new tests. Co-authored-by: Daniel van Strien <davanstrien@gmail.com>	2026-03-27 12:41:59 -07:00
Robin Fernandes	95dc9aaa75	feat: add managed tool gateway and Nous subscription support - add managed modal and gateway-backed tool integrations\n- improve CLI setup, auth, and configuration for subscriber flows\n- expand tests and docs for managed tool support	2026-03-26 16:17:58 -07:00
Teknium	cbf195e806	chore: fix 154 f-strings, simplify getattr/URL patterns, remove dead code (#3119 ) Three categories of cleanup, all zero-behavioral-change: 1. F-strings without placeholders (154 fixes across 29 files) - Converted f'...' to '...' where no {expression} was present - Heaviest files: run_agent.py (24), cli.py (20), honcho_integration/cli.py (34) 2. Simplify defensive patterns in run_agent.py - Added explicit self._is_anthropic_oauth = False in __init__ (before the api_mode branch that conditionally sets it) - Replaced 7x getattr(self, '_is_anthropic_oauth', False) with direct self._is_anthropic_oauth (attribute always initialized now) - Added _is_openrouter_url() and _is_anthropic_url() helper methods - Replaced 3 inline 'openrouter' in self._base_url_lower checks 3. Remove dead code in small files - hermes_cli/claw.py: removed unused 'total' computation - tools/fuzzy_match.py: removed unused strip_indent() function and pattern_stripped variable Full test suite: 6184 passed, 0 failures E2E PTY: banner clean, tool calls work, zero garbled ANSI	2026-03-25 19:47:58 -07:00
Teknium	2f1c4fb01f	fix(auth): preserve 'custom' provider instead of silently remapping to 'openrouter' resolve_provider('custom') was silently returning 'openrouter', causing users who set provider: custom in config.yaml to unknowingly route through OpenRouter instead of their local/custom endpoint. The display showed 'via openrouter' even when the user explicitly chose custom. Changes: - auth.py: Split the conditional so 'custom' returns 'custom' as-is - runtime_provider.py: _resolve_named_custom_runtime now returns provider='custom' instead of 'openrouter' - runtime_provider.py: _resolve_openrouter_runtime returns provider='custom' when that was explicitly requested - Add 'no-key-required' placeholder for keyless local servers - Update existing test + add 5 new tests covering the fix Fixes #2562	2026-03-24 06:41:11 -07:00
0xbyt4	e0ca46cd73	fix: restore opencode-go provider config corrupted by secret redaction (#2393 ) auth_type was "***" instead of "api_key" and api_key_env_vars was ("OPEN...",) instead of ("OPENCODE_GO_API_KEY",). This was introduced in `35d948b6` when a secret redaction tool masked these values during the Kilo Code provider commit. OpenCode Go provider was completely broken as a result.	2026-03-21 17:08:52 -07:00
aashizpoudel	f304bc63b8	fix: ignore placeholder provider keys in provider activation checks Add has_usable_secret() to reject empty, short (<4 char), and common placeholder API key values (changeme, your_api_key, placeholder, etc.) throughout the auth/runtime resolution chain. Update list_available_providers() to use provider-specific auth status via get_auth_status() instead of resolve_runtime_provider(), preventing cross-provider key fallback from making providers appear available when they aren't actually configured. Preserve keyless custom endpoint support by checking via base URL. Cherry-picked from PR #2121 by aashizpoudel.	2026-03-21 12:55:42 -07:00
Teknium	6bcec1ac25	fix: resolve MiniMax 401 auth error by defaulting to anthropic_messages (#2103 ) MiniMax's default base URL was /v1 which caused runtime_provider to default to chat_completions mode (OpenAI-style Authorization: Bearer header). MiniMax rejects this with a 401 because they require the Anthropic-style x-api-key header. Changes: - auth.py: Change default inference_base_url for minimax and minimax-cn from /v1 to /anthropic - runtime_provider.py: Auto-correct stale /v1 URLs from existing .env files to /anthropic, and always default minimax/minimax-cn providers to anthropic_messages mode - Update tests to reflect new defaults, add tests for stale URL auto-correction and explicit api_mode override Based on PR #2100 by @devorun. Fixes #2094. Co-authored-by: Test <test@test.com>	2026-03-19 17:47:05 -07:00
Test	21c45ba0ac	feat: proper Copilot auth with OAuth device code flow and token validation Builds on PR #1879's Copilot integration with critical auth improvements modeled after opencode's implementation: - Add hermes_cli/copilot_auth.py with: - OAuth device code flow (copilot_device_code_login) using the same client_id (Ov23li8tweQw6odWQebz) as opencode and Copilot CLI - Token type validation: reject classic PATs (ghp_*) with a clear error message explaining supported token types - Proper env var priority: COPILOT_GITHUB_TOKEN > GH_TOKEN > GITHUB_TOKEN (matching Copilot CLI documentation) - copilot_request_headers() with Openai-Intent, x-initiator, and Copilot-Vision-Request headers (matching opencode) - Update auth.py: - PROVIDER_REGISTRY copilot entry uses correct env var order - _resolve_api_key_provider_secret delegates to copilot_auth for the copilot provider with proper token validation - Update models.py: - copilot_default_headers() now includes Openai-Intent and x-initiator - Update main.py: - _model_flow_copilot offers OAuth device code login when no token is found, with manual token entry as fallback - Shows supported vs unsupported token types - 22 new tests covering token validation, env var priority, header generation, and integration with existing auth infrastructure	2026-03-18 03:25:58 -07:00
max	0c392e7a87	feat: integrate GitHub Copilot providers across Hermes Add first-class GitHub Copilot and Copilot ACP provider support across model selection, runtime provider resolution, CLI sessions, delegated subagents, cron jobs, and the Telegram gateway. This also normalizes Copilot model catalogs and API modes, introduces a Copilot ACP OpenAI-compatible shim, and fixes service-mode auth by resolving Homebrew-installed gh binaries under launchd. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-03-17 23:40:22 -07:00
Teknium	71c6b1ee99	fix: remove ANTHROPIC_BASE_URL env var to avoid collisions (#1675 ) ANTHROPIC_BASE_URL collides with Claude Code and other Anthropic tooling. Remove it from the Anthropic provider — base URL overrides should go through config.yaml model.base_url instead. The Alibaba/DashScope provider has its own dedicated base URL and API key env vars which don't collide with anything.	2026-03-17 02:51:49 -07:00
Teknium	7042a748f5	feat: add Alibaba Cloud provider and Anthropic base_url override (#1673 ) Add Alibaba Cloud (DashScope) as a first-class inference provider using the Anthropic-compatible endpoint. This gives access to Qwen models (qwen3.5-plus, qwen3-max, qwen3-coder-plus, etc.) through the same api_mode as native Anthropic. Also add ANTHROPIC_BASE_URL env var support so users can point the Anthropic provider at any compatible endpoint. Changes: - auth.py: Add alibaba ProviderConfig + ANTHROPIC_BASE_URL on anthropic - models.py: Add alibaba to catalog, labels, aliases (dashscope/aliyun/qwen), provider order - runtime_provider.py: Add alibaba resolution (anthropic_messages api_mode) + ANTHROPIC_BASE_URL - model_metadata.py: Add Qwen model context lengths (128K) - config.py: Add DASHSCOPE_API_KEY, DASHSCOPE_BASE_URL, ANTHROPIC_BASE_URL env vars Usage: hermes --provider alibaba --model qwen3.5-plus # or via aliases: hermes --provider qwen --model qwen3-max	2026-03-17 02:49:22 -07:00
Teknium	35d948b6e1	feat: add Kilo Code (kilocode) as first-class inference provider (#1666 ) Add Kilo Gateway (kilo.ai) as an API-key provider with OpenAI-compatible endpoint at https://api.kilo.ai/api/gateway. Supports 500+ models from Anthropic, OpenAI, Google, xAI, Mistral, MiniMax via a single API key. - Register kilocode in PROVIDER_REGISTRY with aliases (kilo, kilo-code, kilo-gateway) and KILOCODE_API_KEY / KILOCODE_BASE_URL env vars - Add to model catalog, CLI provider menu, setup wizard, doctor checks - Add google/gemini-3-flash-preview as default aux model - 12 new tests covering registration, aliases, credential resolution, runtime config - Documentation updates (env vars, config, fallback providers) - Fix setup test index shift from provider insertion Inspired by PR #1473 by @amanning3390. Co-authored-by: amanning3390 <amanning3390@users.noreply.github.com>	2026-03-17 02:40:34 -07:00
Teknium	40e2f8d9f0	feat(provider): add OpenCode Zen and OpenCode Go providers Add support for OpenCode Zen (pay-as-you-go, 35+ curated models) and OpenCode Go ($10/month subscription, open models) as first-class providers. Both are OpenAI-compatible endpoints resolved via the generic api_key provider flow — no custom adapter needed. Files changed: - hermes_cli/auth.py — ProviderConfig entries + aliases - hermes_cli/config.py — OPENCODE_ZEN/GO API key env vars - hermes_cli/models.py — model catalogs, labels, aliases, provider order - hermes_cli/main.py — provider labels, menu entries, model flow dispatch - hermes_cli/setup.py — setup wizard branches (idx 10, 11) - agent/model_metadata.py — context lengths for all OpenCode models - agent/auxiliary_client.py — default aux models - .env.example — documentation Co-authored-by: DevAgarwal2 <DevAgarwal2@users.noreply.github.com>	2026-03-17 02:02:43 -07:00
Teknium	3576f44a57	feat: add Vercel AI Gateway provider (#1628 ) * feat: add Vercel AI Gateway as a first-class provider Adds AI Gateway (ai-gateway.vercel.sh) as a new inference provider with AI_GATEWAY_API_KEY authentication, live model discovery, and reasoning support via extra_body.reasoning. Based on PR #1492 by jerilynzheng. * feat: add AI Gateway to setup wizard, doctor, and fallback providers * test: add AI Gateway to api_key_providers test suite * feat: add AI Gateway to hermes model CLI and model metadata Wire AI Gateway into the interactive model selection menu and add context lengths for AI Gateway model IDs in model_metadata.py. * feat: use claude-haiku-4.5 as AI Gateway auxiliary model * revert: use gemini-3-flash as AI Gateway auxiliary model * fix: move AI Gateway below established providers in selection order --------- Co-authored-by: jerilynzheng <jerilynzheng@users.noreply.github.com> Co-authored-by: jerilynzheng <zheng.jerilyn@gmail.com>	2026-03-17 00:12:16 -07:00
Teknium	c1da1fdcd5	feat: auto-detect provider when switching models via /model (#1506 ) When typing /model deepseek-chat while on a different provider, the model name now auto-resolves to the correct provider instead of silently staying on the wrong one and causing API errors. Detection priority: 1. Direct provider with credentials (e.g. DEEPSEEK_API_KEY set) 2. OpenRouter catalog match with proper slug remapping 3. Direct provider without creds (clear error beats silent failure) Also adds DeepSeek as a first-class API-key provider — just set DEEPSEEK_API_KEY and /model deepseek-chat routes directly. Bare model names get remapped to proper OpenRouter slugs: /model gpt-5.4 → openai/gpt-5.4 /model claude-opus-4.6 → anthropic/claude-opus-4.6 Salvages the concept from PR #1177 by @virtaava with credential awareness and OpenRouter slug mapping added. Co-authored-by: virtaava <virtaava@users.noreply.github.com>	2026-03-16 04:34:45 -07:00
Teknium	e8c9bcea2b	fix: prevent model/provider mismatch when switching providers during active gateway (#1183 ) When _update_config_for_provider() writes the new provider and base_url to config.yaml, the gateway (which re-reads config per-message) can pick up the change before model selection completes. This causes the old model name (e.g. 'anthropic/claude-opus-4.6') to be sent to the new provider's API (e.g. MiniMax), which fails. Changes: - _update_config_for_provider() now accepts an optional default_model parameter. When provided and the current model.default is empty or uses OpenRouter format (contains '/'), it sets a safe default model for the new provider. - All setup.py callers for direct-API providers (zai, kimi, minimax, minimax-cn, anthropic) now pass a provider-appropriate default model. - _setup_provider_model_selection() now validates the 'Keep current' choice: if the current model uses OpenRouter format and wouldn't work with the new provider, it warns and switches to the provider's first default model instead of silently keeping the incompatible name. Reported by a user on Home Assistant whose gateway started sending 'anthropic/claude-opus-4.6' to MiniMax's API after running hermes setup.	2026-03-13 09:03:48 -07:00
Teknium	d24bcad90b	fix: Anthropic OAuth — beta header, token refresh, config contamination, reauthentication (#1132 ) Fixes Anthropic OAuth/subscription authentication end-to-end: Auth failures (401 errors): - Add missing 'claude-code-20250219' beta header for OAuth tokens. Both clawdbot and OpenCode include this alongside 'oauth-2025-04-20' — without it, Anthropic's API rejects OAuth tokens with 401 authentication errors. - Fix _fetch_anthropic_models() to use canonical beta headers from _COMMON_BETAS + _OAUTH_ONLY_BETAS instead of hardcoding. Token refresh: - Add _refresh_oauth_token() — when Claude Code credentials from ~/.claude/.credentials.json are expired but have a refresh token, automatically POST to console.anthropic.com/v1/oauth/token to get a new access token. Uses the same client_id as Claude Code / OpenCode. - Add _write_claude_code_credentials() — writes refreshed tokens back to ~/.claude/.credentials.json, preserving other fields. - resolve_anthropic_token() now auto-refreshes expired tokens before returning None. Config contamination: - Anthropic's _model_flow_anthropic() no longer saves base_url to config. Since resolve_runtime_provider() always hardcodes Anthropic's URL, the stale base_url was contaminating other providers when users switched without re-running 'hermes model' (e.g., Codex hitting api.anthropic.com). - _update_config_for_provider() now pops base_url when passed empty string. - Same fix in setup.py. Flow/UX (hermes model command): - CLAUDE_CODE_OAUTH_TOKEN env var now checked in credential detection - Reauthentication option when existing credentials found - run_oauth_setup_token() runs 'claude setup-token' as interactive subprocess, then auto-detects saved credentials - Clean has_creds/needs_auth flow in both main.py and setup.py Tests (14 new): - Beta header assertions for claude-code-20250219 - Token refresh: successful refresh with credential writeback, failed refresh returns None, no refresh token returns None - Credential writeback: new file creation, preserving existing fields - Auto-refresh integration in resolve_anthropic_token() - CLAUDE_CODE_OAUTH_TOKEN fallback, credential file auto-discovery - run_oauth_setup_token() (5 scenarios)	2026-03-12 20:45:50 -07:00
teknium1	5e12442b4b	feat: native Anthropic provider with Claude Code credential auto-discovery Add Anthropic as a first-class inference provider, bypassing OpenRouter for direct API access. Uses the native Anthropic SDK with a full format adapter (same pattern as the codex_responses api_mode). ## Auth (three methods, priority order) 1. ANTHROPIC_API_KEY env var (regular API key, sk-ant-api-) 2. ANTHROPIC_TOKEN / CLAUDE_CODE_OAUTH_TOKEN env var (setup-token, sk-ant-oat-) 3. Auto-discovery from ~/.claude/.credentials.json (Claude Code subscription) - Reads Claude Code's OAuth credentials - Checks token expiry with 60s buffer - Setup tokens use Bearer auth + anthropic-beta: oauth-2025-04-20 header - Regular API keys use standard x-api-key header ## Changes by file ### New files - agent/anthropic_adapter.py — Client builder, message/tool/response format conversion, Claude Code credential reader, token resolver. Handles system prompt extraction, tool_use/tool_result blocks, thinking/reasoning, orphaned tool_use cleanup, cache_control. - tests/test_anthropic_adapter.py — 36 tests covering all adapter logic ### Modified files - pyproject.toml — Add anthropic>=0.39.0 dependency - hermes_cli/auth.py — Add 'anthropic' to PROVIDER_REGISTRY with three env vars, plus 'claude'/'claude-code' aliases - hermes_cli/models.py — Add model catalog, labels, aliases, provider order - hermes_cli/main.py — Add 'anthropic' to --provider CLI choices - hermes_cli/runtime_provider.py — Add Anthropic branch returning api_mode='anthropic_messages' (before generic api_key fallthrough) - hermes_cli/setup.py — Add Anthropic setup wizard with Claude Code credential auto-discovery, model selection, OpenRouter tools prompt - agent/auxiliary_client.py — Add claude-haiku-4-5 as aux model - agent/model_metadata.py — Add bare Claude model context lengths - run_agent.py — Add anthropic_messages api_mode: * Client init (Anthropic SDK instead of OpenAI) * API call dispatch (_anthropic_client.messages.create) * Response validation (content blocks) * finish_reason mapping (stop_reason -> finish_reason) * Token usage (input_tokens/output_tokens) * Response normalization (normalize_anthropic_response) * Client interrupt/rebuild * Prompt caching auto-enabled for native Anthropic - tests/test_run_agent.py — Update test_anthropic_base_url_accepted to expect native routing, add test_prompt_caching_native_anthropic	2026-03-12 15:47:45 -07:00
teknium1	9302690e1b	refactor: remove LLM_MODEL env var dependency — config.yaml is sole source of truth Model selection now comes exclusively from config.yaml (set via 'hermes model' or 'hermes setup'). The LLM_MODEL env var is no longer read or written anywhere in production code. Why: env vars are per-process/per-user and would conflict in multi-agent or multi-tenant setups. Config.yaml is file-based and can be scoped per-user or eventually per-session. Changes: - cli.py: Read model from CLI_CONFIG only, not LLM_MODEL/OPENAI_MODEL - hermes_cli/auth.py: _save_model_choice() no longer writes LLM_MODEL to .env - hermes_cli/setup.py: Remove 12 save_env_value('LLM_MODEL', ...) calls from all provider setup flows - gateway/run.py: Remove LLM_MODEL fallback (HERMES_MODEL still works for gateway process runtime) - cron/scheduler.py: Same - agent/auxiliary_client.py: Remove LLM_MODEL from custom endpoint model detection	2026-03-11 22:04:42 -07:00
teknium1	013cc4d2fc	chore: remove nous-api provider (API key path) Nous Portal only supports OAuth authentication. Remove the 'nous-api' provider which allowed direct API key access via NOUS_API_KEY env var. Removed from: - hermes_cli/auth.py: PROVIDER_REGISTRY entry + aliases - hermes_cli/config.py: OPTIONAL_ENV_VARS entry - hermes_cli/setup.py: setup wizard option + model selection handler (reindexed remaining provider choices) - agent/auxiliary_client.py: docstring references - tests/test_runtime_provider_resolution.py: nous-api test - tests/integration/test_web_tools.py: renamed dict key	2026-03-11 20:14:44 -07:00
teknium1	1518734e59	fix: sort Nous Portal model list (opus first, sonnet lower) fetch_nous_models() returned models in whatever order the API gave them, which put sonnet near the top. Add a priority sort so users see the best models first: opus > pro > other > sonnet.	2026-03-10 23:20:46 -07:00
teknium1	145c57fc01	fix: provider selection not persisting when switching via hermes model Two related bugs prevented users from reliably switching providers: 1. OPENAI_BASE_URL poisoning OpenRouter resolution: When a user with a custom endpoint ran /model openrouter:model, _resolve_openrouter_runtime picked up OPENAI_BASE_URL instead of the OpenRouter URL, causing model validation to probe the wrong API and reject valid models. Fix: skip OPENAI_BASE_URL when requested_provider is explicitly 'openrouter'. 2. Provider never saved to config: _save_model_choice() could save config.model as a plain string. All five _model_flow_* functions then checked isinstance(model, dict) before writing the provider — which silently failed on strings. With no provider in config, auto-detection would pick up stale credentials (e.g. Codex desktop app) instead of the user's explicit choice. Fix: _save_model_choice() now always saves as dict format. All flow functions also normalize string->dict as a safety net before writing provider. Adds 4 regression tests. 2873 tests pass.	2026-03-10 17:12:34 -07:00
Indelwin	de07aa7c40	feat: add Nous Portal API key provider (#644 ) Add support for using Nous Portal via a direct API key, mirroring how OpenRouter and other API-key providers work. This gives users a simpler alternative to the OAuth device-code flow when they already have a Nous API key. Changes: - Add 'nous-api' to PROVIDER_REGISTRY as an api_key provider pointing to https://inference-api.nousresearch.com/v1 - Add NOUS_API_KEY and NOUS_BASE_URL to OPTIONAL_ENV_VARS - Add NOUS_API_BASE_URL / NOUS_API_CHAT_URL to hermes_constants - Add 'Nous Portal API key' as first option in setup wizard - Add provider aliases (nous_api, nousapi, nous-portal-api) - Add test for nous-api runtime provider resolution Closes #644	2026-03-10 06:28:00 -07:00
teknium1	3e352f8a0d	fix: add upstream guard for non-dict function_args + tests for build_tool_preview Complements PR #453 by 0xbyt4. Adds isinstance(dict) guard in run_agent.py to catch cases where json.loads returns non-dict (e.g. null, list, string) before they reach downstream code. Also adds 15 tests for build_tool_preview covering None args, empty dicts, known/unknown tools, fallback keys, truncation, and all special-cased tools (process, todo, memory, session_search).	2026-03-09 21:01:40 -07:00
Christo Mitov	4447e7d71a	fix: add Kimi Code API support (api.kimi.com/coding/v1) Kimi Code (platform.kimi.ai) issues API keys prefixed sk-kimi- that require: 1. A different base URL: api.kimi.com/coding/v1 (not api.moonshot.ai/v1) 2. A User-Agent header identifying a recognized coding agent Without this fix, sk-kimi- keys fail with 401 (wrong endpoint) or 403 ('only available for Coding Agents') errors. Changes: - Auto-detect sk-kimi- key prefix and route to api.kimi.com/coding/v1 - Send User-Agent: KimiCLI/1.0 header for Kimi Code endpoints - Legacy Moonshot keys (api.moonshot.ai) continue to work unchanged - KIMI_BASE_URL env var override still takes priority over auto-detection - Updated .env.example with correct docs and all endpoint options - Fixed doctor.py health check for Kimi Code keys Reference: https://github.com/MoonshotAI/kimi-cli (platforms.py)	2026-03-07 21:00:12 -05:00
teknium1	48e0dc8791	feat: implement Z.AI endpoint detection for API key validation Added functionality to detect the appropriate Z.AI endpoint based on the provided API key, accommodating different billing plans and regions. The setup process now probes available endpoints and updates the configuration accordingly, enhancing user experience and reducing potential billing errors. Updated the setup model provider function to integrate this new detection logic.	2026-03-07 09:43:37 -08:00
teknium1	388dd4789c	feat: add z.ai/GLM, Kimi/Moonshot, MiniMax as first-class providers Adds 4 new direct API-key providers (zai, kimi-coding, minimax, minimax-cn) to the inference provider system. All use standard OpenAI-compatible chat/completions endpoints with Bearer token auth. Core changes: - auth.py: Extended ProviderConfig with api_key_env_vars and base_url_env_var fields. Added providers to PROVIDER_REGISTRY. Added provider aliases (glm, z-ai, zhipu, kimi, moonshot). Added auto-detection of API-key providers in resolve_provider(). Added resolve_api_key_provider_credentials() and get_api_key_provider_status() helpers. - runtime_provider.py: Added generic API-key provider branch in resolve_runtime_provider() — any provider with auth_type='api_key' is automatically handled. - main.py: Added providers to hermes model menu with generic _model_flow_api_key_provider() flow. Updated _has_any_provider_configured() to check all provider env vars. Updated argparse --provider choices. - setup.py: Added providers to setup wizard with API key prompts and curated model lists. - config.py: Added env vars (GLM_API_KEY, KIMI_API_KEY, MINIMAX_API_KEY, etc.) to OPTIONAL_ENV_VARS. - status.py: Added API key display and provider status section. - doctor.py: Added connectivity checks for each provider endpoint. - cli.py: Updated provider docstrings. Docs: Updated README.md, .env.example, cli-config.yaml.example, cli-commands.md, environment-variables.md, configuration.md. Tests: 50 new tests covering registry, aliases, resolution, auto-detection, credential resolution, and runtime provider dispatch. Inspired by PR #33 (numman-ali) which proposed a provider registry approach. Credit to tars90percent (PR #473) and manuelschipper (PR #420) for related provider improvements merged earlier in this changeset.	2026-03-06 18:55:18 -08:00
shitcoinsherpa	48e65631f6	Fix auth store file lock for Windows (msvcrt) with reentrancy support fcntl is not available on Windows. This adds msvcrt.locking as a fallback for cross-process advisory locking on Windows. msvcrt.locking is not reentrant within the same thread, unlike fcntl.flock. This matters because resolve_codex_runtime_credentials holds the lock and then calls _save_codex_tokens, which tries to acquire it again. Without reentrancy tracking, this deadlocks on Windows after a 15-second timeout. Uses threading.local() to track lock depth per thread, allowing nested acquisitions to pass through without re-acquiring the underlying lock. Also handles msvcrt-specific requirements: file must be opened in r+ mode (not a+), must have at least 1 byte of content, and the file pointer must be at position 0 before locking.	2026-03-05 17:16:03 -05:00

1 2

63 Commits