hermes-agent

Author	SHA1	Message	Date
Teknium	3576f44a57	feat: add Vercel AI Gateway provider (#1628 ) * feat: add Vercel AI Gateway as a first-class provider Adds AI Gateway (ai-gateway.vercel.sh) as a new inference provider with AI_GATEWAY_API_KEY authentication, live model discovery, and reasoning support via extra_body.reasoning. Based on PR #1492 by jerilynzheng. * feat: add AI Gateway to setup wizard, doctor, and fallback providers * test: add AI Gateway to api_key_providers test suite * feat: add AI Gateway to hermes model CLI and model metadata Wire AI Gateway into the interactive model selection menu and add context lengths for AI Gateway model IDs in model_metadata.py. * feat: use claude-haiku-4.5 as AI Gateway auxiliary model * revert: use gemini-3-flash as AI Gateway auxiliary model * fix: move AI Gateway below established providers in selection order --------- Co-authored-by: jerilynzheng <jerilynzheng@users.noreply.github.com> Co-authored-by: jerilynzheng <zheng.jerilyn@gmail.com>	2026-03-17 00:12:16 -07:00
teknium1	62abb453d3	Merge origin/main into hermes/hermes-daa73839	2026-03-14 23:44:47 -07:00
teknium1	735a6e7651	fix: convert anthropic image content blocks	2026-03-14 23:41:20 -07:00
teknium1	1337c9efd8	test: resolve auxiliary client merge conflict	2026-03-14 22:15:16 -07:00
teknium1	85ef09e520	Merge origin/main into hermes/hermes-dd253d81	2026-03-14 21:16:29 -07:00
teknium1	db362dbd4c	feat: add native Anthropic auxiliary vision	2026-03-14 21:14:20 -07:00
teknium1	9f6bccd76a	feat: add direct endpoint overrides for auxiliary and delegation Add base_url/api_key overrides for auxiliary tasks and delegation so users can route those flows straight to a custom OpenAI-compatible endpoint without having to rely on provider=main or named custom providers. Also clear gateway session env vars in test isolation so the full suite stays deterministic when run from a messaging-backed agent session.	2026-03-14 21:11:37 -07:00
Teknium	a86b487349	Merge pull request #1373 from NousResearch/hermes/hermes-781f9235 fix: restore config-saved custom endpoint resolution	2026-03-14 21:06:41 -07:00
teknium1	53d1043a50	fix: restore config-saved custom endpoint resolution	2026-03-14 20:58:12 -07:00
teknium1	dc11b86e4b	refactor: unify vision backend gating	2026-03-14 20:22:13 -07:00
0xIbra	437ec17125	fix(cli): respect HERMES_HOME in all remaining hardcoded ~/.hermes paths Several files resolved paths via Path.home() / ".hermes" or os.path.expanduser("~/.hermes/..."), bypassing the HERMES_HOME environment variable. This broke isolation when running multiple Hermes instances with distinct HERMES_HOME directories. Replace all hardcoded paths with calls to get_hermes_home() from hermes_cli.config, consistent with the rest of the codebase. Files fixed: - tools/process_registry.py (processes.json) - gateway/pairing.py (pairing/) - gateway/sticker_cache.py (sticker_cache.json) - gateway/channel_directory.py (channel_directory.json, sessions.json) - gateway/config.py (gateway.json, config.yaml, sessions_dir) - gateway/mirror.py (sessions/) - gateway/hooks.py (hooks/) - gateway/platforms/base.py (image_cache/, audio_cache/, document_cache/) - gateway/platforms/whatsapp.py (whatsapp/session) - gateway/delivery.py (cron/output) - agent/auxiliary_client.py (auth.json) - agent/prompt_builder.py (SOUL.md) - cli.py (config.yaml, images/, pastes/, history) - run_agent.py (logs/) - tools/environments/base.py (sandboxes/) - tools/environments/modal.py (modal_snapshots.json) - tools/environments/singularity.py (singularity_snapshots.json) - tools/tts_tool.py (audio_cache) - hermes_cli/status.py (cron/jobs.json, sessions.json) - hermes_cli/gateway.py (logs/, whatsapp session) - hermes_cli/main.py (whatsapp/session) Tests updated to use HERMES_HOME env var instead of patching Path.home(). Closes #892 (cherry picked from commit 78ac1bba43b8b74a934c6172f2c29bb4d03164b9)	2026-03-13 21:32:53 -07:00
Teknium	11b577671b	fix: auxiliary client uses main model for custom/local endpoints instead of gpt-4o-mini (#1189 ) * fix: prevent model/provider mismatch when switching providers during active gateway When _update_config_for_provider() writes the new provider and base_url to config.yaml, the gateway (which re-reads config per-message) can pick up the change before model selection completes. This causes the old model name (e.g. 'anthropic/claude-opus-4.6') to be sent to the new provider's API (e.g. MiniMax), which fails. Changes: - _update_config_for_provider() now accepts an optional default_model parameter. When provided and the current model.default is empty or uses OpenRouter format (contains '/'), it sets a safe default model for the new provider. - All setup.py callers for direct-API providers (zai, kimi, minimax, minimax-cn, anthropic) now pass a provider-appropriate default model. - _setup_provider_model_selection() now validates the 'Keep current' choice: if the current model uses OpenRouter format and wouldn't work with the new provider, it warns and switches to the provider's first default model instead of silently keeping the incompatible name. Reported by a user on Home Assistant whose gateway started sending 'anthropic/claude-opus-4.6' to MiniMax's API after running hermes setup. * fix: auxiliary client uses main model for custom/local endpoints instead of gpt-4o-mini When a user runs a local server (e.g. Qwen3.5-9B via OPENAI_BASE_URL), the auxiliary client (context compression, vision, session search) would send requests for 'gpt-4o-mini' or 'google/gemini-3-flash-preview' to the local server, which only serves one model — causing 404 errors mid-task. Changes: - _try_custom_endpoint() now reads the user's configured main model via _read_main_model() (checks OPENAI_MODEL → HERMES_MODEL → LLM_MODEL → config.yaml model.default) instead of hardcoding 'gpt-4o-mini'. - resolve_provider_client() auto mode now detects when an OpenRouter- formatted model override (containing '/') would be sent to a non- OpenRouter provider (like a local server) and drops it in favor of the provider's default model. - Test isolation fixes: properly clear env vars in 'nothing available' tests to prevent host environment leakage.	2026-03-13 10:02:16 -07:00
teknium1	e976879cf2	merge: resolve conflicts with main (URL update to hermes-agent.nousresearch.com)	2026-03-12 17:49:26 -07:00
teknium1	4068f20ce9	fix(anthropic): deep scan fixes — auth, retries, edge cases Fixes from comprehensive code review and cross-referencing with clawdbot/OpenCode implementations: CRITICAL: - Add one-shot guard (anthropic_auth_retry_attempted) to prevent infinite 401 retry loops when credentials keep changing - Fix _is_oauth_token(): managed keys from ~/.claude.json are NOT regular API keys (don't start with sk-ant-api). Inverted the logic: only sk-ant-api* is treated as API key auth, everything else uses Bearer auth + oauth beta headers HIGH: - Wrap json.loads(args) in try/except in message conversion — malformed tool_call arguments no longer crash the entire conversation - Raise AuthError in runtime_provider when no Anthropic token found (was silently passing empty string, causing confusing API errors) - Remove broken _try_anthropic() from auxiliary vision chain — the centralized router creates an OpenAI client for api_key providers which doesn't work with Anthropic's Messages API MEDIUM: - Handle empty assistant message content — Anthropic rejects empty content blocks, now inserts '(empty)' placeholder - Fix setup.py existing_key logic — set to 'KEEP' sentinel instead of None to prevent falling through to the auth choice prompt - Add debug logging to _fetch_anthropic_models on failure Tests: 43 adapter tests (2 new for token detection), 3197 total passed	2026-03-12 17:14:22 -07:00
Teknium	39f3c0aeb0	fix: use hermes-agent.nousresearch.com as OpenRouter HTTP-Referer * fix: stop rejecting unlisted models + auto-detect from /models endpoint validate_requested_model() now accepts models not in the provider's API listing with a warning instead of blocking. Removes hardcoded catalog fallback for validation — if API is unreachable, accepts with a warning. Model selection flows (setup + /model command) now probe the provider's /models endpoint to get the real available models. Falls back to hardcoded defaults with a clear warning when auto-detection fails: 'Could not auto-detect models — use Custom model if yours isn't listed.' Z.AI setup no longer excludes GLM-5 on coding plans. * fix: use hermes-agent.nousresearch.com as HTTP-Referer for OpenRouter OpenRouter scrapes the favicon/logo from the HTTP-Referer URL for app rankings. We were sending the GitHub repo URL, which gives us a generic GitHub logo. Changed to the proper website URL so our actual branding shows up in rankings. Changed in run_agent.py (main agent client) and auxiliary_client.py (vision/summarization clients).	2026-03-12 16:20:22 -07:00
teknium1	7086fde37e	fix(anthropic): revert inline vision, add hermes model flow, wire vision aux Feedback fixes: 1. Revert _convert_vision_content — vision is handled by the vision_analyze tool, not by converting image blocks inline in conversation messages. Removed the function and its tests. 2. Add Anthropic to 'hermes model' (cmd_model in main.py): - Added to provider_labels dict - Added to providers selection list - Added _model_flow_anthropic() with Claude Code credential auto-detection, API key prompting, and model selection from catalog. 3. Wire up Anthropic as a vision-capable auxiliary provider: - Added _try_anthropic() to auxiliary_client.py using claude-sonnet-4 as the vision model (Claude natively supports multimodal) - Added to the get_vision_auxiliary_client() auto-detection chain (after OpenRouter/Nous, before Codex/custom) Cache tracking note: the Anthropic cache metrics branch in run_agent.py (cache_read_input_tokens / cache_creation_input_tokens) is in the correct place — it's response-level parsing, same location as the existing OpenRouter cache tracking. auxiliary_client.py has no cache tracking.	2026-03-12 16:09:04 -07:00
teknium1	5e12442b4b	feat: native Anthropic provider with Claude Code credential auto-discovery Add Anthropic as a first-class inference provider, bypassing OpenRouter for direct API access. Uses the native Anthropic SDK with a full format adapter (same pattern as the codex_responses api_mode). ## Auth (three methods, priority order) 1. ANTHROPIC_API_KEY env var (regular API key, sk-ant-api-) 2. ANTHROPIC_TOKEN / CLAUDE_CODE_OAUTH_TOKEN env var (setup-token, sk-ant-oat-) 3. Auto-discovery from ~/.claude/.credentials.json (Claude Code subscription) - Reads Claude Code's OAuth credentials - Checks token expiry with 60s buffer - Setup tokens use Bearer auth + anthropic-beta: oauth-2025-04-20 header - Regular API keys use standard x-api-key header ## Changes by file ### New files - agent/anthropic_adapter.py — Client builder, message/tool/response format conversion, Claude Code credential reader, token resolver. Handles system prompt extraction, tool_use/tool_result blocks, thinking/reasoning, orphaned tool_use cleanup, cache_control. - tests/test_anthropic_adapter.py — 36 tests covering all adapter logic ### Modified files - pyproject.toml — Add anthropic>=0.39.0 dependency - hermes_cli/auth.py — Add 'anthropic' to PROVIDER_REGISTRY with three env vars, plus 'claude'/'claude-code' aliases - hermes_cli/models.py — Add model catalog, labels, aliases, provider order - hermes_cli/main.py — Add 'anthropic' to --provider CLI choices - hermes_cli/runtime_provider.py — Add Anthropic branch returning api_mode='anthropic_messages' (before generic api_key fallthrough) - hermes_cli/setup.py — Add Anthropic setup wizard with Claude Code credential auto-discovery, model selection, OpenRouter tools prompt - agent/auxiliary_client.py — Add claude-haiku-4-5 as aux model - agent/model_metadata.py — Add bare Claude model context lengths - run_agent.py — Add anthropic_messages api_mode: * Client init (Anthropic SDK instead of OpenAI) * API call dispatch (_anthropic_client.messages.create) * Response validation (content blocks) * finish_reason mapping (stop_reason -> finish_reason) * Token usage (input_tokens/output_tokens) * Response normalization (normalize_anthropic_response) * Client interrupt/rebuild * Prompt caching auto-enabled for native Anthropic - tests/test_run_agent.py — Update test_anthropic_base_url_accepted to expect native routing, add test_prompt_caching_native_anthropic	2026-03-12 15:47:45 -07:00
teknium1	9302690e1b	refactor: remove LLM_MODEL env var dependency — config.yaml is sole source of truth Model selection now comes exclusively from config.yaml (set via 'hermes model' or 'hermes setup'). The LLM_MODEL env var is no longer read or written anywhere in production code. Why: env vars are per-process/per-user and would conflict in multi-agent or multi-tenant setups. Config.yaml is file-based and can be scoped per-user or eventually per-session. Changes: - cli.py: Read model from CLI_CONFIG only, not LLM_MODEL/OPENAI_MODEL - hermes_cli/auth.py: _save_model_choice() no longer writes LLM_MODEL to .env - hermes_cli/setup.py: Remove 12 save_env_value('LLM_MODEL', ...) calls from all provider setup flows - gateway/run.py: Remove LLM_MODEL fallback (HERMES_MODEL still works for gateway process runtime) - cron/scheduler.py: Same - agent/auxiliary_client.py: Remove LLM_MODEL from custom endpoint model detection	2026-03-11 22:04:42 -07:00
teknium1	a29801286f	refactor: route main agent client + fallback through centralized router Phase 2 of the provider router migration — route the main agent's client construction and fallback activation through resolve_provider_client() instead of duplicated ad-hoc logic. run_agent.py: - __init__: When no explicit api_key/base_url, use resolve_provider_client(provider, raw_codex=True) for client construction. Explicit creds (from CLI/gateway runtime provider) still construct directly. - _try_activate_fallback: Replace _resolve_fallback_credentials and its duplicated _FALLBACK_API_KEY_PROVIDERS / _FALLBACK_OAUTH_PROVIDERS dicts with a single resolve_provider_client() call. The router handles all provider types (API-key, OAuth, Codex) centrally. - Remove _resolve_fallback_credentials method and both fallback dicts. agent/auxiliary_client.py: - Add raw_codex parameter to resolve_provider_client(). When True, returns the raw OpenAI client for Codex providers instead of wrapping in CodexAuxiliaryClient. The main agent needs this for direct responses.stream() access. 3251 passed, 2 pre-existing unrelated failures.	2026-03-11 21:38:29 -07:00
teknium1	0aa31cd3cb	feat: call_llm/async_call_llm + config slots + migrate all consumers Add centralized call_llm() and async_call_llm() functions that own the full LLM request lifecycle: 1. Resolve provider + model from task config or explicit args 2. Get or create a cached client for that provider 3. Format request args (max_tokens handling, provider extra_body) 4. Make the API call with max_tokens/max_completion_tokens retry 5. Return the response Config: expanded auxiliary section with provider:model slots for all tasks (compression, vision, web_extract, session_search, skills_hub, mcp, flush_memories). Config version bumped to 7. Migrated all auxiliary consumers: - context_compressor.py: uses call_llm(task='compression') - vision_tools.py: uses async_call_llm(task='vision') - web_tools.py: uses async_call_llm(task='web_extract') - session_search_tool.py: uses async_call_llm(task='session_search') - browser_tool.py: uses call_llm(task='vision'/'web_extract') - mcp_tool.py: uses call_llm(task='mcp') - skills_guard.py: uses call_llm(provider='openrouter') - run_agent.py flush_memories: uses call_llm(task='flush_memories') Tests updated for context_compressor and MCP tool. Some test mocks still need updating (15 remaining failures from mock pattern changes, 2 pre-existing).	2026-03-11 20:52:19 -07:00
teknium1	013cc4d2fc	chore: remove nous-api provider (API key path) Nous Portal only supports OAuth authentication. Remove the 'nous-api' provider which allowed direct API key access via NOUS_API_KEY env var. Removed from: - hermes_cli/auth.py: PROVIDER_REGISTRY entry + aliases - hermes_cli/config.py: OPTIONAL_ENV_VARS entry - hermes_cli/setup.py: setup wizard option + model selection handler (reindexed remaining provider choices) - agent/auxiliary_client.py: docstring references - tests/test_runtime_provider_resolution.py: nous-api test - tests/integration/test_web_tools.py: renamed dict key	2026-03-11 20:14:44 -07:00
teknium1	07f09ecd83	refactor: route ad-hoc LLM consumers through centralized provider router Route all remaining ad-hoc auxiliary LLM call sites through resolve_provider_client() so auth, headers, and API format (Chat Completions vs Responses API) are handled consistently in one place. Files changed: - tools/openrouter_client.py: Replace manual AsyncOpenAI construction with resolve_provider_client('openrouter', async_mode=True). The shared client module now delegates entirely to the router. - tools/skills_guard.py: Replace inline OpenAI client construction (hardcoded OpenRouter base_url, manual api_key lookup, manual headers) with resolve_provider_client('openrouter'). Remove unused OPENROUTER_BASE_URL import. - trajectory_compressor.py: Add _detect_provider() to map config base_url to a provider name, then route through resolve_provider_client. Falls back to raw construction for unrecognized custom endpoints. - mini_swe_runner.py: Route default case (no explicit api_key/base_url) through resolve_provider_client('openrouter') with auto-detection fallback. Preserves direct construction when explicit creds are passed via CLI args. - agent/auxiliary_client.py: Fix stale module docstring — vision auto mode now correctly documents that Codex and custom endpoints are tried (not skipped).	2026-03-11 20:02:36 -07:00
teknium1	8805e705a7	feat: centralized provider router + fix Codex vision bypass + vision error handling Three interconnected fixes for auxiliary client infrastructure: 1. CENTRALIZED PROVIDER ROUTER (auxiliary_client.py) Add resolve_provider_client(provider, model, async_mode) — a single entry point for creating properly configured clients. Given a provider name and optional model, it handles auth lookup (env vars, OAuth tokens, auth.json), base URL resolution, provider-specific headers, and API format differences (Chat Completions vs Responses API for Codex). All auxiliary consumers should route through this instead of ad-hoc env var lookups. Refactored get_text_auxiliary_client, get_async_text_auxiliary_client, and get_vision_auxiliary_client to use the router internally. 2. FIX CODEX VISION BYPASS (vision_tools.py) vision_tools.py was constructing a raw AsyncOpenAI client from the sync vision client's api_key/base_url, completely bypassing the Codex Responses API adapter. When the vision provider resolved to Codex, the raw client would hit chatgpt.com/backend-api/codex with chat.completions.create() which only supports the Responses API. Fix: Added get_async_vision_auxiliary_client() which properly wraps Codex into AsyncCodexAuxiliaryClient. vision_tools.py now uses this instead of manual client construction. 3. FIX COMPRESSION FALLBACK + VISION ERROR HANDLING - context_compressor.py: Removed _get_fallback_client() which blindly looked for OPENAI_API_KEY + OPENAI_BASE_URL (fails for Codex OAuth, API-key providers, users without OPENAI_BASE_URL set). Replaced with fallback loop through resolve_provider_client() for each known provider, with same-provider dedup. - vision_tools.py: Added error detection for vision capability failures. Returns clear message to the model when the configured model doesn't support vision, instead of a generic error. Addresses #886	2026-03-11 19:46:47 -07:00
teknium1	ef5d811aba	fix: vision auto-detection now falls back to custom/local endpoints Vision auto-mode previously only tried OpenRouter, Nous, and Codex for multimodal — deliberately skipping custom endpoints with the assumption they 'may not handle vision input.' This caused silent failures for users running local multimodal models (Qwen-VL, LLaVA, Pixtral, etc.) without any cloud API keys. Now custom endpoints are tried as a last resort in auto mode. If the model doesn't support vision, the API call fails gracefully — but users with local vision models no longer need to manually set auxiliary.vision.provider: main in config.yaml. Reported by @Spadav and @kotyKD.	2026-03-09 15:36:19 -07:00
teknium1	2d1a1c1c47	refactor: remove redundant 'openai' auxiliary provider, clean up docs The 'openai' provider was redundant — using OPENAI_BASE_URL + OPENAI_API_KEY with provider: 'main' already covers direct OpenAI API. Provider options are now: auto, openrouter, nous, codex, main. - Removed _try_openai(), _OPENAI_AUX_MODEL, _OPENAI_BASE_URL - Replaced openai tests with codex provider tests - Updated all docs to remove 'openai' option and clarify 'main' - 'main' description now explicitly mentions it works with OpenAI API, local models, and any OpenAI-compatible endpoint Tests: 2467 passed.	2026-03-08 18:50:26 -07:00
teknium1	71e81728ac	feat: Codex OAuth vision support + multimodal content adapter The Codex Responses API (chatgpt.com/backend-api/codex) supports vision via gpt-5.3-codex. This was verified with real API calls using image analysis. Changes to _CodexCompletionsAdapter: - Added _convert_content_for_responses() to translate chat.completions multimodal format to Responses API format: - {type: 'text'} → {type: 'input_text'} - {type: 'image_url', image_url: {url: '...'}} → {type: 'input_image', image_url: '...'} - Fixed: removed 'stream' from resp_kwargs (responses.stream() handles it) - Fixed: removed max_output_tokens and temperature (Codex endpoint rejects them) Provider changes: - Added 'codex' as explicit auxiliary provider option - Vision auto-fallback now includes Codex (OpenRouter → Nous → Codex) since gpt-5.3-codex supports multimodal input - Updated docs with Codex OAuth examples Tested with real Codex OAuth token + ~/.hermes/image2.png — confirmed working end-to-end through the full adapter pipeline. Tests: 2459 passed.	2026-03-08 18:44:33 -07:00
teknium1	ae4a674c84	feat: add 'openai' as auxiliary provider option Users can now set provider: "openai" for auxiliary tasks (vision, web extract, compression) to use OpenAI's API directly with their OPENAI_API_KEY. This hits api.openai.com/v1 with gpt-4o-mini as the default model — supports vision since GPT-4o handles image input. Provider options are now: auto, openrouter, nous, openai, main. Changes: - agent/auxiliary_client.py: added _try_openai(), "openai" case in _resolve_forced_provider(), updated auxiliary_max_tokens_param() to use max_completion_tokens for OpenAI - Updated docs: cli-config.yaml.example, AGENTS.md, and user-facing configuration.md with Common Setups section showing OpenAI, OpenRouter, and local model examples - 3 new tests for OpenAI provider resolution Tests: 2459 passed (was 2429).	2026-03-08 18:25:30 -07:00
teknium1	5ae0b731d0	fix: harden auxiliary model config — gateway bridge, vision safety, tests Improvements on top of PR #606 (auxiliary model configuration): 1. Gateway bridge: Added auxiliary.* and compression.summary_provider config bridging to gateway/run.py so config.yaml settings work from messaging platforms (not just CLI). Matches the pattern in cli.py. 2. Vision auto-fallback safety: In auto mode, vision now only tries OpenRouter + Nous Portal (known multimodal-capable providers). Custom endpoints, Codex, and API-key providers are skipped to avoid confusing errors from providers that don't support vision input. Explicit provider override (AUXILIARY_VISION_PROVIDER=main) still allows using any provider. 3. Comprehensive tests (46 new): - _get_auxiliary_provider env var resolution (8 tests) - _resolve_forced_provider with all provider types (8 tests) - Per-task provider routing integration (4 tests) - Vision auto-fallback safety (7 tests) - Config bridging logic (11 tests) - Gateway/CLI bridge parity (2 tests) - Vision model override via env var (2 tests) - DEFAULT_CONFIG shape validation (4 tests) 4. Docs: Added auxiliary_client.py to AGENTS.md project structure. Updated module docstring with separate text/vision resolution chains. Tests: 2429 passed (was 2383).	2026-03-08 18:06:47 -07:00
teknium1	d9f373654b	feat: enhance auxiliary model configuration and environment variable handling - Added support for auxiliary model overrides in the configuration, allowing users to specify providers and models for vision and web extraction tasks. - Updated the CLI configuration example to include new auxiliary model settings. - Enhanced the environment variable mapping in the CLI to accommodate auxiliary model configurations. - Improved the resolution logic for auxiliary clients to support task-specific provider overrides. - Updated relevant documentation and comments for clarity on the new features and their usage.	2026-03-08 18:06:47 -07:00
Christo Mitov	4447e7d71a	fix: add Kimi Code API support (api.kimi.com/coding/v1) Kimi Code (platform.kimi.ai) issues API keys prefixed sk-kimi- that require: 1. A different base URL: api.kimi.com/coding/v1 (not api.moonshot.ai/v1) 2. A User-Agent header identifying a recognized coding agent Without this fix, sk-kimi- keys fail with 401 (wrong endpoint) or 403 ('only available for Coding Agents') errors. Changes: - Auto-detect sk-kimi- key prefix and route to api.kimi.com/coding/v1 - Send User-Agent: KimiCLI/1.0 header for Kimi Code endpoints - Legacy Moonshot keys (api.moonshot.ai) continue to work unchanged - KIMI_BASE_URL env var override still takes priority over auto-detection - Updated .env.example with correct docs and all endpoint options - Fixed doctor.py health check for Kimi Code keys Reference: https://github.com/MoonshotAI/kimi-cli (platforms.py)	2026-03-07 21:00:12 -05:00
teknium1	e2821effb5	feat: add direct API-key providers as auxiliary client fallbacks When the user only has a z.ai/Kimi/MiniMax API key (no OpenRouter key), auxiliary tasks (context compression, web summarization, session search) now fall back to the configured direct provider instead of returning None. Resolution chain: OpenRouter -> Nous -> Custom endpoint -> Codex OAuth -> direct API-key providers -> None. Uses cheap/fast models for auxiliary tasks: - zai: glm-4.5-flash - kimi-coding: kimi-k2-turbo-preview - minimax/minimax-cn: MiniMax-M2.5-highspeed Vision auxiliary intentionally NOT modified — vision needs multimodal models (Gemini) that these providers don't serve.	2026-03-06 19:08:54 -08:00
teknium1	33ab5cec82	fix: handle None message content across codebase (fixes #276 ) The OpenAI API returns content: null on assistant messages with tool calls. msg.get('content', '') returns None when the key exists with value None, causing TypeError on len(), string concatenation, and .strip() in downstream code paths. Fixed 4 locations that process conversation messages: - agent/auxiliary_client.py:84 — None passed to API calls - cli.py:1288 — crash on content[:200] and len(content) - run_agent.py:3444 — crash on None.strip() - honcho_integration/session.py:445 — 'None' rendered in transcript 13 other instances were verified safe (already protected, only process user/tool messages, or use the safe pattern). Pattern: msg.get('content', '') → msg.get('content') or '' Fixes #276	2026-03-02 02:23:53 -08:00
teknium1	5e598a588f	refactor(auth): transition Codex OAuth tokens to Hermes auth store Updated the authentication mechanism to store Codex OAuth tokens in the Hermes auth store located at ~/.hermes/auth.json instead of the previous ~/.codex/auth.json. This change includes refactoring related functions for reading and saving tokens, ensuring better management of authentication states and preventing conflicts between different applications. Adjusted tests to reflect the new storage structure and improved error handling for missing or malformed tokens.	2026-03-01 19:59:24 -08:00
teknium1	500f0eab4a	refactor(cli): Finalize OpenAI Codex Integration with OAuth - Enhanced Codex model discovery by fetching available models from the API, with fallback to local cache and defaults. - Updated the context compressor's summary target tokens to 2500 for improved performance. - Added external credential detection for Codex CLI to streamline authentication. - Refactored various components to ensure consistent handling of authentication and model selection across the application.	2026-02-28 21:47:51 -08:00
teknium1	2205b22409	fix(headers): update X-OpenRouter-Categories to include 'productivity'	2026-02-28 10:38:49 -08:00
teknium1	58fce0a37b	feat(api): implement dynamic max tokens handling for various providers - Added _max_tokens_param method in AIAgent to return appropriate max tokens parameter based on the provider (OpenAI vs. others). - Updated API calls in AIAgent to utilize the new max tokens handling. - Introduced auxiliary_max_tokens_param function in auxiliary_client for consistent max tokens management across auxiliary clients. - Refactored multiple tools to use auxiliary_max_tokens_param for improved compatibility with different models and providers.	2026-02-26 20:23:56 -08:00
teknium1	7a3656aea2	refactor: integrate Nous Portal support in auxiliary client - Added functionality to include product attribution tags for Nous Portal in auxiliary API calls. - Introduced a mechanism to determine if the auxiliary client is backed by Nous Portal, affecting the extra body of requests. - Updated various tools to utilize the new extra body configuration for enhanced tracking in API calls.	2026-02-25 18:39:36 -08:00
teknium1	9a858b8d67	add identifier for openrouter calls	2026-02-25 16:34:47 -08:00
teknium1	ededaaa874	Hermes Agent UX Improvements	2026-02-22 02:16:11 -08:00

39 Commits