hermes-agent

Author	SHA1	Message	Date
teknium1	9302690e1b	refactor: remove LLM_MODEL env var dependency — config.yaml is sole source of truth Model selection now comes exclusively from config.yaml (set via 'hermes model' or 'hermes setup'). The LLM_MODEL env var is no longer read or written anywhere in production code. Why: env vars are per-process/per-user and would conflict in multi-agent or multi-tenant setups. Config.yaml is file-based and can be scoped per-user or eventually per-session. Changes: - cli.py: Read model from CLI_CONFIG only, not LLM_MODEL/OPENAI_MODEL - hermes_cli/auth.py: _save_model_choice() no longer writes LLM_MODEL to .env - hermes_cli/setup.py: Remove 12 save_env_value('LLM_MODEL', ...) calls from all provider setup flows - gateway/run.py: Remove LLM_MODEL fallback (HERMES_MODEL still works for gateway process runtime) - cron/scheduler.py: Same - agent/auxiliary_client.py: Remove LLM_MODEL from custom endpoint model detection	2026-03-11 22:04:42 -07:00
teknium1	a29801286f	refactor: route main agent client + fallback through centralized router Phase 2 of the provider router migration — route the main agent's client construction and fallback activation through resolve_provider_client() instead of duplicated ad-hoc logic. run_agent.py: - __init__: When no explicit api_key/base_url, use resolve_provider_client(provider, raw_codex=True) for client construction. Explicit creds (from CLI/gateway runtime provider) still construct directly. - _try_activate_fallback: Replace _resolve_fallback_credentials and its duplicated _FALLBACK_API_KEY_PROVIDERS / _FALLBACK_OAUTH_PROVIDERS dicts with a single resolve_provider_client() call. The router handles all provider types (API-key, OAuth, Codex) centrally. - Remove _resolve_fallback_credentials method and both fallback dicts. agent/auxiliary_client.py: - Add raw_codex parameter to resolve_provider_client(). When True, returns the raw OpenAI client for Codex providers instead of wrapping in CodexAuxiliaryClient. The main agent needs this for direct responses.stream() access. 3251 passed, 2 pre-existing unrelated failures.	2026-03-11 21:38:29 -07:00
teknium1	0aa31cd3cb	feat: call_llm/async_call_llm + config slots + migrate all consumers Add centralized call_llm() and async_call_llm() functions that own the full LLM request lifecycle: 1. Resolve provider + model from task config or explicit args 2. Get or create a cached client for that provider 3. Format request args (max_tokens handling, provider extra_body) 4. Make the API call with max_tokens/max_completion_tokens retry 5. Return the response Config: expanded auxiliary section with provider:model slots for all tasks (compression, vision, web_extract, session_search, skills_hub, mcp, flush_memories). Config version bumped to 7. Migrated all auxiliary consumers: - context_compressor.py: uses call_llm(task='compression') - vision_tools.py: uses async_call_llm(task='vision') - web_tools.py: uses async_call_llm(task='web_extract') - session_search_tool.py: uses async_call_llm(task='session_search') - browser_tool.py: uses call_llm(task='vision'/'web_extract') - mcp_tool.py: uses call_llm(task='mcp') - skills_guard.py: uses call_llm(provider='openrouter') - run_agent.py flush_memories: uses call_llm(task='flush_memories') Tests updated for context_compressor and MCP tool. Some test mocks still need updating (15 remaining failures from mock pattern changes, 2 pre-existing).	2026-03-11 20:52:19 -07:00
teknium1	013cc4d2fc	chore: remove nous-api provider (API key path) Nous Portal only supports OAuth authentication. Remove the 'nous-api' provider which allowed direct API key access via NOUS_API_KEY env var. Removed from: - hermes_cli/auth.py: PROVIDER_REGISTRY entry + aliases - hermes_cli/config.py: OPTIONAL_ENV_VARS entry - hermes_cli/setup.py: setup wizard option + model selection handler (reindexed remaining provider choices) - agent/auxiliary_client.py: docstring references - tests/test_runtime_provider_resolution.py: nous-api test - tests/integration/test_web_tools.py: renamed dict key	2026-03-11 20:14:44 -07:00
teknium1	07f09ecd83	refactor: route ad-hoc LLM consumers through centralized provider router Route all remaining ad-hoc auxiliary LLM call sites through resolve_provider_client() so auth, headers, and API format (Chat Completions vs Responses API) are handled consistently in one place. Files changed: - tools/openrouter_client.py: Replace manual AsyncOpenAI construction with resolve_provider_client('openrouter', async_mode=True). The shared client module now delegates entirely to the router. - tools/skills_guard.py: Replace inline OpenAI client construction (hardcoded OpenRouter base_url, manual api_key lookup, manual headers) with resolve_provider_client('openrouter'). Remove unused OPENROUTER_BASE_URL import. - trajectory_compressor.py: Add _detect_provider() to map config base_url to a provider name, then route through resolve_provider_client. Falls back to raw construction for unrecognized custom endpoints. - mini_swe_runner.py: Route default case (no explicit api_key/base_url) through resolve_provider_client('openrouter') with auto-detection fallback. Preserves direct construction when explicit creds are passed via CLI args. - agent/auxiliary_client.py: Fix stale module docstring — vision auto mode now correctly documents that Codex and custom endpoints are tried (not skipped).	2026-03-11 20:02:36 -07:00
teknium1	8805e705a7	feat: centralized provider router + fix Codex vision bypass + vision error handling Three interconnected fixes for auxiliary client infrastructure: 1. CENTRALIZED PROVIDER ROUTER (auxiliary_client.py) Add resolve_provider_client(provider, model, async_mode) — a single entry point for creating properly configured clients. Given a provider name and optional model, it handles auth lookup (env vars, OAuth tokens, auth.json), base URL resolution, provider-specific headers, and API format differences (Chat Completions vs Responses API for Codex). All auxiliary consumers should route through this instead of ad-hoc env var lookups. Refactored get_text_auxiliary_client, get_async_text_auxiliary_client, and get_vision_auxiliary_client to use the router internally. 2. FIX CODEX VISION BYPASS (vision_tools.py) vision_tools.py was constructing a raw AsyncOpenAI client from the sync vision client's api_key/base_url, completely bypassing the Codex Responses API adapter. When the vision provider resolved to Codex, the raw client would hit chatgpt.com/backend-api/codex with chat.completions.create() which only supports the Responses API. Fix: Added get_async_vision_auxiliary_client() which properly wraps Codex into AsyncCodexAuxiliaryClient. vision_tools.py now uses this instead of manual client construction. 3. FIX COMPRESSION FALLBACK + VISION ERROR HANDLING - context_compressor.py: Removed _get_fallback_client() which blindly looked for OPENAI_API_KEY + OPENAI_BASE_URL (fails for Codex OAuth, API-key providers, users without OPENAI_BASE_URL set). Replaced with fallback loop through resolve_provider_client() for each known provider, with same-provider dedup. - vision_tools.py: Added error detection for vision capability failures. Returns clear message to the model when the configured model doesn't support vision, instead of a generic error. Addresses #886	2026-03-11 19:46:47 -07:00
teknium1	ef5d811aba	fix: vision auto-detection now falls back to custom/local endpoints Vision auto-mode previously only tried OpenRouter, Nous, and Codex for multimodal — deliberately skipping custom endpoints with the assumption they 'may not handle vision input.' This caused silent failures for users running local multimodal models (Qwen-VL, LLaVA, Pixtral, etc.) without any cloud API keys. Now custom endpoints are tried as a last resort in auto mode. If the model doesn't support vision, the API call fails gracefully — but users with local vision models no longer need to manually set auxiliary.vision.provider: main in config.yaml. Reported by @Spadav and @kotyKD.	2026-03-09 15:36:19 -07:00
teknium1	2d1a1c1c47	refactor: remove redundant 'openai' auxiliary provider, clean up docs The 'openai' provider was redundant — using OPENAI_BASE_URL + OPENAI_API_KEY with provider: 'main' already covers direct OpenAI API. Provider options are now: auto, openrouter, nous, codex, main. - Removed _try_openai(), _OPENAI_AUX_MODEL, _OPENAI_BASE_URL - Replaced openai tests with codex provider tests - Updated all docs to remove 'openai' option and clarify 'main' - 'main' description now explicitly mentions it works with OpenAI API, local models, and any OpenAI-compatible endpoint Tests: 2467 passed.	2026-03-08 18:50:26 -07:00
teknium1	71e81728ac	feat: Codex OAuth vision support + multimodal content adapter The Codex Responses API (chatgpt.com/backend-api/codex) supports vision via gpt-5.3-codex. This was verified with real API calls using image analysis. Changes to _CodexCompletionsAdapter: - Added _convert_content_for_responses() to translate chat.completions multimodal format to Responses API format: - {type: 'text'} → {type: 'input_text'} - {type: 'image_url', image_url: {url: '...'}} → {type: 'input_image', image_url: '...'} - Fixed: removed 'stream' from resp_kwargs (responses.stream() handles it) - Fixed: removed max_output_tokens and temperature (Codex endpoint rejects them) Provider changes: - Added 'codex' as explicit auxiliary provider option - Vision auto-fallback now includes Codex (OpenRouter → Nous → Codex) since gpt-5.3-codex supports multimodal input - Updated docs with Codex OAuth examples Tested with real Codex OAuth token + ~/.hermes/image2.png — confirmed working end-to-end through the full adapter pipeline. Tests: 2459 passed.	2026-03-08 18:44:33 -07:00
teknium1	ae4a674c84	feat: add 'openai' as auxiliary provider option Users can now set provider: "openai" for auxiliary tasks (vision, web extract, compression) to use OpenAI's API directly with their OPENAI_API_KEY. This hits api.openai.com/v1 with gpt-4o-mini as the default model — supports vision since GPT-4o handles image input. Provider options are now: auto, openrouter, nous, openai, main. Changes: - agent/auxiliary_client.py: added _try_openai(), "openai" case in _resolve_forced_provider(), updated auxiliary_max_tokens_param() to use max_completion_tokens for OpenAI - Updated docs: cli-config.yaml.example, AGENTS.md, and user-facing configuration.md with Common Setups section showing OpenAI, OpenRouter, and local model examples - 3 new tests for OpenAI provider resolution Tests: 2459 passed (was 2429).	2026-03-08 18:25:30 -07:00
teknium1	5ae0b731d0	fix: harden auxiliary model config — gateway bridge, vision safety, tests Improvements on top of PR #606 (auxiliary model configuration): 1. Gateway bridge: Added auxiliary.* and compression.summary_provider config bridging to gateway/run.py so config.yaml settings work from messaging platforms (not just CLI). Matches the pattern in cli.py. 2. Vision auto-fallback safety: In auto mode, vision now only tries OpenRouter + Nous Portal (known multimodal-capable providers). Custom endpoints, Codex, and API-key providers are skipped to avoid confusing errors from providers that don't support vision input. Explicit provider override (AUXILIARY_VISION_PROVIDER=main) still allows using any provider. 3. Comprehensive tests (46 new): - _get_auxiliary_provider env var resolution (8 tests) - _resolve_forced_provider with all provider types (8 tests) - Per-task provider routing integration (4 tests) - Vision auto-fallback safety (7 tests) - Config bridging logic (11 tests) - Gateway/CLI bridge parity (2 tests) - Vision model override via env var (2 tests) - DEFAULT_CONFIG shape validation (4 tests) 4. Docs: Added auxiliary_client.py to AGENTS.md project structure. Updated module docstring with separate text/vision resolution chains. Tests: 2429 passed (was 2383).	2026-03-08 18:06:47 -07:00
teknium1	d9f373654b	feat: enhance auxiliary model configuration and environment variable handling - Added support for auxiliary model overrides in the configuration, allowing users to specify providers and models for vision and web extraction tasks. - Updated the CLI configuration example to include new auxiliary model settings. - Enhanced the environment variable mapping in the CLI to accommodate auxiliary model configurations. - Improved the resolution logic for auxiliary clients to support task-specific provider overrides. - Updated relevant documentation and comments for clarity on the new features and their usage.	2026-03-08 18:06:47 -07:00
Christo Mitov	4447e7d71a	fix: add Kimi Code API support (api.kimi.com/coding/v1) Kimi Code (platform.kimi.ai) issues API keys prefixed sk-kimi- that require: 1. A different base URL: api.kimi.com/coding/v1 (not api.moonshot.ai/v1) 2. A User-Agent header identifying a recognized coding agent Without this fix, sk-kimi- keys fail with 401 (wrong endpoint) or 403 ('only available for Coding Agents') errors. Changes: - Auto-detect sk-kimi- key prefix and route to api.kimi.com/coding/v1 - Send User-Agent: KimiCLI/1.0 header for Kimi Code endpoints - Legacy Moonshot keys (api.moonshot.ai) continue to work unchanged - KIMI_BASE_URL env var override still takes priority over auto-detection - Updated .env.example with correct docs and all endpoint options - Fixed doctor.py health check for Kimi Code keys Reference: https://github.com/MoonshotAI/kimi-cli (platforms.py)	2026-03-07 21:00:12 -05:00
teknium1	e2821effb5	feat: add direct API-key providers as auxiliary client fallbacks When the user only has a z.ai/Kimi/MiniMax API key (no OpenRouter key), auxiliary tasks (context compression, web summarization, session search) now fall back to the configured direct provider instead of returning None. Resolution chain: OpenRouter -> Nous -> Custom endpoint -> Codex OAuth -> direct API-key providers -> None. Uses cheap/fast models for auxiliary tasks: - zai: glm-4.5-flash - kimi-coding: kimi-k2-turbo-preview - minimax/minimax-cn: MiniMax-M2.5-highspeed Vision auxiliary intentionally NOT modified — vision needs multimodal models (Gemini) that these providers don't serve.	2026-03-06 19:08:54 -08:00
teknium1	33ab5cec82	fix: handle None message content across codebase (fixes #276 ) The OpenAI API returns content: null on assistant messages with tool calls. msg.get('content', '') returns None when the key exists with value None, causing TypeError on len(), string concatenation, and .strip() in downstream code paths. Fixed 4 locations that process conversation messages: - agent/auxiliary_client.py:84 — None passed to API calls - cli.py:1288 — crash on content[:200] and len(content) - run_agent.py:3444 — crash on None.strip() - honcho_integration/session.py:445 — 'None' rendered in transcript 13 other instances were verified safe (already protected, only process user/tool messages, or use the safe pattern). Pattern: msg.get('content', '') → msg.get('content') or '' Fixes #276	2026-03-02 02:23:53 -08:00
teknium1	5e598a588f	refactor(auth): transition Codex OAuth tokens to Hermes auth store Updated the authentication mechanism to store Codex OAuth tokens in the Hermes auth store located at ~/.hermes/auth.json instead of the previous ~/.codex/auth.json. This change includes refactoring related functions for reading and saving tokens, ensuring better management of authentication states and preventing conflicts between different applications. Adjusted tests to reflect the new storage structure and improved error handling for missing or malformed tokens.	2026-03-01 19:59:24 -08:00
teknium1	500f0eab4a	refactor(cli): Finalize OpenAI Codex Integration with OAuth - Enhanced Codex model discovery by fetching available models from the API, with fallback to local cache and defaults. - Updated the context compressor's summary target tokens to 2500 for improved performance. - Added external credential detection for Codex CLI to streamline authentication. - Refactored various components to ensure consistent handling of authentication and model selection across the application.	2026-02-28 21:47:51 -08:00
teknium1	2205b22409	fix(headers): update X-OpenRouter-Categories to include 'productivity'	2026-02-28 10:38:49 -08:00
teknium1	58fce0a37b	feat(api): implement dynamic max tokens handling for various providers - Added _max_tokens_param method in AIAgent to return appropriate max tokens parameter based on the provider (OpenAI vs. others). - Updated API calls in AIAgent to utilize the new max tokens handling. - Introduced auxiliary_max_tokens_param function in auxiliary_client for consistent max tokens management across auxiliary clients. - Refactored multiple tools to use auxiliary_max_tokens_param for improved compatibility with different models and providers.	2026-02-26 20:23:56 -08:00
teknium1	7a3656aea2	refactor: integrate Nous Portal support in auxiliary client - Added functionality to include product attribution tags for Nous Portal in auxiliary API calls. - Introduced a mechanism to determine if the auxiliary client is backed by Nous Portal, affecting the extra body of requests. - Updated various tools to utilize the new extra body configuration for enhanced tracking in API calls.	2026-02-25 18:39:36 -08:00
teknium1	9a858b8d67	add identifier for openrouter calls	2026-02-25 16:34:47 -08:00
teknium1	ededaaa874	Hermes Agent UX Improvements	2026-02-22 02:16:11 -08:00

22 Commits