hermes-agent

Author	SHA1	Message	Date
teknium1	9742f11fda	chore: add context lengths for Kimi and MiniMax models Adds DEFAULT_CONTEXT_LENGTHS entries for kimi-k2.5 (262144), kimi-k2-thinking (262144), kimi-k2-turbo-preview (262144), kimi-k2-0905-preview (131072), MiniMax-M2.5/M2.5-highspeed/M2.1 (204800), and glm-4.5/4.5-flash (131072). Avoids unnecessary 2M-token probe on first use with direct providers.	2026-03-06 19:01:38 -08:00
teknium1	3c6c11b7c9	Merge PR #420 : fix: respect OPENAI_BASE_URL when resolving API key priority Authored by manuelschipper. Adds GLM-4.7 and GLM-5 context lengths (202752) to model_metadata.py. The key priority fix (prefer OPENAI_API_KEY for non-OpenRouter endpoints) was already applied in PR #295; merged the Z.ai mention into the comment.	2026-03-06 18:43:13 -08:00
teknium1	e9f05b3524	test: comprehensive tests for model metadata + firecrawl config model_metadata tests (61 tests, was 39): - Token estimation: concrete value assertions, unicode, tool_call messages, vision multimodal content, additive verification - Context length resolution: cache-over-API priority, no-base_url skips cache, missing context_length key in API response - API metadata fetch: canonical_slug aliasing, TTL expiry with time mock, stale cache fallback on API failure, malformed JSON resilience - Probe tiers: above-max returns 2M, zero returns None - Error parsing: Anthropic format ('X > Y maximum'), LM Studio, empty string, unreasonably large numbers — also fixed parser to handle Anthropic format - Cache: corruption resilience (garbage YAML, wrong structure), value updates, special chars in model names Firecrawl config tests (8 tests, was 4): - Singleton caching (core purpose — verified constructor called once) - Constructor failure recovery (retry after exception) - Return value actually asserted (not just constructor args) - Empty string env vars treated as absent - Proper setup/teardown for env var isolation	2026-03-05 18:22:39 -08:00
teknium1	c886333d32	feat: smart context length probing with persistent caching + banner display Replaces the unsafe 128K fallback for unknown models with a descending probe strategy (2M → 1M → 512K → 200K → 128K → 64K → 32K). When a context-length error occurs, the agent steps down tiers and retries. The discovered limit is cached per model+provider combo in ~/.hermes/context_length_cache.yaml so subsequent sessions skip probing. Also parses API error messages to extract the actual context limit (e.g. 'maximum context length is 32768 tokens') for instant resolution. The CLI banner now displays the context window size next to the model name (e.g. 'claude-opus-4 · 200K context · Nous Research'). Changes: - agent/model_metadata.py: CONTEXT_PROBE_TIERS, persistent cache (save/load/get), parse_context_limit_from_error(), get_next_probe_tier() - agent/context_compressor.py: accepts base_url, passes to metadata - run_agent.py: step-down logic in context error handler, caches on success - cli.py + hermes_cli/banner.py: context length in welcome banner - tests: 22 new tests for probing, parsing, and caching Addresses #132. PR #319's approach (8K default) rejected — too conservative.	2026-03-05 16:09:57 -08:00
Dev User	3221818b6e	fix: respect OPENAI_BASE_URL when resolving API key priority When base_url points to a non-OpenRouter endpoint (e.g. Z.ai), OPENROUTER_API_KEY incorrectly takes priority over OPENAI_API_KEY, sending the wrong credentials. This causes 401 errors on the main inference path and forces users to comment out OPENROUTER_API_KEY, which then breaks auxiliary clients (compression, vision). Fix: check whether base_url contains "openrouter" and swap the key priority accordingly. Also adds GLM-4.7 and GLM-5 context lengths to DEFAULT_CONTEXT_LENGTHS.	2026-03-05 08:25:16 +00:00
teknium1	9123cfb5dd	Refactor Terminal and AIAgent cleanup	2026-02-21 22:31:43 -08:00

6 Commits