hermes-agent

Author	SHA1	Message	Date
0xbyt4	dbc25a386e	fix: auxiliary client skips expired Codex JWT and propagates Anthropic OAuth flag Two bugs in the auxiliary provider auto-detection chain: 1. Expired Codex JWT blocks the auto chain: _read_codex_access_token() returned any stored token without checking expiry, preventing fallback to working providers. Now decodes JWT exp claim and returns None for expired tokens. 2. Auxiliary Anthropic client missing OAuth identity transforms: _AnthropicCompletionsAdapter always called build_anthropic_kwargs with is_oauth=False, causing 400 errors for OAuth tokens. Now detects OAuth tokens via _is_oauth_token() and propagates the flag through the adapter chain. Cherry-picked from PR #2378 by 0xbyt4. Fixed test_api_key_no_oauth_flag to mock resolve_anthropic_token directly (env var alone was insufficient).	2026-03-21 17:36:25 -07:00
Teknium	0ea7d0ec80	fix(terminal): log disk warning check failures at debug level (salvage #2372 ) (#2394 ) * fix(terminal): log disk warning check failures at debug level * fix(terminal): guard _check_disk_usage_warning by moving scratch_dir into try --------- Co-authored-by: aydnOktay <xaydinoktay@gmail.com>	2026-03-21 17:10:17 -07:00
Teknium	1d28b4699b	fix(redact): safely handle non-string inputs (salvage #2369 ) fix(redact): safely handle non-string inputs (salvage #2369)	2026-03-21 17:10:14 -07:00
aydnOktay	40c9a13476	fix(redact): safely handle non-string inputs redact_sensitive_text() now returns early for None and coerces other non-string values to str before applying regex-based redaction, preventing TypeErrors in logging/tool-output paths. Cherry-picked from PR #2369 by aydnOktay.	2026-03-21 16:55:02 -07:00
teyrebaz33	bd49bce278	fix(prompt-caching): skip top-level cache_control on role:tool for OpenRouter On the native Anthropic Messages API path, convert_messages_to_anthropic() moves top-level cache_control on role:tool messages inside the tool_result block. On OpenRouter (chat_completions), no such conversion happens — the unexpected top-level field causes a silent hang on the second tool call. Add native_anthropic parameter to _apply_cache_marker() and apply_anthropic_cache_control(). When False (OpenRouter), role:tool messages are skipped entirely. When True (native Anthropic), existing behaviour is preserved. Fixes #2362	2026-03-21 16:54:43 -07:00
Teknium	52dd479214	Merge pull request #2361 from NousResearch/hermes/hermes-5d6932ba feat(gateway): cache AIAgent per session for prompt caching	2026-03-21 16:53:21 -07:00
Teknium	c57d5cbdde	fix(update): prompt before resetting working tree on stash conflicts (#2390 ) When 'hermes update' stashes local changes and the restore hits conflicts, the previous behavior silently ran 'git reset --hard HEAD' to clean up. This could surprise users who didn't realize their working tree was being nuked. Now the conflict handler: - Lists the specific conflicted files - Reassures the user their stash is preserved - Asks before resetting (interactive mode) - Auto-resets in non-interactive mode (prompt_user=False) - If declined, leaves the working tree as-is with guidance	2026-03-21 16:49:19 -07:00
Teknium	525caadd8c	fix: prevent Anthropic token leaking to third-party anthropic_messages providers (salvage #2383 ) (#2389 ) * fix: prevent Anthropic token fallback leaking to third-party anthropic_messages providers When provider is minimax/alibaba/etc and MINIMAX_API_KEY is not set, the code fell back to resolve_anthropic_token() sending Anthropic OAuth credentials to third-party endpoints, causing 401 errors. Now only provider=="anthropic" triggers the fallback. Generalizes the Alibaba-specific guard from #1739 to all non-Anthropic providers. * fix: set provider='anthropic' in credential refresh tests Follow-up for cherry-picked PR #2383 — existing tests didn't set agent.provider, which the new guard requires to allow Anthropic token refresh. --------- Co-authored-by: 0xbyt4 <35742124+0xbyt4@users.noreply.github.com>	2026-03-21 16:42:46 -07:00
Teknium	342096b4bd	feat(gateway): cache AIAgent per session for prompt caching The gateway created a fresh AIAgent per message, rebuilding the system prompt (including memory, skills, context files) every turn. This broke prompt prefix caching — providers like Anthropic charge ~10x more for uncached prefixes. Now caches AIAgent instances per session_key with a config signature. The cached agent is reused across messages in the same session, preserving the frozen system prompt and tool schemas. Cache is invalidated when: - Config changes (model, provider, toolsets, reasoning, ephemeral prompt) — detected via signature mismatch - /new, /reset, /clear — explicit session reset - /model — global model change clears all cached agents - /reasoning — global reasoning change clears all cached agents Per-message state (callbacks, stream consumers, progress queues) is set on the agent instance before each run_conversation() call. This matches CLI behavior where a single AIAgent lives across all turns in a session, with _cached_system_prompt built once and reused.	2026-03-21 16:21:06 -07:00
Teknium	55510cbad2	Merge pull request #2388 from NousResearch/hermes/hermes-31d7db3b fix(provider): prevent Anthropic fallback from inheriting non-Anthropic base_url + fix(update): reset on stash conflict	2026-03-21 16:20:08 -07:00
Teknium	3ab50376b0	fix(update): reset working tree when stash restore leaves conflict markers When `hermes update` stashes local changes and the subsequent `git stash apply` fails or leaves unmerged files, the conflict markers (<<<<<<< etc.) were left in the working tree, making Hermes unrunnable until manually cleaned up. Now the update command runs `git reset --hard HEAD` to restore a clean working tree before exiting, and also detects unmerged files even when git stash apply reports success. Closes #2348	2026-03-21 16:16:35 -07:00
Teknium	2a5f86ed6d	Merge pull request #2343 from NousResearch/hermes/hermes-31d7db3b feat: @ context references + Honcho config fixes	2026-03-21 16:10:19 -07:00
Teknium	8da410ed95	feat(plugins): add slash command registration for plugins (#2359 ) Plugins can now register slash commands via ctx.register_command() in their register() function. Commands automatically appear in: - /help and COMMANDS_BY_CATEGORY (under 'Plugins' category) - Tab autocomplete in CLI - Telegram bot menu - Slack subcommand mapping - Gateway dispatch Handler signature: handler(args: str) -> str \| None Async handlers are supported in gateway context. Changes: - commands.py: add register_plugin_command() and rebuild_lookups() - plugins.py: add register_command() to PluginContext, track in PluginManager._plugin_commands and LoadedPlugin.commands_registered - cli.py: dispatch plugin commands in process_command() - gateway/run.py: dispatch plugin commands before skill commands - tests: 5 new tests for registration, help, tracking, handler, gateway - docs: update plugins feature page and build guide	2026-03-21 16:00:30 -07:00
Teknium	da44c196b6	feat: @ context references — inline file, folder, diff, git, and URL injection Add @file:path, @folder:dir, @diff, @staged, @git:N, and @url: references that expand inline before the message reaches the LLM. Supports line ranges (@file:main.py:10-50), token budget enforcement (soft warn at 25%, hard block at 50%), and path sandboxing for gateway. Core module from PR #2090 by @kshitijk4poor. CLI and gateway wiring rewritten against current main. Fixed asyncio.run() crash when called from inside a running event loop (gateway). Closes #682.	2026-03-21 15:57:13 -07:00
Gutslabs	0b9526b476	fix(acp): preserve session provider when switching models	2026-03-21 15:54:10 -07:00
Teknium	b73d221324	fix: Alibaba/DashScope: preserve model dots, fix 401 auth, fix dead provider check (salvage #1748 + fix #2314 ) fix: Alibaba/DashScope: preserve model dots, fix 401 auth, fix dead provider check (salvage #1748 + fix #2314)	2026-03-21 09:51:40 -07:00
Teknium	cc51ffdb57	Merge pull request #2340 from NousResearch/feat/streaming-default feat: enable streaming by default in CLI	2026-03-21 09:50:54 -07:00
unmodeled-tyler	fb48b8f0c5	fix(gateway): pass message_thread_id in send_image_file, send_document, send_video Fixes #1803. send_image_file, send_document, and send_video were missing message_thread_id forwarding, causing them to fail in Telegram forum/supergroups where thread_id is required. send_voice already handled this correctly. Adds metadata parameter + message_thread_id to all three methods, and adds tests covering the thread_id forwarding path.	2026-03-21 09:49:33 -07:00
Angello Picasso	5a9ab09bc3	feat(cli): add hermes plugins install/remove/list command Plugin management via git repos: - hermes plugins install <git-url\|owner/repo> - hermes plugins update <name> - hermes plugins remove <name> (aliases: rm, uninstall) - hermes plugins list (alias: ls) Security: path traversal protection, no shell injection, manifest version guard, insecure URL warnings. 42 tests covering security, dispatch, helpers, and commands. Based on work by Angello Picasso in PR #1785. Closes #1789.	2026-03-21 09:47:33 -07:00
Teknium	d70e07fc45	refactor(cli): add protected TUI extension hooks for wrapper CLIs Based on PR #1749 by @erosika (reimplemented on current main). Extracts three protected methods from run() so wrapper CLIs can extend the TUI without overriding the entire method: - _get_extra_tui_widgets(): inject widgets between spacer and status bar - _register_extra_tui_keybindings(kb, input_area): add keybindings - _build_tui_layout_children(**widgets): full control over ordering Default implementations reproduce existing layout exactly. The inline HSplit in run() now delegates to _build_tui_layout_children(). 5 tests covering defaults, widget insertion position, and keybinding registration.	2026-03-21 09:42:07 -07:00
Himess	5663980015	fix(mistral-parser): handle nested JSON in fallback extraction	2026-03-21 09:41:17 -07:00
Teknium	8304a7716d	fix(gateway): restart on whatsapp bridge child exit (#2334 ) Co-authored-by: Frederico Ribeiro <fr@tecompanytea.com>	2026-03-21 09:38:52 -07:00
crazywriter1	523d8c38f9	fix: Alibaba/DashScope: preserve model dots (qwen3.5-plus) and fix 401 auth When using Alibaba (DashScope) with an anthropic-compatible endpoint, model names like qwen3.5-plus were being normalized to qwen3-5-plus. Alibaba's API expects the dot. Added preserve_dots parameter to normalize_model_name() and build_anthropic_kwargs(). Also fixed 401 auth: when provider is alibaba or base_url contains dashscope/aliyuncs, use only the resolved API key (DASHSCOPE_API_KEY). Never fall back to resolve_anthropic_token(), and skip Anthropic credential refresh for DashScope endpoints. Cherry-picked from PR #1748 by crazywriter1. Fixes #1739.	2026-03-21 09:38:04 -07:00
Teknium	e183744cb5	feat(honcho): instance-local config via HERMES_HOME, default session strategy to per-directory - Add resolve_config_path(): checks $HERMES_HOME/honcho.json first, falls back to ~/.honcho/config.json. Enables isolated Hermes instances with independent Honcho credentials and settings. - Update CLI and doctor to use resolved path instead of hardcoded global. - Change default session_strategy from per-session to per-directory. Part 1 of #1962 by @erosika.	2026-03-21 09:34:00 -07:00
Himess	bc15f6cca3	fix(mattermost): use MIME types for media attachments Bare strings like "image", "audio", "document" were appended to media_types, but downstream run.py checks mtype.startswith("image/") and mtype.startswith("audio/"), which never matched. This caused all Mattermost file attachments to be silently dropped from vision/STT processing. Use the actual MIME type from file_info instead.	2026-03-21 09:31:15 -07:00
Teknium	28bb0e770f	fix(voice): enable TTS voice reply when streaming is active (#2322 ) When streaming is enabled, the base adapter receives None from _handle_message (already_sent=True) and cannot run auto-TTS for voice input. The runner was unconditionally skipping voice input TTS assuming the base adapter would handle it. Now the runner takes over TTS responsibility when streaming has already delivered the text response, so voice channel playback works with both streaming on and off. Streaming off behavior is unchanged (default already_sent=False preserves the original code path exactly). Co-authored-by: 0xbyt4 <35742124+0xbyt4@users.noreply.github.com>	2026-03-21 08:08:37 -07:00
Teknium	453f4c5175	Merge pull request #2312 from NousResearch/hermes/hermes-31d7db3b fix(gateway): retry Telegram 409 polling conflicts before giving up	2026-03-21 07:19:43 -07:00
Teknium	37a9979459	fix(cron): stop injecting cron outputs into gateway session history (#2313 ) Cron deliveries were mirrored into the target gateway session as assistant-role messages, causing consecutive assistant messages that violate message alternation (issue #2221). Instead of fixing the role, remove the mirror injection entirely. Cron outputs already live in their own cron session and don't belong in the interactive conversation history. Delivered messages are now wrapped with a header (task name) and a footer noting the agent cannot see or respond to the message, so users have clear context about what they're reading. Closes #2221	2026-03-21 07:18:36 -07:00
Teknium	488a30e879	fix(gateway): retry Telegram 409 polling conflicts before giving up A single Telegram 409 Conflict from getUpdates permanently killed Telegram polling with no recovery possible (retryable=False on first occurrence). This is too aggressive for production use with process supervisors. Transient 409s are expected during: - --replace handoffs where the old long-poll session lingers on Telegram servers for a few seconds after SIGTERM - systemd Restart=on-failure respawns that overlap with the dying instance cleanup Now _handle_polling_conflict() retries up to 3 times with a 10-second delay between attempts. The 30-second total retry window lets stale server-side sessions expire. If all retries fail, the error is still marked as permanently fatal — preserving the original protection against genuine dual-instance conflicts. Tests updated: split the single conflict test into two — one verifying retry on transient conflict, one verifying fatal after exhausted retries. Closes #2296	2026-03-21 07:11:06 -07:00
Teknium	58b52dfb2f	Merge pull request #2303 from NousResearch/hermes/hermes-31d7db3b fix: remove synthetic error message injection, fix session resume after repeated failures	2026-03-21 07:03:54 -07:00
Teknium	2da79b13df	feat: priority-based context file selection + CLAUDE.md support (#2301 ) Previously, all project context files (AGENTS.md, .cursorrules, .hermes.md) were loaded and concatenated into the system prompt. This bloated the prompt with potentially redundant or conflicting instructions. Now only ONE project context type is loaded, using priority order: 1. .hermes.md / HERMES.md (walk to git root) 2. AGENTS.md / agents.md (recursive directory walk) 3. CLAUDE.md / claude.md (cwd only, NEW) 4. .cursorrules / .cursor/rules/*.mdc (cwd only) SOUL.md from HERMES_HOME remains independent and always loads. Also adds CLAUDE.md as a recognized context file format, matching the convention popularized by Claude Code. Refactored the monolithic function into four focused helpers: _load_hermes_md, _load_agents_md, _load_claude_md, _load_cursorrules. Tests: replaced 1 coexistence test with 10 new tests covering priority ordering, CLAUDE.md loading, case sensitivity, injection blocking.	2026-03-21 06:26:20 -07:00
Test	1870069f80	fix(session_search): exclude current session lineage Cherry-picked from PR #2201 by @Gutslabs. session_search resolved hits to parent/root sessions but only excluded the exact current_session_id. If the active session was a child continuation (compression/delegation), its parent could still appear as a 'past' conversation result. Fix: resolve current_session_id to its lineage root before filtering, so the entire active lineage (parent and children) is excluded.	2026-03-20 21:07:48 -07:00
Test	10d719ac1b	fix(security): require opt-in for project plugin discovery	2026-03-20 20:50:30 -07:00
Teknium	4263350c5b	fix: remove post-compression file-read history injection (#2226 ) Remove the [Files already read — do NOT re-read these] user message that was injected into the conversation after context compression. This message used role='user' for system-generated content, creating a fake user turn that confused models about conversation state and could contribute to task-redo behavior. The file_tools.py read tracker (warn on 3rd consecutive read, block on 4th+) already handles re-read prevention inline without injecting synthetic messages. Closes #2224. Co-authored-by: Test <test@test.com>	2026-03-20 14:54:25 -07:00
Teknium	ba0b77a803	Merge pull request #2214 from NousResearch/fix/event-loop-closed-delegate Completes the event loop lifecycle fix trilogy (#2190 → #2207 → #2214). Per-thread persistent loops for worker threads prevent GC crashes on cached async clients.	2026-03-20 12:54:19 -07:00
Teknium	f853e50589	Merge pull request #2199 from llbn/fix/telegram-markdownv2-features Clean PR, well-tested. Adds MarkdownV2 strikethrough, spoiler, and blockquote support to Telegram adapter.	2026-03-20 12:45:47 -07:00
emozilla	ab6abc2c13	fix: use per-thread persistent event loops in worker threads Replace asyncio.run() with thread-local persistent event loops for worker threads (e.g., delegate_task's ThreadPoolExecutor). asyncio.run() creates and closes a fresh loop on every call, leaving cached httpx/AsyncOpenAI clients bound to a dead loop — causing 'Event loop is closed' errors during GC when parallel subagents clean up connections. The fix mirrors the main thread's _get_tool_loop() pattern but uses threading.local() so each worker thread gets its own long-lived loop, avoiding both cross-thread contention and the create-destroy lifecycle. Added 4 regression tests covering worker loop persistence, reuse, per-thread isolation, and separation from the main thread's loop.	2026-03-20 15:41:06 -04:00
llbn	43b3a0ac66	fix(telegram): escape backslashes and backticks inside code entities for MarkdownV2 - Escape \ → \\ inside inline code and fenced code blocks - Escape ` → \` inside fenced code block bodies (not delimiters) - Add regression tests for code entity backslash handling	2026-03-20 18:32:45 +01:00
llbn	02f639e561	fix(telegram): add MarkdownV2 support for strikethrough, spoiler, and blockquotes - Convert ~~text~~ to ~text~ (MarkdownV2 strikethrough) - Protect \|\|text\|\| from pipe escaping (MarkdownV2 spoiler) - Preserve > at line start as blockquote instead of escaping it - Update _strip_mdv2() to strip ~strikethrough~ and \|\|spoiler\|\| markers - Add tests covering new formatting paths and edge cases	2026-03-20 18:21:24 +01:00
Teknium	7a427d7b03	fix: persistent event loop in _run_async prevents 'Event loop is closed' (#2190 ) Cherry-picked from PR #2146 by @crazywriter1. Fixes #2104. asyncio.run() creates and closes a fresh event loop each call. Cached httpx/AsyncOpenAI clients bound to the dead loop crash on GC with 'Event loop is closed'. This hit vision_analyze on first use in CLI. Two-layer fix: - model_tools._run_async(): replace asyncio.run() with persistent loop via _get_tool_loop() + run_until_complete() - auxiliary_client._get_cached_client(): track which loop created each async client, discard stale entries if loop is closed 6 regression tests covering loop lifecycle, reuse, and full vision dispatch chain. Co-authored-by: Test <test@test.com>	2026-03-20 09:44:50 -07:00
Teknium	2ea4dd30c6	fix(gateway): strip orphaned tool_results + let /reset bypass running agent (#2180 ) Two fixes for Telegram/gateway-specific bugs: 1. Anthropic adapter: strip orphaned tool_result blocks (mirror of existing tool_use stripping). Context compression or session truncation can remove an assistant message containing a tool_use while leaving the subsequent tool_result intact. Anthropic rejects these with a 400: 'unexpected tool_use_id found in tool_result blocks'. The adapter now collects all tool_use IDs and filters out any tool_result blocks referencing IDs not in that set. 2. Gateway: /reset and /new now bypass the running-agent guard (like /status already does). Previously, sending /reset while an agent was running caused the raw text to be queued and later fed back as a user message with the same broken history — replaying the corrupted session instead of resetting it. Now the running agent is interrupted, pending messages are cleared, and the reset command dispatches immediately. Tests updated: existing tests now include proper tool_use→tool_result pairs; two new tests cover orphaned tool_result stripping. Co-authored-by: Test <test@test.com>	2026-03-20 08:39:49 -07:00
Teknium	c52353cf8a	feat: context pressure warnings for CLI and gateway (#2159 ) * feat: context pressure warnings for CLI and gateway User-facing notifications as context approaches the compaction threshold. Warnings fire at 60% and 85% of the way to compaction — relative to the configured compression threshold, not the raw context window. CLI: Formatted line with a progress bar showing distance to compaction. Cyan at 60% (approaching), bold yellow at 85% (imminent). ◐ context ▰▰▰▰▰▰▰▰▰▰▰▰▱▱▱▱▱▱▱▱ 60% to compaction 100k threshold (50%) · approaching compaction ⚠ context ▰▰▰▰▰▰▰▰▰▰▰▰▰▰▰▰▰▱▱▱ 85% to compaction 100k threshold (50%) · compaction imminent Gateway: Plain-text notification sent to the user's chat via the new status_callback mechanism (asyncio.run_coroutine_threadsafe bridge, same pattern as step_callback). Does NOT inject into the message stream. The LLM never sees these warnings. Flags reset after each compaction cycle. Files changed: - agent/display.py — format_context_pressure(), format_context_pressure_gateway() - run_agent.py — status_callback param, _context_50/70_warned flags, _emit_context_pressure(), flag reset in _compress_context() - gateway/run.py — _status_callback_sync bridge, wired to AIAgent - tests/test_context_pressure.py — 23 tests * Merge remote-tracking branch 'origin/main' into hermes/hermes-7ea545bf --------- Co-authored-by: Test <test@test.com>	2026-03-20 08:37:36 -07:00
Test	e140c02d51	feat(gateway): add webhook platform adapter for external event triggers Add a generic webhook platform adapter that receives HTTP POSTs from external services (GitHub, GitLab, JIRA, Stripe, etc.), validates HMAC signatures, transforms payloads into agent prompts, and routes responses back to the source or to another platform. Features: - Configurable routes with per-route HMAC secrets, event filters, prompt templates with dot-notation payload access, skill loading, and pluggable delivery (github_comment, telegram, discord, log) - HMAC signature validation (GitHub SHA-256, GitLab token, generic) - Rate limiting (30 req/min per route, configurable) - Idempotency cache (1hr TTL, prevents duplicate runs on retries) - Body size limits (1MB default, checked before reading payload) - Setup wizard integration with security warnings and docs links - 33 tests (29 unit + 4 integration), all passing Security: - HMAC secret required per route (startup validation) - Setup wizard warns about internet exposure for webhook/SMS platforms - Sandboxing (Docker/VM) recommended in docs for public-facing deployments Files changed: - gateway/config.py — Platform.WEBHOOK enum + env var overrides - gateway/platforms/webhook.py — WebhookAdapter (~420 lines) - gateway/run.py — factory wiring + auth bypass for webhook events - hermes_cli/config.py — WEBHOOK_* env var definitions - hermes_cli/setup.py — webhook section in setup_gateway() - tests/gateway/test_webhook_adapter.py — 29 unit tests - tests/gateway/test_webhook_integration.py — 4 integration tests - website/docs/user-guide/messaging/webhooks.md — full user docs - website/docs/reference/environment-variables.md — WEBHOOK_* vars - website/sidebars.ts — nav entry	2026-03-20 06:33:36 -07:00
Teknium	88643a1ba9	feat: overhaul context length detection with models.dev and provider-aware resolution (#2158 ) Replace the fragile hardcoded context length system with a multi-source resolution chain that correctly identifies context windows per provider. Key changes: - New agent/models_dev.py: Fetches and caches the models.dev registry (3800+ models across 100+ providers with per-provider context windows). In-memory cache (1hr TTL) + disk cache for cold starts. - Rewritten get_model_context_length() resolution chain: 0. Config override (model.context_length) 1. Custom providers per-model context_length 2. Persistent disk cache 3. Endpoint /models (local servers) 4. Anthropic /v1/models API (max_input_tokens, API-key only) 5. OpenRouter live API (existing, unchanged) 6. Nous suffix-match via OpenRouter (dot/dash normalization) 7. models.dev registry lookup (provider-aware) 8. Thin hardcoded defaults (broad family patterns) 9. 128K fallback (was 2M) - Provider-aware context: same model now correctly resolves to different context windows per provider (e.g. claude-opus-4.6: 1M on Anthropic, 128K on GitHub Copilot). Provider name flows through ContextCompressor. - DEFAULT_CONTEXT_LENGTHS shrunk from 80+ entries to ~16 broad patterns. models.dev replaces the per-model hardcoding. - CONTEXT_PROBE_TIERS changed from [2M, 1M, 512K, 200K, 128K, 64K, 32K] to [128K, 64K, 32K, 16K, 8K]. Unknown models no longer start at 2M. - hermes model: prompts for context_length when configuring custom endpoints. Supports shorthand (32k, 128K). Saved to custom_providers per-model config. - custom_providers schema extended with optional models dict for per-model context_length (backward compatible). - Nous Portal: suffix-matches bare IDs (claude-opus-4-6) against OpenRouter's prefixed IDs (anthropic/claude-opus-4.6) with dot/dash normalization. Handles all 15 current Nous models. - Anthropic direct: queries /v1/models for max_input_tokens. Only works with regular API keys (sk-ant-api*), not OAuth tokens. Falls through to models.dev for OAuth users. Tests: 5574 passed (18 new tests for models_dev + updated probe tiers) Docs: Updated configuration.md context length section, AGENTS.md Co-authored-by: Test <test@test.com>	2026-03-20 06:04:33 -07:00
Teknium	b7b585656b	Merge pull request #2110 from NousResearch/hermes/hermes-5d6932ba fix: session reset + custom provider model switch + honcho base_url	2026-03-20 06:01:44 -07:00
Teknium	3ec6c71e43	fix: update claude 4.6 context length from 200K to 1M (#2155 ) * fix: preserve Ollama model:tag colons in context length detection The colon-split logic in get_model_context_length() and _query_local_context_length() assumed any colon meant provider:model format (e.g. "local:my-model"). But Ollama uses model:tag format (e.g. "qwen3.5:27b"), so the split turned "qwen3.5:27b" into just "27b" — which matches nothing, causing a fallback to the 2M token probe tier. Now only recognised provider prefixes (local, openrouter, anthropic, etc.) are stripped. Ollama model:tag names pass through intact. * fix: update claude-opus-4-6 and claude-sonnet-4-6 context length from 200K to 1M Both models support 1,000,000 token context windows. The hardcoded defaults were set before Anthropic expanded the context for the 4.6 generation. Verified via models.dev and OpenRouter API data. --------- Co-authored-by: kshitijk4poor <82637225+kshitijk4poor@users.noreply.github.com> Co-authored-by: Test <test@test.com>	2026-03-20 04:38:59 -07:00
Test	4ad0083118	fix(honcho): read HONCHO_BASE_URL for local/self-hosted instances Cherry-picked from PR #2120 by @unclebumpy. - from_env() now reads HONCHO_BASE_URL and enables Honcho when base_url is set, even without an API key - from_global_config() reads baseUrl from config root with HONCHO_BASE_URL env var as fallback - get_honcho_client() guard relaxed to allow base_url without api_key for no-auth local instances - Added HONCHO_BASE_URL to OPTIONAL_ENV_VARS registry Result: Setting HONCHO_BASE_URL=http://localhost:8000 in ~/.hermes/.env now correctly routes the Honcho client to a local instance.	2026-03-20 04:36:06 -07:00
Test	1055d4356a	fix: skip model auto-detection for custom/local providers When the user is on a custom provider (provider=custom, localhost, or 127.0.0.1 endpoint), /model <name> no longer tries to auto-detect a provider switch. The model name changes on the current endpoint as-is. To switch away from a custom endpoint, users must use explicit provider:model syntax (e.g. /model openai-codex:gpt-5.2-codex). A helpful tip is printed when changing models on a custom endpoint. This prevents the confusing case where someone on LM Studio types /model gpt-5.2-codex, the auto-detection tries to switch providers, fails or partially succeeds, and requests still go to the old endpoint. Also fixes the missing prompt_toolkit.auto_suggest mock stub in test_cli_init.py (same issue already fixed in test_cli_new_session.py).	2026-03-20 04:35:17 -07:00
Test	5822711ae6	fix: complete session reset — missing compressor counters + test Follow-up to PR #2101 (InB4DevOps). Adds three missing context compressor resets in reset_session_state(): - compression_count (displayed in status bar) - last_total_tokens - _context_probed (stale context-error flag) Also fixes the test_cli_new_session.py prompt_toolkit mock (missing auto_suggest stub) and adds a regression test for #2099 that verifies all token counters and compressor state are zeroed on /new.	2026-03-20 04:35:17 -07:00
Teknium	471ea81a7d	fix: preserve Ollama model:tag colons in context length detection (#2149 ) The colon-split logic in get_model_context_length() and _query_local_context_length() assumed any colon meant provider:model format (e.g. "local:my-model"). But Ollama uses model:tag format (e.g. "qwen3.5:27b"), so the split turned "qwen3.5:27b" into just "27b" — which matches nothing, causing a fallback to the 2M token probe tier. Now only recognised provider prefixes (local, openrouter, anthropic, etc.) are stripped. Ollama model:tag names pass through intact. Co-authored-by: kshitijk4poor <82637225+kshitijk4poor@users.noreply.github.com>	2026-03-20 03:19:31 -07:00

1 2 3 4 5 ...

920 Commits