hermes-agent

Author	SHA1	Message	Date
Teknium	326b146d68	fix: prevent systemd restart storm on gateway connection failure Cherry-picked from PR #2319 by @itenev. When the gateway fails to connect (e.g. PrivilegedIntentsRequired, missing token), systemd's default RestartSec=10 with no start rate limit causes rapid reconnect storms flooding logs and triggering platform-side rate limits. - StartLimitIntervalSec=600 + StartLimitBurst=5 in [Unit] (max 5 restarts per 10 min) - RestartSec: 10 → 30 - Applied to both templates in gateway.py and scripts/hermes-gateway	2026-03-21 09:26:39 -07:00
Teknium	58b52dfb2f	Merge pull request #2303 from NousResearch/hermes/hermes-31d7db3b fix: remove synthetic error message injection, fix session resume after repeated failures	2026-03-21 07:03:54 -07:00
Teknium	651e92fbbf	fix: use git pull --ff-only in update/install to avoid divergent branch error (#2274 ) fix: use git pull --rebase in update/install to avoid divergent branch error	2026-03-21 06:33:22 -07:00
Test	71cf7ad11a	fix(setup): add alibaba to provider model selection Same bug as opencode-zen/go — alibaba fell through to the OpenRouter model list instead of using _setup_provider_model_selection() which probes the provider's own /models endpoint. All user-selectable providers now have correct model selection routing.	2026-03-20 22:48:59 -07:00
Test	7289256114	fix(setup): OpenCode Zen/Go show OpenRouter models instead of their own After selecting OpenCode Zen or Go as provider in hermes setup, the model selection page showed OpenRouter models because these providers weren't in the list that routes to _setup_provider_model_selection(). They fell through to the else branch which shows the OpenRouter catalog. Users ended up with an OpenCode API key but an OpenRouter model name, causing 'Provider resolver returned an empty API key' on first use. Fix: add opencode-zen and opencode-go to the provider list that uses _setup_provider_model_selection() for live /models detection.	2026-03-20 22:42:14 -07:00
Test	870ebb8850	fix: use git pull --ff-only in update/install to avoid divergent branch error Fresh installs without pull.rebase configured hit a git error when running hermes update because git doesn't know how to reconcile divergent branches. --ff-only is the right strategy: it works for the normal case (local branch is behind remote) and fails cleanly if the user somehow has local commits, rather than silently rebasing them.	2026-03-20 22:28:55 -07:00
Test	d0ac8d9fc7	chore: remove dead top-level toolsets config key The top-level 'toolsets' key in config.yaml was never read at runtime. Tool selection uses platform_toolsets (per-platform) or the --toolsets CLI flag. The key existed in load_cli_config() defaults and the example config as 'toolsets: [all]', misleading users into thinking it controlled tool availability. - Remove from load_cli_config() hardcoded defaults - Remove from hermes config show output - Replace in cli-config.yaml.example with deprecation note pointing to platform_toolsets and hermes tools	2026-03-20 22:27:13 -07:00
Test	173a5c6290	fix(tools): disabled toolsets re-enable themselves after hermes tools Two bugs in the save/load roundtrip for platform_toolsets: 1. _save_platform_tools preserved composite toolset entries (hermes-cli, hermes-telegram, etc.) because they weren't in configurable_keys. These composites include ALL _HERMES_CORE_TOOLS, so having hermes-cli in the saved list alongside individual keys negated any disables — the subset check always found the disabled toolset's tools via the composite entry. Fix: also filter out known TOOLSETS keys from preserved entries. Only truly unknown entries (MCP server names, custom entries) are kept. 2. _get_platform_tools used reverse subset inference to determine which configurable toolsets were enabled. This is inherently broken when tools appear in multiple toolsets (e.g. HA tools in both the homeassistant toolset and _HERMES_CORE_TOOLS). Fix: when the saved list contains explicit configurable keys (meaning the user has configured this platform), use direct membership instead of subset inference. The fallback path still handles legacy configs that only have a composite entry like hermes-cli.	2026-03-20 21:11:54 -07:00
Test	10d719ac1b	fix(security): require opt-in for project plugin discovery	2026-03-20 20:50:30 -07:00
Teknium	2416b2b7af	refactor(cli, banner): update gold ANSI color to true-color format (#2246 ) - Changed the ANSI escape code for gold color in cli.py and banner.py to use true-color format (#FFD700) for better visual consistency. - Enhanced the _on_tool_progress method in HermesCLI to update the TUI spinner with tool execution status, improving user feedback during operations. These changes improve the visual representation and user experience in the command-line interface. Co-authored-by: Test <test@test.com>	2026-03-20 18:17:38 -07:00
Teknium	66a1942524	feat: add /queue command to queue prompts without interrupting (#2191 ) Adds /queue <prompt> (alias /q) that queues a message for the next turn while the agent is busy, without interrupting the current run. - CLI: /queue <prompt> puts it in _pending_input for the next turn - Gateway: /queue <prompt> creates a pending MessageEvent on the adapter, picked up after the current agent run finishes - Enter still interrupts as usual (no behavior change) - /queue with no prompt shows usage - /queue when agent is idle tells user to just type normally Co-authored-by: Test <test@test.com>	2026-03-20 09:44:27 -07:00
Test	e140c02d51	feat(gateway): add webhook platform adapter for external event triggers Add a generic webhook platform adapter that receives HTTP POSTs from external services (GitHub, GitLab, JIRA, Stripe, etc.), validates HMAC signatures, transforms payloads into agent prompts, and routes responses back to the source or to another platform. Features: - Configurable routes with per-route HMAC secrets, event filters, prompt templates with dot-notation payload access, skill loading, and pluggable delivery (github_comment, telegram, discord, log) - HMAC signature validation (GitHub SHA-256, GitLab token, generic) - Rate limiting (30 req/min per route, configurable) - Idempotency cache (1hr TTL, prevents duplicate runs on retries) - Body size limits (1MB default, checked before reading payload) - Setup wizard integration with security warnings and docs links - 33 tests (29 unit + 4 integration), all passing Security: - HMAC secret required per route (startup validation) - Setup wizard warns about internet exposure for webhook/SMS platforms - Sandboxing (Docker/VM) recommended in docs for public-facing deployments Files changed: - gateway/config.py — Platform.WEBHOOK enum + env var overrides - gateway/platforms/webhook.py — WebhookAdapter (~420 lines) - gateway/run.py — factory wiring + auth bypass for webhook events - hermes_cli/config.py — WEBHOOK_* env var definitions - hermes_cli/setup.py — webhook section in setup_gateway() - tests/gateway/test_webhook_adapter.py — 29 unit tests - tests/gateway/test_webhook_integration.py — 4 integration tests - website/docs/user-guide/messaging/webhooks.md — full user docs - website/docs/reference/environment-variables.md — WEBHOOK_* vars - website/sidebars.ts — nav entry	2026-03-20 06:33:36 -07:00
Teknium	88643a1ba9	feat: overhaul context length detection with models.dev and provider-aware resolution (#2158 ) Replace the fragile hardcoded context length system with a multi-source resolution chain that correctly identifies context windows per provider. Key changes: - New agent/models_dev.py: Fetches and caches the models.dev registry (3800+ models across 100+ providers with per-provider context windows). In-memory cache (1hr TTL) + disk cache for cold starts. - Rewritten get_model_context_length() resolution chain: 0. Config override (model.context_length) 1. Custom providers per-model context_length 2. Persistent disk cache 3. Endpoint /models (local servers) 4. Anthropic /v1/models API (max_input_tokens, API-key only) 5. OpenRouter live API (existing, unchanged) 6. Nous suffix-match via OpenRouter (dot/dash normalization) 7. models.dev registry lookup (provider-aware) 8. Thin hardcoded defaults (broad family patterns) 9. 128K fallback (was 2M) - Provider-aware context: same model now correctly resolves to different context windows per provider (e.g. claude-opus-4.6: 1M on Anthropic, 128K on GitHub Copilot). Provider name flows through ContextCompressor. - DEFAULT_CONTEXT_LENGTHS shrunk from 80+ entries to ~16 broad patterns. models.dev replaces the per-model hardcoding. - CONTEXT_PROBE_TIERS changed from [2M, 1M, 512K, 200K, 128K, 64K, 32K] to [128K, 64K, 32K, 16K, 8K]. Unknown models no longer start at 2M. - hermes model: prompts for context_length when configuring custom endpoints. Supports shorthand (32k, 128K). Saved to custom_providers per-model config. - custom_providers schema extended with optional models dict for per-model context_length (backward compatible). - Nous Portal: suffix-matches bare IDs (claude-opus-4-6) against OpenRouter's prefixed IDs (anthropic/claude-opus-4.6) with dot/dash normalization. Handles all 15 current Nous models. - Anthropic direct: queries /v1/models for max_input_tokens. Only works with regular API keys (sk-ant-api*), not OAuth tokens. Falls through to models.dev for OAuth users. Tests: 5574 passed (18 new tests for models_dev + updated probe tiers) Docs: Updated configuration.md context length section, AGENTS.md Co-authored-by: Test <test@test.com>	2026-03-20 06:04:33 -07:00
Test	b1d05dfe8b	fix(openai): route api.openai.com to Responses API for GPT-5.x Based on PR #1859 by @magi-morph (too stale to cherry-pick, reimplemented). GPT-5.x models reject tool calls + reasoning_effort on /v1/chat/completions with a 400 error directing to /v1/responses. This auto-detects api.openai.com in the base URL and switches to codex_responses mode in three places: - AIAgent.__init__: upgrades chat_completions → codex_responses - _try_activate_fallback(): same routing for fallback model - runtime_provider.py: _detect_api_mode_for_url() for both custom provider and openrouter runtime resolution paths Also extracts _is_direct_openai_url() helper to replace the inline check in _max_tokens_param().	2026-03-20 05:09:41 -07:00
Test	4ad0083118	fix(honcho): read HONCHO_BASE_URL for local/self-hosted instances Cherry-picked from PR #2120 by @unclebumpy. - from_env() now reads HONCHO_BASE_URL and enables Honcho when base_url is set, even without an API key - from_global_config() reads baseUrl from config root with HONCHO_BASE_URL env var as fallback - get_honcho_client() guard relaxed to allow base_url without api_key for no-auth local instances - Added HONCHO_BASE_URL to OPTIONAL_ENV_VARS registry Result: Setting HONCHO_BASE_URL=http://localhost:8000 in ~/.hermes/.env now correctly routes the Honcho client to a local instance.	2026-03-20 04:36:06 -07:00
Teknium	d8081790f3	Merge pull request #2102 from NousResearch/hermes/hermes-6757a563 fix(tools,cli): normalise MCP schemas + expand session list columns	2026-03-19 19:06:56 -07:00
Test	2f07df3177	fix(cli): expand session list columns for full ID visibility Show complete session IDs in 'hermes sessions list' instead of truncating to 20 characters. Widens title column from 20→30 chars and adjusts header widths accordingly. Fixes #2068. Based on PR #2085 by @Nebula037 with a correction to preserve the no-titles layout (the original PR accidentally replaced the Preview/Src header with a duplicate Title/Preview header).	2026-03-19 18:17:28 -07:00
Teknium	6bcec1ac25	fix: resolve MiniMax 401 auth error by defaulting to anthropic_messages (#2103 ) MiniMax's default base URL was /v1 which caused runtime_provider to default to chat_completions mode (OpenAI-style Authorization: Bearer header). MiniMax rejects this with a 401 because they require the Anthropic-style x-api-key header. Changes: - auth.py: Change default inference_base_url for minimax and minimax-cn from /v1 to /anthropic - runtime_provider.py: Auto-correct stale /v1 URLs from existing .env files to /anthropic, and always default minimax/minimax-cn providers to anthropic_messages mode - Update tests to reflect new defaults, add tests for stale URL auto-correction and explicit api_mode override Based on PR #2100 by @devorun. Fixes #2094. Co-authored-by: Test <test@test.com>	2026-03-19 17:47:05 -07:00
Teknium	4c0c7f4c6e	fix: /model command — bare provider names, custom endpoint display Two issues with /model preventing proper provider switching: 1. Bare provider names not detected: typing '/model nous' treated 'nous' as a model name instead of triggering a provider switch. Fixed by adding step 0 in detect_provider_for_model() that checks if the input matches a known provider name/alias (excluding 'custom'/'openrouter' which need explicit model names) and returns that provider's default model. 2. Custom endpoint details hidden: /model (no args) showed '[custom]' with just a usage hint but no endpoint URL or model name. Now displays the configured base_url for custom providers in both CLI and gateway. Note: config base_url and OPENAI_BASE_URL are intentionally NOT cleared on provider switch — dedicated provider paths (nous, anthropic, codex) have their own credential resolution that ignores these, and clearing them would destroy the user's custom endpoint config, preventing switching back. Co-authored-by: Test <test@test.com>	2026-03-19 12:06:48 -07:00
Teknium	d76fa7fc37	fix: detect context length for custom model endpoints via fuzzy matching + config override (#2051 ) * fix: detect context length for custom model endpoints via fuzzy matching + config override Custom model endpoints (non-OpenRouter, non-known-provider) were silently falling back to 2M tokens when the model name didn't exactly match what the endpoint's /v1/models reported. This happened because: 1. Endpoint metadata lookup used exact match only — model name mismatches (e.g. 'qwen3.5:9b' vs 'Qwen3.5-9B-Q4_K_M.gguf') caused a miss 2. Single-model servers (common for local inference) required exact name match even though only one model was loaded 3. No user escape hatch to manually set context length Changes: - Add fuzzy matching for endpoint model metadata: single-model servers use the only available model regardless of name; multi-model servers try substring matching in both directions - Add model.context_length config override (highest priority) so users can explicitly set their model's context length in config.yaml - Log an informative message when falling back to 2M probe, telling users about the config override option - Thread config_context_length through ContextCompressor and AIAgent init Tests: 6 new tests covering fuzzy match, single-model fallback, config override (including zero/None edge cases). * fix: auto-detect local model name and context length for local servers Cherry-picked from PR #2043 by sudoingX. - Auto-detect model name from local server's /v1/models when only one model is loaded (no manual model name config needed) - Add n_ctx_train and n_ctx to context length detection keys for llama.cpp - Query llama.cpp /props endpoint for actual allocated context (not just training context from GGUF metadata) - Strip .gguf suffix from display in banner and status bar - _auto_detect_local_model() in runtime_provider.py for CLI init Co-authored-by: sudo <sudoingx@users.noreply.github.com> * fix: revert accidental summary_target_tokens change + add docs for context_length config - Revert summary_target_tokens from 2500 back to 500 (accidental change during patching) - Add 'Context Length Detection' section to Custom & Self-Hosted docs explaining model.context_length config override --------- Co-authored-by: Test <test@test.com> Co-authored-by: sudo <sudoingx@users.noreply.github.com>	2026-03-19 06:01:16 -07:00
Teknium	7b6d14e62a	fix(gateway): replace bare text approval with /approve and /deny commands (#2002 ) The gateway approval system previously intercepted bare 'yes'/'no' text from the user's next message to approve/deny dangerous commands. This was fragile and dangerous — if the agent asked a clarify question and the user said 'yes' to answer it, the gateway would execute the pending dangerous command instead. (Fixes #1888) Changes: - Remove bare text matching ('yes', 'y', 'approve', 'ok', etc.) from _handle_message approval check - Add /approve and /deny as gateway-only slash commands in the command registry - /approve supports scoping: /approve (one-time), /approve session, /approve always (permanent) - Add 5-minute timeout for stale approvals - Gateway appends structured instructions to the agent response when a dangerous command is pending, telling the user exactly how to respond - 9 tests covering approve, deny, timeout, scoping, and verification that bare 'yes' no longer triggers execution Credit to @solo386 and @FlyByNight69420 for identifying and reporting this security issue in PR #1971 and issue #1888. Co-authored-by: Test <test@test.com>	2026-03-18 16:58:20 -07:00
Teknium	67d707e851	fix: respect config.yaml model.base_url for Anthropic provider (#1948 ) (#1998 ) After #1675 removed ANTHROPIC_BASE_URL env var support, the Anthropic provider base URL was hardcoded to https://api.anthropic.com. Now reads model.base_url from config.yaml as an override, falling back to the default when not set. Also applies to the auxiliary client. Cherry-picked from PR #1949 by @rivercrab26. Co-authored-by: rivercrab26 <rivercrab26@users.noreply.github.com>	2026-03-18 16:51:24 -07:00
Teknium	a7cc1cf309	fix: support Anthropic-compatible endpoints for third-party providers (#1997 ) Three bugs prevented providers like MiniMax from using their Anthropic-compatible endpoints (e.g. api.minimax.io/anthropic): 1. _VALID_API_MODES was missing 'anthropic_messages', so explicit api_mode config was silently rejected and defaulted to chat_completions. 2. API-key provider resolution hardcoded api_mode to 'chat_completions' without checking model config or detecting Anthropic-compatible URLs. 3. run_agent.py auto-detection only recognized api.anthropic.com, not third-party endpoints using the /anthropic URL convention. Fixes: - Add 'anthropic_messages' to _VALID_API_MODES - API-key providers now check model config api_mode and auto-detect URLs ending in /anthropic - run_agent.py and fallback logic detect /anthropic URL convention - 5 new tests covering all scenarios Users can now either: - Set MINIMAX_BASE_URL=https://api.minimax.io/anthropic (auto-detected) - Set api_mode: anthropic_messages in model config (explicit) - Use custom_providers with api_mode: anthropic_messages Co-authored-by: Test <test@test.com>	2026-03-18 16:26:06 -07:00
Teknium	f24db23458	fix: custom provider uses config base_url and api_key over env vars (#1760 ) (#1994 ) When provider: custom is set in config.yaml with base_url and api_key, those values are now used instead of falling back to OPENAI_BASE_URL and OPENAI_API_KEY env vars. Also reads the 'api' field as an alternative to 'api_key' for config compatibility. Cherry-picked from PR #1762 by crazywriter1. Co-authored-by: crazywriter1 <53251494+crazywriter1@users.noreply.github.com>	2026-03-18 16:00:14 -07:00
Test	e7844e9c8d	Merge origin/main, resolve conflicts (self._base_url_lower)	2026-03-18 04:09:00 -07:00
Test	36921a3e98	fix: correct Copilot API mode selection to match opencode The previous copilot_model_api_mode() checked the catalog's supported_endpoints first and picked /chat/completions when a model supported both endpoints. This is wrong — GPT-5+ models should use the Responses API even when the catalog lists both. Replicate opencode's shouldUseCopilotResponsesApi() logic: - GPT-5+ models (gpt-5.4, gpt-5.3-codex, etc.) → Responses API - gpt-5-mini → Chat Completions (explicit exception) - Everything else (gpt-4o, claude, gemini, etc.) → Chat Completions - Model ID pattern is the primary signal, catalog is secondary The catalog fallback now only matters for non-GPT-5 models that might exclusively support /v1/messages (e.g. Claude via Copilot). Models are auto-detected from the live catalog at api.githubcopilot.com/models — no hardcoded list required for supported models, only a static fallback for when the API is unreachable.	2026-03-18 03:54:50 -07:00
Test	c1750bb32d	feat(cli): add /statusbar command to toggle context bar Adds /statusbar (alias /sb) to show/hide the bottom status bar that displays model name, context usage, and session duration. Uses ConditionalContainer so the bar takes zero space when hidden rather than leaving a blank line.	2026-03-18 03:49:49 -07:00
Test	b05f9b6256	chore: reorder OpenRouter catalog — glm-5-turbo under glm-5, minimax under stepfun	2026-03-18 03:31:04 -07:00
Test	cb54750e07	feat: reorder OpenRouter catalog, add haiku-4.5, fix minimax slug - Add anthropic/claude-haiku-4.5 - Move gpt-5.4-pro and gpt-5.4-nano to bottom - Fix minimax/minimax-m2.7 → minimax-m2.5 (m2.7 not on OpenRouter) - Tag hunter-alpha and healer-alpha as free - Place hunter/healer-alpha right below gpt-5.4-mini	2026-03-18 03:26:06 -07:00
Test	21c45ba0ac	feat: proper Copilot auth with OAuth device code flow and token validation Builds on PR #1879's Copilot integration with critical auth improvements modeled after opencode's implementation: - Add hermes_cli/copilot_auth.py with: - OAuth device code flow (copilot_device_code_login) using the same client_id (Ov23li8tweQw6odWQebz) as opencode and Copilot CLI - Token type validation: reject classic PATs (ghp_*) with a clear error message explaining supported token types - Proper env var priority: COPILOT_GITHUB_TOKEN > GH_TOKEN > GITHUB_TOKEN (matching Copilot CLI documentation) - copilot_request_headers() with Openai-Intent, x-initiator, and Copilot-Vision-Request headers (matching opencode) - Update auth.py: - PROVIDER_REGISTRY copilot entry uses correct env var order - _resolve_api_key_provider_secret delegates to copilot_auth for the copilot provider with proper token validation - Update models.py: - copilot_default_headers() now includes Openai-Intent and x-initiator - Update main.py: - _model_flow_copilot offers OAuth device code login when no token is found, with manual token entry as fallback - Shows supported vs unsupported token types - 22 new tests covering token validation, env var priority, header generation, and integration with existing auth infrastructure	2026-03-18 03:25:58 -07:00
Teknium	050b43108c	feat: add gpt-5.4-mini, gpt-5.4-nano, healer-alpha to OpenRouter catalog (#1913 ) feat: add gpt-5.4-mini, gpt-5.4-nano, healer-alpha to OpenRouter catalog	2026-03-18 03:23:36 -07:00
Test	00cc0c6a28	feat: add gpt-5.4-mini, gpt-5.4-nano, healer-alpha to OpenRouter catalog	2026-03-18 03:23:20 -07:00
Test	f814787144	fix(banner): normalize toolset labels and use skin colors - Strip '_tools' suffix from internal toolset identifiers in the banner (e.g. 'web_tools' -> 'web', 'homeassistant_tools' -> 'homeassistant') - Stop appending '_tools' to unavailable toolset names - Replace 6 hardcoded hex colors (#B8860B, #FFBF00, #FFF8DC) in toolset rows, overflow line, and MCP server rows with the skin variables (dim, accent, text) already resolved at the top of the function Inspired by PR #1871 by @kshitijk4poor. Adds 4 tests.	2026-03-18 03:22:58 -07:00
Teknium	b70dd51cfa	fix: disabled skills respected across banner, system prompt, slash commands, and skill_view (#1897 ) * fix: banner skill count now respects disabled skills and platform filtering The banner's get_available_skills() was doing a raw rglob scan of ~/.hermes/skills/ without checking: - Whether skills are disabled (skills.disabled config) - Whether skills match the current platform (platforms: frontmatter) This caused the banner to show inflated skill counts (e.g. '100 skills' when many are disabled) and list macOS-only skills on Linux. Fix: delegate to _find_all_skills() from tools/skills_tool which already handles both platform gating and disabled-skill filtering. * fix: system prompt and slash commands now respect disabled skills Two more places where disabled skills were still surfaced: 1. build_skills_system_prompt() in prompt_builder.py — disabled skills appeared in the <available_skills> system prompt section, causing the agent to suggest/load them despite being disabled. 2. scan_skill_commands() in skill_commands.py — disabled skills still registered as /skill-name slash commands in CLI help and could be invoked. Both now load _get_disabled_skill_names() and filter accordingly. * fix: skill_view blocks disabled skills skill_view() checked platform compatibility but not disabled state, so the agent could still load and read disabled skills directly. Now returns a clear error when a disabled skill is requested, telling the user to enable it via hermes skills or inspect the files manually. --------- Co-authored-by: Test <test@test.com>	2026-03-18 03:17:37 -07:00
TheSameCat2	5c4c4b8b7d	fix(gateway): detect script-style gateway processes for --replace Recognize hermes_cli/main.py gateway command lines in gateway process detection and PID validation so --replace reliably finds existing gateway instances. Adds a regression test covering script-style cmdline detection. Closes #1830	2026-03-18 03:12:59 -07:00
Teknium	11f029c311	fix(tts): document NeuTTS provider and align install guidance (#1903 ) Co-authored-by: charles-édouard <59705750+ccbbccbb@users.noreply.github.com>	2026-03-18 02:55:30 -07:00
Test	ace2cc6257	fix(gateway): PID-based wait with force-kill for gateway restart Add _wait_for_gateway_exit() that polls get_running_pid() to confirm the old gateway process has actually exited before starting a new one. If the process doesn't exit within 5s, sends SIGKILL to the specific PID. Uses the saved PID from gateway.pid (not launchd labels) so it works correctly with multiple gateway instances under separate HERMES_HOME directories. Applied to both launchd_restart() and the manual restart path (replaces the blind time.sleep(2)). Inspired by PR #1881 by @AzothZephyr (race condition diagnosis). Adds 4 tests.	2026-03-18 02:54:18 -07:00
octo-patch	e4043633fc	feat: upgrade MiniMax default to M2.7 + add new OpenRouter models MiniMax: Add M2.7 and M2.7-highspeed as new defaults across provider model lists, auxiliary client, metadata, setup wizard, RL training tool, fallback tests, and docs. Retain M2.5/M2.1 as alternatives. OpenRouter: Add grok-4.20-beta, nemotron-3-super-120b-a12b:free, trinity-large-preview:free, glm-5-turbo, and hunter-alpha to the model catalog. MiniMax changes based on PR #1882 by @octo-patch (applied manually due to stale conflicts in refactored pricing module).	2026-03-18 02:42:58 -07:00
max	0c392e7a87	feat: integrate GitHub Copilot providers across Hermes Add first-class GitHub Copilot and Copilot ACP provider support across model selection, runtime provider resolution, CLI sessions, delegated subagents, cron jobs, and the Telegram gateway. This also normalizes Copilot model catalogs and API modes, introduces a Copilot ACP OpenAI-compatible shim, and fixes service-mode auth by resolving Homebrew-installed gh binaries under launchd. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-03-17 23:40:22 -07:00
Test	a71e3f4d98	fix: add /browser to COMMAND_REGISTRY so it shows in help and autocomplete The /browser command handler existed in cli.py but was never added to COMMAND_REGISTRY after the centralized command registry refactor. This meant: - /browser didn't appear in /help - No tab-completion or subcommand suggestions - Dispatch used _base_word fallback instead of canonical resolution Added CommandDef with connect/disconnect/status subcommands and switched dispatch to use canonical instead of _base_word.	2026-03-17 13:29:36 -07:00
Teknium	dd60bcbfb7	feat: OpenAI-compatible API server + WhatsApp configurable reply prefix (#1756 ) * feat: OpenAI-compatible API server platform adapter Salvaged from PR #956, updated for current main. Adds an HTTP API server as a gateway platform adapter that exposes hermes-agent via the OpenAI Chat Completions and Responses APIs. Any OpenAI-compatible frontend (Open WebUI, LobeChat, LibreChat, AnythingLLM, NextChat, ChatBox, etc.) can connect by pointing at http://localhost:8642/v1. Endpoints: - POST /v1/chat/completions — stateless Chat Completions API - POST /v1/responses — stateful Responses API with chaining - GET /v1/responses/{id} — retrieve stored response - DELETE /v1/responses/{id} — delete stored response - GET /v1/models — list hermes-agent as available model - GET /health — health check Features: - Real SSE streaming via stream_delta_callback (uses main's streaming) - In-memory LRU response store for Responses API conversation chaining - Named conversations via 'conversation' parameter - Bearer token auth (optional, via API_SERVER_KEY) - CORS support for browser-based frontends - System prompt layering (frontend system messages on top of core) - Real token usage tracking in responses Integration points: - Platform.API_SERVER in gateway/config.py - _create_adapter() branch in gateway/run.py - API_SERVER_* env vars in hermes_cli/config.py - Env var overrides in gateway/config.py _apply_env_overrides() Changes vs original PR #956: - Removed streaming infrastructure (already on main via stream_consumer.py) - Removed Telegram reply_to_mode (separate feature, not included) - Updated _resolve_model() -> _resolve_gateway_model() - Updated stream_callback -> stream_delta_callback - Updated connect()/disconnect() to use _mark_connected()/_mark_disconnected() - Adapted to current Platform enum (includes MATTERMOST, MATRIX, DINGTALK) Tests: 72 new tests, all passing Docs: API server guide, Open WebUI integration guide, env var reference * feat(whatsapp): make reply prefix configurable via config.yaml Reworked from PR #1764 (ifrederico) to use config.yaml instead of .env. The WhatsApp bridge prepends a header to every outgoing message. This was hardcoded to '⚕ Hermes Agent'. Users can now customize or disable it via config.yaml: whatsapp: reply_prefix: '' # disable header reply_prefix: '🤖 My Bot\n───\n' # custom prefix How it works: - load_gateway_config() reads whatsapp.reply_prefix from config.yaml and stores it in PlatformConfig.extra['reply_prefix'] - WhatsAppAdapter reads it from config.extra at init - When spawning bridge.js, the adapter passes it as WHATSAPP_REPLY_PREFIX in the subprocess environment - bridge.js handles undefined (default), empty (no header), or custom values with \\n escape support - Self-chat echo suppression uses the configured prefix Also fixes _config_version: was 9 but ENV_VARS_BY_VERSION had a key 10 (TAVILY_API_KEY), so existing users at v9 would never be prompted for Tavily. Bumped to 10 to close the gap. Added a regression test to prevent this from happening again. Credit: ifrederico (PR #1764) for the bridge.js implementation and the config version gap discovery. --------- Co-authored-by: Test <test@test.com>	2026-03-17 10:44:37 -07:00
Teknium	088d65605a	fix: NameError in OpenCode provider setup (prompt_text -> prompt) (#1779 ) The OpenCode Zen and OpenCode Go setup sections used prompt_text() which is undefined. All other providers correctly use the local prompt() function defined in setup.py. Fixes crash during 'hermes setup' when selecting either OpenCode provider.	2026-03-17 10:30:16 -07:00
teknium1	c881209b92	Revert "feat(cli): skin-aware light/dark theme mode with terminal auto-detection" This reverts commit `a1c81360a5`.	2026-03-17 10:04:53 -07:00
Teknium	df74f86955	Merge pull request #1767 from sai-samarth/fix/systemd-node-path-whatsapp Clean fix for nvm/non-standard Node.js paths in systemd units. Merges cleanly.	2026-03-17 09:41:39 -07:00
sai-samarth	b8eb7c5fed	fix(gateway): include resolved node path in systemd unit	2026-03-17 15:11:28 +00:00
Teknium	af118501b9	Merge pull request #1733 from NousResearch/fix/defensive-hardening fix: defensive hardening — logging, dedup, locks, dead code	2026-03-17 04:46:20 -07:00
Teknium	d1d17f4f0a	feat(compression): add summary_base_url + move compression config to YAML-only - Add summary_base_url config option to compression block for custom OpenAI-compatible endpoints (e.g. zai, DeepSeek, Ollama) - Remove compression env var bridges from cli.py and gateway/run.py (CONTEXT_COMPRESSION_* env vars no longer set from config) - Switch run_agent.py to read compression config directly from config.yaml instead of env vars - Fix backwards-compat block in _resolve_task_provider_model to also fire when auxiliary.compression.provider is 'auto' (DEFAULT_CONFIG sets this, which was silently preventing the compression section's summary_* keys from being read) - Add test for summary_base_url config-to-client flow - Update docs to show compression as config.yaml-only Closes #1591 Based on PR #1702 by @uzaylisak	2026-03-17 04:46:15 -07:00
teknium1	847ee20390	fix: defensive hardening — logging, dedup, locks, dead code Four small fixes: 1. model_tools.py: Tool import failures logged at WARNING instead of DEBUG. If a tool module fails to import (syntax error, missing dep), the user now sees a warning instead of the tool silently vanishing. 2. hermes_cli/config.py: Remove duplicate 'import sys' (lines 19, 21). 3. agent/model_metadata.py: Remove 6 duplicate entries in DEFAULT_CONTEXT_LENGTHS dict. Python keeps the last value, so no functional change, but removes maintenance confusion. 4. hermes_state.py: Add missing self._lock to the LIKE query in resolve_session_id(). The exact-match path used get_session() (which locks internally), but the prefix fallback queried _conn without the lock.	2026-03-17 04:31:26 -07:00
teknium1	0897e4350e	merge: resolve conflicts with origin/main	2026-03-17 04:30:37 -07:00
Teknium	d2b10545db	feat(web): add Tavily as web search/extract/crawl backend (#1731 ) Salvage of PR #1707 by @kshitijk4poor (cherry-picked with authorship preserved). Adds Tavily as a third web backend alongside Firecrawl and Parallel, using the Tavily REST API via httpx. - Backend selection via hermes tools → saved as web.backend in config.yaml - All three tools supported: search, extract, crawl - TAVILY_API_KEY in config registry, doctor, status, setup wizard - 15 new Tavily tests + 9 backend selection tests + 5 config tests - Backward compatible Closes #1707	2026-03-17 04:28:03 -07:00

1 2 3 4 5 ...

548 Commits