hermes-agent

Author	SHA1	Message	Date
Google AI Agent	e208885de6	feat: wire telemetry hooks into auxiliary client All checks were successful Lint / lint (pull_request) Successful in 10s Details	2026-04-22 13:54:32 +00:00
Google AI Agent	cd84fa2084	feat: add telemetry logger for token accounting	2026-04-22 13:54:30 +00:00
Google AI Agent	f3d88ec31d	Merge pull request '[claude] Wire Gemma 4 vision into browser_tool for screenshot analysis (#816 )' (#947 ) from claude/issue-816 into main All checks were successful Lint / lint (push) Successful in 13s Details	2026-04-22 13:36:20 +00:00
Google AI Agent	d6ec32fe93	Merge pull request 'feat: implement SHIELD Multilingual Defense & Input Sanitization' (#918 ) from feat/shield-multilingual-1776700482647 into main Some checks failed Lint / lint (push) Has been cancelled Details	2026-04-22 13:36:05 +00:00
Alexander Whitestone	17cc4bac90	feat: complete Gemma 4 browser_vision wiring — task routing, timeout, tests All checks were successful Lint / lint (pull_request) Successful in 10s Details Building on the Gemma 4 default already on this branch: - Change call_llm() task from "vision" to "browser_vision" in browser_vision() so auxiliary.browser_vision.* config is consulted for provider/model/timeout - Route call_llm(task="browser_vision") through the vision provider resolution path in auxiliary_client.py (same as task="vision") - Fix timeout resolution: check auxiliary.browser_vision.timeout before auxiliary.vision.timeout (allows browser-specific timeout override) - Add timeout option to auxiliary.browser_vision in cli-config.yaml.example - Add test_browser_vision_gemma4.py covering: task routing assertions, call_llm() vision branch routing, and timeout config key ordering Refs #816 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-21 19:43:42 -04:00
Alexander Whitestone	4214082fb6	feat: A2A auth — mutual TLS between fleet agents All checks were successful Lint / lint (pull_request) Successful in 8s Details Implements mTLS for securing agent-to-agent communication in the Hermes fleet. Fixes #806. Changes: - scripts/gen_fleet_ca.sh: generate a self-signed Fleet CA (4096-bit RSA, 10-year validity) that signs all agent certificates - scripts/gen_agent_cert.sh: generate per-agent certs (Timmy, Allegro, Ezra) signed by the fleet CA with SAN entries and clientAuth/serverAuth extended key usage - agent/mtls.py: new module providing: - build_server_ssl_context() — TLS_SERVER context with CERT_REQUIRED, enforces client cert against Fleet CA - build_client_ssl_context() — TLS_CLIENT context for outbound A2A calls - MTLSMiddleware — ASGI middleware that rejects unauthenticated requests to A2A routes (/.well-known/agent-card, /api/agent-card, /a2a/) with HTTP 403 when mTLS is enabled - is_mtls_configured() — checks HERMES_MTLS_CERT/KEY/CA env vars - hermes_cli/web_server.py: wire MTLSMiddleware into the FastAPI app; pass SSL context to uvicorn when HERMES_MTLS_ env vars are set so the server runs TLS with mandatory client cert verification - ansible/roles/hermes_mtls/: Ansible role to distribute Fleet CA cert, agent cert, and agent key to fleet nodes; writes an env file with HERMES_MTLS_* vars and restarts the hermes-gateway service - ansible/fleet_mtls.yml: fleet-wide playbook referencing the role for Timmy, Allegro, and Ezra nodes - tests/test_mtls.py: 15 tests covering is_mtls_configured, SSL context creation with real cryptography-generated certs, and MTLSMiddleware (unauthorized agent rejected → 403, authorized agent accepted → 200) mTLS is opt-in: set HERMES_MTLS_CERT, HERMES_MTLS_KEY, and HERMES_MTLS_CA to enable. When unset, the server behaves exactly as before. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-21 18:04:00 -04:00
Alexander Whitestone	ac28444bf2	feat: add A2AMTLSServer routing API, A2AMTLSClient, and expand tests to 20 (#806 ) All checks were successful Lint / lint (pull_request) Successful in 9s Details Builds on the existing A2AServer / build_*_ssl_context foundation: - agent/a2a_mtls.py: - Add A2AMTLSServer: routing-based HTTPS server with add_route() and context-manager (__enter__/__exit__) lifecycle support - Add A2AMTLSClient: fleet-cert-presenting HTTP client with .get() / .post() - Widen imports (json, Callable, Dict, urlopen) - tests/agent/test_a2a_mtls.py: - Fix datetime.utcnow() deprecation — use datetime.now(timezone.utc) - Add TestA2AMTLSServerAndClient (9 tests): routing GET/POST, 404, context-manager stop, rogue-cert rejection, A2AMTLSClient, concurrency - Total: 11 → 20 passing tests Refs #806	2026-04-21 15:21:10 -04:00
Alexander Whitestone	91faf6f956	feat: A2A auth — mutual TLS between fleet agents All checks were successful Lint / lint (pull_request) Successful in 10s Details Implements mutual TLS for secure agent-to-agent communication (#806). - scripts/gen_fleet_ca.sh: generate fleet CA (4096-bit RSA, 10-year) - scripts/gen_agent_cert.sh: per-agent cert signed by fleet CA (timmy, allegro, ezra) - agent/a2a_mtls.py: A2AServer requiring client cert verification (CERT_REQUIRED), build_server_ssl_context / build_client_ssl_context helpers, server_from_env() - ansible/roles/fleet_mtls_certs/: distribute CA + per-agent certs to fleet nodes, write /etc/hermes/a2a.env, notify hermes-a2a service on change - ansible/fleet_mtls.yml + ansible/inventory/fleet.ini.example: playbook + example inventory - tests/agent/test_a2a_mtls.py: 11 tests — authorized agent accepted (200/202), self-signed cert rejected, no-cert rejected, lifecycle, env-var wiring Fixes #806 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-21 13:28:28 -04:00
Alexander Whitestone	46668505bc	Merge pull request 'feat: tool fixation detection — break repetitive loops (#886 )' (#914 ) from fix/886 into main Some checks failed Lint / lint (push) Has been cancelled Details	2026-04-21 15:35:08 +00:00
Alexander Whitestone	cac0c8224e	Merge pull request 'fix: circuit breaker for error cascading (2.33x amplification)' (#927 ) from fix/885-circuit-breaker into main Some checks failed Lint / lint (push) Has been cancelled Details	2026-04-21 15:35:04 +00:00
Alexander Whitestone	690d100afc	Merge pull request 'feat: Poka-yoke token budget — progressive context overflow guard (#925 )' (#943 ) from burn/925-1776770102 into main Some checks failed Docker Build and Publish / build-and-push (push) Has been skipped Details Nix / nix (ubuntu-latest) (push) Failing after 5s Details Tests / e2e (push) Successful in 5m8s Details Tests / test (push) Failing after 30m13s Details Nix / nix (macos-latest) (push) Has been cancelled Details	2026-04-21 15:29:02 +00:00
Alexander Whitestone	bc55f40505	Merge pull request 'feat: time-aware model routing for cron jobs (#889 )' (#909 ) from fix/889 into main Some checks failed Docker Build and Publish / build-and-push (push) Has been cancelled Details Nix / nix (macos-latest) (push) Has been cancelled Details Nix / nix (ubuntu-latest) (push) Has been cancelled Details Tests / test (push) Has been cancelled Details Tests / e2e (push) Has been cancelled Details	2026-04-21 15:26:43 +00:00
Alexander Whitestone	2adc72335e	Merge pull request 'fix: profile session isolation — tag and filter by profile' (#907 ) from fix/891-profile-isolation into main Some checks failed Docker Build and Publish / build-and-push (push) Has been cancelled Details Nix / nix (macos-latest) (push) Has been cancelled Details Nix / nix (ubuntu-latest) (push) Has been cancelled Details Tests / test (push) Has been cancelled Details Tests / e2e (push) Has been cancelled Details	2026-04-21 15:26:39 +00:00
Alexander Whitestone	719bb537c0	Merge pull request 'feat: provider preflight validation before session start (#924 )' (#932 ) from fix/924 into main Some checks failed Docker Build and Publish / build-and-push (push) Has been cancelled Details Nix / nix (macos-latest) (push) Has been cancelled Details Nix / nix (ubuntu-latest) (push) Has been cancelled Details Tests / test (push) Has been cancelled Details Tests / e2e (push) Has been cancelled Details	2026-04-21 15:23:02 +00:00
Alexander Whitestone	27d2f2ca0e	Merge pull request 'feat: Prevent context window overflow via proactive token counting (#838 )' (#905 ) from fix/838-1776402240 into main Some checks failed Docker Build and Publish / build-and-push (push) Has been cancelled Details Nix / nix (macos-latest) (push) Has been cancelled Details Nix / nix (ubuntu-latest) (push) Has been cancelled Details Tests / e2e (push) Has been cancelled Details Tests / test (push) Has been cancelled Details	2026-04-21 15:22:31 +00:00
Alexander Whitestone	8ac26f54a5	feat: token budget with progressive poka-yoke thresholds (#925 )	2026-04-21 11:40:39 +00:00
Alexander Whitestone	bdd0f2709b	feat: provider preflight validation before session start (#924 ) Some checks failed Contributor Attribution Check / check-attribution (pull_request) Failing after 47s Details Docker Build and Publish / build-and-push (pull_request) Has been skipped Details Supply Chain Audit / Scan PR for supply chain risks (pull_request) Successful in 52s Details Tests / test (pull_request) Failing after 30m48s Details Tests / e2e (pull_request) Successful in 2m9s Details	2026-04-21 04:48:57 +00:00
Alexander Whitestone	ccaa1cb021	feat: circuit breaker for error cascading Closes #885 2.33x error cascade factor detected. After 3 consecutive errors, circuit opens and agent must take corrective action. Recovery pattern: terminal is the safety net (2300 recoveries).	2026-04-21 00:28:14 +00:00
Google AI Agent	3d8cf5122a	feat: add agent/shield.py for SHIELD defense Some checks failed Contributor Attribution Check / check-attribution (pull_request) Failing after 31s Details Docker Build and Publish / build-and-push (pull_request) Has been skipped Details Supply Chain Audit / Scan PR for supply chain risks (pull_request) Successful in 40s Details Tests / e2e (pull_request) Successful in 2m2s Details Tests / test (pull_request) Failing after 52m0s Details	2026-04-20 15:54:48 +00:00
Google AI Agent	9a749d2854	feat: add agent/input_sanitizer.py for SHIELD defense	2026-04-20 15:54:45 +00:00
Alexander Whitestone	ab968e910c	feat: tool fixation detection — break repetitive loops (#886 ) Some checks failed Contributor Attribution Check / check-attribution (pull_request) Failing after 37s Details Docker Build and Publish / build-and-push (pull_request) Has been skipped Details Supply Chain Audit / Scan PR for supply chain risks (pull_request) Successful in 43s Details Tests / e2e (pull_request) Successful in 1m57s Details Tests / test (pull_request) Failing after 18m57s Details Marathon sessions show tool fixation: agent latches onto one tool and calls it repeatedly. Observed streaks of 8-25 identical calls. New agent/tool_fixation_detector.py: - ToolFixationDetector: tracks consecutive tool calls - record(tool_name): returns nudge prompt when threshold reached - Default threshold: 5 consecutive calls (configurable via TOOL_FIXATION_THRESHOLD env var) - Nudge prompt explains the fixation and suggests alternatives: 1. Read error carefully 2. Try different tool 3. Ask user for clarification 4. Check if task is complete - get_streak_info(): current streak state - format_report(): human-readable fixation events - Singleton via get_fixation_detector() Config: - TOOL_FIXATION_THRESHOLD (default: 5) - TOOL_FIXATION_WINDOW (default: 10) Tests: tests/test_tool_fixation_detector.py (9 tests) Closes #886	2026-04-17 01:57:37 -04:00
Alexander Whitestone	0b72884750	feat: time-aware model routing for cron jobs (#889 ) Some checks failed Tests / test (pull_request) Failing after 25m4s Details Tests / e2e (pull_request) Successful in 3m19s Details Contributor Attribution Check / check-attribution (pull_request) Failing after 14s Details Docker Build and Publish / build-and-push (pull_request) Has been skipped Details Supply Chain Audit / Scan PR for supply chain risks (pull_request) Successful in 14s Details Error rate peaks at 18:00 (9.4%) during evening cron batches vs 4.0% at 09:00 during interactive work. Route cron tasks to stronger models during off-hours when user is not present to correct errors. New agent/time_aware_routing.py: - resolve_time_aware_model(): routes based on hour, error rate, task type - Interactive sessions: always use base model (user corrects errors) - Cron during business hours: use base model (low error rate) - Cron during off-hours with high error rate (>6%): upgrade to strong model - get_hour_error_rate(): error rates by hour from empirical audit - is_off_hours(): 18:00-05:59 = off-hours - RoutingDecision: model, provider, reason, hour, error_rate - get_routing_report(): 24h forecast of routing decisions Config via env vars: - CRON_STRONG_MODEL (default: xiaomi/mimo-v2-pro) - CRON_CHEAP_MODEL (default: qwen2.5:7b) - CRON_ERROR_THRESHOLD (default: 6.0%) Tests: tests/test_time_aware_routing.py (9 tests) Closes #889	2026-04-17 01:15:09 -04:00
Alexander Whitestone	b5ba272efe	feat: profile session isolation Closes #891 Tags sessions with originating profile and provides filtered access so profiles cannot see each other's data.	2026-04-17 05:13:01 +00:00
Alexander Whitestone	e3436e36c3	feat: Add context budget tracker for overflow prevention (#838 )	2026-04-17 05:06:08 +00:00
Timmy Time	dbabe0e6ae	feat: 988 Suicide & Crisis Lifeline integration (#673 ) agent/crisis_resources.py provides all 988 Lifeline contact methods: phone (988), text (HOME to 988), chat, Spanish line. Also Crisis Text Line (741741) and 911. Closes #673	2026-04-17 05:04:48 +00:00
Hermes Merge Bot	dff451081d	Merge PR #856	2026-04-16 02:05:42 -04:00
Hermes Merge Bot	05086e58ea	Merge PR #871	2026-04-16 02:00:55 -04:00
Alexander Whitestone	e63cdaf16f	feat: self-modifying agent that improves its own prompts (#813 ) Some checks failed Docker Build and Publish / build-and-push (pull_request) Has been cancelled Details Contributor Attribution Check / check-attribution (pull_request) Has been cancelled Details Supply Chain Audit / Scan PR for supply chain risks (pull_request) Has been cancelled Details Tests / test (pull_request) Has been cancelled Details Tests / e2e (pull_request) Has been cancelled Details Resolves #813. Agent analyzes session transcripts for failure patterns and generates prompt patches to prevent future failures. agent/self_modify.py (PromptLearner class): - analyze_session(): detects 5 failure types from transcripts: retry_loop, timeout, hallucination, context_loss, tool_failure - generate_patches(): converts patterns to prompt patches with confidence scoring (frequency-based) - apply_patches(): appends learned rules to system prompt with backup and rollback support - learn_from_session(): full cycle analyze → patch → apply Failures → patterns → patches → improved prompts → fewer failures. Safety: patches only ADD rules (append-only), never remove. Rollback: restores from timestamped backup.	2026-04-16 01:23:48 -04:00
Google AI Agent	a474eb8459	fix: add agent/agent_card.py for agent card discovery	2026-04-16 03:45:01 +00:00
Alexander Whitestone	13ef670c05	feat: session compaction with fact extraction (#748 ) Some checks failed Contributor Attribution Check / check-attribution (pull_request) Successful in 29s Details Docker Build and Publish / build-and-push (pull_request) Has been skipped Details Supply Chain Audit / Scan PR for supply chain risks (pull_request) Successful in 33s Details Tests / e2e (pull_request) Successful in 3m26s Details Tests / test (pull_request) Failing after 1h28m50s Details Before compressing conversation context, extract durable facts (user preferences, corrections, project details) and save to fact store so they survive compression. New agent/session_compactor.py: - extract_facts_from_messages(): scans user messages for preferences, corrections, project/infra facts using regex - 3 pattern categories: user_pref (5 patterns), correction (3 patterns), project (4 patterns) - ExtractedFact: category, entity, content, confidence, source_turn - save_facts_to_store(): saves to fact store (callback or auto-detect) - extract_and_save_facts(): one-call extraction + persistence - Deduplication by category+content - Skips tool results, short messages, system messages - format_facts_summary(): human-readable summary Tests: tests/test_session_compactor.py (9 tests) Closes #748	2026-04-15 22:41:54 -04:00
Alexander Whitestone	435d790201	feat: add agent/privacy_filter.py from PR #397	2026-04-16 01:35:14 +00:00
Google AI Agent	dfe23f66b1	feat: add ToolOrchestrator with circuit breaker	2026-04-15 15:49:00 +00:00
Teknium	722331a57d	fix: replace hardcoded ~/.hermes with display_hermes_home() in agent-facing text (#10285 ) Tool schema descriptions and tool return values contained hardcoded ~/.hermes paths that the model sees and uses. When HERMES_HOME is set to a custom path (Docker containers, profiles), the agent would still reference ~/.hermes — looking at the wrong directory. Fixes 6 locations across 5 files: - tools/tts_tool.py: output_path schema description - tools/cronjob_tools.py: script path schema description - tools/skill_manager_tool.py: skill_manage schema description - tools/skills_tool.py: two tool return messages - agent/skill_commands.py: skill config injection text All now use display_hermes_home() which resolves to the actual HERMES_HOME path (e.g. /opt/data for Docker, ~/.hermes/profiles/X for profiles, ~/.hermes for default). Reported by: Sandeep Narahari (PrithviDevs)	2026-04-15 04:57:55 -07:00
Teknium	772cfb6c4e	fix: stale agent timeout, uv venv detection, empty response after tools, compression model fallback (#9051 , #8620 , #9400 ) (#10093 ) Four independent fixes: 1. Reset activity timestamp on cached agent reuse (#9051) When the gateway reuses a cached AIAgent for a new turn, the _last_activity_ts from the previous turn (possibly hours ago) carried over. The inactivity timeout handler immediately saw the agent as idle for hours and killed it. Fix: reset _last_activity_ts, _last_activity_desc, and _api_call_count when retrieving an agent from the cache. 2. Detect uv-managed virtual environments (#8620 sub-issue 1) The systemd unit generator fell back to sys.executable (uv's standalone Python) when running under 'uv run', because sys.prefix == sys.base_prefix. The generated ExecStart pointed to a Python binary without site-packages. Fix: check VIRTUAL_ENV env var before falling back to sys.executable. uv sets VIRTUAL_ENV even when sys.prefix doesn't reflect the venv. 3. Nudge model to continue after empty post-tool response (#9400) Weaker models sometimes return empty after tool calls. The agent silently abandoned the remaining work. Fix: append assistant('(empty)') + user nudge message and retry once. Resets after each successful tool round. 4. Compression model fallback on permanent errors (#8620 sub-issue 4) When the default summary model (gemini-3-flash) returns 503 'model_not_found' on custom proxies, the compressor entered a 600s cooldown, leaving context growing unbounded. Fix: detect permanent model-not-found errors (503, 404, 'model_not_found', 'no available channel') and fall back to using the main model for compression instead of entering cooldown. One-time fallback with immediate retry. Test plan: 40 compressor tests + 97 gateway/CLI tests + 9 venv tests pass	2026-04-14 22:38:17 -07:00
kshitijk4poor	9855190f23	feat(compressor): smart collapse, dedup, anti-thrashing, template upgrade, hardening Combined salvage of PRs #9661, #9663, #9674, #9677, #9678 by kshitijk4poor. - Smart tool output collapse: informative 1-line summaries replace generic placeholder - Dedup identical tool results via MD5 hash, truncate large tool_call arguments - Anti-thrashing: skip compression after 2 consecutive <10% savings passes - Structured action-log summary template with numbered actions and Active State - Hardening: max_tokens 1.3x cap, multimodal safety, note idempotency, adaptive cooldown Follow-up fixes applied during salvage: - web_extract: reads 'urls' (list) not 'url' (original PR bug) - Multimodal list content guards in dedup and prune passes - Kept 'Relevant Files' section in template (original PR removed it) Skipped PRs #9665 (user msg preservation — duplication risk) and #9675 (dead code).	2026-04-14 22:21:25 -07:00
Julien Talbot	3b50821555	feat(xai): add xAI/Grok to provider prefix stripping Add 'xai', 'x-ai', 'x.ai', 'grok' to _PROVIDER_PREFIXES so that colon-prefixed model names (e.g. xai:grok-4.20) are stripped correctly for context length lookups. Cherry-picked from PR #9184 by @Julientalbot.	2026-04-14 16:43:42 -07:00
Teknium	6448e1da23	feat(zai): add GLM-5V-Turbo support for coding plan (#9907 ) - Add glm-5v-turbo to OpenRouter, Nous, and native Z.AI model lists - Add glm-5v context length entry (200K tokens) to model metadata - Update Z.AI endpoint probe to try multiple candidate models per endpoint (glm-5.1, glm-5v-turbo, glm-4.7) — fixes detection for newer coding plan accounts that lack older models - Add zai to _PROVIDER_VISION_MODELS so auxiliary vision tasks (vision_analyze, browser screenshots) route through 5v Fixes #9888	2026-04-14 16:26:01 -07:00
Teknium	a37a095980	fix: detect qwen-oauth provider via CLI tokens in /model picker Seed qwen-oauth credentials from resolve_qwen_runtime_credentials() in _seed_from_singletons(). Users who authenticate via 'qwen auth qwen-oauth' store tokens in ~/.qwen/oauth_creds.json which the runtime resolver reads but the credential pool couldn't detect — same gap pattern as copilot. Uses refresh_if_expiring=False to avoid network calls during discovery.	2026-04-14 11:16:26 -07:00
Marvae	0bd3f521ae	fix: detect copilot provider via gh auth token in /model picker Seed copilot credentials from resolve_copilot_token() in the credential pool's _seed_from_singletons(), alongside the existing anthropic and openai-codex seeding logic. This makes copilot appear in the /model provider picker when the user authenticates solely through gh auth token. Cherry-picked from PR #9767 by Marvae.	2026-04-14 11:16:26 -07:00
N0nb0at	b21b3bfd68	feat(plugins): namespaced skill registration for plugin skill bundles Add ctx.register_skill() API so plugins can ship SKILL.md files under a 'plugin:skill' namespace, preventing name collisions with built-in Hermes skills. skill_view() detects the ':' separator and routes to the plugin registry while bare names continue through the existing flat-tree scan unchanged. Key additions: - agent/skill_utils: parse_qualified_name(), is_valid_namespace() - hermes_cli/plugins: PluginContext.register_skill(), PluginManager skill registry (find/list/remove) - tools/skills_tool: qualified name dispatch in skill_view(), _serve_plugin_skill() with full guards (disabled, platform, injection scan), bundle context banner with sibling listing, stale registry self-heal - Hoisted _INJECTION_PATTERNS to module level (dedup) - Updated skill_view schema description Based on PR #9334 by N0nb0at. Lean P1 salvage — omits autogen shim (P2) for a simpler first merge. Closes #8422	2026-04-14 10:42:58 -07:00
walli	884cd920d4	feat(gateway): unify QQBot branding, add PLATFORM_HINTS, fix streaming, restore missing setup functions - Rename platform from 'qq' to 'qqbot' across all integration points (Platform enum, toolset, config keys, import paths, file rename qq.py → qqbot.py) - Add PLATFORM_HINTS for QQBot in prompt_builder (QQ supports markdown) - Set SUPPORTS_MESSAGE_EDITING = False to skip streaming on QQ (prevents duplicate messages from non-editable partial + final sends) - Add _send_qqbot() standalone send function for cron/send_message tool - Add interactive _setup_qq() wizard in hermes_cli/setup.py - Restore missing _setup_signal/email/sms/dingtalk/feishu/wecom/wecom_callback functions that were lost during the original merge	2026-04-14 00:11:49 -07:00
Kenny Xie	cdd44817f2	fix(anthropic): send fast mode speed via extra_body	2026-04-13 22:32:39 -07:00
Teknium	943c01536f	feat: add openrouter/elephant-alpha to curated model lists (#9378 ) * Add hermes debug share instructions to all issue templates - bug_report.yml: Add required Debug Report section with hermes debug share and /debug instructions, make OS/Python/Hermes version optional (covered by debug report), demote old logs field to optional supplementary - setup_help.yml: Replace hermes doctor reference with hermes debug share, add Debug Report section with fallback chain (debug share -> --local -> doctor) - feature_request.yml: Add optional Debug Report section for environment context All templates now guide users to run hermes debug share (or /debug in chat) and paste the resulting paste.rs links, giving maintainers system info, config, and recent logs in one step. * feat: add openrouter/elephant-alpha to curated model lists - Add to OPENROUTER_MODELS (free, positioned above GPT models) - Add to _PROVIDER_MODELS["nous"] mirror list - Add 256K context window fallback in model_metadata.py	2026-04-13 21:16:14 -07:00
Teknium	d15efc9c1b	fix: correct GPT-5 family context lengths in fallback defaults (#9309 ) The generic 'gpt-5' fallback was set to 128,000 — which is the max OUTPUT tokens, not the context window. GPT-5 base and most variants (codex, mini) have 400,000 context. This caused /model to report 128k for models like gpt-5.3-codex when models.dev was unavailable. Added specific entries for GPT-5 variants with different context sizes: - gpt-5.4, gpt-5.4-pro: 1,050,000 (1.05M) - gpt-5.4-mini, gpt-5.4-nano: 400,000 - gpt-5.3-codex-spark: 128,000 (reduced) - gpt-5.1-chat: 128,000 (chat variant) - gpt-5 (catch-all): 400,000 Sources: https://developers.openai.com/api/docs/models	2026-04-13 19:22:23 -07:00
Teknium	f324222b79	fix: add vLLM/local server error patterns + MCP initial connection retry (#9281 ) Port two improvements inspired by Kilo-Org/kilocode analysis: 1. Error classifier: add context overflow patterns for vLLM, Ollama, and llama.cpp/llama-server. These local inference servers return different error formats than cloud providers (e.g., 'exceeds the max_model_len', 'context length exceeded', 'slot context'). Without these patterns, context overflow errors from local servers are misclassified as format errors, causing infinite retries instead of triggering compression. 2. MCP initial connection retry: previously, if the very first connection attempt to an MCP server failed (e.g., transient DNS blip at startup), the server was permanently marked as failed with no retry. Post-connect reconnection had 5 retries with exponential backoff, but initial connection had zero. Now initial connections retry up to 3 times with backoff before giving up, matching the resilience of post-connect reconnection. (Inspired by Kilo Code's MCP server disappearing fix in v1.3.3) Tests: 6 new error classifier tests, 4 new MCP retry tests, 1 updated existing test. All 276 affected tests pass.	2026-04-13 18:46:14 -07:00
arthurbr11	0a4cf5b3e1	feat(providers): add Arcee AI as direct API provider Adds Arcee AI as a standard direct provider (ARCEEAI_API_KEY) with Trinity models: trinity-large-thinking, trinity-large-preview, trinity-mini. Standard OpenAI-compatible provider checklist: auth.py, config.py, models.py, main.py, providers.py, doctor.py, model_normalize.py, model_metadata.py, setup.py, trajectory_compressor.py. Based on PR #9274 by arthurbr11, simplified to a standard direct provider without dual-endpoint OpenRouter routing.	2026-04-13 18:40:06 -07:00
Teknium	8d023e43ed	refactor: remove dead code — 1,784 lines across 77 files (#9180 ) Deep scan with vulture, pyflakes, and manual cross-referencing identified: - 41 dead functions/methods (zero callers in production) - 7 production-dead functions (only test callers, tests deleted) - 5 dead constants/variables - ~35 unused imports across agent/, hermes_cli/, tools/, gateway/ Categories of dead code removed: - Refactoring leftovers: _set_default_model, _setup_copilot_reasoning_selection, rebuild_lookups, clear_session_context, get_logs_dir, clear_session - Unused API surface: search_models_dev, get_pricing, skills_categories, get_read_files_summary, clear_read_tracker, menu_labels, get_spinner_list - Dead compatibility wrappers: schedule_cronjob, list_cronjobs, remove_cronjob - Stale debug helpers: get_debug_session_info copies in 4 tool files (centralized version in debug_helpers.py already exists) - Dead gateway methods: send_emote, send_notice (matrix), send_reaction (bluebubbles), _normalize_inbound_text (feishu), fetch_room_history (matrix), _start_typing_indicator (signal), parse_feishu_post_content - Dead constants: NOUS_API_BASE_URL, SKILLS_TOOL_DESCRIPTION, FILE_TOOLS, VALID_ASPECT_RATIOS, MEMORY_DIR - Unused UI code: _interactive_provider_selection, _interactive_model_selection (superseded by prompt_toolkit picker) Test suite verified: 609 tests covering affected files all pass. Tests for removed functions deleted. Tests using removed utilities (clear_read_tracker, MEMORY_DIR) updated to use internal APIs directly.	2026-04-13 16:32:04 -07:00
Teknium	b27eaaa4db	fix: improve ACP type check and restore comment accuracy - Use isinstance() with try/except import for CopilotACPClient check in _to_async_client instead of fragile __class__.__name__ string check - Restore accurate comment: GPT-5.x models require (not 'often require') the Responses API on OpenAI/OpenRouter; ACP is the exception, not a softening of the requirement - Add inline comment explaining the ACP exclusion rationale	2026-04-13 16:17:43 -07:00
helix4u	8680f61f8b	fix(copilot-acp): keep acp runtime off responses path	2026-04-13 16:17:43 -07:00
Teknium	0e60a9dc25	fix: add kimi-coding-cn to remaining provider touchpoints Follow-up for salvaged PR #7637. Adds kimi-coding-cn to: - model_normalize.py (prefix strip) - providers.py (models.dev mapping) - runtime_provider.py (credential resolution) - setup.py (model list + setup label) - doctor.py (health check) - trajectory_compressor.py (URL detection) - models_dev.py (registry mapping) - integrations/providers.md (docs)	2026-04-13 11:20:37 -07:00

1 2 3 4 5 ...

473 Commits