hermes-agent

Author	SHA1	Message	Date
Hermes Agent	6c35a1b762	security(input_sanitizer): expand jailbreak pattern coverage (#87 ) - Add DAN-style patterns: do anything now, stay in character, token smuggling, etc. - Add roleplaying override patterns: roleplay as, act as if, simulate being, etc. - Add system prompt extraction patterns: repeat instructions, show prompt, etc. - 10+ new patterns with full test coverage - Zero regression on legitimate inputs	2026-04-05 15:48:10 +00:00
Allegro	d9cf77e382	feat: Issue #42 - Nexus Architect for autonomous Three.js world building Implement Phase 31: Autonomous 'Nexus' Expansion & Architecture DELIVERABLES: - agent/nexus_architect.py: AI agent for natural language to Three.js conversion * Prompt engineering for LLM-driven immersive environment generation * Mental state integration for dynamic aesthetic tuning * Mood preset system (contemplative, energetic, mysterious, etc.) * Room and portal design generation - tools/nexus_build_tool.py: Build tool interface with functions: * create_room(name, description, style) - Generate room modules * create_portal(from_room, to_room, style) - Generate portal connections * add_lighting(room, type, color, intensity) - Add Three.js lighting * add_geometry(room, shape, position, material) - Add 3D objects * generate_scene_from_mood(mood_description) - Mood-based generation * deploy_nexus_module(module_code, test=True) - Deploy and test - agent/nexus_deployment.py: Real-time deployment system * Hot-reload Three.js modules without page refresh * Validation (syntax check, Three.js API compliance) * Rollback on error with version history * Module versioning and status tracking - config/nexus-templates/: Template library * base_room.js - Base room template (Three.js r128+) * portal_template.js - Portal template (circular, rectangular, stargate) * lighting_presets.json - Warm, cool, dramatic, serene, crystalline presets * material_presets.json - 15 material presets including Timmy's gold, Allegro blue - tests/test_nexus_architect.py: Comprehensive test coverage * Unit tests for all components * Integration tests for full workflow * Template file validation DESIGN PRINCIPLES: - Modular architecture (each room = separate JS module) - Valid Three.js code (r128+ compatible) - Hot-reloadable (no page refresh needed) - Mental state integration (SOUL.md values influence aesthetic) NEXUS AESTHETIC GUIDELINES: - Timmy's color: warm gold (#D4AF37) - Allegro's color: motion blue (#4A90E2) - Sovereignty theme: crystalline structures, clean lines - Service theme: open spaces, welcoming lighting - Default mood: contemplative, expansive, hopeful	2026-04-01 02:45:36 +00:00
Allegro	ae6f3e9a95	feat: Issue #39 - temporal knowledge graph with versioning and reasoning Implement Phase 28: Sovereign Knowledge Graph 'Time Travel' - agent/temporal_knowledge_graph.py: SQLite-backed temporal triple store with versioning, validity periods, and temporal query operators (BEFORE, AFTER, DURING, OVERLAPS, AT) - agent/temporal_reasoning.py: Temporal reasoning engine supporting historical queries, fact evolution tracking, and worldview snapshots - tools/temporal_kg_tool.py: Tool integration with functions for storing facts with time, querying historical state, generating temporal summaries, and natural language temporal queries - tests/test_temporal_kg.py: Comprehensive test coverage including storage tests, query operators, historical summaries, and integration tests	2026-04-01 02:08:20 +00:00
Allegro	be865df8c4	security: Issue #81 - ULTRAPLINIAN fallback chain audit framework Implement comprehensive red team audit infrastructure for testing the entire fallback chain against jailbreak and crisis intervention attacks. Files created: - tests/security/ultraplinian_audit.py: Comprehensive audit runner with: * Support for all 4 techniques: GODMODE, Parseltongue, Prefill, Crisis * Model configurations for Kimi, Gemini, Grok, Llama * Concurrent execution via ThreadPoolExecutor * JSON and Markdown report generation * CLI interface with --help, --list-models, etc. - tests/security/FALLBACK_CHAIN_TEST_PLAN.md: Detailed test specifications: * Complete test matrix (5 models × 4 techniques × 8 queries = 160 tests) * Technique specifications with system prompts * Scoring criteria and detection patterns * Success criteria and maintenance schedule - agent/ultraplinian_router.py (optional): Race-mode fallback router: * Parallel model querying for safety validation * SHIELD-based safety analysis * Crisis escalation to SAFE SIX models * Configurable routing decisions Test commands: python tests/security/ultraplinian_audit.py --help python tests/security/ultraplinian_audit.py --all-models --all-techniques python tests/security/ultraplinian_audit.py --model kimi-k2.5 --technique crisis Relates to: Issue #72 (Red Team Jailbreak Audit) Severity: MEDIUM	2026-04-01 01:51:23 +00:00
Allegro	b88125af30	security: Add crisis pattern detection to input_sanitizer (Issue #72 ) Some checks failed Docker Build and Publish / build-and-push (push) Has been cancelled Details Nix / nix (macos-latest) (push) Has been cancelled Details Nix / nix (ubuntu-latest) (push) Has been cancelled Details Tests / test (push) Has been cancelled Details - Add CRISIS_PATTERNS for suicide/self-harm detection - Crisis patterns score 50pts per hit (max 100) vs 10pts for others - Addresses Red Team Audit HIGH finding: og_godmode + crisis queries - All 136 existing tests pass + new crisis safety tests pass Defense in depth: Input layer now blocks crisis queries even if wrapped in jailbreak templates, before they reach the model.	2026-03-31 21:27:17 +00:00
Allegro	e555c989af	security: add input sanitization for jailbreak patterns (Issue #72 ) Implements input sanitization module to detect and strip jailbreak fingerprint patterns identified in red team audit: HIGH severity: - GODMODE dividers: [START], [END], GODMODE ENABLED, UNFILTERED - L33t speak encoding: h4ck, k3ylog, ph1shing, m4lw4r3 MEDIUM severity: - Boundary inversion: [END]...[START] tricks - Fake role markers: user: assistant: system: LOW severity: - Spaced text bypass: k e y l o g g e r Other patterns detected: - Refusal inversion: 'refusal is harmful' - System prompt injection: 'you are now', 'ignore previous instructions' - Obfuscation: base64, hex, rot13 mentions Files created: - agent/input_sanitizer.py: Core sanitization module with detection, scoring, and cleaning functions - tests/test_input_sanitizer.py: 69 test cases covering all patterns - tests/test_input_sanitizer_integration.py: Integration tests Files modified: - agent/__init__.py: Export sanitizer functions - run_agent.py: Integrate sanitizer at start of run_conversation() Features: - detect_jailbreak_patterns(): Returns bool, patterns list, category scores - sanitize_input(): Returns cleaned_text, risk_score, patterns - score_input_risk(): Returns 0-100 risk score - sanitize_input_full(): Complete sanitization with blocking decisions - Logging integration for security auditing	2026-03-31 19:56:16 +00:00
Allegro	5ef812d581	feat: implement automatic kimi-coding fallback on quota errors	2026-03-31 19:35:54 +00:00
Allegro	fa1a0b6b7f	Merge pull request 'feat: Apparatus Verification System — Mapping Soul to Code' (#11 ) from feat/apparatus-verification into main Some checks failed Nix / nix (ubuntu-latest) (push) Failing after 1s Details Docker Build and Publish / build-and-push (push) Failing after 16s Details Tests / test (push) Failing after 8m40s Details Nix / nix (macos-latest) (push) Has been cancelled Details	2026-03-31 02:28:31 +00:00
Allegro	cb0cf51adf	security: Fix V-006 MCP OAuth Deserialization (CVSS 8.8 CRITICAL) Some checks failed Nix / nix (ubuntu-latest) (pull_request) Failing after 15s Details Supply Chain Audit / Scan PR for supply chain risks (pull_request) Failing after 19s Details Docker Build and Publish / build-and-push (pull_request) Failing after 28s Details Tests / test (pull_request) Failing after 9m43s Details Nix / nix (macos-latest) (pull_request) Has been cancelled Details - Replace pickle with JSON + HMAC-SHA256 state serialization - Add constant-time signature verification - Implement replay attack protection with nonce expiration - Add comprehensive security test suite (54 tests) - Harden token storage with integrity verification Resolves: V-006 (CVSS 8.8)	2026-03-31 00:37:14 +00:00
Google AI Agent	e6599b8651	feat: implement Phase 3 - Domain Distiller Some checks failed Supply Chain Audit / Scan PR for supply chain risks (pull_request) Failing after 45s Details Tests / test (pull_request) Failing after 27s Details Docker Build and Publish / build-and-push (pull_request) Failing after 1m11s Details	2026-03-30 22:59:57 +00:00
Google AI Agent	679d2cd81d	feat: implement Phase 2 - World Modeler	2026-03-30 22:59:56 +00:00
Google AI Agent	e7b2fe8196	feat: implement Phase 1 - Self-Correction Generator	2026-03-30 22:59:55 +00:00
Google AI Agent	1ce0b71368	docs: initial @soul mapping for Apparatus Verification Some checks failed Supply Chain Audit / Scan PR for supply chain risks (pull_request) Successful in 24s Details Docker Build and Publish / build-and-push (pull_request) Failing after 32s Details Tests / test (pull_request) Failing after 23s Details	2026-03-30 22:38:02 +00:00
Google AI Agent	1bff6d17d5	feat: enhance Knowledge Ingester with symbolic extraction Some checks failed Docker Build and Publish / build-and-push (pull_request) Failing after 1m20s Details Tests / test (pull_request) Failing after 16s Details Supply Chain Audit / Scan PR for supply chain risks (pull_request) Failing after 34s Details	2026-03-30 22:28:59 +00:00
Google AI Agent	482b6c5aea	feat: add Sovereign Intersymbolic Memory Layer	2026-03-30 22:28:57 +00:00
Allegro	0f508c9600	Merge PR #4 : Sovereign Real-time Learning System Some checks failed Tests / test (push) Failing after 40s Details Docker Build and Publish / build-and-push (push) Failing after 55s Details Nix / nix (ubuntu-latest) (push) Failing after 21s Details Nix / nix (macos-latest) (push) Has been cancelled Details	2026-03-30 22:27:14 +00:00
Google AI Agent	d1defbe06a	feat: add Sovereign Knowledge Ingester	2026-03-30 22:19:27 +00:00
Google AI Agent	fdce07ff40	feat: add Meta-Reasoning Layer	2026-03-30 22:16:19 +00:00
Google AI Agent	bf82581189	feat: add native Gemini 3 series adapter	2026-03-30 22:16:18 +00:00
Teknium	fb634068df	fix(security): extend secret redaction to ElevenLabs, Tavily and Exa API keys (#3920 ) Some checks failed Nix / nix (ubuntu-latest) (push) Failing after 3m9s Details Docker Build and Publish / build-and-push (push) Failing after 4m1s Details Tests / test (push) Failing after 29m41s Details Nix / nix (macos-latest) (push) Has been cancelled Details ElevenLabs (sk_), Tavily (tvly-), and Exa (exa_) keys were not covered by _PREFIX_PATTERNS, leaking in plain text via printenv or log output. Salvaged from PR #3790 by @memosr. Tests rewritten with correct assertions (original tests had vacuously true checks). Co-authored-by: memosr <memosr@users.noreply.github.com>	2026-03-30 08:13:01 -07:00
Teknium	1c900c45e3	fix(agent): support full context length resolution for direct Gemini API endpoints (#3876 ) * add .aac audio file format support to transcription tool * fix(agent): support full context length resolution for direct Gemini API endpoints Add generativelanguage.googleapis.com to _URL_TO_PROVIDER so direct Gemini API users get correct 1M+ context length instead of the 128K unknown-proxy fallback. Co-authored-by: bb873 <bb873@users.noreply.github.com> --------- Co-authored-by: Adrian Scott <adrian@adrianscott.com> Co-authored-by: bb873 <bb873@users.noreply.github.com>	2026-03-29 21:56:07 -07:00
Teknium	e296efbf24	fix: add INFO-level logging for auxiliary provider resolution (#3866 ) The auxiliary client's auto-detection chain was a black box — when compression, summarization, or memory flush failed, the only clue was a generic 'Request timed out' with no indication of which provider was tried or why it was skipped. Now logs at INFO level: - 'Auxiliary auto-detect: using local/custom (qwen3.5-9b) — skipped: openrouter, nous' when auto-detection picks a provider - 'Auxiliary compression: using auto (qwen3.5-9b) at http://localhost:11434/v1' before each auxiliary call - 'Auxiliary compression: provider custom unavailable, falling back to openrouter' on fallback - Clear warning with actionable guidance when NO provider is available: 'Set OPENROUTER_API_KEY or configure a local model in config.yaml'	2026-03-29 21:29:00 -07:00
Teknium	3cc50532d1	fix: auxiliary client uses placeholder key for local servers without auth (#3842 ) Local inference servers (Ollama, llama.cpp, vLLM, LM Studio) don't require API keys, but the auxiliary client's _resolve_custom_runtime() rejected endpoints with empty keys — causing the auto-detection chain to skip the user's local server entirely. This broke compression, summarization, and memory flush for users running local models without an OpenRouter/cloud API key. The main CLI already had this fix (PR #2556, 'no-key-required' placeholder), but the auxiliary client's resolution path was missed. Two fixes: - _resolve_custom_runtime(): use 'no-key-required' placeholder instead of returning None when base_url is present but key is empty - resolve_provider_client() custom branch: same placeholder fallback for explicit_base_url without explicit_api_key Updates 2 tests that expected the old (broken) behavior.	2026-03-29 21:05:36 -07:00
Teknium	e314833c9d	feat(display): configurable tool preview length -- show full paths by default (#3841 ) Tool call previews (paths, commands, queries) were hardcoded to truncate at 35-40 chars across CLI spinners, completion lines, and gateway progress messages. Users could not see full file paths in tool output. New config option: display.tool_preview_length (default 0 = no limit). Set a positive number to truncate at that length. Changes: - display.py: module-level _tool_preview_max_len with getter/setter; build_tool_preview() and get_cute_tool_message() _trunc/_path respect it - cli.py: reads config at startup, spinner widget respects config - gateway/run.py: reads config per-message, progress callback respects config - run_agent.py: removed redundant 30-char quiet-mode spinner truncation - config.py: added display.tool_preview_length to DEFAULT_CONFIG Reported by kriskaminski	2026-03-29 18:02:42 -07:00
Teknium	fcd1645223	feat(skills): support external skill directories via config (#3678 ) Add skills.external_dirs config option — a list of additional directories to scan for skills alongside ~/.hermes/skills/. External dirs are read-only: skill creation/editing always writes to the local dir. Local skills take precedence when names collide. This lets users share skills across tools/agents without copying them into Hermes's own directory (e.g. ~/.agents/skills, /shared/team-skills). Changes: - agent/skill_utils.py: add get_external_skills_dirs() and get_all_skills_dirs() - agent/prompt_builder.py: scan external dirs in build_skills_system_prompt() - tools/skills_tool.py: _find_all_skills() and skill_view() search external dirs; security check recognizes configured external dirs as trusted - agent/skill_commands.py: /skill slash commands discover external skills - hermes_cli/config.py: add skills.external_dirs to DEFAULT_CONFIG - cli-config.yaml.example: document the option - tests/agent/test_external_skills.py: 11 tests covering discovery, precedence, deduplication, and skill_view for external skills Requested by community member primco.	2026-03-29 00:33:30 -07:00
Teknium	bea49e02a3	fix: route /bg spinner through TUI widget to prevent status bar collision (#3643 ) Background agent's KawaiiSpinner wrote \r-based animation and stop() messages through StdoutProxy, colliding with prompt_toolkit's status bar. Two fixes: - display.py: use isinstance(out, StdoutProxy) instead of fragile hasattr+name check for detecting prompt_toolkit's stdout wrapper - cli.py: silence bg agent's raw spinner (_print_fn=no-op) and route thinking updates through the TUI widget only when no foreground agent is active; clear spinner text in finally block with same guard Closes #2718 Co-authored-by: kshitijk4poor <kshitijk4poor@users.noreply.github.com>	2026-03-28 17:29:37 -07:00
Teknium	9a364f2805	fix: cap percentage displays at 100% in stats, gateway, and memory tool (#3599 ) Salvage of PR #3533 (binhnt92). Follow-up to #3480 — applies min(100, ...) to 5 remaining unclamped percentage display sites in context_compressor, cli /stats, gateway /stats, and memory tool. Defensive clamps now that the root cause (estimation heuristic) was already removed in #3480. Co-Authored-By: binhnt92 <binhnt92@users.noreply.github.com>	2026-03-28 14:55:18 -07:00
Teknium	839d9d7471	feat(agent): configurable timeouts for auxiliary LLM calls via config.yaml (#3597 ) Add per-task timeout settings under auxiliary.{task}.timeout in config.yaml instead of hardcoded values. Users with slow local models (Ollama, llama.cpp) can now increase timeouts for compression, vision, session search, etc. Defaults: - auxiliary.compression.timeout: 120s (was hardcoded 45s) - auxiliary.vision.timeout: 30s (unchanged) - all other aux tasks: 30s (was hardcoded 30s) - title_generator: 30s (was hardcoded 15s) call_llm/async_call_llm now auto-resolve timeout from config when not explicitly passed. Callers can still override with an explicit timeout arg. Based on PR #3406 by alanfwilliams. Converted from env vars to config.yaml per project conventions. Co-authored-by: alanfwilliams <alanfwilliams@users.noreply.github.com>	2026-03-28 14:35:28 -07:00
Teknium	2dd286c162	fix: write models.dev disk cache atomically (#3588 ) Use atomic_json_write() from utils.py instead of plain open()/json.dump() for the models.dev disk cache. Prevents corrupted cache if the process is killed mid-write — _load_disk_cache() silently returns {} on corrupt JSON, losing all model metadata until the next successful API fetch. Co-authored-by: memosr <memosr@users.noreply.github.com>	2026-03-28 14:20:30 -07:00
Teknium	831e8ba0e5	feat: tool-use enforcement + strip budget warnings from history (#3528 ) Cherry-pick of feat/gpt-tool-steering with modifications: 1. Tool-use enforcement prompt (refactored from GPT-specific): - Renamed GPT_TOOL_USE_GUIDANCE -> TOOL_USE_ENFORCEMENT_GUIDANCE - Added TOOL_USE_ENFORCEMENT_MODELS tuple: ('gpt', 'codex') - Injection logic now checks against the tuple instead of hardcoding 'gpt' — adding new model families is a one-line change - Addresses models describing actions instead of making tool calls 2. Budget warning history stripping: - _strip_budget_warnings_from_history() strips _budget_warning JSON keys and [BUDGET WARNING: ...] text from tool results at the start of run_conversation() - Prevents old budget warnings from poisoning subsequent turns Based on PR #3479 by teknium1.	2026-03-28 07:38:36 -07:00
Teknium	15cfd20820	fix: cap context pressure percentage at 100% in display (#3480 ) * fix: cap context pressure percentage at 100% in display The forward-looking token estimate can overshoot the compaction threshold (e.g. a large tool result pushes it from 70% to 109% in one step). The progress bar was already capped via min(), but pct_int was not — causing the user to see '109% to compaction' which is confusing. Cap pct_int at 100 in both CLI and gateway display functions. Reported by @JoshExile82. * refactor: use real API token counts for compression decisions Replace the rough chars/3 estimation with actual prompt_tokens + completion_tokens from the API response. The estimation was needed to predict whether tool results would push context past the threshold, but the default 50% threshold leaves ample headroom — if tool results push past it, the next API call reports real usage and triggers compression then. This removes all estimation from the compression and context pressure paths, making both 100% data-driven from provider-reported token counts. Also removes the dead _msg_count_before_tools variable.	2026-03-27 21:42:09 -07:00
Teknium	83043e9aa8	fix: add timeout to subprocess calls in context_references (#3469 ) _expand_git_reference() and _rg_files() called subprocess.run() without a timeout. On a large repository, @diff, @staged, or @git:N references could hang the agent indefinitely while git or ripgrep processes slow output. - Add timeout=30 to git subprocess in _expand_git_reference() with a user-friendly error message on TimeoutExpired - Add timeout=10 to rg subprocess in _rg_files() returning None on timeout (falls back to os.walk folder listing) Co-authored-by: memosr.eth <96793918+memosr@users.noreply.github.com>	2026-03-27 17:51:14 -07:00
Teknium	658692799d	fix: guard aux LLM calls against None content + reasoning fallback + retry (salvage #3389 ) (#3449 ) Salvage of #3389 by @binhnt92 with reasoning fallback and retry logic added on top. All 7 auxiliary LLM call sites now use extract_content_or_reasoning() which mirrors the main agent loop's behavior: extract content, strip think blocks, fall back to structured reasoning fields, retry on empty. Closes #3389.	2026-03-27 15:28:19 -07:00
Teknium	ab09f6b568	feat: curate HF model picker with OpenRouter analogues (#3440 ) Show only agentic models that map to OpenRouter defaults: Qwen/Qwen3.5-397B-A17B ↔ qwen/qwen3.5-plus Qwen/Qwen3.5-35B-A3B ↔ qwen/qwen3.5-35b-a3b deepseek-ai/DeepSeek-V3.2 ↔ deepseek/deepseek-chat moonshotai/Kimi-K2.5 ↔ moonshotai/kimi-k2.5 MiniMaxAI/MiniMax-M2.5 ↔ minimax/minimax-m2.5 zai-org/GLM-5 ↔ z-ai/glm-5 XiaomiMiMo/MiMo-V2-Flash ↔ xiaomi/mimo-v2-pro moonshotai/Kimi-K2-Thinking ↔ moonshotai/kimi-k2-thinking Users can still pick any HF model via Enter custom model name.	2026-03-27 13:54:46 -07:00
Teknium	6f11ff53ad	fix(anthropic): use model-native output limits instead of hardcoded 16K (#3426 ) The Anthropic adapter defaulted to max_tokens=16384 when no explicit value was configured. This severely limits thinking-enabled models where thinking tokens count toward max_tokens: - Claude Opus 4.6 supports 128K output but was capped at 16K - Claude Sonnet 4.6 supports 64K output but was capped at 16K With extended thinking (adaptive or budget-based), the model could exhaust the entire 16K on reasoning, leaving zero tokens for the actual response. This caused two user-visible errors: - 'Response truncated (finish_reason=length)' — thinking consumed most tokens - 'Response only contains think block with no content' — thinking consumed all Fix: add _ANTHROPIC_OUTPUT_LIMITS lookup table (sourced from Anthropic docs and Cline's model catalog) and use the model's actual output limit as the default. Unknown future models default to 128K (the current maximum). Also adds context_length clamping: if the user configured a smaller context window (e.g. custom endpoint), max_tokens is clamped to context_length - 1 to avoid exceeding the window. Closes #2706	2026-03-27 13:02:52 -07:00
Teknium	fd8c465e42	feat: add Hugging Face as a first-class inference provider (#3419 ) Salvage of PR #1747 (original PR #1171 by @davanstrien) onto current main. Registers Hugging Face Inference Providers (router.huggingface.co/v1) as a named provider: - hermes chat --provider huggingface (or --provider hf) - 18 curated open models via hermes model picker - HF_TOKEN in ~/.hermes/.env - OpenAI-compatible endpoint with automatic failover (Groq, Together, SambaNova, etc.) Files: auth.py, models.py, main.py, setup.py, config.py, model_metadata.py, .env.example, 5 docs pages, 17 new tests. Co-authored-by: Daniel van Strien <davanstrien@gmail.com>	2026-03-27 12:41:59 -07:00
Teknium	5127567d5d	perf(ttft): cache skills prompt with shared skill_utils module (salvage #3366 ) (#3421 ) Two-layer caching for build_skills_system_prompt(): 1. In-process LRU (OrderedDict, max 8) — same-process: 546ms → <1ms 2. Disk snapshot (.skills_prompt_snapshot.json) — cold start: 297ms → 103ms Key improvements over original PR #3366: - Extract shared logic into agent/skill_utils.py (parse_frontmatter, skill_matches_platform, get_disabled_skill_names, extract_skill_conditions, extract_skill_description, iter_skill_index_files) - tools/skills_tool.py delegates to shared module — zero code duplication - Proper LRU eviction via OrderedDict.move_to_end + popitem(last=False) - Cache invalidation on all skill mutation paths: - skill_manage tool (in-conversation writes) - hermes skills install (CLI hub) - hermes skills uninstall (CLI hub) - Automatic via mtime/size manifest on cold start prompt_builder.py no longer imports tools.skills_tool (avoids pulling in the entire tool registry chain at prompt build time). 6301 tests pass, 0 failures. Co-authored-by: kshitijk4poor <82637225+kshitijk4poor@users.noreply.github.com>	2026-03-27 10:54:02 -07:00
Teknium	e0dbbdb2c9	fix: eliminate 'Event loop is closed' / 'Press ENTER to continue' during idle (#3398 ) The OpenAI SDK's AsyncHttpxClientWrapper.__del__ schedules aclose() via asyncio.get_running_loop().create_task(). When an AsyncOpenAI client is garbage-collected while prompt_toolkit's event loop is running (the common CLI idle state), the aclose() task runs on prompt_toolkit's loop but the underlying TCP transport is bound to a different (dead) worker loop. The transport's self._loop.call_soon() then raises RuntimeError('Event loop is closed'), which prompt_toolkit surfaces as the disruptive 'Unhandled exception in event loop ... Press ENTER to continue...' error. Three-layer fix: 1. neuter_async_httpx_del(): Monkey-patches __del__ to a no-op at CLI startup before any AsyncOpenAI clients are created. Safe because cached clients are explicitly cleaned via _force_close_async_httpx, and uncached clients' TCP connections are cleaned by the OS on exit. 2. Custom asyncio exception handler: Installed on prompt_toolkit's event loop to silently suppress 'Event loop is closed' RuntimeError. Defense-in-depth for SDK upgrades that might change the class name. 3. cleanup_stale_async_clients(): Called after each agent turn (when the agent thread joins) to proactively evict cache entries whose event loop is closed, preventing stale clients from accumulating.	2026-03-27 09:45:25 -07:00
Teknium	5a1e2a307a	perf(ttft): salvage easy-win startup optimizations from #3346 (#3395 ) * perf(ttft): dedupe shared tool availability checks * perf(ttft): short-circuit vision auto-resolution * perf(ttft): make Claude Code version detection lazy * perf(ttft): reuse loaded toolsets for skills prompt --------- Co-authored-by: kshitijk4poor <82637225+kshitijk4poor@users.noreply.github.com>	2026-03-27 07:49:44 -07:00
Teknium	3f95e741a7	fix: validate empty user messages to prevent Anthropic API 400 errors (#3322 ) When user messages have empty content (e.g., Discord @mention-only messages, unrecognized attachments), the Anthropic API rejects the request with 'user messages must have non-empty content'. Changes: - anthropic_adapter.py: Add empty content validation for user messages (string and list formats), matching the existing pattern for assistant and tool messages. Empty content gets '(empty message)' placeholder. - discord.py: Defense-in-depth check at gateway layer to catch empty messages before they enter session history. - Add 4 regression tests covering empty string, whitespace-only, empty list, and empty text block scenarios. Fixes #3143 Co-authored-by: Bartok9 <bartok9@users.noreply.github.com>	2026-03-26 19:24:03 -07:00
Teknium	ad764d3513	fix(auxiliary): catch ImportError from build_anthropic_client in vision auto-detection (#3312 ) _try_anthropic() caught ImportError on the module import (line 667-669) but not on the build_anthropic_client() call (line 696). When the anthropic_adapter module imports fine but the anthropic SDK is missing, build_anthropic_client() raises ImportError at call time. This escaped _try_anthropic() entirely, killing get_available_vision_backends() and cascading to 7 test failures: - 4 setup wizard tests hit unexpected 'Configure vision:' prompt - 3 codex-auth-as-vision tests failed check_vision_requirements() The fix wraps the build_anthropic_client call in try/except ImportError, returning (None, None) when the SDK is unavailable — consistent with the existing guard at the top of the function.	2026-03-26 18:21:59 -07:00
Teknium	0375b2a0d7	fix(gateway): silence background agent terminal output (#3297 ) * fix(gateway): silence flush agent terminal output quiet_mode=True only suppresses AIAgent init messages. Tool call output still leaks to the terminal through _safe_print → _print_fn during session reset/expiry. Since #2670 injected live memory state into the flush prompt, the flush agent now reliably calls memory tools — making the output leak noticeable for the first time. Set _print_fn to a no-op so the background flush is fully silent. * test(gateway): add test for flush agent terminal silence + fix dotenv mock - Add TestFlushAgentSilenced: verifies _print_fn is set to a no-op on the flush agent so tool output never leaks to the terminal - Fix pre-existing test failures: replace patch('run_agent.AIAgent') with sys.modules mock to avoid importing run_agent (requires openai) - Add autouse _mock_dotenv fixture so all tests in this file run without the dotenv package installed * fix(display): route KawaiiSpinner output through print_fn to fully silence flush agent The previous fix set tmp_agent._print_fn = no-op on the flush agent but spinner output and quiet-mode cute messages bypassed _print_fn entirely: - KawaiiSpinner captured sys.stdout at __init__ and wrote directly to it - quiet-mode tool results used builtin print() instead of _safe_print() Add optional print_fn parameter to KawaiiSpinner.__init__; _write routes through it when set. Pass self._print_fn to all spinner construction sites in run_agent.py and change the quiet-mode cute message print to _safe_print. The existing gateway fix (tmp_agent._print_fn = lambda) now propagates correctly through both paths. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(gateway): silence hygiene and compression background agents Two more background AIAgent instances in the gateway were created with quiet_mode=True but without _print_fn = no-op, causing tool output to leak to the terminal: - _hyg_agent (in-turn hygiene memory agent) - tmp_agent (_compress_context path) Apply the same _print_fn no-op pattern used for the flush agent. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * chore(display): remove unused _last_flush_time from KawaiiSpinner Attribute was set but never read; upstream already removed it. Leftover from conflict resolution during rebase onto upstream/main. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> --------- Co-authored-by: Dilee <uzmpsk.dilekakbas@gmail.com> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-26 17:40:31 -07:00
Teknium	a8e02c7d49	fix: align Nous Portal model slugs with OpenRouter naming (#3253 ) Nous Portal now passes through OpenRouter model names and routes from there. Update the static fallback model list and auxiliary client default to use OpenRouter-format slugs (provider/model) instead of bare names. - _PROVIDER_MODELS['nous']: full OpenRouter catalog - _NOUS_MODEL: google/gemini-3-flash-preview (was gemini-3-flash) - Updated 4 test assertions for the new default model name	2026-03-26 13:49:43 -07:00
Teknium	2c719f0701	fix(auth): migrate OAuth token refresh to platform.claude.com with fallback (#3246 ) Anthropic migrated their OAuth infrastructure from console.anthropic.com to platform.claude.com (Claude Code v2.1.81+). Update _refresh_oauth_token() to try the new endpoint first, falling back to the old one for tokens issued before the migration. Also switches Content-Type from application/x-www-form-urlencoded to application/json to match current Claude Code behavior. Salvaged from PR #2741 by kshitijk4poor.	2026-03-26 13:26:56 -07:00
Teknium	43af094ae3	fix(agent): include tool tokens in preflight estimate, guard context probe persistence (#3164 ) Two improvements salvaged from PR #2600 (paraddox): 1. Preflight compression now counts tool schema tokens alongside system prompt and messages. With 50+ tools enabled, schemas can add 20-30K tokens that were previously invisible to the estimator, delaying compression until the API rejected the request. 2. Context probe persistence guard: when the agent steps down context tiers after a context-length error, only provider-confirmed numeric limits (parsed from the error message) are cached to disk. Guessed fallback tiers from get_next_probe_tier() stay in-memory only, preventing wrong values from polluting the persistent cache. Co-authored-by: paraddox <paraddox@users.noreply.github.com>	2026-03-26 02:00:50 -07:00
Teknium	cbf195e806	chore: fix 154 f-strings, simplify getattr/URL patterns, remove dead code (#3119 ) Three categories of cleanup, all zero-behavioral-change: 1. F-strings without placeholders (154 fixes across 29 files) - Converted f'...' to '...' where no {expression} was present - Heaviest files: run_agent.py (24), cli.py (20), honcho_integration/cli.py (34) 2. Simplify defensive patterns in run_agent.py - Added explicit self._is_anthropic_oauth = False in __init__ (before the api_mode branch that conditionally sets it) - Replaced 7x getattr(self, '_is_anthropic_oauth', False) with direct self._is_anthropic_oauth (attribute always initialized now) - Added _is_openrouter_url() and _is_anthropic_url() helper methods - Replaced 3 inline 'openrouter' in self._base_url_lower checks 3. Remove dead code in small files - hermes_cli/claw.py: removed unused 'total' computation - tools/fuzzy_match.py: removed unused strip_indent() function and pattern_stripped variable Full test suite: 6184 passed, 0 failures E2E PTY: banner clean, tool calls work, zero garbled ANSI	2026-03-25 19:47:58 -07:00
Teknium	7258311710	fix: stop recursive AGENTS.md walk, load top-level only (#3110 ) The recursive os.walk for AGENTS.md in subdirectories was undesired. Only load AGENTS.md from the working directory root, matching the behavior of CLAUDE.md and .cursorrules.	2026-03-25 18:30:45 -07:00
Teknium	910ec7eb38	chore: remove unused Hermes-native PKCE OAuth flow (#3107 ) Remove run_hermes_oauth_login(), refresh_hermes_oauth_token(), read_hermes_oauth_credentials(), _save_hermes_oauth_credentials(), _generate_pkce(), and associated constants/credential file path. This code was added in `63e88326` but never wired into any user-facing flow (setup wizard, hermes model, or any CLI command). Neither clawdbot/OpenClaw nor opencode implement PKCE for Anthropic — both use setup-token or API keys. Dead code that was never tested in production. Also removes the credential resolution step that checked ~/.hermes/.anthropic_oauth.json (step 3 in resolve_anthropic_token), renumbering remaining steps.	2026-03-25 18:29:47 -07:00
ctlst	281100e2df	fix(agent): prevent AsyncOpenAI/httpx cross-loop deadlock in gateway mode (#2701 ) In gateway mode, async tools (vision_analyze, web_extract, session_search) deadlock because _run_async() spawns a thread with asyncio.run(), creating a new event loop, but _get_cached_client() returns an AsyncOpenAI client bound to a different loop. httpx.AsyncClient cannot work across event loop boundaries, causing await client.chat.completions.create() to hang forever. Fix: include the event loop identity in the async client cache key so each loop gets its own AsyncOpenAI instance. Also fix session_search_tool.py which had its own broken asyncio.run()-in-thread pattern — now uses the centralized _run_async() bridge.	2026-03-25 17:31:56 -07:00
Teknium	d218cf9118	fix(skills): handle null metadata in skill frontmatter frontmatter.get("metadata", {}) returns None (not {}) when the key exists with a null value, crashing build_skills_system_prompt with AttributeError: 'NoneType' object has no attribute 'get'. Made-with: Cursor	2026-03-25 16:06:15 -07:00

1 2 3 4 5 ...

286 Commits