Enhanced the _run_single_child function by introducing max_tokens, reasoning_config, and prefill_messages parameters from the parent agent. This allows for more flexible configuration of child agents, improving their operational capabilities.
Previously, when a session expired (idle/daily reset), the memory flush
ran synchronously inside get_or_create_session — blocking the user's
message for 10-60s while an LLM call saved memories.
Now a background watcher task (_session_expiry_watcher) runs every 5 min,
detects expired sessions, and flushes memories proactively in a thread
pool. By the time the user sends their next message, memories are
already saved and the response is immediate.
Changes:
- Add _is_session_expired(entry) to SessionStore — works from entry
alone without needing a SessionSource
- Add _pre_flushed_sessions set to track already-flushed sessions
- Remove sync _on_auto_reset callback from get_or_create_session
- Refactor flush into _flush_memories_for_session (sync worker) +
_async_flush_memories (thread pool wrapper)
- Add _session_expiry_watcher background task, started in start()
- Simplify /reset command to use shared fire-and-forget flush
- Add 10 tests for expiry detection, callback removal, tracking
Reduces token usage and latency for most tasks by defaulting to
medium reasoning effort instead of xhigh. Users can still override
via config or CLI flag. Updates code, tests, example config, and docs.
When running under systemd, the gateway could enter restart loops in two
scenarios:
1. The previous gateway process hasn't fully exited when systemd starts
a new one, causing 'Gateway already running (PID ...)' → exit 1 →
restart → same error → infinite loop.
2. The interactive CLI exits immediately in non-TTY mode, and systemd
keeps restarting it.
Changes:
- Add --replace flag to 'hermes gateway run' that gracefully kills any
existing gateway instance (SIGTERM → wait 10s → SIGKILL) before
starting, preventing the PID-lock deadlock.
- Update the generated systemd unit template to use --replace by default,
add ExecStop for clean shutdown, set KillMode=mixed and
TimeoutStopSec=15 for proper process management.
- Existing behavior (without --replace) is unchanged: still prints the
error message and exits, now also mentioning the --replace option.
Fixes#576
Eliminated the model parameter from the AIAgent class initialization, streamlining the constructor and ensuring consistent behavior across agent instances. This change aligns with recent updates to the task delegation logic.
Added functionality to detect the appropriate Z.AI endpoint based on the provided API key, accommodating different billing plans and regions. The setup process now probes available endpoints and updates the configuration accordingly, enhancing user experience and reducing potential billing errors. Updated the setup model provider function to integrate this new detection logic.
Eliminated the model parameter from the delegate_task function and its associated schema, defaulting to None for subagent calls. This change simplifies the function signature and enforces consistent behavior across task delegation.
Added logic to manage multiple compression attempts for large payloads and context length errors. Introduced limits on compression attempts to prevent infinite retries, with appropriate logging and error handling. This ensures better resilience and user feedback when facing compression issues during API calls.
_incomplete_scratchpad_retries and _codex_incomplete_retries were not
reset at the start of run_conversation(). In CLI mode, where the same
AIAgent instance is reused across conversations, stale counters from
a previous conversation could carry over, causing premature retry
exhaustion and partial responses.
Updated the default model version from "anthropic/claude-sonnet-4-20250514" to "anthropic/claude-sonnet-4.6" across multiple files including AGENTS.md, batch_runner.py, mini_swe_runner.py, and run_agent.py for consistency and to reflect the latest model improvements.
_approval_callback() had no return statement after the timeout break,
causing it to return None. Callers expect a string ("once", "session",
"always", or "deny"), so None could lead to undefined behavior when
approving dangerous commands.
_convert_to_png() renamed the original file to .bmp before calling
ImageMagick convert, then unconditionally deleted the .bmp regardless
of whether convert succeeded. If convert failed, both files were gone.
- Only delete .bmp after confirmed successful conversion
- Restore original file on convert failure, timeout, or missing binary
- Add 3 tests covering failure, not-installed, and timeout scenarios
_make_cli() did not clear HERMES_MAX_ITERATIONS env var, so tests
failed in CI where the var was set externally. Also, default max_turns
changed from 60 to 90 in 0a82396 but tests were not updated.
- Clear HERMES_MAX_ITERATIONS in _make_cli() for proper isolation
- Add env_overrides parameter for tests that need specific env values
- Update hardcoded 60 assertions to 90 to match new default
- Simplify test_env_var_max_turns using env_overrides
Subagent tool calls now count toward the same session-wide iteration
limit as the parent agent. Previously, each subagent had its own
independent counter, so a parent with max_iterations=60 could spawn
3 subagents each doing 50 calls = 150 total tool calls unmetered.
Changes:
- IterationBudget: thread-safe shared counter (run_agent.py)
- consume(): try to use one iteration, returns False if exhausted
- refund(): give back one iteration (for execute_code turns)
- Thread-safe via Lock (subagents run in ThreadPoolExecutor)
- Parent creates the budget, children inherit it via delegate_tool.py
- execute_code turns are refunded (don't count against budget)
- Default raised from 60 → 90 to account for shared consumption
- Per-child cap (50) still applies as a safety valve
The per-child max_iterations (default 50) remains as a per-child
ceiling, but the shared budget is the hard session-wide limit.
A child stops at whichever comes first.
Enhance message compression by adding a method to clean up orphaned tool-call and tool-result pairs. This ensures that the API receives well-formed messages, preventing errors related to mismatched IDs. The new functionality includes removing orphaned results and adding stub results for missing calls, improving overall message integrity during compression.
When MarkdownV2 parsing fails, _strip_mdv2() removes escape backslashes
and bold markers (*text*) but missed italic markers (_text_). Users saw
raw underscores around italic text in the plaintext fallback.
- Add regex to strip _text_ italic markers in _strip_mdv2()
- Use word boundary lookaround to preserve snake_case identifiers
- Add tests for _strip_mdv2 covering italic, bold, snake_case, and edge cases
Add Python DDGS library examples for all 4 search types (text, news,
images, videos) with return field documentation, quick reference table,
and validated gotchas. Reorganize to put Python API primary, CLI secondary.
Soften Firecrawl-fallback framing. All examples validated on ddgs==9.11.2.
Check how many commits behind origin/main the local repo is and
display a warning in the welcome banner:
⚠ 12 commits behind — run hermes update to update
- git fetch cached for 6 hours (avoids repeated network calls)
- Falls back gracefully if offline or not a git repo
- Never breaks the banner — all errors silently caught
- tools_config.py: Add 'Local Browser' as first provider option
(no API keys needed, same npm install for agent-browser)
- setup.py: Show 'Browser Automation (local)' when agent-browser
CLI is found but no Browserbase key is set
- config.py: Mark BROWSERBASE_* descriptions as optional
- status.py: Note that local browser works without Browserbase
When Telegram's MarkdownV2 parser rejects a message, the send() fallback
was sending the already-escaped text as plain text. This caused users to
see raw backslashes before every special character (periods, dashes,
parentheses, etc.) — e.g. 'sentence\.' or '\-\-auto\-approve'.
Changes:
- Add _strip_mdv2() to reverse MarkdownV2 escaping for clean plaintext
- Use stripped text in the send() fallback path instead of raw escaped chunk
- Add logging when the MDV2 fallback is triggered for diagnostics
- Add logger to telegram.py (was missing)
The edit_message() fallback already correctly used the original content;
this brings send() in line with that behavior.
Add local browser mode as an automatic fallback when Browserbase
credentials are not configured. Uses the same agent-browser CLI with
--session (local Chromium) instead of --cdp (cloud Browserbase).
The agent-facing API is completely unchanged — all 10 browser_* tools
produce identical output in both modes. Auto-detection:
- BROWSERBASE_API_KEY set → cloud mode (existing behavior)
- No key → local mode (new, free, headless Chromium)
Changes:
- _is_local_mode(): auto-detect based on env vars
- _create_local_session(): lightweight session (no API call)
- _get_session_info(): branches on local vs cloud
- _run_browser_command(): --session in local, --cdp in cloud
- check_browser_requirements(): only needs agent-browser CLI in local mode
- _emergency_cleanup: CLI close in local, API release in cloud
- cleanup_browser/browser_close: skip BB API calls in local mode
- Registry: removed requires_env — check_fn handles both modes
Setup for local mode:
npm install -g agent-browser
agent-browser install # downloads Chromium
agent-browser install --with-deps # also installs system libs (Docker/Debian)
Closes#374 (Phase 1)
Add a 'platforms' field to SKILL.md frontmatter that restricts skills
to specific operating systems. Skills with platforms: [macos] only
appear in the system prompt, skills_list(), and slash commands on macOS.
Skills without the field load everywhere (backward compatible).
Implementation:
- skill_matches_platform() in tools/skills_tool.py — core filter
- Wired into all 3 discovery paths: prompt_builder.py, skills_tool.py,
skill_commands.py
- 28 new tests across 3 test files
New bundled Apple/macOS skills (all platforms: [macos]):
- imessage — Send/receive iMessages via imsg CLI
- apple-reminders — Manage Reminders via remindctl CLI
- apple-notes — Manage Notes via memo CLI
- findmy — Track devices/AirTags via AppleScript + screen capture
Docs updated: CONTRIBUTING.md, AGENTS.md, creating-skills.md,
skills.md (user guide)
The previous 'get_env_value' in dir() check always evaluated to False
(dir() returns local scope, not module scope), making the left branch
dead code. Simplified to just os.getenv() which was the fallback anyway.
Authored by areu01or00. Adds timezone support via hermes_time.now() helper
with IANA timezone resolution (HERMES_TIMEZONE env → config.yaml → server-local).
Updates system prompt timestamp, cron scheduling, and execute_code sandbox TZ
injection. Includes config migration (v4→v5) and comprehensive test coverage.
uv pip install requires a virtual environment by default. When hermes
is installed system-wide or via pipx, the setup wizard's SDK installs
(daytona, swe-rex[modal], tinker-atropos) fail with 'No virtual
environment found'. Fix by passing --python sys.executable to uv,
which targets the correct Python regardless of venv state.
Also show the actual error message on install failure so users can
debug.
config['model'] can be a dict (old format: {default, base_url, provider})
or a string (new format). The setup wizard was showing the raw dict in
'Keep current' and 'Model set to' messages. Now extracts the model name
from either format.
Both 'hermes tools' and 'hermes setup tools' now use the same unified
flow in tools_config.py:
1. Select platform (CLI, Telegram, Discord, etc.)
2. Toggle all 18 toolsets on/off in checklist
3. Newly enabled tools that need API keys → provider-aware config
(e.g., TTS shows Edge/OpenAI/ElevenLabs picker)
4. Already-configured tools that stay enabled → silent, no prompts
5. Menu option: 'Reconfigure an existing tool' for updating
providers or API keys on tools that are already set up
Key changes:
- Move TOOL_CATEGORIES, provider config, and post-setup hooks from
setup.py to tools_config.py
- Replace flat _check_and_prompt_requirements() with provider-aware
_configure_toolset() that uses TOOL_CATEGORIES
- Add _reconfigure_tool() flow for updating existing configs
- setup.py's setup_tools() now delegates to tools_command()
- tools_command() menu adds 'Reconfigure' option alongside platforms
- Only prompt for API keys on tools that are NEWLY toggled on AND
don't already have keys configured
No breaking changes. All 2013 tests pass.
simple_term_menu miscalculates string widths when labels contain
ANSI escape codes (from color()) or em dashes, causing duplicated
and garbled lines on arrow key navigation.
Replace color() status indicators with plain text [configured]/[active]
and em dashes with regular dashes in all prompt_choice/prompt_checklist
labels.
Restructure the monolithic hermes setup wizard into independently-runnable
sections with a category-first tool configuration experience.
Changes:
- Break setup into 5 sections: model, terminal, gateway, tools, agent
- Each section is a standalone function, runnable individually via
'hermes setup model', 'hermes setup terminal', etc.
- Returning users get a menu: Quick Setup / Full Setup / individual sections
- First-time users get a guided walkthrough of all sections
Tool Configuration UX overhaul:
- Replace flat API key checklist with category-first approach
- Show tool types (TTS, Web Search, Image Gen, etc.) as top-level items
- Within each category, let users pick a provider:
- TTS: Microsoft Edge (Free), OpenAI, ElevenLabs
- Web: Firecrawl Cloud, Firecrawl Self-Hosted
- Image Gen: FAL.ai
- Browser: Browserbase
- Smart Home: Home Assistant
- RL Training: Tinker/Atropos
- GitHub: Personal Access Token
- Shows configured status on each tool and provider
- Only prompts for API keys after provider selection
Also:
- Add section argument to setup argparse parser in main.py
- Update summary to show new section commands
- Add self-hosted Firecrawl and Home Assistant to tool setup
- All 2013 tests pass
When `fetch_nous_models()` fails silently during setup, the model
selection falls through to the OpenRouter static list. Users then pick
models in OpenRouter format (e.g. `anthropic/claude-opus-4.6`) which
the Nous inference API rejects with a 400 "missing model" error.
Add an explicit `elif selected_provider == "nous"` branch that prompts
for manual model entry instead of falling through to the generic
OpenRouter fallback.
Adds eval-only benchmark for YC-Bench (collinear-ai/yc-bench), a
deterministic long-horizon benchmark where the agent acts as CEO of an
AI startup over a simulated 1-3 year run.
Key design decisions verified against the official yc-bench repo:
- Uses 'sim init' (NOT 'yc-bench run') to avoid starting a competing
built-in agent loop
- Correct DB table names: 'companies' and 'sim_events'
- Correct 4 domains: research, inference, data_environment, training
- Penalty values are preset-dependent (not hardcoded in system prompt)
- Sequential evaluation (each run is 100-500 turns)
- Follows TerminalBench2 patterns: KeyboardInterrupt handling,
cleanup_all_environments(), tqdm logging handler, streaming JSONL
yc-bench added as optional dependency: pip install hermes-agent[yc-bench]
Closes#340
These direct providers don't return cost in API responses and their
per-token pricing isn't readily available externally. Treat as local
models with zero cost so they appear in /insights without fake estimates.
When the user only has a z.ai/Kimi/MiniMax API key (no OpenRouter key),
auxiliary tasks (context compression, web summarization, session search)
now fall back to the configured direct provider instead of returning None.
Resolution chain: OpenRouter -> Nous -> Custom endpoint -> Codex OAuth
-> direct API-key providers -> None.
Uses cheap/fast models for auxiliary tasks:
- zai: glm-4.5-flash
- kimi-coding: kimi-k2-turbo-preview
- minimax/minimax-cn: MiniMax-M2.5-highspeed
Vision auxiliary intentionally NOT modified — vision needs multimodal
models (Gemini) that these providers don't serve.
Adds DEFAULT_CONTEXT_LENGTHS entries for kimi-k2.5 (262144), kimi-k2-thinking
(262144), kimi-k2-turbo-preview (262144), kimi-k2-0905-preview (131072),
MiniMax-M2.5/M2.5-highspeed/M2.1 (204800), and glm-4.5/4.5-flash (131072).
Avoids unnecessary 2M-token probe on first use with direct providers.
Introduces a new OpenClaw-to-Hermes migration skill with a Python
helper script that handles importing SOUL.md, memories, user profiles,
messaging settings, command allowlists, skills, TTS assets, and
workspace instructions.
Supports two migration presets (user-data / full), three skill conflict
modes (skip / overwrite / rename), overflow file export for entries that
exceed character limits, and granular include/exclude option filtering.
Includes detailed SKILL.md agent instructions covering the clarify-tool
interaction protocol, decision-to-command mapping, post-run reporting
rules, and path resolution guidance.
Adds dynamic panel width calculation to CLI clarify/approval widgets so
panels adapt to content and terminal size.
Includes 7 new tests covering presets, include/exclude, conflict modes,
overflow exports, and skills_guard integration.
Adds 4 new direct API-key providers (zai, kimi-coding, minimax, minimax-cn)
to the inference provider system. All use standard OpenAI-compatible
chat/completions endpoints with Bearer token auth.
Core changes:
- auth.py: Extended ProviderConfig with api_key_env_vars and base_url_env_var
fields. Added providers to PROVIDER_REGISTRY. Added provider aliases
(glm, z-ai, zhipu, kimi, moonshot). Added auto-detection of API-key
providers in resolve_provider(). Added resolve_api_key_provider_credentials()
and get_api_key_provider_status() helpers.
- runtime_provider.py: Added generic API-key provider branch in
resolve_runtime_provider() — any provider with auth_type='api_key'
is automatically handled.
- main.py: Added providers to hermes model menu with generic
_model_flow_api_key_provider() flow. Updated _has_any_provider_configured()
to check all provider env vars. Updated argparse --provider choices.
- setup.py: Added providers to setup wizard with API key prompts and
curated model lists.
- config.py: Added env vars (GLM_API_KEY, KIMI_API_KEY, MINIMAX_API_KEY,
etc.) to OPTIONAL_ENV_VARS.
- status.py: Added API key display and provider status section.
- doctor.py: Added connectivity checks for each provider endpoint.
- cli.py: Updated provider docstrings.
Docs: Updated README.md, .env.example, cli-config.yaml.example,
cli-commands.md, environment-variables.md, configuration.md.
Tests: 50 new tests covering registry, aliases, resolution, auto-detection,
credential resolution, and runtime provider dispatch.
Inspired by PR #33 (numman-ali) which proposed a provider registry approach.
Credit to tars90percent (PR #473) and manuelschipper (PR #420) for related
provider improvements merged earlier in this changeset.