The setup wizard imported `get_codex_models` which does not exist;
the actual function is `get_codex_model_ids`. This caused a runtime
ImportError when selecting the openai-codex provider during setup.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Images pasted in the CLI were embedded as raw base64 image_url content
parts in the conversation history, which only works with vision-capable
models. If the main model (e.g. Nous API) doesn't support vision, this
breaks the request and poisons all subsequent messages.
Now the CLI uses the same approach as the messaging gateway: images are
pre-processed through the auxiliary vision model (Gemini Flash via
OpenRouter or Nous Portal) and converted to text descriptions. The
local file path is included so the agent can re-examine via
vision_analyze if needed. Works with any model.
Fixes#638.
User messaging improvements:
- Rejection: '(>_<) Error: not a valid model' instead of '(^_^) Warning: Error:'
- Rejection: shows 'Model unchanged' + tip about /model and /provider
- Session-only: explains 'this session only' with reason and 'will revert on restart'
- Saved: clear '(saved to config)' confirmation
Docs updated:
- cli-commands.md, cli.md, messaging/index.md: /model now shows
provider:model syntax, /provider command added to tables
Test fixes: deduplicated test names, assertions match new messages.
/provider command (CLI + gateway):
Shows all providers with auth status (✓/✗), aliases, and active marker.
Users can now discover what provider names work with provider:model syntax.
Gateway bugs fixed:
- Config was saved even when validation.persist=False (told user 'session
only' but actually persisted the unvalidated model)
- HERMES_INFERENCE_PROVIDER env var not set on provider switch, causing
the switch to be silently overridden if that env var was already set
parse_model_input hardened:
- Colon only treated as provider delimiter if left side is a recognized
provider name or alias. 'anthropic/claude-3.5-sonnet:beta' now passes
through as a model name instead of trying provider='anthropic/claude-3.5-sonnet'.
- HTTP URLs, random colons no longer misinterpreted.
56 tests passing across model validation, CLI commands, and integration.
'auto' doesn't always mean openrouter — it could be nous, zai,
kimi-coding, etc. depending on configured credentials. Reverted the
hardcoded mapping and now both CLI and gateway call
resolve_provider() to detect the actual active provider when 'auto'
is set. Falls back to openrouter only if resolution fails.
Added a system to track environment variables introduced in each config version, allowing migration prompts to only mention new variables since the user's last version. Updated the interactive configuration process to offer users the option to set these new optional keys during migration.
- normalize_provider('auto') now returns 'openrouter' (the default)
so /model shows the curated model list instead of nothing
- CLI /model display uses normalize_provider before looking up labels
- Gateway /model handler now uses the same validation logic as CLI:
live API probe, provider:model syntax, curated model list display
Add provider:model syntax to /model command for runtime provider switching:
/model zai:glm-5 → switch to Z.AI provider with glm-5
/model nous:hermes-3 → switch to Nous Portal with hermes-3
/model openrouter:anthropic/claude-sonnet-4.5 → explicit OpenRouter
When switching providers, credentials are resolved via resolve_runtime_provider
and validated before committing. Both model and provider are saved to config.
Provider aliases work (glm: → zai, kimi: → kimi-coding, etc.).
Enhanced /model (no args) display now shows:
- Current model and provider
- Curated model list for the current provider with ← marker
- Usage examples including provider:model syntax
39 tests covering parse_model_input, curated_models_for_provider,
provider switching (success + credential failure), and display output.
The 200 lines of prompt_toolkit/rich/fire stubs added in PR #650 were
guarded by 'if module in sys.modules: return' and never activated since
those dependencies are always installed. Removed to keep the test file
lean. Also removed unused MagicMock and pytest imports.
Not all providers require 'provider/model' format. Removing the rigid
format check lets the live API probe handle all validation uniformly.
If someone types 'gpt-5.4' on OpenRouter, the probe won't find it and
will suggest 'openai/gpt-5.4' — better UX than a format rejection.
Replace the static catalog-based model validation with a live API probe.
The /model command now hits the provider's /models endpoint to check if
the requested model actually exists:
- Model found in API → accepted + saved to config
- Model NOT found in API → rejected with 'Error: not a valid model'
and fuzzy-match suggestions from the live model list
- API unreachable → graceful fallback to hardcoded catalog (session-only
for unrecognized models)
- Format errors (empty, spaces, missing '/') still caught instantly
without a network call
The API probe takes ~0.2s for OpenRouter (346 models) and works with any
OpenAI-compatible endpoint (Ollama, vLLM, custom, etc.).
32 tests covering all paths: format checks, API found, API not found,
API unreachable fallback, CLI integration.
- Wrap validate_requested_model in try/except so /model doesn't crash
if validation itself fails (falls back to old accept+save behavior)
- Remove unnecessary sys.path.insert from both test files
- Expand test_model_validation.py: 4 → 23 tests covering normalize_provider,
provider_model_ids, empty/whitespace/spaces rejection, OpenRouter format
validation, custom endpoints, nous provider, provider aliases, unknown
providers, fuzzy suggestions
- Expand test_cli_model_command.py: 2 → 5 tests adding known-model save,
validation crash fallback, and /model with no argument
Updated the systemd unit generation to include the virtual environment and node modules in the PATH, improving the execution context for the hermes CLI. Additionally, added support for installing Playwright and its dependencies on Arch/Manjaro systems in the install script, ensuring a smoother setup process for browser tools.
Enhanced the environment setup for browser commands by ensuring the PATH variable includes standard directories, addressing potential issues with minimal PATH in systemd services. Additionally, updated the logging of stderr to use a warning level on failure for better visibility of errors. This change improves the robustness of subprocess execution in the browser tool.
Renamed _find_shell to _find_bash to clarify its purpose of specifically locating bash. Improved the shell detection logic to prioritize bash over the user's $SHELL, ensuring compatibility with the fence wrapper's syntax requirements. Added a backward compatibility alias for _find_shell to maintain existing imports in process_registry.py.
Updated the LocalEnvironment class to ensure the PATH variable includes standard directories. This change addresses issues with systemd services and terminal multiplexers that inherit a minimal PATH, improving the execution environment for subprocesses.
Updated the _find_shell function to improve shell detection on non-Windows systems. The function now checks for the existence of /usr/bin/bash and /bin/bash before falling back to /bin/sh, ensuring a more robust shell resolution process.
Two fixes:
1. Gateway CWD override: TERMINAL_CWD from config.yaml was being
unconditionally overwritten by the messaging_cwd fallback (line 114).
Now explicit paths in config.yaml are respected — only '.' / 'auto' /
'cwd' (or unset) fall back to MESSAGING_CWD or home directory.
2. sandbox_dir config: Added terminal.sandbox_dir to config.yaml bridge
in gateway/run.py, cli.py, and hermes_cli/config.py. Maps to
TERMINAL_SANDBOX_DIR env var, which get_sandbox_dir() reads to
determine where Docker/Singularity sandbox data is stored (default:
~/.hermes/sandboxes/). Users can now set:
hermes config set terminal.sandbox_dir /data/hermes-sandboxes
Skills can now declare runtime prerequisites (env vars, CLI binaries) via
YAML frontmatter. Skills with unmet prerequisites are excluded from the
system prompt so the agent never claims capabilities it can't deliver, and
skill_view() warns the agent about what's missing.
Three layers of defense:
- build_skills_system_prompt() filters out unavailable skills
- _find_all_skills() flags unmet prerequisites in metadata
- skill_view() returns prerequisites_warning with actionable details
Tagged 12 bundled skills that have hard runtime dependencies:
gif-search (TENOR_API_KEY), notion (NOTION_API_KEY), himalaya, imessage,
apple-notes, apple-reminders, openhue, duckduckgo-search, codebase-inspection,
blogwatcher, songsee, mcporter.
Closes#658Fixes#630
Removed the hard block on base_url containing 'api.anthropic.com'.
Anthropic now offers an OpenAI-compatible /chat/completions endpoint,
so blocking their URL prevents legitimate use. If the endpoint isn't
compatible, the API call will fail with a proper error anyway.
Removed from: run_agent.py, mini_swe_runner.py
Updated test to verify Anthropic URLs are accepted.
browser_vision now saves screenshots persistently to ~/.hermes/browser_screenshots/
and returns the screenshot_path in its JSON response. The model can include
MEDIA:<path> in its response to share screenshots as native photos.
Changes:
- browser_tool.py: Save screenshots persistently, return screenshot_path,
auto-cleanup files older than 24 hours, mkdir moved inside try/except
- telegram.py: Add send_image_file() — sends local images via bot.send_photo()
- discord.py: Add send_image_file() — sends local images via discord.File
- slack.py: Add send_image_file() — sends local images via files_upload_v2()
(WhatsApp already had send_image_file — no changes needed)
- prompt_builder.py: Updated Telegram hint to list image extensions,
added Discord and Slack MEDIA: platform hints
- browser.md: Document screenshot sharing and 24h cleanup
- send_file_integration_map.md: Updated to reflect send_image_file is now
implemented on Telegram/Discord/Slack
- test_send_image_file.py: 19 tests covering MEDIA: .png extraction,
send_image_file on all platforms, and screenshot cleanup
Partially addresses #466 (Phase 0: platform adapter gaps for send_image_file).
Authored by christomitov. Auto-detects sk-kimi- key prefix and routes
to api.kimi.com/coding/v1. Adds User-Agent header for Kimi Code API
compatibility. Legacy Moonshot keys continue to work unchanged.
Adds --worktree (-w) flag to hermes CLI for isolated git worktree sessions.
Multiple agents can work on the same repo concurrently without collisions.
Closes#652
Telegram's send_photo via URL has a ~5MB limit. Upscaled images from
fal.ai's Clarity Upscaler often exceed this, causing 'Wrong type of
web page content' or 'Failed to get http url content' errors.
Fix: Add download-and-upload fallback in Telegram's send_image().
When URL-based send_photo fails, download the image via httpx and
re-upload as bytes (supports up to 10MB file uploads).
Also: convert print() to logger.warning/error in image sending path
for proper log visibility (print goes to socket, invisible in logs).
Critical fixes:
- Add --worktree/-w to hermes_cli/main.py argparse (both chat
subcommand and top-level parser) so 'hermes -w' works via the
actual CLI entry point, not just 'python cli.py -w'
- Pass worktree flag through cmd_chat() kwargs to cli_main()
- Handle worktree attr in bare 'hermes' and --resume/--continue paths
Bug fixes in cli.py:
- Skip worktree creation for --list-tools/--list-toolsets (wasteful)
- Wrap git worktree subprocess.run in try/except (crash on timeout)
- Add stale worktree pruning on startup (_prune_stale_worktrees):
removes clean worktrees older than 24h left by crashed/killed sessions
Documentation updates:
- AGENTS.md: add --worktree to CLI commands table
- cli-config.yaml.example: add worktree config section
- website/docs/reference/cli-commands.md: add to core commands
- website/docs/user-guide/cli.md: add usage examples
- website/docs/user-guide/configuration.md: add config docs
Test improvements (17 → 31 tests):
- Stale worktree pruning (prune old clean, keep recent, keep dirty)
- Directory symlink via .worktreeinclude
- Edge cases (no commits, not a repo, pre-existing .worktrees/)
- CLI flag/config OR logic
- TERMINAL_CWD integration
- System prompt injection format
Add a --worktree (-w) flag to the hermes CLI that creates an isolated
git worktree for the session. This allows running multiple hermes-agent
instances concurrently on the same repo without file collisions.
How it works:
- On startup with -w: detects git repo, creates .worktrees/<session>/
with its own branch (hermes/<session-id>), sets TERMINAL_CWD to it
- Each agent works in complete isolation — independent HEAD, index,
and working tree, shared git object store
- On exit: auto-removes worktree and branch if clean, warns and
keeps if there are uncommitted changes
- .worktreeinclude file support: list gitignored files (.env, .venv/)
to auto-copy/symlink into new worktrees
- .worktrees/ is auto-added to .gitignore
- Agent gets a system prompt note about the worktree context
- Config support: set worktree: true in config.yaml to always enable
Usage:
hermes -w # Interactive mode in worktree
hermes -w -q "Fix issue #123" # Single query in worktree
# Or in config.yaml:
worktree: true
Includes 17 tests covering: repo detection, worktree creation,
independence verification, cleanup (clean/dirty), .worktreeinclude,
.gitignore management, and 10 concurrent worktrees.
Closes#652
Add official optional skill for qmd (tobi/qmd), a local on-device
search engine for personal knowledge bases, notes, docs, and meeting
transcripts.
Covers:
- Installation and setup for macOS and Linux
- Collection management and context annotations
- All search modes: BM25, vector, hybrid with reranking
- MCP integration (stdio and HTTP daemon modes)
- Structured query patterns and best practices
- systemd/launchd service configs for daemon persistence
Placed in optional-skills/ due to heavyweight requirements
(Node >= 22, ~2GB local models).
Long-lived gateway sessions can accumulate enough history that every new
message rehydrates an oversized transcript, causing repeated truncation
failures (finish_reason=length).
Add a session hygiene check in _handle_message that runs right after
loading the transcript and before invoking the agent:
1. Estimate message count and rough token count of the transcript
2. If above configurable thresholds (default: 200 msgs or 100K tokens),
auto-compress the transcript proactively
3. Notify the user about the compression with before/after stats
4. If still above warn threshold (default: 200K tokens) after
compression, suggest /reset
5. If compression fails on a dangerously large session, warn the user
to use /compress or /reset manually
Thresholds are configurable via config.yaml:
session_hygiene:
auto_compress_tokens: 100000
auto_compress_messages: 200
warn_tokens: 200000
This complements the agent's existing preflight compression (which
runs inside run_conversation) by catching pathological sessions at
the gateway layer before the agent is even created.
Includes 12 tests for threshold detection and token estimation.
The web_extract_tool was stripping the 'url' key during its output
trimming step, but documentation in 3 places claimed it was present.
This caused KeyError when accessing result['url'] in execute_code
scripts, especially when extracting from multiple URLs.
Changes:
- web_tools.py: Add 'url' back to trimmed_results output
- code_execution_tool.py: Add 'title' to _TOOL_STUBS docstring and
_TOOL_DOC_LINES so docs match actual {url, title, content, error}
response format
Kimi Code (platform.kimi.ai) issues API keys prefixed sk-kimi- that require:
1. A different base URL: api.kimi.com/coding/v1 (not api.moonshot.ai/v1)
2. A User-Agent header identifying a recognized coding agent
Without this fix, sk-kimi- keys fail with 401 (wrong endpoint) or 403
('only available for Coding Agents') errors.
Changes:
- Auto-detect sk-kimi- key prefix and route to api.kimi.com/coding/v1
- Send User-Agent: KimiCLI/1.0 header for Kimi Code endpoints
- Legacy Moonshot keys (api.moonshot.ai) continue to work unchanged
- KIMI_BASE_URL env var override still takes priority over auto-detection
- Updated .env.example with correct docs and all endpoint options
- Fixed doctor.py health check for Kimi Code keys
Reference: https://github.com/MoonshotAI/kimi-cli (platforms.py)
Adds market-data/polymarket skill — read-only access to Polymarket's public
prediction market APIs. Zero dependencies, zero auth required.
Addresses #589.
Adds a new market-data/polymarket skill for querying Polymarket's public
prediction market APIs. Pure read-only, zero authentication required,
zero external dependencies (stdlib only).
Includes:
- SKILL.md: Agent instructions with key concepts and workflow
- references/api-endpoints.md: Full API reference (Gamma, CLOB, Data APIs)
- scripts/polymarket.py: CLI helper for search, trending, prices, orderbooks,
price history, and recent trades
Addresses #589.
Root cause: fal_client.AsyncClient uses @cached_property for its
httpx.AsyncClient, creating it once and caching forever. In the gateway,
the agent runs in a thread pool where _run_async() calls asyncio.run()
which creates a temporary event loop. The first call works, but
asyncio.run() closes that loop. On the next call, a new loop is created
but the cached httpx.AsyncClient still references the old closed loop,
causing 'Event loop is closed'.
Fix: Switch from async fal_client API (submit_async/handler.get with
await) to sync API (submit/handler.get). The sync API uses httpx.Client
which has no event loop dependency. Since the tool already runs in a
thread pool via the gateway, async adds no benefit here.
Changes:
- image_generate_tool: async def -> def
- _upscale_image: async def -> def
- fal_client.submit_async -> fal_client.submit
- await handler.get() -> handler.get()
- is_async=True -> is_async=False in registry
- Remove unused asyncio import