Add support for using Nous Portal via a direct API key, mirroring
how OpenRouter and other API-key providers work. This gives users a
simpler alternative to the OAuth device-code flow when they already
have a Nous API key.
Changes:
- Add 'nous-api' to PROVIDER_REGISTRY as an api_key provider
pointing to https://inference-api.nousresearch.com/v1
- Add NOUS_API_KEY and NOUS_BASE_URL to OPTIONAL_ENV_VARS
- Add NOUS_API_BASE_URL / NOUS_API_CHAT_URL to hermes_constants
- Add 'Nous Portal API key' as first option in setup wizard
- Add provider aliases (nous_api, nousapi, nous-portal-api)
- Add test for nous-api runtime provider resolution
Closes#644
Move all new tests (schema, env filtering, edge cases, interrupt) into
the existing test_code_execution.py instead of a separate file.
Delete the now-redundant test_code_execution_schema.py.
build_execute_code_schema(set()) produced "from hermes_tools import , ..."
in the code property description — invalid Python syntax shown to the model.
This triggers when a user enables only the code_execution toolset without
any of the sandbox-allowed tools (e.g. `hermes tools code_execution`),
because SANDBOX_ALLOWED_TOOLS & {"execute_code"} = empty set.
Also adds 29 unit tests covering build_execute_code_schema, environment
variable filtering, execute_code edge cases, and interrupt handling.
Authored by tripledoublev.
After context compression on 413/400 errors, the inner retry loop was
reusing the stale pre-compression api_messages payload. Fix breaks out
of the inner retry loop so the outer loop rebuilds api_messages from
the now-compressed messages list. Adds regression test verifying the
second request actually contains the compressed payload.
Add display.background_process_notifications config option to control
how chatty the gateway process watcher is when using
terminal(background=true, check_interval=...) from messaging platforms.
Modes:
- all: running-output updates + final message (default, current behavior)
- result: only the final completion message
- error: only the final message when exit code != 0
- off: no watcher messages at all
Also supports HERMES_BACKGROUND_NOTIFICATIONS env var override.
Includes 12 tests (5 config loading + 7 watcher behavior).
Inspired by @PeterFile's PR #593. Closes#592.
Automatic filesystem snapshots before destructive file operations,
with user-facing rollback. Inspired by PR #559 (by @alireza78a).
Architecture:
- Shadow git repos at ~/.hermes/checkpoints/{hash}/ via GIT_DIR
- CheckpointManager: take/list/restore, turn-scoped dedup, pruning
- Transparent — the LLM never sees it, no tool schema, no tokens
- Once per turn — only first write_file/patch triggers a snapshot
Integration:
- Config: checkpoints.enabled + checkpoints.max_snapshots
- CLI flag: hermes --checkpoints
- Trigger: run_agent.py _execute_tool_calls() before write_file/patch
- /rollback slash command in CLI + gateway (list, restore by number)
- Pre-rollback snapshot auto-created on restore (undo the undo)
Safety:
- Never blocks file operations — all errors silently logged
- Skips root dir, home dir, dirs >50K files
- Disables gracefully when git not installed
- Shadow repo completely isolated from project git
Tests: 35 new tests, all passing (2798 total suite)
Docs: feature page, config reference, CLI commands reference
Authored by unmodeled-tyler. Adds openclaw-migration skill to optional-skills/
with migration script, SKILL.md, and 7 tests. Also improves clarify/approval
panel rendering with dynamic width calculation.
Authored by 0xbyt4. Fixes #N/A (no linked issue).
- Sanitize user input before FTS5 MATCH to prevent OperationalError on
special characters (C++, unbalanced quotes, dangling operators, etc.)
- Close SessionDB connection in mirror._append_to_sqlite() via finally block
- Added tests for both fixes
Platform.SIGNAL was missing from default_toolset_map and platform_config_key
in gateway/run.py, causing Signal to silently fall back to hermes-telegram
toolset (same bug as HomeAssistant, fixed in PR #538).
Also updates:
- tests/test_toolsets.py: include hermes-signal and hermes-homeassistant in
the platform core-tools consistency check
- cli-config.yaml.example: document signal and homeassistant platform keys
When a user runs `hermes -w -c Pokemon Agent Dev` without quoting the
session name, argparse would fail with:
error: argument command: invalid choice: 'Agent'
This is because argparse parses `-c Pokemon` (consuming one token via
nargs='?'), then sees 'Agent' and tries to match it as a subcommand.
Fix: add _coalesce_session_name_args() that pre-processes sys.argv before
argparse, joining consecutive non-flag, non-subcommand tokens after -c or
-r into a single argument. This makes both quoted and unquoted multi-word
session names work transparently.
Includes 17 tests covering all edge cases: multi-word names, single-word,
bare flags, flag ordering, subcommand boundaries, and passthrough.
Complements PR #453 by 0xbyt4. Adds isinstance(dict) guard in
run_agent.py to catch cases where json.loads returns non-dict
(e.g. null, list, string) before they reach downstream code.
Also adds 15 tests for build_tool_preview covering None args,
empty dicts, known/unknown tools, fallback keys, truncation,
and all special-cased tools (process, todo, memory, session_search).
Validate that the username portion of ~username paths contains only
valid characters (alphanumeric, dot, hyphen, underscore) before passing
to shell echo for expansion. Previously, paths like '~; rm -rf /'
would be passed unquoted to self._exec(f'echo {path}'), allowing
arbitrary command execution.
The approach validates the username rather than using shlex.quote(),
which would prevent tilde expansion from working at all since
echo '~user' outputs the literal string instead of expanding it.
Added tests for injection blocking and valid ~username/path expansion.
Credit to @alireza78a for reporting (PR #442, issue #442).
Vision auto-mode previously only tried OpenRouter, Nous, and Codex
for multimodal — deliberately skipping custom endpoints with the
assumption they 'may not handle vision input.' This caused silent
failures for users running local multimodal models (Qwen-VL, LLaVA,
Pixtral, etc.) without any cloud API keys.
Now custom endpoints are tried as a last resort in auto mode. If the
model doesn't support vision, the API call fails gracefully — but
users with local vision models no longer need to manually set
auxiliary.vision.provider: main in config.yaml.
Reported by @Spadav and @kotyKD.
- Register no-op app_mention event handler to suppress Bolt 404 errors.
The 'message' handler already processes @mentions in channels, so
app_mention is acknowledged without duplicate processing.
- Add send_document() for native file attachments (PDFs, CSVs, etc.)
via files_upload_v2, matching the pattern from Telegram PR #779.
- Add send_video() for native video uploads via files_upload_v2.
- Handle incoming document attachments from users: download, cache,
and inject text content for .txt/.md files (capped at 100KB),
following the same pattern as the Telegram adapter.
- Add _download_slack_file_bytes() helper for raw byte downloads.
- Add 24 new tests covering all new functionality.
Fixes the unhandled app_mention events reported in gateway logs.
Add MCP sampling/createMessage capability via SamplingHandler class.
Text-only sampling + tool use in sampling with governance (rate limits,
model whitelist, token caps, tool loop limits). Per-server audit metrics.
Based on concept from PR #366 by eren-karakus0. Restructured as class-based
design with bug fixes and tests using real MCP SDK types.
50 new tests, 2600 total passing.
Two changes to prevent unnecessary Anthropic prompt cache misses in the
gateway, where a fresh AIAgent is created per user message:
1. Reuse stored system prompt for continuing sessions:
When conversation_history is non-empty, load the system prompt from
the session DB instead of rebuilding from disk. The model already has
updated memory in its conversation history (it wrote it!), so
re-reading memory from disk produces a different system prompt that
breaks the cache prefix.
2. Stabilize Honcho context per session:
- Only prefetch Honcho context on the first turn (empty history)
- Bake Honcho context into the cached system prompt and store to DB
- Remove the per-turn Honcho injection from the API call loop
This ensures the system message is identical across all turns in a
session. Previously, re-fetching Honcho could return different context
on each turn, changing the system message and invalidating the cache.
Both changes preserve the existing behavior for compression (which
invalidates the prompt and rebuilds from scratch) and for the CLI
(where the same AIAgent persists and the cached prompt is already
stable across turns).
Tests: 2556 passed (6 new)
Terminal output was already redacted via redact_sensitive_text() but
read_file and search_files returned raw content. Now both tools
redact secrets before returning results to the LLM.
Based on PR #372 by @teyrebaz33 (closes#363) — applied manually
due to branch conflicts with the current codebase.
The summary message was always injected as 'user' role, which causes
consecutive user messages when the last preserved head message is also
'user'. Some APIs reject this (400 error), and it produces malformed
training data.
Fix: check the role of the last head message and pick the opposite role
for the summary — 'user' after assistant/tool, 'assistant' after user.
Based on PR #328 by johnh4098. Closes#328.
Split fallback provider handling into two clean registries:
_FALLBACK_API_KEY_PROVIDERS — env-var-based (openrouter, zai, kimi, minimax)
_FALLBACK_OAUTH_PROVIDERS — OAuth-based (openai-codex, nous)
New _resolve_fallback_credentials() method handles all three cases
(OAuth, API key, custom endpoint) and returns a uniform (key, url, mode)
tuple. _try_activate_fallback() is now just validation + client build.
Adds Nous Portal as a fallback provider — uses the same OAuth flow
as the primary provider (hermes login), returns chat_completions mode.
OAuth providers get credential refresh for free: the existing 401
retry handlers (_try_refresh_codex/nous_client_credentials) check
self.provider, which is set correctly after fallback activation.
4 new tests (nous activation, nous no-login, codex retained).
27 total fallback tests passing, 2548 full suite.
Codex OAuth uses a different auth flow (OAuth tokens, not env vars)
and a different API mode (codex_responses, not chat_completions).
The fallback now handles this specially:
- Resolves credentials via resolve_codex_runtime_credentials()
- Sets api_mode to codex_responses
- Fails gracefully if no Codex OAuth session exists
Also added to the commented-out config.yaml example.
2 new tests (codex activation + graceful failure).
The gateway had a SEPARATE compression system ('session hygiene')
with hardcoded thresholds (100k tokens / 200 messages) that were
completely disconnected from the model's context length and the
user's compression config in config.yaml. This caused premature
auto-compression on Telegram/Discord — triggering at ~60k tokens
(from the 200-message threshold) or inconsistent token counts.
Changes:
- Gateway hygiene now reads model name from config.yaml and uses
get_model_context_length() to derive the actual context limit
- Compression threshold comes from compression.threshold in
config.yaml (default 0.85), same as the agent's ContextCompressor
- Removed the message-count-based trigger (was redundant and caused
false positives in tool-heavy sessions)
- Removed the undocumented session_hygiene config section — the
standard compression.* config now controls everything
- Env var overrides (CONTEXT_COMPRESSION_THRESHOLD,
CONTEXT_COMPRESSION_ENABLED) are respected
- Warn threshold is now 95% of model context (was hardcoded 200k)
- Updated tests to verify model-aware thresholds, scaling across
models, and that message count alone no longer triggers compression
For claude-opus-4.6 (200k context) at 85% threshold: gateway
hygiene now triggers at 170k tokens instead of the old 100k.
New browser capabilities and a built-in skill for agent-driven web QA.
## New tool: browser_console
Returns console messages (log/warn/error/info) AND uncaught JavaScript
exceptions in a single call. Uses agent-browser's 'console' and 'errors'
commands through the existing session plumbing. Supports --clear to reset
buffers. Verified working in both local and Browserbase cloud modes.
## Enhanced tool: browser_vision(annotate=True)
New boolean parameter on browser_vision. When true, agent-browser overlays
numbered [N] labels on interactive elements — each [N] maps to ref @eN.
Annotation data (element name, role, bounding box) returned alongside the
vision analysis. Useful for QA reports and spatial reasoning.
## Config: browser.record_sessions
Auto-record browser sessions as WebM video files when enabled:
- Starts recording on first browser_navigate
- Stops and saves on browser_close
- Saves to ~/.hermes/browser_recordings/
- Works in both local and cloud modes (verified)
- Disabled by default
## Built-in skill: dogfood
Systematic exploratory QA testing for web applications. Teaches the agent
a 5-phase workflow:
1. Plan — accept URL, create output dirs, set scope
2. Explore — systematic crawl with annotated screenshots
3. Collect Evidence — screenshots, console errors, JS exceptions
4. Categorize — severity (Critical/High/Medium/Low) and category
(Functional/Visual/Accessibility/Console/UX/Content)
5. Report — structured markdown with per-issue evidence
Includes:
- skills/dogfood/SKILL.md — full workflow instructions
- skills/dogfood/references/issue-taxonomy.md — severity/category defs
- skills/dogfood/templates/dogfood-report-template.md — report template
## Tests
21 new tests covering:
- browser_console message/error parsing, clear flag, empty/failed states
- browser_console schema registration
- browser_vision annotate schema and flag passing
- record_sessions config defaults and recording lifecycle
- Dogfood skill file existence and content validation
Addresses #315.
Remove hallucinated providers (openai, deepseek, together, groq,
fireworks, mistral, gemini, nous) from the fallback provider map.
These don't exist in hermes-agent's provider system.
The real supported providers for fallback are:
openrouter (OPENROUTER_API_KEY)
zai (ZAI_API_KEY)
kimi-coding (KIMI_API_KEY)
minimax (MINIMAX_API_KEY)
minimax-cn (MINIMAX_CN_API_KEY)
For any other OpenAI-compatible endpoint, users can use the
base_url + api_key_env overrides in the config.
Also adds Kimi User-Agent header for kimi fallback (matching
the main provider system).
When the primary model/provider fails after retries (rate limit, overload,
auth errors, connection failures), Hermes automatically switches to a
configured fallback model for the remainder of the session.
Config (in ~/.hermes/config.yaml):
fallback_model:
provider: openrouter
model: anthropic/claude-sonnet-4
Supports all major providers: OpenRouter, OpenAI, Nous, DeepSeek, Together,
Groq, Fireworks, Mistral, Gemini — plus custom endpoints via base_url and
api_key_env overrides.
Design principles:
- Dead simple: one fallback model, not a chain
- One-shot: switches once, doesn't ping-pong back
- Zero new dependencies: uses existing OpenAI client
- Minimal code: ~100 lines in run_agent.py, ~5 lines in cli.py/gateway
- Three trigger points: max retries exhausted, non-retryable client errors,
and invalid response exhaustion
Does NOT trigger on context overflow or payload-too-large errors (those
are handled by the existing compression system).
Addresses #737.
25 new tests, 2492 total passing.
Complete Signal adapter using signal-cli daemon HTTP API.
Based on PR #268 by ibhagwan, rebuilt on current main with bug fixes.
Architecture:
- SSE streaming for inbound messages with exponential backoff (2s→60s)
- JSON-RPC 2.0 for outbound (send, typing, attachments, contacts)
- Health monitor detects stale SSE connections (120s threshold)
- Phone number redaction in all logs and global redact.py
Features:
- DM and group message support with separate access policies
- DM policies: pairing (default), allowlist, open
- Group policies: disabled (default), allowlist, open
- Attachment download with magic-byte type detection
- Typing indicators (8s refresh interval)
- 100MB attachment size limit, 8000 char message limit
- E.164 phone + UUID allowlist support
Integration:
- Platform.SIGNAL enum in gateway/config.py
- Signal in _is_user_authorized() allowlist maps (gateway/run.py)
- Adapter factory in _create_adapter() (gateway/run.py)
- user_id_alt/chat_id_alt fields in SessionSource for UUIDs
- send_message tool support via httpx JSON-RPC (not aiohttp)
- Interactive setup wizard in 'hermes gateway setup'
- Connectivity testing during setup (pings /api/v1/check)
- signal-cli detection and install guidance
Bug fixes from PR #268:
- Timestamp reads from envelope_data (not outer wrapper)
- Uses httpx consistently (not aiohttp in send_message tool)
- SIGNAL_DEBUG scoped to signal logger (not root)
- extract_images regex NOT modified (preserves group numbering)
- pairing.py NOT modified (no cross-platform side effects)
- No dual authorization (adapter defers to run.py for user auth)
- Wildcard uses set membership ('*' in set, not list equality)
- .zip default for PK magic bytes (not .docx)
No new Python dependencies — uses httpx (already core).
External requirement: signal-cli daemon (user-installed).
Tests: 30 new tests covering config, init, helpers, session source,
phone redaction, authorization, and send_message integration.
Co-authored-by: ibhagwan <ibhagwan@users.noreply.github.com>
The gateway had a SEPARATE compression system ('session hygiene')
with hardcoded thresholds (100k tokens / 200 messages) that were
completely disconnected from the model's context length and the
user's compression config in config.yaml. This caused premature
auto-compression on Telegram/Discord — triggering at ~60k tokens
(from the 200-message threshold) or inconsistent token counts.
Changes:
- Gateway hygiene now reads model name from config.yaml and uses
get_model_context_length() to derive the actual context limit
- Compression threshold comes from compression.threshold in
config.yaml (default 0.85), same as the agent's ContextCompressor
- Removed the message-count-based trigger (was redundant and caused
false positives in tool-heavy sessions)
- Removed the undocumented session_hygiene config section — the
standard compression.* config now controls everything
- Env var overrides (CONTEXT_COMPRESSION_THRESHOLD,
CONTEXT_COMPRESSION_ENABLED) are respected
- Warn threshold is now 95% of model context (was hardcoded 200k)
- Updated tests to verify model-aware thresholds, scaling across
models, and that message count alone no longer triggers compression
For claude-opus-4.6 (200k context) at 85% threshold: gateway
hygiene now triggers at 170k tokens instead of the old 100k.
The 'openai' provider was redundant — using OPENAI_BASE_URL +
OPENAI_API_KEY with provider: 'main' already covers direct OpenAI API.
Provider options are now: auto, openrouter, nous, codex, main.
- Removed _try_openai(), _OPENAI_AUX_MODEL, _OPENAI_BASE_URL
- Replaced openai tests with codex provider tests
- Updated all docs to remove 'openai' option and clarify 'main'
- 'main' description now explicitly mentions it works with OpenAI API,
local models, and any OpenAI-compatible endpoint
Tests: 2467 passed.
The Codex Responses API (chatgpt.com/backend-api/codex) supports
vision via gpt-5.3-codex. This was verified with real API calls
using image analysis.
Changes to _CodexCompletionsAdapter:
- Added _convert_content_for_responses() to translate chat.completions
multimodal format to Responses API format:
- {type: 'text'} → {type: 'input_text'}
- {type: 'image_url', image_url: {url: '...'}} → {type: 'input_image', image_url: '...'}
- Fixed: removed 'stream' from resp_kwargs (responses.stream() handles it)
- Fixed: removed max_output_tokens and temperature (Codex endpoint rejects them)
Provider changes:
- Added 'codex' as explicit auxiliary provider option
- Vision auto-fallback now includes Codex (OpenRouter → Nous → Codex)
since gpt-5.3-codex supports multimodal input
- Updated docs with Codex OAuth examples
Tested with real Codex OAuth token + ~/.hermes/image2.png — confirmed
working end-to-end through the full adapter pipeline.
Tests: 2459 passed.
The Codex model normalization was rejecting any model without 'codex'
in its name, forcing a fallback to gpt-5.3-codex. This blocked models
like gpt-5.4 that the Codex API actually supports.
The fix simplifies _normalize_model_for_provider() to two operations:
1. Strip provider prefixes (API needs bare slugs)
2. Replace the *untouched default* model with a Codex-compatible one
If the user explicitly chose a model — any model — we trust them and
let the API be the judge. No allowlists, no slug checks.
Also removes the 'codex not in slug' filter from _read_cache_models()
so the local cache preserves all API-available models.
Inspired by OpenClaw's approach which explicitly lists non-codex models
(gpt-5.4, gpt-5.2) as valid Codex models.
Users can now set provider: "openai" for auxiliary tasks (vision, web
extract, compression) to use OpenAI's API directly with their
OPENAI_API_KEY. This hits api.openai.com/v1 with gpt-4o-mini as the
default model — supports vision since GPT-4o handles image input.
Provider options are now: auto, openrouter, nous, openai, main.
Changes:
- agent/auxiliary_client.py: added _try_openai(), "openai" case in
_resolve_forced_provider(), updated auxiliary_max_tokens_param()
to use max_completion_tokens for OpenAI
- Updated docs: cli-config.yaml.example, AGENTS.md, and user-facing
configuration.md with Common Setups section showing OpenAI,
OpenRouter, and local model examples
- 3 new tests for OpenAI provider resolution
Tests: 2459 passed (was 2429).
Improvements on top of PR #606 (auxiliary model configuration):
1. Gateway bridge: Added auxiliary.* and compression.summary_provider
config bridging to gateway/run.py so config.yaml settings work from
messaging platforms (not just CLI). Matches the pattern in cli.py.
2. Vision auto-fallback safety: In auto mode, vision now only tries
OpenRouter + Nous Portal (known multimodal-capable providers).
Custom endpoints, Codex, and API-key providers are skipped to avoid
confusing errors from providers that don't support vision input.
Explicit provider override (AUXILIARY_VISION_PROVIDER=main) still
allows using any provider.
3. Comprehensive tests (46 new):
- _get_auxiliary_provider env var resolution (8 tests)
- _resolve_forced_provider with all provider types (8 tests)
- Per-task provider routing integration (4 tests)
- Vision auto-fallback safety (7 tests)
- Config bridging logic (11 tests)
- Gateway/CLI bridge parity (2 tests)
- Vision model override via env var (2 tests)
- DEFAULT_CONFIG shape validation (4 tests)
4. Docs: Added auxiliary_client.py to AGENTS.md project structure.
Updated module docstring with separate text/vision resolution chains.
Tests: 2429 passed (was 2383).
- Added support for auxiliary model overrides in the configuration, allowing users to specify providers and models for vision and web extraction tasks.
- Updated the CLI configuration example to include new auxiliary model settings.
- Enhanced the environment variable mapping in the CLI to accommodate auxiliary model configurations.
- Improved the resolution logic for auxiliary clients to support task-specific provider overrides.
- Updated relevant documentation and comments for clarity on the new features and their usage.