Comprehensive 16-point checklist covering every integration point
needed when adding a new messaging platform to the gateway. Built
from the Signal integration experience where 7 integration points
were initially missed.
Covers: adapter, config enum, factory, auth maps, session source,
prompt hints, toolsets, cron delivery, send_message tool, cronjob
tool schema, channel directory, status display, setup wizard,
redaction, documentation, and tests.
Remove hallucinated providers (openai, deepseek, together, groq,
fireworks, mistral, gemini, nous) from the fallback provider map.
These don't exist in hermes-agent's provider system.
The real supported providers for fallback are:
openrouter (OPENROUTER_API_KEY)
zai (ZAI_API_KEY)
kimi-coding (KIMI_API_KEY)
minimax (MINIMAX_API_KEY)
minimax-cn (MINIMAX_CN_API_KEY)
For any other OpenAI-compatible endpoint, users can use the
base_url + api_key_env overrides in the config.
Also adds Kimi User-Agent header for kimi fallback (matching
the main provider system).
The config comment now shows the complete list of built-in providers
that the fallback system supports, each with the env var it reads
for the API key. Also clarifies that custom OpenAI-compatible endpoints
work via base_url + api_key_env.
- website/docs/user-guide/messaging/signal.md: Full setup guide with
prerequisites, step-by-step instructions, access policies, features,
troubleshooting, security notes, and env var reference
- website/docs/user-guide/messaging/index.md: Added Signal to architecture
diagram, platform toolset table, security examples, and Next Steps links
- website/docs/reference/environment-variables.md: All 7 SIGNAL_* env vars
- README.md: Signal in feature table and documentation table
- AGENTS.md: Signal in gateway description and env var config section
All documentation migrated to website/docs/ (Docusaurus). The docs/
directory only contained:
- README.md: redirect saying 'docs moved to website' (redundant)
- send_file_integration_map.md: internal engineering notes, unreferenced
by any file in the codebase
The landing page at landingpage/ is still actively used by the
deploy-site.yml GitHub Actions workflow.
When the primary model/provider fails after retries (rate limit, overload,
auth errors, connection failures), Hermes automatically switches to a
configured fallback model for the remainder of the session.
Config (in ~/.hermes/config.yaml):
fallback_model:
provider: openrouter
model: anthropic/claude-sonnet-4
Supports all major providers: OpenRouter, OpenAI, Nous, DeepSeek, Together,
Groq, Fireworks, Mistral, Gemini — plus custom endpoints via base_url and
api_key_env overrides.
Design principles:
- Dead simple: one fallback model, not a chain
- One-shot: switches once, doesn't ping-pong back
- Zero new dependencies: uses existing OpenAI client
- Minimal code: ~100 lines in run_agent.py, ~5 lines in cli.py/gateway
- Three trigger points: max retries exhausted, non-retryable client errors,
and invalid response exhaustion
Does NOT trigger on context overflow or payload-too-large errors (those
are handled by the existing compression system).
Addresses #737.
25 new tests, 2492 total passing.
Complete Signal adapter using signal-cli daemon HTTP API.
Based on PR #268 by ibhagwan, rebuilt on current main with bug fixes.
Architecture:
- SSE streaming for inbound messages with exponential backoff (2s→60s)
- JSON-RPC 2.0 for outbound (send, typing, attachments, contacts)
- Health monitor detects stale SSE connections (120s threshold)
- Phone number redaction in all logs and global redact.py
Features:
- DM and group message support with separate access policies
- DM policies: pairing (default), allowlist, open
- Group policies: disabled (default), allowlist, open
- Attachment download with magic-byte type detection
- Typing indicators (8s refresh interval)
- 100MB attachment size limit, 8000 char message limit
- E.164 phone + UUID allowlist support
Integration:
- Platform.SIGNAL enum in gateway/config.py
- Signal in _is_user_authorized() allowlist maps (gateway/run.py)
- Adapter factory in _create_adapter() (gateway/run.py)
- user_id_alt/chat_id_alt fields in SessionSource for UUIDs
- send_message tool support via httpx JSON-RPC (not aiohttp)
- Interactive setup wizard in 'hermes gateway setup'
- Connectivity testing during setup (pings /api/v1/check)
- signal-cli detection and install guidance
Bug fixes from PR #268:
- Timestamp reads from envelope_data (not outer wrapper)
- Uses httpx consistently (not aiohttp in send_message tool)
- SIGNAL_DEBUG scoped to signal logger (not root)
- extract_images regex NOT modified (preserves group numbering)
- pairing.py NOT modified (no cross-platform side effects)
- No dual authorization (adapter defers to run.py for user auth)
- Wildcard uses set membership ('*' in set, not list equality)
- .zip default for PK magic bytes (not .docx)
No new Python dependencies — uses httpx (already core).
External requirement: signal-cli daemon (user-installed).
Tests: 30 new tests covering config, init, helpers, session source,
phone redaction, authorization, and send_message integration.
Co-authored-by: ibhagwan <ibhagwan@users.noreply.github.com>
All failure paths in _run_browser_command now log at WARNING level,
which means they automatically land in ~/.hermes/logs/errors.log
(the persistent error log captures WARNING+).
What's now logged:
- agent-browser CLI not found (warning)
- Session creation failure with task ID (warning)
- Command entry with socket_dir path and length (debug)
- Non-zero return code with stderr (warning)
- Non-JSON output from agent-browser (warning — version mismatch/crash)
- Command timeout with task ID and socket path (warning)
- Unexpected exceptions with full traceback (warning + exc_info)
- browser_vision: which model is used and screenshot size (debug)
- browser_vision: LLM analysis failure with full traceback (warning)
Also fixed: _get_vision_model() was called twice in browser_vision —
now called once and reused.
macOS sets TMPDIR to /var/folders/xx/.../T/ (~51 chars). Combined with
agent-browser session names, socket paths reach 121 chars — exceeding
the 104-byte macOS AF_UNIX limit. This causes 'Screenshot file was not
created' errors and silent browser_vision failures on macOS.
Fix: use /tmp/ on macOS (symlink to /private/tmp, sticky-bit protected).
On Linux, tempfile.gettempdir() already returns /tmp — no behavior change.
Changes in browser_tool.py:
- Add _socket_safe_tmpdir() helper — returns /tmp on macOS, gettempdir()
elsewhere
- Replace all 3 tempfile.gettempdir() calls for socket dirs
- Set mode=0o700 on socket dirs for privacy (was using default umask)
- Guard vision/text client init with try/except — a broken auxiliary
config no longer prevents the entire browser_tool module from importing
(which would disable all 10 browser tools, not just vision)
- Improve screenshot error messages with mode info and diagnostic hints
- Don't delete screenshots when LLM analysis fails — the capture was
valid, only the vision API call failed. Screenshots are still cleaned
up by the existing 24-hour _cleanup_old_screenshots mechanism.
Changes in code_execution_tool.py:
- Same /tmp fix for RPC socket path (was 103 chars on macOS — one char
from the 104-byte limit)
Enhancements to the Solana blockchain skill (PR #212 by gizdusum):
- CoinGecko price integration (free, no API key)
- Wallet shows tokens with USD values, sorted by value
- Token info includes price and market cap
- Transaction details show USD amounts for balance changes
- Whale detector shows USD alongside SOL amounts
- Stats includes SOL price and market cap
- New `price` command for quick lookups by symbol or mint
- Smart wallet output
- Tokens sorted by USD value (highest first)
- Default limit of 20 tokens (--limit N to adjust)
- Dust filtering (< $0.01 tokens hidden, count shown)
- --all flag to see everything
- --no-prices flag for fast RPC-only mode
- NFT summary (count + first 10)
- Portfolio total in USD
- Token name resolution
- 25+ well-known tokens mapped (SOL, USDC, BONK, JUP, etc.)
- CoinGecko fallback for unknown tokens
- Abbreviated mint addresses for unlabeled tokens
- Reliability
- Retry with exponential backoff on 429 rate-limit (RPC + CoinGecko)
- Graceful degradation when price data unavailable
- Capped API calls to respect CoinGecko free-tier limits
- Updated SKILL.md with all new capabilities and flags
When the agent is interrupted, the model now receives descriptive
context instead of a generic 'Operation interrupted.' string:
- Tool skip messages include the tool name:
'[Tool execution cancelled — terminal was skipped due to user interrupt]'
'[Tool execution skipped — web_search was not started. User sent a new message]'
- API call interrupts include timing:
'Operation interrupted: waiting for model response (4.2s elapsed).'
- Retry/error interrupts include retry context:
'Operation interrupted: retrying API call after rate limit (retry 2/5).'
'Operation interrupted: handling API error (Timeout: connection timed out).'
This helps the model understand what was happening when it was
interrupted, reducing wasted iterations spent re-discovering state.
Solana blockchain queries are a niche use case — not needed by every user.
Moved from skills/ (bundled) to optional-skills/ (installable via Skills Hub).
The 'openai' provider was redundant — using OPENAI_BASE_URL +
OPENAI_API_KEY with provider: 'main' already covers direct OpenAI API.
Provider options are now: auto, openrouter, nous, codex, main.
- Removed _try_openai(), _OPENAI_AUX_MODEL, _OPENAI_BASE_URL
- Replaced openai tests with codex provider tests
- Updated all docs to remove 'openai' option and clarify 'main'
- 'main' description now explicitly mentions it works with OpenAI API,
local models, and any OpenAI-compatible endpoint
Tests: 2467 passed.
The Codex Responses API (chatgpt.com/backend-api/codex) supports
vision via gpt-5.3-codex. This was verified with real API calls
using image analysis.
Changes to _CodexCompletionsAdapter:
- Added _convert_content_for_responses() to translate chat.completions
multimodal format to Responses API format:
- {type: 'text'} → {type: 'input_text'}
- {type: 'image_url', image_url: {url: '...'}} → {type: 'input_image', image_url: '...'}
- Fixed: removed 'stream' from resp_kwargs (responses.stream() handles it)
- Fixed: removed max_output_tokens and temperature (Codex endpoint rejects them)
Provider changes:
- Added 'codex' as explicit auxiliary provider option
- Vision auto-fallback now includes Codex (OpenRouter → Nous → Codex)
since gpt-5.3-codex supports multimodal input
- Updated docs with Codex OAuth examples
Tested with real Codex OAuth token + ~/.hermes/image2.png — confirmed
working end-to-end through the full adapter pipeline.
Tests: 2459 passed.
The Codex model normalization was rejecting any model without 'codex'
in its name, forcing a fallback to gpt-5.3-codex. This blocked models
like gpt-5.4 that the Codex API actually supports.
The fix simplifies _normalize_model_for_provider() to two operations:
1. Strip provider prefixes (API needs bare slugs)
2. Replace the *untouched default* model with a Codex-compatible one
If the user explicitly chose a model — any model — we trust them and
let the API be the judge. No allowlists, no slug checks.
Also removes the 'codex not in slug' filter from _read_cache_models()
so the local cache preserves all API-available models.
Inspired by OpenClaw's approach which explicitly lists non-codex models
(gpt-5.4, gpt-5.2) as valid Codex models.
Users can now set provider: "openai" for auxiliary tasks (vision, web
extract, compression) to use OpenAI's API directly with their
OPENAI_API_KEY. This hits api.openai.com/v1 with gpt-4o-mini as the
default model — supports vision since GPT-4o handles image input.
Provider options are now: auto, openrouter, nous, openai, main.
Changes:
- agent/auxiliary_client.py: added _try_openai(), "openai" case in
_resolve_forced_provider(), updated auxiliary_max_tokens_param()
to use max_completion_tokens for OpenAI
- Updated docs: cli-config.yaml.example, AGENTS.md, and user-facing
configuration.md with Common Setups section showing OpenAI,
OpenRouter, and local model examples
- 3 new tests for OpenAI provider resolution
Tests: 2459 passed (was 2429).
Adds clear how-to documentation for changing the vision model, web
extraction model, and compression model to the user-facing docs site
(website/docs/user-guide/configuration.md).
Includes:
- Full auxiliary config.yaml example
- 'Changing the Vision Model' walkthrough with config + env var options
- Provider options table (auto/openrouter/nous/main)
- Multimodal safety warning for vision
- Environment variable reference table
- Updated the warning about OpenRouter-dependent tools to mention
auxiliary model configuration
Major issues fixed:
- Removed dead APIs: artii.herokuapp.com (404 since Heroku free tier
ended 2022), patorjk.com TAAG AJAX endpoint (404)
- Removed unusable sources: emojicombos.com (3.3MB JS blob, not
curl-accessible), asciiart.eu (art loads via JavaScript only)
New working sources added:
- asciified API (asciified.thelicato.io): free text-to-ASCII REST API,
250+ FIGlet fonts, returns plain text, no auth — perfect remote
alternative when pyfiglet isn't installed
- ascii.co.uk: classic ASCII art archive, art in <pre> tags,
extractable with simple curl + Python parsing
- qrenco.de: QR codes as ASCII art via curl
- wttr.in: weather and moon phase as ASCII art via curl
Also fixed: Tool 6 no longer relies on web_extract inside
execute_code (which was the original #662 bug). All web lookups
now use terminal curl which is universally available.
Clear how-to documentation for changing the vision model, web extraction
model, and compression model. Includes config.yaml examples, env var
alternatives, provider options table, and multimodal safety notes.
Improvements on top of PR #606 (auxiliary model configuration):
1. Gateway bridge: Added auxiliary.* and compression.summary_provider
config bridging to gateway/run.py so config.yaml settings work from
messaging platforms (not just CLI). Matches the pattern in cli.py.
2. Vision auto-fallback safety: In auto mode, vision now only tries
OpenRouter + Nous Portal (known multimodal-capable providers).
Custom endpoints, Codex, and API-key providers are skipped to avoid
confusing errors from providers that don't support vision input.
Explicit provider override (AUXILIARY_VISION_PROVIDER=main) still
allows using any provider.
3. Comprehensive tests (46 new):
- _get_auxiliary_provider env var resolution (8 tests)
- _resolve_forced_provider with all provider types (8 tests)
- Per-task provider routing integration (4 tests)
- Vision auto-fallback safety (7 tests)
- Config bridging logic (11 tests)
- Gateway/CLI bridge parity (2 tests)
- Vision model override via env var (2 tests)
- DEFAULT_CONFIG shape validation (4 tests)
4. Docs: Added auxiliary_client.py to AGENTS.md project structure.
Updated module docstring with separate text/vision resolution chains.
Tests: 2429 passed (was 2383).
- Added support for auxiliary model overrides in the configuration, allowing users to specify providers and models for vision and web extraction tasks.
- Updated the CLI configuration example to include new auxiliary model settings.
- Enhanced the environment variable mapping in the CLI to accommodate auxiliary model configurations.
- Improved the resolution logic for auxiliary clients to support task-specific provider overrides.
- Updated relevant documentation and comments for clarity on the new features and their usage.
- sessions.md: New 'Conversation Recap on Resume' subsection with visual
example, feature bullet points, and config snippet
- cli.md: New 'Session Resume Display' subsection with cross-reference
- configuration.md: Add resume_display to display settings YAML block
- AGENTS.md: Add _preload_resumed_session() and _display_resumed_history()
to key components, add UX note about resume panel
Add contextual [Hint: ...] suffixes to tool results where they save
real iterations:
- patch (no match): suggests read_file/search_files to verify content
before retrying — addresses the common pattern where the agent retries
with stale old_string instead of re-reading the file.
- search_files (truncated): provides explicit next offset and suggests
narrowing the search — clearer than relying on total_count inference.
Other hints proposed in #722 (terminal, web_search, web_extract,
browser_snapshot, search zero-results, search content-matches) were
evaluated and found to be low-value: either already covered by existing
mechanisms (read_file pagination, similar-files, schema descriptions)
or guidance the agent already follows from its own reasoning.
5 new tests covering hint presence/absence for both tools.
When resuming a session via --continue or --resume, show a compact recap
of the previous conversation inside a Rich panel before the input prompt.
This gives users immediate visual context about what was discussed.
Changes:
- Add _preload_resumed_session() to load session history early (in run(),
before banner) so _init_agent() doesn't need a separate DB round-trip
- Add _display_resumed_history() that renders a formatted recap panel:
* User messages shown with gold bullet (truncated at 300 chars)
* Assistant responses shown with green diamond (truncated at 200 chars / 3 lines)
* Tool calls collapsed to count + tool names
* System messages and tool results hidden
* <REASONING_SCRATCHPAD> blocks stripped from display
* Pure-reasoning messages (no visible output) skipped entirely
* Capped at last 10 exchanges with 'N earlier messages' indicator
* Dim/muted styling distinguishes recap from active conversation
- Add display.resume_display config option: 'full' (default) or 'minimal'
- Store resume_display as instance variable (like compact) for testability
- 27 new tests covering all display scenarios, config, and edge cases
Closes#719
NOUS_API_KEY is unused — vision tools use OPENROUTER_API_KEY or Nous
Portal OAuth (auth.json), and MoA tools use OPENROUTER_API_KEY.
Removed from:
- hermes_cli/config.py: api_keys allowlist for config set routing
- .env.example: example env file entry and comment
- tests/hermes_cli/test_set_config_value.py: parametrize test data
- tests/integration/test_web_tools.py: updated comments and log
messages to reference 'auxiliary LLM provider' instead of NOUS_API_KEY
No HECATE references found in codebase (already cleaned up).
Add `hermes sessions browse` — a curses-based interactive session picker
with live type-to-search filtering, arrow key navigation, and seamless
session resume via Enter.
Features:
- Arrow keys to navigate, Enter to select and resume, Esc/q to quit
- Type characters to live-filter sessions by title, preview, source, or ID
- Backspace to edit filter, first Esc clears filter, second Esc exits
- Adaptive column layout (title/preview, last active, source, ID)
- Scrolling support for long session lists
- --source flag to filter by platform (cli, telegram, discord, etc.)
- --limit flag to control how many sessions to load (default: 50)
- Windows fallback: numbered list with input prompt
- After selection, seamlessly execs into `hermes --resume <id>`
Design decisions:
- Separate subcommand (not a flag on -c) — preserves `hermes -c` as-is
for instant most-recent-session resume
- Uses curses (not simple_term_menu) per Known Pitfalls to avoid the
arrow-key ghost-duplication rendering bug in tmux/iTerm
- Follows existing curses pattern from hermes_cli/tools_config.py
Also fixes: removed redundant `import os` inside cmd_sessions stats
block that shadowed the module-level import (would cause UnboundLocalError
if browse action was taken in the same function).
Tests: 33 new tests covering curses picker, fallback mode, filtering,
navigation, edge cases, and argument parser registration.
Verifies that setup.py imports the correct function name
(get_codex_model_ids) from codex_models.py. This would have caught
the ImportError bug before it reached users.
Source code (hermes_cli/clipboard.py):
- _convert_to_png() lost the file when both Pillow and ImageMagick were
unavailable: path.rename(tmp) moved the file to .bmp, then subprocess.run
raised FileNotFoundError, but the file was never renamed back. The final
fallback 'return path.exists()' returned False.
- Fix: restore the original file in both except handlers by renaming tmp
back to path when the original is missing.
Test (tests/tools/test_clipboard.py):
- test_file_still_usable_when_no_converter expected 'from PIL import Image'
to raise an Exception, but Pillow is installed so pytest.raises fired
'DID NOT RAISE'. The test also never called _convert_to_png().
- Fix: properly mock PIL unavailability via patch.dict(sys.modules),
actually call _convert_to_png(), and assert the correct result.
Previously, --worktree printed a yellow warning and continued without
isolation, silently defeating the purpose of the flag. Now it prints
a clear error message and exits immediately.
Add a detailed section for /compress in the CLI Commands Reference,
explaining what it does, when to use it, requirements, and output format.
Previously only had a one-line table entry.
Telegram: add /insights, /update, /reload_mcp (underscore variant since
Telegram BotCommand names don't allow hyphens).
Discord: add /insights (with days parameter), /reload-mcp.
Also add reload_mcp as an alias for reload-mcp in the gateway command
dispatcher so Telegram's underscore form works, and add resume/provider
to the _known_commands set for hook emission.
Add /title, /resume, /compress, /provider, /usage to Telegram's
set_my_commands so they appear in the / autocomplete menu.
Add /title, /resume, /compress, /provider, /usage, /help as Discord
slash commands so they appear in Discord's native command picker.
These commands were functional via text but not registered with the
platform-native command menus, so users couldn't discover them.
Messaging users can now switch back to previously-named sessions:
- /resume My Project — resolves the title (with auto-lineage) and
restores that session's conversation history
- /resume (no args) — lists recent titled sessions to choose from
Adds SessionStore.switch_session() which ends the current session and
points the session entry at the target session ID so the old transcript
is loaded on the next message. Running agents are cleared on switch.
Completes the session naming feature from PR #720 for gateway users.
8 new tests covering: name resolution, lineage auto-latest, already-on-
session check, nonexistent names, agent cleanup, no-DB fallback, and
listing titled sessions.
When _ensure_runtime_credentials() resolves the provider to openai-codex,
check if the active model is Codex-compatible. If not (e.g. the default
anthropic/claude-opus-4.6), swap it for the best available Codex model.
Also strips provider prefixes the Codex API rejects (openai/gpt-5.3-codex
→ gpt-5.3-codex).
Adds _model_is_default flag so warnings are only shown when the user
explicitly chose an incompatible model (not when it's the config default).
Fixes#651.
Co-inspired-by: stablegenius49 (PR #661)
Co-inspired-by: teyrebaz33 (PR #696)
Previously, search_files would silently return 0 results when the
search path didn't exist (e.g., /root/.hermes/... when HOME is
/home/user). The path was passed to rg/grep/find which would fail
silently, and the empty stdout was parsed as 'no matches found'.
Changes:
- Add path existence check at the top of search() using test -e.
Returns SearchResult with a clear error message when path doesn't exist.
- Add exit code 2 checks in _search_with_rg() and _search_with_grep()
as secondary safety net for other error types (bad regex, permissions).
- Add 4 new tests covering: nonexistent path (content mode), nonexistent
path (files mode), existing path proceeds normally, rg error exit code.
Tests: 37 → 41 in test_file_operations.py, full suite 2330 passed.
- website/docs/user-guide/sessions.md: New 'Session Naming' section
with /title usage, title rules, auto-lineage, gateway support.
Updated 'Resume by Name' section, 'Rename a Session' subsection,
updated sessions list output format, updated DB schema description.
- website/docs/reference/cli-commands.md: Added -c "name" and
--resume by title to Core Commands, sessions rename to Sessions
table, /title to slash commands.
- website/docs/user-guide/cli.md: Added -c "name" and --resume by
title to resume options.
- AGENTS.md: Added -c, --resume, sessions list/rename to CLI commands
table. Added hermes_state.py to project structure.
- CONTRIBUTING.md: Updated hermes_state.py and session persistence
descriptions to mention titles.
- hermes_cli/main.py: Fixed sessions help string to include 'rename'.