Two issues caused Matrix E2EE to silently not work in encrypted rooms:
1. When matrix-nio is installed without the [e2e] extra (no python-olm /
libolm), nio.crypto.ENCRYPTION_ENABLED is False and client.olm is
never initialized. The adapter logged warnings but returned True from
connect(), so the bot appeared online but could never decrypt messages.
Now: check_matrix_requirements() and connect() both hard-fail with a
clear error message when MATRIX_ENCRYPTION=true but E2EE deps are
missing.
2. Without a stable device_id, the bot gets a new device identity on each
restart. Other clients see it as "unknown device" and refuse to share
Megolm session keys. Now: MATRIX_DEVICE_ID env var lets users pin a
stable device identity that persists across restarts and is passed to
nio.AsyncClient constructor + restore_login().
Changes:
- gateway/platforms/matrix.py: add _check_e2ee_deps(), hard-fail in
connect() and check_matrix_requirements(), MATRIX_DEVICE_ID support
in constructor + restore_login
- gateway/config.py: plumb MATRIX_DEVICE_ID into platform extras
- hermes_cli/config.py: add MATRIX_DEVICE_ID to OPTIONAL_ENV_VARS
Closes#3521
When the Codex CLI (or VS Code extension) consumes a refresh token before
Hermes can use it, Hermes previously surfaced a generic 401 error with no
actionable guidance.
- In `refresh_codex_oauth_pure`: detect `refresh_token_reused` from the
OAuth endpoint and raise an AuthError explaining the cause and the exact
steps to recover (run `codex` to refresh, then `hermes login`).
- In `run_agent.py`: when provider is `openai-codex` and HTTP 401 is
received, show Codex-specific recovery steps instead of the generic
"check your API key" message.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- avoid hard-coded ~/.hermes paths in the setup and API shorthands
- prefer HERMES_HOME with a sane default to /Users/peteradams/.hermes
- keep the examples aligned with profile-aware Hermes installs
- fall back to adding the repo root to sys.path when hermes_constants is not importable
- fixes direct execution of setup.py and google_api.py from the repo checkout
- keeps the upstream PR scoped to the google-workspace compatibility fix
The Mattermost adapter downloads file attachments correctly but
never updates msg_type from TEXT to DOCUMENT. This means the
document enrichment block in gateway/run.py (which requires
MessageType.DOCUMENT) never executes — text files are not
inlined, and the agent is never notified about attached files.
The user sends a file, the adapter downloads it to the local
cache, but the agent sees an empty message and responds with
'I didn't receive any file'.
Set msg_type to DOCUMENT when file_ids is non-empty, matching
the behavior of the Telegram and Discord adapters.
The _async_flush_memories() helper accepts (session_id) but both the
/new and /resume handlers passed two arguments (session_id, session_key).
The TypeError was silently swallowed at DEBUG level, so memory extraction
never ran when users typed /new or /resume.
One call site (the session expiry watcher) was already fixed in 9c96f669,
but /new and /resume were missed.
- gateway/run.py:3247 — remove stray session_key from /new handler
- gateway/run.py:4989 — remove stray session_key from /resume handler
- tests/gateway/test_resume_command.py:222 — update test assertion
Previously the scheduler checked startswith('[SILENT]'), so agents that
appended [SILENT] after an explanation (e.g. 'N items filtered.\n\n[SILENT]')
would still trigger delivery.
Change the check to 'in' so the marker is caught regardless of position.
Add test_silent_trailing_suppresses_delivery to cover this case.
Ensures pending sessions are committed on process exit even if
shutdown_memory_provider is never called (gateway crash, SIGKILL,
or exception in _async_flush_memories preventing shutdown).
Also reorders on_session_end to wait for the pending sync thread
before checking turn_count, so the last turn's messages are flushed.
Based on PR #4919 by dagbs.
Updates the plugin build guide and features page to reflect the
interactive env var prompting added in PR #5470. Documents the rich
manifest format (name/description/url/secret) alongside the simple
string format.
- {__raw__} in webhook prompt templates dumps the full JSON payload (truncated at 4000 chars)
- _deliver_cross_platform now passes thread_id/message_thread_id from deliver_extra as metadata, enabling Telegram forum topic delivery
- Tests for both features
Follow-up to PR #5470. Replaces hardcoded ~/.hermes/.env references with
display_hermes_home() for correct behavior under profiles. Also updates
PluginManifest.requires_env type hint to List[Union[str, Dict[str, Any]]]
to document the rich format introduced in #5470.
Adds obsidian-headless (npm) setup guide to the Obsidian Integration
section — Node 22+, ob login, sync-create-remote, sync-setup, systemd
service for continuous background sync. Covers the full headless
workflow for agents running on servers syncing to Obsidian desktop on
other devices.
* feat(tools): add Firecrawl cloud browser provider
Adds Firecrawl (https://firecrawl.dev) as a cloud browser provider
alongside Browserbase and Browser Use. All browser tools route through
Firecrawl's cloud browser via CDP when selected.
- tools/browser_providers/firecrawl.py — FirecrawlProvider
- tools/browser_tool.py — register in _PROVIDER_REGISTRY
- hermes_cli/tools_config.py — add to onboarding provider picker
- hermes_cli/setup.py — add to setup summary
- hermes_cli/config.py — add FIRECRAWL_BROWSER_TTL config
- website/docs/ — browser docs and env var reference
Based on #4490 by @developersdigest.
Co-Authored-By: Developers Digest <124798203+developersdigest@users.noreply.github.com>
* refactor: simplify FirecrawlProvider.emergency_cleanup
Use self._headers() and self._api_url() instead of duplicating
env-var reads and header construction.
* fix: recognize Firecrawl in subscription browser detection
_resolve_browser_feature_state() now handles "firecrawl" as a direct
browser provider (same pattern as "browser-use"), so hermes setup
summary correctly shows "Browser Automation (Firecrawl)" instead of
misreporting as "Local browser".
Also fixes test_config_version_unchanged assertion (11 → 12).
---------
Co-authored-by: Developers Digest <124798203+developersdigest@users.noreply.github.com>
Skills can now declare config.yaml settings via metadata.hermes.config
in their SKILL.md frontmatter. Values are stored under skills.config.*
namespace, prompted during hermes config migrate, shown in hermes config
show, and injected into the skill context at load time.
Also adds the llm-wiki skill (Karpathy's LLM Wiki pattern) as the first
skill to use the new config interface, declaring wiki.path.
Skill config interface (new):
- agent/skill_utils.py: extract_skill_config_vars(), discover_all_skill_config_vars(),
resolve_skill_config_values(), SKILL_CONFIG_PREFIX
- agent/skill_commands.py: _inject_skill_config() injects resolved values
into skill messages as [Skill config: ...] block
- hermes_cli/config.py: get_missing_skill_config_vars(), skill config
prompting in migrate_config(), Skill Settings in show_config()
LLM Wiki skill (skills/research/llm-wiki/SKILL.md):
- Three-layer architecture (raw sources, wiki pages, schema)
- Three operations (ingest, query, lint)
- Session orientation, page thresholds, tag taxonomy, update policy,
scaling guidance, log rotation, archiving workflow
Docs: creating-skills.md, configuration.md, skills.md, skills-catalog.md
Closes#5100
The canonical config key for model name is model.default (used by setup,
auth, runtime_provider, profile list, and CLI startup). But /model --global
wrote to model.name in both gateway and CLI paths.
This caused:
- hermes profile list showing the old model (reads model.default)
- Gateway restart reverting to the old model (_resolve_gateway_model reads model.default)
- CLI startup using the old model (main.py reads model.default)
The only reason it appeared to work in Telegram was the cached agent
staying alive with the in-place switch.
Fix: change all 3 write/read sites to use model.default.
When parent_agent.enabled_toolsets is None (the default, meaning all tools
are enabled), subagents incorrectly fell back to DEFAULT_TOOLSETS
(['terminal', 'file', 'web']) instead of inheriting the parent's full
toolset.
Root cause:
- Line 188 used 'or' fallback: None or DEFAULT_TOOLSETS evaluates to
DEFAULT_TOOLSETS
- Line 192 checked truthiness: None is falsy, falling through to else
Fix:
- Use 'is not None' checks instead of truthiness
- When enabled_toolsets is None, derive effective toolsets from
parent_agent.valid_tool_names via the tool registry
Fixes the bug introduced in f75b1d21b and repeated in e5d14445e (PR #3269).
When an alias like 'claude' can't be resolved on the current provider,
_resolve_alias_fallback() tries other providers. Previously it hardcoded
('openrouter', 'nous') — so '/model claude' on z.ai would resolve to
openrouter even if the user doesn't have openrouter credentials but does
have anthropic.
Now the fallback uses the user's actual authenticated providers (detected
via list_authenticated_providers which is backed by the models.dev
in-memory cache). If no authenticated providers are found, falls back to
the old ('openrouter', 'nous') for backwards compatibility.
New helper: get_authenticated_provider_slugs() returns just the slug
strings from list_authenticated_providers().
launchctl kickstart returns exit code 113 ("Could not find service") when
the plist exists but the job hasn't been bootstrapped into the runtime domain.
The existing recovery path only caught exit code 3 ("unloaded"), causing an
unhandled CalledProcessError.
Exit code 113 means the same thing practically -- the service definition needs
bootstrapping before it can be kicked. Add it to the same recovery path that
already handles exit 3, matching the existing pattern in launchd_stop().
Follow-up: add a unit test covering the 113 recovery path.
Shell injection via unquoted workdir interpolation in docker, singularity,
and SSH backends. When workdir contained shell metacharacters (e.g.
~/;id), arbitrary commands could execute.
Changes:
- Add shlex.quote() at each interpolation point in docker.py,
singularity.py, and ssh.py with tilde-aware quoting (keep ~
unquoted for shell expansion, quote only the subpath)
- Add _validate_workdir() allowlist in terminal_tool.py as
defense-in-depth before workdir reaches any backend
Original work by Mariano A. Nicolini (PR #5620). Salvaged with fixes
for tilde expansion (shlex.quote breaks cd ~/path) and replaced
incomplete deny-list with strict character allowlist.
Co-authored-by: Mariano A. Nicolini <entropidelic@users.noreply.github.com>
Windows users running Hermes in WSL2 with model servers on the Windows
host hit 'connection refused' because WSL2's NAT networking means
localhost points to the VM, not Windows.
Covers:
- Mirrored networking mode (Win 11 22H2+) — makes localhost work
- NAT mode fallback using the host IP via ip route
- Per-server bind address table (Ollama, LM Studio, llama-server,
vLLM, SGLang)
- Detailed Ollama Windows service config for OLLAMA_HOST
- Windows Firewall rules for WSL2 connections
- Quick verification steps
- Cross-reference from Troubleshooting section
The ContextVar migration removed 'from pathlib import Path' but Path
is still used in _load_config_passthrough(). Without this import,
config-based env passthrough would raise NameError.
When a user runs out of OpenRouter credits and switches to Codex (or any
other provider), auxiliary tasks (compression, vision, web_extract) would
still try OpenRouter first and fail with 402. Two fixes:
1. Payment fallback in call_llm(): When a resolved provider returns HTTP 402
or a credit-related error, automatically retry with the next available
provider in the auto-detection chain. Skips the depleted provider and
tries Nous → Custom → Codex → API-key providers.
2. Remove hardcoded OpenRouter fallback: The old code fell back specifically
to OpenRouter when auto/custom resolution returned no client. Now falls
back to the full auto-detection chain, which handles any available
provider — not just OpenRouter.
Also extracts _get_provider_chain() as a shared function (replaces inline
tuple in _resolve_auto and the new fallback), built at call time so test
patches on _try_* functions remain visible.
Adds 16 tests covering _is_payment_error(), _get_provider_chain(),
_try_payment_fallback(), and call_llm() integration with 402 retry.
Centralize the skill → slash command registration that Telegram already had
in commands.py so Discord uses the exact same priority system, filtering,
and cap enforcement:
1. Core/built-in commands (never trimmed)
2. Plugin commands (never trimmed)
3. Skill commands (fill remaining slots, alphabetical, only tier trimmed)
Changes:
hermes_cli/commands.py:
- Rename _TG_NAME_LIMIT → _CMD_NAME_LIMIT (32 chars shared by both platforms)
- Rename _clamp_telegram_names → _clamp_command_names (generic)
- Extract _collect_gateway_skill_entries() — shared plugin + skill
collection with platform filtering, name sanitization, description
truncation, and cap enforcement
- Refactor telegram_menu_commands() to use the shared helper
- Add discord_skill_commands() that returns (name, desc, cmd_key) triples
- Preserve _sanitize_telegram_name() for Telegram-specific name cleaning
gateway/platforms/discord.py:
- Call discord_skill_commands() from _register_slash_commands()
- Create app_commands.Command per skill entry with cmd_key callback
- Respect 100-command global Discord limit
- Log warning when skills are skipped due to cap
Backward-compat aliases preserved for _TG_NAME_LIMIT and
_clamp_telegram_names.
Tests: 9 new tests (7 Discord + 2 backward-compat), 98 total pass.
Inspired by PR #5498 (sprmn24). Closes#5480.
When using xAI's API directly (base_url contains x.ai), send the
x-grok-conv-id header set to the Hermes session_id. This routes
consecutive requests to the same server, maximizing automatic
prompt cache hits.
Ref: https://docs.x.ai/developers/advanced-api-usage/prompt-caching
The cron scheduler delivery path passed raw text including MEDIA: tags
to _send_to_platform(), so media attachments were delivered as literal
text instead of actual files. The send function already supports
media_files= but the cron path never used it.
Now calls BasePlatformAdapter.extract_media() to split media paths
from text before sending, matching the gateway's normal message flow.
Salvaged from PR #4877 by robert-hoffmann.
The Signal adapter inherited base class defaults for send_image_file(),
send_voice(), and send_video() which only sent the file path as text
(e.g. '🖼️ Image: /tmp/chart.png') instead of actually delivering the file
as a Signal attachment.
When agent responses contain MEDIA:/path/to/file tags, the gateway
media pipeline extracts them and routes through these methods by file
type. Without proper overrides, image/audio/video files were never
actually delivered to Signal users.
Extract a shared _send_attachment() helper that handles all file
validation, size checking, group/DM routing, and RPC dispatch. The four
public methods (send_document, send_image_file, send_voice, send_video)
now delegate to this helper, following the same pattern used by WhatsApp
(_send_media_to_bridge) and Discord (_send_file_attachment).
The helper also uses a single stat() call with try/except FileNotFoundError
instead of the previous exists() + stat() two-syscall pattern, eliminating
a TOCTOU race. As a bonus, send_document() now gains the 100MB size check
that was previously missing (inconsistency with send_image).
Add 25 tests covering all methods plus MEDIA: tag extraction integration,
method-override guards, and send_document's new size check.
Fixes#5105
Telegram Bot API requires command names to contain only lowercase a-z,
digits 0-9, and underscores. Skill/plugin names containing characters
like +, /, @, or . caused set_my_commands to fail with
Bot_command_invalid.
Two-layer fix:
- scan_skill_commands(): strip non-alphanumeric/non-hyphen chars from
cmd_key at source, collapse consecutive hyphens, trim edges, skip
names that sanitize to empty string
- _sanitize_telegram_name(): centralized helper used by all 3 Telegram
name generation sites (core commands, plugin commands, skill commands)
with empty-name guard at each call site
Closes#5534
OpenCode Go model names with dots (minimax-m2.7, glm-4.5, kimi-k2.5)
were being mangled to hyphens (minimax-m2-7), causing HTTP 401 errors.
Two code paths were affected:
1. model_normalize.py: opencode-go was incorrectly in DOT_TO_HYPHEN_PROVIDERS
2. run_agent.py: _anthropic_preserve_dots() did not check for opencode-go
Fix:
- Remove opencode-go from _DOT_TO_HYPHEN_PROVIDERS (dots are correct for Go)
- Add opencode-go to _anthropic_preserve_dots() provider check
- Add opencode.ai/zen/go to base_url fallback check
- Add regression tests in tests/test_model_normalize.py
Co-authored-by: jacob3712 <jacob3712@users.noreply.github.com>
Grok models (x-ai/grok-4.20-beta, grok-code-fast-1) now receive tool-use
enforcement guidance, steering them to actually call tools instead of
describing intended actions. Matches both OpenRouter (x-ai/grok-*) and
direct xAI API usage.