Two remaining gaps from the codex empty-output spec:
1. Normalize dict-shaped streamed items: output_item.done events may
yield dicts (raw/fallback paths) instead of SDK objects. The
extraction loop now uses _item_get() that handles both getattr
and dict .get() access.
2. Avoid plain-text synthesis when function_call events were streamed:
tracks has_function_calls during streaming and skips text-delta
synthesis when tool calls are present — prevents collapsing a
tool-call response into a fake text message.
When a user replies in a Slack thread where the bot has an active
conversation session, the bot now processes the message even without
an explicit @mention. This improves UX for ongoing threaded
discussions.
Changes:
- Added set_session_store() to BasePlatformAdapter for adapters to
check active sessions
- Modified SlackAdapter to detect thread replies and check if a
session exists for that thread before requiring @mentions
- Updated GatewayRunner to inject the session store into adapters
- Added comprehensive tests for the new behavior
Fixes: Thread replies without @jarvis are now processed if there is
an active session, matching user expectations for conversation flow
The edit_message method was sending raw content directly to Slack's
chat_update API without converting standard markdown to Slack's mrkdwn
format. This caused broken formatting and malformed URLs (e.g., trailing
** from bold syntax became part of clickable links → 404 errors).
The send() method already calls format_message() to handle this conversion,
but edit_message() was bypassing it. This change ensures edited messages
receive the same markdown → mrkdwn transformation as new messages.
Closes: PR #5558 formatting issue where links had trailing markdown syntax.
signal-cli is not available via apt or snap. Replace the incorrect
'sudo apt install signal-cli' with the official install method:
downloading from GitHub releases (Linux) or brew (macOS).
Updated both signal.md docs and the gateway.py setup hint.
Inspired by PR #4225 (which proposed snap, also incorrect).
Comprehensive guide covering:
- llama.cpp and MLX (omlx) setup on Apple Silicon
- Model selection and memory optimization (quantized KV cache)
- Real benchmarks on M5 Max comparing both backends
- Hermes connection instructions
Cherry-picked from PR #2590.
Adds a concise post-update validation checklist (git status, hermes
doctor, version check, gateway status). Adapted from PR #3050 with
corrections — removed inaccurate submodule claim (hermes update
already handles submodules) and tightened the checklist.
Cherry-picked and adapted from PR #3050.
The docker container needs the explicit 'setup' subcommand to launch
the setup wizard. Without it, the container starts in default mode.
Co-authored-by: Omar <omar2535@users.noreply.github.com>
Cherry-picked from PR #4896 (also submitted independently as PR #5532).
The _CodexCompletionsAdapter (used for compression, vision, web_extract,
session_search, and memory flush when on the codex provider) streamed
responses but discarded all events with 'for _event in stream: pass'.
When get_final_response() returned empty output (the same chatgpt.com
backend-api shape change), auxiliary calls silently returned None content.
Now collects response.output_item.done and text deltas during streaming
and backfills empty output — same pattern as _run_codex_stream().
Tested live against chatgpt.com/backend-api/codex with OAuth.
Two gaps in the codex empty-output handling:
1. _run_codex_create_stream_fallback() skipped all non-terminal events,
so when the fallback path was used and the terminal response had
empty output, there was no recovery. Now collects output_item.done
and text deltas during the fallback stream, backfills on empty output.
2. _normalize_codex_response() hard-crashed with RuntimeError when
output was empty, even when the response had output_text set. The
function already had fallback logic at line 3562 to use output_text,
but the guard at line 3446 killed it first. Now checks output_text
before raising and synthesizes a minimal output item.
Salvages the core fix from PR #5673 (egerev) onto current main.
The chatgpt.com/backend-api/codex endpoint streams valid output items
via response.output_item.done events, but the OpenAI SDK's
get_final_response() returns an empty output list. This caused every
Codex response to be rejected as invalid.
Fix: collect output_item.done events during streaming and backfill
response.output when get_final_response() returns empty. Falls back
to synthesizing from text deltas when no done events were received.
Also moves the synthesis logic from the validation loop (too late, from
#5681) into _run_codex_stream() (before the response leaves the
streaming function), and simplifies the validation to just log
diagnostics since recovery now happens upstream.
Co-authored-by: Egor <egerev@users.noreply.github.com>
OpenAI OAuth refresh tokens are single-use and rotate on every refresh.
When the Codex CLI (or another Hermes profile) refreshes its token, the
pool entry's refresh_token becomes stale. Subsequent refresh attempts
fail with invalid_grant, and the entry enters a 24-hour exhaustion
cooldown with no recovery path.
This mirrors the existing _sync_anthropic_entry_from_credentials_file()
pattern: when an openai-codex entry is exhausted, compare its
refresh_token against ~/.codex/auth.json and sync the fresh pair if
they differ.
Fixes the common scenario where users run 'codex login' to refresh
their token externally and Hermes never picks it up.
Co-authored-by: David Andrews (LexGenius.ai) <david@lexgenius.ai>
Three bugs causing OpenAI Codex sessions to fail silently:
1. Credential pool vs legacy store disconnect: hermes auth and hermes
model store device_code tokens in the credential pool, but
get_codex_auth_status(), resolve_codex_runtime_credentials(), and
_model_flow_openai_codex() only read from the legacy provider state.
Fresh pool tokens were invisible to the auth status checks and model
selection flow.
2. _import_codex_cli_tokens() imported expired tokens from ~/.codex/
without checking JWT expiry. Combined with _login_openai_codex()
saying 'Login successful!' for expired credentials, users got stuck
in a loop of dead tokens being recycled.
3. _login_openai_codex() accepted expired tokens from
resolve_codex_runtime_credentials() without validating expiry before
telling the user login succeeded.
Fixes:
- get_codex_auth_status() now checks credential pool first, falls back
to legacy provider state
- _model_flow_openai_codex() uses pool-aware auth status for token
retrieval when fetching model lists
- _import_codex_cli_tokens() validates JWT exp claim, rejects expired
- _login_openai_codex() verifies resolved token isn't expiring before
accepting existing credentials
- _run_codex_stream() logs response.incomplete/failed terminal events
with status and incomplete_details for diagnostics
- Codex empty output recovery: captures streamed text during streaming
and synthesizes a response when get_final_response() returns empty
output (handles chatgpt.com backend-api edge cases)
Two fixes:
1. Replace all stale 'hermes login' references with 'hermes auth' across
auth.py, auxiliary_client.py, delegate_tool.py, config.py, run_agent.py,
and documentation. The 'hermes login' command was deprecated; 'hermes auth'
now handles OAuth credential management.
2. Fix credential removal not persisting for singleton-sourced credentials
(device_code for openai-codex/nous, hermes_pkce for anthropic).
auth_remove_command already cleared env vars for env-sourced credentials,
but singleton credentials stored in the auth store were re-seeded by
_seed_from_singletons() on the next load_pool() call. Now clears the
underlying auth store entry when removing singleton-sourced credentials.
Add fine-grained authorization policies per Feishu group chat via
platforms.feishu.extra configuration.
- Add global bot-level admins that bypass all group restrictions
- Add per-group policies: open, allowlist, blacklist, admin_only, disabled
- Add default_group_policy fallback for chats without explicit rules
- Thread chat_id through group message gate for per-chat rule selection
- Match both open_id and user_id for backward compatibility
- Preserve existing FEISHU_ALLOWED_USERS / FEISHU_GROUP_POLICY behavior
- Add focused regression tests for all policy modes
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Consolidate coercion functions, extract loop readiness check, and deduplicate test mock setup to improve maintainability without changing behavior.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Reapply local reconnect and ping settings after the Feishu SDK refreshes its client config so user-provided websocket tuning actually takes effect.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Allow Feishu websocket keepalive timing to be configured via platform
extra config so disconnects can be detected faster in unstable networks.
New optional extra settings:
- ws_ping_interval
- ws_ping_timeout
These values are applied only when explicitly configured. Invalid values
fall back to the websocket library defaults by leaving the options unset.
This complements the reconnect timing settings added previously and helps
reduce total recovery time after network interruptions.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Allow users to configure websocket reconnect behavior via platform extra
config to reduce reconnect latency in production environments.
The official Feishu SDK defaults to:
- First reconnect: random jitter 0-30 seconds
- Subsequent retries: 120 second intervals
This can cause 20-30 second delays before reconnection after network
interruptions. This commit makes these values configurable while keeping
the SDK defaults for backward compatibility.
Configuration via ~/.hermes/config.yaml:
```yaml
platforms:
feishu:
extra:
ws_reconnect_nonce: 0 # Disable first-reconnect jitter (default: 30)
ws_reconnect_interval: 3 # Retry every 3 seconds (default: 120)
```
Invalid values (negative numbers, non-integers) fall back to SDK defaults.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit fixes two critical bugs in the Feishu adapter that affect
message reliability and process lifecycle.
**Bug Fix 1: Intermittent Message Drops**
Root cause: Event handler was created once in __init__ and reused across
reconnects, causing callbacks to capture stale loop references. When the
adapter disconnected and reconnected, old callbacks continued firing with
invalid loop references, resulting in dropped messages with warnings:
"[Feishu] Dropping inbound message before adapter loop is ready"
Fix:
- Rebuild event handler on each connect (websocket/webhook)
- Clear handler on disconnect
- Ensure callbacks always capture current valid loop
- Add defensive loop.is_closed() checks with getattr for test compatibility
- Unify webhook dispatch path to use same loop checks as websocket mode
**Bug Fix 2: Process Hangs on Ctrl+C / SIGTERM**
Root cause: Feishu SDK's websocket client runs in a background thread with
an infinite _select() loop that never exits naturally. The thread was never
properly joined on disconnect, causing processes to hang indefinitely after
Ctrl+C or gateway stop commands.
Fix:
- Store reference to thread-local event loop (_ws_thread_loop)
- On disconnect, cancel all tasks in thread loop and stop it gracefully
via call_soon_threadsafe()
- Await thread future with 10s timeout
- Clean up pending tasks in thread's finally block before closing loop
- Add detailed debug logging for disconnect flow
**Additional Improvements:**
- Add regression tests for disconnect cleanup and webhook dispatch
- Ensure all event callbacks check loop readiness before dispatching
Tested on Linux with websocket mode. All Feishu tests pass.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Two issues caused Matrix E2EE to silently not work in encrypted rooms:
1. When matrix-nio is installed without the [e2e] extra (no python-olm /
libolm), nio.crypto.ENCRYPTION_ENABLED is False and client.olm is
never initialized. The adapter logged warnings but returned True from
connect(), so the bot appeared online but could never decrypt messages.
Now: check_matrix_requirements() and connect() both hard-fail with a
clear error message when MATRIX_ENCRYPTION=true but E2EE deps are
missing.
2. Without a stable device_id, the bot gets a new device identity on each
restart. Other clients see it as "unknown device" and refuse to share
Megolm session keys. Now: MATRIX_DEVICE_ID env var lets users pin a
stable device identity that persists across restarts and is passed to
nio.AsyncClient constructor + restore_login().
Changes:
- gateway/platforms/matrix.py: add _check_e2ee_deps(), hard-fail in
connect() and check_matrix_requirements(), MATRIX_DEVICE_ID support
in constructor + restore_login
- gateway/config.py: plumb MATRIX_DEVICE_ID into platform extras
- hermes_cli/config.py: add MATRIX_DEVICE_ID to OPTIONAL_ENV_VARS
Closes#3521
When the Codex CLI (or VS Code extension) consumes a refresh token before
Hermes can use it, Hermes previously surfaced a generic 401 error with no
actionable guidance.
- In `refresh_codex_oauth_pure`: detect `refresh_token_reused` from the
OAuth endpoint and raise an AuthError explaining the cause and the exact
steps to recover (run `codex` to refresh, then `hermes login`).
- In `run_agent.py`: when provider is `openai-codex` and HTTP 401 is
received, show Codex-specific recovery steps instead of the generic
"check your API key" message.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- avoid hard-coded ~/.hermes paths in the setup and API shorthands
- prefer HERMES_HOME with a sane default to /Users/peteradams/.hermes
- keep the examples aligned with profile-aware Hermes installs
- fall back to adding the repo root to sys.path when hermes_constants is not importable
- fixes direct execution of setup.py and google_api.py from the repo checkout
- keeps the upstream PR scoped to the google-workspace compatibility fix
The Mattermost adapter downloads file attachments correctly but
never updates msg_type from TEXT to DOCUMENT. This means the
document enrichment block in gateway/run.py (which requires
MessageType.DOCUMENT) never executes — text files are not
inlined, and the agent is never notified about attached files.
The user sends a file, the adapter downloads it to the local
cache, but the agent sees an empty message and responds with
'I didn't receive any file'.
Set msg_type to DOCUMENT when file_ids is non-empty, matching
the behavior of the Telegram and Discord adapters.
The _async_flush_memories() helper accepts (session_id) but both the
/new and /resume handlers passed two arguments (session_id, session_key).
The TypeError was silently swallowed at DEBUG level, so memory extraction
never ran when users typed /new or /resume.
One call site (the session expiry watcher) was already fixed in 9c96f669,
but /new and /resume were missed.
- gateway/run.py:3247 — remove stray session_key from /new handler
- gateway/run.py:4989 — remove stray session_key from /resume handler
- tests/gateway/test_resume_command.py:222 — update test assertion
Previously the scheduler checked startswith('[SILENT]'), so agents that
appended [SILENT] after an explanation (e.g. 'N items filtered.\n\n[SILENT]')
would still trigger delivery.
Change the check to 'in' so the marker is caught regardless of position.
Add test_silent_trailing_suppresses_delivery to cover this case.
Ensures pending sessions are committed on process exit even if
shutdown_memory_provider is never called (gateway crash, SIGKILL,
or exception in _async_flush_memories preventing shutdown).
Also reorders on_session_end to wait for the pending sync thread
before checking turn_count, so the last turn's messages are flushed.
Based on PR #4919 by dagbs.
Updates the plugin build guide and features page to reflect the
interactive env var prompting added in PR #5470. Documents the rich
manifest format (name/description/url/secret) alongside the simple
string format.
- {__raw__} in webhook prompt templates dumps the full JSON payload (truncated at 4000 chars)
- _deliver_cross_platform now passes thread_id/message_thread_id from deliver_extra as metadata, enabling Telegram forum topic delivery
- Tests for both features
Follow-up to PR #5470. Replaces hardcoded ~/.hermes/.env references with
display_hermes_home() for correct behavior under profiles. Also updates
PluginManifest.requires_env type hint to List[Union[str, Dict[str, Any]]]
to document the rich format introduced in #5470.
Adds obsidian-headless (npm) setup guide to the Obsidian Integration
section — Node 22+, ob login, sync-create-remote, sync-setup, systemd
service for continuous background sync. Covers the full headless
workflow for agents running on servers syncing to Obsidian desktop on
other devices.
* feat(tools): add Firecrawl cloud browser provider
Adds Firecrawl (https://firecrawl.dev) as a cloud browser provider
alongside Browserbase and Browser Use. All browser tools route through
Firecrawl's cloud browser via CDP when selected.
- tools/browser_providers/firecrawl.py — FirecrawlProvider
- tools/browser_tool.py — register in _PROVIDER_REGISTRY
- hermes_cli/tools_config.py — add to onboarding provider picker
- hermes_cli/setup.py — add to setup summary
- hermes_cli/config.py — add FIRECRAWL_BROWSER_TTL config
- website/docs/ — browser docs and env var reference
Based on #4490 by @developersdigest.
Co-Authored-By: Developers Digest <124798203+developersdigest@users.noreply.github.com>
* refactor: simplify FirecrawlProvider.emergency_cleanup
Use self._headers() and self._api_url() instead of duplicating
env-var reads and header construction.
* fix: recognize Firecrawl in subscription browser detection
_resolve_browser_feature_state() now handles "firecrawl" as a direct
browser provider (same pattern as "browser-use"), so hermes setup
summary correctly shows "Browser Automation (Firecrawl)" instead of
misreporting as "Local browser".
Also fixes test_config_version_unchanged assertion (11 → 12).
---------
Co-authored-by: Developers Digest <124798203+developersdigest@users.noreply.github.com>