Commit Graph

2106 Commits

Author SHA1 Message Date
Teknium
e2e53d497f fix: recognize Claude Code OAuth credentials in startup gate (#1455)
* fix: prevent infinite 400 failure loop on context overflow (#1630)

When a gateway session exceeds the model's context window, Anthropic may
return a generic 400 invalid_request_error with just 'Error' as the
message.  This bypassed the phrase-based context-length detection,
causing the agent to treat it as a non-retryable client error.  Worse,
the failed user message was still persisted to the transcript, making
the session even larger on each attempt — creating an infinite loop.

Three-layer fix:

1. run_agent.py — Fallback heuristic: when a 400 error has a very short
   generic message AND the session is large (>40% of context or >80
   messages), treat it as a probable context overflow and trigger
   compression instead of aborting.

2. run_agent.py + gateway/run.py — Don't persist failed messages:
   when the agent returns failed=True before generating any response,
   skip writing the user's message to the transcript/DB. This prevents
   the session from growing on each failure.

3. gateway/run.py — Smarter error messages: detect context-overflow
   failures and suggest /compact or /reset specifically, instead of a
   generic 'try again' that will fail identically.

* fix(skills): detect prompt injection patterns and block cache file reads

Adds two security layers to prevent prompt injection via skills hub
cache files (#1558):

1. read_file: blocks direct reads of ~/.hermes/skills/.hub/ directory
   (index-cache, catalog files). The 3.5MB clawhub_catalog_v1.json
   was the original injection vector — untrusted skill descriptions
   in the catalog contained adversarial text that the model executed.

2. skill_view: warns when skills are loaded from outside the trusted
   ~/.hermes/skills/ directory, and detects common injection patterns
   in skill content ("ignore previous instructions", "<system>", etc.).

Cherry-picked from PR #1562 by ygd58.

* fix(tools): chunk long messages in send_message_tool before dispatch (#1552)

Long messages sent via send_message tool or cron delivery silently
failed when exceeding platform limits. Gateway adapters handle this
via truncate_message(), but the standalone senders in send_message_tool
bypassed that entirely.

- Apply truncate_message() chunking in _send_to_platform() before
  dispatching to individual platform senders
- Remove naive message[i:i+2000] character split in _send_discord()
  in favor of centralized smart splitting
- Attach media files to last chunk only for Telegram
- Add regression tests for chunking and media placement

Cherry-picked from PR #1557 by llbn.

* fix(approval): show full command in dangerous command approval (#1553)

Previously the command was truncated to 80 chars in CLI (with a
[v]iew full option), 500 chars in Discord embeds, and missing entirely
in Telegram/Slack approval messages. Now the full command is always
displayed everywhere:

- CLI: removed 80-char truncation and [v]iew full menu option
- Gateway (TG/Slack): approval_required message includes full command
  in a code block
- Discord: embed shows full command up to 4096-char limit
- Windows: skip SIGALRM-based test timeout (Unix-only)
- Updated tests: replaced view-flow tests with direct approval tests

Cherry-picked from PR #1566 by crazywriter1.

* fix(cli): flush stdout during agent loop to prevent macOS display freeze (#1624)

The interrupt polling loop in chat() waited on the queue without
invalidating the prompt_toolkit renderer. On macOS, the StdoutProxy
buffer only flushed on input events, causing the CLI to appear frozen
during tool execution until the user typed a key.

Fix: call _invalidate() on each queue timeout (every ~100ms, throttled
to 150ms) to force the renderer to flush buffered agent output.

* fix(claw): warn when API keys are skipped during OpenClaw migration (#1580)

When --migrate-secrets is not passed (the default), API keys like
OPENROUTER_API_KEY are silently skipped with no warning. Users don't
realize their keys weren't migrated until the agent fails to connect.

Add a post-migration warning with actionable instructions: either
re-run with --migrate-secrets or add the key manually via
hermes config set.

Cherry-picked from PR #1593 by ygd58.

* fix(security): block sandbox backend creds from subprocess env (#1264)

Add Modal and Daytona sandbox credentials to the subprocess env
blocklist so they're not leaked to agent terminal sessions via
printenv/env.

Cherry-picked from PR #1571 by ygd58.

* fix(gateway): cap interrupt recursion depth to prevent resource exhaustion (#816)

When a user sends multiple messages while the agent keeps failing,
_run_agent() calls itself recursively with no depth limit. This can
exhaust stack/memory if the agent is in a failure loop.

Add _MAX_INTERRUPT_DEPTH = 3. When exceeded, the pending message is
logged and the current result is returned instead of recursing deeper.

The log handler duplication bug described in #816 was already fixed
separately (AIAgent.__init__ deduplicates handlers).

* fix(gateway): /model shows active fallback model instead of config default (#1615)

When the agent falls back to a different model (e.g. due to rate
limiting), /model still showed the config default. Now tracks the
effective model/provider after each agent run and displays it.

Cleared when the primary model succeeds again or the user explicitly
switches via /model.

Cherry-picked from PR #1616 by MaxKerkula. Added hasattr guard for
test compatibility.

* feat(gateway): inject reply-to message context for out-of-session replies (#1594)

When a user replies to a Telegram message, check if the quoted text
exists in the current session transcript. If missing (from cron jobs,
background tasks, or old sessions), prepend [Replying to: "..."] to
the message so the agent has context about what's being referenced.

- Add reply_to_text field to MessageEvent (base.py)
- Populate from Telegram's reply_to_message (text or caption)
- Inject context in _handle_message when not found in history

Based on PR #1596 by anpicasso (cherry-picked reply-to feature only,
excluded unrelated /server command and background delegation changes).

* fix: recognize Claude Code OAuth credentials in startup gate (#1455)

The _has_any_provider_configured() startup check didn't look for
Claude Code OAuth credentials (~/.claude/.credentials.json). Users
with only Claude Code auth got the setup wizard instead of starting.

Cherry-picked from PR #1455 by kshitijk4poor.

---------

Co-authored-by: buray <ygd58@users.noreply.github.com>
Co-authored-by: lbn <llbn@users.noreply.github.com>
Co-authored-by: crazywriter1 <53251494+crazywriter1@users.noreply.github.com>
Co-authored-by: Max K <MaxKerkula@users.noreply.github.com>
Co-authored-by: Angello Picasso <angello.picasso@devsu.com>
Co-authored-by: kshitij <kshitijk4poor@users.noreply.github.com>
2026-03-17 02:32:16 -07:00
teknium1
693f5786ac perf: use ripgrep for file search (200x faster than find)
search_files(target='files') now uses rg --files -g instead of find.
Ripgrep respects .gitignore, excludes hidden dirs by default, and has
parallel directory traversal — ~200x faster on wide trees (0.14s vs 34s
benchmarked on 164-repo tree).

Falls back to find when rg is unavailable, preserving hidden-dir
exclusion and BSD find compatibility.

Salvaged from PR #1464 by @light-merlin-dark (Merlin) — adapted to
preserve hidden-dir exclusion added since the original PR.
2026-03-17 02:32:02 -07:00
Teknium
9ece1ce2de feat(gateway): inject reply-to message context for out-of-session replies (#1594)
* fix: prevent infinite 400 failure loop on context overflow (#1630)

When a gateway session exceeds the model's context window, Anthropic may
return a generic 400 invalid_request_error with just 'Error' as the
message.  This bypassed the phrase-based context-length detection,
causing the agent to treat it as a non-retryable client error.  Worse,
the failed user message was still persisted to the transcript, making
the session even larger on each attempt — creating an infinite loop.

Three-layer fix:

1. run_agent.py — Fallback heuristic: when a 400 error has a very short
   generic message AND the session is large (>40% of context or >80
   messages), treat it as a probable context overflow and trigger
   compression instead of aborting.

2. run_agent.py + gateway/run.py — Don't persist failed messages:
   when the agent returns failed=True before generating any response,
   skip writing the user's message to the transcript/DB. This prevents
   the session from growing on each failure.

3. gateway/run.py — Smarter error messages: detect context-overflow
   failures and suggest /compact or /reset specifically, instead of a
   generic 'try again' that will fail identically.

* fix(skills): detect prompt injection patterns and block cache file reads

Adds two security layers to prevent prompt injection via skills hub
cache files (#1558):

1. read_file: blocks direct reads of ~/.hermes/skills/.hub/ directory
   (index-cache, catalog files). The 3.5MB clawhub_catalog_v1.json
   was the original injection vector — untrusted skill descriptions
   in the catalog contained adversarial text that the model executed.

2. skill_view: warns when skills are loaded from outside the trusted
   ~/.hermes/skills/ directory, and detects common injection patterns
   in skill content ("ignore previous instructions", "<system>", etc.).

Cherry-picked from PR #1562 by ygd58.

* fix(tools): chunk long messages in send_message_tool before dispatch (#1552)

Long messages sent via send_message tool or cron delivery silently
failed when exceeding platform limits. Gateway adapters handle this
via truncate_message(), but the standalone senders in send_message_tool
bypassed that entirely.

- Apply truncate_message() chunking in _send_to_platform() before
  dispatching to individual platform senders
- Remove naive message[i:i+2000] character split in _send_discord()
  in favor of centralized smart splitting
- Attach media files to last chunk only for Telegram
- Add regression tests for chunking and media placement

Cherry-picked from PR #1557 by llbn.

* fix(approval): show full command in dangerous command approval (#1553)

Previously the command was truncated to 80 chars in CLI (with a
[v]iew full option), 500 chars in Discord embeds, and missing entirely
in Telegram/Slack approval messages. Now the full command is always
displayed everywhere:

- CLI: removed 80-char truncation and [v]iew full menu option
- Gateway (TG/Slack): approval_required message includes full command
  in a code block
- Discord: embed shows full command up to 4096-char limit
- Windows: skip SIGALRM-based test timeout (Unix-only)
- Updated tests: replaced view-flow tests with direct approval tests

Cherry-picked from PR #1566 by crazywriter1.

* fix(cli): flush stdout during agent loop to prevent macOS display freeze (#1624)

The interrupt polling loop in chat() waited on the queue without
invalidating the prompt_toolkit renderer. On macOS, the StdoutProxy
buffer only flushed on input events, causing the CLI to appear frozen
during tool execution until the user typed a key.

Fix: call _invalidate() on each queue timeout (every ~100ms, throttled
to 150ms) to force the renderer to flush buffered agent output.

* fix(claw): warn when API keys are skipped during OpenClaw migration (#1580)

When --migrate-secrets is not passed (the default), API keys like
OPENROUTER_API_KEY are silently skipped with no warning. Users don't
realize their keys weren't migrated until the agent fails to connect.

Add a post-migration warning with actionable instructions: either
re-run with --migrate-secrets or add the key manually via
hermes config set.

Cherry-picked from PR #1593 by ygd58.

* fix(security): block sandbox backend creds from subprocess env (#1264)

Add Modal and Daytona sandbox credentials to the subprocess env
blocklist so they're not leaked to agent terminal sessions via
printenv/env.

Cherry-picked from PR #1571 by ygd58.

* fix(gateway): cap interrupt recursion depth to prevent resource exhaustion (#816)

When a user sends multiple messages while the agent keeps failing,
_run_agent() calls itself recursively with no depth limit. This can
exhaust stack/memory if the agent is in a failure loop.

Add _MAX_INTERRUPT_DEPTH = 3. When exceeded, the pending message is
logged and the current result is returned instead of recursing deeper.

The log handler duplication bug described in #816 was already fixed
separately (AIAgent.__init__ deduplicates handlers).

* fix(gateway): /model shows active fallback model instead of config default (#1615)

When the agent falls back to a different model (e.g. due to rate
limiting), /model still showed the config default. Now tracks the
effective model/provider after each agent run and displays it.

Cleared when the primary model succeeds again or the user explicitly
switches via /model.

Cherry-picked from PR #1616 by MaxKerkula. Added hasattr guard for
test compatibility.

* feat(gateway): inject reply-to message context for out-of-session replies (#1594)

When a user replies to a Telegram message, check if the quoted text
exists in the current session transcript. If missing (from cron jobs,
background tasks, or old sessions), prepend [Replying to: "..."] to
the message so the agent has context about what's being referenced.

- Add reply_to_text field to MessageEvent (base.py)
- Populate from Telegram's reply_to_message (text or caption)
- Inject context in _handle_message when not found in history

Based on PR #1596 by anpicasso (cherry-picked reply-to feature only,
excluded unrelated /server command and background delegation changes).

---------

Co-authored-by: buray <ygd58@users.noreply.github.com>
Co-authored-by: lbn <llbn@users.noreply.github.com>
Co-authored-by: crazywriter1 <53251494+crazywriter1@users.noreply.github.com>
Co-authored-by: Max K <MaxKerkula@users.noreply.github.com>
Co-authored-by: Angello Picasso <angello.picasso@devsu.com>
2026-03-17 02:31:27 -07:00
Teknium
36a76bf9db Merge pull request #1661 from NousResearch/fix/discord-thread-persistence
fix(discord): persist thread participation across gateway restarts
2026-03-17 02:27:09 -07:00
Teknium
d0faf77208 fix(gateway): /model shows active fallback model instead of config default (#1615)
* fix: prevent infinite 400 failure loop on context overflow (#1630)

When a gateway session exceeds the model's context window, Anthropic may
return a generic 400 invalid_request_error with just 'Error' as the
message.  This bypassed the phrase-based context-length detection,
causing the agent to treat it as a non-retryable client error.  Worse,
the failed user message was still persisted to the transcript, making
the session even larger on each attempt — creating an infinite loop.

Three-layer fix:

1. run_agent.py — Fallback heuristic: when a 400 error has a very short
   generic message AND the session is large (>40% of context or >80
   messages), treat it as a probable context overflow and trigger
   compression instead of aborting.

2. run_agent.py + gateway/run.py — Don't persist failed messages:
   when the agent returns failed=True before generating any response,
   skip writing the user's message to the transcript/DB. This prevents
   the session from growing on each failure.

3. gateway/run.py — Smarter error messages: detect context-overflow
   failures and suggest /compact or /reset specifically, instead of a
   generic 'try again' that will fail identically.

* fix(skills): detect prompt injection patterns and block cache file reads

Adds two security layers to prevent prompt injection via skills hub
cache files (#1558):

1. read_file: blocks direct reads of ~/.hermes/skills/.hub/ directory
   (index-cache, catalog files). The 3.5MB clawhub_catalog_v1.json
   was the original injection vector — untrusted skill descriptions
   in the catalog contained adversarial text that the model executed.

2. skill_view: warns when skills are loaded from outside the trusted
   ~/.hermes/skills/ directory, and detects common injection patterns
   in skill content ("ignore previous instructions", "<system>", etc.).

Cherry-picked from PR #1562 by ygd58.

* fix(tools): chunk long messages in send_message_tool before dispatch (#1552)

Long messages sent via send_message tool or cron delivery silently
failed when exceeding platform limits. Gateway adapters handle this
via truncate_message(), but the standalone senders in send_message_tool
bypassed that entirely.

- Apply truncate_message() chunking in _send_to_platform() before
  dispatching to individual platform senders
- Remove naive message[i:i+2000] character split in _send_discord()
  in favor of centralized smart splitting
- Attach media files to last chunk only for Telegram
- Add regression tests for chunking and media placement

Cherry-picked from PR #1557 by llbn.

* fix(approval): show full command in dangerous command approval (#1553)

Previously the command was truncated to 80 chars in CLI (with a
[v]iew full option), 500 chars in Discord embeds, and missing entirely
in Telegram/Slack approval messages. Now the full command is always
displayed everywhere:

- CLI: removed 80-char truncation and [v]iew full menu option
- Gateway (TG/Slack): approval_required message includes full command
  in a code block
- Discord: embed shows full command up to 4096-char limit
- Windows: skip SIGALRM-based test timeout (Unix-only)
- Updated tests: replaced view-flow tests with direct approval tests

Cherry-picked from PR #1566 by crazywriter1.

* fix(cli): flush stdout during agent loop to prevent macOS display freeze (#1624)

The interrupt polling loop in chat() waited on the queue without
invalidating the prompt_toolkit renderer. On macOS, the StdoutProxy
buffer only flushed on input events, causing the CLI to appear frozen
during tool execution until the user typed a key.

Fix: call _invalidate() on each queue timeout (every ~100ms, throttled
to 150ms) to force the renderer to flush buffered agent output.

* fix(claw): warn when API keys are skipped during OpenClaw migration (#1580)

When --migrate-secrets is not passed (the default), API keys like
OPENROUTER_API_KEY are silently skipped with no warning. Users don't
realize their keys weren't migrated until the agent fails to connect.

Add a post-migration warning with actionable instructions: either
re-run with --migrate-secrets or add the key manually via
hermes config set.

Cherry-picked from PR #1593 by ygd58.

* fix(security): block sandbox backend creds from subprocess env (#1264)

Add Modal and Daytona sandbox credentials to the subprocess env
blocklist so they're not leaked to agent terminal sessions via
printenv/env.

Cherry-picked from PR #1571 by ygd58.

* fix(gateway): cap interrupt recursion depth to prevent resource exhaustion (#816)

When a user sends multiple messages while the agent keeps failing,
_run_agent() calls itself recursively with no depth limit. This can
exhaust stack/memory if the agent is in a failure loop.

Add _MAX_INTERRUPT_DEPTH = 3. When exceeded, the pending message is
logged and the current result is returned instead of recursing deeper.

The log handler duplication bug described in #816 was already fixed
separately (AIAgent.__init__ deduplicates handlers).

* fix(gateway): /model shows active fallback model instead of config default (#1615)

When the agent falls back to a different model (e.g. due to rate
limiting), /model still showed the config default. Now tracks the
effective model/provider after each agent run and displays it.

Cleared when the primary model succeeds again or the user explicitly
switches via /model.

Cherry-picked from PR #1616 by MaxKerkula. Added hasattr guard for
test compatibility.

---------

Co-authored-by: buray <ygd58@users.noreply.github.com>
Co-authored-by: lbn <llbn@users.noreply.github.com>
Co-authored-by: crazywriter1 <53251494+crazywriter1@users.noreply.github.com>
Co-authored-by: Max K <MaxKerkula@users.noreply.github.com>
2026-03-17 02:26:51 -07:00
teknium1
c8582fc4a2 fix(discord): persist thread participation across gateway restarts
_bot_participated_threads was an in-memory set — lost on every restart.
After restart, the bot forgot which threads it was active in, requiring
fresh @mentions and potentially creating duplicate threads instead of
continuing existing conversations.

Changes:
- Persist thread IDs to ~/.hermes/discord_threads.json
- Load on adapter init, save on every new thread participation
- _track_thread() replaces direct .add() calls for atomic persist
- Cap at 500 tracked threads to prevent unbounded growth
- /thread slash command also tracks participation
- 7 new tests covering persistence, restart survival, corruption
  recovery, cap enforcement
2026-03-17 02:26:34 -07:00
Teknium
60b67e2b47 fix(gateway): cap interrupt recursion depth to prevent resource exhaustion (#816)
* fix: prevent infinite 400 failure loop on context overflow (#1630)

When a gateway session exceeds the model's context window, Anthropic may
return a generic 400 invalid_request_error with just 'Error' as the
message.  This bypassed the phrase-based context-length detection,
causing the agent to treat it as a non-retryable client error.  Worse,
the failed user message was still persisted to the transcript, making
the session even larger on each attempt — creating an infinite loop.

Three-layer fix:

1. run_agent.py — Fallback heuristic: when a 400 error has a very short
   generic message AND the session is large (>40% of context or >80
   messages), treat it as a probable context overflow and trigger
   compression instead of aborting.

2. run_agent.py + gateway/run.py — Don't persist failed messages:
   when the agent returns failed=True before generating any response,
   skip writing the user's message to the transcript/DB. This prevents
   the session from growing on each failure.

3. gateway/run.py — Smarter error messages: detect context-overflow
   failures and suggest /compact or /reset specifically, instead of a
   generic 'try again' that will fail identically.

* fix(skills): detect prompt injection patterns and block cache file reads

Adds two security layers to prevent prompt injection via skills hub
cache files (#1558):

1. read_file: blocks direct reads of ~/.hermes/skills/.hub/ directory
   (index-cache, catalog files). The 3.5MB clawhub_catalog_v1.json
   was the original injection vector — untrusted skill descriptions
   in the catalog contained adversarial text that the model executed.

2. skill_view: warns when skills are loaded from outside the trusted
   ~/.hermes/skills/ directory, and detects common injection patterns
   in skill content ("ignore previous instructions", "<system>", etc.).

Cherry-picked from PR #1562 by ygd58.

* fix(tools): chunk long messages in send_message_tool before dispatch (#1552)

Long messages sent via send_message tool or cron delivery silently
failed when exceeding platform limits. Gateway adapters handle this
via truncate_message(), but the standalone senders in send_message_tool
bypassed that entirely.

- Apply truncate_message() chunking in _send_to_platform() before
  dispatching to individual platform senders
- Remove naive message[i:i+2000] character split in _send_discord()
  in favor of centralized smart splitting
- Attach media files to last chunk only for Telegram
- Add regression tests for chunking and media placement

Cherry-picked from PR #1557 by llbn.

* fix(approval): show full command in dangerous command approval (#1553)

Previously the command was truncated to 80 chars in CLI (with a
[v]iew full option), 500 chars in Discord embeds, and missing entirely
in Telegram/Slack approval messages. Now the full command is always
displayed everywhere:

- CLI: removed 80-char truncation and [v]iew full menu option
- Gateway (TG/Slack): approval_required message includes full command
  in a code block
- Discord: embed shows full command up to 4096-char limit
- Windows: skip SIGALRM-based test timeout (Unix-only)
- Updated tests: replaced view-flow tests with direct approval tests

Cherry-picked from PR #1566 by crazywriter1.

* fix(cli): flush stdout during agent loop to prevent macOS display freeze (#1624)

The interrupt polling loop in chat() waited on the queue without
invalidating the prompt_toolkit renderer. On macOS, the StdoutProxy
buffer only flushed on input events, causing the CLI to appear frozen
during tool execution until the user typed a key.

Fix: call _invalidate() on each queue timeout (every ~100ms, throttled
to 150ms) to force the renderer to flush buffered agent output.

* fix(claw): warn when API keys are skipped during OpenClaw migration (#1580)

When --migrate-secrets is not passed (the default), API keys like
OPENROUTER_API_KEY are silently skipped with no warning. Users don't
realize their keys weren't migrated until the agent fails to connect.

Add a post-migration warning with actionable instructions: either
re-run with --migrate-secrets or add the key manually via
hermes config set.

Cherry-picked from PR #1593 by ygd58.

* fix(security): block sandbox backend creds from subprocess env (#1264)

Add Modal and Daytona sandbox credentials to the subprocess env
blocklist so they're not leaked to agent terminal sessions via
printenv/env.

Cherry-picked from PR #1571 by ygd58.

* fix(gateway): cap interrupt recursion depth to prevent resource exhaustion (#816)

When a user sends multiple messages while the agent keeps failing,
_run_agent() calls itself recursively with no depth limit. This can
exhaust stack/memory if the agent is in a failure loop.

Add _MAX_INTERRUPT_DEPTH = 3. When exceeded, the pending message is
logged and the current result is returned instead of recursing deeper.

The log handler duplication bug described in #816 was already fixed
separately (AIAgent.__init__ deduplicates handlers).

---------

Co-authored-by: buray <ygd58@users.noreply.github.com>
Co-authored-by: lbn <llbn@users.noreply.github.com>
Co-authored-by: crazywriter1 <53251494+crazywriter1@users.noreply.github.com>
2026-03-17 02:23:07 -07:00
Teknium
2c7c30be69 fix(security): harden terminal safety and sandbox file writes (#1653)
* fix(security): harden terminal safety and sandbox file writes

Two security improvements:

1. Dangerous command detection: expand shell -c pattern to catch
   combined flags (bash -lc, bash -ic, ksh -c) that were previously
   undetected. Pattern changed from matching only 'bash -c' to
   matching any shell invocation with -c anywhere in the flags.

2. File write sandboxing: add HERMES_WRITE_SAFE_ROOT env var that
   constrains all write_file/patch operations to a configured directory
   tree. Opt-in — when unset, behavior is unchanged. Useful for
   gateway/messaging deployments that should only touch a workspace.

Based on PR #1085 by ismoilh.

* fix: correct "POSIDEON" typo to "POSEIDON" in banner ASCII art

The poseidon skin's banner_logo had the E and I letters swapped,
spelling "POSIDEON-AGENT" instead of "POSEIDON-AGENT".

---------

Co-authored-by: ismoilh <ismoilh@users.noreply.github.com>
Co-authored-by: unmodeled-tyler <unmodeled.tyler@proton.me>
2026-03-17 02:22:12 -07:00
Teknium
6a320e8bfe fix(security): block sandbox backend creds from subprocess env (#1264)
* fix: prevent infinite 400 failure loop on context overflow (#1630)

When a gateway session exceeds the model's context window, Anthropic may
return a generic 400 invalid_request_error with just 'Error' as the
message.  This bypassed the phrase-based context-length detection,
causing the agent to treat it as a non-retryable client error.  Worse,
the failed user message was still persisted to the transcript, making
the session even larger on each attempt — creating an infinite loop.

Three-layer fix:

1. run_agent.py — Fallback heuristic: when a 400 error has a very short
   generic message AND the session is large (>40% of context or >80
   messages), treat it as a probable context overflow and trigger
   compression instead of aborting.

2. run_agent.py + gateway/run.py — Don't persist failed messages:
   when the agent returns failed=True before generating any response,
   skip writing the user's message to the transcript/DB. This prevents
   the session from growing on each failure.

3. gateway/run.py — Smarter error messages: detect context-overflow
   failures and suggest /compact or /reset specifically, instead of a
   generic 'try again' that will fail identically.

* fix(skills): detect prompt injection patterns and block cache file reads

Adds two security layers to prevent prompt injection via skills hub
cache files (#1558):

1. read_file: blocks direct reads of ~/.hermes/skills/.hub/ directory
   (index-cache, catalog files). The 3.5MB clawhub_catalog_v1.json
   was the original injection vector — untrusted skill descriptions
   in the catalog contained adversarial text that the model executed.

2. skill_view: warns when skills are loaded from outside the trusted
   ~/.hermes/skills/ directory, and detects common injection patterns
   in skill content ("ignore previous instructions", "<system>", etc.).

Cherry-picked from PR #1562 by ygd58.

* fix(tools): chunk long messages in send_message_tool before dispatch (#1552)

Long messages sent via send_message tool or cron delivery silently
failed when exceeding platform limits. Gateway adapters handle this
via truncate_message(), but the standalone senders in send_message_tool
bypassed that entirely.

- Apply truncate_message() chunking in _send_to_platform() before
  dispatching to individual platform senders
- Remove naive message[i:i+2000] character split in _send_discord()
  in favor of centralized smart splitting
- Attach media files to last chunk only for Telegram
- Add regression tests for chunking and media placement

Cherry-picked from PR #1557 by llbn.

* fix(approval): show full command in dangerous command approval (#1553)

Previously the command was truncated to 80 chars in CLI (with a
[v]iew full option), 500 chars in Discord embeds, and missing entirely
in Telegram/Slack approval messages. Now the full command is always
displayed everywhere:

- CLI: removed 80-char truncation and [v]iew full menu option
- Gateway (TG/Slack): approval_required message includes full command
  in a code block
- Discord: embed shows full command up to 4096-char limit
- Windows: skip SIGALRM-based test timeout (Unix-only)
- Updated tests: replaced view-flow tests with direct approval tests

Cherry-picked from PR #1566 by crazywriter1.

* fix(cli): flush stdout during agent loop to prevent macOS display freeze (#1624)

The interrupt polling loop in chat() waited on the queue without
invalidating the prompt_toolkit renderer. On macOS, the StdoutProxy
buffer only flushed on input events, causing the CLI to appear frozen
during tool execution until the user typed a key.

Fix: call _invalidate() on each queue timeout (every ~100ms, throttled
to 150ms) to force the renderer to flush buffered agent output.

* fix(claw): warn when API keys are skipped during OpenClaw migration (#1580)

When --migrate-secrets is not passed (the default), API keys like
OPENROUTER_API_KEY are silently skipped with no warning. Users don't
realize their keys weren't migrated until the agent fails to connect.

Add a post-migration warning with actionable instructions: either
re-run with --migrate-secrets or add the key manually via
hermes config set.

Cherry-picked from PR #1593 by ygd58.

* fix(security): block sandbox backend creds from subprocess env (#1264)

Add Modal and Daytona sandbox credentials to the subprocess env
blocklist so they're not leaked to agent terminal sessions via
printenv/env.

Cherry-picked from PR #1571 by ygd58.

---------

Co-authored-by: buray <ygd58@users.noreply.github.com>
Co-authored-by: lbn <llbn@users.noreply.github.com>
Co-authored-by: crazywriter1 <53251494+crazywriter1@users.noreply.github.com>
2026-03-17 02:20:42 -07:00
Teknium
cb0deb5f9d feat: add NeuTTS optional skill + local TTS provider backend
* feat(skills): add bundled neutts optional skill

Add NeuTTS optional skill with CLI scaffold, bootstrap helper, and
sample voice profile. Also fixes skills_hub.py to handle binary
assets (WAV files) during skill installation.

Changes:
- optional-skills/mlops/models/neutts/ — skill + CLI scaffold
- tools/skills_hub.py — binary asset support (read_bytes, write_bytes)
- tests/tools/test_skills_hub.py — regression tests for binary assets

* feat(tts): add NeuTTS as local TTS provider backend

Add NeuTTS as a fourth TTS provider option alongside Edge, ElevenLabs,
and OpenAI. NeuTTS runs fully on-device via neutts_cli — no API key
needed.

Provider behavior:
- Explicit: set tts.provider to 'neutts' in config.yaml
- Fallback: when Edge TTS is unavailable and neutts_cli is installed,
  automatically falls back to NeuTTS instead of failing
- check_tts_requirements() now includes NeuTTS in availability checks

NeuTTS outputs WAV natively. For Telegram voice bubbles, ffmpeg
converts to Opus (same pattern as Edge TTS).

Changes:
- tools/tts_tool.py — _generate_neutts(), _check_neutts_available(),
  provider dispatch, fallback logic, Opus conversion
- hermes_cli/config.py — tts.neutts config defaults

---------

Co-authored-by: unmodeled-tyler <unmodeled.tyler@proton.me>
2026-03-17 02:13:34 -07:00
Teknium
766f4aae2b refactor: tie api_mode to provider config instead of env var (#1656)
Remove HERMES_API_MODE env var. api_mode is now configured where the
endpoint is defined:

- model.api_mode in config.yaml (for the active model config)
- custom_providers[].api_mode (for named custom providers)

Replace _get_configured_api_mode() with _parse_api_mode() which just
validates a value against the whitelist without reading env vars.

Both paths (model config and named custom providers) now read api_mode
from their respective config entries rather than a global override.
2026-03-17 02:13:26 -07:00
Teknium
4e66d22151 fix(claw): warn when API keys are skipped during OpenClaw migration (#1580)
* fix: prevent infinite 400 failure loop on context overflow (#1630)

When a gateway session exceeds the model's context window, Anthropic may
return a generic 400 invalid_request_error with just 'Error' as the
message.  This bypassed the phrase-based context-length detection,
causing the agent to treat it as a non-retryable client error.  Worse,
the failed user message was still persisted to the transcript, making
the session even larger on each attempt — creating an infinite loop.

Three-layer fix:

1. run_agent.py — Fallback heuristic: when a 400 error has a very short
   generic message AND the session is large (>40% of context or >80
   messages), treat it as a probable context overflow and trigger
   compression instead of aborting.

2. run_agent.py + gateway/run.py — Don't persist failed messages:
   when the agent returns failed=True before generating any response,
   skip writing the user's message to the transcript/DB. This prevents
   the session from growing on each failure.

3. gateway/run.py — Smarter error messages: detect context-overflow
   failures and suggest /compact or /reset specifically, instead of a
   generic 'try again' that will fail identically.

* fix(skills): detect prompt injection patterns and block cache file reads

Adds two security layers to prevent prompt injection via skills hub
cache files (#1558):

1. read_file: blocks direct reads of ~/.hermes/skills/.hub/ directory
   (index-cache, catalog files). The 3.5MB clawhub_catalog_v1.json
   was the original injection vector — untrusted skill descriptions
   in the catalog contained adversarial text that the model executed.

2. skill_view: warns when skills are loaded from outside the trusted
   ~/.hermes/skills/ directory, and detects common injection patterns
   in skill content ("ignore previous instructions", "<system>", etc.).

Cherry-picked from PR #1562 by ygd58.

* fix(tools): chunk long messages in send_message_tool before dispatch (#1552)

Long messages sent via send_message tool or cron delivery silently
failed when exceeding platform limits. Gateway adapters handle this
via truncate_message(), but the standalone senders in send_message_tool
bypassed that entirely.

- Apply truncate_message() chunking in _send_to_platform() before
  dispatching to individual platform senders
- Remove naive message[i:i+2000] character split in _send_discord()
  in favor of centralized smart splitting
- Attach media files to last chunk only for Telegram
- Add regression tests for chunking and media placement

Cherry-picked from PR #1557 by llbn.

* fix(approval): show full command in dangerous command approval (#1553)

Previously the command was truncated to 80 chars in CLI (with a
[v]iew full option), 500 chars in Discord embeds, and missing entirely
in Telegram/Slack approval messages. Now the full command is always
displayed everywhere:

- CLI: removed 80-char truncation and [v]iew full menu option
- Gateway (TG/Slack): approval_required message includes full command
  in a code block
- Discord: embed shows full command up to 4096-char limit
- Windows: skip SIGALRM-based test timeout (Unix-only)
- Updated tests: replaced view-flow tests with direct approval tests

Cherry-picked from PR #1566 by crazywriter1.

* fix(cli): flush stdout during agent loop to prevent macOS display freeze (#1624)

The interrupt polling loop in chat() waited on the queue without
invalidating the prompt_toolkit renderer. On macOS, the StdoutProxy
buffer only flushed on input events, causing the CLI to appear frozen
during tool execution until the user typed a key.

Fix: call _invalidate() on each queue timeout (every ~100ms, throttled
to 150ms) to force the renderer to flush buffered agent output.

* fix(claw): warn when API keys are skipped during OpenClaw migration (#1580)

When --migrate-secrets is not passed (the default), API keys like
OPENROUTER_API_KEY are silently skipped with no warning. Users don't
realize their keys weren't migrated until the agent fails to connect.

Add a post-migration warning with actionable instructions: either
re-run with --migrate-secrets or add the key manually via
hermes config set.

Cherry-picked from PR #1593 by ygd58.

---------

Co-authored-by: buray <ygd58@users.noreply.github.com>
Co-authored-by: lbn <llbn@users.noreply.github.com>
Co-authored-by: crazywriter1 <53251494+crazywriter1@users.noreply.github.com>
2026-03-17 02:10:36 -07:00
Teknium
8992babaa3 fix(cli): flush stdout during agent loop to prevent macOS display freeze (#1624)
* fix: prevent infinite 400 failure loop on context overflow (#1630)

When a gateway session exceeds the model's context window, Anthropic may
return a generic 400 invalid_request_error with just 'Error' as the
message.  This bypassed the phrase-based context-length detection,
causing the agent to treat it as a non-retryable client error.  Worse,
the failed user message was still persisted to the transcript, making
the session even larger on each attempt — creating an infinite loop.

Three-layer fix:

1. run_agent.py — Fallback heuristic: when a 400 error has a very short
   generic message AND the session is large (>40% of context or >80
   messages), treat it as a probable context overflow and trigger
   compression instead of aborting.

2. run_agent.py + gateway/run.py — Don't persist failed messages:
   when the agent returns failed=True before generating any response,
   skip writing the user's message to the transcript/DB. This prevents
   the session from growing on each failure.

3. gateway/run.py — Smarter error messages: detect context-overflow
   failures and suggest /compact or /reset specifically, instead of a
   generic 'try again' that will fail identically.

* fix(skills): detect prompt injection patterns and block cache file reads

Adds two security layers to prevent prompt injection via skills hub
cache files (#1558):

1. read_file: blocks direct reads of ~/.hermes/skills/.hub/ directory
   (index-cache, catalog files). The 3.5MB clawhub_catalog_v1.json
   was the original injection vector — untrusted skill descriptions
   in the catalog contained adversarial text that the model executed.

2. skill_view: warns when skills are loaded from outside the trusted
   ~/.hermes/skills/ directory, and detects common injection patterns
   in skill content ("ignore previous instructions", "<system>", etc.).

Cherry-picked from PR #1562 by ygd58.

* fix(tools): chunk long messages in send_message_tool before dispatch (#1552)

Long messages sent via send_message tool or cron delivery silently
failed when exceeding platform limits. Gateway adapters handle this
via truncate_message(), but the standalone senders in send_message_tool
bypassed that entirely.

- Apply truncate_message() chunking in _send_to_platform() before
  dispatching to individual platform senders
- Remove naive message[i:i+2000] character split in _send_discord()
  in favor of centralized smart splitting
- Attach media files to last chunk only for Telegram
- Add regression tests for chunking and media placement

Cherry-picked from PR #1557 by llbn.

* fix(approval): show full command in dangerous command approval (#1553)

Previously the command was truncated to 80 chars in CLI (with a
[v]iew full option), 500 chars in Discord embeds, and missing entirely
in Telegram/Slack approval messages. Now the full command is always
displayed everywhere:

- CLI: removed 80-char truncation and [v]iew full menu option
- Gateway (TG/Slack): approval_required message includes full command
  in a code block
- Discord: embed shows full command up to 4096-char limit
- Windows: skip SIGALRM-based test timeout (Unix-only)
- Updated tests: replaced view-flow tests with direct approval tests

Cherry-picked from PR #1566 by crazywriter1.

* fix(cli): flush stdout during agent loop to prevent macOS display freeze (#1624)

The interrupt polling loop in chat() waited on the queue without
invalidating the prompt_toolkit renderer. On macOS, the StdoutProxy
buffer only flushed on input events, causing the CLI to appear frozen
during tool execution until the user typed a key.

Fix: call _invalidate() on each queue timeout (every ~100ms, throttled
to 150ms) to force the renderer to flush buffered agent output.

---------

Co-authored-by: buray <ygd58@users.noreply.github.com>
Co-authored-by: lbn <llbn@users.noreply.github.com>
Co-authored-by: crazywriter1 <53251494+crazywriter1@users.noreply.github.com>
2026-03-17 02:09:26 -07:00
Teknium
49043b7b7d feat: add /tools disable/enable/list slash commands with session reset (#1652)
Add in-session tool management via /tools disable/enable/list, plus
hermes tools list/disable/enable CLI subcommands. Supports both
built-in toolsets (web, memory) and MCP tools (github:create_issue).

To preserve prompt caching, /tools disable/enable in a chat session
saves the change to config and resets the session cleanly — the user
is asked to confirm before the reset happens.

Also improves prefix matching: /qui now dispatches to /quit instead
of showing ambiguous when longer skill commands like /quint-pipeline
are installed.

Based on PR #1520 by @YanSte.

Co-authored-by: Yannick Stephan <YanSte@users.noreply.github.com>
2026-03-17 02:05:26 -07:00
Teknium
f2414bfd45 feat: allow custom endpoints to use responses API via api_mode override (#1651)
Add HERMES_API_MODE env var and model.api_mode config field to let
custom OpenAI-compatible endpoints opt into codex_responses mode
without requiring the OpenAI Codex OAuth provider path.

- _get_configured_api_mode() reads HERMES_API_MODE env (precedence)
  then model.api_mode from config.yaml; validates against whitelist
- Applied in both _resolve_openrouter_runtime() and
  _resolve_named_custom_runtime() (original PR only covered openrouter)
- Fix _dump_api_request_debug() to show /responses URL when in
  codex_responses mode instead of always showing /chat/completions
- Tests for config override, env override, invalid values, named
  custom providers, and debug dump URL for both API modes

Inspired by PR #1041 by @mxyhi.

Co-authored-by: mxyhi <mxyhi@users.noreply.github.com>
2026-03-17 02:04:36 -07:00
0xbyt4
68fbcdaa06 fix: add browser_console to browser toolset and core tools list (#1084)
browser_console was registered in the tool registry but missing from
all toolset definitions (TOOLSETS, _HERMES_CORE_TOOLS, _LEGACY_TOOLSET_MAP),
so the agent could never discover or use it.

Added to all 4 locations + 4 wiring tests.

Cherry-picked from PR #1084 by @0xbyt4 (authorship preserved in tests).
2026-03-17 02:02:57 -07:00
teknium1
7d91b436e4 fix: exclude hidden directories from find/grep search backends (#1558)
The primary injection vector in #1558 was search_files discovering
catalog cache files in .hub/index-cache/ via find or grep, which
don't skip hidden directories like ripgrep does by default.

Three-layer fix:

1. _search_files (find): add -not -path '*/.*' to exclude hidden
   directories, matching ripgrep's default behavior.

2. _search_with_grep: add --exclude-dir='.*' to skip hidden
   directories in the grep fallback path.

3. _write_index_cache: write a .ignore file to .hub/ so ripgrep
   also skips it even when invoked with --hidden (belt-and-suspenders).

This makes all three search backends (rg, grep, find) consistently
exclude hidden directories, preventing the agent from discovering
and reading unvetted community content in hub cache files.
2026-03-17 02:02:57 -07:00
Teknium
40e2f8d9f0 feat(provider): add OpenCode Zen and OpenCode Go providers
Add support for OpenCode Zen (pay-as-you-go, 35+ curated models) and
OpenCode Go ($10/month subscription, open models) as first-class providers.

Both are OpenAI-compatible endpoints resolved via the generic api_key
provider flow — no custom adapter needed.

Files changed:
- hermes_cli/auth.py — ProviderConfig entries + aliases
- hermes_cli/config.py — OPENCODE_ZEN/GO API key env vars
- hermes_cli/models.py — model catalogs, labels, aliases, provider order
- hermes_cli/main.py — provider labels, menu entries, model flow dispatch
- hermes_cli/setup.py — setup wizard branches (idx 10, 11)
- agent/model_metadata.py — context lengths for all OpenCode models
- agent/auxiliary_client.py — default aux models
- .env.example — documentation

Co-authored-by: DevAgarwal2 <DevAgarwal2@users.noreply.github.com>
2026-03-17 02:02:43 -07:00
Teknium
4cb6735541 fix(approval): show full command in dangerous command approval (#1553)
* fix: prevent infinite 400 failure loop on context overflow (#1630)

When a gateway session exceeds the model's context window, Anthropic may
return a generic 400 invalid_request_error with just 'Error' as the
message.  This bypassed the phrase-based context-length detection,
causing the agent to treat it as a non-retryable client error.  Worse,
the failed user message was still persisted to the transcript, making
the session even larger on each attempt — creating an infinite loop.

Three-layer fix:

1. run_agent.py — Fallback heuristic: when a 400 error has a very short
   generic message AND the session is large (>40% of context or >80
   messages), treat it as a probable context overflow and trigger
   compression instead of aborting.

2. run_agent.py + gateway/run.py — Don't persist failed messages:
   when the agent returns failed=True before generating any response,
   skip writing the user's message to the transcript/DB. This prevents
   the session from growing on each failure.

3. gateway/run.py — Smarter error messages: detect context-overflow
   failures and suggest /compact or /reset specifically, instead of a
   generic 'try again' that will fail identically.

* fix(skills): detect prompt injection patterns and block cache file reads

Adds two security layers to prevent prompt injection via skills hub
cache files (#1558):

1. read_file: blocks direct reads of ~/.hermes/skills/.hub/ directory
   (index-cache, catalog files). The 3.5MB clawhub_catalog_v1.json
   was the original injection vector — untrusted skill descriptions
   in the catalog contained adversarial text that the model executed.

2. skill_view: warns when skills are loaded from outside the trusted
   ~/.hermes/skills/ directory, and detects common injection patterns
   in skill content ("ignore previous instructions", "<system>", etc.).

Cherry-picked from PR #1562 by ygd58.

* fix(tools): chunk long messages in send_message_tool before dispatch (#1552)

Long messages sent via send_message tool or cron delivery silently
failed when exceeding platform limits. Gateway adapters handle this
via truncate_message(), but the standalone senders in send_message_tool
bypassed that entirely.

- Apply truncate_message() chunking in _send_to_platform() before
  dispatching to individual platform senders
- Remove naive message[i:i+2000] character split in _send_discord()
  in favor of centralized smart splitting
- Attach media files to last chunk only for Telegram
- Add regression tests for chunking and media placement

Cherry-picked from PR #1557 by llbn.

* fix(approval): show full command in dangerous command approval (#1553)

Previously the command was truncated to 80 chars in CLI (with a
[v]iew full option), 500 chars in Discord embeds, and missing entirely
in Telegram/Slack approval messages. Now the full command is always
displayed everywhere:

- CLI: removed 80-char truncation and [v]iew full menu option
- Gateway (TG/Slack): approval_required message includes full command
  in a code block
- Discord: embed shows full command up to 4096-char limit
- Windows: skip SIGALRM-based test timeout (Unix-only)
- Updated tests: replaced view-flow tests with direct approval tests

Cherry-picked from PR #1566 by crazywriter1.

---------

Co-authored-by: buray <ygd58@users.noreply.github.com>
Co-authored-by: lbn <llbn@users.noreply.github.com>
Co-authored-by: crazywriter1 <53251494+crazywriter1@users.noreply.github.com>
2026-03-17 02:02:33 -07:00
teknium1
0351e4fa90 fix: add metadata param to base send_image and forward in send_animation
_send_response_parts() calls send_image(metadata=_thread_metadata) but
the base class signature didn't accept metadata, crashing platforms that
don't override send_image. send_animation already had the param but
wasn't forwarding it.

Credit: @0xbyt4 (PR #1077)
2026-03-17 02:02:28 -07:00
Teknium
1b2d6c424c fix: add --yes flag to bypass confirmation in /skills install and uninstall (#1647)
Fixes hanging when using /skills install or /skills uninstall from the
TUI — bare input() calls hang inside prompt_toolkit's event loop.

Changes:
- Add skip_confirm parameter to do_install() and do_uninstall()
- Separate --yes/-y (confirmation bypass) from --force (scan override)
  in both argparse and slash command handlers
- Update usage hint for /skills uninstall to show [--yes]

The original PR (#1595) accidentally deleted the install_from_quarantine()
call, which would have broken all installs. That bug is not present here.

Based on PR #1595 by 333Alden333.

Co-authored-by: 333Alden333 <333Alden333@users.noreply.github.com>
2026-03-17 01:59:07 -07:00
Teknium
28c35d045d Merge pull request #1537 from aydnOktay/improve/skill-manager-error-logging
Improve error logging in skill manager tool
2026-03-17 01:53:58 -07:00
Teknium
1f6a1f0028 fix(tools): chunk long messages in send_message_tool before platform dispatch
* add base support

* fix: correct skill author attribution to youssefea

* fix(tools): chunk long messages in send_message_tool before platform dispatch

  - Convert BasePlatformAdapter.truncate_message() to @staticmethod
  - Apply truncate_message() in _send_to_platform() with per-platform
    max lengths
  - Remove naive character split in _send_discord()
  - Attach media files to last chunk only for Telegram
  - Add regression tests for chunking and media placement

---------

Co-authored-by: youssefea <youcefea99@gmail.com>
Co-authored-by: llbn <46884939+llbn@users.noreply.github.com>
2026-03-17 01:52:51 -07:00
Teknium
d7029489d6 fix: show custom endpoint models in /model via live API probe (#1645)
Add 'custom' to the provider order so custom OpenAI-compatible
endpoints appear in /model list. Probes the endpoint's /models API
to dynamically discover available models.

Changes:
- Add 'custom' to _PROVIDER_ORDER in list_available_providers()
- Add _get_custom_base_url() helper to read model.base_url from config
- Add custom branch in provider_model_ids() using fetch_api_models()
- Custom endpoint detection via base_url presence for has_creds check

Based on PR #1612 by @aashizpoudel.

Co-authored-by: Aashish Poudel <aashizpoudel@users.noreply.github.com>
2026-03-17 01:52:46 -07:00
Teknium
12afccd9ca fix(tools): chunk long messages in send_message_tool before dispatch (#1552)
* fix: prevent infinite 400 failure loop on context overflow (#1630)

When a gateway session exceeds the model's context window, Anthropic may
return a generic 400 invalid_request_error with just 'Error' as the
message.  This bypassed the phrase-based context-length detection,
causing the agent to treat it as a non-retryable client error.  Worse,
the failed user message was still persisted to the transcript, making
the session even larger on each attempt — creating an infinite loop.

Three-layer fix:

1. run_agent.py — Fallback heuristic: when a 400 error has a very short
   generic message AND the session is large (>40% of context or >80
   messages), treat it as a probable context overflow and trigger
   compression instead of aborting.

2. run_agent.py + gateway/run.py — Don't persist failed messages:
   when the agent returns failed=True before generating any response,
   skip writing the user's message to the transcript/DB. This prevents
   the session from growing on each failure.

3. gateway/run.py — Smarter error messages: detect context-overflow
   failures and suggest /compact or /reset specifically, instead of a
   generic 'try again' that will fail identically.

* fix(skills): detect prompt injection patterns and block cache file reads

Adds two security layers to prevent prompt injection via skills hub
cache files (#1558):

1. read_file: blocks direct reads of ~/.hermes/skills/.hub/ directory
   (index-cache, catalog files). The 3.5MB clawhub_catalog_v1.json
   was the original injection vector — untrusted skill descriptions
   in the catalog contained adversarial text that the model executed.

2. skill_view: warns when skills are loaded from outside the trusted
   ~/.hermes/skills/ directory, and detects common injection patterns
   in skill content ("ignore previous instructions", "<system>", etc.).

Cherry-picked from PR #1562 by ygd58.

* fix(tools): chunk long messages in send_message_tool before dispatch (#1552)

Long messages sent via send_message tool or cron delivery silently
failed when exceeding platform limits. Gateway adapters handle this
via truncate_message(), but the standalone senders in send_message_tool
bypassed that entirely.

- Apply truncate_message() chunking in _send_to_platform() before
  dispatching to individual platform senders
- Remove naive message[i:i+2000] character split in _send_discord()
  in favor of centralized smart splitting
- Attach media files to last chunk only for Telegram
- Add regression tests for chunking and media placement

Cherry-picked from PR #1557 by llbn.

---------

Co-authored-by: buray <ygd58@users.noreply.github.com>
Co-authored-by: lbn <llbn@users.noreply.github.com>
2026-03-17 01:52:43 -07:00
Teknium
81f76111b0 Merge pull request #1560 from eren-karakus0/fix/singularity-preflight-check
fix(terminal): add Singularity/Apptainer preflight availability check
2026-03-17 01:52:03 -07:00
Teknium
96dac22194 fix: prevent infinite 400 loop on context overflow + block prompt injection via cache files (#1630, #1558)
* fix: prevent infinite 400 failure loop on context overflow (#1630)

When a gateway session exceeds the model's context window, Anthropic may
return a generic 400 invalid_request_error with just 'Error' as the
message.  This bypassed the phrase-based context-length detection,
causing the agent to treat it as a non-retryable client error.  Worse,
the failed user message was still persisted to the transcript, making
the session even larger on each attempt — creating an infinite loop.

Three-layer fix:

1. run_agent.py — Fallback heuristic: when a 400 error has a very short
   generic message AND the session is large (>40% of context or >80
   messages), treat it as a probable context overflow and trigger
   compression instead of aborting.

2. run_agent.py + gateway/run.py — Don't persist failed messages:
   when the agent returns failed=True before generating any response,
   skip writing the user's message to the transcript/DB. This prevents
   the session from growing on each failure.

3. gateway/run.py — Smarter error messages: detect context-overflow
   failures and suggest /compact or /reset specifically, instead of a
   generic 'try again' that will fail identically.

* fix(skills): detect prompt injection patterns and block cache file reads

Adds two security layers to prevent prompt injection via skills hub
cache files (#1558):

1. read_file: blocks direct reads of ~/.hermes/skills/.hub/ directory
   (index-cache, catalog files). The 3.5MB clawhub_catalog_v1.json
   was the original injection vector — untrusted skill descriptions
   in the catalog contained adversarial text that the model executed.

2. skill_view: warns when skills are loaded from outside the trusted
   ~/.hermes/skills/ directory, and detects common injection patterns
   in skill content ("ignore previous instructions", "<system>", etc.).

Cherry-picked from PR #1562 by ygd58.

---------

Co-authored-by: buray <ygd58@users.noreply.github.com>
2026-03-17 01:50:59 -07:00
Teknium
2d36819503 feat: add Base blockchain optional skill
* add base support

* fix: correct skill author attribution to youssefea

---------

Co-authored-by: youssefea <youcefea99@gmail.com>
2026-03-17 01:50:03 -07:00
Teknium
8e20a7e035 fix(gateway): strip MEDIA: and [[audio_as_voice]] tags from message body
* fix(gateway): strip MEDIA: and [[audio_as_voice]] tags from message body

Closes #1561

* fix: remove redundant re import, use existing import

---------

Co-authored-by: mettin4 <coktinmetin@gmail.com>
2026-03-17 01:47:35 -07:00
Teknium
4920c5940f feat: auto-detect local file paths in gateway responses for native media delivery (#1640)
Small models (7B-14B) can't reliably use MEDIA: or IMAGE: syntax. This
adds extract_local_files() to BasePlatformAdapter that regex-detects
bare local file paths ending in image/video extensions, validates them
with os.path.isfile(), and delivers them as native platform attachments.

Hardened over the original PR:
- Code-block exclusion: paths inside fenced blocks and inline code are
  skipped so code samples are never mutilated
- URL rejection: negative lookbehind prevents matching path segments
  inside HTTP URLs
- Relative path rejection: ./foo.png no longer matches
- Tilde path cleanup: raw ~/... form is removed from response text
- Deduplication by expanded path
- Added .webm to _VIDEO_EXTS
- Fallback to send_document for unrecognized media extensions

Based on PR #1636 by sudoingX.

Co-authored-by: sudoingX <sudoingX@users.noreply.github.com>
2026-03-17 01:47:34 -07:00
Teknium
3744118311 feat(cli): two-stage /model autocomplete with ghost text suggestions (#1641)
* feat(cli): two-stage /model autocomplete with ghost text suggestions

- SlashCommandCompleter: Tab-complete providers first (anthropic:, openrouter:, etc.)
  then models within the selected provider
- SlashCommandAutoSuggest: inline ghost text for slash commands, subcommands,
  and /model provider:model two-stage suggestions
- Custom Tab key binding: accepts provider completion and immediately
  re-triggers completions to show that provider's models
- COMMANDS_BY_CATEGORY: structured format with explicit subcommands for
  tab completion and ghost text (prompt, reasoning, voice, skills, cron, browser)
- SUBCOMMANDS dict auto-extracted from command definitions
- Model/provider info cached 60s for responsive completions

* fix: repair test regression and restore gold color from PR #1622

- Fix test_unknown_command_still_shows_error: patch _cprint instead of
  console.print to match the _cprint switch in process_command()
- Restore gold color on 'Type /help' hint using _DIM + _GOLD constants
  instead of bare \033[2m (was losing the #B8860B gold)
- Use _GOLD constant for ambiguous command message for consistency
- Add clarifying comment on SUBCOMMANDS regex fallback

---------

Co-authored-by: Lars van der Zande <lmvanderzande@gmail.com>
2026-03-17 01:47:32 -07:00
Teknium
5ada0b95e9 Merge pull request #1609 from 0xbyt4/fix/context-counter-cache-tokens
fix: context counter shows cached token count in status bar
2026-03-17 01:45:12 -07:00
teknium1
19eaf5d956 test: fix telegram mock to include ParseMode constant
The MarkdownV2 formatting change imports telegram.constants.ParseMode,
which the test mock didn't provide. Add ParseMode to the mock so
existing tests continue working.
2026-03-17 01:44:11 -07:00
Alex Ferrari
365d175100 fix: apply MarkdownV2 formatting in _send_telegram for proper rendering
The _send_telegram() function was sending raw markdown text without
parse_mode, causing bold, links, and headers to render as plain text.
This fix reuses the gateway adapter's format_message() to convert
markdown to Telegram's MarkdownV2 format, with a fallback to plain
text if parsing fails.
2026-03-17 01:44:11 -07:00
Teknium
c3ca68d25b Merge pull request #1614 from PeterFile/fix/launchd-service-recovery
fix(gateway): recover stale launchd service state
2026-03-17 01:43:07 -07:00
Teknium
eaa9ceeb43 Merge pull request #1621 from Death-Incarnate/main
fix: isolate test_anthropic_adapter from local credentials
2026-03-17 01:40:39 -07:00
Teknium
949fac192f fix(tools): remove unnecessary crontab requirement from cronjob tool (#1638)
* fix(tools): remove unnecessary crontab requirement from cronjob tool

The hermes cron system is internal — it uses a JSON-based scheduler
ticked by the gateway (cron/scheduler.py), not system crontab.

The check for shutil.which('crontab') was preventing the cronjob tool
from being available in environments without crontab installed (e.g.
minimal Ubuntu containers).

Changes:
- Remove shutil.which('crontab') check from check_cronjob_requirements()
- Remove unused shutil import
- Update docstring to clarify internal scheduler is used
- Update tests to reflect new behavior and add coverage for all
  session modes (interactive, gateway, exec_ask)

Fixes #1589

* test: add HERMES_EXEC_ASK coverage for cronjob requirements

Adds missing test for the exec_ask session mode, complementing
the cherry-picked fix from PR #1633.

---------

Co-authored-by: Bartok9 <bartokmagic@proton.me>
2026-03-17 01:40:02 -07:00
Teknium
4b96d10bc3 fix(cli): invalidate update-check cache after hermes update
Signed-off-by: nidhi-singh02 <nidhi2894@gmail.com>
Co-authored-by: nidhi-singh02 <nidhi2894@gmail.com>
2026-03-17 01:38:11 -07:00
teknium1
c16870277c test: add regression test for stale PID in gateway_state.json (#1631)
Verifies that write_runtime_status() overwrites pid and start_time
from a previous process rather than preserving them via setdefault().
Covers the fix from PR #1632.
2026-03-17 01:35:02 -07:00
Teknium
247e3c1470 Merge pull request #1632 from nidhi-singh02/fix/stale-pid-gateway-state
fix(gateway): overwrite stale PID in gateway_state.json on restart
2026-03-17 01:34:24 -07:00
Teknium
2af4af6390 Merge pull request #1635 from NousResearch/hermes/hermes-a86162db
fix: sanitize corrupted .env files on read and during migration
2026-03-17 01:33:36 -07:00
Teknium
749e9977a0 Merge pull request #1629 from NousResearch/hermes/hermes-6891ac11
feat(browser): multi-provider cloud browser support + Browser Use integration
2026-03-17 01:32:38 -07:00
teknium1
1c61ab6bd9 fix: unconditionally clear ANTHROPIC_TOKEN on v8→v9 migration
No conditional checks — just clear it. The new auth flow doesn't use
this env var. Anyone upgrading gets it wiped once, then it's done.
2026-03-17 01:31:20 -07:00
teknium1
e9f1a8e39b fix: gate ANTHROPIC_TOKEN cleanup to config version 8→9 migration
- Bump _config_version 8 → 9
- Move stale ANTHROPIC_TOKEN clearing into 'if current_ver < 9' block
  so it only runs once during the upgrade, not on every migrate_config()
- ANTHROPIC_TOKEN is still a valid auth path (OAuth flow), so we don't
  want to clear it repeatedly — only during the one-time migration from
  old setups that left it stale
- Add test_skips_on_version_9_or_later to verify one-time behavior
- All tests set config version 8 to trigger migration
2026-03-17 01:28:38 -07:00
teknium1
b6a51c955e fix: clear stale ANTHROPIC_TOKEN during migration, remove false *** detection
- Remove *** placeholder detection from _sanitize_env_lines (was based on
  confusing terminal redaction with literal file content)
- Add migrate_config() logic to clear stale ANTHROPIC_TOKEN when better
  credentials exist (ANTHROPIC_API_KEY or Claude Code auto-discovery)
- Old ANTHROPIC_TOKEN values shadow Claude Code credential fallthrough,
  breaking auth for users who updated without re-running setup
- Preserves ANTHROPIC_TOKEN when it's the only auth method available
- 3 new migration tests, updated existing tests
2026-03-17 01:26:23 -07:00
teknium1
634c1f6752 fix: sanitize corrupted .env files on read and during migration
Fixes two corruption patterns that break API keys during updates:

1. Concatenated KEY=VALUE pairs on a single line due to missing newlines
   (e.g. ANTHROPIC_API_KEY=sk-...OPENAI_BASE_URL=https://...). Uses a
   known-keys set to safely detect and split concatenated entries without
   false-splitting values that contain uppercase text.

2. Stale KEY=*** placeholder entries left by incomplete setup runs that
   never get updated and shadow real credentials.

Changes:
- Add _sanitize_env_lines() that splits concatenated known keys and drops
  *** placeholders
- Add sanitize_env_file() public API for explicit repair
- Call sanitization in save_env_value() on every read (self-healing)
- Call sanitize_env_file() at the start of migrate_config() so existing
  corrupted files are repaired on update
- 12 new tests covering splits, placeholders, edge cases, and integration
2026-03-17 01:13:34 -07:00
Teknium
6ebb816e56 Merge pull request #1634 from NousResearch/hermes/hermes-a86162db
chore: release v0.3.0 (v2026.3.17)
2026-03-17 00:55:51 -07:00
teknium1
37862f74fa chore: release v0.3.0 (v2026.3.17)
- Bump version 0.2.0 → 0.3.0
- Add comprehensive changelog (248 merged PRs, 15 contributors)
- CalVer tag: v2026.3.17
2026-03-17 00:38:48 -07:00
nidhi-singh02
67546746d4 fix(gateway): overwrite stale PID in gateway_state.json on restart
Signed-off-by: nidhi-singh02 <nidhi2894@gmail.com>
2026-03-17 13:01:55 +05:30
ShawnPana
d44b6b7f1b feat(browser): multi-provider cloud browser support + Browser Use integration
Introduce a cloud browser provider abstraction so users can switch
between Local Browser, Browserbase, and Browser Use (or future providers)
via hermes tools / hermes setup.

Cloud browser providers are behind an ABC (tools/browser_providers/base.py)
so adding a new provider is a single-file addition with no changes to
browser_tool.py internals.

Changes:
- tools/browser_providers/ package with ABC, Browserbase extraction,
  and Browser Use provider
- browser_tool.py refactored to use _PROVIDER_REGISTRY + _get_cloud_provider()
  (cached) instead of hardcoded _is_local_mode() / _create_browserbase_session()
- tools_config.py: generic _is_provider_active() / _detect_active_provider_index()
  replace TTS-only logic; Browser Use added as third browser option
- config.py: BROWSER_USE_API_KEY added to OPTIONAL_ENV_VARS + show_config + allowlist
- subprocess pipe hang fix: agent-browser daemon inherits pipe fds,
  communicate() blocks. Replaced with Popen + temp files.

Original PR: #1208
Co-authored-by: ShawnPana <shawnpana@users.noreply.github.com>
2026-03-17 00:16:34 -07:00