fix(gateway): /model shows active fallback model instead of config default (#1615)
* fix: prevent infinite 400 failure loop on context overflow (#1630) When a gateway session exceeds the model's context window, Anthropic may return a generic 400 invalid_request_error with just 'Error' as the message. This bypassed the phrase-based context-length detection, causing the agent to treat it as a non-retryable client error. Worse, the failed user message was still persisted to the transcript, making the session even larger on each attempt — creating an infinite loop. Three-layer fix: 1. run_agent.py — Fallback heuristic: when a 400 error has a very short generic message AND the session is large (>40% of context or >80 messages), treat it as a probable context overflow and trigger compression instead of aborting. 2. run_agent.py + gateway/run.py — Don't persist failed messages: when the agent returns failed=True before generating any response, skip writing the user's message to the transcript/DB. This prevents the session from growing on each failure. 3. gateway/run.py — Smarter error messages: detect context-overflow failures and suggest /compact or /reset specifically, instead of a generic 'try again' that will fail identically. * fix(skills): detect prompt injection patterns and block cache file reads Adds two security layers to prevent prompt injection via skills hub cache files (#1558): 1. read_file: blocks direct reads of ~/.hermes/skills/.hub/ directory (index-cache, catalog files). The 3.5MB clawhub_catalog_v1.json was the original injection vector — untrusted skill descriptions in the catalog contained adversarial text that the model executed. 2. skill_view: warns when skills are loaded from outside the trusted ~/.hermes/skills/ directory, and detects common injection patterns in skill content ("ignore previous instructions", "<system>", etc.). Cherry-picked from PR #1562 by ygd58. * fix(tools): chunk long messages in send_message_tool before dispatch (#1552) Long messages sent via send_message tool or cron delivery silently failed when exceeding platform limits. Gateway adapters handle this via truncate_message(), but the standalone senders in send_message_tool bypassed that entirely. - Apply truncate_message() chunking in _send_to_platform() before dispatching to individual platform senders - Remove naive message[i:i+2000] character split in _send_discord() in favor of centralized smart splitting - Attach media files to last chunk only for Telegram - Add regression tests for chunking and media placement Cherry-picked from PR #1557 by llbn. * fix(approval): show full command in dangerous command approval (#1553) Previously the command was truncated to 80 chars in CLI (with a [v]iew full option), 500 chars in Discord embeds, and missing entirely in Telegram/Slack approval messages. Now the full command is always displayed everywhere: - CLI: removed 80-char truncation and [v]iew full menu option - Gateway (TG/Slack): approval_required message includes full command in a code block - Discord: embed shows full command up to 4096-char limit - Windows: skip SIGALRM-based test timeout (Unix-only) - Updated tests: replaced view-flow tests with direct approval tests Cherry-picked from PR #1566 by crazywriter1. * fix(cli): flush stdout during agent loop to prevent macOS display freeze (#1624) The interrupt polling loop in chat() waited on the queue without invalidating the prompt_toolkit renderer. On macOS, the StdoutProxy buffer only flushed on input events, causing the CLI to appear frozen during tool execution until the user typed a key. Fix: call _invalidate() on each queue timeout (every ~100ms, throttled to 150ms) to force the renderer to flush buffered agent output. * fix(claw): warn when API keys are skipped during OpenClaw migration (#1580) When --migrate-secrets is not passed (the default), API keys like OPENROUTER_API_KEY are silently skipped with no warning. Users don't realize their keys weren't migrated until the agent fails to connect. Add a post-migration warning with actionable instructions: either re-run with --migrate-secrets or add the key manually via hermes config set. Cherry-picked from PR #1593 by ygd58. * fix(security): block sandbox backend creds from subprocess env (#1264) Add Modal and Daytona sandbox credentials to the subprocess env blocklist so they're not leaked to agent terminal sessions via printenv/env. Cherry-picked from PR #1571 by ygd58. * fix(gateway): cap interrupt recursion depth to prevent resource exhaustion (#816) When a user sends multiple messages while the agent keeps failing, _run_agent() calls itself recursively with no depth limit. This can exhaust stack/memory if the agent is in a failure loop. Add _MAX_INTERRUPT_DEPTH = 3. When exceeded, the pending message is logged and the current result is returned instead of recursing deeper. The log handler duplication bug described in #816 was already fixed separately (AIAgent.__init__ deduplicates handlers). * fix(gateway): /model shows active fallback model instead of config default (#1615) When the agent falls back to a different model (e.g. due to rate limiting), /model still showed the config default. Now tracks the effective model/provider after each agent run and displays it. Cleared when the primary model succeeds again or the user explicitly switches via /model. Cherry-picked from PR #1616 by MaxKerkula. Added hasattr guard for test compatibility. --------- Co-authored-by: buray <ygd58@users.noreply.github.com> Co-authored-by: lbn <llbn@users.noreply.github.com> Co-authored-by: crazywriter1 <53251494+crazywriter1@users.noreply.github.com> Co-authored-by: Max K <MaxKerkula@users.noreply.github.com>
This commit is contained in:
@@ -342,7 +342,13 @@ class GatewayRunner:
|
||||
# Key: session_key, Value: AIAgent instance
|
||||
self._running_agents: Dict[str, Any] = {}
|
||||
self._pending_messages: Dict[str, str] = {} # Queued messages during interrupt
|
||||
|
||||
|
||||
# Track active fallback model/provider when primary is rate-limited.
|
||||
# Set after an agent run where fallback was activated; cleared when
|
||||
# the primary model succeeds again or the user switches via /model.
|
||||
self._effective_model: Optional[str] = None
|
||||
self._effective_provider: Optional[str] = None
|
||||
|
||||
# Track pending exec approvals per session
|
||||
# Key: session_key, Value: {"command": str, "pattern_key": str, ...}
|
||||
self._pending_approvals: Dict[str, Dict[str, Any]] = {}
|
||||
@@ -2204,6 +2210,21 @@ class GatewayRunner:
|
||||
current_provider = "custom"
|
||||
|
||||
if not args:
|
||||
# If a fallback model is active, show it instead of config
|
||||
if self._effective_model:
|
||||
eff_provider = self._effective_provider or 'unknown'
|
||||
eff_label = _PROVIDER_LABELS.get(eff_provider, eff_provider)
|
||||
cfg_label = _PROVIDER_LABELS.get(current_provider, current_provider)
|
||||
lines = [
|
||||
f"🤖 **Active model:** `{self._effective_model}` (fallback)",
|
||||
f"**Provider:** {eff_label}",
|
||||
f"**Primary model** (`{current}` via {cfg_label}) is rate-limited.",
|
||||
"",
|
||||
]
|
||||
lines.append("To change: `/model model-name`")
|
||||
lines.append("Switch provider: `/model provider:model-name`")
|
||||
return "\n".join(lines)
|
||||
|
||||
provider_label = _PROVIDER_LABELS.get(current_provider, current_provider)
|
||||
lines = [
|
||||
f"🤖 **Current model:** `{current}`",
|
||||
@@ -2303,6 +2324,9 @@ class GatewayRunner:
|
||||
persist_note = "saved to config"
|
||||
else:
|
||||
persist_note = "this session only — will revert on restart"
|
||||
# Clear fallback state since user explicitly chose a model
|
||||
self._effective_model = None
|
||||
self._effective_provider = None
|
||||
return f"🤖 Model changed to `{new_model}` ({persist_note}){provider_note}{warning}\n_(takes effect on next message)_"
|
||||
|
||||
async def _handle_provider_command(self, event: MessageEvent) -> str:
|
||||
@@ -4531,7 +4555,21 @@ class GatewayRunner:
|
||||
# Run in thread pool to not block
|
||||
loop = asyncio.get_event_loop()
|
||||
response = await loop.run_in_executor(None, run_sync)
|
||||
|
||||
|
||||
# Track fallback model state: if the agent switched to a
|
||||
# fallback model during this run, persist it so /model shows
|
||||
# the actually-active model instead of the config default.
|
||||
_agent = agent_holder[0]
|
||||
if _agent is not None and hasattr(_agent, 'model'):
|
||||
_cfg_model = _resolve_gateway_model()
|
||||
if _agent.model != _cfg_model:
|
||||
self._effective_model = _agent.model
|
||||
self._effective_provider = getattr(_agent, 'provider', None)
|
||||
else:
|
||||
# Primary model worked — clear any stale fallback state
|
||||
self._effective_model = None
|
||||
self._effective_provider = None
|
||||
|
||||
# Check if we were interrupted and have a pending message
|
||||
result = result_holder[0]
|
||||
adapter = self.adapters.get(source.platform)
|
||||
|
||||
Reference in New Issue
Block a user