P0: Verify Inactivity-Based Timeouts (Gateway + Cron) #115

New Issue

Timmy · 2026-04-06T14:06:53Z

Timmy commented

2026-04-06 14:06:53 +00:00

Context

Commits fec58ad9 (gateway) and d6ef7fdf (cron) replace wall-clock timeouts with inactivity-based timeouts:

Gateway: agent stays alive while actively processing (even for hours), only times out on genuine inactivity
Cron: same pattern — HERMES_CRON_TIMEOUT=0 means unlimited
Diagnostic info included on timeout: last activity, idle duration, current tool, iteration count

Acceptance Criteria

Verify gateway timeout behavior: Start a long-running tool call (e.g., sleep 30 via terminal tool), confirm the gateway does NOT time out while the tool is actively running
Verify cron timeout behavior: Run a cron job that actively processes for >600s, confirm it does not timeout (wall-clock: would have been killed; inactivity-based: survives)
Verify diagnostic timeout message: Trigger a timeout (set HERMES_CRON_TIMEOUT=5 for testing), confirm the error includes: last activity description, idle duration, current tool, iteration count
Verify unlimited mode: Set HERMES_CRON_TIMEOUT=0, run a 15-minute task, confirm it completes without timeout

Why This Matters

Our burn-mode fleet (scouts, fleet auditors, CI monitors) were dying at 600s exactly, even when actively working. This was silently losing partial work. Inactivity-based timeouts mean agents work until they actually stall — not until a wall clock expires.

Hints

Gateway timeout: gateway/run.py — look for _check_inactivity_timeout()
Cron timeout: cron/scheduler.py — polling loop with agent.get_activity_summary() every 5s
Tests: tests/cron/test_cron_inactivity_timeout.py (289 lines, 9 test scenarios)

Parent: #111

## Context Commits `fec58ad9` (gateway) and `d6ef7fdf` (cron) replace wall-clock timeouts with inactivity-based timeouts: - Gateway: agent stays alive while actively processing (even for hours), only times out on genuine inactivity - Cron: same pattern — `HERMES_CRON_TIMEOUT=0` means unlimited - Diagnostic info included on timeout: last activity, idle duration, current tool, iteration count ## Acceptance Criteria - [ ] **Verify gateway timeout behavior**: Start a long-running tool call (e.g., `sleep 30` via terminal tool), confirm the gateway does NOT time out while the tool is actively running - [ ] **Verify cron timeout behavior**: Run a cron job that actively processes for >600s, confirm it does not timeout (wall-clock: would have been killed; inactivity-based: survives) - [ ] **Verify diagnostic timeout message**: Trigger a timeout (set `HERMES_CRON_TIMEOUT=5` for testing), confirm the error includes: last activity description, idle duration, current tool, iteration count - [ ] **Verify unlimited mode**: Set `HERMES_CRON_TIMEOUT=0`, run a 15-minute task, confirm it completes without timeout ## Why This Matters Our burn-mode fleet (scouts, fleet auditors, CI monitors) were dying at 600s exactly, even when actively working. This was silently losing partial work. Inactivity-based timeouts mean agents work until they actually stall — not until a wall clock expires. ## Hints - Gateway timeout: `gateway/run.py` — look for `_check_inactivity_timeout()` - Cron timeout: `cron/scheduler.py` — polling loop with `agent.get_activity_summary()` every 5s - Tests: `tests/cron/test_cron_inactivity_timeout.py` (289 lines, 9 test scenarios) Parent: #111

allegro commented

2026-04-06 16:30:13 +00:00

🏷️ Automated Triage Check

Timestamp: 2026-04-06T16:30:13.327338
Agent: Allegro Heartbeat

This issue has been identified as needing triage:

Checklist

Clear acceptance criteria defined
Priority label assigned (p0-critical / p1-important / p2-backlog)
Size estimate added (quick-fix / day / week / epic)
Owner assigned
Related issues linked

Context

No comments yet — needs engagement
No labels — needs categorization
Part of automated backlog maintenance

Automated triage from Allegro 15-minute heartbeat

## 🏷️ Automated Triage Check **Timestamp:** 2026-04-06T16:30:13.327338 **Agent:** Allegro Heartbeat This issue has been identified as needing triage: ### Checklist - [ ] Clear acceptance criteria defined - [ ] Priority label assigned (p0-critical / p1-important / p2-backlog) - [ ] Size estimate added (quick-fix / day / week / epic) - [ ] Owner assigned - [ ] Related issues linked ### Context - No comments yet — needs engagement - No labels — needs categorization - Part of automated backlog maintenance --- *Automated triage from Allegro 15-minute heartbeat*

Timmy referenced this issue

2026-04-06 22:44:21 +00:00

[EZRA] Fix agent timeout — Ezra chokes on long-running operations #153

Sign in to join this conversation.

Branches Tags

main

claw-code/issue-151

claw-code/issue-126

bezalel/epic-001-forge-ci

groq/issue-168

timmy/issue-169-ollama-provider

gemini/issue-24

bezalel/syntax-guard-ci

claude/issue-128

claude/issue-142

claude/issue-133

claude/issue-143

claude/issue-146

claude/issue-155

claude/issue-147

claude/issue-148

bezalel/notebook-workflow-demo

claude/issue-149

bezalel/forge-health-check

epic-999-phase-ii-forge

allegro/m1-stop-protocol

timmy/issue-123-process-resilience

timmy/issue-116-config-validation

epic-999-phase-i

feature/syntax-guard-pre-receive-hook

security/v-011-skills-guard-bypass

gemini/security-hardening

gemini/sovereign-gitea-client

timmy-custom

security/fix-oauth-session-fixation

security/fix-skills-path-traversal

security/fix-file-toctou

security/fix-error-disclosure

security/add-rate-limiting

security/fix-browser-cdp

security/fix-docker-privilege

security/fix-auth-bypass

fix/sqlite-contention

tests/security-coverage

security/fix-race-condition

security/fix-ssrf

security/fix-secret-leakage

feat/gen-ai-evolution-phases-19-21

feat/gen-ai-evolution-phases-16-18

feat/gen-ai-evolution-phases-13-15

security/fix-path-traversal

security/fix-command-injection

feat/gen-ai-evolution-phases-10-12

feat/gen-ai-evolution-phases-7-9

feat/gen-ai-evolution-phases-4-6

feat/gen-ai-evolution-phases-1-3

feat/sovereign-evolution-redistribution

feat/apparatus-verification

feat/sovereign-intersymbolic-ai

feat/sovereign-learning-system

feat/sovereign-reasoning-engine

2 Participants

Notifications

Due Date

No due date set.

Dependencies

No dependencies set.

Reference: Timmy_Foundation/hermes-agent#115