Two remaining CI failures:
1. agent-client-protocol 0.9.0 removed AuthMethod (replaced with
AuthMethodAgent/EnvVar/Terminal). Pin to <0.9 until the new API
is evaluated — our usage doesn't map 1:1 to the new types.
2. test_429_exhausts_all_retries_before_raising expected pytest.raises
but the agent now catches 429s after max retries, tries fallback,
then returns a result dict. Updated to check final_response.
The non-streaming API call path (_interruptible_api_call) had no
wall-clock timeout. When providers keep connections alive with SSE
keep-alive pings but never deliver a response, httpx's inactivity
timeout never fires and the call hangs indefinitely.
Subagents always used the non-streaming path because they have no
stream consumers (quiet_mode=True). This caused delegate_task to
hang for 40+ minutes in production.
The streaming path has two layers of protection:
- httpx read timeout (60s, HERMES_STREAM_READ_TIMEOUT)
- Stale stream detection (90s, HERMES_STREAM_STALE_TIMEOUT)
Both work because streaming sends chunks continuously — a 90-second
gap between chunks genuinely means the connection is broken, even for
reasoning models that take minutes to complete.
Now run_conversation() always prefers the streaming path. The streaming
method falls back to non-streaming automatically if the provider
doesn't support it. Stream delta callbacks are no-ops when no
consumers are registered, so there's no overhead for subagents.
- 429 rate limit and 529 overloaded were incorrectly treated as
non-retryable client errors, causing immediate failure instead of
exponential backoff retry. Users hitting Anthropic rate limits got
silent failures or no response at all.
- Generic "Sorry, I encountered an unexpected error" now includes
error type, details, and status-specific hints (auth, rate limit,
overloaded).
- Failed agent with final_response=None now surfaces the actual
error message instead of returning an empty response.