base.py's _keep_typing refresh loop calls send_typing every ~2s while
the agent is processing. If signal-cli returns NETWORK_FAILURE for the
recipient (offline, unroutable, group membership lost), the unmitigated
path was a WARNING log every 2 seconds for as long as the agent stayed
busy — a user report showed 1048 warnings in 41 minutes for one
offline contact, plus the matching volume of pointless RPC traffic to
signal-cli.
- _rpc() accepts log_failures=False so callers can route repeated
expected failures (typing) to DEBUG while keeping send/receive at
WARNING.
- send_typing() tracks consecutive failures per chat. First failure
still logs WARNING so transport issues remain visible; subsequent
failures log at DEBUG. After three consecutive failures we skip the
RPC during an exponential cooldown (16s, 32s, 60s cap) so we stop
hammering signal-cli for a recipient it can't deliver to. A
successful sendTyping resets the counters.
- _stop_typing_indicator() clears the backoff state so the next agent
turn starts fresh.
E2E simulation against the reported 41-minute window: RPCs drop from
1230 to 45 (-96%), log lines from 1048 WARNINGs to 1 WARNING + 44
DEBUGs.
Credits kshitijk4poor (#12056) for the _rpc log_failures kwarg idea;
the broader restructure in that PR (nested per-chat loop inside
send_typing) is avoided here in favour of stateful backoff that
preserves base.py's existing _keep_typing architecture.