Salvage of PR #1321 by @alireza78a (cherry-picked concept, reimplemented
against current main).
Phase 1 — Pre-call message sanitization:
_sanitize_api_messages() now runs unconditionally before every LLM call.
Previously gated on context_compressor being present, so sessions loaded
from disk or running without compression could accumulate dangling
tool_call/tool_result pairs causing API errors.
Phase 2a — Delegate task cap:
_cap_delegate_task_calls() truncates excess delegate_task calls per turn
to MAX_CONCURRENT_CHILDREN. The existing cap in delegate_tool.py only
limits the task array within a single call; this catches multiple
separate delegate_task tool_calls in one turn.
Phase 2b — Tool call deduplication:
_deduplicate_tool_calls() drops duplicate (tool_name, arguments) pairs
within a single turn when models stutter.
All three are static methods on AIAgent, independently testable.
29 tests covering happy paths and edge cases.