Thorough code review found 5 issues across run_agent.py, cli.py, and gateway/:
1. CRITICAL — Gateway stream consumer task never started: stream_consumer_holder
was checked BEFORE run_sync populated it. Fixed with async polling pattern
(same as track_agent).
2. MEDIUM-HIGH — Streaming fallback after partial delivery caused double-response:
if streaming failed after some tokens were delivered, the fallback would
re-deliver the full response. Now tracks deltas_were_sent and only falls
back when no tokens reached consumers yet.
3. MEDIUM — Codex mode lost on_first_delta spinner callback: _run_codex_stream
now accepts on_first_delta parameter, fires it on first text delta. Passed
through from _interruptible_streaming_api_call via _codex_on_first_delta
instance attribute.
4. MEDIUM — CLI close-tag after-text bypassed tag filtering: text after a
reasoning close tag was sent directly to _emit_stream_text, skipping
open-tag detection. Now routes through _stream_delta for full filtering.
5. LOW — Removed 140 lines of dead code: old _streaming_api_call method
(superseded by _interruptible_streaming_api_call). Updated 13 tests in
test_run_agent.py and test_openai_client_lifecycle.py to use the new
method name and signature.
4573 tests passing.
Salvaged from PR #1470 by adavyas.
Core fix: Honcho tool calls in a multi-session gateway could route to
the wrong session because honcho_tools.py relied on process-global
state. Now threads session context through the call chain:
AIAgent._invoke_tool() → handle_function_call() → registry.dispatch()
→ handler **kw → _resolve_session_context()
Changes:
- Add _resolve_session_context() to prefer per-call context over globals
- Plumb honcho_manager + honcho_session_key through handle_function_call
- Add sync_honcho=False to run_conversation() for synthetic flush turns
- Pass honcho_session_key through gateway memory flush lifecycle
- Harden gateway PID detection when /proc cmdline is unreadable
- Make interrupt test scripts import-safe for pytest-xdist
- Wrap BibTeX examples in Jekyll raw blocks for docs build
- Fix thread-order-dependent assertion in client lifecycle test
- Expand Honcho docs: session isolation, lifecycle, routing internals
Dropped from original PR:
- Indentation change in _create_request_openai_client that would move
client creation inside the lock (causes unnecessary contention)
Co-authored-by: adavyas <adavyas@users.noreply.github.com>
Use per-request OpenAI clients inside _interruptible_api_call so interrupts and transport failures do not poison later retries. Also add closed-client detection/recreation for the shared client and regression tests covering retry and concurrency behavior.