[claude] fix SSE stream registry race condition at 60-second timeout boundary (#16) #56

Merged
Rockachopa merged 2 commits from claude/issue-16 into main 2026-03-23 14:52:55 +00:00

2 Commits

Author SHA1 Message Date
Alexander Whitestone
d6ab748943 fix: handle SSE stream registry race condition at timeout boundary
Some checks failed
CI / Typecheck & Lint (pull_request) Failing after 0s
When stub mode (or very fast work) completes before the SSE client
attaches to the stream, streamRegistry.get() returns null because
end() was already called. Previously this fell through to a generic
timeout error even though the job succeeded.

Changes:
- Add DB polling fallback (2s interval, 120s max) when stream is null
  but job state is "executing" — waits for terminal state then replays
  the result via token+done SSE events
- Add unit tests covering the race condition: instant completion before
  client attach, normal live streaming, and the DB replay fallback path

The 90s bus-wait timeout and post-wait DB re-check were already in place.

Fixes #16

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-22 21:57:36 -04:00
Alexander Whitestone
06d0d6220f fix: resolve SSE stream registry race condition at completion boundary
Some checks failed
CI / Typecheck & Lint (pull_request) Failing after 1s
Fixes #16

1. stream-registry: Don't delete stream from map in end() — let the
   "close" event handle cleanup after consumers drain buffered data.
   This prevents the race where a late-attaching SSE client calls get()
   after end() but before reading buffered tokens.

2. stream-registry: Add hasEnded() method to check if a stream's
   writable side has ended (used for diagnostics).

3. jobs SSE endpoint: When job is "executing" but stream slot is gone
   (ended before client attached), poll DB every 2s (max 120s) until
   the job completes, then replay the full result. Previously this
   case returned "Stream timed out" error.

4. Timeout was already 90s (updated in prior work); fixed the
   docstring comment from 60s to 90s.

5. Added unit test covering the race: simulates instant stub completion
   (write + end before consumer attaches) and verifies buffered data
   is still readable by a late consumer.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-22 21:11:52 -04:00