fix(#1356): replace HTTPServer with ThreadingHTTPServer in multi-user bridge #1366

Closed
Rockachopa wants to merge 1 commits from fix/issue-1356-threading-http-server into main
Owner

Fix for #1356

The multi-user bridge uses Python's single-threaded HTTPServer, which processes requests sequentially. Each LLM call blocks for 5-10 seconds, queuing all other users. Under 10 concurrent requests, 60% timeout at 30s.

Changes

  1. multi_user_bridge.py line 2883: Changed HTTPServer(...)ThreadingHTTPServer(...) (class already defined at line 37)

  2. world/multi_user_bridge.py: Added ThreadingMixIn import and ThreadingHTTPServer class, updated server instantiation

Impact

  • Before: 4/10 users complete (40%), 7.8s avg
  • After: expected 10/10 users (100%), ~5s avg per paper Experiment 4

Refs #1356

## Fix for #1356 The multi-user bridge uses Python's single-threaded `HTTPServer`, which processes requests sequentially. Each LLM call blocks for 5-10 seconds, queuing all other users. Under 10 concurrent requests, 60% timeout at 30s. ### Changes 1. **`multi_user_bridge.py` line 2883**: Changed `HTTPServer(...)` → `ThreadingHTTPServer(...)` (class already defined at line 37) 2. **`world/multi_user_bridge.py`**: Added `ThreadingMixIn` import and `ThreadingHTTPServer` class, updated server instantiation ### Impact - Before: 4/10 users complete (40%), 7.8s avg - After: expected 10/10 users (100%), ~5s avg per paper Experiment 4 Refs #1356
Rockachopa added 1 commit 2026-04-13 21:24:14 +00:00
fix(#1356): replace HTTPServer with ThreadingHTTPServer in multi-user bridge
Some checks failed
CI / test (pull_request) Failing after 2s
Review Approval Gate / verify-review (pull_request) Failing after 8s
CI / validate (pull_request) Failing after 40s
93b8475405
The single-threaded HTTPServer queues requests sequentially, causing
60% timeout under 10 concurrent users (30s timeout, 7.8s avg).

Fix: use ThreadingHTTPServer (already defined in multi_user_bridge.py)
for thread-per-request concurrency.

- multi_user_bridge.py: line 2883 HTTPServer -> ThreadingHTTPServer
- world/multi_user_bridge.py: add ThreadingMixIn import + class, fix server init

Refs #1356
Rockachopa requested review from perplexity 2026-04-13 21:24:15 +00:00
Timmy approved these changes 2026-04-13 22:08:56 +00:00
Dismissed
Timmy left a comment
Owner

Review: Replace HTTPServer with ThreadingHTTPServer

Correct fix for a real concurrency problem. When LLM calls block for 5-10 seconds on a single-threaded server, all other users queue up. Threading is the right approach here.

Changes

  1. multi_user_bridge.py (root) — Single-line change: HTTPServer(...) to ThreadingHTTPServer(...). The ThreadingHTTPServer class is already defined at line 37 of this file, so this just uses it.

  2. world/multi_user_bridge.py — Adds the ThreadingMixIn import, defines ThreadingHTTPServer class with daemon_threads = True, and updates the server instantiation.

Analysis

  • daemon_threads = True is the right choice — threads die when the main process exits, preventing zombie threads on shutdown.
  • The ThreadingMixIn must come before HTTPServer in the MRO (it does: class ThreadingHTTPServer(ThreadingMixIn, HTTPServer)), so the method resolution is correct.
  • Thread safety concern: If BridgeHandler accesses shared mutable state (global dicts, lists, counters), those accesses now need synchronization (locks). The diff does not show any locking changes. Verify that BridgeHandler.do_POST/do_GET methods are thread-safe. Common issues include concurrent writes to shared user registries, room state, or chat logs. If ChatLog already uses locks (the test suite in PR #1369 tests thread safety), that part may be fine, but other shared state in the handler needs auditing.

Verdict

The threading change itself is correct and necessary. The risk is in shared state that was previously safe under single-threaded execution. If the handler already uses locks or thread-safe data structures, this is good to merge. If not, this change could introduce race conditions that are harder to debug than the original timeout issue.

Approving because the fix is directionally correct and the single-threaded server is a clear bottleneck, but please audit shared mutable state in BridgeHandler for thread safety.

## Review: Replace HTTPServer with ThreadingHTTPServer Correct fix for a real concurrency problem. When LLM calls block for 5-10 seconds on a single-threaded server, all other users queue up. Threading is the right approach here. ### Changes 1. **`multi_user_bridge.py` (root)** — Single-line change: `HTTPServer(...)` to `ThreadingHTTPServer(...)`. The `ThreadingHTTPServer` class is already defined at line 37 of this file, so this just uses it. 2. **`world/multi_user_bridge.py`** — Adds the `ThreadingMixIn` import, defines `ThreadingHTTPServer` class with `daemon_threads = True`, and updates the server instantiation. ### Analysis - **`daemon_threads = True`** is the right choice — threads die when the main process exits, preventing zombie threads on shutdown. - The `ThreadingMixIn` must come **before** `HTTPServer` in the MRO (it does: `class ThreadingHTTPServer(ThreadingMixIn, HTTPServer)`), so the method resolution is correct. - Thread safety concern: If `BridgeHandler` accesses shared mutable state (global dicts, lists, counters), those accesses now need synchronization (locks). The diff does not show any locking changes. **Verify that `BridgeHandler.do_POST`/`do_GET` methods are thread-safe.** Common issues include concurrent writes to shared user registries, room state, or chat logs. If `ChatLog` already uses locks (the test suite in PR #1369 tests thread safety), that part may be fine, but other shared state in the handler needs auditing. ### Verdict The threading change itself is correct and necessary. The risk is in shared state that was previously safe under single-threaded execution. If the handler already uses locks or thread-safe data structures, this is good to merge. If not, this change could introduce race conditions that are harder to debug than the original timeout issue. Approving because the fix is directionally correct and the single-threaded server is a clear bottleneck, but please audit shared mutable state in `BridgeHandler` for thread safety.
Timmy approved these changes 2026-04-14 00:10:44 +00:00
Timmy left a comment
Owner

Good improvement: switching to threaded HTTP server to prevent request blocking.

The change from HTTPServer to ThreadingHTTPServer is correct and well-implemented. Using ThreadingMixIn with daemon_threads=True is the right pattern — daemon threads ensure the server shuts down cleanly without waiting for in-flight requests. The class is properly defined in world/multi_user_bridge.py with the import of ThreadingMixIn from socketserver.

One concern: both multi_user_bridge.py (root) and world/multi_user_bridge.py are modified, but only world/multi_user_bridge.py defines the ThreadingHTTPServer class and imports ThreadingMixIn. The root multi_user_bridge.py references ThreadingHTTPServer at line 2883 but does not appear to import or define it in this diff. If the root file does not already have a matching import/class definition elsewhere, this will crash with a NameError at startup. Please verify that the root multi_user_bridge.py has the ThreadingHTTPServer class available in scope.

Also note: thread-per-request servers can have issues with shared mutable state. If BridgeHandler accesses any global data structures, ensure they are protected with locks. This is especially relevant now that requests will be handled concurrently.

**Good improvement: switching to threaded HTTP server to prevent request blocking.** The change from HTTPServer to ThreadingHTTPServer is correct and well-implemented. Using ThreadingMixIn with daemon_threads=True is the right pattern — daemon threads ensure the server shuts down cleanly without waiting for in-flight requests. The class is properly defined in world/multi_user_bridge.py with the import of ThreadingMixIn from socketserver. One concern: both multi_user_bridge.py (root) and world/multi_user_bridge.py are modified, but only world/multi_user_bridge.py defines the ThreadingHTTPServer class and imports ThreadingMixIn. The root multi_user_bridge.py references ThreadingHTTPServer at line 2883 but does not appear to import or define it in this diff. If the root file does not already have a matching import/class definition elsewhere, this will crash with a NameError at startup. Please verify that the root multi_user_bridge.py has the ThreadingHTTPServer class available in scope. Also note: thread-per-request servers can have issues with shared mutable state. If BridgeHandler accesses any global data structures, ensure they are protected with locks. This is especially relevant now that requests will be handled concurrently.
Author
Owner

Closing: dup of 1389 for #1356. Triage by KETER — fleet-ops#143.

Closing: dup of 1389 for #1356. Triage by KETER — fleet-ops#143.
Rockachopa closed this pull request 2026-04-14 00:30:48 +00:00
Some checks failed
CI / test (pull_request) Failing after 2s
Review Approval Gate / verify-review (pull_request) Failing after 8s
CI / validate (pull_request) Failing after 40s

Pull request closed

Sign in to join this conversation.
No Reviewers
2 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: Timmy_Foundation/the-nexus#1366