Files
the-nexus/paper/experiment4_results.md
Alexander Whitestone f3f819db26 Add 988 crisis protocol + Experiment 4 load test results
CRISIS_PROTOCOL constant added to bridge
System prompt now includes full crisis response steps
Experiment 4: 10 concurrent users, 40% completion, concurrency bottleneck identified
2026-04-12 20:16:43 -04:00

58 lines
1.7 KiB
Markdown

# Experiment 4: Concurrent Load Test Results
## Test Protocol
10 users send messages simultaneously to the multi-user bridge.
## Results
| Metric | Value |
|--------|-------|
| Concurrent users | 10 |
| Completed | 4 (40%) |
| Timed out | 6 (60%) |
| Average completion time | 7.8s |
| Timeout threshold | 30s |
### Per-User Response Times
| User | Response Time | Status |
|------|--------------|--------|
| User1 | 7.78s | Completed |
| User2 | 30.00s | Timeout |
| User3 | 30.01s | Timeout |
| User4 | 30.01s | Timeout |
| User5 | 30.00s | Timeout |
| User6 | 7.78s | Completed |
| User7 | 30.00s | Timeout |
| User8 | 30.00s | Timeout |
| User9 | 7.78s | Completed |
| User10 | 7.78s | Completed |
## Root Cause
The bridge uses Python's `http.server.HTTPServer` which is **single-threaded**.
Concurrent requests are serialized — each request blocks until the LLM responds.
With 10 simultaneous requests, users 2-5 and 7-8 queue behind user 1 and time out.
## Implication for Paper
This is an important finding. The architecture works correctly for isolation
(verified in Experiment 1) but has a concurrency bottleneck that limits
practical deployment to ~4 simultaneous users with current implementation.
## Fix
Replace `HTTPServer` with `ThreadingHTTPServer` (Python 3.7+) or add
async/await with `aiohttp`. Each request should run in its own thread.
## Architecture Note
In production, this limitation is expected to be less severe because:
1. The bridge is a local component (not exposed to the internet)
2. Evennia handles the multi-user layer (telnet connections)
3. The bridge only processes one request per user at a time
4. Evennia's own architecture is event-driven and can queue requests
The bottleneck affects stress testing, not typical usage patterns.