fix: limit concurrent Modal sandbox creations to avoid deadlocks
- Add max_concurrent_tasks config (default 8) with semaphore in TB2 eval - Pass cwd: /app via register_task_env_overrides for TB2 tasks - Add /home/ to host path prefixes as safety net for container backends When all 86 TerminalBench2 tasks fire simultaneously, each creates a Modal sandbox via asyncio.run() inside a thread pool worker. Modal's blocking calls deadlock when too many are created at once. The semaphore ensures max 8 concurrent creations. Co-Authored-By: hermes-agent[bot] <hermes-agent[bot]@users.noreply.github.com>
This commit is contained in:
@@ -29,6 +29,10 @@ env:
|
||||
wandb_name: "terminal-bench-2"
|
||||
ensure_scores_are_not_same: false
|
||||
data_dir_to_save_evals: "environments/benchmarks/evals/terminal-bench-2"
|
||||
# CRITICAL: Limit concurrent Modal sandbox creations to avoid deadlocks.
|
||||
# Modal's blocking calls (App.lookup, etc.) deadlock when too many sandboxes
|
||||
# are created simultaneously inside thread pool workers via asyncio.run().
|
||||
max_concurrent_tasks: 8
|
||||
|
||||
openai:
|
||||
base_url: "https://openrouter.ai/api/v1"
|
||||
|
||||
Reference in New Issue
Block a user