- Add max_concurrent_tasks config (default 8) with semaphore in TB2 eval - Pass cwd: /app via register_task_env_overrides for TB2 tasks - Add /home/ to host path prefixes as safety net for container backends When all 86 TerminalBench2 tasks fire simultaneously, each creates a Modal sandbox via asyncio.run() inside a thread pool worker. Modal's blocking calls deadlock when too many are created at once. The semaphore ensures max 8 concurrent creations. Co-Authored-By: hermes-agent[bot] <hermes-agent[bot]@users.noreply.github.com>
1.7 KiB
1.7 KiB