refactor: remove mini-swe-agent dependency — inline Docker/Modal backends (#2804)
Drop the mini-swe-agent git submodule. All terminal backends now use
hermes-agent's own environment implementations directly.
Docker backend:
- Inline the `docker run -d` container startup (was 15 lines in
minisweagent's DockerEnvironment). Our wrapper already handled
execute(), cleanup(), security hardening, volumes, and resource limits.
Modal backend:
- Import swe-rex's ModalDeployment directly instead of going through
minisweagent's 90-line passthrough wrapper.
- Bake the _AsyncWorker pattern (from environments/patches.py) directly
into ModalEnvironment for Atropos compatibility without monkey-patching.
Cleanup:
- Remove minisweagent_path.py (submodule path resolution helper)
- Remove submodule init/install from install.sh and setup-hermes.sh
- Remove mini-swe-agent from .gitmodules
- environments/patches.py is now a no-op (kept for backward compat)
- terminal_tool.py no longer does sys.path hacking for minisweagent
- mini_swe_runner.py guards imports (optional, for RL training only)
- Update all affected tests to mock the new direct subprocess calls
- Update README.md, CONTRIBUTING.md
No functionality change — all Docker, Modal, local, SSH, Singularity,
and Daytona backends behave identically. 6093 tests pass.
2026-03-24 07:30:25 -07:00
|
|
|
"""Docker execution environment for sandboxed command execution.
|
2026-02-21 22:31:43 -08:00
|
|
|
|
refactor: remove mini-swe-agent dependency — inline Docker/Modal backends (#2804)
Drop the mini-swe-agent git submodule. All terminal backends now use
hermes-agent's own environment implementations directly.
Docker backend:
- Inline the `docker run -d` container startup (was 15 lines in
minisweagent's DockerEnvironment). Our wrapper already handled
execute(), cleanup(), security hardening, volumes, and resource limits.
Modal backend:
- Import swe-rex's ModalDeployment directly instead of going through
minisweagent's 90-line passthrough wrapper.
- Bake the _AsyncWorker pattern (from environments/patches.py) directly
into ModalEnvironment for Atropos compatibility without monkey-patching.
Cleanup:
- Remove minisweagent_path.py (submodule path resolution helper)
- Remove submodule init/install from install.sh and setup-hermes.sh
- Remove mini-swe-agent from .gitmodules
- environments/patches.py is now a no-op (kept for backward compat)
- terminal_tool.py no longer does sys.path hacking for minisweagent
- mini_swe_runner.py guards imports (optional, for RL training only)
- Update all affected tests to mock the new direct subprocess calls
- Update README.md, CONTRIBUTING.md
No functionality change — all Docker, Modal, local, SSH, Singularity,
and Daytona backends behave identically. 6093 tests pass.
2026-03-24 07:30:25 -07:00
|
|
|
Security hardened (cap-drop ALL, no-new-privileges, PID limits),
|
fix(docker): remove --read-only and allow exec on /tmp for package installs
The Docker sandbox previously used --read-only on the root filesystem and
noexec on /tmp. This broke 30+ skills that need to install packages:
- npm install -g (codex, claude-code, mcporter, powerpoint)
- pip install (20+ mlops/media/productivity skills)
- apt install (minecraft-modpack-server, ml-paper-writing)
- Build tools that compile in /tmp (pip wheels, node-gyp)
The container is already fully isolated from the host. Industry standard
(E2B, Docker Sandboxes, OpenAI Codex) does not use --read-only — the
container itself is the security boundary.
Retained security hardening:
- --cap-drop ALL (zero capabilities)
- --security-opt no-new-privileges (no escalation)
- --pids-limit 256 (no fork bombs)
- Size-limited tmpfs for /tmp, /var/tmp, /run
- nosuid on all tmpfs mounts
- noexec on /var/tmp and /run (rarely need exec there)
- Resource limits (CPU, memory, disk)
- Ephemeral containers (destroyed after use)
Fixes #189.
2026-03-02 01:09:34 -08:00
|
|
|
configurable resource limits (CPU, memory, disk), and optional filesystem
|
|
|
|
|
persistence via bind mounts.
|
2026-02-23 02:11:33 -08:00
|
|
|
"""
|
|
|
|
|
|
|
|
|
|
import logging
|
2026-02-21 22:31:43 -08:00
|
|
|
import os
|
2026-03-17 02:34:25 -07:00
|
|
|
import re
|
fix: Docker backend fails when docker is not in PATH (macOS gateway)
On macOS, Docker Desktop installs the CLI to /usr/local/bin/docker, but
when Hermes runs as a gateway service (launchd) or in other non-login
contexts, /usr/local/bin is often not in PATH. This causes the Docker
requirements check to fail with 'No such file or directory: docker' even
though docker works fine from the user's terminal.
Add find_docker() helper that uses shutil.which() first, then probes
common Docker Desktop install paths on macOS (/usr/local/bin,
/opt/homebrew/bin, Docker.app bundle). The resolved path is cached and
passed to mini-swe-agent via its 'executable' parameter.
- tools/environments/docker.py: add find_docker(), use it in
_storage_opt_supported() and pass to _Docker(executable=...)
- tools/terminal_tool.py: use find_docker() in requirements check
- tests/tools/test_docker_find.py: 4 tests (PATH, fallback, not found, cache)
2877 tests pass.
2026-03-10 20:45:13 -07:00
|
|
|
import shutil
|
2026-02-21 22:31:43 -08:00
|
|
|
import subprocess
|
2026-02-25 22:31:05 -05:00
|
|
|
import sys
|
2026-02-23 02:11:33 -08:00
|
|
|
import threading
|
|
|
|
|
import time
|
refactor: remove mini-swe-agent dependency — inline Docker/Modal backends (#2804)
Drop the mini-swe-agent git submodule. All terminal backends now use
hermes-agent's own environment implementations directly.
Docker backend:
- Inline the `docker run -d` container startup (was 15 lines in
minisweagent's DockerEnvironment). Our wrapper already handled
execute(), cleanup(), security hardening, volumes, and resource limits.
Modal backend:
- Import swe-rex's ModalDeployment directly instead of going through
minisweagent's 90-line passthrough wrapper.
- Bake the _AsyncWorker pattern (from environments/patches.py) directly
into ModalEnvironment for Atropos compatibility without monkey-patching.
Cleanup:
- Remove minisweagent_path.py (submodule path resolution helper)
- Remove submodule init/install from install.sh and setup-hermes.sh
- Remove mini-swe-agent from .gitmodules
- environments/patches.py is now a no-op (kept for backward compat)
- terminal_tool.py no longer does sys.path hacking for minisweagent
- mini_swe_runner.py guards imports (optional, for RL training only)
- Update all affected tests to mock the new direct subprocess calls
- Update README.md, CONTRIBUTING.md
No functionality change — all Docker, Modal, local, SSH, Singularity,
and Daytona backends behave identically. 6093 tests pass.
2026-03-24 07:30:25 -07:00
|
|
|
import uuid
|
2026-02-23 02:11:33 -08:00
|
|
|
from typing import Optional
|
2026-02-21 22:31:43 -08:00
|
|
|
|
|
|
|
|
from tools.environments.base import BaseEnvironment
|
2026-02-23 02:11:33 -08:00
|
|
|
from tools.interrupt import is_interrupted
|
|
|
|
|
|
|
|
|
|
logger = logging.getLogger(__name__)
|
|
|
|
|
|
|
|
|
|
|
fix: Docker backend fails when docker is not in PATH (macOS gateway)
On macOS, Docker Desktop installs the CLI to /usr/local/bin/docker, but
when Hermes runs as a gateway service (launchd) or in other non-login
contexts, /usr/local/bin is often not in PATH. This causes the Docker
requirements check to fail with 'No such file or directory: docker' even
though docker works fine from the user's terminal.
Add find_docker() helper that uses shutil.which() first, then probes
common Docker Desktop install paths on macOS (/usr/local/bin,
/opt/homebrew/bin, Docker.app bundle). The resolved path is cached and
passed to mini-swe-agent via its 'executable' parameter.
- tools/environments/docker.py: add find_docker(), use it in
_storage_opt_supported() and pass to _Docker(executable=...)
- tools/terminal_tool.py: use find_docker() in requirements check
- tests/tools/test_docker_find.py: 4 tests (PATH, fallback, not found, cache)
2877 tests pass.
2026-03-10 20:45:13 -07:00
|
|
|
# Common Docker Desktop install paths checked when 'docker' is not in PATH.
|
|
|
|
|
# macOS Intel: /usr/local/bin, macOS Apple Silicon (Homebrew): /opt/homebrew/bin,
|
|
|
|
|
# Docker Desktop app bundle: /Applications/Docker.app/Contents/Resources/bin
|
|
|
|
|
_DOCKER_SEARCH_PATHS = [
|
|
|
|
|
"/usr/local/bin/docker",
|
|
|
|
|
"/opt/homebrew/bin/docker",
|
|
|
|
|
"/Applications/Docker.app/Contents/Resources/bin/docker",
|
|
|
|
|
]
|
|
|
|
|
|
|
|
|
|
_docker_executable: Optional[str] = None # resolved once, cached
|
2026-03-17 02:34:25 -07:00
|
|
|
_ENV_VAR_NAME_RE = re.compile(r"^[A-Za-z_][A-Za-z0-9_]*$")
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
def _normalize_forward_env_names(forward_env: list[str] | None) -> list[str]:
|
|
|
|
|
"""Return a deduplicated list of valid environment variable names."""
|
|
|
|
|
normalized: list[str] = []
|
|
|
|
|
seen: set[str] = set()
|
|
|
|
|
|
|
|
|
|
for item in forward_env or []:
|
|
|
|
|
if not isinstance(item, str):
|
|
|
|
|
logger.warning("Ignoring non-string docker_forward_env entry: %r", item)
|
|
|
|
|
continue
|
|
|
|
|
|
|
|
|
|
key = item.strip()
|
|
|
|
|
if not key:
|
|
|
|
|
continue
|
|
|
|
|
if not _ENV_VAR_NAME_RE.match(key):
|
|
|
|
|
logger.warning("Ignoring invalid docker_forward_env entry: %r", item)
|
|
|
|
|
continue
|
|
|
|
|
if key in seen:
|
|
|
|
|
continue
|
|
|
|
|
|
|
|
|
|
seen.add(key)
|
|
|
|
|
normalized.append(key)
|
|
|
|
|
|
|
|
|
|
return normalized
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
def _load_hermes_env_vars() -> dict[str, str]:
|
|
|
|
|
"""Load ~/.hermes/.env values without failing Docker command execution."""
|
|
|
|
|
try:
|
|
|
|
|
from hermes_cli.config import load_env
|
|
|
|
|
|
|
|
|
|
return load_env() or {}
|
|
|
|
|
except Exception:
|
|
|
|
|
return {}
|
fix: Docker backend fails when docker is not in PATH (macOS gateway)
On macOS, Docker Desktop installs the CLI to /usr/local/bin/docker, but
when Hermes runs as a gateway service (launchd) or in other non-login
contexts, /usr/local/bin is often not in PATH. This causes the Docker
requirements check to fail with 'No such file or directory: docker' even
though docker works fine from the user's terminal.
Add find_docker() helper that uses shutil.which() first, then probes
common Docker Desktop install paths on macOS (/usr/local/bin,
/opt/homebrew/bin, Docker.app bundle). The resolved path is cached and
passed to mini-swe-agent via its 'executable' parameter.
- tools/environments/docker.py: add find_docker(), use it in
_storage_opt_supported() and pass to _Docker(executable=...)
- tools/terminal_tool.py: use find_docker() in requirements check
- tests/tools/test_docker_find.py: 4 tests (PATH, fallback, not found, cache)
2877 tests pass.
2026-03-10 20:45:13 -07:00
|
|
|
|
|
|
|
|
|
|
|
|
|
def find_docker() -> Optional[str]:
|
|
|
|
|
"""Locate the docker CLI binary.
|
|
|
|
|
|
|
|
|
|
Checks ``shutil.which`` first (respects PATH), then probes well-known
|
|
|
|
|
install locations on macOS where Docker Desktop may not be in PATH
|
|
|
|
|
(e.g. when running as a gateway service via launchd).
|
|
|
|
|
|
|
|
|
|
Returns the absolute path, or ``None`` if docker cannot be found.
|
|
|
|
|
"""
|
|
|
|
|
global _docker_executable
|
|
|
|
|
if _docker_executable is not None:
|
|
|
|
|
return _docker_executable
|
|
|
|
|
|
|
|
|
|
found = shutil.which("docker")
|
|
|
|
|
if found:
|
|
|
|
|
_docker_executable = found
|
|
|
|
|
return found
|
|
|
|
|
|
|
|
|
|
for path in _DOCKER_SEARCH_PATHS:
|
|
|
|
|
if os.path.isfile(path) and os.access(path, os.X_OK):
|
|
|
|
|
_docker_executable = path
|
|
|
|
|
logger.info("Found docker at non-PATH location: %s", path)
|
|
|
|
|
return path
|
|
|
|
|
|
|
|
|
|
return None
|
|
|
|
|
|
2026-02-23 02:11:33 -08:00
|
|
|
|
fix(docker): remove --read-only and allow exec on /tmp for package installs
The Docker sandbox previously used --read-only on the root filesystem and
noexec on /tmp. This broke 30+ skills that need to install packages:
- npm install -g (codex, claude-code, mcporter, powerpoint)
- pip install (20+ mlops/media/productivity skills)
- apt install (minecraft-modpack-server, ml-paper-writing)
- Build tools that compile in /tmp (pip wheels, node-gyp)
The container is already fully isolated from the host. Industry standard
(E2B, Docker Sandboxes, OpenAI Codex) does not use --read-only — the
container itself is the security boundary.
Retained security hardening:
- --cap-drop ALL (zero capabilities)
- --security-opt no-new-privileges (no escalation)
- --pids-limit 256 (no fork bombs)
- Size-limited tmpfs for /tmp, /var/tmp, /run
- nosuid on all tmpfs mounts
- noexec on /var/tmp and /run (rarely need exec there)
- Resource limits (CPU, memory, disk)
- Ephemeral containers (destroyed after use)
Fixes #189.
2026-03-02 01:09:34 -08:00
|
|
|
# Security flags applied to every container.
|
|
|
|
|
# The container itself is the security boundary (isolated from host).
|
2026-03-09 17:52:33 -07:00
|
|
|
# We drop all capabilities then add back the minimum needed:
|
|
|
|
|
# DAC_OVERRIDE - root can write to bind-mounted dirs owned by host user
|
|
|
|
|
# CHOWN/FOWNER - package managers (pip, npm, apt) need to set file ownership
|
|
|
|
|
# Block privilege escalation and limit PIDs.
|
fix(docker): remove --read-only and allow exec on /tmp for package installs
The Docker sandbox previously used --read-only on the root filesystem and
noexec on /tmp. This broke 30+ skills that need to install packages:
- npm install -g (codex, claude-code, mcporter, powerpoint)
- pip install (20+ mlops/media/productivity skills)
- apt install (minecraft-modpack-server, ml-paper-writing)
- Build tools that compile in /tmp (pip wheels, node-gyp)
The container is already fully isolated from the host. Industry standard
(E2B, Docker Sandboxes, OpenAI Codex) does not use --read-only — the
container itself is the security boundary.
Retained security hardening:
- --cap-drop ALL (zero capabilities)
- --security-opt no-new-privileges (no escalation)
- --pids-limit 256 (no fork bombs)
- Size-limited tmpfs for /tmp, /var/tmp, /run
- nosuid on all tmpfs mounts
- noexec on /var/tmp and /run (rarely need exec there)
- Resource limits (CPU, memory, disk)
- Ephemeral containers (destroyed after use)
Fixes #189.
2026-03-02 01:09:34 -08:00
|
|
|
# /tmp is size-limited and nosuid but allows exec (needed by pip/npm builds).
|
2026-02-23 02:11:33 -08:00
|
|
|
_SECURITY_ARGS = [
|
|
|
|
|
"--cap-drop", "ALL",
|
2026-03-09 17:52:33 -07:00
|
|
|
"--cap-add", "DAC_OVERRIDE",
|
|
|
|
|
"--cap-add", "CHOWN",
|
|
|
|
|
"--cap-add", "FOWNER",
|
2026-02-23 02:11:33 -08:00
|
|
|
"--security-opt", "no-new-privileges",
|
|
|
|
|
"--pids-limit", "256",
|
fix(docker): remove --read-only and allow exec on /tmp for package installs
The Docker sandbox previously used --read-only on the root filesystem and
noexec on /tmp. This broke 30+ skills that need to install packages:
- npm install -g (codex, claude-code, mcporter, powerpoint)
- pip install (20+ mlops/media/productivity skills)
- apt install (minecraft-modpack-server, ml-paper-writing)
- Build tools that compile in /tmp (pip wheels, node-gyp)
The container is already fully isolated from the host. Industry standard
(E2B, Docker Sandboxes, OpenAI Codex) does not use --read-only — the
container itself is the security boundary.
Retained security hardening:
- --cap-drop ALL (zero capabilities)
- --security-opt no-new-privileges (no escalation)
- --pids-limit 256 (no fork bombs)
- Size-limited tmpfs for /tmp, /var/tmp, /run
- nosuid on all tmpfs mounts
- noexec on /var/tmp and /run (rarely need exec there)
- Resource limits (CPU, memory, disk)
- Ephemeral containers (destroyed after use)
Fixes #189.
2026-03-02 01:09:34 -08:00
|
|
|
"--tmpfs", "/tmp:rw,nosuid,size=512m",
|
2026-02-23 02:11:33 -08:00
|
|
|
"--tmpfs", "/var/tmp:rw,noexec,nosuid,size=256m",
|
|
|
|
|
"--tmpfs", "/run:rw,noexec,nosuid,size=64m",
|
|
|
|
|
]
|
2026-02-21 22:31:43 -08:00
|
|
|
|
|
|
|
|
|
2026-02-26 01:15:56 -08:00
|
|
|
_storage_opt_ok: Optional[bool] = None # cached result across instances
|
|
|
|
|
|
|
|
|
|
|
2026-03-14 02:53:02 -07:00
|
|
|
def _ensure_docker_available() -> None:
|
|
|
|
|
"""Best-effort check that the docker CLI is available before use.
|
|
|
|
|
|
|
|
|
|
Reuses ``find_docker()`` so this preflight stays consistent with the rest of
|
|
|
|
|
the Docker backend, including known non-PATH Docker Desktop locations.
|
|
|
|
|
"""
|
|
|
|
|
docker_exe = find_docker()
|
|
|
|
|
if not docker_exe:
|
|
|
|
|
logger.error(
|
|
|
|
|
"Docker backend selected but no docker executable was found in PATH "
|
|
|
|
|
"or known install locations. Install Docker Desktop and ensure the "
|
|
|
|
|
"CLI is available."
|
|
|
|
|
)
|
|
|
|
|
raise RuntimeError(
|
|
|
|
|
"Docker executable not found in PATH or known install locations. "
|
|
|
|
|
"Install Docker and ensure the 'docker' command is available."
|
|
|
|
|
)
|
|
|
|
|
|
|
|
|
|
try:
|
|
|
|
|
result = subprocess.run(
|
|
|
|
|
[docker_exe, "version"],
|
|
|
|
|
capture_output=True,
|
|
|
|
|
text=True,
|
|
|
|
|
timeout=5,
|
|
|
|
|
)
|
|
|
|
|
except FileNotFoundError:
|
|
|
|
|
logger.error(
|
|
|
|
|
"Docker backend selected but the resolved docker executable '%s' could "
|
|
|
|
|
"not be executed.",
|
|
|
|
|
docker_exe,
|
|
|
|
|
exc_info=True,
|
|
|
|
|
)
|
|
|
|
|
raise RuntimeError(
|
|
|
|
|
"Docker executable could not be executed. Check your Docker installation."
|
|
|
|
|
)
|
|
|
|
|
except subprocess.TimeoutExpired:
|
|
|
|
|
logger.error(
|
|
|
|
|
"Docker backend selected but '%s version' timed out. "
|
|
|
|
|
"The Docker daemon may not be running.",
|
|
|
|
|
docker_exe,
|
|
|
|
|
exc_info=True,
|
|
|
|
|
)
|
|
|
|
|
raise RuntimeError(
|
|
|
|
|
"Docker daemon is not responding. Ensure Docker is running and try again."
|
|
|
|
|
)
|
|
|
|
|
except Exception:
|
|
|
|
|
logger.error(
|
|
|
|
|
"Unexpected error while checking Docker availability.",
|
|
|
|
|
exc_info=True,
|
|
|
|
|
)
|
|
|
|
|
raise
|
|
|
|
|
else:
|
|
|
|
|
if result.returncode != 0:
|
|
|
|
|
logger.error(
|
|
|
|
|
"Docker backend selected but '%s version' failed "
|
|
|
|
|
"(exit code %d, stderr=%s)",
|
|
|
|
|
docker_exe,
|
|
|
|
|
result.returncode,
|
|
|
|
|
result.stderr.strip(),
|
|
|
|
|
)
|
|
|
|
|
raise RuntimeError(
|
|
|
|
|
"Docker command is available but 'docker version' failed. "
|
|
|
|
|
"Check your Docker installation."
|
|
|
|
|
)
|
|
|
|
|
|
|
|
|
|
|
2026-02-21 22:31:43 -08:00
|
|
|
class DockerEnvironment(BaseEnvironment):
|
2026-02-23 02:11:33 -08:00
|
|
|
"""Hardened Docker container execution with resource limits and persistence.
|
|
|
|
|
|
fix(docker): remove --read-only and allow exec on /tmp for package installs
The Docker sandbox previously used --read-only on the root filesystem and
noexec on /tmp. This broke 30+ skills that need to install packages:
- npm install -g (codex, claude-code, mcporter, powerpoint)
- pip install (20+ mlops/media/productivity skills)
- apt install (minecraft-modpack-server, ml-paper-writing)
- Build tools that compile in /tmp (pip wheels, node-gyp)
The container is already fully isolated from the host. Industry standard
(E2B, Docker Sandboxes, OpenAI Codex) does not use --read-only — the
container itself is the security boundary.
Retained security hardening:
- --cap-drop ALL (zero capabilities)
- --security-opt no-new-privileges (no escalation)
- --pids-limit 256 (no fork bombs)
- Size-limited tmpfs for /tmp, /var/tmp, /run
- nosuid on all tmpfs mounts
- noexec on /var/tmp and /run (rarely need exec there)
- Resource limits (CPU, memory, disk)
- Ephemeral containers (destroyed after use)
Fixes #189.
2026-03-02 01:09:34 -08:00
|
|
|
Security: all capabilities dropped, no privilege escalation, PID limits,
|
|
|
|
|
size-limited tmpfs for scratch dirs. The container itself is the security
|
|
|
|
|
boundary — the filesystem inside is writable so agents can install packages
|
|
|
|
|
(pip, npm, apt) as needed. Writable workspace via tmpfs or bind mounts.
|
2026-02-21 22:31:43 -08:00
|
|
|
|
fix(docker): remove --read-only and allow exec on /tmp for package installs
The Docker sandbox previously used --read-only on the root filesystem and
noexec on /tmp. This broke 30+ skills that need to install packages:
- npm install -g (codex, claude-code, mcporter, powerpoint)
- pip install (20+ mlops/media/productivity skills)
- apt install (minecraft-modpack-server, ml-paper-writing)
- Build tools that compile in /tmp (pip wheels, node-gyp)
The container is already fully isolated from the host. Industry standard
(E2B, Docker Sandboxes, OpenAI Codex) does not use --read-only — the
container itself is the security boundary.
Retained security hardening:
- --cap-drop ALL (zero capabilities)
- --security-opt no-new-privileges (no escalation)
- --pids-limit 256 (no fork bombs)
- Size-limited tmpfs for /tmp, /var/tmp, /run
- nosuid on all tmpfs mounts
- noexec on /var/tmp and /run (rarely need exec there)
- Resource limits (CPU, memory, disk)
- Ephemeral containers (destroyed after use)
Fixes #189.
2026-03-02 01:09:34 -08:00
|
|
|
Persistence: when enabled, bind mounts preserve /workspace and /root
|
|
|
|
|
across container restarts.
|
2026-02-21 22:31:43 -08:00
|
|
|
"""
|
|
|
|
|
|
2026-02-23 02:11:33 -08:00
|
|
|
def __init__(
|
|
|
|
|
self,
|
|
|
|
|
image: str,
|
2026-02-25 22:31:05 -05:00
|
|
|
cwd: str = "/root",
|
2026-02-23 02:11:33 -08:00
|
|
|
timeout: int = 60,
|
|
|
|
|
cpu: float = 0,
|
|
|
|
|
memory: int = 0,
|
|
|
|
|
disk: int = 0,
|
|
|
|
|
persistent_filesystem: bool = False,
|
|
|
|
|
task_id: str = "default",
|
2026-02-28 07:12:48 +10:00
|
|
|
volumes: list = None,
|
2026-03-17 02:34:25 -07:00
|
|
|
forward_env: list[str] | None = None,
|
2026-02-23 02:11:33 -08:00
|
|
|
network: bool = True,
|
2026-03-16 03:35:35 -04:00
|
|
|
host_cwd: str = None,
|
2026-03-16 05:19:43 -07:00
|
|
|
auto_mount_cwd: bool = False,
|
2026-02-23 02:11:33 -08:00
|
|
|
):
|
2026-02-25 22:31:05 -05:00
|
|
|
if cwd == "~":
|
|
|
|
|
cwd = "/root"
|
2026-02-21 22:31:43 -08:00
|
|
|
super().__init__(cwd=cwd, timeout=timeout)
|
2026-02-23 02:11:33 -08:00
|
|
|
self._base_image = image
|
|
|
|
|
self._persistent = persistent_filesystem
|
|
|
|
|
self._task_id = task_id
|
2026-03-17 02:34:25 -07:00
|
|
|
self._forward_env = _normalize_forward_env_names(forward_env)
|
2026-02-23 02:11:33 -08:00
|
|
|
self._container_id: Optional[str] = None
|
2026-02-28 07:12:48 +10:00
|
|
|
logger.info(f"DockerEnvironment volumes: {volumes}")
|
|
|
|
|
# Ensure volumes is a list (config.yaml could be malformed)
|
|
|
|
|
if volumes is not None and not isinstance(volumes, list):
|
|
|
|
|
logger.warning(f"docker_volumes config is not a list: {volumes!r}")
|
|
|
|
|
volumes = []
|
2026-02-23 02:11:33 -08:00
|
|
|
|
refactor: remove mini-swe-agent dependency — inline Docker/Modal backends (#2804)
Drop the mini-swe-agent git submodule. All terminal backends now use
hermes-agent's own environment implementations directly.
Docker backend:
- Inline the `docker run -d` container startup (was 15 lines in
minisweagent's DockerEnvironment). Our wrapper already handled
execute(), cleanup(), security hardening, volumes, and resource limits.
Modal backend:
- Import swe-rex's ModalDeployment directly instead of going through
minisweagent's 90-line passthrough wrapper.
- Bake the _AsyncWorker pattern (from environments/patches.py) directly
into ModalEnvironment for Atropos compatibility without monkey-patching.
Cleanup:
- Remove minisweagent_path.py (submodule path resolution helper)
- Remove submodule init/install from install.sh and setup-hermes.sh
- Remove mini-swe-agent from .gitmodules
- environments/patches.py is now a no-op (kept for backward compat)
- terminal_tool.py no longer does sys.path hacking for minisweagent
- mini_swe_runner.py guards imports (optional, for RL training only)
- Update all affected tests to mock the new direct subprocess calls
- Update README.md, CONTRIBUTING.md
No functionality change — all Docker, Modal, local, SSH, Singularity,
and Daytona backends behave identically. 6093 tests pass.
2026-03-24 07:30:25 -07:00
|
|
|
# Fail fast if Docker is not available.
|
2026-03-14 02:53:02 -07:00
|
|
|
_ensure_docker_available()
|
|
|
|
|
|
2026-02-23 02:11:33 -08:00
|
|
|
# Build resource limit args
|
|
|
|
|
resource_args = []
|
|
|
|
|
if cpu > 0:
|
|
|
|
|
resource_args.extend(["--cpus", str(cpu)])
|
|
|
|
|
if memory > 0:
|
|
|
|
|
resource_args.extend(["--memory", f"{memory}m"])
|
2026-02-26 11:37:38 -08:00
|
|
|
if disk > 0 and sys.platform != "darwin":
|
|
|
|
|
if self._storage_opt_supported():
|
|
|
|
|
resource_args.extend(["--storage-opt", f"size={disk}m"])
|
|
|
|
|
else:
|
|
|
|
|
logger.warning(
|
|
|
|
|
"Docker storage driver does not support per-container disk limits "
|
|
|
|
|
"(requires overlay2 on XFS with pquota). Container will run without disk quota."
|
|
|
|
|
)
|
2026-02-23 02:11:33 -08:00
|
|
|
if not network:
|
|
|
|
|
resource_args.append("--network=none")
|
|
|
|
|
|
2026-02-23 21:15:35 -08:00
|
|
|
# Persistent workspace via bind mounts from a configurable host directory
|
|
|
|
|
# (TERMINAL_SANDBOX_DIR, default ~/.hermes/sandboxes/). Non-persistent
|
|
|
|
|
# mode uses tmpfs (ephemeral, fast, gone on cleanup).
|
|
|
|
|
from tools.environments.base import get_sandbox_dir
|
|
|
|
|
|
2026-02-28 07:12:48 +10:00
|
|
|
# User-configured volume mounts (from config.yaml docker_volumes)
|
|
|
|
|
volume_args = []
|
2026-03-16 05:19:43 -07:00
|
|
|
workspace_explicitly_mounted = False
|
2026-02-28 07:12:48 +10:00
|
|
|
for vol in (volumes or []):
|
|
|
|
|
if not isinstance(vol, str):
|
|
|
|
|
logger.warning(f"Docker volume entry is not a string: {vol!r}")
|
|
|
|
|
continue
|
|
|
|
|
vol = vol.strip()
|
|
|
|
|
if not vol:
|
|
|
|
|
continue
|
|
|
|
|
if ":" in vol:
|
|
|
|
|
volume_args.extend(["-v", vol])
|
2026-03-16 05:19:43 -07:00
|
|
|
if ":/workspace" in vol:
|
|
|
|
|
workspace_explicitly_mounted = True
|
2026-02-28 07:12:48 +10:00
|
|
|
else:
|
|
|
|
|
logger.warning(f"Docker volume '{vol}' missing colon, skipping")
|
|
|
|
|
|
2026-03-16 05:19:43 -07:00
|
|
|
host_cwd_abs = os.path.abspath(os.path.expanduser(host_cwd)) if host_cwd else ""
|
|
|
|
|
bind_host_cwd = (
|
|
|
|
|
auto_mount_cwd
|
|
|
|
|
and bool(host_cwd_abs)
|
|
|
|
|
and os.path.isdir(host_cwd_abs)
|
|
|
|
|
and not workspace_explicitly_mounted
|
|
|
|
|
)
|
|
|
|
|
if auto_mount_cwd and host_cwd and not os.path.isdir(host_cwd_abs):
|
|
|
|
|
logger.debug(f"Skipping docker cwd mount: host_cwd is not a valid directory: {host_cwd}")
|
|
|
|
|
|
|
|
|
|
self._workspace_dir: Optional[str] = None
|
|
|
|
|
self._home_dir: Optional[str] = None
|
|
|
|
|
writable_args = []
|
|
|
|
|
if self._persistent:
|
|
|
|
|
sandbox = get_sandbox_dir() / "docker" / task_id
|
|
|
|
|
self._home_dir = str(sandbox / "home")
|
|
|
|
|
os.makedirs(self._home_dir, exist_ok=True)
|
|
|
|
|
writable_args.extend([
|
|
|
|
|
"-v", f"{self._home_dir}:/root",
|
|
|
|
|
])
|
|
|
|
|
if not bind_host_cwd and not workspace_explicitly_mounted:
|
|
|
|
|
self._workspace_dir = str(sandbox / "workspace")
|
|
|
|
|
os.makedirs(self._workspace_dir, exist_ok=True)
|
|
|
|
|
writable_args.extend([
|
|
|
|
|
"-v", f"{self._workspace_dir}:/workspace",
|
|
|
|
|
])
|
|
|
|
|
else:
|
|
|
|
|
if not bind_host_cwd and not workspace_explicitly_mounted:
|
|
|
|
|
writable_args.extend([
|
|
|
|
|
"--tmpfs", "/workspace:rw,exec,size=10g",
|
|
|
|
|
])
|
|
|
|
|
writable_args.extend([
|
|
|
|
|
"--tmpfs", "/home:rw,exec,size=1g",
|
|
|
|
|
"--tmpfs", "/root:rw,exec,size=1g",
|
|
|
|
|
])
|
|
|
|
|
|
|
|
|
|
if bind_host_cwd:
|
|
|
|
|
logger.info(f"Mounting configured host cwd to /workspace: {host_cwd_abs}")
|
|
|
|
|
volume_args = ["-v", f"{host_cwd_abs}:/workspace", *volume_args]
|
|
|
|
|
elif workspace_explicitly_mounted:
|
|
|
|
|
logger.debug("Skipping docker cwd mount: /workspace already mounted by user config")
|
2026-03-16 03:35:35 -04:00
|
|
|
|
2026-03-28 23:53:40 -07:00
|
|
|
# Mount credential files (OAuth tokens, etc.) declared by skills.
|
|
|
|
|
# Read-only so the container can authenticate but not modify host creds.
|
|
|
|
|
try:
|
feat: mount skills directory into all remote backends with live sync (#3890)
Skills with scripts/, templates/, and references/ subdirectories need
those files available inside sandboxed execution environments. Previously
the skills directory was missing entirely from remote backends.
Live sync — files stay current as credentials refresh and skills update:
- Docker/Singularity: bind mounts are inherently live (host changes
visible immediately)
- Modal: _sync_files() runs before each command with mtime+size caching,
pushing only changed credential and skill files (~13μs no-op overhead)
- SSH: rsync --safe-links before each command (naturally incremental)
- Daytona: _upload_if_changed() with mtime+size caching before each command
Security — symlink filtering:
- Docker/Singularity: sanitized temp copy when symlinks detected
- Modal/Daytona: iter_skills_files() skips symlinks
- SSH: rsync --safe-links skips symlinks pointing outside source tree
- Temp dir cleanup via atexit + reuse across calls
Non-root user support:
- SSH: detects remote home via echo $HOME, syncs to $HOME/.hermes/
- Daytona: detects sandbox home before sync, uploads to $HOME/.hermes/
- Docker/Modal/Singularity: run as root, /root/.hermes/ is correct
Also:
- credential_files.py: fix name/path key fallback in required_credential_files
- Singularity, SSH, Daytona: gained credential file support
- 14 tests covering symlink filtering, name/path fallback, iter_skills_files
2026-03-30 02:45:41 -07:00
|
|
|
from tools.credential_files import get_credential_file_mounts, get_skills_directory_mount
|
2026-03-28 23:53:40 -07:00
|
|
|
|
|
|
|
|
for mount_entry in get_credential_file_mounts():
|
|
|
|
|
volume_args.extend([
|
|
|
|
|
"-v",
|
|
|
|
|
f"{mount_entry['host_path']}:{mount_entry['container_path']}:ro",
|
|
|
|
|
])
|
|
|
|
|
logger.info(
|
|
|
|
|
"Docker: mounting credential %s -> %s",
|
|
|
|
|
mount_entry["host_path"],
|
|
|
|
|
mount_entry["container_path"],
|
|
|
|
|
)
|
feat: mount skills directory into all remote backends with live sync (#3890)
Skills with scripts/, templates/, and references/ subdirectories need
those files available inside sandboxed execution environments. Previously
the skills directory was missing entirely from remote backends.
Live sync — files stay current as credentials refresh and skills update:
- Docker/Singularity: bind mounts are inherently live (host changes
visible immediately)
- Modal: _sync_files() runs before each command with mtime+size caching,
pushing only changed credential and skill files (~13μs no-op overhead)
- SSH: rsync --safe-links before each command (naturally incremental)
- Daytona: _upload_if_changed() with mtime+size caching before each command
Security — symlink filtering:
- Docker/Singularity: sanitized temp copy when symlinks detected
- Modal/Daytona: iter_skills_files() skips symlinks
- SSH: rsync --safe-links skips symlinks pointing outside source tree
- Temp dir cleanup via atexit + reuse across calls
Non-root user support:
- SSH: detects remote home via echo $HOME, syncs to $HOME/.hermes/
- Daytona: detects sandbox home before sync, uploads to $HOME/.hermes/
- Docker/Modal/Singularity: run as root, /root/.hermes/ is correct
Also:
- credential_files.py: fix name/path key fallback in required_credential_files
- Singularity, SSH, Daytona: gained credential file support
- 14 tests covering symlink filtering, name/path fallback, iter_skills_files
2026-03-30 02:45:41 -07:00
|
|
|
|
|
|
|
|
# Mount the skills directory so skill scripts/templates are
|
|
|
|
|
# available inside the container at the same relative path.
|
|
|
|
|
skills_mount = get_skills_directory_mount()
|
|
|
|
|
if skills_mount:
|
|
|
|
|
volume_args.extend([
|
|
|
|
|
"-v",
|
|
|
|
|
f"{skills_mount['host_path']}:{skills_mount['container_path']}:ro",
|
|
|
|
|
])
|
|
|
|
|
logger.info(
|
|
|
|
|
"Docker: mounting skills dir %s -> %s",
|
|
|
|
|
skills_mount["host_path"],
|
|
|
|
|
skills_mount["container_path"],
|
|
|
|
|
)
|
2026-03-28 23:53:40 -07:00
|
|
|
except Exception as e:
|
|
|
|
|
logger.debug("Docker: could not load credential file mounts: %s", e)
|
|
|
|
|
|
2026-02-28 07:12:48 +10:00
|
|
|
logger.info(f"Docker volume_args: {volume_args}")
|
|
|
|
|
all_run_args = list(_SECURITY_ARGS) + writable_args + resource_args + volume_args
|
|
|
|
|
logger.info(f"Docker run_args: {all_run_args}")
|
2026-02-23 02:11:33 -08:00
|
|
|
|
fix: Docker backend fails when docker is not in PATH (macOS gateway)
On macOS, Docker Desktop installs the CLI to /usr/local/bin/docker, but
when Hermes runs as a gateway service (launchd) or in other non-login
contexts, /usr/local/bin is often not in PATH. This causes the Docker
requirements check to fail with 'No such file or directory: docker' even
though docker works fine from the user's terminal.
Add find_docker() helper that uses shutil.which() first, then probes
common Docker Desktop install paths on macOS (/usr/local/bin,
/opt/homebrew/bin, Docker.app bundle). The resolved path is cached and
passed to mini-swe-agent via its 'executable' parameter.
- tools/environments/docker.py: add find_docker(), use it in
_storage_opt_supported() and pass to _Docker(executable=...)
- tools/terminal_tool.py: use find_docker() in requirements check
- tests/tools/test_docker_find.py: 4 tests (PATH, fallback, not found, cache)
2877 tests pass.
2026-03-10 20:45:13 -07:00
|
|
|
# Resolve the docker executable once so it works even when
|
|
|
|
|
# /usr/local/bin is not in PATH (common on macOS gateway/service).
|
refactor: remove mini-swe-agent dependency — inline Docker/Modal backends (#2804)
Drop the mini-swe-agent git submodule. All terminal backends now use
hermes-agent's own environment implementations directly.
Docker backend:
- Inline the `docker run -d` container startup (was 15 lines in
minisweagent's DockerEnvironment). Our wrapper already handled
execute(), cleanup(), security hardening, volumes, and resource limits.
Modal backend:
- Import swe-rex's ModalDeployment directly instead of going through
minisweagent's 90-line passthrough wrapper.
- Bake the _AsyncWorker pattern (from environments/patches.py) directly
into ModalEnvironment for Atropos compatibility without monkey-patching.
Cleanup:
- Remove minisweagent_path.py (submodule path resolution helper)
- Remove submodule init/install from install.sh and setup-hermes.sh
- Remove mini-swe-agent from .gitmodules
- environments/patches.py is now a no-op (kept for backward compat)
- terminal_tool.py no longer does sys.path hacking for minisweagent
- mini_swe_runner.py guards imports (optional, for RL training only)
- Update all affected tests to mock the new direct subprocess calls
- Update README.md, CONTRIBUTING.md
No functionality change — all Docker, Modal, local, SSH, Singularity,
and Daytona backends behave identically. 6093 tests pass.
2026-03-24 07:30:25 -07:00
|
|
|
self._docker_exe = find_docker() or "docker"
|
|
|
|
|
|
|
|
|
|
# Start the container directly via `docker run -d`.
|
|
|
|
|
container_name = f"hermes-{uuid.uuid4().hex[:8]}"
|
|
|
|
|
run_cmd = [
|
|
|
|
|
self._docker_exe, "run", "-d",
|
|
|
|
|
"--name", container_name,
|
|
|
|
|
"-w", cwd,
|
|
|
|
|
*all_run_args,
|
|
|
|
|
image,
|
|
|
|
|
"sleep", "2h",
|
|
|
|
|
]
|
|
|
|
|
logger.debug(f"Starting container: {' '.join(run_cmd)}")
|
|
|
|
|
result = subprocess.run(
|
|
|
|
|
run_cmd,
|
|
|
|
|
capture_output=True,
|
|
|
|
|
text=True,
|
|
|
|
|
timeout=120, # image pull may take a while
|
|
|
|
|
check=True,
|
2026-02-23 02:11:33 -08:00
|
|
|
)
|
refactor: remove mini-swe-agent dependency — inline Docker/Modal backends (#2804)
Drop the mini-swe-agent git submodule. All terminal backends now use
hermes-agent's own environment implementations directly.
Docker backend:
- Inline the `docker run -d` container startup (was 15 lines in
minisweagent's DockerEnvironment). Our wrapper already handled
execute(), cleanup(), security hardening, volumes, and resource limits.
Modal backend:
- Import swe-rex's ModalDeployment directly instead of going through
minisweagent's 90-line passthrough wrapper.
- Bake the _AsyncWorker pattern (from environments/patches.py) directly
into ModalEnvironment for Atropos compatibility without monkey-patching.
Cleanup:
- Remove minisweagent_path.py (submodule path resolution helper)
- Remove submodule init/install from install.sh and setup-hermes.sh
- Remove mini-swe-agent from .gitmodules
- environments/patches.py is now a no-op (kept for backward compat)
- terminal_tool.py no longer does sys.path hacking for minisweagent
- mini_swe_runner.py guards imports (optional, for RL training only)
- Update all affected tests to mock the new direct subprocess calls
- Update README.md, CONTRIBUTING.md
No functionality change — all Docker, Modal, local, SSH, Singularity,
and Daytona backends behave identically. 6093 tests pass.
2026-03-24 07:30:25 -07:00
|
|
|
self._container_id = result.stdout.strip()
|
|
|
|
|
logger.info(f"Started container {container_name} ({self._container_id[:12]})")
|
2026-02-21 22:31:43 -08:00
|
|
|
|
2026-02-26 01:15:56 -08:00
|
|
|
@staticmethod
|
|
|
|
|
def _storage_opt_supported() -> bool:
|
|
|
|
|
"""Check if Docker's storage driver supports --storage-opt size=.
|
|
|
|
|
|
|
|
|
|
Only overlay2 on XFS with pquota supports per-container disk quotas.
|
|
|
|
|
Ubuntu (and most distros) default to ext4, where this flag errors out.
|
|
|
|
|
"""
|
|
|
|
|
global _storage_opt_ok
|
|
|
|
|
if _storage_opt_ok is not None:
|
|
|
|
|
return _storage_opt_ok
|
|
|
|
|
try:
|
fix: Docker backend fails when docker is not in PATH (macOS gateway)
On macOS, Docker Desktop installs the CLI to /usr/local/bin/docker, but
when Hermes runs as a gateway service (launchd) or in other non-login
contexts, /usr/local/bin is often not in PATH. This causes the Docker
requirements check to fail with 'No such file or directory: docker' even
though docker works fine from the user's terminal.
Add find_docker() helper that uses shutil.which() first, then probes
common Docker Desktop install paths on macOS (/usr/local/bin,
/opt/homebrew/bin, Docker.app bundle). The resolved path is cached and
passed to mini-swe-agent via its 'executable' parameter.
- tools/environments/docker.py: add find_docker(), use it in
_storage_opt_supported() and pass to _Docker(executable=...)
- tools/terminal_tool.py: use find_docker() in requirements check
- tests/tools/test_docker_find.py: 4 tests (PATH, fallback, not found, cache)
2877 tests pass.
2026-03-10 20:45:13 -07:00
|
|
|
docker = find_docker() or "docker"
|
2026-02-26 01:15:56 -08:00
|
|
|
result = subprocess.run(
|
fix: Docker backend fails when docker is not in PATH (macOS gateway)
On macOS, Docker Desktop installs the CLI to /usr/local/bin/docker, but
when Hermes runs as a gateway service (launchd) or in other non-login
contexts, /usr/local/bin is often not in PATH. This causes the Docker
requirements check to fail with 'No such file or directory: docker' even
though docker works fine from the user's terminal.
Add find_docker() helper that uses shutil.which() first, then probes
common Docker Desktop install paths on macOS (/usr/local/bin,
/opt/homebrew/bin, Docker.app bundle). The resolved path is cached and
passed to mini-swe-agent via its 'executable' parameter.
- tools/environments/docker.py: add find_docker(), use it in
_storage_opt_supported() and pass to _Docker(executable=...)
- tools/terminal_tool.py: use find_docker() in requirements check
- tests/tools/test_docker_find.py: 4 tests (PATH, fallback, not found, cache)
2877 tests pass.
2026-03-10 20:45:13 -07:00
|
|
|
[docker, "info", "--format", "{{.Driver}}"],
|
2026-02-26 01:15:56 -08:00
|
|
|
capture_output=True, text=True, timeout=10,
|
|
|
|
|
)
|
|
|
|
|
driver = result.stdout.strip().lower()
|
|
|
|
|
if driver != "overlay2":
|
|
|
|
|
_storage_opt_ok = False
|
|
|
|
|
return False
|
|
|
|
|
# overlay2 only supports storage-opt on XFS with pquota.
|
|
|
|
|
# Probe by attempting a dry-ish run — the fastest reliable check.
|
|
|
|
|
probe = subprocess.run(
|
fix: Docker backend fails when docker is not in PATH (macOS gateway)
On macOS, Docker Desktop installs the CLI to /usr/local/bin/docker, but
when Hermes runs as a gateway service (launchd) or in other non-login
contexts, /usr/local/bin is often not in PATH. This causes the Docker
requirements check to fail with 'No such file or directory: docker' even
though docker works fine from the user's terminal.
Add find_docker() helper that uses shutil.which() first, then probes
common Docker Desktop install paths on macOS (/usr/local/bin,
/opt/homebrew/bin, Docker.app bundle). The resolved path is cached and
passed to mini-swe-agent via its 'executable' parameter.
- tools/environments/docker.py: add find_docker(), use it in
_storage_opt_supported() and pass to _Docker(executable=...)
- tools/terminal_tool.py: use find_docker() in requirements check
- tests/tools/test_docker_find.py: 4 tests (PATH, fallback, not found, cache)
2877 tests pass.
2026-03-10 20:45:13 -07:00
|
|
|
[docker, "create", "--storage-opt", "size=1m", "hello-world"],
|
2026-02-26 01:15:56 -08:00
|
|
|
capture_output=True, text=True, timeout=15,
|
|
|
|
|
)
|
|
|
|
|
if probe.returncode == 0:
|
|
|
|
|
# Clean up the created container
|
|
|
|
|
container_id = probe.stdout.strip()
|
|
|
|
|
if container_id:
|
fix: Docker backend fails when docker is not in PATH (macOS gateway)
On macOS, Docker Desktop installs the CLI to /usr/local/bin/docker, but
when Hermes runs as a gateway service (launchd) or in other non-login
contexts, /usr/local/bin is often not in PATH. This causes the Docker
requirements check to fail with 'No such file or directory: docker' even
though docker works fine from the user's terminal.
Add find_docker() helper that uses shutil.which() first, then probes
common Docker Desktop install paths on macOS (/usr/local/bin,
/opt/homebrew/bin, Docker.app bundle). The resolved path is cached and
passed to mini-swe-agent via its 'executable' parameter.
- tools/environments/docker.py: add find_docker(), use it in
_storage_opt_supported() and pass to _Docker(executable=...)
- tools/terminal_tool.py: use find_docker() in requirements check
- tests/tools/test_docker_find.py: 4 tests (PATH, fallback, not found, cache)
2877 tests pass.
2026-03-10 20:45:13 -07:00
|
|
|
subprocess.run([docker, "rm", container_id],
|
2026-02-26 01:15:56 -08:00
|
|
|
capture_output=True, timeout=5)
|
|
|
|
|
_storage_opt_ok = True
|
|
|
|
|
else:
|
|
|
|
|
_storage_opt_ok = False
|
|
|
|
|
except Exception:
|
|
|
|
|
_storage_opt_ok = False
|
|
|
|
|
logger.debug("Docker --storage-opt support: %s", _storage_opt_ok)
|
|
|
|
|
return _storage_opt_ok
|
|
|
|
|
|
2026-02-21 22:31:43 -08:00
|
|
|
def execute(self, command: str, cwd: str = "", *,
|
|
|
|
|
timeout: int | None = None,
|
|
|
|
|
stdin_data: str | None = None) -> dict:
|
2026-03-08 17:46:11 +03:30
|
|
|
exec_command, sudo_stdin = self._prepare_command(command)
|
2026-02-21 22:31:43 -08:00
|
|
|
work_dir = cwd or self.cwd
|
|
|
|
|
effective_timeout = timeout or self.timeout
|
|
|
|
|
|
2026-03-08 17:46:11 +03:30
|
|
|
# Merge sudo password (if any) with caller-supplied stdin_data.
|
|
|
|
|
if sudo_stdin is not None and stdin_data is not None:
|
|
|
|
|
effective_stdin = sudo_stdin + stdin_data
|
|
|
|
|
elif sudo_stdin is not None:
|
|
|
|
|
effective_stdin = sudo_stdin
|
|
|
|
|
else:
|
|
|
|
|
effective_stdin = stdin_data
|
|
|
|
|
|
2026-02-23 21:15:35 -08:00
|
|
|
# docker exec -w doesn't expand ~, so prepend a cd into the command
|
|
|
|
|
if work_dir == "~" or work_dir.startswith("~/"):
|
|
|
|
|
exec_command = f"cd {work_dir} && {exec_command}"
|
|
|
|
|
work_dir = "/"
|
|
|
|
|
|
refactor: remove mini-swe-agent dependency — inline Docker/Modal backends (#2804)
Drop the mini-swe-agent git submodule. All terminal backends now use
hermes-agent's own environment implementations directly.
Docker backend:
- Inline the `docker run -d` container startup (was 15 lines in
minisweagent's DockerEnvironment). Our wrapper already handled
execute(), cleanup(), security hardening, volumes, and resource limits.
Modal backend:
- Import swe-rex's ModalDeployment directly instead of going through
minisweagent's 90-line passthrough wrapper.
- Bake the _AsyncWorker pattern (from environments/patches.py) directly
into ModalEnvironment for Atropos compatibility without monkey-patching.
Cleanup:
- Remove minisweagent_path.py (submodule path resolution helper)
- Remove submodule init/install from install.sh and setup-hermes.sh
- Remove mini-swe-agent from .gitmodules
- environments/patches.py is now a no-op (kept for backward compat)
- terminal_tool.py no longer does sys.path hacking for minisweagent
- mini_swe_runner.py guards imports (optional, for RL training only)
- Update all affected tests to mock the new direct subprocess calls
- Update README.md, CONTRIBUTING.md
No functionality change — all Docker, Modal, local, SSH, Singularity,
and Daytona backends behave identically. 6093 tests pass.
2026-03-24 07:30:25 -07:00
|
|
|
assert self._container_id, "Container not started"
|
|
|
|
|
cmd = [self._docker_exe, "exec"]
|
2026-03-08 17:46:11 +03:30
|
|
|
if effective_stdin is not None:
|
2026-02-21 22:31:43 -08:00
|
|
|
cmd.append("-i")
|
|
|
|
|
cmd.extend(["-w", work_dir])
|
2026-03-28 23:53:40 -07:00
|
|
|
# Combine explicit docker_forward_env with skill-declared env_passthrough
|
|
|
|
|
# vars so skills that declare required_environment_variables (e.g. Notion)
|
|
|
|
|
# have their keys forwarded into the container automatically.
|
|
|
|
|
forward_keys = set(self._forward_env)
|
|
|
|
|
try:
|
|
|
|
|
from tools.env_passthrough import get_all_passthrough
|
|
|
|
|
forward_keys |= get_all_passthrough()
|
|
|
|
|
except Exception:
|
|
|
|
|
pass
|
|
|
|
|
hermes_env = _load_hermes_env_vars() if forward_keys else {}
|
|
|
|
|
for key in sorted(forward_keys):
|
2026-03-17 02:34:25 -07:00
|
|
|
value = os.getenv(key)
|
|
|
|
|
if value is None:
|
|
|
|
|
value = hermes_env.get(key)
|
|
|
|
|
if value is not None:
|
2026-02-21 22:31:43 -08:00
|
|
|
cmd.extend(["-e", f"{key}={value}"])
|
refactor: remove mini-swe-agent dependency — inline Docker/Modal backends (#2804)
Drop the mini-swe-agent git submodule. All terminal backends now use
hermes-agent's own environment implementations directly.
Docker backend:
- Inline the `docker run -d` container startup (was 15 lines in
minisweagent's DockerEnvironment). Our wrapper already handled
execute(), cleanup(), security hardening, volumes, and resource limits.
Modal backend:
- Import swe-rex's ModalDeployment directly instead of going through
minisweagent's 90-line passthrough wrapper.
- Bake the _AsyncWorker pattern (from environments/patches.py) directly
into ModalEnvironment for Atropos compatibility without monkey-patching.
Cleanup:
- Remove minisweagent_path.py (submodule path resolution helper)
- Remove submodule init/install from install.sh and setup-hermes.sh
- Remove mini-swe-agent from .gitmodules
- environments/patches.py is now a no-op (kept for backward compat)
- terminal_tool.py no longer does sys.path hacking for minisweagent
- mini_swe_runner.py guards imports (optional, for RL training only)
- Update all affected tests to mock the new direct subprocess calls
- Update README.md, CONTRIBUTING.md
No functionality change — all Docker, Modal, local, SSH, Singularity,
and Daytona backends behave identically. 6093 tests pass.
2026-03-24 07:30:25 -07:00
|
|
|
cmd.extend([self._container_id, "bash", "-lc", exec_command])
|
2026-02-21 22:31:43 -08:00
|
|
|
|
|
|
|
|
try:
|
2026-02-23 02:11:33 -08:00
|
|
|
_output_chunks = []
|
|
|
|
|
proc = subprocess.Popen(
|
|
|
|
|
cmd,
|
|
|
|
|
stdout=subprocess.PIPE, stderr=subprocess.STDOUT,
|
2026-03-08 17:46:11 +03:30
|
|
|
stdin=subprocess.PIPE if effective_stdin else subprocess.DEVNULL,
|
2026-02-23 02:11:33 -08:00
|
|
|
text=True,
|
|
|
|
|
)
|
2026-03-08 17:46:11 +03:30
|
|
|
if effective_stdin:
|
2026-02-23 02:11:33 -08:00
|
|
|
try:
|
2026-03-08 17:46:11 +03:30
|
|
|
proc.stdin.write(effective_stdin)
|
2026-02-23 02:11:33 -08:00
|
|
|
proc.stdin.close()
|
|
|
|
|
except Exception:
|
|
|
|
|
pass
|
|
|
|
|
|
|
|
|
|
def _drain():
|
|
|
|
|
try:
|
|
|
|
|
for line in proc.stdout:
|
|
|
|
|
_output_chunks.append(line)
|
|
|
|
|
except Exception:
|
|
|
|
|
pass
|
|
|
|
|
|
|
|
|
|
reader = threading.Thread(target=_drain, daemon=True)
|
|
|
|
|
reader.start()
|
|
|
|
|
deadline = time.monotonic() + effective_timeout
|
|
|
|
|
|
|
|
|
|
while proc.poll() is None:
|
|
|
|
|
if is_interrupted():
|
|
|
|
|
proc.terminate()
|
|
|
|
|
try:
|
|
|
|
|
proc.wait(timeout=1)
|
|
|
|
|
except subprocess.TimeoutExpired:
|
|
|
|
|
proc.kill()
|
|
|
|
|
reader.join(timeout=2)
|
|
|
|
|
return {
|
|
|
|
|
"output": "".join(_output_chunks) + "\n[Command interrupted]",
|
|
|
|
|
"returncode": 130,
|
|
|
|
|
}
|
|
|
|
|
if time.monotonic() > deadline:
|
|
|
|
|
proc.kill()
|
|
|
|
|
reader.join(timeout=2)
|
|
|
|
|
return self._timeout_result(effective_timeout)
|
|
|
|
|
time.sleep(0.2)
|
|
|
|
|
|
|
|
|
|
reader.join(timeout=5)
|
|
|
|
|
return {"output": "".join(_output_chunks), "returncode": proc.returncode}
|
|
|
|
|
except Exception as e:
|
|
|
|
|
return {"output": f"Docker execution error: {e}", "returncode": 1}
|
2026-02-21 22:31:43 -08:00
|
|
|
|
|
|
|
|
def cleanup(self):
|
2026-02-23 21:15:35 -08:00
|
|
|
"""Stop and remove the container. Bind-mount dirs persist if persistent=True."""
|
refactor: remove mini-swe-agent dependency — inline Docker/Modal backends (#2804)
Drop the mini-swe-agent git submodule. All terminal backends now use
hermes-agent's own environment implementations directly.
Docker backend:
- Inline the `docker run -d` container startup (was 15 lines in
minisweagent's DockerEnvironment). Our wrapper already handled
execute(), cleanup(), security hardening, volumes, and resource limits.
Modal backend:
- Import swe-rex's ModalDeployment directly instead of going through
minisweagent's 90-line passthrough wrapper.
- Bake the _AsyncWorker pattern (from environments/patches.py) directly
into ModalEnvironment for Atropos compatibility without monkey-patching.
Cleanup:
- Remove minisweagent_path.py (submodule path resolution helper)
- Remove submodule init/install from install.sh and setup-hermes.sh
- Remove mini-swe-agent from .gitmodules
- environments/patches.py is now a no-op (kept for backward compat)
- terminal_tool.py no longer does sys.path hacking for minisweagent
- mini_swe_runner.py guards imports (optional, for RL training only)
- Update all affected tests to mock the new direct subprocess calls
- Update README.md, CONTRIBUTING.md
No functionality change — all Docker, Modal, local, SSH, Singularity,
and Daytona backends behave identically. 6093 tests pass.
2026-03-24 07:30:25 -07:00
|
|
|
if self._container_id:
|
2026-03-17 04:02:01 -07:00
|
|
|
try:
|
2026-03-30 23:15:11 +00:00
|
|
|
# SECURITY FIX: Use list-based commands instead of shell=True
|
|
|
|
|
# to prevent command injection via malicious container IDs
|
refactor: remove mini-swe-agent dependency — inline Docker/Modal backends (#2804)
Drop the mini-swe-agent git submodule. All terminal backends now use
hermes-agent's own environment implementations directly.
Docker backend:
- Inline the `docker run -d` container startup (was 15 lines in
minisweagent's DockerEnvironment). Our wrapper already handled
execute(), cleanup(), security hardening, volumes, and resource limits.
Modal backend:
- Import swe-rex's ModalDeployment directly instead of going through
minisweagent's 90-line passthrough wrapper.
- Bake the _AsyncWorker pattern (from environments/patches.py) directly
into ModalEnvironment for Atropos compatibility without monkey-patching.
Cleanup:
- Remove minisweagent_path.py (submodule path resolution helper)
- Remove submodule init/install from install.sh and setup-hermes.sh
- Remove mini-swe-agent from .gitmodules
- environments/patches.py is now a no-op (kept for backward compat)
- terminal_tool.py no longer does sys.path hacking for minisweagent
- mini_swe_runner.py guards imports (optional, for RL training only)
- Update all affected tests to mock the new direct subprocess calls
- Update README.md, CONTRIBUTING.md
No functionality change — all Docker, Modal, local, SSH, Singularity,
and Daytona backends behave identically. 6093 tests pass.
2026-03-24 07:30:25 -07:00
|
|
|
# Stop in background so cleanup doesn't block
|
2026-03-30 23:15:11 +00:00
|
|
|
container_id = self._container_id
|
|
|
|
|
# Validate container ID format to prevent injection
|
|
|
|
|
if not re.match(r'^[a-f0-9]{12,64}$', container_id):
|
|
|
|
|
logger.warning("Invalid container ID format: %s", container_id)
|
|
|
|
|
return
|
|
|
|
|
|
|
|
|
|
# Use subprocess with list args instead of shell=True
|
|
|
|
|
subprocess.Popen(
|
|
|
|
|
["timeout", "60", self._docker_exe, "stop", container_id],
|
|
|
|
|
stdout=subprocess.DEVNULL,
|
|
|
|
|
stderr=subprocess.DEVNULL,
|
2026-03-17 04:02:01 -07:00
|
|
|
)
|
|
|
|
|
except Exception as e:
|
refactor: remove mini-swe-agent dependency — inline Docker/Modal backends (#2804)
Drop the mini-swe-agent git submodule. All terminal backends now use
hermes-agent's own environment implementations directly.
Docker backend:
- Inline the `docker run -d` container startup (was 15 lines in
minisweagent's DockerEnvironment). Our wrapper already handled
execute(), cleanup(), security hardening, volumes, and resource limits.
Modal backend:
- Import swe-rex's ModalDeployment directly instead of going through
minisweagent's 90-line passthrough wrapper.
- Bake the _AsyncWorker pattern (from environments/patches.py) directly
into ModalEnvironment for Atropos compatibility without monkey-patching.
Cleanup:
- Remove minisweagent_path.py (submodule path resolution helper)
- Remove submodule init/install from install.sh and setup-hermes.sh
- Remove mini-swe-agent from .gitmodules
- environments/patches.py is now a no-op (kept for backward compat)
- terminal_tool.py no longer does sys.path hacking for minisweagent
- mini_swe_runner.py guards imports (optional, for RL training only)
- Update all affected tests to mock the new direct subprocess calls
- Update README.md, CONTRIBUTING.md
No functionality change — all Docker, Modal, local, SSH, Singularity,
and Daytona backends behave identically. 6093 tests pass.
2026-03-24 07:30:25 -07:00
|
|
|
logger.warning("Failed to stop container %s: %s", self._container_id, e)
|
|
|
|
|
|
|
|
|
|
if not self._persistent:
|
|
|
|
|
# Also schedule removal (stop only leaves it as stopped)
|
|
|
|
|
try:
|
2026-03-30 23:15:11 +00:00
|
|
|
# Use a delayed removal via threading instead of shell
|
|
|
|
|
def delayed_remove(docker_exe, container_id, delay=3):
|
|
|
|
|
import time
|
|
|
|
|
time.sleep(delay)
|
|
|
|
|
try:
|
|
|
|
|
subprocess.run(
|
|
|
|
|
[docker_exe, "rm", "-f", container_id],
|
|
|
|
|
stdout=subprocess.DEVNULL,
|
|
|
|
|
stderr=subprocess.DEVNULL,
|
|
|
|
|
check=False,
|
|
|
|
|
)
|
|
|
|
|
except Exception:
|
|
|
|
|
pass
|
|
|
|
|
|
|
|
|
|
import threading
|
|
|
|
|
remove_thread = threading.Thread(
|
|
|
|
|
target=delayed_remove,
|
|
|
|
|
args=(self._docker_exe, self._container_id, 3),
|
|
|
|
|
daemon=True,
|
refactor: remove mini-swe-agent dependency — inline Docker/Modal backends (#2804)
Drop the mini-swe-agent git submodule. All terminal backends now use
hermes-agent's own environment implementations directly.
Docker backend:
- Inline the `docker run -d` container startup (was 15 lines in
minisweagent's DockerEnvironment). Our wrapper already handled
execute(), cleanup(), security hardening, volumes, and resource limits.
Modal backend:
- Import swe-rex's ModalDeployment directly instead of going through
minisweagent's 90-line passthrough wrapper.
- Bake the _AsyncWorker pattern (from environments/patches.py) directly
into ModalEnvironment for Atropos compatibility without monkey-patching.
Cleanup:
- Remove minisweagent_path.py (submodule path resolution helper)
- Remove submodule init/install from install.sh and setup-hermes.sh
- Remove mini-swe-agent from .gitmodules
- environments/patches.py is now a no-op (kept for backward compat)
- terminal_tool.py no longer does sys.path hacking for minisweagent
- mini_swe_runner.py guards imports (optional, for RL training only)
- Update all affected tests to mock the new direct subprocess calls
- Update README.md, CONTRIBUTING.md
No functionality change — all Docker, Modal, local, SSH, Singularity,
and Daytona backends behave identically. 6093 tests pass.
2026-03-24 07:30:25 -07:00
|
|
|
)
|
2026-03-30 23:15:11 +00:00
|
|
|
remove_thread.start()
|
refactor: remove mini-swe-agent dependency — inline Docker/Modal backends (#2804)
Drop the mini-swe-agent git submodule. All terminal backends now use
hermes-agent's own environment implementations directly.
Docker backend:
- Inline the `docker run -d` container startup (was 15 lines in
minisweagent's DockerEnvironment). Our wrapper already handled
execute(), cleanup(), security hardening, volumes, and resource limits.
Modal backend:
- Import swe-rex's ModalDeployment directly instead of going through
minisweagent's 90-line passthrough wrapper.
- Bake the _AsyncWorker pattern (from environments/patches.py) directly
into ModalEnvironment for Atropos compatibility without monkey-patching.
Cleanup:
- Remove minisweagent_path.py (submodule path resolution helper)
- Remove submodule init/install from install.sh and setup-hermes.sh
- Remove mini-swe-agent from .gitmodules
- environments/patches.py is now a no-op (kept for backward compat)
- terminal_tool.py no longer does sys.path hacking for minisweagent
- mini_swe_runner.py guards imports (optional, for RL training only)
- Update all affected tests to mock the new direct subprocess calls
- Update README.md, CONTRIBUTING.md
No functionality change — all Docker, Modal, local, SSH, Singularity,
and Daytona backends behave identically. 6093 tests pass.
2026-03-24 07:30:25 -07:00
|
|
|
except Exception:
|
|
|
|
|
pass
|
2026-03-17 04:02:01 -07:00
|
|
|
self._container_id = None
|
|
|
|
|
|
2026-02-23 21:15:35 -08:00
|
|
|
if not self._persistent:
|
|
|
|
|
for d in (self._workspace_dir, self._home_dir):
|
|
|
|
|
if d:
|
|
|
|
|
shutil.rmtree(d, ignore_errors=True)
|