Add background process management with process tool, wait, PTY, and stdin support

New process registry and tool for managing long-running background processes across all terminal backends (local, Docker, Singularity, Modal, SSH). Process Registry (tools/process_registry.py): - ProcessSession tracking with rolling 200KB output buffer - spawn_local() with optional PTY via ptyprocess for interactive CLIs - spawn_via_env() for non-local backends (runs inside sandbox, never on host) - Background reader threads per process (Popen stdout or PTY) - wait() with timeout clamping, interrupt support, and transparent limit reporting - JSON checkpoint to ~/.hermes/processes.json for gateway crash recovery - Module-level singleton shared across agent loop, gateway, and RL Process Tool (model_tools.py): - 7 actions: list, poll, log, wait, kill, write, submit - Paired with terminal in all toolsets (CLI, messaging, RL) - Timeout clamping with transparent notes in response Terminal Tool Updates (tools/terminal_tool.py): - Replaced nohup background mode with registry spawn (returns session_id) - Added workdir parameter for per-command working directory - Added check_interval parameter for gateway auto-check watchers - Added pty parameter for interactive CLI tools (Codex, Claude Code) - Updated TERMINAL_TOOL_DESCRIPTION with full background workflow docs - Cleanup thread now respects active background processes (won't reap sandbox) Gateway Integration (gateway/run.py, session.py, config.py): - Session reset protection: sessions with active processes exempt from reset - Default idle timeout increased from 2 hours to 24 hours - from_dict fallback aligned to match (was 120, now 1440) - session_key env var propagated to process registry for session mapping - Crash recovery on gateway startup via checkpoint probe - check_interval watcher: asyncio task polls process, delivers updates to platform RL Safety (environments/): - tool_context.py cleanup() kills background processes on episode end - hermes_base_env.py warns when enabled_toolsets is None (loads all tools) - Process tool safe in RL via wait() blocking the agent loop Also: - Added ptyprocess as optional dependency (in pyproject.toml [pty] extra + [all]) - Fixed pre-existing bug: rl_test_inference missing from TOOL_TO_TOOLSET_MAP - Updated AGENTS.md with process management docs and project structure - Updated README.md terminal section with process management overview
2026-02-17 02:51:31 -08:00
parent 48b5cfd085
commit 061fa70907
12 changed files with 1142 additions and 40 deletions
--- a/environments/hermes_base_env.py
+++ b/environments/hermes_base_env.py
@@ -258,6 +258,11 @@ class HermesAgentBaseEnv(BaseEnv):
            logger.info("Sampled toolsets from '%s': %s", config.distribution, group_toolsets)
        else:
            group_toolsets = config.enabled_toolsets  # None means "all available"
+            if group_toolsets is None:
+                logger.warning(
+                    "enabled_toolsets is None -- loading ALL tools including messaging. "
+                    "Set explicit enabled_toolsets for RL training."
+                )

        tools = get_tool_definitions(
            enabled_toolsets=group_toolsets,
--- a/environments/tool_context.py
+++ b/environments/tool_context.py
@@ -438,11 +438,21 @@ class ToolContext:

    def cleanup(self):
        """
-        Release all resources (terminal VMs, browser sessions) for this rollout.
+        Release all resources (terminal VMs, browser sessions, background processes)
+        for this rollout.

        Called automatically by the base environment via try/finally after
        compute_reward() completes. You generally don't need to call this yourself.
        """
+        # Kill any background processes from this rollout (safety net)
+        try:
+            from tools.process_registry import process_registry
+            killed = process_registry.kill_all(task_id=self.task_id)
+            if killed:
+                logger.debug("Process cleanup for task %s: killed %d process(es)", self.task_id, killed)
+        except Exception as e:
+            logger.debug("Process cleanup for task %s: %s", self.task_id, e)
+
        try:
            cleanup_vm(self.task_id)
        except Exception as e: