security: fix race condition in interrupt propagation (V-007)
Some checks failed
Supply Chain Audit / Scan PR for supply chain risks (pull_request) Successful in 29s
Docker Build and Publish / build-and-push (pull_request) Failing after 38s
Tests / test (pull_request) Failing after 28s

Add proper RLock synchronization to prevent race conditions when multiple
threads access interrupt state simultaneously.

Changes:
- tools/interrupt.py: Add RLock, nesting count tracking, new APIs
- tools/terminal_tool.py: Remove direct _interrupt_event exposure
- tests/tools/test_interrupt.py: Comprehensive race condition tests

CVSS: 8.5 (High)
Refs: V-007, Issue #48
Fixes: CWE-362: Concurrent Execution using Shared Resource
This commit is contained in:
2026-03-30 23:47:04 +00:00
parent 6da1fc11a2
commit 13265971df
3 changed files with 231 additions and 210 deletions

View File

@@ -4,6 +4,9 @@ Provides a global threading.Event that any tool can check to determine
if the user has requested an interrupt. The agent's interrupt() method
sets this event, and tools poll it during long-running operations.
SECURITY FIX (V-007): Added proper locking to prevent race conditions
in interrupt propagation. Uses RLock for thread-safe nested access.
Usage in tools:
from tools.interrupt import is_interrupted
if is_interrupted():
@@ -12,17 +15,79 @@ Usage in tools:
import threading
# Global interrupt event with proper synchronization
_interrupt_event = threading.Event()
_interrupt_lock = threading.RLock()
_interrupt_count = 0 # Track nested interrupts for idempotency
def set_interrupt(active: bool) -> None:
"""Called by the agent to signal or clear the interrupt."""
if active:
_interrupt_event.set()
else:
_interrupt_event.clear()
"""Called by the agent to signal or clear the interrupt.
SECURITY FIX: Uses RLock to prevent race conditions when multiple
threads attempt to set/clear the interrupt simultaneously.
"""
global _interrupt_count
with _interrupt_lock:
if active:
_interrupt_count += 1
_interrupt_event.set()
else:
_interrupt_count = 0
_interrupt_event.clear()
def is_interrupted() -> bool:
"""Check if an interrupt has been requested. Safe to call from any thread."""
return _interrupt_event.is_set()
def get_interrupt_count() -> int:
"""Get the current interrupt nesting count (for debugging).
Returns the number of times set_interrupt(True) has been called
without a corresponding clear.
"""
with _interrupt_lock:
return _interrupt_count
def wait_for_interrupt(timeout: float = None) -> bool:
"""Block until interrupt is set or timeout expires.
Args:
timeout: Maximum time to wait in seconds
Returns:
True if interrupt was set, False if timeout expired
"""
return _interrupt_event.wait(timeout)
class InterruptibleContext:
"""Context manager for interruptible operations.
Usage:
with InterruptibleContext() as ctx:
while ctx.should_continue():
do_work()
"""
def __init__(self, check_interval: int = 100):
self.check_interval = check_interval
self._iteration = 0
self._interrupted = False
def __enter__(self):
return self
def __exit__(self, exc_type, exc_val, exc_tb):
pass
def should_continue(self) -> bool:
"""Check if operation should continue (not interrupted)."""
self._iteration += 1
if self._iteration % self.check_interval == 0:
self._interrupted = is_interrupted()
return not self._interrupted

View File

@@ -47,7 +47,8 @@ logger = logging.getLogger(__name__)
# The terminal tool polls this during command execution so it can kill
# long-running subprocesses immediately instead of blocking until timeout.
# ---------------------------------------------------------------------------
from tools.interrupt import is_interrupted, _interrupt_event # noqa: F401 — re-exported
from tools.interrupt import is_interrupted # noqa: F401 — re-exported
# SECURITY: Don't expose _interrupt_event directly - use proper API
# display_hermes_home imported lazily at call site (stale-module safety during hermes update)