Authored by dmahan93. Adds HERMES_YOLO_MODE env var and --yolo CLI flag to auto-approve all dangerous command prompts. Post-merge: renamed --fuck-it-ship-it to --yolo for brevity, resolved conflict with --checkpoints flag.
7.8 KiB
Checkpoint & Rollback — Implementation Plan
Goal
Automatic filesystem snapshots before destructive file operations, with user-facing rollback. The agent never sees or interacts with this — it's transparent infrastructure.
Design Principles
- Not a tool — the LLM never knows about it. Zero prompt tokens, zero tool schema overhead.
- Once per turn — checkpoint at most once per conversation turn (user message → agent response cycle), triggered lazily on the first file-mutating operation. Not on every write.
- Opt-in via config — disabled by default, enabled with
checkpoints: truein config.yaml. - Works on any directory — uses a shadow git repo completely separate from the user's project git. Works on git repos, non-git directories, anything.
- User-facing rollback —
/rollbackslash command (CLI + gateway) to list and restore checkpoints. Alsohermes rollbackCLI subcommand.
Architecture
~/.hermes/checkpoints/
{sha256(abs_dir)[:16]}/ # Shadow git repo per working directory
HEAD, refs/, objects/... # Standard git internals
HERMES_WORKDIR # Original dir path (for display)
info/exclude # Default excludes (node_modules, .env, etc.)
Core: CheckpointManager (new file: tools/checkpoint_manager.py)
Adapted from PR #559's CheckpointStore. Key changes from the PR:
- Not a tool — no schema, no registry entry, no handler
- Turn-scoped deduplication — tracks
_checkpointed_dirs: Set[str]per turn - Configurable — reads
checkpointsconfig key - Pruning — keeps last N snapshots per directory (default 50), prunes on take
class CheckpointManager:
def __init__(self, enabled: bool = False, max_snapshots: int = 50):
self.enabled = enabled
self.max_snapshots = max_snapshots
self._checkpointed_dirs: Set[str] = set() # reset each turn
def new_turn(self):
"""Call at start of each conversation turn to reset dedup."""
self._checkpointed_dirs.clear()
def ensure_checkpoint(self, working_dir: str, reason: str = "auto") -> None:
"""Take a checkpoint if enabled and not already done this turn."""
if not self.enabled:
return
abs_dir = str(Path(working_dir).resolve())
if abs_dir in self._checkpointed_dirs:
return
self._checkpointed_dirs.add(abs_dir)
try:
self._take(abs_dir, reason)
except Exception as e:
logger.debug("Checkpoint failed (non-fatal): %s", e)
def list_checkpoints(self, working_dir: str) -> List[dict]:
"""List available checkpoints for a directory."""
...
def restore(self, working_dir: str, commit_hash: str) -> dict:
"""Restore files to a checkpoint state."""
...
def _take(self, working_dir: str, reason: str):
"""Shadow git: add -A + commit. Prune if over max_snapshots."""
...
def _prune(self, shadow_repo: Path):
"""Keep only last max_snapshots commits."""
...
Integration Point: run_agent.py
The AIAgent already owns the conversation loop. Add CheckpointManager as an instance attribute:
class AIAgent:
def __init__(self, ...):
...
# Checkpoint manager — reads config to determine if enabled
self._checkpoint_mgr = CheckpointManager(
enabled=config.get("checkpoints", False),
max_snapshots=config.get("checkpoint_max_snapshots", 50),
)
Turn boundary — in run_conversation(), call new_turn() at the start of each agent iteration (before processing tool calls):
# Inside the main loop, before _execute_tool_calls():
self._checkpoint_mgr.new_turn()
Trigger point — in _execute_tool_calls(), before dispatching file-mutating tools:
# Before the handle_function_call dispatch:
if function_name in ("write_file", "patch"):
# Determine working dir from the file path in the args
file_path = function_args.get("path", "") or function_args.get("old_string", "")
if file_path:
work_dir = str(Path(file_path).parent.resolve())
self._checkpoint_mgr.ensure_checkpoint(work_dir, f"before {function_name}")
This means:
- First
write_filein a turn → checkpoint (fast, onegit add -A && git commit) - Subsequent writes in the same turn → no-op (already checkpointed)
- Next turn (new user message) → fresh checkpoint eligibility
Config
Add to DEFAULT_CONFIG in hermes_cli/config.py:
"checkpoints": False, # Enable filesystem checkpoints before destructive ops
"checkpoint_max_snapshots": 50, # Max snapshots to keep per directory
User enables with:
# ~/.hermes/config.yaml
checkpoints: true
User-Facing Rollback
CLI slash command — add /rollback to process_command() in cli.py:
/rollback — List recent checkpoints for the current directory
/rollback <hash> — Restore files to that checkpoint
Shows a numbered list:
📸 Checkpoints for /home/user/project:
1. abc1234 2026-03-09 21:15 before write_file (3 files changed)
2. def5678 2026-03-09 20:42 before patch (1 file changed)
3. ghi9012 2026-03-09 20:30 before write_file (2 files changed)
Use /rollback <number> to restore, e.g. /rollback 1
Gateway slash command — add /rollback to gateway/run.py with the same behavior.
CLI subcommand — hermes rollback (optional, lower priority).
What Gets Excluded (not checkpointed)
Same as the PR's defaults — written to the shadow repo's info/exclude:
node_modules/
dist/
build/
.env
.env.*
__pycache__/
*.pyc
.DS_Store
*.log
.cache/
.venv/
.git/
Also respects the project's .gitignore if present (shadow repo can read it via core.excludesFile).
Safety
ensure_checkpoint()wraps everything in try/except — a checkpoint failure never blocks the actual file operation- Shadow repo is completely isolated — GIT_DIR + GIT_WORK_TREE env vars, never touches user's .git
- If git isn't installed, checkpoints silently disable
- Large directories: add a file count check — skip checkpoint if >50K files to avoid slowdowns
Files to Create/Modify
| File | Change |
|---|---|
tools/checkpoint_manager.py |
NEW — CheckpointManager class (adapted from PR #559) |
run_agent.py |
Add CheckpointManager init + trigger in _execute_tool_calls() |
hermes_cli/config.py |
Add checkpoints + checkpoint_max_snapshots to DEFAULT_CONFIG |
cli.py |
Add /rollback slash command handler |
gateway/run.py |
Add /rollback slash command handler |
tests/tools/test_checkpoint_manager.py |
NEW — tests (adapted from PR #559's tests) |
What We Take From PR #559
_shadow_repo_path()— deterministic path hashing ✅_git_env()— GIT_DIR/GIT_WORK_TREE isolation ✅_run_git()— subprocess wrapper with timeout ✅_init_shadow_repo()— shadow repo initialization ✅DEFAULT_EXCLUDESlist ✅- Test structure and patterns ✅
What We Change From PR #559
- Remove tool schema/registry — not a tool
- Remove injection into file_operations.py and patch_parser.py — trigger from run_agent.py instead
- Add turn-scoped deduplication — one checkpoint per turn, not per operation
- Add pruning — keep last N snapshots
- Add config flag — opt-in, not mandatory
- Add /rollback command — user-facing restore UI
- Add file count guard — skip huge directories
Implementation Order
tools/checkpoint_manager.py— core class with take/list/restore/prunetests/tools/test_checkpoint_manager.py— testshermes_cli/config.py— config keysrun_agent.py— integration (init + trigger)cli.py—/rollbackslash commandgateway/run.py—/rollbackslash command- Full test suite run + manual smoke test