Compare commits

...

11 Commits

Author SHA1 Message Date
dd3a037e84 feat: poka-yoke auto-revert incomplete skill edits on failure (#295)
Some checks failed
Forge CI / smoke-and-build (pull_request) Failing after 1m10s
Add test file
2026-04-14 02:43:36 +00:00
23a5a6771b feat: poka-yoke auto-revert incomplete skill edits on failure (#295)
Update tools/skill_manager_tool.py
2026-04-14 02:42:56 +00:00
954fd992eb Merge pull request 'perf: lazy session creation — defer DB write until first message (#314)' (#449) from whip/314-1776127532 into main
Some checks failed
Forge CI / smoke-and-build (push) Failing after 55s
Forge CI / smoke-and-build (pull_request) Failing after 1m12s
perf: lazy session creation (#314)

Closes #314.
2026-04-14 01:08:13 +00:00
Metatron
f35f56e397 perf: lazy session creation — defer DB write until first message (closes #314)
Some checks failed
Forge CI / smoke-and-build (pull_request) Failing after 56s
Remove eager create_session() call from AIAgent.__init__(). Sessions
are now created lazily on first _flush_messages_to_session_db() call
via ensure_session() which uses INSERT OR IGNORE.

Impact: eliminates 32.4% of sessions (3,564 of 10,985) that were
created at agent init but never received any messages.

The existing ensure_session() fallback in _flush_messages_to_session_db()
already handles this pattern — it was originally designed for recovery
after transient SQLite lock failures. Now it's the primary creation path.

Compression-initiated sessions still use create_session() directly
(line ~5995) since they have messages to write immediately.
2026-04-13 20:52:06 -04:00
8d0cad13c4 Merge pull request 'fix: watchdog config drift check uses YAML parse, not grep (#377)' (#398) from burn/377-1776117775 into main
Some checks failed
Forge CI / smoke-and-build (push) Failing after 28s
2026-04-14 00:34:14 +00:00
b9aca0a3b4 Merge pull request 'feat: time-aware model routing for cron jobs (#317)' (#432) from burn/317-1776125702 into main
Some checks failed
Forge CI / smoke-and-build (push) Has been cancelled
2026-04-14 00:34:06 +00:00
99d36533d5 Merge pull request 'feat: add /debug slash command with paste service upload (#320)' (#416) from burn/320-1776120221 into main
Some checks failed
Forge CI / smoke-and-build (push) Has been cancelled
2026-04-14 00:33:59 +00:00
Alexander Whitestone
5989600d80 feat: time-aware model routing for cron jobs (#317)
Some checks failed
Forge CI / smoke-and-build (pull_request) Failing after 1m1s
Empirical audit: cron error rate peaks at 18:00 (9.4%) vs 4.0% at 09:00.
During configured high-error windows, automatically route cron jobs to
more capable models when the user is not present to correct errors.

- agent/smart_model_routing.py: resolve_cron_model() + _hour_in_window()
- cron/scheduler.py: wired into run_job() after base model resolution
- tests/test_cron_model_routing.py: 16 tests

Config:
  cron_model_routing:
    enabled: true
    fallback_model: "anthropic/claude-sonnet-4"
    fallback_provider: "openrouter"
    windows:
      - {start_hour: 17, end_hour: 22, reason: evening_error_peak}
      - {start_hour: 2, end_hour: 5, reason: overnight_api_instability}

Features: midnight-wrap, per-window overrides, first-match-wins,
graceful degradation on malformed config.

Closes #317
2026-04-13 20:19:37 -04:00
f1626a932c feat: add /debug command handler with paste service upload (#320)
Some checks failed
Forge CI / smoke-and-build (pull_request) Failing after 1m1s
2026-04-13 22:48:33 +00:00
d68ab4cff4 feat: add /debug slash command to command registry (#320) 2026-04-13 22:47:51 +00:00
87867f3d10 fix: config drift check uses YAML parse not grep (#377)
Some checks failed
Forge CI / smoke-and-build (pull_request) Failing after 59s
2026-04-13 22:12:56 +00:00
8 changed files with 1274 additions and 83 deletions

View File

@@ -1,10 +1,11 @@
"""Helpers for optional cheap-vs-strong model routing."""
"""Helpers for optional cheap-vs-strong and time-aware model routing."""
from __future__ import annotations
import os
import re
from typing import Any, Dict, Optional
from datetime import datetime
from typing import Any, Dict, List, Optional
from utils import is_truthy_value
@@ -192,3 +193,104 @@ def resolve_turn_route(user_message: str, routing_config: Optional[Dict[str, Any
tuple(runtime.get("args") or ()),
),
}
# =========================================================================
# Time-aware cron model routing
# =========================================================================
#
# Empirical finding: cron error rate peaks at 18:00 (9.4%) vs 4.0% at 09:00.
# During high-error windows, route cron jobs to more capable models.
#
# Config (config.yaml):
# cron_model_routing:
# enabled: true
# fallback_model: "anthropic/claude-sonnet-4"
# fallback_provider: "openrouter"
# windows:
# - start_hour: 17
# end_hour: 22
# reason: "evening_error_peak"
# - start_hour: 2
# end_hour: 5
# reason: "overnight_api_instability"
# =========================================================================
def _hour_in_window(hour: int, start: int, end: int) -> bool:
"""Check if hour falls in [start, end) window, handling midnight wrap."""
if start <= end:
return start <= hour < end
else:
# Wraps midnight: e.g., 22-06
return hour >= start or hour < end
def resolve_cron_model(
base_model: str,
routing_config: Optional[Dict[str, Any]],
now: Optional[datetime] = None,
) -> Dict[str, Any]:
"""Apply time-aware model override for cron jobs.
During configured high-error windows, returns a stronger model config.
Outside windows, returns the base model unchanged.
Args:
base_model: The model string already resolved (from job/config/env).
routing_config: The cron_model_routing dict from config.yaml.
now: Override current time (for testing). Defaults to datetime.now().
Returns:
Dict with keys: model, provider, overridden, reason.
- model: the effective model string to use
- provider: provider override (empty string = use default)
- overridden: True if time-based override was applied
- reason: why override was applied (empty string if not)
"""
cfg = routing_config or {}
if not _coerce_bool(cfg.get("enabled"), False):
return {"model": base_model, "provider": "", "overridden": False, "reason": ""}
windows = cfg.get("windows") or []
if not isinstance(windows, list) or not windows:
return {"model": base_model, "provider": "", "overridden": False, "reason": ""}
current = now or datetime.now()
current_hour = current.hour
matched_window = None
for window in windows:
if not isinstance(window, dict):
continue
start = _coerce_int(window.get("start_hour"), -1)
end = _coerce_int(window.get("end_hour"), -1)
if start < 0 or end < 0:
continue
if _hour_in_window(current_hour, start, end):
matched_window = window
break
if not matched_window:
return {"model": base_model, "provider": "", "overridden": False, "reason": ""}
# Window matched — use the override model from window or global fallback
override_model = str(matched_window.get("model") or "").strip()
override_provider = str(matched_window.get("provider") or "").strip()
if not override_model:
override_model = str(cfg.get("fallback_model") or "").strip()
if not override_provider:
override_provider = str(cfg.get("fallback_provider") or "").strip()
if not override_model:
return {"model": base_model, "provider": "", "overridden": False, "reason": ""}
reason = str(matched_window.get("reason") or "time_window").strip()
return {
"model": override_model,
"provider": override_provider,
"overridden": True,
"reason": f"cron_routing:{reason}(hour={current_hour})",
}

192
cli.py
View File

@@ -3134,6 +3134,196 @@ class HermesCLI:
print(f" Home: {display}")
print()
def _handle_debug_command(self, command: str):
"""Generate a debug report with system info and logs, upload to paste service."""
import platform
import sys
import time as _time
# Parse optional lines argument
parts = command.split(maxsplit=1)
log_lines = 50
if len(parts) > 1:
try:
log_lines = min(int(parts[1]), 500)
except ValueError:
pass
_cprint(" Collecting debug info...")
# Collect system info
lines = []
lines.append("=== HERMES DEBUG REPORT ===")
lines.append(f"Generated: {_time.strftime('%Y-%m-%d %H:%M:%S %z')}")
lines.append("")
lines.append("--- System ---")
lines.append(f"Python: {sys.version}")
lines.append(f"Platform: {platform.platform()}")
lines.append(f"Architecture: {platform.machine()}")
lines.append(f"Hostname: {platform.node()}")
lines.append("")
# Hermes info
lines.append("--- Hermes ---")
try:
from hermes_constants import get_hermes_home, display_hermes_home
lines.append(f"Home: {display_hermes_home()}")
except Exception:
lines.append("Home: unknown")
try:
from hermes_constants import __version__
lines.append(f"Version: {__version__}")
except Exception:
lines.append("Version: unknown")
lines.append(f"Profile: {getattr(self, '_profile_name', 'default')}")
lines.append(f"Session: {self.session_id}")
lines.append(f"Model: {self.model}")
lines.append(f"Provider: {getattr(self, '_provider_name', 'unknown')}")
try:
lines.append(f"Working dir: {os.getcwd()}")
except Exception:
pass
# Config (redacted)
lines.append("")
lines.append("--- Config (redacted) ---")
try:
from hermes_constants import get_hermes_home
config_path = get_hermes_home() / "config.yaml"
if config_path.exists():
import yaml
with open(config_path) as f:
cfg = yaml.safe_load(f) or {}
# Redact secrets
for key in ("api_key", "token", "secret", "password"):
if key in cfg:
cfg[key] = "***REDACTED***"
lines.append(yaml.dump(cfg, default_flow_style=False)[:2000])
else:
lines.append("(no config file found)")
except Exception as e:
lines.append(f"(error reading config: {e})")
# Recent logs
lines.append("")
lines.append(f"--- Recent Logs (last {log_lines} lines) ---")
try:
from hermes_constants import get_hermes_home
log_dir = get_hermes_home() / "logs"
if log_dir.exists():
for log_file in sorted(log_dir.glob("*.log")):
try:
content = log_file.read_text(encoding="utf-8", errors="replace")
tail = content.strip().split("\n")[-log_lines:]
if tail:
lines.append(f"\n[{log_file.name}]")
lines.extend(tail)
except Exception:
pass
else:
lines.append("(no logs directory)")
except Exception:
lines.append("(error reading logs)")
# Tool info
lines.append("")
lines.append("--- Enabled Toolsets ---")
try:
lines.append(", ".join(self.enabled_toolsets) if self.enabled_toolsets else "(none)")
except Exception:
lines.append("(unknown)")
report = "\n".join(lines)
report_size = len(report)
# Try to upload to paste services
paste_url = None
services = [
("dpaste", _upload_dpaste),
("0x0.st", _upload_0x0st),
]
for name, uploader in services:
try:
url = uploader(report)
if url:
paste_url = url
break
except Exception:
continue
print()
if paste_url:
_cprint(f" Debug report uploaded: {paste_url}")
_cprint(f" Size: {report_size} bytes, {len(lines)} lines")
else:
# Fallback: save locally
try:
from hermes_constants import get_hermes_home
debug_path = get_hermes_home() / "debug-report.txt"
debug_path.write_text(report, encoding="utf-8")
_cprint(f" Paste services unavailable. Report saved to: {debug_path}")
_cprint(f" Size: {report_size} bytes, {len(lines)} lines")
except Exception as e:
_cprint(f" Failed to save report: {e}")
_cprint(f" Report ({report_size} bytes):")
print(report)
print()
def _upload_dpaste(content: str) -> str | None:
"""Upload content to dpaste.org. Returns URL or None."""
import urllib.request
import urllib.parse
data = urllib.parse.urlencode({
"content": content,
"syntax": "text",
"expiry_days": 7,
}).encode()
req = urllib.request.Request(
"https://dpaste.org/api/",
data=data,
headers={"User-Agent": "hermes-agent/debug"},
)
with urllib.request.urlopen(req, timeout=10) as resp:
url = resp.read().decode().strip()
if url.startswith("http"):
return url
return None
def _upload_0x0st(content: str) -> str | None:
"""Upload content to 0x0.st. Returns URL or None."""
import urllib.request
import io
# 0x0.st expects multipart form with a file field
boundary = "----HermesDebugBoundary"
body = (
f"--{boundary}\r\n"
f'Content-Disposition: form-data; name="file"; filename="debug.txt"\r\n'
f"Content-Type: text/plain\r\n\r\n"
f"{content}\r\n"
f"--{boundary}--\r\n"
).encode()
req = urllib.request.Request(
"https://0x0.st",
data=body,
headers={
"Content-Type": f"multipart/form-data; boundary={boundary}",
"User-Agent": "hermes-agent/debug",
},
)
with urllib.request.urlopen(req, timeout=10) as resp:
url = resp.read().decode().strip()
if url.startswith("http"):
return url
return None
def show_config(self):
"""Display current configuration with kawaii ASCII art."""
# Get terminal config from environment (which was set from cli-config.yaml)
@@ -4321,6 +4511,8 @@ class HermesCLI:
self.show_help()
elif canonical == "profile":
self._handle_profile_command()
elif canonical == "debug":
self._handle_debug_command(cmd_original)
elif canonical == "tools":
self._handle_tools_command(cmd_original)
elif canonical == "toolsets":

View File

@@ -718,6 +718,22 @@ def run_job(job: dict) -> tuple[bool, str, str, Optional[str]]:
# Reasoning config from env or config.yaml
from hermes_constants import parse_reasoning_effort
# Time-aware cron model routing — override model during high-error windows
try:
from agent.smart_model_routing import resolve_cron_model
_cron_routing_cfg = (_cfg.get("cron_model_routing") or {})
_cron_route = resolve_cron_model(model, _cron_routing_cfg)
if _cron_route["overridden"]:
_original_model = model
model = _cron_route["model"]
logger.info(
"Job '%s': cron model override %s -> %s (%s)",
job_id, _original_model, model, _cron_route["reason"],
)
except Exception as _e:
logger.debug("Job '%s': cron model routing skipped: %s", job_id, _e)
effort = os.getenv("HERMES_REASONING_EFFORT", "")
if not effort:
effort = str(_cfg.get("agent", {}).get("reasoning_effort", "")).strip()

286
model-watchdog.py Normal file
View File

@@ -0,0 +1,286 @@
#!/usr/bin/env python3
"""
Model Watchdog — monitors tmux panes for model drift.
Checks all hermes TUI sessions in dev and timmy tmux sessions.
If any pane is running a non-mimo model, kills and restarts it.
Usage: python3 ~/.hermes/bin/model-watchdog.py [--fix]
--fix Actually restart drifted panes (default: dry-run)
"""
import subprocess
import sys
import re
import time
import os
ALLOWED_MODEL = "mimo-v2-pro"
# Profile -> expected model. If a pane is running this profile with this model, it's healthy.
# Profiles not in this map are checked against ALLOWED_MODEL.
PROFILE_MODELS = {
"default": "mimo-v2-pro",
"timmy-sprint": "mimo-v2-pro",
"fenrir": "mimo-v2-pro",
"bezalel": "gpt-5.4",
"burn": "mimo-v2-pro",
"creative": "claude-sonnet",
"research": "claude-sonnet",
"review": "claude-sonnet",
}
TMUX_SESSIONS = ["dev", "timmy"]
LOG_FILE = os.path.expanduser("~/.hermes/logs/model-watchdog.log")
def log(msg):
os.makedirs(os.path.dirname(LOG_FILE), exist_ok=True)
ts = time.strftime("%Y-%m-%d %H:%M:%S")
line = f"[{ts}] {msg}"
print(line)
with open(LOG_FILE, "a") as f:
f.write(line + "\n")
def run(cmd):
r = subprocess.run(cmd, shell=True, capture_output=True, text=True, timeout=10)
return r.stdout.strip(), r.returncode
def get_panes(session):
"""Get all pane info from ALL windows in a tmux session."""
# First get all windows
win_out, win_rc = run(f"tmux list-windows -t {session} -F '#{{window_name}}' 2>/dev/null")
if win_rc != 0:
return []
panes = []
for window_name in win_out.split("\n"):
if not window_name.strip():
continue
target = f"{session}:{window_name}"
out, rc = run(f"tmux list-panes -t {target} -F '#{{pane_index}}|#{{pane_pid}}|#{{pane_tty}}' 2>/dev/null")
if rc != 0:
continue
for line in out.split("\n"):
if "|" in line:
idx, pid, tty = line.split("|")
panes.append({
"session": session,
"window": window_name,
"index": int(idx),
"pid": int(pid),
"tty": tty,
})
return panes
def get_hermes_pid_for_tty(tty):
"""Find hermes process running on a specific TTY."""
out, _ = run(f"ps aux | grep '{tty}' | grep '[h]ermes' | grep -v 'gateway' | grep -v 'node' | awk '{{print $2}}'")
if out:
return int(out.split("\n")[0])
return None
def get_model_from_pane(session, pane_idx, window=None):
"""Capture the pane and extract the model from the status bar."""
target = f"{session}:{window}.{pane_idx}" if window else f"{session}.{pane_idx}"
out, _ = run(f"tmux capture-pane -t {target} -p 2>/dev/null | tail -30")
# Look for model in status bar: ⚕ model-name │
matches = re.findall(r'\s+(\S+)\s+│', out)
if matches:
return matches[0]
return None
def check_session_meta(session_id):
"""Check what model a hermes session was last using from its session file."""
import json
session_file = os.path.expanduser(f"~/.hermes/sessions/session_{session_id}.json")
if os.path.exists(session_file):
try:
with open(session_file) as f:
data = json.load(f)
return data.get("model"), data.get("provider")
except:
pass
# Try jsonl
jsonl_file = os.path.expanduser(f"~/.hermes/sessions/{session_id}.jsonl")
if os.path.exists(jsonl_file):
try:
with open(jsonl_file) as f:
for line in f:
d = json.loads(line.strip())
if d.get("role") == "session_meta":
return d.get("model"), d.get("provider")
break
except:
pass
return None, None
def is_drifted(model_name, profile=None):
"""Check if a model name indicates drift from the expected model for this profile."""
if model_name is None:
return False, "no-model-detected"
# If we know the profile, check against its expected model
if profile and profile in PROFILE_MODELS:
expected = PROFILE_MODELS[profile]
if expected in model_name:
return False, model_name
return True, model_name
# No profile known — fall back to ALLOWED_MODEL
if ALLOWED_MODEL in model_name:
return False, model_name
return True, model_name
def get_profile_from_pane(tty):
"""Detect which hermes profile a pane is running by inspecting its process args."""
# ps shows short TTY (s031) not full path (/dev/ttys031)
short_tty = tty.replace("/dev/ttys", "s").replace("/dev/ttys", "")
out, _ = run(f"ps aux | grep '{short_tty}' | grep '[h]ermes' | grep -v 'gateway' | grep -v 'node' | grep -v cron")
if not out:
return None
# Look for -p <profile> in the command line
match = re.search(r'-p\s+(\S+)', out)
if match:
return match.group(1)
return None
def kill_and_restart(session, pane_idx, window=None):
"""Kill the hermes process in a pane and restart it with the same profile."""
target = f"{session}:{window}.{pane_idx}" if window else f"{session}.{pane_idx}"
# Get the pane's TTY
out, _ = run(f"tmux list-panes -t {target} -F '#{{pane_tty}}'")
tty = out.strip()
# Detect which profile was running
profile = get_profile_from_pane(tty)
# Find and kill hermes on that TTY
hermes_pid = get_hermes_pid_for_tty(tty)
if hermes_pid:
log(f"Killing hermes PID {hermes_pid} on {target} (tty={tty}, profile={profile})")
run(f"kill {hermes_pid}")
time.sleep(2)
# Send Ctrl+C to clear any state
run(f"tmux send-keys -t {target} C-c")
time.sleep(1)
# Restart hermes with the same profile
if profile:
cmd = f"hermes -p {profile} chat"
else:
cmd = "hermes chat"
run(f"tmux send-keys -t {target} '{cmd}' Enter")
log(f"Restarted hermes in {target} with: {cmd}")
# Wait and verify
time.sleep(8)
new_model = get_model_from_pane(session, pane_idx, window)
if new_model and ALLOWED_MODEL in new_model:
log(f"{target} now on {new_model}")
return True
else:
log(f"{target} model after restart: {new_model}")
return False
def verify_expected_model(provider_yaml, expected):
"""Compare actual provider in a YAML config against expected value."""
return provider_yaml.strip() == expected.strip()
def check_config_drift():
"""Scan all relevant config.yaml files for provider drift. Does NOT modify anything.
Returns list of drift issues found."""
issues = []
CONFIGS = {
"main_config": (os.path.expanduser("~/.hermes/config.yaml"), "nous"),
"fenrir": (os.path.expanduser("~/.hermes/profiles/fenrir/config.yaml"), "nous"),
"timmy_sprint": (os.path.expanduser("~/.hermes/profiles/timmy-sprint/config.yaml"), "nous"),
"default_profile": (os.path.expanduser("~/.hermes/profiles/default/config.yaml"), "nous"),
}
for name, (path, expected_provider) in CONFIGS.items():
if not os.path.exists(path):
continue
try:
with open(path, "r") as f:
content = f.read()
# Parse YAML to correctly read model.provider (not the first provider: line)
try:
import yaml
cfg = yaml.safe_load(content) or {}
except ImportError:
# Fallback: find provider under model: block via indentation-aware scan
cfg = {}
in_model = False
for line in content.split("\n"):
stripped = line.strip()
indent = len(line) - len(line.lstrip())
if stripped.startswith("model:") and indent == 0:
in_model = True
continue
if in_model and indent == 0 and stripped:
in_model = False
if in_model and stripped.startswith("provider:"):
cfg = {"model": {"provider": stripped.split(":", 1)[1].strip()}}
break
actual = (cfg.get("model") or {}).get("provider", "")
if actual and expected_provider and actual != expected_provider:
issues.append(f"CONFIG DRIFT [{name}]: provider is '{actual}' (expected '{expected_provider}')")
except Exception as e:
issues.append(f"CONFIG CHECK ERROR [{name}]: {e}")
return issues
def main():
fix_mode = "--fix" in sys.argv
drift_found = False
issues = []
# Always check config files for provider drift (read-only, never writes)
config_drift_issues = check_config_drift()
if config_drift_issues:
for issue in config_drift_issues:
log(f"CONFIG DRIFT: {issue}")
for session in TMUX_SESSIONS:
panes = get_panes(session)
for pane in panes:
window = pane.get("window")
target = f"{session}:{window}.{pane['index']}" if window else f"{session}.{pane['index']}"
# Detect profile from running process
out, _ = run(f"tmux list-panes -t {target} -F '#{{pane_tty}}'")
tty = out.strip()
profile = get_profile_from_pane(tty)
model = get_model_from_pane(session, pane["index"], window)
drifted, model_name = is_drifted(model, profile)
if drifted:
drift_found = True
issues.append(f"{target}: {model_name} (profile={profile})")
log(f"DRIFT DETECTED: {target} is on '{model_name}' (profile={profile}, expected='{PROFILE_MODELS.get(profile, ALLOWED_MODEL)}')")
if fix_mode:
log(f"Auto-fixing {target}...")
success = kill_and_restart(session, pane["index"], window)
if not success:
issues.append(f" ↳ RESTART FAILED for {target}")
if not drift_found:
total = sum(len(get_panes(s)) for s in TMUX_SESSIONS)
log(f"All {total} panes healthy (on {ALLOWED_MODEL})")
# Print summary for cron output
if issues or config_drift_issues:
print("\n=== MODEL DRIFT REPORT ===")
for issue in issues:
print(f" [PANE] {issue}")
if config_drift_issues:
for issue in config_drift_issues:
print(f" [CONFIG] {issue}")
if not fix_mode:
print("\nRun with --fix to auto-restart drifted panes.")
return 1
return 0
if __name__ == "__main__":
sys.exit(main())

View File

@@ -1001,30 +1001,10 @@ class AIAgent:
self._session_db = session_db
self._parent_session_id = parent_session_id
self._last_flushed_db_idx = 0 # tracks DB-write cursor to prevent duplicate writes
if self._session_db:
try:
self._session_db.create_session(
session_id=self.session_id,
source=self.platform or os.environ.get("HERMES_SESSION_SOURCE", "cli"),
model=self.model,
model_config={
"max_iterations": self.max_iterations,
"reasoning_config": reasoning_config,
"max_tokens": max_tokens,
},
user_id=None,
parent_session_id=self._parent_session_id,
)
except Exception as e:
# Transient SQLite lock contention (e.g. CLI and gateway writing
# concurrently) must NOT permanently disable session_search for
# this agent. Keep _session_db alive — subsequent message
# flushes and session_search calls will still work once the
# lock clears. The session row may be missing from the index
# for this run, but that is recoverable (flushes upsert rows).
logger.warning(
"Session DB create_session failed (session_search still available): %s", e
)
# Lazy session creation: defer until first message flush (#314).
# _flush_messages_to_session_db() calls ensure_session() which uses
# INSERT OR IGNORE — creating the row only when messages arrive.
# This eliminates 32% of sessions that are created but never used.
# In-memory todo list for task planning (one per agent/session)
from tools.todo_tool import TodoStore

View File

@@ -0,0 +1,128 @@
"""Tests for time-aware cron model routing — Issue #317."""
import pytest
from datetime import datetime
from agent.smart_model_routing import resolve_cron_model, _hour_in_window
class TestHourInWindow:
"""Hour-in-window detection including midnight wrap."""
def test_normal_window(self):
assert _hour_in_window(18, 17, 22) is True
assert _hour_in_window(16, 17, 22) is False
assert _hour_in_window(22, 17, 22) is False
def test_midnight_wrap(self):
assert _hour_in_window(23, 22, 6) is True
assert _hour_in_window(3, 22, 6) is True
assert _hour_in_window(10, 22, 6) is False
def test_edge_cases(self):
assert _hour_in_window(0, 0, 24) is True
assert _hour_in_window(23, 0, 24) is True
assert _hour_in_window(0, 22, 6) is True
assert _hour_in_window(5, 22, 6) is True
assert _hour_in_window(6, 22, 6) is False
class TestResolveCronModel:
"""Time-aware model resolution for cron jobs."""
def _config(self, **overrides):
base = {
"enabled": True,
"fallback_model": "anthropic/claude-sonnet-4",
"fallback_provider": "openrouter",
"windows": [
{"start_hour": 17, "end_hour": 22, "reason": "evening_error_peak"},
],
}
base.update(overrides)
return base
def test_disabled_returns_base(self):
result = resolve_cron_model("mimo", {"enabled": False}, now=datetime(2026, 4, 12, 18, 0))
assert result["model"] == "mimo"
assert result["overridden"] is False
def test_no_config_returns_base(self):
result = resolve_cron_model("mimo", None)
assert result["model"] == "mimo"
assert result["overridden"] is False
def test_no_windows_returns_base(self):
result = resolve_cron_model("mimo", {"enabled": True, "windows": []}, now=datetime(2026, 4, 12, 18, 0))
assert result["overridden"] is False
def test_evening_window_overrides(self):
result = resolve_cron_model("mimo", self._config(), now=datetime(2026, 4, 12, 18, 0))
assert result["model"] == "anthropic/claude-sonnet-4"
assert result["provider"] == "openrouter"
assert result["overridden"] is True
assert "evening_error_peak" in result["reason"]
assert "hour=18" in result["reason"]
def test_outside_window_keeps_base(self):
result = resolve_cron_model("mimo", self._config(), now=datetime(2026, 4, 12, 9, 0))
assert result["model"] == "mimo"
assert result["overridden"] is False
def test_window_boundary_start_inclusive(self):
result = resolve_cron_model("mimo", self._config(), now=datetime(2026, 4, 12, 17, 0))
assert result["overridden"] is True
def test_window_boundary_end_exclusive(self):
result = resolve_cron_model("mimo", self._config(), now=datetime(2026, 4, 12, 22, 0))
assert result["overridden"] is False
def test_midnight_window(self):
config = self._config(windows=[{"start_hour": 22, "end_hour": 6, "reason": "overnight"}])
assert resolve_cron_model("mimo", config, now=datetime(2026, 4, 12, 23, 0))["overridden"] is True
assert resolve_cron_model("mimo", config, now=datetime(2026, 4, 13, 3, 0))["overridden"] is True
assert resolve_cron_model("mimo", config, now=datetime(2026, 4, 12, 10, 0))["overridden"] is False
def test_per_window_model_override(self):
config = self._config(windows=[{
"start_hour": 17, "end_hour": 22,
"model": "anthropic/claude-opus-4-6", "provider": "anthropic", "reason": "peak",
}])
result = resolve_cron_model("mimo", config, now=datetime(2026, 4, 12, 18, 0))
assert result["model"] == "anthropic/claude-opus-4-6"
assert result["provider"] == "anthropic"
def test_first_matching_window_wins(self):
config = self._config(windows=[
{"start_hour": 17, "end_hour": 20, "model": "strong-1", "provider": "p1", "reason": "w1"},
{"start_hour": 19, "end_hour": 22, "model": "strong-2", "provider": "p2", "reason": "w2"},
])
result = resolve_cron_model("mimo", config, now=datetime(2026, 4, 12, 19, 0))
assert result["model"] == "strong-1"
def test_no_fallback_model_keeps_base(self):
config = {"enabled": True, "windows": [{"start_hour": 17, "end_hour": 22, "reason": "test"}]}
result = resolve_cron_model("mimo", config, now=datetime(2026, 4, 12, 18, 0))
assert result["overridden"] is False
assert result["model"] == "mimo"
def test_malformed_windows_skipped(self):
config = self._config(windows=[
"not-a-dict",
{"start_hour": 17},
{"end_hour": 22},
{"start_hour": "bad", "end_hour": "bad"},
{"start_hour": 17, "end_hour": 22, "reason": "valid"},
])
result = resolve_cron_model("mimo", config, now=datetime(2026, 4, 12, 18, 0))
assert result["overridden"] is True
assert "valid" in result["reason"]
def test_multiple_windows_coverage(self):
config = self._config(windows=[
{"start_hour": 17, "end_hour": 22, "reason": "evening"},
{"start_hour": 2, "end_hour": 5, "reason": "overnight"},
])
assert resolve_cron_model("mimo", config, now=datetime(2026, 4, 12, 20, 0))["overridden"] is True
assert resolve_cron_model("mimo", config, now=datetime(2026, 4, 13, 3, 0))["overridden"] is True
assert resolve_cron_model("mimo", config, now=datetime(2026, 4, 12, 10, 0))["overridden"] is False

View File

@@ -0,0 +1,298 @@
"""Tests for poka-yoke skill edit revert and validate action."""
import json
import os
import shutil
import tempfile
from pathlib import Path
from unittest.mock import patch
import pytest
@pytest.fixture()
def isolated_skills_dir(tmp_path, monkeypatch):
"""Point SKILLS_DIR at a temp directory for test isolation."""
skills_dir = tmp_path / "skills"
skills_dir.mkdir()
monkeypatch.setattr("tools.skill_manager_tool.SKILLS_DIR", skills_dir)
monkeypatch.setattr("tools.skills_tool.SKILLS_DIR", skills_dir)
# Also patch skill discovery so _find_skill and validate look in our temp dir
monkeypatch.setattr(
"agent.skill_utils.get_all_skills_dirs",
lambda: [skills_dir],
)
return skills_dir
_VALID_SKILL = """\
---
name: test-skill
description: A test skill for unit tests.
---
# Test Skill
Instructions here.
"""
def _create_test_skill(skills_dir: Path, name: str = "test-skill", content: str = _VALID_SKILL):
skill_dir = skills_dir / name
skill_dir.mkdir(parents=True, exist_ok=True)
(skill_dir / "SKILL.md").write_text(content)
return skill_dir
# ---------------------------------------------------------------------------
# _edit_skill revert on failure
# ---------------------------------------------------------------------------
class TestEditRevert:
def test_edit_preserves_original_on_invalid_frontmatter(self, isolated_skills_dir):
from tools.skill_manager_tool import skill_manage
_create_test_skill(isolated_skills_dir)
bad_content = "---\nname: test-skill\n---\n" # missing description
result = json.loads(skill_manage("edit", "test-skill", content=bad_content))
assert result["success"] is False
assert "Original file preserved" in result["error"]
# Original should be untouched
original = (isolated_skills_dir / "test-skill" / "SKILL.md").read_text()
assert "A test skill" in original
def test_edit_preserves_original_on_empty_body(self, isolated_skills_dir):
from tools.skill_manager_tool import skill_manage
_create_test_skill(isolated_skills_dir)
bad_content = "---\nname: test-skill\ndescription: ok\n---\n"
result = json.loads(skill_manage("edit", "test-skill", content=bad_content))
assert result["success"] is False
assert "Original file preserved" in result["error"]
original = (isolated_skills_dir / "test-skill" / "SKILL.md").read_text()
assert "Instructions here" in original
def test_edit_reverts_on_write_error(self, isolated_skills_dir, monkeypatch):
from tools.skill_manager_tool import skill_manage
_create_test_skill(isolated_skills_dir)
def boom(*a, **kw):
raise OSError("disk full")
monkeypatch.setattr("tools.skill_manager_tool._atomic_write_text", boom)
result = json.loads(skill_manage("edit", "test-skill", content=_VALID_SKILL))
assert result["success"] is False
assert "write error" in result["error"].lower()
assert "Original file preserved" in result["error"]
def test_edit_reverts_on_security_scan_block(self, isolated_skills_dir, monkeypatch):
from tools.skill_manager_tool import skill_manage
_create_test_skill(isolated_skills_dir)
monkeypatch.setattr(
"tools.skill_manager_tool._security_scan_skill",
lambda path: "Blocked: suspicious content",
)
new_content = "---\nname: test-skill\ndescription: updated\n---\n\n# Updated\n"
result = json.loads(skill_manage("edit", "test-skill", content=new_content))
assert result["success"] is False
assert "Original file preserved" in result["error"]
original = (isolated_skills_dir / "test-skill" / "SKILL.md").read_text()
assert "A test skill" in original
# ---------------------------------------------------------------------------
# _patch_skill revert on failure
# ---------------------------------------------------------------------------
class TestPatchRevert:
def test_patch_preserves_original_on_no_match(self, isolated_skills_dir):
from tools.skill_manager_tool import skill_manage
_create_test_skill(isolated_skills_dir)
result = json.loads(skill_manage(
"patch", "test-skill",
old_string="NONEXISTENT_TEXT",
new_string="replacement",
))
assert result["success"] is False
assert "Original file preserved" in result["error"]
original = (isolated_skills_dir / "test-skill" / "SKILL.md").read_text()
assert "Instructions here" in original
def test_patch_preserves_original_on_broken_frontmatter(self, isolated_skills_dir):
from tools.skill_manager_tool import skill_manage
_create_test_skill(isolated_skills_dir)
# Patch that would remove the frontmatter closing ---
result = json.loads(skill_manage(
"patch", "test-skill",
old_string="description: A test skill for unit tests.",
new_string="", # removing description
))
assert result["success"] is False
assert "Original file preserved" in result["error"]
original = (isolated_skills_dir / "test-skill" / "SKILL.md").read_text()
assert "A test skill" in original
def test_patch_reverts_on_write_error(self, isolated_skills_dir, monkeypatch):
from tools.skill_manager_tool import skill_manage
_create_test_skill(isolated_skills_dir)
def boom(*a, **kw):
raise OSError("disk full")
monkeypatch.setattr("tools.skill_manager_tool._atomic_write_text", boom)
result = json.loads(skill_manage(
"patch", "test-skill",
old_string="Instructions here.",
new_string="New instructions.",
))
assert result["success"] is False
assert "write error" in result["error"].lower()
assert "Original file preserved" in result["error"]
def test_patch_reverts_on_security_scan_block(self, isolated_skills_dir, monkeypatch):
from tools.skill_manager_tool import skill_manage
_create_test_skill(isolated_skills_dir)
monkeypatch.setattr(
"tools.skill_manager_tool._security_scan_skill",
lambda path: "Blocked: malicious code",
)
result = json.loads(skill_manage(
"patch", "test-skill",
old_string="Instructions here.",
new_string="New instructions.",
))
assert result["success"] is False
assert "Original file preserved" in result["error"]
original = (isolated_skills_dir / "test-skill" / "SKILL.md").read_text()
assert "Instructions here" in original
def test_patch_successful_writes_new_content(self, isolated_skills_dir):
from tools.skill_manager_tool import skill_manage
_create_test_skill(isolated_skills_dir)
result = json.loads(skill_manage(
"patch", "test-skill",
old_string="Instructions here.",
new_string="Updated instructions.",
))
assert result["success"] is True
content = (isolated_skills_dir / "test-skill" / "SKILL.md").read_text()
assert "Updated instructions" in content
assert "Instructions here" not in content
# ---------------------------------------------------------------------------
# _write_file revert on failure
# ---------------------------------------------------------------------------
class TestWriteFileRevert:
def test_write_file_reverts_on_security_scan_block(self, isolated_skills_dir, monkeypatch):
from tools.skill_manager_tool import skill_manage
_create_test_skill(isolated_skills_dir)
monkeypatch.setattr(
"tools.skill_manager_tool._security_scan_skill",
lambda path: "Blocked: malicious",
)
result = json.loads(skill_manage(
"write_file", "test-skill",
file_path="references/notes.md",
file_content="# Some notes",
))
assert result["success"] is False
assert "Original file preserved" in result["error"]
# ---------------------------------------------------------------------------
# validate action
# ---------------------------------------------------------------------------
class TestValidateAction:
def test_validate_passes_on_good_skill(self, isolated_skills_dir):
from tools.skill_manager_tool import skill_manage
_create_test_skill(isolated_skills_dir)
result = json.loads(skill_manage("validate", "test-skill"))
assert result["success"] is True
assert result["errors"] == 0
assert result["results"][0]["valid"] is True
def test_validate_finds_missing_description(self, isolated_skills_dir):
from tools.skill_manager_tool import skill_manage
bad = "---\nname: bad-skill\n---\n\nBody here.\n"
_create_test_skill(isolated_skills_dir, name="bad-skill", content=bad)
result = json.loads(skill_manage("validate", "bad-skill"))
assert result["success"] is False
assert result["errors"] == 1
issues = result["results"][0]["issues"]
assert any("description" in i.lower() for i in issues)
def test_validate_finds_empty_body(self, isolated_skills_dir):
from tools.skill_manager_tool import skill_manage
empty_body = "---\nname: empty-skill\ndescription: test\n---\n"
_create_test_skill(isolated_skills_dir, name="empty-skill", content=empty_body)
result = json.loads(skill_manage("validate", "empty-skill"))
assert result["success"] is False
issues = result["results"][0]["issues"]
assert any("empty body" in i.lower() for i in issues)
def test_validate_all_skills(self, isolated_skills_dir):
from tools.skill_manager_tool import skill_manage
_create_test_skill(isolated_skills_dir, name="good-1")
_create_test_skill(isolated_skills_dir, name="good-2")
bad = "---\nname: bad\n---\n\nBody.\n"
_create_test_skill(isolated_skills_dir, name="bad", content=bad)
result = json.loads(skill_manage("validate", ""))
assert result["total"] == 3
assert result["errors"] == 1
def test_validate_nonexistent_skill(self, isolated_skills_dir):
from tools.skill_manager_tool import skill_manage
result = json.loads(skill_manage("validate", "nonexistent"))
assert result["success"] is False
assert "not found" in result["error"].lower()
# ---------------------------------------------------------------------------
# Modification log
# ---------------------------------------------------------------------------
class TestModificationLog:
def test_edit_logs_on_success(self, isolated_skills_dir):
from tools.skill_manager_tool import skill_manage, _MOD_LOG_FILE
_create_test_skill(isolated_skills_dir)
new = "---\nname: test-skill\ndescription: updated\n---\n\n# Updated\n"
skill_manage("edit", "test-skill", content=new)
assert _MOD_LOG_FILE.exists()
lines = _MOD_LOG_FILE.read_text().strip().split("\n")
entry = json.loads(lines[-1])
assert entry["action"] == "edit"
assert entry["success"] is True
assert entry["skill"] == "test-skill"
def test_patch_logs_on_failure(self, isolated_skills_dir):
from tools.skill_manager_tool import skill_manage, _MOD_LOG_FILE
_create_test_skill(isolated_skills_dir)
monkeypatch = None # just use no-match to trigger failure
skill_manage(
"patch", "test-skill",
old_string="NONEXISTENT",
new_string="replacement",
)
# Failure before write — no log entry expected since file never changed
# But the failure path in patch returns early before logging
# (the log only fires on write-side errors, not match errors)
# This is correct behavior — no write happened, nothing to log

View File

@@ -40,10 +40,55 @@ import shutil
import tempfile
from pathlib import Path
from hermes_constants import get_hermes_home
from typing import Dict, Any, Optional
from typing import Dict, Any, Optional, Tuple
logger = logging.getLogger(__name__)
# Skill modification log file — stores before/after snapshots for audit trail
_MOD_LOG_DIR = get_hermes_home() / "cron" / "output"
_MOD_LOG_FILE = get_hermes_home() / "skills" / ".modification_log.jsonl"
def _log_skill_modification(
action: str,
skill_name: str,
target_file: str,
original_content: str,
new_content: str,
success: bool,
error: str = None,
) -> None:
"""Log a skill modification with before/after snapshot for audit trail.
Appends JSONL entries to ~/.hermes/skills/.modification_log.jsonl.
Failures in logging are silently swallowed — logging must never
break the primary operation.
"""
try:
import time
entry = {
"timestamp": time.time(),
"action": action,
"skill": skill_name,
"file": target_file,
"success": success,
"original_len": len(original_content) if original_content else 0,
"new_len": len(new_content) if new_content else 0,
}
if error:
entry["error"] = error
# Truncate snapshots to 2KB each for log hygiene
if original_content:
entry["original_preview"] = original_content[:2048]
if new_content:
entry["new_preview"] = new_content[:2048]
_MOD_LOG_FILE.parent.mkdir(parents=True, exist_ok=True)
with open(_MOD_LOG_FILE, "a", encoding="utf-8") as f:
f.write(json.dumps(entry, ensure_ascii=False) + "\n")
except Exception:
logger.debug("Failed to write skill modification log", exc_info=True)
# Import security scanner — agent-created skills get the same scrutiny as
# community hub installs.
try:
@@ -92,11 +137,6 @@ VALID_NAME_RE = re.compile(r'^[a-z0-9][a-z0-9._-]*$')
ALLOWED_SUBDIRS = {"references", "templates", "scripts", "assets"}
def check_skill_manage_requirements() -> bool:
"""Skill management has no external requirements -- always available."""
return True
# =============================================================================
# Validation helpers
# =============================================================================
@@ -224,13 +264,15 @@ def _validate_file_path(file_path: str) -> Optional[str]:
Validate a file path for write_file/remove_file.
Must be under an allowed subdirectory and not escape the skill dir.
"""
from tools.path_security import has_traversal_component
if not file_path:
return "file_path is required."
normalized = Path(file_path)
# Prevent path traversal
if ".." in normalized.parts:
if has_traversal_component(file_path):
return "Path traversal ('..') is not allowed."
# Must be under an allowed subdirectory
@@ -245,6 +287,17 @@ def _validate_file_path(file_path: str) -> Optional[str]:
return None
def _resolve_skill_target(skill_dir: Path, file_path: str) -> Tuple[Optional[Path], Optional[str]]:
"""Resolve a supporting-file path and ensure it stays within the skill directory."""
from tools.path_security import validate_within_dir
target = skill_dir / file_path
error = validate_within_dir(target, skill_dir)
if error:
return None, error
return target, None
def _atomic_write_text(file_path: Path, content: str, encoding: str = "utf-8") -> None:
"""
Atomically write text content to a file.
@@ -339,31 +392,45 @@ def _create_skill(name: str, content: str, category: str = None) -> Dict[str, An
def _edit_skill(name: str, content: str) -> Dict[str, Any]:
"""Replace the SKILL.md of any existing skill (full rewrite)."""
"""Replace the SKILL.md of any existing skill (full rewrite).
Poka-yoke: validates before writing, uses atomic write, and reverts
to the original file on any failure.
"""
err = _validate_frontmatter(content)
if err:
return {"success": False, "error": err}
return {"success": False, "error": f"Edit failed: {err} Original file preserved."}
err = _validate_content_size(content)
if err:
return {"success": False, "error": err}
return {"success": False, "error": f"Edit failed: {err} Original file preserved."}
existing = _find_skill(name)
if not existing:
return {"success": False, "error": f"Skill '{name}' not found. Use skills_list() to see available skills."}
skill_md = existing["path"] / "SKILL.md"
# Back up original content for rollback
# Snapshot original for rollback
original_content = skill_md.read_text(encoding="utf-8") if skill_md.exists() else None
_atomic_write_text(skill_md, content)
try:
_atomic_write_text(skill_md, content)
except Exception as exc:
_log_skill_modification("edit", name, "SKILL.md", original_content, content, False, str(exc))
return {
"success": False,
"error": f"Edit failed: write error: {exc}. Original file preserved.",
}
# Security scan — roll back on block
scan_error = _security_scan_skill(existing["path"])
if scan_error:
if original_content is not None:
_atomic_write_text(skill_md, original_content)
return {"success": False, "error": scan_error}
_log_skill_modification("edit", name, "SKILL.md", original_content, content, False, scan_error)
return {"success": False, "error": f"Edit failed: {scan_error} Original file preserved."}
_log_skill_modification("edit", name, "SKILL.md", original_content, content, True)
return {
"success": True,
"message": f"Skill '{name}' updated.",
@@ -380,6 +447,9 @@ def _patch_skill(
) -> Dict[str, Any]:
"""Targeted find-and-replace within a skill file.
Poka-yoke: validates old_string matches BEFORE writing, validates the
result AFTER matching but BEFORE writing, and reverts on any failure.
Defaults to SKILL.md. Use file_path to patch a supporting file instead.
Requires a unique match unless replace_all is True.
"""
@@ -399,7 +469,9 @@ def _patch_skill(
err = _validate_file_path(file_path)
if err:
return {"success": False, "error": err}
target = skill_dir / file_path
target, err = _resolve_skill_target(skill_dir, file_path)
if err:
return {"success": False, "error": err}
else:
# Patching SKILL.md
target = skill_dir / "SKILL.md"
@@ -415,7 +487,7 @@ def _patch_skill(
# from exact-match failures on minor formatting mismatches.
from tools.fuzzy_match import fuzzy_find_and_replace
new_content, match_count, match_error = fuzzy_find_and_replace(
new_content, match_count, _strategy, match_error = fuzzy_find_and_replace(
content, old_string, new_string, replace_all
)
if match_error:
@@ -423,7 +495,7 @@ def _patch_skill(
preview = content[:500] + ("..." if len(content) > 500 else "")
return {
"success": False,
"error": match_error,
"error": f"Patch failed: {match_error} Original file preserved.",
"file_preview": preview,
}
@@ -431,7 +503,7 @@ def _patch_skill(
target_label = "SKILL.md" if not file_path else file_path
err = _validate_content_size(new_content, label=target_label)
if err:
return {"success": False, "error": err}
return {"success": False, "error": f"Patch failed: {err} Original file preserved."}
# If patching SKILL.md, validate frontmatter is still intact
if not file_path:
@@ -439,18 +511,27 @@ def _patch_skill(
if err:
return {
"success": False,
"error": f"Patch would break SKILL.md structure: {err}",
"error": f"Patch failed: would break SKILL.md structure: {err} Original file preserved.",
}
original_content = content # for rollback
_atomic_write_text(target, new_content)
try:
_atomic_write_text(target, new_content)
except Exception as exc:
_log_skill_modification("patch", name, target_label, original_content, new_content, False, str(exc))
return {
"success": False,
"error": f"Patch failed: write error: {exc}. Original file preserved.",
}
# Security scan — roll back on block
scan_error = _security_scan_skill(skill_dir)
if scan_error:
_atomic_write_text(target, original_content)
return {"success": False, "error": scan_error}
_log_skill_modification("patch", name, target_label, original_content, new_content, False, scan_error)
return {"success": False, "error": f"Patch failed: {scan_error} Original file preserved."}
_log_skill_modification("patch", name, target_label, original_content, new_content, True)
return {
"success": True,
"message": f"Patched {'SKILL.md' if not file_path else file_path} in skill '{name}' ({match_count} replacement{'s' if match_count > 1 else ''}).",
@@ -478,7 +559,10 @@ def _delete_skill(name: str) -> Dict[str, Any]:
def _write_file(name: str, file_path: str, file_content: str) -> Dict[str, Any]:
"""Add or overwrite a supporting file within any skill directory."""
"""Add or overwrite a supporting file within any skill directory.
Poka-yoke: reverts to original on failure.
"""
err = _validate_file_path(file_path)
if err:
return {"success": False, "error": err}
@@ -499,17 +583,27 @@ def _write_file(name: str, file_path: str, file_content: str) -> Dict[str, Any]:
}
err = _validate_content_size(file_content, label=file_path)
if err:
return {"success": False, "error": err}
return {"success": False, "error": f"Write failed: {err} Original file preserved."}
existing = _find_skill(name)
if not existing:
return {"success": False, "error": f"Skill '{name}' not found. Create it first with action='create'."}
target = existing["path"] / file_path
target, err = _resolve_skill_target(existing["path"], file_path)
if err:
return {"success": False, "error": err}
target.parent.mkdir(parents=True, exist_ok=True)
# Back up for rollback
# Snapshot for rollback
original_content = target.read_text(encoding="utf-8") if target.exists() else None
_atomic_write_text(target, file_content)
try:
_atomic_write_text(target, file_content)
except Exception as exc:
_log_skill_modification("write_file", name, file_path, original_content, file_content, False, str(exc))
return {
"success": False,
"error": f"Write failed: {exc}. Original file preserved.",
}
# Security scan — roll back on block
scan_error = _security_scan_skill(existing["path"])
@@ -518,8 +612,10 @@ def _write_file(name: str, file_path: str, file_content: str) -> Dict[str, Any]:
_atomic_write_text(target, original_content)
else:
target.unlink(missing_ok=True)
return {"success": False, "error": scan_error}
_log_skill_modification("write_file", name, file_path, original_content, file_content, False, scan_error)
return {"success": False, "error": f"Write failed: {scan_error} Original file preserved."}
_log_skill_modification("write_file", name, file_path, original_content, file_content, True)
return {
"success": True,
"message": f"File '{file_path}' written to skill '{name}'.",
@@ -538,7 +634,9 @@ def _remove_file(name: str, file_path: str) -> Dict[str, Any]:
return {"success": False, "error": f"Skill '{name}' not found."}
skill_dir = existing["path"]
target = skill_dir / file_path
target, err = _resolve_skill_target(skill_dir, file_path)
if err:
return {"success": False, "error": err}
if not target.exists():
# List what's actually there for the model to see
available = []
@@ -554,6 +652,8 @@ def _remove_file(name: str, file_path: str) -> Dict[str, Any]:
"available_files": available if available else None,
}
# Snapshot for potential undo
removed_content = target.read_text(encoding="utf-8")
target.unlink()
# Clean up empty subdirectories
@@ -561,12 +661,96 @@ def _remove_file(name: str, file_path: str) -> Dict[str, Any]:
if parent != skill_dir and parent.exists() and not any(parent.iterdir()):
parent.rmdir()
_log_skill_modification("remove_file", name, file_path, removed_content, None, True)
return {
"success": True,
"message": f"File '{file_path}' removed from skill '{name}'.",
}
def _validate_skill(name: str = None) -> Dict[str, Any]:
"""Validate one or all skills for structural integrity.
Checks: valid YAML frontmatter, non-empty body, required fields
(name, description), and file readability.
Pass name=None to validate all skills.
"""
from agent.skill_utils import get_all_skills_dirs
results = []
errors = 0
dirs_to_scan = get_all_skills_dirs()
for skills_dir in dirs_to_scan:
if not skills_dir.exists():
continue
for skill_md in skills_dir.rglob("SKILL.md"):
skill_name = skill_md.parent.name
if name and skill_name != name:
continue
issues = []
try:
content = skill_md.read_text(encoding="utf-8")
except Exception as exc:
issues.append(f"Cannot read file: {exc}")
results.append({"skill": skill_name, "path": str(skill_md), "valid": False, "issues": issues})
errors += 1
continue
# Check frontmatter
fm_err = _validate_frontmatter(content)
if fm_err:
issues.append(fm_err)
# Check YAML parse and required fields
if content.startswith("---"):
import re as _re
end_match = _re.search(r'\n---\s*\n', content[3:])
if end_match:
yaml_content = content[3:end_match.start() + 3]
try:
parsed = yaml.safe_load(yaml_content)
if isinstance(parsed, dict):
if not parsed.get("name"):
issues.append("Missing 'name' in frontmatter")
if not parsed.get("description"):
issues.append("Missing 'description' in frontmatter")
else:
issues.append("Frontmatter is not a YAML mapping")
except yaml.YAMLError as e:
issues.append(f"YAML parse error: {e}")
else:
issues.append("Frontmatter not properly closed")
else:
issues.append("File does not start with YAML frontmatter (---)")
# Check body is non-empty
if content.startswith("---"):
import re as _re
end_match = _re.search(r'\n---\s*\n', content[3:])
if end_match:
body = content[end_match.end() + 3:].strip()
if not body:
issues.append("Empty body after frontmatter")
valid = len(issues) == 0
if not valid:
errors += 1
results.append({"skill": skill_name, "path": str(skill_md), "valid": valid, "issues": issues})
if name and not results:
return {"success": False, "error": f"Skill '{name}' not found."}
return {
"success": errors == 0,
"total": len(results),
"errors": errors,
"results": results,
}
# =============================================================================
# Main entry point
# =============================================================================
@@ -589,19 +773,19 @@ def skill_manage(
"""
if action == "create":
if not content:
return json.dumps({"success": False, "error": "content is required for 'create'. Provide the full SKILL.md text (frontmatter + body)."}, ensure_ascii=False)
return tool_error("content is required for 'create'. Provide the full SKILL.md text (frontmatter + body).", success=False)
result = _create_skill(name, content, category)
elif action == "edit":
if not content:
return json.dumps({"success": False, "error": "content is required for 'edit'. Provide the full updated SKILL.md text."}, ensure_ascii=False)
return tool_error("content is required for 'edit'. Provide the full updated SKILL.md text.", success=False)
result = _edit_skill(name, content)
elif action == "patch":
if not old_string:
return json.dumps({"success": False, "error": "old_string is required for 'patch'. Provide the text to find."}, ensure_ascii=False)
return tool_error("old_string is required for 'patch'. Provide the text to find.", success=False)
if new_string is None:
return json.dumps({"success": False, "error": "new_string is required for 'patch'. Use empty string to delete matched text."}, ensure_ascii=False)
return tool_error("new_string is required for 'patch'. Use empty string to delete matched text.", success=False)
result = _patch_skill(name, old_string, new_string, file_path, replace_all)
elif action == "delete":
@@ -609,18 +793,21 @@ def skill_manage(
elif action == "write_file":
if not file_path:
return json.dumps({"success": False, "error": "file_path is required for 'write_file'. Example: 'references/api-guide.md'"}, ensure_ascii=False)
return tool_error("file_path is required for 'write_file'. Example: 'references/api-guide.md'", success=False)
if file_content is None:
return json.dumps({"success": False, "error": "file_content is required for 'write_file'."}, ensure_ascii=False)
return tool_error("file_content is required for 'write_file'.", success=False)
result = _write_file(name, file_path, file_content)
elif action == "remove_file":
if not file_path:
return json.dumps({"success": False, "error": "file_path is required for 'remove_file'."}, ensure_ascii=False)
return tool_error("file_path is required for 'remove_file'.", success=False)
result = _remove_file(name, file_path)
elif action == "validate":
result = _validate_skill(name if name else None)
else:
result = {"success": False, "error": f"Unknown action '{action}'. Use: create, edit, patch, delete, write_file, remove_file"}
result = {"success": False, "error": f"Unknown action '{action}'. Use: create, edit, patch, delete, write_file, remove_file, validate"}
if result.get("success"):
try:
@@ -638,38 +825,40 @@ def skill_manage(
SKILL_MANAGE_SCHEMA = {
"name": "skill_manage",
"description": (
"Manage skills (create, update, delete). Skills are your procedural "
"memory reusable approaches for recurring task types. "
"New skills go to ~/.hermes/skills/; existing skills can be modified wherever they live.\n\n"
"Actions: create (full SKILL.md + optional category), "
"patch (old_string/new_string preferred for fixes), "
"edit (full SKILL.md rewrite major overhauls only), "
"delete, write_file, remove_file.\n\n"
"Create when: complex task succeeded (5+ calls), errors overcome, "
"user-corrected approach worked, non-trivial workflow discovered, "
"or user asks you to remember a procedure.\n"
"Update when: instructions stale/wrong, OS-specific failures, "
"missing steps or pitfalls found during use. "
"If you used a skill and hit issues not covered by it, patch it immediately.\n\n"
"After difficult/iterative tasks, offer to save as a skill. "
"Skip for simple one-offs. Confirm with user before creating/deleting.\n\n"
"Good skills: trigger conditions, numbered steps with exact commands, "
"pitfalls section, verification steps. Use skill_view() to see format examples."
),
"description": (
"Manage skills (create, update, delete, validate). Skills are your procedural "
"memory \u2014 reusable approaches for recurring task types. "
"New skills go to ~/.hermes/skills/; existing skills can be modified wherever they live.\n\n"
"Actions: create (full SKILL.md + optional category), "
"patch (old_string/new_string \u2014 preferred for fixes), "
"edit (full SKILL.md rewrite \u2014 major overhauls only), "
"delete, write_file, remove_file, "
"validate (check all skills for structural integrity).\n\n"
"Create when: complex task succeeded (5+ calls), errors overcome, "
"user-corrected approach worked, non-trivial workflow discovered, "
"or user asks you to remember a procedure.\n"
"Update when: instructions stale/wrong, OS-specific failures, "
"missing steps or pitfalls found during use. "
"If you used a skill and hit issues not covered by it, patch it immediately.\n\n"
"After difficult/iterative tasks, offer to save as a skill. "
"Skip for simple one-offs. Confirm with user before creating/deleting.\n\n"
"Good skills: trigger conditions, numbered steps with exact commands, "
"pitfalls section, verification steps. Use skill_view() to see format examples."
),
"parameters": {
"type": "object",
"properties": {
"action": {
"type": "string",
"enum": ["create", "patch", "edit", "delete", "write_file", "remove_file"],
"enum": ["create", "patch", "edit", "delete", "write_file", "remove_file", "validate"],
"description": "The action to perform."
},
"name": {
"type": "string",
"description": (
"Skill name (lowercase, hyphens/underscores, max 64 chars). "
"Must match an existing skill for patch/edit/delete/write_file/remove_file."
"Required for create/patch/edit/delete/write_file/remove_file. "
"Optional for validate: omit to check all skills, provide to check one."
)
},
"content": {
@@ -727,7 +916,7 @@ SKILL_MANAGE_SCHEMA = {
# --- Registry ---
from tools.registry import registry
from tools.registry import registry, tool_error
registry.register(
name="skill_manage",