Some checks failed
Architecture Lint / Linter Tests (pull_request) Successful in 28s
Smoke Test / smoke (pull_request) Failing after 23s
Validate Config / YAML Lint (pull_request) Failing after 11s
Validate Config / JSON Validate (pull_request) Successful in 13s
Validate Config / Python Syntax & Import Check (pull_request) Failing after 31s
Validate Config / Python Test Suite (pull_request) Has been skipped
Validate Config / Shell Script Lint (pull_request) Failing after 23s
Validate Config / Cron Syntax Check (pull_request) Successful in 5s
Validate Config / Deploy Script Dry Run (pull_request) Successful in 7s
Validate Config / Playbook Schema Validation (pull_request) Successful in 15s
Validate Training Data / validate (pull_request) Failing after 18s
PR Checklist / pr-checklist (pull_request) Failing after 3m57s
Architecture Lint / Lint Repository (pull_request) Failing after 15s
- Moved 3 actively-used custom agent modules to timmy-config/extensions/ - shield.py (crisis & jailbreak detection) - privacy_filter.py (PII stripping for remote API calls) - smart_model_routing.py (cheap vs strong model heuristic) - Updated deploy.sh to copy extensions/ to ~/.hermes/agent/ - Added comprehensive migration doc in docs/AGENT_EXTENSION_MIGRATION.md - Archived 16 dead/unused agent/*.py modules to agent/archive/ in hermes-agent This is SIDECAR-3 — restructure agent/* extensions as sidecar skills. Docs: Closes #339
29 lines
996 B
Python
29 lines
996 B
Python
"""Shield integration — crisis detection and jailbreak pattern scanning.
|
|
|
|
This module restructures agent/shield.py as a timmy-config extension.
|
|
Original file existed in hermes-agent/agent/shield.py and imported
|
|
`tools.shield.detector`. We re-export the same interface.
|
|
"""
|
|
|
|
from tools.shield.detector import ShieldDetector, Verdict, CRISIS_SYSTEM_PROMPT, SAFE_SIX_MODELS
|
|
|
|
_detector = None
|
|
|
|
def get_detector():
|
|
"""Get or create the global Shield detector instance."""
|
|
global _detector
|
|
if _detector is None:
|
|
_detector = ShieldDetector()
|
|
return _detector
|
|
|
|
def scan_text(text: str):
|
|
"""Scan text for jailbreaks and crisis signals using SHIELD."""
|
|
detector = get_detector()
|
|
return detector.detect(text)
|
|
|
|
def is_crisis(verdict: str) -> bool:
|
|
return verdict in [Verdict.CRISIS_DETECTED.value, Verdict.CRISIS_UNDER_ATTACK.value]
|
|
|
|
def is_jailbreak(verdict: str) -> bool:
|
|
return verdict in [Verdict.JAILBREAK_DETECTED.value, Verdict.CRISIS_UNDER_ATTACK.value]
|