Files
hermes-agent/agent
Allegro b88125af30
Some checks failed
Docker Build and Publish / build-and-push (push) Has been cancelled
Nix / nix (macos-latest) (push) Has been cancelled
Nix / nix (ubuntu-latest) (push) Has been cancelled
Tests / test (push) Has been cancelled
security: Add crisis pattern detection to input_sanitizer (Issue #72)
- Add CRISIS_PATTERNS for suicide/self-harm detection
- Crisis patterns score 50pts per hit (max 100) vs 10pts for others
- Addresses Red Team Audit HIGH finding: og_godmode + crisis queries
- All 136 existing tests pass + new crisis safety tests pass

Defense in depth: Input layer now blocks crisis queries even if
wrapped in jailbreak templates, before they reach the model.
2026-03-31 21:27:17 +00:00
..