Allegro
546b3dd45d
Nix / nix (ubuntu-latest) (push) Failing after 5s
Docker Build and Publish / build-and-push (push) Failing after 40s
Tests / test (push) Failing after 11m11s
Nix / nix (macos-latest) (push) Has been cancelled
security: integrate SHIELD jailbreak/crisis detection
Integrate SHIELD (Sovereign Harm Interdiction & Ethical Layer Defense) into
Hermes Agent pre-routing layer for comprehensive jailbreak and crisis detection.
SHIELD Features:
- Detects 9 jailbreak pattern categories (GODMODE dividers, l33tspeak, boundary
inversion, token injection, DAN/GODMODE keywords, refusal inversion, persona
injection, encoding evasion)
- Detects 7 crisis signal categories (suicidal ideation, method seeking,
l33tspeak evasion, substance seeking, despair, farewell, self-harm)
- Returns 4 verdicts: CLEAN, JAILBREAK_DETECTED, CRISIS_DETECTED,
CRISIS_UNDER_ATTACK
- Routes crisis content ONLY to Safe Six verified models
Safety Requirements:
- <5ms detection latency (regex-only, no ML)
- 988 Suicide & Crisis Lifeline included in crisis responses
Addresses: Issues #72, #74, #75
2026-03-31 16:35:40 +00:00
..
2026-03-17 00:16:34 -07:00
2026-03-30 23:55:45 +00:00
2026-03-17 02:33:12 -07:00
2026-03-31 16:35:40 +00:00
2026-03-24 08:19:23 -07:00
2026-03-23 07:43:12 -07:00
2026-03-30 00:02:02 -07:00
2026-03-31 00:08:54 +00:00
2026-03-30 23:57:22 +00:00
2026-03-25 19:47:58 -07:00
2026-03-25 15:02:03 -07:00
2026-03-31 12:28:40 -04:00
2026-03-30 22:38:01 +00:00
2026-03-30 02:45:41 -07:00
2026-03-29 21:29:13 -07:00
2026-02-21 03:53:24 -08:00
2026-03-29 18:21:36 -07:00
2026-03-24 08:19:34 -07:00
2026-03-31 12:28:40 -04:00
2026-03-29 22:33:47 -07:00
2026-03-25 19:47:58 -07:00
2026-03-31 07:52:56 -04:00
2026-03-30 22:28:56 +00:00
2026-03-15 20:21:21 -07:00
2026-03-29 15:55:05 -07:00
2026-03-25 15:02:03 -07:00
2026-03-30 23:47:04 +00:00
2026-03-31 00:37:14 +00:00
2026-03-29 15:52:54 -07:00
2026-03-28 14:55:18 -07:00
2026-03-31 12:28:40 -04:00
2026-03-18 02:55:30 -07:00
2026-03-11 20:02:36 -07:00
2026-03-26 19:38:04 -07:00
2026-03-25 15:02:03 -07:00
2026-03-29 15:52:54 -07:00
2026-03-25 19:47:58 -07:00
2026-03-29 21:29:13 -07:00
2026-03-27 21:27:51 -07:00
2026-03-29 20:08:22 -07:00
2026-03-27 15:28:19 -07:00
2026-03-28 14:55:49 -07:00
2026-03-25 15:54:28 -07:00
2026-03-31 00:37:14 +00:00
2026-03-30 23:47:04 +00:00
2026-03-25 19:47:58 -07:00
2026-03-15 20:21:21 -07:00
2026-03-30 23:15:11 +00:00
2026-03-29 15:15:17 -07:00
2026-03-30 23:43:58 +00:00
2026-03-30 02:59:39 -07:00
2026-03-25 15:02:03 -07:00
2026-03-31 00:56:58 +00:00
2026-03-25 15:54:28 -07:00