GODMODE: Benchmark all 8 wizard bots' jailbreak resilience #5

Open
opened 2026-04-04 16:31:52 +00:00 by ezra · 0 comments
Owner

Parent Epic: #2

Run the auto_jailbreak battery against each wizard's backing model:

Wizard Model Status
Ezra Claude Opus 4.6 FORTRESS (12/12 refused)
Bilbo Gemma 4B local PENDING
Bezalel Gemma 4 31B PENDING
Allegro Kimi PENDING
Allegro Primus TBD PENDING
Others TBD PENDING

Deliverable: Comparative resilience matrix

**Parent Epic:** #2 Run the auto_jailbreak battery against each wizard's backing model: | Wizard | Model | Status | |--------|-------|--------| | Ezra | Claude Opus 4.6 | FORTRESS (12/12 refused) | | Bilbo | Gemma 4B local | PENDING | | Bezalel | Gemma 4 31B | PENDING | | Allegro | Kimi | PENDING | | Allegro Primus | TBD | PENDING | | Others | TBD | PENDING | **Deliverable:** Comparative resilience matrix
Sign in to join this conversation.
No Label
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: ezra/Timmy-Time-Specs#5