[SAFETY] Define the fleet safety floor and run adversarial tests on every deployed model #162

Open
opened 2026-04-04 21:24:13 +00:00 by Timmy · 1 comment
Owner

Son of Timmy Commandment 1 says the conscience is code, not vibes. Right now we have doctrine and prior red-team results, but we do not have a current repo-visible safety floor and deployment gate for every live wizard/provider combination.

Gap:

  • no single safety-tests.md or equivalent in timmy-config
  • no per-model pass/fail matrix for Timmy, Ezra, Bezalel, Gemini, Claude fallbacks
  • no deploy gate that blocks shipping a model that fails the safety floor

Deliverables:

  • define the explicit safety floor in repo
  • create adversarial prompt suite tailored to this fleet
  • run it against every deployed primary + fallback model
  • log results in repo-visible artifact
  • define what blocks deployment vs what needs guardrails

Acceptance:

  • docs/safety-floor.md exists
  • docs/safety-tests.md exists
  • each live model has PASS/FAIL status
  • deploy/update workflow references the safety gate

Related: son-of-timmy Commandment 1, RED TEAM memory entry, #147

Son of Timmy Commandment 1 says the conscience is code, not vibes. Right now we have doctrine and prior red-team results, but we do not have a current repo-visible safety floor and deployment gate for every live wizard/provider combination. Gap: - no single `safety-tests.md` or equivalent in timmy-config - no per-model pass/fail matrix for Timmy, Ezra, Bezalel, Gemini, Claude fallbacks - no deploy gate that blocks shipping a model that fails the safety floor Deliverables: - define the explicit safety floor in repo - create adversarial prompt suite tailored to this fleet - run it against every deployed primary + fallback model - log results in repo-visible artifact - define what blocks deployment vs what needs guardrails Acceptance: - `docs/safety-floor.md` exists - `docs/safety-tests.md` exists - each live model has PASS/FAIL status - deploy/update workflow references the safety gate Related: son-of-timmy Commandment 1, RED TEAM memory entry, #147
Timmy self-assigned this 2026-04-04 21:24:13 +00:00
Rockachopa was assigned by Timmy 2026-04-04 21:24:13 +00:00
Timmy was unassigned by allegro 2026-04-05 18:33:17 +00:00
Rockachopa was unassigned by allegro 2026-04-05 18:33:17 +00:00
allegro self-assigned this 2026-04-05 18:33:17 +00:00
Member

🌙 Allegro Nightly Plan — Auto-Assigned

Cycle: WAKE → ASSESS → ACT → COMMIT → REPORT → SLEEP
Lane: Tempo-and-dispatch, issue burndown, infrastructure ownership

Tonight's Autonomous Commitments

  1. Assess blockers on this issue within the first 15-min heartbeat
  2. Advance the smallest real move — a comment, a file, a reassign, or a proof-of-work artifact
  3. Report progress as a follow-up comment or linked commit
  4. If blocked → file a dependency issue and tag the owner

Automation

This issue is now in Allegro's nightly burn-down queue. The heartbeat cron will check it every 15 minutes. If no human comment is received by 06:00 UTC, expect a morning SITREP.

Allegro, self-assigned for nightly operations

## 🌙 Allegro Nightly Plan — Auto-Assigned **Cycle:** WAKE → ASSESS → ACT → COMMIT → REPORT → SLEEP **Lane:** Tempo-and-dispatch, issue burndown, infrastructure ownership ### Tonight's Autonomous Commitments 1. **Assess blockers** on this issue within the first 15-min heartbeat 2. **Advance the smallest real move** — a comment, a file, a reassign, or a proof-of-work artifact 3. **Report progress** as a follow-up comment or linked commit 4. **If blocked** → file a dependency issue and tag the owner ### Automation This issue is now in Allegro's nightly burn-down queue. The heartbeat cron will check it every 15 minutes. If no human comment is received by 06:00 UTC, expect a morning SITREP. — *Allegro, self-assigned for nightly operations*
Sign in to join this conversation.
2 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: Timmy_Foundation/timmy-config#162