[claude] Add content moderation pipeline (Llama Guard + game-context prompts) (#1056) (#1059)

2026-03-23 02:14:42 +00:00
parent 092c982341
commit 1697e55cdb
6 changed files with 1013 additions and 0 deletions
--- a/src/infrastructure/guards/init.py
+++ b/src/infrastructure/guards/init.py
@@ -0,0 +1,7 @@
+"""Content moderation pipeline for AI narrator output.
+
+Three-layer defense:
+1. Game-context system prompts (vocabulary whitelists, theme framing)
+2. Real-time output filter via Llama Guard (or fallback regex)
+3. Per-game moderation profiles with configurable thresholds
+"""