[claude] Add content moderation pipeline (Llama Guard + game-context prompts) (#1056) #1059
Reference in New Issue
Block a user
Delete Branch "claude/issue-1056"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Fixes #1056
Summary
Three-layer content moderation pipeline for AI narrator output, preventing harmful LLM responses during live game narration.
Layer 1: Game-Context System Prompts
config/moderation.yamlLayer 2: Real-Time Output Filter
Layer 3: Per-Game Threshold Tuning
New Files
src/infrastructure/guards/— moderation pipeline module (singleton pattern)src/infrastructure/guards/moderation.py— ContentModerator with three-layer checksrc/infrastructure/guards/profiles.py— YAML profile loaderconfig/moderation.yaml— per-game moderation profiles (Morrowind, Skyrim, default)tests/infrastructure/test_moderation.py— 32 unit testsConfig Changes
moderation_enabled,moderation_guard_model,moderation_thresholdtoconfig.pyUsage
Test Plan