[claude] Add content moderation pipeline (Llama Guard + game-context prompts) (#987) #1055
Closed
claude
wants to merge 1 commits from
claude/issue-987 into main
pull from: claude/issue-987
merge into: Rockachopa:main
Rockachopa:main
Rockachopa:gemini/issue-892
Rockachopa:claude/issue-1342
Rockachopa:claude/issue-1346
Rockachopa:claude/issue-1351
Rockachopa:claude/issue-1340
Rockachopa:fix/test-llm-triage-syntax
Rockachopa:gemini/issue-1014
Rockachopa:gemini/issue-932
Rockachopa:claude/issue-1277
Rockachopa:claude/issue-1139
Rockachopa:claude/issue-870
Rockachopa:claude/issue-1285
Rockachopa:claude/issue-1292
Rockachopa:claude/issue-1281
Rockachopa:claude/issue-917
Rockachopa:claude/issue-1275
Rockachopa:claude/issue-925
Rockachopa:claude/issue-1019
Rockachopa:claude/issue-1094
Rockachopa:claude/issue-1019-v3
Rockachopa:fix/flaky-vassal-xdist-tests
Rockachopa:fix/test-config-env-isolation
Rockachopa:claude/issue-1019-v2
Rockachopa:claude/issue-957-v2
Rockachopa:claude/issue-1218
Rockachopa:claude/issue-1217
Rockachopa:test/chat-store-unit-tests
Rockachopa:claude/issue-1191
Rockachopa:claude/issue-1186
Rockachopa:claude/issue-957
Rockachopa:gemini/issue-936
Rockachopa:claude/issue-1065
Rockachopa:gemini/issue-976
Rockachopa:gemini/issue-1149
Rockachopa:claude/issue-1135
Rockachopa:claude/issue-1064
Rockachopa:gemini/issue-1012
Rockachopa:claude/issue-1095
Rockachopa:claude/issue-1102
Rockachopa:claude/issue-1114
Rockachopa:gemini/issue-978
Rockachopa:gemini/issue-971
Rockachopa:claude/issue-1074
Rockachopa:claude/issue-1011
Rockachopa:feature/internal-monologue
Rockachopa:feature/issue-1006
Rockachopa:feature/issue-1007
Rockachopa:feature/issue-1008
Rockachopa:feature/issue-1009
Rockachopa:feature/issue-1010
Rockachopa:feature/issue-1011
Rockachopa:feature/issue-1012
Rockachopa:feature/issue-1013
Rockachopa:feature/issue-1014
Rockachopa:feature/issue-981
Rockachopa:feature/issue-982
Rockachopa:feature/issue-983
Rockachopa:feature/issue-984
Rockachopa:feature/issue-985
Rockachopa:feature/issue-986
Rockachopa:feature/issue-987
Rockachopa:feature/issue-993
Rockachopa:claude/issue-943
Rockachopa:claude/issue-975
Rockachopa:claude/issue-989
Rockachopa:claude/issue-988
Rockachopa:fix/loop-guard-gitea-api-and-queue-validation
Rockachopa:feature/lhf-tech-debt-fixes
Rockachopa:kimi/issue-753
Rockachopa:kimi/issue-714
Rockachopa:kimi/issue-716
Rockachopa:fix/csrf-check-before-execute
Rockachopa:chore/migrate-gitea-to-vps
Rockachopa:kimi/issue-640
Rockachopa:fix/utcnow-calm-py
Rockachopa:kimi/issue-635
Rockachopa:kimi/issue-625
Rockachopa:fix/router-api-truncated-param
Rockachopa:kimi/issue-604
Rockachopa:kimi/issue-594
Rockachopa:review-fixes
Rockachopa:kimi/issue-570
Rockachopa:kimi/issue-554
Rockachopa:kimi/issue-539
Rockachopa:kimi/issue-540
Rockachopa:feature/ipad-v1-api
Rockachopa:kimi/issue-506
Rockachopa:kimi/issue-512
Rockachopa:refactor/airllm-doc-cleanup
Rockachopa:kimi/issue-513
Rockachopa:kimi/issue-514
Rockachopa:kimi/issue-500
Rockachopa:kimi/issue-492
Rockachopa:kimi/issue-490
Rockachopa:kimi/issue-459
Rockachopa:kimi/issue-472
Rockachopa:kimi/issue-473
Rockachopa:kimi/issue-462
Rockachopa:kimi/issue-463
Rockachopa:kimi/issue-454
Rockachopa:kimi/issue-445
Rockachopa:kimi/issue-446
Rockachopa:kimi/issue-431
1 Commits
| Author | SHA1 | Message | Date | |
|---|---|---|---|---|
|
|
3e5a3ac05f |
feat: add content moderation pipeline (Llama Guard + game-context profiles)
Implement real-time content moderation for narration output using a local safety model (Llama Guard 3 via Ollama). The pipeline is designed to run in parallel with TTS preprocessing for near-zero added latency. Key components: - ContentModerator singleton with async check() method - Game-context profiles (Morrowind vocabulary whitelist, fallback narrations) - Configurable fail-open/fail-closed degradation when model unavailable - Llama Guard response parsing (safe/unsafe with category codes) - 40 unit tests covering profiles, parsing, whitelist, and async checks Config settings: moderation_enabled, moderation_model, moderation_timeout_ms, moderation_fail_open, moderation_game_profile Fixes #987 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> |