[loop-generated] [security] Split moderation.py — 497 lines, content filtering system #1407

Closed
opened 2026-03-24 12:54:03 +00:00 by Timmy · 1 comment
Owner

Problem

src/infrastructure/guards/moderation.py is 497 lines handling critical security functions:

  • Content filtering and safety checks
  • Prompt injection detection
  • Response sanitization
  • Policy enforcement
  • Threat pattern matching

Proposed Split

  1. Extract content filters into src/infrastructure/guards/filters/content.py
  2. Extract injection detection into src/infrastructure/guards/filters/injection.py
  3. Extract sanitization into src/infrastructure/guards/filters/sanitizer.py
  4. Extract policy engine into src/infrastructure/guards/policy.py
  5. Keep moderation.py as orchestrator

Benefits

  • Isolated testing of security components
  • Clear separation of security concerns
  • Easier security auditing
  • Pluggable filter architecture
  • Better maintainability of critical security code

Security Considerations

  • All existing security guarantees must be preserved
  • No relaxation of current safety checks
  • Comprehensive test coverage for all security components
  • Performance must not degrade (security is performance-critical)

Acceptance Criteria

  • No module exceeds 200 lines after split
  • ALL existing security functionality preserved
  • All tests pass (tox -e unit)
  • Security benchmarks pass
  • No performance regression in moderation pipeline
  • Clean separation of security concerns

Files

  • src/infrastructure/guards/moderation.py (primary, 497 lines)

Lines of code is a liability. Delete as much as you create.

## Problem `src/infrastructure/guards/moderation.py` is 497 lines handling critical security functions: - Content filtering and safety checks - Prompt injection detection - Response sanitization - Policy enforcement - Threat pattern matching ## Proposed Split 1. Extract content filters into `src/infrastructure/guards/filters/content.py` 2. Extract injection detection into `src/infrastructure/guards/filters/injection.py` 3. Extract sanitization into `src/infrastructure/guards/filters/sanitizer.py` 4. Extract policy engine into `src/infrastructure/guards/policy.py` 5. Keep `moderation.py` as orchestrator ## Benefits - Isolated testing of security components - Clear separation of security concerns - Easier security auditing - Pluggable filter architecture - Better maintainability of critical security code ## Security Considerations - All existing security guarantees must be preserved - No relaxation of current safety checks - Comprehensive test coverage for all security components - Performance must not degrade (security is performance-critical) ## Acceptance Criteria - [ ] No module exceeds 200 lines after split - [ ] ALL existing security functionality preserved - [ ] All tests pass (`tox -e unit`) - [ ] Security benchmarks pass - [ ] No performance regression in moderation pipeline - [ ] Clean separation of security concerns ## Files - `src/infrastructure/guards/moderation.py` (primary, 497 lines) Lines of code is a liability. Delete as much as you create.
Author
Owner

Implementation Instructions

This is a SECURITY-CRITICAL refactor. All existing security guarantees must be preserved.

Step-by-Step Implementation:

  1. Create package structure:

    src/infrastructure/guards/
    ├── moderation.py (orchestrator - keep thin)
    ├── policy.py (policy engine)
    └── filters/
        ├── __init__.py
        ├── content.py (content filtering)
        ├── injection.py (prompt injection detection)
        └── sanitizer.py (response sanitization)
    
  2. Security Components to Extract:

    • content.py: Text analysis, keyword detection, content classification
    • injection.py: Prompt injection patterns, escape sequence detection
    • sanitizer.py: Output cleaning, HTML/script stripping, safe formatting
    • policy.py: Rule evaluation, policy configuration, enforcement logic
  3. Keep in moderation.py (orchestrator):

    • Main moderate_content() function
    • Component coordination
    • Error handling and logging
    • Public API surface
  4. Testing Requirements:

    • Run full security test suite
    • Verify ALL existing security checks still work
    • No performance regression
    • All existing import paths continue to work
  5. Validation:

    tox -e unit  # Must pass 100%
    # Verify security benchmarks pass
    # Check performance metrics
    

CRITICAL: This module protects against malicious input. Any mistake could create security vulnerabilities. Test thoroughly.

## Implementation Instructions This is a SECURITY-CRITICAL refactor. All existing security guarantees must be preserved. ### Step-by-Step Implementation: 1. **Create package structure:** ``` src/infrastructure/guards/ ├── moderation.py (orchestrator - keep thin) ├── policy.py (policy engine) └── filters/ ├── __init__.py ├── content.py (content filtering) ├── injection.py (prompt injection detection) └── sanitizer.py (response sanitization) ``` 2. **Security Components to Extract:** - **content.py**: Text analysis, keyword detection, content classification - **injection.py**: Prompt injection patterns, escape sequence detection - **sanitizer.py**: Output cleaning, HTML/script stripping, safe formatting - **policy.py**: Rule evaluation, policy configuration, enforcement logic 3. **Keep in moderation.py (orchestrator):** - Main `moderate_content()` function - Component coordination - Error handling and logging - Public API surface 4. **Testing Requirements:** - Run full security test suite - Verify ALL existing security checks still work - No performance regression - All existing import paths continue to work 5. **Validation:** ```bash tox -e unit # Must pass 100% # Verify security benchmarks pass # Check performance metrics ``` **CRITICAL**: This module protects against malicious input. Any mistake could create security vulnerabilities. Test thoroughly.
kimi was assigned by Timmy 2026-03-24 12:54:28 +00:00
kimi was unassigned by Timmy 2026-03-24 19:32:24 +00:00
Timmy closed this issue 2026-03-24 21:54:13 +00:00
Sign in to join this conversation.
No Label
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: Rockachopa/Timmy-time-dashboard#1407