feat: image content screening for self-harm indicators #143

Closed
Rockachopa wants to merge 1 commits from fix/132 into main
Owner

Closes #132

Epic: #102 (Multimodal Crisis Detection)

Added

image_screening.py — Privacy-preserving image analysis:

  • Uses local Ollama vision model (gemma3:4b) for screening
  • Detects: self-harm wounds, concerning medication, farewell imagery, crisis screenshots
  • Returns RiskLevel (safe/concerning/critical) with confidence score
  • In-memory analysis only — no image retention
  • Fallback heuristic when vision model unavailable
  • handle_chat_image() returns action dict with overlay/988 triggers

tests/test_image_screening.py — 8 tests:

  • RiskLevel classification (safe, concerning, critical)
  • Fallback behavior without vision model
  • Base64 input handling
  • Crisis overlay triggers for critical images
  • Followup prompts for concerning images

Usage

from image_screening import handle_chat_image
action = handle_chat_image(uploaded_image_bytes)
if action["show_crisis_overlay"]:
    render_crisis_overlay()
Closes #132 Epic: #102 (Multimodal Crisis Detection) ## Added **image_screening.py** — Privacy-preserving image analysis: - Uses local Ollama vision model (gemma3:4b) for screening - Detects: self-harm wounds, concerning medication, farewell imagery, crisis screenshots - Returns RiskLevel (safe/concerning/critical) with confidence score - In-memory analysis only — no image retention - Fallback heuristic when vision model unavailable - handle_chat_image() returns action dict with overlay/988 triggers **tests/test_image_screening.py** — 8 tests: - RiskLevel classification (safe, concerning, critical) - Fallback behavior without vision model - Base64 input handling - Crisis overlay triggers for critical images - Followup prompts for concerning images ## Usage ```python from image_screening import handle_chat_image action = handle_chat_image(uploaded_image_bytes) if action["show_crisis_overlay"]: render_crisis_overlay() ```
Rockachopa added 1 commit 2026-04-15 16:05:32 +00:00
feat: image content screening for self-harm indicators (closes #132)
All checks were successful
Sanity Checks / sanity-test (pull_request) Successful in 4s
Smoke Test / smoke (pull_request) Successful in 10s
0ab2626ef2
Timmy reviewed 2026-04-15 16:16:06 +00:00
Timmy left a comment
Owner

Review: feat: image content screening for self-harm indicators

Good overall structure. Privacy-preserving design is sound (in-memory only, no retention). A few items:

  1. Fallback always returns SAFE: The _analyze_fallback() function defaults to RiskLevel.SAFE with low confidence. This means if Ollama is unavailable, all images pass through unchecked. Consider whether a more cautious default (e.g., CONCERNING) is warranted for a crisis tool, or at minimum log at WARNING level when falling back.

  2. No image size/type validation: screen_image() accepts arbitrary data without checking size limits or valid image formats. A very large payload could cause memory issues or slow Ollama calls. Add basic validation (file size cap, magic byte check).

  3. JSON parsing from LLM output is fragile: The json_start/json_end extraction assumes the LLM wraps its response in a single JSON object. If the model returns malformed JSON or multiple objects, this will silently fail and fall through to the fallback. Consider a retry or stricter validation.

  4. Tests are solid — good coverage of mock scenarios and edge cases.

  5. timeout=30 on the Ollama call may be too short for vision model inference on larger images, especially on constrained hardware.

**Review: feat: image content screening for self-harm indicators** Good overall structure. Privacy-preserving design is sound (in-memory only, no retention). A few items: 1. **Fallback always returns SAFE**: The `_analyze_fallback()` function defaults to `RiskLevel.SAFE` with low confidence. This means if Ollama is unavailable, *all* images pass through unchecked. Consider whether a more cautious default (e.g., CONCERNING) is warranted for a crisis tool, or at minimum log at WARNING level when falling back. 2. **No image size/type validation**: `screen_image()` accepts arbitrary data without checking size limits or valid image formats. A very large payload could cause memory issues or slow Ollama calls. Add basic validation (file size cap, magic byte check). 3. **JSON parsing from LLM output is fragile**: The `json_start`/`json_end` extraction assumes the LLM wraps its response in a single JSON object. If the model returns malformed JSON or multiple objects, this will silently fail and fall through to the fallback. Consider a retry or stricter validation. 4. **Tests are solid** — good coverage of mock scenarios and edge cases. 5. **`timeout=30`** on the Ollama call may be too short for vision model inference on larger images, especially on constrained hardware.
Timmy requested changes 2026-04-15 23:06:39 +00:00
Timmy left a comment
Owner

Image screening for self-harm is a sensitive and important feature. The architecture is privacy-preserving (in-memory analysis, no retention). Review:

  1. CRITICAL — Fallback defaults to SAFE: When no vision model is available, _analyze_fallback returns RiskLevel.SAFE with confidence=0.2. This is a dangerous default — if the model is down, ALL images pass through unscreened. The fallback should return CONCERNING or at minimum flag that analysis was not performed, so downstream systems know the image was NOT actually screened.

  2. PRIVACY: Local Ollama model is correct — using localhost:11434 ensures images never leave the machine. Good.

  3. Good: Structured screening prompt covering relevant categories (wounds, pills, farewell imagery, crisis searches).

  4. JSON parsing from LLM output is fragile: The _analyze_with_ollama function parses JSON from LLM output using string slicing. If the model outputs malformed JSON, it returns None which falls through to the SAFE fallback (see concern #1). Consider retrying or defaulting to CONCERNING on parse failure.

  5. Good: handle_chat_image response includes 988 resources for CRITICAL and gentle check-in for CONCERNING.

  6. Test coverage is good including mock testing of the Ollama integration.

Requesting changes: The SAFE fallback on model failure (#1) is a safety hazard. Unscreened images must not be marked SAFE.

Image screening for self-harm is a sensitive and important feature. The architecture is privacy-preserving (in-memory analysis, no retention). Review: 1. **CRITICAL — Fallback defaults to SAFE**: When no vision model is available, _analyze_fallback returns RiskLevel.SAFE with confidence=0.2. This is a dangerous default — if the model is down, ALL images pass through unscreened. The fallback should return CONCERNING or at minimum flag that analysis was not performed, so downstream systems know the image was NOT actually screened. 2. **PRIVACY: Local Ollama model is correct** — using localhost:11434 ensures images never leave the machine. Good. 3. **Good: Structured screening prompt** covering relevant categories (wounds, pills, farewell imagery, crisis searches). 4. **JSON parsing from LLM output is fragile**: The _analyze_with_ollama function parses JSON from LLM output using string slicing. If the model outputs malformed JSON, it returns None which falls through to the SAFE fallback (see concern #1). Consider retrying or defaulting to CONCERNING on parse failure. 5. **Good: handle_chat_image response** includes 988 resources for CRITICAL and gentle check-in for CONCERNING. 6. **Test coverage is good** including mock testing of the Ollama integration. Requesting changes: The SAFE fallback on model failure (#1) is a safety hazard. Unscreened images must not be marked SAFE.
Timmy closed this pull request 2026-04-17 01:52:33 +00:00
Owner

Archived — branch unknown preserved for reference. Cherry-pick if still relevant.

Archived — branch `unknown` preserved for reference. Cherry-pick if still relevant.
All checks were successful
Sanity Checks / sanity-test (pull_request) Successful in 4s
Smoke Test / smoke (pull_request) Successful in 10s

Pull request closed

Sign in to join this conversation.