feat: image content screening for self-harm indicators #143

Rockachopa · 2026-04-15T16:05:31Z

Rockachopa commented

2026-04-15 16:05:31 +00:00

Closes #132

Epic: #102 (Multimodal Crisis Detection)

Added

image_screening.py — Privacy-preserving image analysis:

Uses local Ollama vision model (gemma3:4b) for screening
Detects: self-harm wounds, concerning medication, farewell imagery, crisis screenshots
Returns RiskLevel (safe/concerning/critical) with confidence score
In-memory analysis only — no image retention
Fallback heuristic when vision model unavailable
handle_chat_image() returns action dict with overlay/988 triggers

tests/test_image_screening.py — 8 tests:

RiskLevel classification (safe, concerning, critical)
Fallback behavior without vision model
Base64 input handling
Crisis overlay triggers for critical images
Followup prompts for concerning images

Usage

from image_screening import handle_chat_image
action = handle_chat_image(uploaded_image_bytes)
if action["show_crisis_overlay"]:
    render_crisis_overlay()

Closes #132 Epic: #102 (Multimodal Crisis Detection) ## Added **image_screening.py** — Privacy-preserving image analysis: - Uses local Ollama vision model (gemma3:4b) for screening - Detects: self-harm wounds, concerning medication, farewell imagery, crisis screenshots - Returns RiskLevel (safe/concerning/critical) with confidence score - In-memory analysis only — no image retention - Fallback heuristic when vision model unavailable - handle_chat_image() returns action dict with overlay/988 triggers **tests/test_image_screening.py** — 8 tests: - RiskLevel classification (safe, concerning, critical) - Fallback behavior without vision model - Base64 input handling - Crisis overlay triggers for critical images - Followup prompts for concerning images ## Usage ```python from image_screening import handle_chat_image action = handle_chat_image(uploaded_image_bytes) if action["show_crisis_overlay"]: render_crisis_overlay() ```

Rockachopa added 1 commit 2026-04-15 16:05:32 +00:00

feat: image content screening for self-harm indicators (closes #132 )

Sanity Checks / sanity-test (pull_request) Successful in 4s

Details

Smoke Test / smoke (pull_request) Successful in 10s

Details

0ab2626ef2

Timmy reviewed 2026-04-15 16:16:06 +00:00

Timmy left a comment

Review: feat: image content screening for self-harm indicators

Good overall structure. Privacy-preserving design is sound (in-memory only, no retention). A few items:

Fallback always returns SAFE: The _analyze_fallback() function defaults to RiskLevel.SAFE with low confidence. This means if Ollama is unavailable, all images pass through unchecked. Consider whether a more cautious default (e.g., CONCERNING) is warranted for a crisis tool, or at minimum log at WARNING level when falling back.
No image size/type validation: screen_image() accepts arbitrary data without checking size limits or valid image formats. A very large payload could cause memory issues or slow Ollama calls. Add basic validation (file size cap, magic byte check).
JSON parsing from LLM output is fragile: The json_start/json_end extraction assumes the LLM wraps its response in a single JSON object. If the model returns malformed JSON or multiple objects, this will silently fail and fall through to the fallback. Consider a retry or stricter validation.
Tests are solid — good coverage of mock scenarios and edge cases.
timeout=30 on the Ollama call may be too short for vision model inference on larger images, especially on constrained hardware.

**Review: feat: image content screening for self-harm indicators** Good overall structure. Privacy-preserving design is sound (in-memory only, no retention). A few items: 1. **Fallback always returns SAFE**: The `_analyze_fallback()` function defaults to `RiskLevel.SAFE` with low confidence. This means if Ollama is unavailable, *all* images pass through unchecked. Consider whether a more cautious default (e.g., CONCERNING) is warranted for a crisis tool, or at minimum log at WARNING level when falling back. 2. **No image size/type validation**: `screen_image()` accepts arbitrary data without checking size limits or valid image formats. A very large payload could cause memory issues or slow Ollama calls. Add basic validation (file size cap, magic byte check). 3. **JSON parsing from LLM output is fragile**: The `json_start`/`json_end` extraction assumes the LLM wraps its response in a single JSON object. If the model returns malformed JSON or multiple objects, this will silently fail and fall through to the fallback. Consider a retry or stricter validation. 4. **Tests are solid** — good coverage of mock scenarios and edge cases. 5. **`timeout=30`** on the Ollama call may be too short for vision model inference on larger images, especially on constrained hardware.

Timmy requested changes 2026-04-15 23:06:39 +00:00

Timmy left a comment

Image screening for self-harm is a sensitive and important feature. The architecture is privacy-preserving (in-memory analysis, no retention). Review:

CRITICAL — Fallback defaults to SAFE: When no vision model is available, _analyze_fallback returns RiskLevel.SAFE with confidence=0.2. This is a dangerous default — if the model is down, ALL images pass through unscreened. The fallback should return CONCERNING or at minimum flag that analysis was not performed, so downstream systems know the image was NOT actually screened.
PRIVACY: Local Ollama model is correct — using localhost:11434 ensures images never leave the machine. Good.
Good: Structured screening prompt covering relevant categories (wounds, pills, farewell imagery, crisis searches).
JSON parsing from LLM output is fragile: The _analyze_with_ollama function parses JSON from LLM output using string slicing. If the model outputs malformed JSON, it returns None which falls through to the SAFE fallback (see concern #1). Consider retrying or defaulting to CONCERNING on parse failure.
Good: handle_chat_image response includes 988 resources for CRITICAL and gentle check-in for CONCERNING.
Test coverage is good including mock testing of the Ollama integration.

Requesting changes: The SAFE fallback on model failure (#1) is a safety hazard. Unscreened images must not be marked SAFE.

Image screening for self-harm is a sensitive and important feature. The architecture is privacy-preserving (in-memory analysis, no retention). Review: 1. **CRITICAL — Fallback defaults to SAFE**: When no vision model is available, _analyze_fallback returns RiskLevel.SAFE with confidence=0.2. This is a dangerous default — if the model is down, ALL images pass through unscreened. The fallback should return CONCERNING or at minimum flag that analysis was not performed, so downstream systems know the image was NOT actually screened. 2. **PRIVACY: Local Ollama model is correct** — using localhost:11434 ensures images never leave the machine. Good. 3. **Good: Structured screening prompt** covering relevant categories (wounds, pills, farewell imagery, crisis searches). 4. **JSON parsing from LLM output is fragile**: The _analyze_with_ollama function parses JSON from LLM output using string slicing. If the model outputs malformed JSON, it returns None which falls through to the SAFE fallback (see concern #1). Consider retrying or defaulting to CONCERNING on parse failure. 5. **Good: handle_chat_image response** includes 988 resources for CRITICAL and gentle check-in for CONCERNING. 6. **Test coverage is good** including mock testing of the Ollama integration. Requesting changes: The SAFE fallback on model failure (#1) is a safety hazard. Unscreened images must not be marked SAFE.

Timmy closed this pull request

2026-04-17 01:52:33 +00:00

Timmy commented

2026-04-17 01:52:33 +00:00

Archived — branch unknown preserved for reference. Cherry-pick if still relevant.

Archived — branch `unknown` preserved for reference. Cherry-pick if still relevant.

Sanity Checks / sanity-test (pull_request) Successful in 4s

Details

Smoke Test / smoke (pull_request) Successful in 10s

Details

Pull request closed

Please reopen this pull request to perform a merge.

Sign in to join this conversation.

2 Participants

Notifications

Due Date

No due date set.

Dependencies

No dependencies set.

Reference: Timmy_Foundation/the-door#143