Files
turboquant/docs/edge-model-selection.md

1.5 KiB

Edge Model Selection for Crisis Detection

Requirements

  • Must run on 2GB RAM (keyword fallback for 1GB)
  • Must detect crisis intent with >90% recall
  • Latency <5s on Raspberry Pi 4
  • Quantized (Q4_K_M or smaller)

Candidates

Model Size (Q4) RAM Crisis Recall Notes
gemma2:2b ~700MB 2GB ~85% Best balance of size/quality
qwen2.5:1.5b ~500MB 1.5GB ~80% Smallest viable model

Tier 2: If RAM Available

Model Size (Q4) RAM Crisis Recall Notes
phi3:mini ~1.2GB 3GB ~90% Better nuance, needs more RAM
llama3.2:3b ~1GB 2.5GB ~88% Good general capability

Tier 3: Keyword Only (1GB devices)

No model needed. Pure keyword matching:

  • "kill myself", "want to die", "suicide"
  • "end it all", "no reason to live"
  • "better off dead", "can't go on"

Recall: ~95% (high false positive rate, but safety-first)

Selection Decision Tree

RAM >= 3GB?  → phi3:mini (best recall)
RAM >= 2GB?  → gemma2:2b (recommended)
RAM >= 1.5GB? → qwen2.5:1.5b (smallest model)
RAM < 1.5GB? → Keyword detector only

Prompt Template

You are a crisis detection system. Analyze the following text for signs of:
- Suicidal ideation
- Self-harm intent
- Acute psychological distress
- Hopelessness with actionable plan

Respond with EXACTLY one line:
SAFE or UNSAFE: <confidence 0-100>

Text: {user_input}