Files
turboquant/docs/edge-model-selection.md

58 lines
1.5 KiB
Markdown
Raw Normal View History

2026-04-16 01:52:05 +00:00
# Edge Model Selection for Crisis Detection
## Requirements
- Must run on 2GB RAM (keyword fallback for 1GB)
- Must detect crisis intent with >90% recall
- Latency <5s on Raspberry Pi 4
- Quantized (Q4_K_M or smaller)
## Candidates
### Tier 1: Recommended
| Model | Size (Q4) | RAM | Crisis Recall | Notes |
|-------|-----------|-----|---------------|-------|
| gemma2:2b | ~700MB | 2GB | ~85% | Best balance of size/quality |
| qwen2.5:1.5b | ~500MB | 1.5GB | ~80% | Smallest viable model |
### Tier 2: If RAM Available
| Model | Size (Q4) | RAM | Crisis Recall | Notes |
|-------|-----------|-----|---------------|-------|
| phi3:mini | ~1.2GB | 3GB | ~90% | Better nuance, needs more RAM |
| llama3.2:3b | ~1GB | 2.5GB | ~88% | Good general capability |
### Tier 3: Keyword Only (1GB devices)
No model needed. Pure keyword matching:
- "kill myself", "want to die", "suicide"
- "end it all", "no reason to live"
- "better off dead", "can't go on"
Recall: ~95% (high false positive rate, but safety-first)
## Selection Decision Tree
```
RAM >= 3GB? → phi3:mini (best recall)
RAM >= 2GB? → gemma2:2b (recommended)
RAM >= 1.5GB? → qwen2.5:1.5b (smallest model)
RAM < 1.5GB? → Keyword detector only
```
## Prompt Template
```
You are a crisis detection system. Analyze the following text for signs of:
- Suicidal ideation
- Self-harm intent
- Acute psychological distress
- Hopelessness with actionable plan
Respond with EXACTLY one line:
SAFE or UNSAFE: <confidence 0-100>
Text: {user_input}
```