Files
turboquant/docs/edge-crisis-deployment.md

102 lines
2.4 KiB
Markdown

# Crisis Detection on Edge Devices
Deploy a minimal crisis detection system on low-power devices for offline use.
## Why Edge?
A person in crisis may not have internet. The model must run locally:
- No cloud dependency
- No API keys needed
- Works on airplane mode, rural areas, network outages
- Privacy: text never leaves the device
## Target Hardware
| Device | RAM | Expected Latency | Notes |
|--------|-----|------------------|-------|
| Raspberry Pi 4 (4GB) | 4GB | 2-5s per inference | Recommended. Use Q4_K_M quant. |
| Raspberry Pi 3B+ | 1GB | Keyword-only | Not enough RAM for model. Use keyword detector. |
| Old Android phone | 2-4GB | 1-3s | Termux + llama.cpp. ARM NEON optimized. |
| Any Linux laptop | 4GB+ | <1s | Full model possible. |
## Quick Start (Raspberry Pi 4)
### 1. Install Ollama
```bash
curl -fsSL https://ollama.com/install.sh | sh
```
### 2. Download the Crisis Detection Model
```bash
# Smallest reliable model: ~700MB Q4_K_M quant
ollama pull gemma2:2b
# Alternative: even smaller
ollama pull qwen2.5:1.5b
```
### 3. Copy the Edge Detector
```bash
# Copy crisis_resources.json to the device
scp edge/crisis_resources.json pi@raspberrypi:~/
scp edge/detector.py pi@raspberrypi:~/
```
### 4. Run Offline
```bash
# Disconnect from internet, then:
python3 detector.py --offline
# Interactive mode:
python3 detector.py --interactive
```
## How It Works
Three layers, fastest-first:
1. **Keyword Detection** (instant, no model)
- Matches against crisis keywords
- Zero latency, works on any device
- High recall, some false positives
2. **Model Inference** (1-5s)
- Only runs if keywords flag a match
- Uses smallest reliable model
- Returns confidence score
3. **Resource Display** (instant)
- Shows 988 Suicide & Crisis Lifeline
- Shows Crisis Text Line
- Shows local resources from cache
- Works fully offline
## Systemd Service (Pi)
```bash
sudo cp edge/crisis-detect.service /etc/systemd/system/
sudo systemctl enable crisis-detect
sudo systemctl start crisis-detect
```
## Testing
```bash
# Test keyword detection (no internet needed)
python3 tests/test_edge_detector.py
# Test with actual model (needs Ollama running)
python3 detector.py --test-model
```
## Troubleshooting
- **Out of memory**: Use `qwen2.5:1.5b` instead of `gemma2:2b`
- **Too slow**: Use keyword-only mode with `--no-model`
- **Model not found**: Run `ollama list` to verify download
- **Permission denied**: `chmod +x detector.py`