turboquant/docs/edge-crisis-deployment.md

# Crisis Detection on Edge Devices

Deploy a minimal crisis detection system on low-power devices for offline use.

## Why Edge?

A person in crisis may not have internet. The model must run locally:
- No cloud dependency
- No API keys needed
- Works on airplane mode, rural areas, network outages
- Privacy: text never leaves the device

## Target Hardware

| Device | RAM | Expected Latency | Notes |
|--------|-----|------------------|-------|
| Raspberry Pi 4 (4GB) | 4GB | 2-5s per inference | Recommended. Use Q4_K_M quant. |
| Raspberry Pi 3B+ | 1GB | Keyword-only | Not enough RAM for model. Use keyword detector. |
| Old Android phone | 2-4GB | 1-3s | Termux + llama.cpp. ARM NEON optimized. |
| Any Linux laptop | 4GB+ | <1s | Full model possible. |

## Quick Start (Raspberry Pi 4)

### 1. Install Ollama

```bash
curl -fsSL https://ollama.com/install.sh | sh
```

### 2. Download the Crisis Detection Model

```bash
# Smallest reliable model: ~700MB Q4_K_M quant
ollama pull gemma2:2b

# Alternative: even smaller
ollama pull qwen2.5:1.5b
```

### 3. Copy the Edge Detector

```bash
# Copy crisis_resources.json to the device
scp edge/crisis_resources.json pi@raspberrypi:~/
scp edge/detector.py pi@raspberrypi:~/
```

### 4. Run Offline

```bash
# Disconnect from internet, then:
python3 detector.py --offline

# Interactive mode:
python3 detector.py --interactive
```

## How It Works

Three layers, fastest-first:

1. **Keyword Detection** (instant, no model)
   - Matches against crisis keywords
   - Zero latency, works on any device
   - High recall, some false positives

2. **Model Inference** (1-5s)
   - Only runs if keywords flag a match
   - Uses smallest reliable model
   - Returns confidence score

3. **Resource Display** (instant)
   - Shows 988 Suicide & Crisis Lifeline
   - Shows Crisis Text Line
   - Shows local resources from cache
   - Works fully offline

## Systemd Service (Pi)

```bash
sudo cp edge/crisis-detect.service /etc/systemd/system/
sudo systemctl enable crisis-detect
sudo systemctl start crisis-detect
```

## Testing

```bash
# Test keyword detection (no internet needed)
python3 tests/test_edge_detector.py

# Test with actual model (needs Ollama running)
python3 detector.py --test-model
```

## Troubleshooting

- **Out of memory**: Use `qwen2.5:1.5b` instead of `gemma2:2b`
- **Too slow**: Use keyword-only mode with `--no-model`
- **Model not found**: Run `ollama list` to verify download
- **Permission denied**: `chmod +x detector.py`