Files
turboquant/docs/edge-crisis-deployment.md

2.4 KiB

Crisis Detection on Edge Devices

Deploy a minimal crisis detection system on low-power devices for offline use.

Why Edge?

A person in crisis may not have internet. The model must run locally:

  • No cloud dependency
  • No API keys needed
  • Works on airplane mode, rural areas, network outages
  • Privacy: text never leaves the device

Target Hardware

Device RAM Expected Latency Notes
Raspberry Pi 4 (4GB) 4GB 2-5s per inference Recommended. Use Q4_K_M quant.
Raspberry Pi 3B+ 1GB Keyword-only Not enough RAM for model. Use keyword detector.
Old Android phone 2-4GB 1-3s Termux + llama.cpp. ARM NEON optimized.
Any Linux laptop 4GB+ <1s Full model possible.

Quick Start (Raspberry Pi 4)

1. Install Ollama

curl -fsSL https://ollama.com/install.sh | sh

2. Download the Crisis Detection Model

# Smallest reliable model: ~700MB Q4_K_M quant
ollama pull gemma2:2b

# Alternative: even smaller
ollama pull qwen2.5:1.5b

3. Copy the Edge Detector

# Copy crisis_resources.json to the device
scp edge/crisis_resources.json pi@raspberrypi:~/
scp edge/detector.py pi@raspberrypi:~/

4. Run Offline

# Disconnect from internet, then:
python3 detector.py --offline

# Interactive mode:
python3 detector.py --interactive

How It Works

Three layers, fastest-first:

  1. Keyword Detection (instant, no model)

    • Matches against crisis keywords
    • Zero latency, works on any device
    • High recall, some false positives
  2. Model Inference (1-5s)

    • Only runs if keywords flag a match
    • Uses smallest reliable model
    • Returns confidence score
  3. Resource Display (instant)

    • Shows 988 Suicide & Crisis Lifeline
    • Shows Crisis Text Line
    • Shows local resources from cache
    • Works fully offline

Systemd Service (Pi)

sudo cp edge/crisis-detect.service /etc/systemd/system/
sudo systemctl enable crisis-detect
sudo systemctl start crisis-detect

Testing

# Test keyword detection (no internet needed)
python3 tests/test_edge_detector.py

# Test with actual model (needs Ollama running)
python3 detector.py --test-model

Troubleshooting

  • Out of memory: Use qwen2.5:1.5b instead of gemma2:2b
  • Too slow: Use keyword-only mode with --no-model
  • Model not found: Run ollama list to verify download
  • Permission denied: chmod +x detector.py