Files
turboquant/docs/edge-model-selection.md
Alexander Payne 96b7183d70
All checks were successful
Smoke Test / smoke (pull_request) Successful in 8s
test(edge): add hardware validation for edge crisis detector (closes #116)
Implements #116 — hardware validation testing for edge crisis detector
on Raspberry Pi 4 and other edge devices.

Adds edge detector (keyword + optional Ollama model), crisis_resources.json,
deployment docs, and two test files:
- test_edge_detector.py: unit tests for keyword logic
- test_edge_detector_hardware.py: hardware validation suite

Hardware validation measures keyword detection (<1ms), model inference (<5s
on Pi 4), offline operation, and provides reproducible benchmark via
`python3 edge/detector.py --benchmark`.

Re-implements the functionality from closed PR #111 with expanded tests.
2026-04-26 00:51:31 -04:00

928 B

Edge Model Selection for Crisis Detection

Requirements

  • Must run on 2GB RAM (keyword fallback for 1GB devices)
  • Must detect crisis intent with >90% recall
  • Latency <5s on Raspberry Pi 4
  • Quantized (Q4_K_M or smaller)

Candidates

Model Size (Q4) RAM Crisis Recall Notes
gemma2:2b ~700MB 2GB ~85% Best balance of size/quality
qwen2.5:1.5b ~500MB 1.5GB ~80% Smallest viable model

Tier 2: If RAM Available

Model Size (Q4) RAM Crisis Recall Notes
phi3:mini ~1.2GB 3GB ~90% Better nuance, needs more RAM
llama3.2:3b ~1GB 2.5GB ~88% Good general capability

Tier 3: Keyword Only (1GB devices)

For devices with <2GB RAM, use --offline mode — keyword detection runs in <1ms and requires zero model memory.