Alexander Payne
|
16e73dd143
|
feat(edge-crisis): add complete offline crisis detection deployment for edge devices
Smoke Test / smoke (pull_request) Successful in 11s
Deliverables for issue #102:
1. Deployment guide: docs/edge-crisis-deployment.md (11KB)
- Hardware targets: Raspberry Pi 4, Android Termux, old laptops
- Model selection: Bonsai-1.7B (primary, F1 0.86), Falcon-H1-Tiny-90M (fallback, 300MB)
- TurboQuant integration: llama-cpp-turboquant build + turbo4 KV compression
- Offline resource cache: 988 phone/text, Crisis Text Line (741741), SAMHSA, Trevor Project
- Crisis detection wrapper script + troubleshooting guide
2. Edge device profile: profiles/edge-crisis.yaml
- Hermes profile for local llama.cpp server with TurboQuant
- turbo4 compression on keys and values
- Minimal offline-only toolset (memory, read_file, write_file)
- Platform tuning: Pi 4 (4 threads), Android Termux (2 threads)
3. Offline resource cache: resources/crisis_resources.json
- Hotline database with multiple national services
- Local resource discovery pattern
- Self-care steps for acute crisis management
4. Offline test script: tests/test_edge_crisis_offline.sh
- End-to-end verification: prerequisites, server startup, health check
- Offline validation guidance (user performs network disconnect)
- Resource cache integrity check
- Clean bash-n syntax
Model rationale: Bonsai-1.7B (1.1GB GGUF Q4) runs ~8 tok/s on Pi 4 with TurboQuant
turbo4 reducing KV cache from 8GB to 2.2GB, enabling 8K context on 4GB RAM devices.
Falcon-H1-Tiny-90M (300MB) serves severely constrained hardware (<2GB RAM).
Closes #102
|
2026-04-29 00:05:43 -04:00 |
|