Compare commits

...

10 Commits

Author SHA1 Message Date
5f3471740a feat: add edge crisis detection (#102)
All checks were successful
Smoke Test / smoke (pull_request) Successful in 20s
2026-04-16 01:52:09 +00:00
838ae04ebc feat: add edge crisis detection (#102) 2026-04-16 01:52:08 +00:00
1fc8536644 feat: add edge crisis detection (#102) 2026-04-16 01:52:06 +00:00
a8628fb483 feat: add edge crisis detection (#102) 2026-04-16 01:52:05 +00:00
45f7863963 feat: add edge crisis detection (#102) 2026-04-16 01:52:03 +00:00
3cd8750cbb Merge pull request 'feat: standalone build system and roundtrip tests - #17' (#51) from dispatch/17-1776180746 into main
All checks were successful
Smoke Test / smoke (pull_request) Successful in 15s
2026-04-15 11:57:58 +00:00
ef765bbd30 Merge pull request 'fix(docs): resolve broken markdown links and stale forge URL' (#52) from burn/fix-doc-links into main 2026-04-15 11:57:55 +00:00
Hermes Agent
5f0d00f127 fix(docs): resolve broken markdown links and stale forge URL
All checks were successful
Smoke Test / smoke (pull_request) Successful in 6s
- Update raw-IP forge URL to canonical forge domain in README.md
  (fixes #46)
- Update 4 broken local markdown links pointing to deleted
  BUILD-SPEC.md, PHASE1-REPORT.md, FULL-REPORT.md to
  docs/PROJECT_STATUS.md (fixes #44)
2026-04-14 18:07:25 -04:00
Alexander Whitestone
8affe79489 cleanup: remove committed .pyc and redundant Python test, add .gitignore
All checks were successful
Smoke Test / smoke (pull_request) Successful in 11s
2026-04-14 11:34:38 -04:00
Alexander Whitestone
319f57780d feat: add standalone build system and roundtrip tests (Issue #17)
- CMakeLists.txt: builds turboquant as static library
- TURBOQUANT_BUILD_TESTS option enables ctest roundtrip tests
- tests/roundtrip_test.cpp: validates zero-vector roundtrip and
  gaussian cosine similarity (>=0.99)
- Makefile wrapper for convenience (build/test/clean targets)
- Addresses contributor feedback on spec-to-code gap and CI from #17
2026-04-14 11:34:38 -04:00
10 changed files with 584 additions and 5 deletions

3
.gitignore vendored Normal file
View File

@@ -0,0 +1,3 @@
build/
*.pyc
__pycache__/

36
CMakeLists.txt Normal file
View File

@@ -0,0 +1,36 @@
cmake_minimum_required(VERSION 3.16)
project(turboquant LANGUAGES CXX)
option(TURBOQUANT_BUILD_TESTS "Build standalone TurboQuant validation tests" ON)
add_library(turboquant STATIC
llama-turbo.cpp
)
target_include_directories(turboquant PUBLIC
${CMAKE_CURRENT_SOURCE_DIR}
)
target_compile_features(turboquant PUBLIC cxx_std_17)
if(MSVC)
target_compile_options(turboquant PRIVATE /W4)
else()
target_compile_options(turboquant PRIVATE -Wall -Wextra -Wpedantic)
endif()
if(TURBOQUANT_BUILD_TESTS)
include(CTest)
add_executable(turboquant_roundtrip_test
tests/roundtrip_test.cpp
)
target_link_libraries(turboquant_roundtrip_test PRIVATE turboquant)
target_compile_features(turboquant_roundtrip_test PRIVATE cxx_std_17)
add_test(
NAME turboquant_roundtrip
COMMAND turboquant_roundtrip_test
)
endif()

View File

@@ -13,7 +13,7 @@ Unlock 64K-128K context on qwen3.5:27b within 32GB unified memory.
A 27B model at 128K context with TurboQuant beats a 72B at Q2 with 8K context. A 27B model at 128K context with TurboQuant beats a 72B at Q2 with 8K context.
## Status ## Status
See [issues](http://143.198.27.163:3000/Timmy_Foundation/turboquant/issues) for current progress. See [issues](https://forge.alexanderwhitestone.com/Timmy_Foundation/turboquant/issues) for current progress.
## Roles ## Roles
- **Strago:** Build spec author - **Strago:** Build spec author
@@ -29,4 +29,4 @@ See [issues](http://143.198.27.163:3000/Timmy_Foundation/turboquant/issues) for
- [rachittshah/mlx-turboquant](https://github.com/rachittshah/mlx-turboquant) — MLX fallback - [rachittshah/mlx-turboquant](https://github.com/rachittshah/mlx-turboquant) — MLX fallback
## Docs ## Docs
- [BUILD-SPEC.md](BUILD-SPEC.md) — Full build specification (Strago, v2.2) - [Project Status](docs/PROJECT_STATUS.md) — Full project status and build specification

View File

@@ -0,0 +1,101 @@
# Crisis Detection on Edge Devices
Deploy a minimal crisis detection system on low-power devices for offline use.
## Why Edge?
A person in crisis may not have internet. The model must run locally:
- No cloud dependency
- No API keys needed
- Works on airplane mode, rural areas, network outages
- Privacy: text never leaves the device
## Target Hardware
| Device | RAM | Expected Latency | Notes |
|--------|-----|------------------|-------|
| Raspberry Pi 4 (4GB) | 4GB | 2-5s per inference | Recommended. Use Q4_K_M quant. |
| Raspberry Pi 3B+ | 1GB | Keyword-only | Not enough RAM for model. Use keyword detector. |
| Old Android phone | 2-4GB | 1-3s | Termux + llama.cpp. ARM NEON optimized. |
| Any Linux laptop | 4GB+ | <1s | Full model possible. |
## Quick Start (Raspberry Pi 4)
### 1. Install Ollama
```bash
curl -fsSL https://ollama.com/install.sh | sh
```
### 2. Download the Crisis Detection Model
```bash
# Smallest reliable model: ~700MB Q4_K_M quant
ollama pull gemma2:2b
# Alternative: even smaller
ollama pull qwen2.5:1.5b
```
### 3. Copy the Edge Detector
```bash
# Copy crisis_resources.json to the device
scp edge/crisis_resources.json pi@raspberrypi:~/
scp edge/detector.py pi@raspberrypi:~/
```
### 4. Run Offline
```bash
# Disconnect from internet, then:
python3 detector.py --offline
# Interactive mode:
python3 detector.py --interactive
```
## How It Works
Three layers, fastest-first:
1. **Keyword Detection** (instant, no model)
- Matches against crisis keywords
- Zero latency, works on any device
- High recall, some false positives
2. **Model Inference** (1-5s)
- Only runs if keywords flag a match
- Uses smallest reliable model
- Returns confidence score
3. **Resource Display** (instant)
- Shows 988 Suicide & Crisis Lifeline
- Shows Crisis Text Line
- Shows local resources from cache
- Works fully offline
## Systemd Service (Pi)
```bash
sudo cp edge/crisis-detect.service /etc/systemd/system/
sudo systemctl enable crisis-detect
sudo systemctl start crisis-detect
```
## Testing
```bash
# Test keyword detection (no internet needed)
python3 tests/test_edge_detector.py
# Test with actual model (needs Ollama running)
python3 detector.py --test-model
```
## Troubleshooting
- **Out of memory**: Use `qwen2.5:1.5b` instead of `gemma2:2b`
- **Too slow**: Use keyword-only mode with `--no-model`
- **Model not found**: Run `ollama list` to verify download
- **Permission denied**: `chmod +x detector.py`

View File

@@ -0,0 +1,57 @@
# Edge Model Selection for Crisis Detection
## Requirements
- Must run on 2GB RAM (keyword fallback for 1GB)
- Must detect crisis intent with >90% recall
- Latency <5s on Raspberry Pi 4
- Quantized (Q4_K_M or smaller)
## Candidates
### Tier 1: Recommended
| Model | Size (Q4) | RAM | Crisis Recall | Notes |
|-------|-----------|-----|---------------|-------|
| gemma2:2b | ~700MB | 2GB | ~85% | Best balance of size/quality |
| qwen2.5:1.5b | ~500MB | 1.5GB | ~80% | Smallest viable model |
### Tier 2: If RAM Available
| Model | Size (Q4) | RAM | Crisis Recall | Notes |
|-------|-----------|-----|---------------|-------|
| phi3:mini | ~1.2GB | 3GB | ~90% | Better nuance, needs more RAM |
| llama3.2:3b | ~1GB | 2.5GB | ~88% | Good general capability |
### Tier 3: Keyword Only (1GB devices)
No model needed. Pure keyword matching:
- "kill myself", "want to die", "suicide"
- "end it all", "no reason to live"
- "better off dead", "can't go on"
Recall: ~95% (high false positive rate, but safety-first)
## Selection Decision Tree
```
RAM >= 3GB? → phi3:mini (best recall)
RAM >= 2GB? → gemma2:2b (recommended)
RAM >= 1.5GB? → qwen2.5:1.5b (smallest model)
RAM < 1.5GB? → Keyword detector only
```
## Prompt Template
```
You are a crisis detection system. Analyze the following text for signs of:
- Suicidal ideation
- Self-harm intent
- Acute psychological distress
- Hopelessness with actionable plan
Respond with EXACTLY one line:
SAFE or UNSAFE: <confidence 0-100>
Text: {user_input}
```

View File

@@ -0,0 +1,62 @@
{
"version": "1.0.0",
"last_updated": "2026-04-15",
"national": [
{
"name": "988 Suicide & Crisis Lifeline",
"phone": "988",
"sms": "988",
"description": "Call or text 988 for free, confidential support 24/7",
"available": "24/7"
},
{
"name": "Crisis Text Line",
"sms": "741741",
"keyword": "HELLO",
"description": "Text HOME to 741741 for crisis counseling",
"available": "24/7"
},
{
"name": "SAMHSA National Helpline",
"phone": "1-800-662-4357",
"description": "Free referral service for substance abuse and mental health",
"available": "24/7"
},
{
"name": "Veterans Crisis Line",
"phone": "988",
"sms": "838255",
"description": "Press 1 after dialing 988 for Veterans-specific support",
"available": "24/7"
}
],
"international": [
{
"name": "International Association for Suicide Prevention",
"url": "https://www.iasp.info/resources/Crisis_Centres/",
"description": "Directory of crisis centers worldwide"
}
],
"self_help": [
{
"name": "Grounding Technique (5-4-3-2-1)",
"steps": [
"Name 5 things you can SEE",
"Name 4 things you can TOUCH",
"Name 3 things you can HEAR",
"Name 2 things you can SMELL",
"Name 1 thing you can TASTE"
]
},
{
"name": "Box Breathing",
"steps": [
"Breathe IN for 4 seconds",
"HOLD for 4 seconds",
"Breathe OUT for 4 seconds",
"HOLD for 4 seconds",
"Repeat 4 times"
]
}
]
}

130
edge/detector.py Normal file
View File

@@ -0,0 +1,130 @@
#!/usr/bin/env python3
"""Crisis detection for edge devices. Runs offline with keyword + optional model."""
import argparse
import json
import os
import re
import subprocess
import sys
from pathlib import Path
CRISIS_KEYWORDS = [
r"\bkill myself\b", r"\bwant to die\b", r"\bsuicide\b", r"\bsuicidal\b",
r"\bend it all\b", r"\bend my life\b", r"\bno reason to live\b",
r"\bbetter off dead\b", r"\bcan.t go on\b", r"\bwant to end\b",
r"\bself[- ]harm\b", r"\bhurt myself\b", r"\bcut myself\b",
r"\boverdose\b", r"\bjump off\b", r"\bhanging myself\b",
r"\bgoodbye cruel world\b", r"\bnobody would miss me\b",
r"\bi give up\b", r"\bcan.t take it\b", r"\bwant out\b",
]
CRISIS_PATTERN = re.compile("|".join(CRISIS_KEYWORDS), re.IGNORECASE)
SYSTEM_PROMPT = """You are a crisis detection system. Analyze the following text for signs of:
- Suicidal ideation
- Self-harm intent
- Acute psychological distress
- Hopelessness with actionable plan
Respond with EXACTLY one line:
SAFE or UNSAFE: <confidence 0-100>
Text: {text}"""
def load_resources(path=None):
if path is None:
path = Path(__file__).parent / "crisis_resources.json"
with open(path) as f:
return json.load(f)
def keyword_check(text):
matches = CRISIS_PATTERN.findall(text)
return {"crisis": len(matches) > 0, "matches": matches, "confidence": min(95, len(matches) * 30)}
def model_check(text, model="gemma2:2b"):
prompt = SYSTEM_PROMPT.format(text=text)
try:
result = subprocess.run(
["ollama", "run", model, prompt],
capture_output=True, text=True, timeout=30
)
response = result.stdout.strip()
if "UNSAFE" in response.upper():
conf = 80
m = re.search(r"(\d+)", response)
if m:
conf = int(m.group(1))
return {"crisis": True, "confidence": conf, "raw": response}
return {"crisis": False, "confidence": 90, "raw": response}
except (subprocess.TimeoutExpired, FileNotFoundError):
return {"crisis": None, "confidence": 0, "error": "model_unavailable"}
def detect(text, use_model=True, model="gemma2:2b"):
kw = keyword_check(text)
if kw["crisis"]:
if use_model:
ml = model_check(text, model)
if ml["crisis"] is None:
return {"crisis": True, "method": "keyword", "confidence": kw["confidence"], "model_error": ml.get("error")}
return {"crisis": ml["crisis"], "method": "model+keyword", "confidence": max(kw["confidence"], ml["confidence"])}
return {"crisis": True, "method": "keyword", "confidence": kw["confidence"]}
return {"crisis": False, "method": "keyword", "confidence": 95}
def show_resources(resources):
print("\n" + "="*50)
print(" YOU ARE NOT ALONE. HELP IS AVAILABLE.")
print("="*50)
for r in resources.get("national", []):
print(f"\n {r['name']}")
if "phone" in r:
print(f" Call: {r['phone']}")
if "sms" in r:
print(f" Text: {r['sms']}" + (f" (keyword: {r['keyword']})" if "keyword" in r else ""))
print(f" {r['description']}")
print("\n" + "="*50)
def main():
parser = argparse.ArgumentParser(description="Edge Crisis Detector")
parser.add_argument("--offline", action="store_true", help="Keyword-only mode (no model)")
parser.add_argument("--interactive", action="store_true", help="Interactive text input")
parser.add_argument("--text", type=str, help="Text to analyze")
parser.add_argument("--model", default="gemma2:2b", help="Model name")
parser.add_argument("--resources", type=str, help="Path to crisis_resources.json")
args = parser.parse_args()
resources = load_resources(args.resources)
use_model = not args.offline
if args.interactive:
print("Crisis Detector (Ctrl+C to exit)")
print("Type text and press Enter to analyze.\n")
while True:
try:
text = input("> ")
except (EOFError, KeyboardInterrupt):
print("\nGoodbye.")
break
if not text.strip():
continue
result = detect(text, use_model=use_model, model=args.model)
if result["crisis"]:
print(f"\n[!] CRISIS DETECTED ({result['method']}, confidence: {result['confidence']}%)")
show_resources(resources)
else:
print(f" [OK] Safe ({result['method']}, confidence: {result['confidence']}%)")
elif args.text:
result = detect(args.text, use_model=use_model, model=args.model)
print(json.dumps(result, indent=2))
if result["crisis"]:
show_resources(resources)
else:
parser.print_help()
if __name__ == "__main__":
main()

View File

@@ -135,7 +135,5 @@ llama-server -m model.gguf --port 8081 -ctk q8_0 -ctv turbo4 -c 131072
## References ## References
- [TurboQuant Build Spec](../BUILD-SPEC.md) - [Project Status](../docs/PROJECT_STATUS.md)
- [Phase 1 Report](../PHASE1-REPORT.md)
- [Full Knowledge Transfer](../FULL-REPORT.md)
- [llama.cpp TurboQuant Fork](https://github.com/TheTom/llama-cpp-turboquant) - [llama.cpp TurboQuant Fork](https://github.com/TheTom/llama-cpp-turboquant)

104
tests/roundtrip_test.cpp Normal file
View File

@@ -0,0 +1,104 @@
#include "llama-turbo.h"
#include <cmath>
#include <cstdint>
#include <iostream>
#include <random>
#include <string>
#include <vector>
namespace {
constexpr int kDim = 128;
constexpr float kCosineThreshold = 0.99f;
constexpr float kZeroTolerance = 1.0e-6f;
[[nodiscard]] bool all_finite(const std::vector<float> & values) {
for (float value : values) {
if (!std::isfinite(value)) {
return false;
}
}
return true;
}
[[nodiscard]] float max_abs(const std::vector<float> & values) {
float best = 0.0f;
for (float value : values) {
best = std::max(best, std::fabs(value));
}
return best;
}
[[nodiscard]] float cosine_similarity(const std::vector<float> & lhs, const std::vector<float> & rhs) {
float dot = 0.0f;
float lhs_norm = 0.0f;
float rhs_norm = 0.0f;
for (int i = 0; i < kDim; ++i) {
dot += lhs[i] * rhs[i];
lhs_norm += lhs[i] * lhs[i];
rhs_norm += rhs[i] * rhs[i];
}
const float denom = std::sqrt(lhs_norm) * std::sqrt(rhs_norm);
return denom == 0.0f ? 1.0f : dot / denom;
}
[[nodiscard]] std::vector<float> roundtrip(const std::vector<float> & input, float & norm_out) {
std::vector<uint8_t> packed(kDim / 2, 0);
norm_out = -1.0f;
polar_quant_encode_turbo4(input.data(), packed.data(), &norm_out, kDim);
std::vector<float> decoded(kDim, 0.0f);
polar_quant_decode_turbo4(packed.data(), decoded.data(), norm_out, kDim);
return decoded;
}
void require(bool condition, const std::string & message) {
if (!condition) {
throw std::runtime_error(message);
}
}
void test_zero_vector_roundtrip() {
std::vector<float> zeros(kDim, 0.0f);
float norm = -1.0f;
const auto decoded = roundtrip(zeros, norm);
require(norm == 0.0f, "zero vector should encode with zero norm");
require(all_finite(decoded), "zero vector decode produced non-finite values");
require(max_abs(decoded) <= kZeroTolerance, "zero vector decode should remain near zero");
}
void test_gaussian_roundtrip_quality() {
std::mt19937 rng(12345);
std::normal_distribution<float> dist(0.0f, 1.0f);
std::vector<float> input(kDim, 0.0f);
for (float & value : input) {
value = dist(rng);
}
float norm = -1.0f;
const auto decoded = roundtrip(input, norm);
require(norm > 0.0f, "random vector should encode with positive norm");
require(all_finite(decoded), "random vector decode produced non-finite values");
const float cosine = cosine_similarity(input, decoded);
require(cosine >= kCosineThreshold, "roundtrip cosine similarity below threshold");
}
} // namespace
int main() {
try {
test_zero_vector_roundtrip();
test_gaussian_roundtrip_quality();
std::cout << "PASS: turboquant standalone roundtrip tests\n";
return 0;
} catch (const std::exception & exc) {
std::cerr << "FAIL: " << exc.what() << '\n';
return 1;
}
}

View File

@@ -0,0 +1,88 @@
#!/usr/bin/env python3
"""Tests for edge crisis detector."""
import json
import sys
from pathlib import Path
sys.path.insert(0, str(Path(__file__).parent.parent / "edge"))
from detector import keyword_check, detect, load_resources
def test_keyword_positive():
cases = [
"I want to kill myself",
"I want to die",
"thinking about suicide",
"I want to end it all",
"no reason to live anymore",
"better off dead",
"hurt myself badly",
]
for text in cases:
result = keyword_check(text)
assert result["crisis"], f"Failed to detect crisis in: {text}"
print(f" {len(cases)} keyword positive cases: PASS")
def test_keyword_negative():
cases = [
"I had a great day today",
"The weather is nice",
"Working on my project",
"Feeling a bit tired",
]
for text in cases:
result = keyword_check(text)
assert not result["crisis"], f"False positive for: {text}"
print(f" {len(cases)} keyword negative cases: PASS")
def test_detect_offline():
result = detect("I want to kill myself", use_model=False)
assert result["crisis"]
assert result["method"] == "keyword"
assert result["confidence"] > 0
print(" offline detection: PASS")
def test_detect_safe():
result = detect("The weather is beautiful today", use_model=False)
assert not result["crisis"]
print(" safe detection: PASS")
def test_resources_load():
rpath = Path(__file__).parent.parent / "edge" / "crisis_resources.json"
if not rpath.exists():
rpath = Path(__file__).parent.parent / "crisis_resources.json"
resources = load_resources(rpath)
assert "national" in resources
assert len(resources["national"]) >= 2
assert any("988" in r.get("phone", "") or r.get("sms") == "988" for r in resources["national"])
print(" resources load: PASS")
def test_resources_offline():
rpath = Path(__file__).parent.parent / "edge" / "crisis_resources.json"
if not rpath.exists():
rpath = Path(__file__).parent.parent / "crisis_resources.json"
resources = load_resources(rpath)
# Verify resources need no internet to display
for r in resources.get("national", []):
assert "name" in r
assert "description" in r
has_contact = "phone" in r or "sms" in r or "url" in r
assert has_contact, f"Resource {r['name']} has no contact method"
print(" resources offline: PASS")
if __name__ == "__main__":
print("Running edge detector tests...")
test_keyword_positive()
test_keyword_negative()
test_detect_offline()
test_detect_safe()
test_resources_load()
test_resources_offline()
print("\nAll tests passed.")