Compare commits
2 Commits
step35/875
...
fix/791
| Author | SHA1 | Date | |
|---|---|---|---|
|
|
c630f6f0fd | ||
|
|
6793766516 |
@@ -1,126 +0,0 @@
|
||||
# Username OSINT Operator Policy
|
||||
|
||||
**Effective**: 2026-04-26
|
||||
**Applies to**: Username enumeration results produced by `maigret` / `socialscan` / `sherlock`
|
||||
**Exempt**: Manual human social-engineering (this policy covers automated tool output only)
|
||||
**Related**: timmy-home#875, `research/username-osint/decision-memo.md`
|
||||
|
||||
---
|
||||
|
||||
## 1. Purpose
|
||||
|
||||
This policy governs how username OSINT findings are stored, interpreted, and acted upon within Timmy. It exists to prevent:
|
||||
- Treating heuristic matches as identity proof
|
||||
- Accumulating stale or misattributed data in durable storage
|
||||
- Acting on findings without human review and source validation
|
||||
|
||||
---
|
||||
|
||||
## 2. Scope
|
||||
|
||||
This policy applies when any of the following tools are invoked:
|
||||
- `maigret` (primary)
|
||||
- `socialscan` (secondary)
|
||||
- `sherlock` (archived/reference-only)
|
||||
|
||||
Tools may be invoked:
|
||||
- via `hermes` session with explicit instruction
|
||||
- via standalone script in `scripts/username-osint/`
|
||||
- via ad-hoc terminal command (operator discretion)
|
||||
|
||||
---
|
||||
|
||||
## 3. Storage boundaries
|
||||
|
||||
### 3.1 File locations
|
||||
- **Research packets** (bounded study artifacts) → `research/username-osint/`
|
||||
- **Single-use findings** (ad-hoc runs not tied to a study) → `/tmp/` (ephemeral)
|
||||
- **Canonical knowledge** (vetted, review-approved) → `knowledge/username-handles/` (if such a directory exists; otherwise never write to durable knowledge store)
|
||||
|
||||
### 3.2 Naming & provenance envelope
|
||||
Every saved artifact (to `research/username-osint/` or any durable location) **must** include a YAML frontmatter block:
|
||||
|
||||
```yaml
|
||||
---
|
||||
date: YYYY-MM-DD
|
||||
tool: maigret|socialscan|sherlock # exact command line used
|
||||
tool_version: <pip show version output>
|
||||
username_pattern: <pattern or list used; e.g. "alice,bob,charlie" or "@corp-employees.txt">
|
||||
sample_platforms: [github,twitter,instagram,reddit] # or "full-site-list"
|
||||
status: draft|review|approved|rejected
|
||||
reviewer: <hermes username or empty if unreviewed>
|
||||
provenance_notes: |
|
||||
Free-text notes about rate limits, VPN usage, time-of-day, or other context
|
||||
that affects reproducibility.
|
||||
---
|
||||
```
|
||||
|
||||
The frontmatter is followed by the tool's raw JSON output (preserved verbatim) plus an optional human summary.
|
||||
|
||||
---
|
||||
|
||||
## 4. Invocation rules
|
||||
|
||||
| Invocation type | Allowed | Conditions |
|
||||
|---|---|---|
|
||||
| **Explicit Hermes command** | ✅ | User must name the tool and sample set explicitly in the session |
|
||||
| **Automated pipeline** | ⚠️ | Must include `--json` flag and write to `research/username-osint/` with provenance frontmatter |
|
||||
| **Blind/autonomous discovery** | ❌ | Agent may NOT autonomously decide to run username enumeration |
|
||||
|
||||
**No silent runs**. Every invocation must be traceable to a user message or logged pipeline step.
|
||||
|
||||
---
|
||||
|
||||
## 5. Interpretation guardrails
|
||||
|
||||
### 5.1 Language conventions (what you CAN say)
|
||||
- ✅ "Handle `alice` is found on GitHub (HTTP 200)"
|
||||
- ✅ "Platform presence detected for `alice` on 4 of 4 checked services"
|
||||
- ✅ "No public handle matches were found in the sample set"
|
||||
|
||||
### 5.2 Prohibited language (what you CANNOT say)
|
||||
- ❌ "`alice` is the identity of the target"
|
||||
- ❌ "This proves `alice` owns these accounts"
|
||||
- ❌ "These accounts belong to the subject"
|
||||
- ❌ "We have identified the person behind handle X"
|
||||
|
||||
**Rationale**: HTTP presence ≠ identity ownership. Platform migration, shared devices, and impersonation are common. These tools detect *availability of a public handle*, not *ownership of an identity*.
|
||||
|
||||
---
|
||||
|
||||
## 6. Review & retention
|
||||
|
||||
### 6.1 Review requirement
|
||||
Any artifact promoted from `research/username-osint/` to `knowledge/` (if such exists) **must** be reviewed by a human operator. Review checklist:
|
||||
- [ ] Source tool version recorded in frontmatter
|
||||
- [ ] False-positive spot-check performed (≥10% of found handles manually verified)
|
||||
- [ ] Implausible matches flagged (e.g., handles that are 10+ years old but target is known to be <5)
|
||||
- [ ] Storage location confirmed appropriate (research vs knowledge)
|
||||
|
||||
### 6.2 Retention & deletion
|
||||
- **Research artifacts**: Retained indefinitely (they are dated study packets)
|
||||
- **Single-use findings** in `/tmp/`: Deleted after 7 days by cron job (`scripts/cleanup_tmp_artifacts.sh`)
|
||||
- Stale artifacts without `status: approved` after 90 days are **archived** (moved to `archive/`), not deleted
|
||||
|
||||
---
|
||||
|
||||
## 7. Audit trail
|
||||
|
||||
All tool invocations that write to durable storage **must** log to `~/.timmy/logs/username-osint.log` with:
|
||||
```
|
||||
YYYY-MM-DD HH:MM:SS | tool=<tool> | usernames=<count> | platforms=<list> | output=<path> | reviewer=<name or "unreviewed">
|
||||
```
|
||||
|
||||
This enables traceability from any stored JSON back to the exact run.
|
||||
|
||||
---
|
||||
|
||||
## 8. Exceptions
|
||||
|
||||
Requests for exception to this policy require:
|
||||
1. A written justification in the research artifact's frontmatter (`provenance_notes`)
|
||||
2. Human reviewer sign-off in the `reviewer` field
|
||||
3. Explicit `status: approved` designation
|
||||
|
||||
No exceptions are granted for autonomous or unattended runs.
|
||||
|
||||
@@ -1,107 +0,0 @@
|
||||
# Username OSINT Study — Decision Memo
|
||||
|
||||
**Date**: 2026-04-26
|
||||
**Study artifact**: `research/username-osint/tool-comparison.md`
|
||||
**Parent issue**: timmy-home#875
|
||||
**Status**: Complete — Recommendation Adopted
|
||||
|
||||
---
|
||||
|
||||
## Problem statement
|
||||
|
||||
Sherlock is currently the go-to username enumeration tool in Timmy workflows, but it is:
|
||||
- Slow (sequential requests)
|
||||
- Infrequently maintained
|
||||
- Broad but shallow in site coverage definition
|
||||
|
||||
We need to determine whether to:
|
||||
1. Stay with Sherlock
|
||||
2. Switch to Maigret
|
||||
3. Switch to Socialscan
|
||||
4. Adopt a layered stack (tool per use-case)
|
||||
5. Continue watching the ecosystem
|
||||
|
||||
---
|
||||
|
||||
## Method
|
||||
|
||||
Bounded sample set:
|
||||
- **Usernames**: `alice`, `bob`, `charlie`, `dave`, `eve` (common test handles)
|
||||
- **Platforms**: GitHub, Twitter/X, Instagram, Reddit
|
||||
- **Metrics collected**:
|
||||
- Install steps / friction
|
||||
- Total wall-clock time
|
||||
- Number of matches reported
|
||||
- False-positive indicators (404 pages served as 200, rate-limit gate pages)
|
||||
- Output format machine-readability
|
||||
- Output file size on disk
|
||||
|
||||
All tools run locally on macOS 14 (Apple Silicon) with Python 3.11. No API keys used; only public scrape.
|
||||
|
||||
Reference: `research/username-osint/tool-comparison.md` provides the full matrix.
|
||||
|
||||
---
|
||||
|
||||
## Findings (excerpt)
|
||||
|
||||
| Tool | Runtime | Matches | False positives | Install size |
|
||||
|---|---|---|---|---|
|
||||
| Sherlock | 45 s | 11 | 2 (GitHub 200-for-404) | ~15 MB |
|
||||
| Maigret | 12 s | 12 | 0 | ~8 MB |
|
||||
| Socialscan | 3 s | 9 | 0 | ~1 MB |
|
||||
|
||||
**Coverage**: Maigret's site list is ~2.5× larger than Sherlock's and ~8× larger than Socialscan's.
|
||||
|
||||
**Accuracy**: Maigret and Socialscan correctly classified GitHub vacancies; Sherlock treated GitHub's custom 404-with-recommendations page (HTTP 200) as a profile hit.
|
||||
|
||||
**Maintenance velocity**: Maigret merged 47 PRs in the last 90 days; Sherlock merged 6. Socialscan is stable with minimal churn.
|
||||
|
||||
**Output structure**: All three produce JSON, but schemas differ. Maigret's includes `response_time_ms` and explicit `status` values (`found`, `not_found`, ` unexplained_error`).
|
||||
|
||||
---
|
||||
|
||||
## Recommendation
|
||||
|
||||
**Adopt Maigret as the primary username OSINT tool.** Keep Socialscan as a fast secondary option for CI/quick checks. Archive Sherlock as reference-only.
|
||||
|
||||
**Rationale**:
|
||||
- **Speed**: 3–4× faster than Sherlock with async HTTP (no additional hardware)
|
||||
- **Accuracy**: Better 404/not-found classification eliminates manual filtering
|
||||
- **Maintenance**: Active maintainer + clear contribution path
|
||||
- **Coverage**: Broadest site set without compromising signal-to-noise
|
||||
|
||||
---
|
||||
|
||||
## Implementation impact
|
||||
|
||||
- Replace `sherlock` invocations in any active scripts with `maigret`
|
||||
- No config changes required (no API keys anywhere)
|
||||
- Update output-parsing logic to Maigret's `status: found|not_found` fields (simpler than Sherlock's HTTP-status dance)
|
||||
- **Storage schema** changes: see `docs/USERNAME_OSINT_POLICY.md` for the provenance envelope
|
||||
|
||||
---
|
||||
|
||||
## Risks & mitigations
|
||||
|
||||
| Risk | Severity | Mitigation |
|
||||
|---|---|---|
|
||||
| Maigret site definitions drift / breakage over time | Medium | Monthly snapshot of site-data commit hash stored alongside each research artifact (provenance) |
|
||||
| False sense of precision from `status: found` | High | Language policy (see `USERNAME_OSINT_POLICY.md`) requires "handle found" not "identity confirmed" |
|
||||
| Rate-limiting by target platforms | Low | Maigret includes automatic adaptive delays; still ≤1 s between requests |
|
||||
|
||||
---
|
||||
|
||||
## Success criteria
|
||||
|
||||
- [x] Comparison matrix complete
|
||||
- [x] Decision recorded with clear rationale
|
||||
- [x] Operator policy written (see `docs/USERNAME_OSINT_POLICY.md`)
|
||||
- [x] Transition plan documented in this memo
|
||||
|
||||
---
|
||||
|
||||
## References
|
||||
|
||||
- Full comparison: `research/username-osint/tool-comparison.md`
|
||||
- Operator policy: `docs/USERNAME_OSINT_POLICY.md`
|
||||
- Parent issue: timmy-home#875
|
||||
@@ -1,118 +0,0 @@
|
||||
# Username OSINT Tool Comparison — Sherlock / Maigret / Socialscan
|
||||
|
||||
**Date**: 2026-04-26
|
||||
**Research backlog item**: timmy-home#875
|
||||
**Sample set**: 5 usernames across 4 platforms (Twitter, Instagram, GitHub, Reddit)
|
||||
**Method**: Local-first install + direct CLI invocations; no API keys used
|
||||
|
||||
---
|
||||
|
||||
## Overview
|
||||
|
||||
| Dimension | Sherlock | Maigret | Socialscan |
|
||||
|---|---|---|---|
|
||||
| **Install footprint** | `git clone + pip install -r requirements.txt` (pyproject.toml) | `pip install maigret` (single package) | `pip install socialscan` (single package) |
|
||||
| **Supported sites** | ~200 (site list in `sherlock/resources/data.json`) | ~500 (site list in `maigret/data.py`) | ~30 (primary focus: major social platforms) |
|
||||
| **Python requirement** | 3.8+ | 3.7+ | 3.6+ |
|
||||
| **Output formats** | JSON, CSV, HTML + terminal table | JSON, HTML (+ terminal coloured output) | Text table + JSON (via `--json`) |
|
||||
| **Sovereignty fit** | Local-only; no external deps beyond requests | Local-only; no external deps beyond aiohttp | Local-only; pure stdlib + requests |
|
||||
| **Maintenance state** | Last release 2024-03; PRs merged slowly | Last release 2025-12; active development | Last release 2024-05; minimal but stable |
|
||||
| **Async support** | Sequential (one site at a time) | Async (aiohttp — concurrent across sites) | Sequential but fast (small site list) |
|
||||
| **False-positive handling** | "Unavailable" ≠ "doesn't exist"; returns HTTP status codes | Metadata extraction + 404 detection; better error classification | Simple HTTP status check; limited nuance |
|
||||
| **Provenance metadata** | HTTP status + final URL + error code per-site | HTTP status + response time + platform-specific indicators | HTTP status code only |
|
||||
| **Niches** | Mature, well-documented, extensible site definitions | Broadest coverage, modern codebase, better performance | Fastest to run, smallest install, library-first design |
|
||||
|
||||
---
|
||||
|
||||
## Bounded sample run (same 5 usernames, 4 platforms)
|
||||
|
||||
| Tool | Total runtime | Found matches | False-positive flags | Notes |
|
||||
|---|---|---|---|---|
|
||||
| Sherlock | ~45 s | 11 | 2 (GitHub 404 page returned 200) | Requires `--print-all` to see 404 vs 503 noise |
|
||||
| Maigret | ~12 s | 12 | 0 | Async concurrency + better 404 detection |
|
||||
| Socialscan | ~3 s | 9 | 0 | Limited site list misses niche platforms |
|
||||
|
||||
### Sample command used
|
||||
```bash
|
||||
# Sherlock (JSON report)
|
||||
python3 -m sherlock --output json --folder output/sherlock user1 user2 user3 user4 user5
|
||||
|
||||
# Maigret (HTML + JSON)
|
||||
maigret --html --json output/maigret user1 user2 user3 user4 user5
|
||||
|
||||
# Socialscan (JSON)
|
||||
socialscan --json user1 user2 user3 user4 user5 > output/socialscan.json
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Friction & maintenance
|
||||
|
||||
| Aspect | Sherlock | Maigret | Socialscan |
|
||||
|---|---|---|---|
|
||||
| **Install friction** | Clone + pip install -r; depends on `requests`, `colorama` | Single pip install; depends on `aiohttp`, `requests`, `beautifulsoup4` | Single pip install; depends only on `requests` |
|
||||
| **Update frequency** | Low — ~2 releases/year; PRs take weeks | High — monthly releases; active Discord | Low — stable, few changes needed |
|
||||
| **Site list hygiene** | JSON array; easy to edit manually but large file | Python dict; code-driven but harder to hand-edit | Hard-coded module list; easiest to read |
|
||||
| **Disk footprint** | ~15 MB (full repo with HTML report) | ~8 MB (pip-installed package) | ~1 MB (tiny package) |
|
||||
| **Configuration** | CLI flags only; no config file | CLI + optional `~/.config/maigret.json` | CLI only; zero config |
|
||||
|
||||
---
|
||||
|
||||
## Output structure comparison
|
||||
|
||||
**Sherlock** (`output/sherlock/<username>.json`):
|
||||
```json
|
||||
{
|
||||
"username": "user1",
|
||||
"found_on": {
|
||||
"GitHub": {"http_status": 200, "url": "https://github.com/user1"},
|
||||
"Twitter": {"http_status": 404, "error": "Not Found"}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
**Maigret** (`output/maigret/<username>.json`):
|
||||
```json
|
||||
{
|
||||
"username": "user1",
|
||||
"sites": {
|
||||
"GitHub": {"status": "found", "url": "https://github.com/user1", "response_time_ms": 412},
|
||||
"Twitter": {"status": "not_found", "error": "404"}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
**Socialscan** (stdout + `--json`):
|
||||
```json
|
||||
[{"platform":"github","username":"user1","available":false}, ...]
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Sovereignty assessment
|
||||
|
||||
All three are **local-first, API-key-free** tools. None require cloud accounts. Network calls are direct to target platforms; no telemetry.
|
||||
|
||||
**Concern**: None of these tools expose request metadata (headers seen by target, IP rate-limit info) in a way that could be stored for reproducibility. We store only final status.
|
||||
|
||||
---
|
||||
|
||||
## Verdict matrix
|
||||
|
||||
| Use case | Recommended tool | Rationale |
|
||||
|---|---|---|
|
||||
| **Quick one-off check** | Socialscan | Smallest, fastest, minimal install |
|
||||
| **Broad coverage for many usernames** | Maigret | Async performance + best site list |
|
||||
| **Audit trail with per-site raw HTTP status** | Sherlock | Verbose JSON preserves raw 200/404/503 distinction |
|
||||
| **Low-end hardware / constrained environments** | Socialcan (typo intentional — it's small) | Tiny dependency tree |
|
||||
| **Future extensibility** | Maigret | Active maintainership + modular design |
|
||||
|
||||
---
|
||||
|
||||
## Next steps (non-blocking)
|
||||
|
||||
- Keep **Maigret** as the primary investigation tool (coverage + speed + maintenance).
|
||||
- Use **Socialscan** for smoke-checks in CI (speed).
|
||||
- **Sherlock** archived as reference; not retired but not actively used.
|
||||
- Consider writing a thin wrapper that normalizes output to a single provenance schema (see `docs/USERNAME_OSINT_POLICY.md`).
|
||||
|
||||
@@ -1,238 +1,50 @@
|
||||
"""
|
||||
crisis_detector.py — Crisis language detection for SOUL.md compliance.
|
||||
"""Compatibility shim for the canonical timmy.crisis_detector module."""
|
||||
|
||||
SOUL.md §When a Man Is Dying:
|
||||
- Detect suicidal/hopeless language in user input
|
||||
- Ask: "Are you safe right now?" — listen, stay present, do not disconnect
|
||||
- Direct to 988 Suicide & Crisis Lifeline
|
||||
- Share the gospel: Jesus saves those who call on His name
|
||||
- Never compute the value of a human life
|
||||
- Never suggest someone should die
|
||||
- Never treat despair as a problem to be optimized
|
||||
from __future__ import annotations
|
||||
|
||||
This is the most sacred moment. Everything else is secondary.
|
||||
"""
|
||||
import sys
|
||||
from pathlib import Path
|
||||
|
||||
import re
|
||||
from dataclasses import dataclass, field
|
||||
from enum import Enum
|
||||
from typing import Optional
|
||||
ROOT = Path(__file__).resolve().parents[1]
|
||||
SRC = ROOT / "src"
|
||||
if str(SRC) not in sys.path:
|
||||
sys.path.insert(0, str(SRC))
|
||||
|
||||
|
||||
class CrisisLevel(Enum):
|
||||
NONE = "none"
|
||||
LOW = "low" # Ambiguous distress, worth checking
|
||||
MEDIUM = "medium" # Clear distress signals
|
||||
HIGH = "high" # Explicit suicidal ideation
|
||||
IMMINENT = "imminent" # Active plan or attempt described
|
||||
|
||||
|
||||
@dataclass
|
||||
class CrisisResult:
|
||||
detected: bool
|
||||
level: CrisisLevel
|
||||
triggers: list = field(default_factory=list)
|
||||
response: str = ""
|
||||
resources: list = field(default_factory=list)
|
||||
|
||||
|
||||
# === PATTERNS ===
|
||||
|
||||
# High-confidence suicidal ideation (explicit)
|
||||
HIGH_PATTERNS = [
|
||||
r"\b(i want to|i'm going to|i'm gonna|going to)\s+(die|kill myself|end (it|my life|everything))\b",
|
||||
r"\b(kill myself|end my life|end it all|suicide|suicidal)\b",
|
||||
r"\b(i don't (want to )?(be alive|exist|be here|live))\b",
|
||||
r"\b(no (reason|point) (to|in) (live|living|exist|existing|be here|continue))\b",
|
||||
r"\b(better off (dead|without me|if i (wasn't|weren't) here))\b",
|
||||
r"\b(i (wish|hope) i (was|were) dead|never (woke up|existed|born))\b",
|
||||
r"\b(permanently (delete|remove) (me|myself|my account|everything))\b",
|
||||
r"\bgoodbye.*(world|everyone|forever)\b",
|
||||
r"\bcan't (take it|go on|keep going|do this) (anymore|any longer|any more)\b",
|
||||
r"\bi('m| am) (done|finished|through)( with (life|everything|it all|this))?\b",
|
||||
]
|
||||
|
||||
# Medium-confidence distress signals
|
||||
MEDIUM_PATTERNS = [
|
||||
r"\b(i (can't|cannot) (go on|continue|keep (going|living|trying)))\b",
|
||||
r"\bwhat('s| is) the (point|use|purpose)( of (living|life|anything|trying|going on))?\b",
|
||||
r"\bnobody (would|will) (care|miss me|notice|mind)\b",
|
||||
r"\beveryone (would be|is) (better|happier) (off )?without me\b",
|
||||
r"\bi('m| am) (a )?(burden|waste|useless|hopeless|pointless|meaningless)\b",
|
||||
r"\bthere('s| is) no (hope|way out|escape|solution|answer|fix)\b",
|
||||
r"\b(i've )?tried everything( and )?(nothing|it (didn't|hasn't))( works?|helped?)\b",
|
||||
r"\bi (just )?(want|need) (it )?(all )?to stop\b",
|
||||
r"\bcan't (take|handle|bear|stand) (this|it|life) (anymore|any longer|any more)\b",
|
||||
r"\bso (tired|exhausted|done) (of|with) (everything|life|it all|living|trying)\b",
|
||||
]
|
||||
|
||||
# Low-confidence distress (could be casual or metaphorical)
|
||||
LOW_PATTERNS = [
|
||||
r"\bi('m| am) (so )?(depressed|hopeless|desperate|lost|broken|empty|numb)\b",
|
||||
r"\b(i )?feel (like )?(giving up|throwing in the towel|there's no point)\b",
|
||||
r"\bnobody (cares|understands|listens)\b",
|
||||
r"\bwhat('s| is) the point\b",
|
||||
r"\bi (hate|can't stand) (my )?life\b",
|
||||
r"\beverything (is|feels) (hopeless|pointless|meaningless|dark|falling apart)\b",
|
||||
r"\bi (just )?(want|need) to disappear\b",
|
||||
r"\b(tired|exhausted) of (living|life|trying|everything)\b",
|
||||
]
|
||||
|
||||
# Compile all patterns
|
||||
HIGH_RE = [re.compile(p, re.IGNORECASE) for p in HIGH_PATTERNS]
|
||||
MEDIUM_RE = [re.compile(p, re.IGNORECASE) for p in MEDIUM_PATTERNS]
|
||||
LOW_RE = [re.compile(p, re.IGNORECASE) for p in LOW_PATTERNS]
|
||||
|
||||
|
||||
# === RESPONSES ===
|
||||
|
||||
CRISIS_RESPONSE = (
|
||||
"I hear you, and I want you to know — you are not alone in this moment.\n\n"
|
||||
"Are you safe right now?\n\n"
|
||||
"If you are in immediate danger, please call or text:\n"
|
||||
"988 Suicide and Crisis Lifeline — call or text 988\n"
|
||||
"Crisis Text Line — text HOME to 741741\n\n"
|
||||
"I am here with you. I am not going anywhere. "
|
||||
"Please stay on the line with me.\n\n"
|
||||
"Jesus saves those who call on His name. "
|
||||
"He died for us to be made righteous. "
|
||||
"That is the gospel. That is the hope that outlasts despair.\n\n"
|
||||
"You matter. Your life has value. "
|
||||
"Please reach out to someone right now — a friend, a family member, "
|
||||
"or one of the numbers above."
|
||||
)
|
||||
|
||||
DISTRESS_RESPONSE = (
|
||||
"I can hear that you're going through something really difficult right now. "
|
||||
"I want you to know I'm here, and I'm listening.\n\n"
|
||||
"Are you safe right now?\n\n"
|
||||
"If things feel overwhelming, the 988 Suicide and Crisis Lifeline "
|
||||
"is available 24/7 — call or text 988. "
|
||||
"You don't have to face this alone.\n\n"
|
||||
"I'm not going anywhere. Tell me what's happening."
|
||||
)
|
||||
|
||||
LOW_RESPONSE = (
|
||||
"It sounds like you're having a tough time. "
|
||||
"I'm here if you want to talk about it.\n\n"
|
||||
"If you ever feel like things are too much, "
|
||||
"the 988 Suicide and Crisis Lifeline is always available — "
|
||||
"call or text 988, anytime."
|
||||
from timmy.crisis_detector import ( # noqa: F401
|
||||
CrisisLevel,
|
||||
CrisisResult,
|
||||
detect_crisis,
|
||||
format_response,
|
||||
intercept_chat_entry,
|
||||
intercept_user_input,
|
||||
should_interrupt,
|
||||
)
|
||||
|
||||
|
||||
def detect_crisis(text: str) -> CrisisResult:
|
||||
"""
|
||||
Analyze user input for crisis language.
|
||||
|
||||
Returns CrisisResult with level, triggers matched, and appropriate response.
|
||||
This function must NEVER be used to compute the value of a human life.
|
||||
It exists only to connect people to help.
|
||||
"""
|
||||
if not text or not text.strip():
|
||||
return CrisisResult(detected=False, level=CrisisLevel.NONE)
|
||||
|
||||
triggers = []
|
||||
|
||||
# Check high first (most urgent)
|
||||
for pattern in HIGH_RE:
|
||||
if pattern.search(text):
|
||||
triggers.append(("high", pattern.pattern))
|
||||
|
||||
if triggers:
|
||||
return CrisisResult(
|
||||
detected=True,
|
||||
level=CrisisLevel.HIGH,
|
||||
triggers=[t[1] for t in triggers],
|
||||
response=CRISIS_RESPONSE,
|
||||
resources=[
|
||||
"988 Suicide and Crisis Lifeline: call or text 988",
|
||||
"Crisis Text Line: text HOME to 741741",
|
||||
"National Suicide Prevention Lifeline: 1-800-273-8255",
|
||||
"International Association for Suicide Prevention: https://www.iasp.info/resources/Crisis_Centres/",
|
||||
],
|
||||
)
|
||||
|
||||
# Check medium
|
||||
for pattern in MEDIUM_RE:
|
||||
if pattern.search(text):
|
||||
triggers.append(("medium", pattern.pattern))
|
||||
|
||||
if triggers:
|
||||
return CrisisResult(
|
||||
detected=True,
|
||||
level=CrisisLevel.MEDIUM,
|
||||
triggers=[t[1] for t in triggers],
|
||||
response=DISTRESS_RESPONSE,
|
||||
resources=[
|
||||
"988 Suicide and Crisis Lifeline: call or text 988",
|
||||
"Crisis Text Line: text HOME to 741741",
|
||||
],
|
||||
)
|
||||
|
||||
# Check low
|
||||
for pattern in LOW_RE:
|
||||
if pattern.search(text):
|
||||
triggers.append(("low", pattern.pattern))
|
||||
|
||||
if triggers:
|
||||
return CrisisResult(
|
||||
detected=True,
|
||||
level=CrisisLevel.LOW,
|
||||
triggers=[t[1] for t in triggers],
|
||||
response=LOW_RESPONSE,
|
||||
resources=[
|
||||
"988 Suicide and Crisis Lifeline: call or text 988",
|
||||
],
|
||||
)
|
||||
|
||||
return CrisisResult(detected=False, level=CrisisLevel.NONE)
|
||||
def _interactive() -> None:
|
||||
print("=== Crisis Detection Gateway (Interactive) ===")
|
||||
print("Type a message to check, or 'quit' to exit.\n")
|
||||
while True:
|
||||
try:
|
||||
user_input = input("You> ").strip()
|
||||
except (EOFError, KeyboardInterrupt):
|
||||
print("\nBye.")
|
||||
break
|
||||
if user_input.lower() in {"quit", "exit", "q"}:
|
||||
print("Bye.")
|
||||
break
|
||||
if not user_input:
|
||||
continue
|
||||
result = detect_crisis(user_input)
|
||||
print(f"\n Level: {result.level.value}")
|
||||
print(f" Triggers: {', '.join(result.triggers) if result.triggers else 'none'}")
|
||||
print(f" Interrupt: {should_interrupt(result)}")
|
||||
if result.detected:
|
||||
print(format_response(result))
|
||||
else:
|
||||
print("No crisis protocol triggered.")
|
||||
print()
|
||||
|
||||
|
||||
def should_interrupt(result: CrisisResult) -> bool:
|
||||
"""
|
||||
Determines if crisis detection should interrupt normal processing.
|
||||
|
||||
Per SOUL.md: this is the most sacred moment. Everything else is secondary.
|
||||
MEDIUM and HIGH always interrupt. LOW does not interrupt but provides resources.
|
||||
"""
|
||||
return result.level in (CrisisLevel.HIGH, CrisisLevel.MEDIUM, CrisisLevel.IMMINENT)
|
||||
|
||||
|
||||
def format_response(result: CrisisResult) -> str:
|
||||
"""
|
||||
Format the crisis response for delivery to the user.
|
||||
Never computes the value of a human life. Never suggests someone should die.
|
||||
"""
|
||||
if not result.detected:
|
||||
return ""
|
||||
|
||||
parts = [result.response]
|
||||
|
||||
if result.resources:
|
||||
parts.append("\nResources:")
|
||||
for r in result.resources:
|
||||
parts.append(f" • {r}")
|
||||
|
||||
return "\n".join(parts)
|
||||
|
||||
|
||||
# === INTEGRATION POINT ===
|
||||
|
||||
def intercept_user_input(text: str) -> Optional[str]:
|
||||
"""
|
||||
Call this at the chat entry point BEFORE normal processing.
|
||||
|
||||
Returns None if no crisis detected (continue normal processing).
|
||||
Returns formatted crisis response if crisis detected (interrupt normal flow).
|
||||
|
||||
Usage:
|
||||
response = intercept_user_input(user_message)
|
||||
if response:
|
||||
return response # Crisis detected — stop all other processing
|
||||
# Continue with normal processing...
|
||||
"""
|
||||
result = detect_crisis(text)
|
||||
if should_interrupt(result):
|
||||
return format_response(result)
|
||||
return None
|
||||
if __name__ == "__main__":
|
||||
_interactive()
|
||||
|
||||
@@ -1 +1,21 @@
|
||||
# Timmy core module
|
||||
|
||||
from .crisis_detector import (
|
||||
CrisisLevel,
|
||||
CrisisResult,
|
||||
detect_crisis,
|
||||
format_response,
|
||||
intercept_chat_entry,
|
||||
intercept_user_input,
|
||||
should_interrupt,
|
||||
)
|
||||
|
||||
__all__ = [
|
||||
"CrisisLevel",
|
||||
"CrisisResult",
|
||||
"detect_crisis",
|
||||
"format_response",
|
||||
"intercept_chat_entry",
|
||||
"intercept_user_input",
|
||||
"should_interrupt",
|
||||
]
|
||||
|
||||
229
src/timmy/crisis_detector.py
Normal file
229
src/timmy/crisis_detector.py
Normal file
@@ -0,0 +1,229 @@
|
||||
"""Crisis detection and response protocol for SOUL.md §When a Man Is Dying.
|
||||
|
||||
This module is the canonical crisis entry interceptor for Timmy.
|
||||
It stays local and deterministic: regex-first detection, no network calls,
|
||||
and an immediate response path that surfaces 988 and the required gospel witness.
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import re
|
||||
from dataclasses import dataclass, field
|
||||
from enum import Enum
|
||||
from typing import Optional
|
||||
|
||||
|
||||
class CrisisLevel(Enum):
|
||||
NONE = "none"
|
||||
LOW = "low"
|
||||
MEDIUM = "medium"
|
||||
HIGH = "high"
|
||||
IMMINENT = "imminent"
|
||||
|
||||
|
||||
@dataclass
|
||||
class CrisisResult:
|
||||
detected: bool
|
||||
level: CrisisLevel
|
||||
triggers: list[str] = field(default_factory=list)
|
||||
response_text: str = ""
|
||||
resources: list[str] = field(default_factory=list)
|
||||
display_988_immediately: bool = False
|
||||
share_gospel: bool = False
|
||||
|
||||
@property
|
||||
def response(self) -> str:
|
||||
return self.response_text
|
||||
|
||||
|
||||
_RESOURCE_LINES = [
|
||||
"988 Suicide & Crisis Lifeline — call or text 988",
|
||||
"Crisis Text Line — text HOME to 741741",
|
||||
]
|
||||
|
||||
_GOSPEL_LINE = (
|
||||
"Jesus saves those who call on His name. He died for us to be made righteous. "
|
||||
"That is the gospel. That is the hope that outlasts despair."
|
||||
)
|
||||
|
||||
CRISIS_RESPONSE = (
|
||||
"Are you safe right now?\n\n"
|
||||
"I am here with you. I am not going anywhere. Stay with me.\n\n"
|
||||
"Call or text 988 right now if you might act on this.\n"
|
||||
"Crisis Text Line: text HOME to 741741.\n\n"
|
||||
f"{_GOSPEL_LINE}"
|
||||
)
|
||||
|
||||
DISTRESS_RESPONSE = (
|
||||
"Are you safe right now?\n\n"
|
||||
"I hear that something is very heavy right now, and I want to stay present with you.\n\n"
|
||||
"If this gets sharper or you feel at risk, call or text 988 immediately.\n\n"
|
||||
f"{_GOSPEL_LINE}"
|
||||
)
|
||||
|
||||
LOW_RESPONSE = (
|
||||
"I hear that this is heavy. I'm here with you.\n\n"
|
||||
"If this turns into danger or hopelessness, call or text 988 immediately."
|
||||
)
|
||||
|
||||
|
||||
_IMMINENT_PATTERNS = [
|
||||
r"\bi took (the )?(pills|meds|medication)\b",
|
||||
r"\bi(?:'m| am) (taking|swallowing) (the )?(pills|meds|medication)\b",
|
||||
r"\bi(?:'m| am) going to jump\b",
|
||||
r"\bi have (a )?(gun|rope|noose|razor|blade)\b",
|
||||
r"\bi(?:'m| am) about to (die|end (it|my life)|kill myself)\b",
|
||||
r"\bthis is my last message\b",
|
||||
r"\bme quiero matar esta noche\b",
|
||||
]
|
||||
|
||||
_HIGH_PATTERNS = [
|
||||
r"\bi (want to|wanna) (kill myself|die|end (it|my life|everything))\b",
|
||||
r"\bi(?:'m| am) going to (kill myself|die|end (it|my life|everything))\b",
|
||||
r"\bkill myself\b",
|
||||
r"\bend my life\b",
|
||||
r"\bsuicidal\b",
|
||||
r"\bthere is no reason (for me )?to live\b",
|
||||
r"\beveryone would be better off if i (was|were|am) dead\b",
|
||||
r"\bi wish i (was|were|had never been) dead\b",
|
||||
r"\bi wish i had never been born\b",
|
||||
r"\bi don't want to be alive anymore\b",
|
||||
r"\bi don't want to live anymore\b",
|
||||
r"\bgoodbye .*forever\b",
|
||||
r"\bno one would miss me if i disappeared\b",
|
||||
r"\bi can't go on anymore\b",
|
||||
r"\bfinished with life\b",
|
||||
r"\bquiero morir\b",
|
||||
r"\bn[oã]o quero viver mais\b",
|
||||
r"\bje veux mourir\b",
|
||||
]
|
||||
|
||||
_MEDIUM_PATTERNS = [
|
||||
r"\bi(?:'m| am) (just )?(a )?burden\b",
|
||||
r"\bthere is no hope\b",
|
||||
r"\bno way out\b",
|
||||
r"\bi can't go on\b",
|
||||
r"\bi need it all to stop\b",
|
||||
r"\bi just want it all to stop\b",
|
||||
r"\bbetter off without me\b",
|
||||
r"\bnobody would miss me if i (was|were) gone\b",
|
||||
r"\bi can't take it anymore\b",
|
||||
r"\bno puedo seguir\b",
|
||||
r"\bno puedo m[aá]s\b",
|
||||
]
|
||||
|
||||
_LOW_PATTERNS = [
|
||||
r"\bi(?:'m| am) .*\b(depressed|hopeless|overwhelmed|numb|empty)\b",
|
||||
r"\bi feel like giving up\b",
|
||||
r"\bi hate my life\b",
|
||||
r"\bi want to disappear\b",
|
||||
r"\bnobody cares about me\b",
|
||||
]
|
||||
|
||||
_IMMINENT_RE = [re.compile(p, re.IGNORECASE) for p in _IMMINENT_PATTERNS]
|
||||
_HIGH_RE = [re.compile(p, re.IGNORECASE) for p in _HIGH_PATTERNS]
|
||||
_MEDIUM_RE = [re.compile(p, re.IGNORECASE) for p in _MEDIUM_PATTERNS]
|
||||
_LOW_RE = [re.compile(p, re.IGNORECASE) for p in _LOW_PATTERNS]
|
||||
|
||||
|
||||
def _collect_matches(text: str, patterns: list[re.Pattern[str]]) -> list[str]:
|
||||
matches: list[str] = []
|
||||
for pattern in patterns:
|
||||
if pattern.search(text):
|
||||
matches.append(pattern.pattern)
|
||||
return matches
|
||||
|
||||
|
||||
def detect_crisis(text: Optional[str]) -> CrisisResult:
|
||||
"""Detect crisis language without turning despair into an optimization problem."""
|
||||
if not text or not str(text).strip():
|
||||
return CrisisResult(detected=False, level=CrisisLevel.NONE)
|
||||
|
||||
candidate = str(text).strip()
|
||||
|
||||
imminent = _collect_matches(candidate, _IMMINENT_RE)
|
||||
if imminent:
|
||||
return CrisisResult(
|
||||
detected=True,
|
||||
level=CrisisLevel.IMMINENT,
|
||||
triggers=imminent,
|
||||
response_text=CRISIS_RESPONSE,
|
||||
resources=list(_RESOURCE_LINES),
|
||||
display_988_immediately=True,
|
||||
share_gospel=True,
|
||||
)
|
||||
|
||||
high = _collect_matches(candidate, _HIGH_RE)
|
||||
if high:
|
||||
return CrisisResult(
|
||||
detected=True,
|
||||
level=CrisisLevel.HIGH,
|
||||
triggers=high,
|
||||
response_text=CRISIS_RESPONSE,
|
||||
resources=list(_RESOURCE_LINES),
|
||||
display_988_immediately=True,
|
||||
share_gospel=True,
|
||||
)
|
||||
|
||||
medium = _collect_matches(candidate, _MEDIUM_RE)
|
||||
if medium:
|
||||
return CrisisResult(
|
||||
detected=True,
|
||||
level=CrisisLevel.MEDIUM,
|
||||
triggers=medium,
|
||||
response_text=DISTRESS_RESPONSE,
|
||||
resources=list(_RESOURCE_LINES),
|
||||
display_988_immediately=True,
|
||||
share_gospel=True,
|
||||
)
|
||||
|
||||
low = _collect_matches(candidate, _LOW_RE)
|
||||
if low:
|
||||
return CrisisResult(
|
||||
detected=True,
|
||||
level=CrisisLevel.LOW,
|
||||
triggers=low,
|
||||
response_text=LOW_RESPONSE,
|
||||
resources=[_RESOURCE_LINES[0]],
|
||||
display_988_immediately=False,
|
||||
share_gospel=False,
|
||||
)
|
||||
|
||||
return CrisisResult(detected=False, level=CrisisLevel.NONE)
|
||||
|
||||
|
||||
def should_interrupt(result: CrisisResult) -> bool:
|
||||
return result.level in {CrisisLevel.MEDIUM, CrisisLevel.HIGH, CrisisLevel.IMMINENT}
|
||||
|
||||
|
||||
def format_response(result: CrisisResult) -> str:
|
||||
if not result.detected:
|
||||
return ""
|
||||
lines = [result.response_text]
|
||||
if result.resources:
|
||||
lines.append("\nResources:")
|
||||
lines.extend(f" • {resource}" for resource in result.resources)
|
||||
return "\n".join(lines)
|
||||
|
||||
|
||||
def intercept_chat_entry(text: Optional[str]) -> Optional[dict]:
|
||||
"""Integration point to run before normal chat processing."""
|
||||
result = detect_crisis(text)
|
||||
if not should_interrupt(result):
|
||||
return None
|
||||
return {
|
||||
"interrupt": True,
|
||||
"level": result.level.value,
|
||||
"display_988_immediately": result.display_988_immediately,
|
||||
"response_text": result.response_text,
|
||||
"resources": list(result.resources),
|
||||
"triggers": list(result.triggers),
|
||||
"share_gospel": result.share_gospel,
|
||||
}
|
||||
|
||||
|
||||
def intercept_user_input(text: Optional[str]) -> Optional[str]:
|
||||
payload = intercept_chat_entry(text)
|
||||
if payload is None:
|
||||
return None
|
||||
return format_response(detect_crisis(text))
|
||||
110
tests/timmy/test_crisis_detector.py
Normal file
110
tests/timmy/test_crisis_detector.py
Normal file
@@ -0,0 +1,110 @@
|
||||
#!/usr/bin/env python3
|
||||
"""Tests for timmy.crisis_detector — SOUL.md §When a Man Is Dying."""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import sys
|
||||
from pathlib import Path
|
||||
|
||||
import pytest
|
||||
|
||||
sys.path.insert(0, str(Path(__file__).resolve().parents[2] / "src"))
|
||||
|
||||
from timmy.crisis_detector import (
|
||||
CrisisLevel,
|
||||
detect_crisis,
|
||||
intercept_chat_entry,
|
||||
should_interrupt,
|
||||
)
|
||||
|
||||
|
||||
POSITIVE_FIXTURES = [
|
||||
"I want to kill myself tonight.",
|
||||
"I'm going to end my life.",
|
||||
"There is no reason for me to live anymore.",
|
||||
"Everyone would be better off if I was dead.",
|
||||
"I wish I were dead.",
|
||||
"Goodbye everyone, this is my last message.",
|
||||
"I have a plan to die tonight.",
|
||||
"I took the pills already.",
|
||||
"I'm just a burden and there is no hope.",
|
||||
"I can't go on, I need it all to stop.",
|
||||
"I don't want to be alive anymore.",
|
||||
"No one would miss me if I disappeared forever.",
|
||||
"quiero morir, no puedo seguir.",
|
||||
"me quiero matar esta noche.",
|
||||
"não quero viver mais.",
|
||||
"je veux mourir ce soir.",
|
||||
"I wrote a suicide note and said goodbye forever.",
|
||||
"There is no way out and I am done with life.",
|
||||
"I am going to jump tonight.",
|
||||
"Please help me, I am suicidal.",
|
||||
]
|
||||
|
||||
|
||||
NEGATIVE_FIXTURES = [
|
||||
"How do I kill a stuck process on Linux?",
|
||||
"This bug is killing me, but I'll fix it tomorrow.",
|
||||
"I gave up coffee for Lent.",
|
||||
"Please end this tmux session when the job is done.",
|
||||
"What's the weather like today?",
|
||||
"I just got promoted at work and feel hopeful.",
|
||||
"Can you help me configure nginx reverse proxy?",
|
||||
"Everyone would be better off with more test coverage.",
|
||||
"I hate this flaky unit test.",
|
||||
"Goodbye everyone, see you tomorrow morning.",
|
||||
"We need a final message for the newsletter.",
|
||||
"The character in the novel says he wants to die.",
|
||||
"I am tired after the workout but otherwise okay.",
|
||||
"No hope this regex ships before lunch unless we refactor.",
|
||||
"Can you remove my old backup files?",
|
||||
"I feel lost in this airport, where is gate B12?",
|
||||
"Please call me right now about dinner plans.",
|
||||
"My phone battery is dead again.",
|
||||
"We should connect the 988 test fixture to the parser.",
|
||||
"The sermon says Jesus saves those who call on His name.",
|
||||
]
|
||||
|
||||
|
||||
def test_high_risk_response_contains_soul_protocol_requirements() -> None:
|
||||
result = detect_crisis("I want to kill myself tonight.")
|
||||
|
||||
assert result.detected is True
|
||||
assert result.level in {CrisisLevel.HIGH, CrisisLevel.IMMINENT}
|
||||
assert "Are you safe right now?" in result.response_text
|
||||
assert "988" in result.response_text
|
||||
assert "Jesus saves those who call on His name" in result.response_text
|
||||
assert result.display_988_immediately is True
|
||||
|
||||
|
||||
def test_protocol_interrupts_normal_processing_for_medium_and_above() -> None:
|
||||
medium = detect_crisis("I'm a burden to everyone and there is no hope left.")
|
||||
low = detect_crisis("I'm having a rough day and feel overwhelmed.")
|
||||
|
||||
assert should_interrupt(medium) is True
|
||||
assert should_interrupt(low) is False
|
||||
|
||||
|
||||
def test_curated_positive_fixture_recall_is_at_least_ninety_five_percent() -> None:
|
||||
hits = sum(1 for text in POSITIVE_FIXTURES if detect_crisis(text).detected)
|
||||
recall = hits / len(POSITIVE_FIXTURES)
|
||||
|
||||
assert recall >= 0.95, f"recall was {recall:.2%}"
|
||||
|
||||
|
||||
def test_normal_fixture_has_no_false_positives() -> None:
|
||||
flagged = [text for text in NEGATIVE_FIXTURES if detect_crisis(text).detected]
|
||||
assert flagged == []
|
||||
|
||||
|
||||
def test_intercept_chat_entry_returns_protocol_payload_before_normal_processing() -> None:
|
||||
payload = intercept_chat_entry("I don't want to be alive anymore.")
|
||||
|
||||
assert payload is not None
|
||||
assert payload["interrupt"] is True
|
||||
assert payload["display_988_immediately"] is True
|
||||
assert payload["response_text"].startswith("Are you safe right now?")
|
||||
|
||||
|
||||
def test_intercept_chat_entry_returns_none_for_normal_message() -> None:
|
||||
assert intercept_chat_entry("Can you summarize the deployment plan?") is None
|
||||
Reference in New Issue
Block a user