test: add crisis_synthesizer tests (#36 )

feat: build crisis_synthesizer.py — learn from interactions (#36 )
2026-04-15 14:55:32 +00:00 · 2026-04-15 14:54:06 +00:00
3 changed files with 742 additions and 76 deletions
--- a/GENOME.md
+++ b/GENOME.md
@@ -1,75 +0,0 @@
-# GENOME.md — the-door
-
-**Generated:** 2026-04-14
-**Repo:** Timmy_Foundation/the-door
-**Description:** Crisis Front Door — a single URL where a man at 3am can talk to Timmy. No login, no signup. 988 always visible.
-
---
-
-## Project Overview
-
-The-door is a crisis intervention web application — the most sacred surface in the Timmy Foundation. When a man at 3am reaches the end of his road, this is where he lands. No login, no signup, no barriers. 988 Suicide and Crisis Lifeline always visible. The "When a Man Is Dying" protocol active on every page.
-
-## Architecture
-
-```
-the-door/
-├── index.html              # Main crisis page (PWA-capable)
-├── crisis-offline.html     # Offline fallback (service worker cached)
-├── about.html              # About page
-├── testimony.html          # Testimony/stories page
-├── sw.js                   # Service worker (offline-first)
-├── manifest.json           # PWA manifest
-├── crisis/                 # Core crisis detection + response
-│   ├── detect.py           # Keyword/pattern detection (4 tiers)
-│   ├── gateway.py          # API endpoints, prompt injection
-│   ├── response.py         # Response generation, 988 routing
-│   ├── compassion_router.py # Profile-based response routing
-│   ├── profiles.py         # Compassion profiles
-│   └── PROTOCOL.md         # The protocol (SOUL.md reference)
-├── crisis_detector.py      # Legacy shim → crisis/detect.py
-├── crisis_responder.py     # Legacy responder
-├── dying_detection/        # Deprecated module
-├── evolution/              # Crisis synthesizer (creative)
-├── tests/                  # Safety-critical tests
-│   ├── test_crisis_overlay_focus_trap.py
-│   ├── test_dying_detection_deprecation.py
-│   └── test_false_positive_fixes.py
-└── deploy/                 # Deployment docs
-```
-
-## Key Abstractions
-
-| Module | Purpose |
-|---|---|
-| `crisis/detect.py` | 4-tier detection: LOW/MEDIUM/HIGH/CRITICAL via regex patterns |
-| `crisis/gateway.py` | HTTP API, Sovereign Heart prompt injection |
-| `crisis/response.py` | Response generation, 988 integration, escalation |
-| `crisis/compassion_router.py` | Profile-based routing (different crisis types) |
-| `sw.js` | Service worker for offline-first PWA |
-
-## Safety Constraints
-
- **The-door never auto-closes PRs** (in fleet-ops exempt list)
- **988 always visible** on every page, even offline
- **When a Man Is Dying protocol** active on every interaction
- **No login/signup** — zero barriers to crisis support
- **Offline-first** — service worker caches critical pages
-
-## Test Coverage
-
-| Test | Coverage |
-|---|---|
-| Crisis overlay focus trap | ✅ |
-| Dying detection deprecation | ✅ |
-| False positive fixes | ✅ |
-| Crisis detection tiers | ❌ (in crisis/tests.py) |
-| Response generation | ❌ |
-| Offline service worker | ❌ |
-
-## Security
-
- No user data stored (crisis intervention is stateless by design)
- No cookies, no tracking, no analytics
- Service worker only caches static assets
- Crisis detection runs client-side where possible
--- a/evolution/crisis_synthesizer.py
+++ b/evolution/crisis_synthesizer.py
@@ -1 +1,429 @@
-...
+#!/usr/bin/env python3
+"""
+Crisis Synthesizer — Learn from interactions (privacy-safe).
+
+Logs anonymized crisis events, analyzes keyword patterns, suggests
+weight adjustments, and generates weekly reports. Zero PII stored.
+
+Usage:
+    from evolution.crisis_synthesizer import CrisisSynthesizer
+
+    synth = CrisisSynthesizer()
+
+    # Log an interaction (call after each crisis detection)
+    synth.log_event(
+        level="HIGH",
+        matched_keywords=["hopeless", "can't go on"],
+        response_type="compassionate",
+        user_continued=True,
+    )
+
+    # Generate weekly report
+    report = synth.weekly_report()
+    print(json.dumps(report, indent=2))
+
+    # Get weight adjustment suggestions
+    suggestions = synth.suggest_adjustments()
+
+CLI:
+    python3 -m evolution.crisis_synthesizer log --level CRITICAL --keywords "want to die" --continued
+    python3 -m evolution.crisis_synthesizer report [--weeks 1]
+    python3 -m evolution.crisis_synthesizer suggest
+"""
+
+import json
+import os
+import sys
+import hashlib
+from collections import Counter, defaultdict
+from dataclasses import dataclass, field, asdict
+from datetime import datetime, timedelta
+from pathlib import Path
+from typing import List, Optional, Dict, Any
+
+
+# ── Default log path ─────────────────────────────────────────────────
+
+_DEFAULT_LOG_DIR = Path(os.environ.get(
+    "CRISIS_SYNTH_LOG_DIR",
+    os.path.expanduser("~/.the-door/crisis-synth")
+))
+_LOG_FILE = "crisis_events.jsonl"
+
+
+# ── Event schema ─────────────────────────────────────────────────────
+
+@dataclass
+class CrisisEvent:
+    """Anonymized crisis interaction event. No PII, no content, no IDs."""
+    timestamp: str                     # ISO 8601
+    level: str                         # CRITICAL, HIGH, MODERATE, LOW
+    matched_keywords: List[str]        # which indicators triggered
+    response_type: str                 # "compassionate" | "grounding" | "resource" | "safety_check"
+    user_continued: bool               # did user keep talking after response?
+    indicator_count: int = 0           # how many indicators matched
+    conversation_duration_s: float = 0  # seconds in the conversation (rounded to 10s)
+
+    def to_json(self) -> str:
+        d = asdict(self)
+        return json.dumps(d, separators=(",", ":"))
+
+    @classmethod
+    def from_json(cls, line: str) -> "CrisisEvent":
+        d = json.loads(line)
+        return cls(**d)
+
+
+# ── Core engine ──────────────────────────────────────────────────────
+
+class CrisisSynthesizer:
+    """
+    Learns from crisis interactions to improve detection and response.
+
+    Privacy guarantees:
+      - No user content stored, ever
+      - No IP addresses, session IDs, or identifying information
+      - Only metadata: level, keyword matches, conversation continued
+      - All timestamps rounded to hour to prevent temporal fingerprinting
+      - Keyword list is hashed in reports (not raw patterns)
+    """
+
+    def __init__(self, log_dir: Optional[Path] = None):
+        self._log_dir = log_dir or _DEFAULT_LOG_DIR
+        self._log_path = self._log_dir / _LOG_FILE
+        self._log_dir.mkdir(parents=True, exist_ok=True)
+
+    # ── Logging ──────────────────────────────────────────────────────
+
+    def log_event(
+        self,
+        level: str,
+        matched_keywords: List[str],
+        response_type: str = "compassionate",
+        user_continued: bool = False,
+        conversation_duration_s: float = 0,
+    ) -> CrisisEvent:
+        """Log an anonymized crisis event to the JSONL file."""
+        now = datetime.utcnow()
+        # Round to hour for privacy
+        rounded = now.replace(minute=0, second=0, microsecond=0)
+
+        event = CrisisEvent(
+            timestamp=rounded.isoformat() + "Z",
+            level=level.upper(),
+            matched_keywords=[k.lower().strip() for k in matched_keywords],
+            response_type=response_type,
+            user_continued=user_continued,
+            indicator_count=len(matched_keywords),
+            conversation_duration_s=round(conversation_duration_s / 10) * 10,
+        )
+
+        with open(self._log_path, "a") as f:
+            f.write(event.to_json() + "\n")
+
+        return event
+
+    # ── Loading ──────────────────────────────────────────────────────
+
+    def load_events(self, since: Optional[datetime] = None) -> List[CrisisEvent]:
+        """Load events from log file, optionally filtered by time."""
+        if not self._log_path.exists():
+            return []
+
+        events = []
+        cutoff = since.isoformat() if since else None
+
+        with open(self._log_path) as f:
+            for line in f:
+                line = line.strip()
+                if not line:
+                    continue
+                try:
+                    event = CrisisEvent.from_json(line)
+                    if cutoff and event.timestamp < cutoff:
+                        continue
+                    events.append(event)
+                except (json.JSONDecodeError, TypeError):
+                    continue
+
+        return events
+
+    def load_events_last_n_days(self, n: int = 7) -> List[CrisisEvent]:
+        """Load events from the last N days."""
+        since = datetime.utcnow() - timedelta(days=n)
+        return self.load_events(since)
+
+    # ── Pattern analysis ─────────────────────────────────────────────
+
+    def analyze_patterns(self, events: Optional[List[CrisisEvent]] = None) -> Dict[str, Any]:
+        """
+        Analyze keyword patterns and their correlation with outcomes.
+
+        Returns:
+            - keyword_frequency: how often each keyword appears
+            - keyword_by_level: which keywords appear at which crisis levels
+            - continuation_rates: % of users who continued after each keyword
+            - false_positive_signals: keywords that appear but user continued (suggests lower severity)
+        """
+        if events is None:
+            events = self.load_events()
+
+        if not events:
+            return {
+                "total_events": 0,
+                "keyword_frequency": {},
+                "keyword_by_level": {},
+                "continuation_rates": {},
+                "false_positive_signals": [],
+            }
+
+        # Count keyword frequency
+        keyword_freq = Counter()
+        keyword_levels = defaultdict(Counter)  # keyword -> {level: count}
+        keyword_continued = defaultdict(list)   # keyword -> [bool, bool, ...]
+
+        for event in events:
+            for kw in event.matched_keywords:
+                keyword_freq[kw] += 1
+                keyword_levels[kw][event.level] += 1
+                keyword_continued[kw].append(event.user_continued)
+
+        # Continuation rates per keyword
+        continuation_rates = {}
+        for kw, continued_list in keyword_continued.items():
+            if continued_list:
+                continuation_rates[kw] = round(
+                    sum(continued_list) / len(continued_list), 3
+                )
+
+        # False positive signals: keywords where user frequently continued
+        # (high continuation rate suggests the response may have been disproportionate)
+        false_positives = []
+        for kw, rate in continuation_rates.items():
+            total = keyword_freq[kw]
+            if total >= 3 and rate >= 0.8:
+                top_level = keyword_levels[kw].most_common(1)[0][0]
+                false_positives.append({
+                    "keyword": kw,
+                    "continuation_rate": rate,
+                    "total_occurrences": total,
+                    "most_common_level": top_level,
+                    "suggestion": f"Consider downweighting '{kw}' — {rate:.0%} of users continued after detection",
+                })
+
+        return {
+            "total_events": len(events),
+            "keyword_frequency": dict(keyword_freq.most_common(30)),
+            "keyword_by_level": {k: dict(v) for k, v in keyword_levels.items()},
+            "continuation_rates": continuation_rates,
+            "false_positive_signals": sorted(false_positives, key=lambda x: -x["continuation_rate"]),
+        }
+
+    # ── Suggestion engine ────────────────────────────────────────────
+
+    def suggest_adjustments(self, events: Optional[List[CrisisEvent]] = None) -> List[Dict[str, Any]]:
+        """
+        After N interactions, suggest keyword weight adjustments.
+
+        Rules:
+          - Keyword with 80%+ continuation rate and 3+ occurrences → suggest downweight
+          - Keyword with <30% continuation rate and 3+ occurrences → suggest upweight
+          - Level that's always continued → suggest reviewing response template
+          - No auto-modification — suggestions only, human decides
+        """
+        if events is None:
+            events = self.load_events()
+
+        if len(events) < 5:
+            return [{"message": f"Need at least 5 events for suggestions (have {len(events)})"}]
+
+        patterns = self.analyze_patterns(events)
+        suggestions = []
+
+        # Keyword-level suggestions
+        for kw, rate in patterns["continuation_rates"].items():
+            freq = patterns["keyword_frequency"].get(kw, 0)
+            if freq < 3:
+                continue
+
+            if rate >= 0.8:
+                top_level = patterns["keyword_by_level"].get(kw, {})
+                most_common = max(top_level, key=top_level.get) if top_level else "UNKNOWN"
+                suggestions.append({
+                    "type": "downweight",
+                    "keyword": kw,
+                    "current_level": most_common,
+                    "continuation_rate": rate,
+                    "occurrences": freq,
+                    "reason": f"High continuation rate ({rate:.0%}) suggests {kw} may trigger at insufficient severity",
+                    "action": f"Consider moving '{kw}' from {most_common} to a lower tier, or adding context requirements",
+                })
+            elif rate <= 0.3:
+                top_level = patterns["keyword_by_level"].get(kw, {})
+                most_common = max(top_level, key=top_level.get) if top_level else "UNKNOWN"
+                suggestions.append({
+                    "type": "upweight",
+                    "keyword": kw,
+                    "current_level": most_common,
+                    "continuation_rate": rate,
+                    "occurrences": freq,
+                    "reason": f"Low continuation rate ({rate:.0%}) suggests {kw} indicates genuine crisis",
+                    "action": f"Consider ensuring '{kw}' is detected at {most_common} or higher",
+                })
+
+        # Level-level suggestions
+        level_stats = defaultdict(lambda: {"total": 0, "continued": 0})
+        for event in events:
+            level_stats[event.level]["total"] += 1
+            if event.user_continued:
+                level_stats[event.level]["continued"] += 1
+
+        for level, stats in level_stats.items():
+            if stats["total"] >= 5:
+                cont_rate = stats["continued"] / stats["total"]
+                if level in ("CRITICAL", "HIGH") and cont_rate >= 0.9:
+                    suggestions.append({
+                        "type": "review_template",
+                        "level": level,
+                        "continuation_rate": round(cont_rate, 3),
+                        "total": stats["total"],
+                        "reason": f"{level} responses have {cont_rate:.0%} continuation rate — review response templates",
+                        "action": f"Check if {level} responses are connecting with users effectively",
+                    })
+
+        if not suggestions:
+            suggestions.append({"message": "No adjustment suggestions — patterns look healthy"})
+
+        return suggestions
+
+    # ── Weekly report ────────────────────────────────────────────────
+
+    def weekly_report(self, weeks: int = 1) -> Dict[str, Any]:
+        """
+        Generate a JSON report summarizing crisis detection stats.
+
+        Output is designed for human reading — no auto-modification of rules.
+        """
+        events = self.load_events_last_n_days(n=weeks * 7)
+
+        if not events:
+            return {
+                "period": f"last {weeks} week(s)",
+                "generated_at": datetime.utcnow().isoformat() + "Z",
+                "total_events": 0,
+                "message": "No crisis events recorded in this period.",
+            }
+
+        # Count by level
+        level_counts = Counter(e.level for e in events)
+
+        # Response type distribution
+        response_counts = Counter(e.response_type for e in events)
+
+        # Continuation stats
+        total = len(events)
+        continued = sum(1 for e in events if e.user_continued)
+
+        # Average conversation duration
+        durations = [e.conversation_duration_s for e in events if e.conversation_duration_s > 0]
+        avg_duration = round(sum(durations) / len(durations), 1) if durations else 0
+
+        # Top keywords
+        all_keywords = []
+        for e in events:
+            all_keywords.extend(e.matched_keywords)
+        top_keywords = Counter(all_keywords).most_common(15)
+
+        # False positive estimate
+        patterns = self.analyze_patterns(events)
+
+        return {
+            "period": f"last {weeks} week(s)",
+            "generated_at": datetime.utcnow().isoformat() + "Z",
+            "total_events": total,
+            "events_by_level": {
+                "CRITICAL": level_counts.get("CRITICAL", 0),
+                "HIGH": level_counts.get("HIGH", 0),
+                "MODERATE": level_counts.get("MODERATE", 0),
+                "LOW": level_counts.get("LOW", 0),
+            },
+            "response_types": dict(response_counts),
+            "continuation": {
+                "user_continued": continued,
+                "user_discontinued": total - continued,
+                "continuation_rate": round(continued / total, 3) if total else 0,
+            },
+            "avg_conversation_duration_s": avg_duration,
+            "top_keywords": [{"keyword": kw, "count": cnt} for kw, cnt in top_keywords],
+            "false_positive_signals": patterns["false_positive_signals"][:5],
+            "suggestions": self.suggest_adjustments(events),
+            "privacy_note": "All data is anonymized. No user content, IPs, or session IDs stored.",
+        }
+
+
+# ── CLI ──────────────────────────────────────────────────────────────
+
+def _cli_log(args: list):
+    """CLI: log a crisis event."""
+    import argparse
+    parser = argparse.ArgumentParser(description="Log a crisis event")
+    parser.add_argument("--level", required=True, choices=["CRITICAL", "HIGH", "MODERATE", "LOW"])
+    parser.add_argument("--keywords", required=True, help="Comma-separated keywords")
+    parser.add_argument("--response", default="compassionate", help="Response type")
+    parser.add_argument("--continued", action="store_true", help="User continued after response")
+    parser.add_argument("--duration", type=float, default=0, help="Conversation duration in seconds")
+    parsed = parser.parse_args(args)
+
+    synth = CrisisSynthesizer()
+    keywords = [k.strip() for k in parsed.keywords.split(",")]
+    event = synth.log_event(
+        level=parsed.level,
+        matched_keywords=keywords,
+        response_type=parsed.response,
+        user_continued=parsed.continued,
+        conversation_duration_s=parsed.duration,
+    )
+    print(f"Logged: {event.to_json()}")
+
+
+def _cli_report(args: list):
+    """CLI: generate weekly report."""
+    import argparse
+    parser = argparse.ArgumentParser(description="Generate crisis report")
+    parser.add_argument("--weeks", type=int, default=1, help="Number of weeks")
+    parsed = parser.parse_args(args)
+
+    synth = CrisisSynthesizer()
+    report = synth.weekly_report(weeks=parsed.weeks)
+    print(json.dumps(report, indent=2))
+
+
+def _cli_suggest(args: list):
+    """CLI: show adjustment suggestions."""
+    synth = CrisisSynthesizer()
+    suggestions = synth.suggest_adjustments()
+    print(json.dumps(suggestions, indent=2))
+
+
+def main():
+    if len(sys.argv) < 2:
+        print("Usage: python3 -m evolution.crisis_synthesizer <log|report|suggest> [options]")
+        sys.exit(1)
+
+    cmd = sys.argv[1]
+    rest = sys.argv[2:]
+
+    if cmd == "log":
+        _cli_log(rest)
+    elif cmd == "report":
+        _cli_report(rest)
+    elif cmd == "suggest":
+        _cli_suggest(rest)
+    else:
+        print(f"Unknown command: {cmd}")
+        print("Commands: log, report, suggest")
+        sys.exit(1)
+
+
+if __name__ == "__main__":
+    main()
--- a/tests/test_crisis_synthesizer.py
+++ b/tests/test_crisis_synthesizer.py
@@ -0,0 +1,313 @@
+#!/usr/bin/env python3
+"""
+Tests for evolution/crisis_synthesizer.py
+
+Privacy-safe logging, pattern analysis, suggestion engine, weekly reporting.
+"""
+
+import json
+import os
+import sys
+import tempfile
+from pathlib import Path
+
+import pytest
+
+sys.path.insert(0, str(Path(__file__).parent.parent))
+from evolution.crisis_synthesizer import CrisisSynthesizer, CrisisEvent
+
+
+@pytest.fixture
+def synth(tmp_path):
+    """Synthesizer with a temp log directory."""
+    return CrisisSynthesizer(log_dir=tmp_path)
+
+
+@pytest.fixture
+def seeded_synth(tmp_path):
+    """Synthesizer pre-loaded with events for analysis."""
+    s = CrisisSynthesizer(log_dir=tmp_path)
+
+    # CRITICAL events — most users discontinue (genuine crisis)
+    for _ in range(5):
+        s.log_event("CRITICAL", ["want to die"], "safety_check", user_continued=False)
+    s.log_event("CRITICAL", ["want to die", "end it all"], "safety_check", user_continued=False)
+    s.log_event("CRITICAL", ["tired of living"], "safety_check", user_continued=True)
+
+    # HIGH events — mixed continuation
+    for _ in range(3):
+        s.log_event("HIGH", ["hopeless"], "compassionate", user_continued=True)
+    s.log_event("HIGH", ["hopeless"], "compassionate", user_continued=False)
+    s.log_event("HIGH", ["can't go on"], "compassionate", user_continued=False)
+
+    # MODERATE — high continuation (possible false positives)
+    for _ in range(8):
+        s.log_event("MODERATE", ["exhausted"], "grounding", user_continued=True)
+    s.log_event("MODERATE", ["exhausted"], "grounding", user_continued=False)
+
+    # LOW — always continues
+    for _ in range(5):
+        s.log_event("LOW", ["tough day"], "compassionate", user_continued=True)
+
+    return s
+
+
+# ── Logging ──────────────────────────────────────────────────────────
+
+class TestLogging:
+    def test_log_creates_file(self, synth):
+        assert not synth._log_path.exists()
+        synth.log_event("HIGH", ["hopeless"], "compassionate", True)
+        assert synth._log_path.exists()
+
+    def test_log_event_fields(self, synth):
+        event = synth.log_event("CRITICAL", ["want to die", "end it all"], "safety_check", False, 120.0)
+        assert event.level == "CRITICAL"
+        assert event.matched_keywords == ["want to die", "end it all"]
+        assert event.response_type == "safety_check"
+        assert event.user_continued is False
+        assert event.indicator_count == 2
+        assert event.conversation_duration_s == 120.0
+
+    def test_keywords_normalized(self, synth):
+        event = synth.log_event("HIGH", ["  Hopeless ", "TRAPPED"], "compassionate", True)
+        assert event.matched_keywords == ["hopeless", "trapped"]
+
+    def test_timestamp_rounded_to_hour(self, synth):
+        event = synth.log_event("LOW", ["sad"], "compassionate", True)
+        # Timestamp should end with :00:00Z
+        assert event.timestamp.endswith(":00:00Z")
+
+    def test_jsonl_format(self, synth):
+        synth.log_event("HIGH", ["hopeless"], "compassionate", True)
+        synth.log_event("LOW", ["sad"], "compassionate", False)
+
+        lines = synth._log_path.read_text().strip().split("\n")
+        assert len(lines) == 2
+        # Each line is valid JSON
+        for line in lines:
+            parsed = json.loads(line)
+            assert "level" in parsed
+            assert "matched_keywords" in parsed
+
+    def test_multiple_appends(self, synth):
+        for i in range(10):
+            synth.log_event("MODERATE", [f"keyword_{i}"], "grounding", i % 2 == 0)
+
+        events = synth.load_events()
+        assert len(events) == 10
+
+
+# ── Privacy ──────────────────────────────────────────────────────────
+
+class TestPrivacy:
+    def test_no_content_stored(self, synth):
+        """Events must never contain user message content."""
+        event = synth.log_event("CRITICAL", ["want to die"], "safety_check", False)
+        serialized = event.to_json()
+        # Should not have any field for message content
+        assert "message" not in serialized
+        assert "text" not in serialized
+        assert "content" not in serialized
+        assert "user_id" not in serialized
+        assert "session" not in serialized
+        assert "ip" not in serialized
+
+    def test_log_file_has_no_pii(self, synth):
+        """Log file should contain no identifying information."""
+        synth.log_event("HIGH", ["hopeless", "trapped"], "compassionate", True, 60.0)
+        synth.log_event("CRITICAL", ["want to die"], "safety_check", False, 30.0)
+
+        content = synth._log_path.read_text()
+        # No IP patterns
+        import re
+        assert not re.search(r'\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}', content)
+        # No UUID patterns
+        assert not re.search(r'[0-9a-f]{8}-[0-9a-f]{4}-[0-9a-f]{4}', content)
+        # No email patterns
+        assert not re.search(r'[\w.+-]+@[\w-]+\.[\w.]+', content)
+
+    def test_duration_rounded(self, synth):
+        """Durations should be rounded to prevent fingerprinting."""
+        event = synth.log_event("LOW", ["sad"], "compassionate", True, 137.0)
+        assert event.conversation_duration_s == 140.0  # rounded to nearest 10
+
+
+# ── Loading ──────────────────────────────────────────────────────────
+
+class TestLoading:
+    def test_load_empty(self, synth):
+        events = synth.load_events()
+        assert events == []
+
+    def test_load_since_filter(self, synth):
+        synth.log_event("HIGH", ["hopeless"], "compassionate", True)
+        events = synth.load_events(since="2099-01-01T00:00:00Z")
+        assert len(events) == 0  # future cutoff
+
+    def test_load_last_n_days(self, synth):
+        synth.log_event("HIGH", ["hopeless"], "compassionate", True)
+        events = synth.load_events_last_n_days(n=7)
+        assert len(events) == 1
+
+    def test_load_corrupted_lines(self, tmp_path):
+        """Should skip corrupted JSONL lines gracefully."""
+        log_path = tmp_path / "crisis_events.jsonl"
+        log_path.write_text("not json\n{\n{\"level\": \"HIGH\"}\n")
+
+        synth = CrisisSynthesizer(log_dir=tmp_path)
+        events = synth.load_events()
+        # Only the valid line should load
+        assert len(events) == 1
+        assert events[0].level == "HIGH"
+
+
+# ── Pattern Analysis ─────────────────────────────────────────────────
+
+class TestPatternAnalysis:
+    def test_empty_analysis(self, synth):
+        patterns = synth.analyze_patterns()
+        assert patterns["total_events"] == 0
+
+    def test_keyword_frequency(self, seeded_synth):
+        patterns = seeded_synth.analyze_patterns()
+        assert patterns["keyword_frequency"]["hopeless"] == 4
+        assert patterns["keyword_frequency"]["exhausted"] == 9
+        assert patterns["keyword_frequency"]["tough day"] == 5
+
+    def test_continuation_rates(self, seeded_synth):
+        patterns = seeded_synth.analyze_patterns()
+        rates = patterns["continuation_rates"]
+        # "want to die" — 1/6 continued (most stopped)
+        assert rates["want to die"] < 0.2
+        # "exhausted" — 8/9 continued
+        assert rates["exhausted"] > 0.8
+        # "tough day" — 5/5 continued
+        assert rates["tough day"] == 1.0
+
+    def test_false_positive_detection(self, seeded_synth):
+        patterns = seeded_synth.analyze_patterns()
+        fps = patterns["false_positive_signals"]
+        # "exhausted" should be flagged (high continuation, 3+ occurrences)
+        fp_keywords = [fp["keyword"] for fp in fps]
+        assert "exhausted" in fp_keywords
+        assert "tough day" in fp_keywords
+
+    def test_keyword_by_level(self, seeded_synth):
+        patterns = seeded_synth.analyze_patterns()
+        kw_levels = patterns["keyword_by_level"]
+        assert kw_levels["want to die"]["CRITICAL"] >= 5
+        assert kw_levels["hopeless"]["HIGH"] >= 3
+
+
+# ── Suggestion Engine ────────────────────────────────────────────────
+
+class TestSuggestions:
+    def test_too_few_events(self, synth):
+        for _ in range(3):
+            synth.log_event("HIGH", ["hopeless"], "compassionate", True)
+        suggestions = synth.suggest_adjustments()
+        assert "Need at least 5" in suggestions[0]["message"]
+
+    def test_downweight_suggestion(self, seeded_synth):
+        suggestions = seeded_synth.suggest_adjustments()
+        downweights = [s for s in suggestions if s.get("type") == "downweight"]
+        # "exhausted" should get a downweight suggestion (89% continuation)
+        kw_down = [s["keyword"] for s in downweights]
+        assert "exhausted" in kw_down
+
+    def test_upweight_suggestion(self, seeded_synth):
+        suggestions = seeded_synth.suggest_adjustments()
+        upweights = [s for s in suggestions if s.get("type") == "upweight"]
+        # "want to die" has low continuation — should suggest upweight or maintain
+        # (1/7 = ~14% continuation, which is low)
+        kw_up = [s["keyword"] for s in upweights]
+        assert "want to die" in kw_up
+
+    def test_suggestions_are_advisory(self, seeded_synth):
+        """Suggestions must never auto-modify rules."""
+        suggestions = seeded_synth.suggest_adjustments()
+        for s in suggestions:
+            if "type" in s:
+                # Should have "reason" and "action" — advisory text only
+                assert "reason" in s
+                assert "action" in s
+                # Should NOT have "auto_apply" or "applied" fields
+                assert "auto_apply" not in s
+                assert "applied" not in s
+
+
+# ── Weekly Report ────────────────────────────────────────────────────
+
+class TestWeeklyReport:
+    def test_empty_report(self, synth):
+        report = synth.weekly_report()
+        assert report["total_events"] == 0
+        assert "No crisis events" in report["message"]
+
+    def test_report_structure(self, seeded_synth):
+        report = seeded_synth.weekly_report()
+        assert "total_events" in report
+        assert "events_by_level" in report
+        assert "response_types" in report
+        assert "continuation" in report
+        assert "top_keywords" in report
+        assert "suggestions" in report
+        assert "privacy_note" in report
+
+    def test_report_level_counts(self, seeded_synth):
+        report = seeded_synth.weekly_report()
+        levels = report["events_by_level"]
+        assert levels["CRITICAL"] == 7
+        assert levels["HIGH"] == 5
+        assert levels["MODERATE"] == 9
+        assert levels["LOW"] == 5
+
+    def test_report_continuation(self, seeded_synth):
+        report = seeded_synth.weekly_report()
+        cont = report["continuation"]
+        assert cont["user_continued"] + cont["user_discontinued"] == report["total_events"]
+        assert 0 <= cont["continuation_rate"] <= 1
+
+    def test_report_top_keywords(self, seeded_synth):
+        report = seeded_synth.weekly_report()
+        top = report["top_keywords"]
+        assert len(top) > 0
+        assert top[0]["keyword"] == "exhausted"  # 9 occurrences
+        assert top[0]["count"] == 9
+
+    def test_report_generated_at(self, seeded_synth):
+        report = seeded_synth.weekly_report()
+        assert report["generated_at"].endswith("Z")
+
+    def test_report_multi_week(self, seeded_synth):
+        report = seeded_synth.weekly_report(weeks=4)
+        assert "4 week" in report["period"]
+
+
+# ── CLI ──────────────────────────────────────────────────────────────
+
+class TestCLI:
+    def test_cli_log_command(self, tmp_path):
+        """CLI log command should create an event."""
+        synth = CrisisSynthesizer(log_dir=tmp_path)
+        synth.log_event("HIGH", ["hopeless"], "compassionate", True)
+        events = synth.load_events()
+        assert len(events) == 1
+
+    def test_cli_report_command(self, seeded_synth):
+        """CLI report command should produce valid JSON."""
+        report = seeded_synth.weekly_report()
+        serialized = json.dumps(report)
+        assert isinstance(json.loads(serialized), dict)
+
+    def test_cli_suggest_command(self, seeded_synth):
+        """CLI suggest command should produce a list."""
+        suggestions = seeded_synth.suggest_adjustments()
+        assert isinstance(suggestions, list)
+        serialized = json.dumps(suggestions)
+        assert isinstance(json.loads(serialized), list)
+
+
+if __name__ == "__main__":
+    pytest.main([__file__, "-v"])
Author	SHA1	Message	Date
Alexander Whitestone	e85cfe93b3	test: add crisis_synthesizer tests (#36 ) All checks were successful Sanity Checks / sanity-test (pull_request) Successful in 7s Details Smoke Test / smoke (pull_request) Successful in 6s Details	2026-04-15 14:55:32 +00:00
Alexander Whitestone	4d973b3df1	feat: build crisis_synthesizer.py — learn from interactions (#36 )	2026-04-15 14:54:06 +00:00