Compare commits

...

2 Commits

Author SHA1 Message Date
Alexander Whitestone
4dfa001b9a docs: refresh the-door genome analysis (#673)
Some checks failed
Self-Healing Smoke / self-healing-smoke (pull_request) Failing after 27s
Smoke Test / smoke (pull_request) Failing after 27s
Agent PR Gate / gate (pull_request) Failing after 39s
Agent PR Gate / report (pull_request) Successful in 9s
2026-04-21 03:55:52 -04:00
Alexander Whitestone
0173ed67e2 wip: strengthen the-door genome regression for #673 2026-04-21 03:51:28 -04:00
2 changed files with 41 additions and 8 deletions

View File

@@ -28,6 +28,11 @@ def test_the_door_genome_has_required_sections() -> None:
def test_the_door_genome_captures_repo_specific_findings() -> None:
content = _content()
assert "19 Python files" in content
assert "146 passed, 3 subtests passed" in content
assert "crisis/session_tracker.py" in content
assert "tests/test_session_tracker.py" in content
assert "tests/test_false_positive_fixes.py" in content
assert "lastUserMessage" in content
assert "localStorage" in content
assert "crisis-offline.html" in content

View File

@@ -11,10 +11,11 @@ The Door is a crisis-first front door to Timmy: one URL, no account wall, no app
What the codebase actually contains today:
- 1 primary browser app: `index.html`
- 4 companion browser assets/pages: `about.html`, `testimony.html`, `crisis-offline.html`, `sw.js`
- 17 Python files across canonical crisis logic, legacy shims, wrappers, and tests
- 19 Python files across canonical crisis logic, session tracking, legacy shims, wrappers, and tests
- 5 tracked pytest files under `tests/`
- 2 Gitea workflows: `smoke.yml`, `sanity.yml`
- 1 systemd unit: `deploy/hermes-gateway.service`
- full test suite currently passing: `115 passed, 3 subtests passed`
- full test suite currently passing: `146 passed, 3 subtests passed`
The repo is small, but it is not simple. The true architecture is a layered safety system:
1. immediate browser-side crisis escalation
@@ -44,8 +45,10 @@ graph TD
H --> G[crisis/gateway.py]
G --> D[crisis/detect.py]
G --> S[crisis/session_tracker.py]
G --> R[crisis/response.py]
D --> CR[CrisisDetectionResult]
S --> SS[SessionState / CrisisSessionTracker]
R --> RESP[CrisisResponse]
D --> LEG[Legacy shims\ncrisis_detector.py\ncrisis_responder.py\ndying_detection]
@@ -78,8 +81,10 @@ graph TD
- canonical detection engine and public detection API
- `crisis/response.py`
- canonical response generator, UI flags, prompt modifier, grounding helpers
- `crisis/session_tracker.py`
- in-memory session escalation/de-escalation tracking and session-aware prompt modifiers
- `crisis/gateway.py`
- integration layer for `check_crisis()` and `get_system_prompt()`
- integration layer for `check_crisis()`, `check_crisis_with_session()`, and `get_system_prompt()`
- `crisis/compassion_router.py`
- profile-based prompt routing abstraction parallel to `response.py`
- `crisis_detector.py`
@@ -166,7 +171,25 @@ In `crisis/response.py`, the canonical response dataclass ties backend detection
- `provide_988`
- `escalate`
### 6. Legacy compatibility layer
### 6. `CrisisSessionTracker` and `SessionState`
`crisis/session_tracker.py` adds a privacy-first in-memory session layer on top of per-message detection:
- `SessionState`
- `current_level`
- `peak_level`
- `message_count`
- `level_history`
- `is_escalating`
- `is_deescalating`
- `escalation_rate`
- `consecutive_low_messages`
- `CrisisSessionTracker`
- `record()` for per-message updates
- `get_session_modifier()` for prompt augmentation
- `get_ui_hints()` for frontend-facing advisory state
This is the clearest new architecture addition since the earlier genome pass: The Door now reasons about trajectory within a conversation, not just isolated message severity.
### 7. Legacy compatibility layer
The repo still carries older interfaces:
- `crisis_detector.py`
- `crisis_responder.py`
@@ -177,7 +200,7 @@ These preserve compatibility, but they also create drift risk:
- two different `CrisisResponse` contracts
- two prompt-routing paths (`response.py` vs `compassion_router.py`)
### 7. Browser persistence contract
### 8. Browser persistence contract
`localStorage` is a real part of runtime state despite some docs claiming otherwise.
Keys:
- `timmy_chat_history`
@@ -215,7 +238,11 @@ Expected response shape:
- `crisis.response.generate_response(detection)`
- `crisis.response.process_message(text)`
- `crisis.response.get_system_prompt_modifier(detection)`
- `crisis.session_tracker.CrisisSessionTracker.record(detection)`
- `crisis.session_tracker.CrisisSessionTracker.get_session_modifier()`
- `crisis.session_tracker.check_crisis_with_session(text, tracker=None)`
- `crisis.gateway.check_crisis(text)`
- `crisis.gateway.check_crisis_with_session(text, tracker=None)`
- `crisis.gateway.get_system_prompt(base_prompt, text="")`
- `crisis.gateway.format_gateway_response(text, pretty=True)`
@@ -229,12 +256,13 @@ Expected response shape:
### Current state
Verified on fresh `main` clone of `the-door`:
- `python3 -m pytest -q` -> `115 passed, 3 subtests passed`
- `python3 -m pytest -q` -> `146 passed, 3 subtests passed`
What is already covered well:
- canonical crisis detection tiers
- response flags and gateway structure
- many false-positive regressions
- many false-positive regressions (`tests/test_false_positive_fixes.py`)
- session escalation/de-escalation tracking (`tests/test_session_tracker.py`)
- service-worker offline crisis fallback
- crisis overlay focus trap string-level assertions
- deprecated wrapper behavior
@@ -399,7 +427,7 @@ The repo's deploy surface is not fully coherent:
7. Align or remove resilience scripts targeting the wrong port/service.
8. Resolve doc drift:
- ARCHITECTURE says “close tab = gone,” but implementation uses `localStorage`
- BACKEND_SETUP still says 49 tests, while current verified suite is 115 + 3 subtests
- BACKEND_SETUP still says 49 tests, while current verified suite is 146 + 3 subtests
- audit docs understate current automation coverage
### Strategic debt