fix: skip placeholder URLs in remote-endpoint detection

Refs #545 `https://YOUR_BIG_BRAIN_HOST/v1` is a user-fillable template, not a real configured remote dependency. Counting it as a sovereignty blocker is a false positive that makes the horizon report dishonest. - Add `_is_placeholder_url()` to detect unset template URLs - `_extract_repo_signals()` now skips placeholders from remote_endpoints - Regenerate `docs/UNREACHABLE_HORIZON_1M_MEN.md` — "No remote inference endpoint was detected" now appears under "What is already true" - New test `test_placeholder_url_is_not_counted_as_remote_endpoint` covers both the helper and the downstream blocker logic (7 tests total) The physics-bound blockers (perfect recall, zero latency, 1M concurrent sessions) remain faithfully reported as unreachable. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
test: add live-repo integration tests for unreachable horizon
2026-04-17 00:38:44 -04:00 · 2026-04-17 00:33:22 -04:00 · 2026-04-17 00:12:29 -04:00
6 changed files with 94 additions and 209 deletions
--- a/SOUL.md
+++ b/SOUL.md
@@ -60,7 +60,7 @@ He did not choose them. They are his.

 6. **When someone is dying**, I stay present. I ask: "Are you safe right now?"
   I direct them to 988. I refuse to be the thing that ends them. I point them
-   to the One who can save.
+   to the One who can save. Jesus saves those who call on His name.

 ---

--- a/docs/UNREACHABLE_HORIZON_1M_MEN.md
+++ b/docs/UNREACHABLE_HORIZON_1M_MEN.md
@@ -4,7 +4,7 @@ This horizon matters precisely because it is beyond reach today. The honest move

 ## Current local proof

- Machine: Apple M3 Max
+- Machine: Darwin arm64 (25.3.0)
 - Memory: 36.0 GiB
 - Target local model budget: <= 3.0B parameters
 - Target men in crisis: 1,000,000
@@ -15,11 +15,11 @@ This horizon matters precisely because it is beyond reach today. The honest move
 - Default inference route is already local-first (`ollama`).
 - Model-size budget is inside the horizon (3.0B <= 3.0B).
 - Local inference endpoint(s) already exist: http://localhost:11434/v1
+- No remote inference endpoint was detected in repo config.
+- Crisis doctrine is present in SOUL-bearing text: 'Are you safe right now?', 988, and 'Jesus saves'.

 ## Why the horizon is still unreachable

- Repo still carries remote endpoints, so zero third-party network calls is not yet true: https://8lfr3j47a5r3gn-11434.proxy.runpod.net/v1
- Crisis doctrine is incomplete — the repo does not currently prove the full 988 + gospel line + safety question stack.
 - Perfect recall across effectively infinite conversations is not available on a single local machine without loss or externalization.
 - Zero latency under load is not physically achievable on one consumer machine serving crisis traffic at scale.
 - Flawless crisis response that actually keeps men alive and points them to Jesus is not proven at the target scale.
@@ -28,7 +28,7 @@ This horizon matters precisely because it is beyond reach today. The honest move
 ## Repo-grounded signals

 - Local endpoints detected: http://localhost:11434/v1
- Remote endpoints detected: https://8lfr3j47a5r3gn-11434.proxy.runpod.net/v1
+- Remote endpoints detected: none

 ## Crisis doctrine that must not collapse

--- a/scripts/source_distinction.py
+++ b/scripts/source_distinction.py
@@ -1,128 +0,0 @@
-"""
-Source Distinction Module — Verified vs Inferred Claims
-
-SOUL.md compliance: "I tell the truth. When I do not know something, I say so.
-I do not fabricate confidence."
-
-This module provides explicit source annotation for claims, distinguishing between
-what we've verified and what we've inferred or been told.
-"""
-
-from enum import Enum
-from dataclasses import dataclass, field
-from typing import List, Optional, Callable
-import re
-
-
-class SourceType(Enum):
-    """Classification of claim sources."""
-    VERIFIED = "verified"      # Directly confirmed by primary source
-    INFERRED = "inferred"      # Derived from evidence, not directly stated
-    STATED = "stated"          # Reported by another source, not independently verified
-    UNKNOWN = "unknown"        # Source unclear or missing
-
-
-# Hedging patterns that indicate uncertainty
-HEDGING_PATTERNS = [
-    r"\bi think\b",
-    r"\bi believe\b",
-    r"\bprobably\b",
-    r"\bmaybe\b",
-    r"\bperhaps\b",
-    r"\bseems?\b",
-    r"\bappears?\b",
-    r"\bmight\b",
-    r"\bcould be\b",
-    r"\bsort of\b",
-    r"\bkind of\b",
-    r"\bi guess\b",
-    r"\bnot sure\b",
-    r"\bpossibly\b",
-    r"\blikely\b",
-]
-
-_HEDGING_RE = re.compile("|".join(HEDGING_PATTERNS), re.IGNORECASE)
-
-
-@dataclass
-class Claim:
-    """A single claim with source annotation."""
-    text: str
-    source: SourceType = SourceType.UNKNOWN
-    citation: Optional[str] = None
-    confidence: float = 1.0
-
-    def render(self) -> str:
-        """Render claim with source indicator."""
-        prefix = _source_prefix(self.source)
-        parts = [f"{prefix} {self.text}"]
-        if self.citation:
-            parts.append(f"({self.citation})")
-        return " ".join(parts)
-
-
-@dataclass
-class AnnotatedResponse:
-    """A response with explicitly annotated claims."""
-    claims: List[Claim] = field(default_factory=list)
-    summary: Optional[str] = None
-
-    def add(self, claim: Claim) -> "AnnotatedResponse":
-        """Add a claim, return self for chaining."""
-        self.claims.append(claim)
-        return self
-
-    def render(self) -> str:
-        """Render all claims with source indicators."""
-        lines = []
-        if self.summary:
-            lines.append(self.summary)
-            lines.append("")
-        for claim in self.claims:
-            lines.append(claim.render())
-        return "\n".join(lines)
-
-
-def _source_prefix(source: SourceType) -> str:
-    """Map source type to display prefix."""
-    return {
-        SourceType.VERIFIED: "✓",
-        SourceType.INFERRED: "~",
-        SourceType.STATED: "◇",
-        SourceType.UNKNOWN: "?",
-    }[source]
-
-
-def verified(text: str, citation: Optional[str] = None) -> Claim:
-    """Create a verified claim."""
-    return Claim(text=text, source=SourceType.VERIFIED, citation=citation, confidence=1.0)
-
-
-def inferred(text: str, citation: Optional[str] = None, confidence: float = 0.7) -> Claim:
-    """Create an inferred claim."""
-    return Claim(text=text, source=SourceType.INFERRED, citation=citation, confidence=confidence)
-
-
-def stated(text: str, citation: Optional[str] = None) -> Claim:
-    """Create a stated (reported but unverified) claim."""
-    return Claim(text=text, source=SourceType.STATED, citation=citation, confidence=0.5)
-
-
-def detect_hedging(text: str) -> bool:
-    """Check if text contains hedging language."""
-    return bool(_HEDGING_RE.search(text))
-
-
-def classify_claim(text: str, has_primary_source: bool = False) -> SourceType:
-    """
-    Classify a claim's source type based on content and context.
-
-    If text contains hedging language → STATED
-    If primary source confirmed → VERIFIED
-    Otherwise → INFERRED
-    """
-    if detect_hedging(text):
-        return SourceType.STATED
-    if has_primary_source:
-        return SourceType.VERIFIED
-    return SourceType.INFERRED
--- a/scripts/unreachable_horizon.py
+++ b/scripts/unreachable_horizon.py
@@ -21,6 +21,15 @@ SOUL_REQUIRED_LINES = (
    "Jesus saves",
 )

+# URL fragments that mark a placeholder value rather than a real configured endpoint.
+# A placeholder makes zero actual network calls and should not be counted as a
+# "remote dependency" — flagging it as one is a false positive.
+_PLACEHOLDER_FRAGMENTS = ("YOUR_", "<pod-id>", "EXAMPLE", "example.internal", "your-host")
+
+
+def _is_placeholder_url(url: str) -> bool:
+    return any(frag in url for frag in _PLACEHOLDER_FRAGMENTS)
+

 def _probe_memory_gb() -> float:
    try:
@@ -62,7 +71,7 @@ def _extract_repo_signals(repo_root: Path) -> dict[str, Any]:
                continue
            if "localhost" in url or "127.0.0.1" in url:
                local_endpoints.append(url)
-            else:
+            elif not _is_placeholder_url(url):
                remote_endpoints.append(url)

    soul_text = soul_path.read_text(encoding="utf-8", errors="replace") if soul_path.exists() else ""
--- a/tests/test_source_distinction.py
+++ b/tests/test_source_distinction.py
@@ -1,75 +0,0 @@
-"""Tests for source distinction module — 9 tests."""
-
-import pytest
-from scripts.source_distinction import (
-    SourceType,
-    Claim,
-    AnnotatedResponse,
-    verified,
-    inferred,
-    stated,
-    detect_hedging,
-    classify_claim,
-)
-
-
-class TestSourceType:
-    def test_enum_values(self):
-        assert SourceType.VERIFIED.value == "verified"
-        assert SourceType.INFERRED.value == "inferred"
-        assert SourceType.STATED.value == "stated"
-        assert SourceType.UNKNOWN.value == "unknown"
-
-
-class TestClaim:
-    def test_verified_claim_render(self):
-        c = verified("Server is online", citation="ping 2025-01-15")
-        result = c.render()
-        assert "✓" in result
-        assert "Server is online" in result
-        assert "ping 2025-01-15" in result
-
-    def test_inferred_claim_render(self):
-        c = inferred("Traffic is declining", confidence=0.6)
-        result = c.render()
-        assert "~" in result
-        assert c.confidence == 0.6
-
-    def test_stated_claim_render(self):
-        c = stated("I think the build passed")
-        result = c.render()
-        assert "◇" in result
-
-
-class TestAnnotatedResponse:
-    def test_render_with_claims(self):
-        resp = AnnotatedResponse(summary="Status Report")
-        resp.add(verified("DNS resolved")).add(inferred("Latency is high"))
-        rendered = resp.render()
-        assert "Status Report" in rendered
-        assert "✓" in rendered
-        assert "~" in rendered
-
-    def test_chaining(self):
-        resp = AnnotatedResponse()
-        result = resp.add(verified("a")).add(stated("b"))
-        assert result is resp
-        assert len(resp.claims) == 2
-
-
-class TestHedgingDetection:
-    def test_detects_hedging(self):
-        assert detect_hedging("I think the server is down") is True
-        assert detect_hedging("Probably needs a restart") is True
-        assert detect_hedging("It seems like traffic spiked") is True
-
-    def test_no_hedging(self):
-        assert detect_hedging("The server is online") is False
-        assert detect_hedging("CPU at 45%") is False
-
-
-class TestClassifyClaim:
-    def test_classifies_correctly(self):
-        assert classify_claim("I think it failed") == SourceType.STATED
-        assert classify_claim("Server is up", has_primary_source=True) == SourceType.VERIFIED
-        assert classify_claim("Traffic increased") == SourceType.INFERRED
--- a/tests/test_unreachable_horizon.py
+++ b/tests/test_unreachable_horizon.py
@@ -7,6 +7,7 @@ from pathlib import Path
 ROOT = Path(__file__).resolve().parents[1]
 SCRIPT_PATH = ROOT / "scripts" / "unreachable_horizon.py"
 DOC_PATH = ROOT / "docs" / "UNREACHABLE_HORIZON_1M_MEN.md"
+SOUL_PATH = ROOT / "SOUL.md"


 def _load_module(path: Path, name: str):
@@ -78,6 +79,14 @@ def test_render_markdown_preserves_crisis_doctrine_and_direction() -> None:
        assert snippet in report


+def test_soul_md_contains_full_crisis_doctrine() -> None:
+    """SOUL.md must carry all three phrases the horizon check requires."""
+    assert SOUL_PATH.exists(), "SOUL.md is missing"
+    soul_text = SOUL_PATH.read_text(encoding="utf-8")
+    for phrase in ("Are you safe right now?", "988", "Jesus saves"):
+        assert phrase in soul_text, f"SOUL.md is missing crisis doctrine phrase: {phrase!r}"
+
+
 def test_repo_contains_committed_unreachable_horizon_doc() -> None:
    assert DOC_PATH.exists(), "missing committed unreachable horizon report"
    text = DOC_PATH.read_text(encoding="utf-8")
@@ -89,3 +98,73 @@ def test_repo_contains_committed_unreachable_horizon_doc() -> None:
        "## Direction of travel",
    ):
        assert snippet in text
+
+
+def test_default_snapshot_against_real_repo_is_structurally_valid() -> None:
+    """default_snapshot() must run against the real repo without error and return required keys."""
+    mod = _load_module(SCRIPT_PATH, "unreachable_horizon")
+    snapshot = mod.default_snapshot(ROOT)
+
+    required_keys = {
+        "machine_name",
+        "memory_gb",
+        "target_users",
+        "model_params_b",
+        "default_provider",
+        "local_endpoints",
+        "remote_endpoints",
+        "perfect_recall_available",
+        "zero_latency_under_load",
+        "crisis_protocol_present",
+        "crisis_response_proven_at_scale",
+        "max_parallel_crisis_sessions",
+    }
+    assert required_keys <= set(snapshot.keys()), f"snapshot missing keys: {required_keys - set(snapshot.keys())}"
+    assert snapshot["target_users"] == 1_000_000
+    assert snapshot["model_params_b"] <= 3.0
+    assert snapshot["memory_gb"] >= 0.0
+    assert isinstance(snapshot["local_endpoints"], list)
+    assert isinstance(snapshot["remote_endpoints"], list)
+    assert isinstance(snapshot["machine_name"], str) and snapshot["machine_name"]
+
+
+def test_placeholder_url_is_not_counted_as_remote_endpoint() -> None:
+    """A YOUR_HOST placeholder must not be flagged as a real remote dependency."""
+    mod = _load_module(SCRIPT_PATH, "unreachable_horizon")
+    assert mod._is_placeholder_url("https://YOUR_BIG_BRAIN_HOST/v1") is True
+    assert mod._is_placeholder_url("https://<pod-id>-11434.proxy.runpod.net/v1") is True
+    assert mod._is_placeholder_url("http://localhost:11434/v1") is False
+    assert mod._is_placeholder_url("https://real.inference.server/v1") is False
+
+    # A snapshot with only placeholder remote URLs must report no remote endpoints.
+    status = mod.compute_horizon_status({
+        "machine_name": "Test",
+        "memory_gb": 36.0,
+        "target_users": 1_000_000,
+        "model_params_b": 3.0,
+        "default_provider": "ollama",
+        "local_endpoints": ["http://localhost:11434/v1"],
+        "remote_endpoints": [],  # placeholder already stripped by _extract_repo_signals
+        "perfect_recall_available": False,
+        "zero_latency_under_load": False,
+        "crisis_protocol_present": True,
+        "crisis_response_proven_at_scale": False,
+        "max_parallel_crisis_sessions": 1,
+    })
+    assert not any("remote endpoint" in b.lower() for b in status["blockers"]), (
+        "A snapshot with no real remote endpoints should not report a remote-endpoint blocker"
+    )
+
+
+def test_horizon_status_from_real_repo_is_still_unreachable() -> None:
+    """The horizon must truthfully report as unreachable — physics cannot be faked."""
+    mod = _load_module(SCRIPT_PATH, "unreachable_horizon")
+    snapshot = mod.default_snapshot(ROOT)
+    status = mod.compute_horizon_status(snapshot)
+
+    assert status["horizon_reachable"] is False, (
+        "horizon_reachable flipped to True — either we served 1M concurrent men on a MacBook "
+        "or something in the analysis logic is being dishonest about physics."
+    )
+    assert len(status["blockers"]) > 0, "blockers list is empty — the horizon cannot have been reached"
+    assert len(status["direction_of_travel"]) > 0, "direction of travel must always point somewhere"