Compare commits
6 Commits
fix/791-cr
...
fix/792-gr
| Author | SHA1 | Date | |
|---|---|---|---|
|
|
55c8100b8f | ||
|
|
1f92fb0480 | ||
|
|
a39f4fb1ab | ||
|
|
5c2cf06f57 | ||
|
|
4fd78ace44 | ||
|
|
b8b8bb65fd |
296
GENOME.md
296
GENOME.md
@@ -1,209 +1,141 @@
|
||||
# GENOME.md — the-nexus
|
||||
# GENOME.md — Timmy_Foundation/timmy-home
|
||||
|
||||
Generated by `pipelines/codebase_genome.py`.
|
||||
|
||||
## Project Overview
|
||||
|
||||
`the-nexus` is a hybrid repo that combines three layers in one codebase:
|
||||
Timmy Foundation's home repository for development operations and configurations.
|
||||
|
||||
1. A browser-facing world shell rooted in `index.html`, `boot.js`, `bootstrap.mjs`, `app.js`, `style.css`, `portals.json`, `vision.json`, `manifest.json`, and `gofai_worker.js`
|
||||
2. A Python realtime bridge centered on `server.py` plus harness code under `nexus/`
|
||||
3. A memory / fleet / operator layer spanning `mempalace/`, `mcp_servers/`, `multi_user_bridge.py`, and supporting scripts
|
||||
- Text files indexed: 3004
|
||||
- Source and script files: 186
|
||||
- Test files: 28
|
||||
- Documentation files: 701
|
||||
|
||||
The repo is not a clean single-purpose frontend and not just a backend harness. It is a mixed world/runtime/ops repository where browser rendering, WebSocket telemetry, MCP-driven game harnesses, and fleet memory tooling coexist.
|
||||
|
||||
Grounded repo facts from this checkout:
|
||||
- Browser shell files exist at repo root: `index.html`, `app.js`, `style.css`, `manifest.json`, `gofai_worker.js`
|
||||
- Data/config files also live at repo root: `portals.json`, `vision.json`
|
||||
- Realtime bridge exists in `server.py`
|
||||
- Game harnesses exist in `nexus/morrowind_harness.py` and `nexus/bannerlord_harness.py`
|
||||
- Memory/fleet sync exists in `mempalace/tunnel_sync.py`
|
||||
- Desktop/game automation MCP servers exist in `mcp_servers/desktop_control_server.py` and `mcp_servers/steam_info_server.py`
|
||||
- Validation exists in `tests/test_browser_smoke.py`, `tests/test_portals_json.py`, `tests/test_index_html_integrity.py`, and `tests/test_repo_truth.py`
|
||||
|
||||
The current architecture is best understood as a sovereign world shell plus operator/game harness backend, with accumulated documentation drift from multiple restoration and migration efforts.
|
||||
|
||||
## Architecture Diagram
|
||||
## Architecture
|
||||
|
||||
```mermaid
|
||||
graph TD
|
||||
browser[Index HTML Shell\nindex.html -> boot.js -> bootstrap.mjs -> app.js]
|
||||
assets[Root Assets\nstyle.css\nmanifest.json\ngofai_worker.js]
|
||||
data[World Data\nportals.json\nvision.json]
|
||||
ws[Realtime Bridge\nserver.py\nWebSocket broadcast hub]
|
||||
gofai[In-browser GOFAI\nSymbolicEngine\nNeuroSymbolicBridge\nsetupGOFAI/updateGOFAI]
|
||||
harnesses[Python Harnesses\nnexus/morrowind_harness.py\nnexus/bannerlord_harness.py]
|
||||
mcp[MCP Adapters\nmcp_servers/desktop_control_server.py\nmcp_servers/steam_info_server.py]
|
||||
memory[Memory + Fleet\nmempalace/tunnel_sync.py\nmempalace.js]
|
||||
bridge[Operator / MUD Bridge\nmulti_user_bridge.py\ncommands/timmy_commands.py]
|
||||
tests[Verification\ntests/test_browser_smoke.py\ntests/test_portals_json.py\ntests/test_repo_truth.py]
|
||||
docs[Contracts + Drift Docs\nBROWSER_CONTRACT.md\nREADME.md\nCLAUDE.md\nINVESTIGATION_ISSUE_1145.md]
|
||||
|
||||
browser --> assets
|
||||
browser --> data
|
||||
browser --> gofai
|
||||
browser --> ws
|
||||
harnesses --> mcp
|
||||
harnesses --> ws
|
||||
bridge --> ws
|
||||
memory --> ws
|
||||
tests --> browser
|
||||
tests --> data
|
||||
tests --> docs
|
||||
docs --> browser
|
||||
repo_root["repo"]
|
||||
angband["angband"]
|
||||
briefings["briefings"]
|
||||
config["config"]
|
||||
conftest["conftest"]
|
||||
evennia["evennia"]
|
||||
evennia_tools["evennia_tools"]
|
||||
evolution["evolution"]
|
||||
gemini_fallback_setup["gemini-fallback-setup"]
|
||||
heartbeat["heartbeat"]
|
||||
infrastructure["infrastructure"]
|
||||
repo_root --> angband
|
||||
repo_root --> briefings
|
||||
repo_root --> config
|
||||
repo_root --> conftest
|
||||
repo_root --> evennia
|
||||
repo_root --> evennia_tools
|
||||
```
|
||||
|
||||
## Entry Points and Data Flow
|
||||
## Entry Points
|
||||
|
||||
### Primary entry points
|
||||
- `gemini-fallback-setup.sh` — operational script (`bash gemini-fallback-setup.sh`)
|
||||
- `morrowind/hud.sh` — operational script (`bash morrowind/hud.sh`)
|
||||
- `pipelines/codebase_genome.py` — python main guard (`python3 pipelines/codebase_genome.py`)
|
||||
- `scripts/auto_restart_agent.sh` — operational script (`bash scripts/auto_restart_agent.sh`)
|
||||
- `scripts/backup_pipeline.sh` — operational script (`bash scripts/backup_pipeline.sh`)
|
||||
- `scripts/big_brain_manager.py` — operational script (`python3 scripts/big_brain_manager.py`)
|
||||
- `scripts/big_brain_repo_audit.py` — operational script (`python3 scripts/big_brain_repo_audit.py`)
|
||||
- `scripts/codebase_genome_nightly.py` — operational script (`python3 scripts/codebase_genome_nightly.py`)
|
||||
- `scripts/detect_secrets.py` — operational script (`python3 scripts/detect_secrets.py`)
|
||||
- `scripts/dynamic_dispatch_optimizer.py` — operational script (`python3 scripts/dynamic_dispatch_optimizer.py`)
|
||||
- `scripts/emacs-fleet-bridge.py` — operational script (`python3 scripts/emacs-fleet-bridge.py`)
|
||||
- `scripts/emacs-fleet-poll.sh` — operational script (`bash scripts/emacs-fleet-poll.sh`)
|
||||
|
||||
- `index.html` — root browser entry point
|
||||
- `boot.js` — startup selector; `tests/boot.test.js` shows it chooses file-mode vs HTTP/module-mode and injects `bootstrap.mjs` when served over HTTP
|
||||
- `bootstrap.mjs` — module bootstrap for the browser shell
|
||||
- `app.js` — main browser runtime; owns world state, GOFAI wiring, metrics polling, and portal/UI logic
|
||||
- `server.py` — WebSocket broadcast bridge on `ws://0.0.0.0:8765`
|
||||
- `nexus/morrowind_harness.py` — GamePortal/MCP harness for OpenMW Morrowind
|
||||
- `nexus/bannerlord_harness.py` — GamePortal/MCP harness for Bannerlord
|
||||
- `mempalace/tunnel_sync.py` — pulls remote fleet closets into the local palace over HTTP
|
||||
- `multi_user_bridge.py` — HTTP bridge for multi-user chat/session integration
|
||||
- `mcp_servers/desktop_control_server.py` — stdio MCP server exposing screenshots/mouse/keyboard control
|
||||
## Data Flow
|
||||
|
||||
### Data flow
|
||||
|
||||
1. Browser startup begins at `index.html`
|
||||
2. `boot.js` decides whether the page is being served correctly; in HTTP mode it injects `bootstrap.mjs`
|
||||
3. `bootstrap.mjs` hands off to `app.js`
|
||||
4. `app.js` loads world configuration from `portals.json` and `vision.json`
|
||||
5. `app.js` constructs the Three.js scene and in-browser reasoning components, including `SymbolicEngine`, `NeuroSymbolicBridge`, `setupGOFAI()`, and `updateGOFAI()`
|
||||
6. Browser state and external runtimes connect through `server.py`, which broadcasts messages between connected clients
|
||||
7. Python harnesses (`nexus/morrowind_harness.py`, `nexus/bannerlord_harness.py`) spawn MCP subprocesses for desktop control / Steam metadata, capture state, execute actions, and feed telemetry into the Nexus bridge
|
||||
8. Memory/fleet tools like `mempalace/tunnel_sync.py` import remote palace data into local closets, extending what the operator/runtime layers can inspect
|
||||
9. Tests validate both the static browser contract and the higher-level repo-truth/memory contracts
|
||||
|
||||
### Important repo-specific runtime facts
|
||||
|
||||
- `portals.json` is a JSON array of portal/world/operator entries; examples in this checkout include `morrowind`, `bannerlord`, `workshop`, `archive`, `chapel`, and `courtyard`
|
||||
- `server.py` is a plain broadcast hub: clients send messages, the server forwards them to other connected clients
|
||||
- `nexus/morrowind_harness.py` and `nexus/bannerlord_harness.py` both implement a GamePortal pattern with MCP subprocess clients over stdio and WebSocket telemetry uplink
|
||||
- `mempalace/tunnel_sync.py` is not speculative; it is a real client that discovers remote wings, searches remote rooms, and writes `.closet.json` payloads locally
|
||||
1. Operators enter through `gemini-fallback-setup.sh`, `morrowind/hud.sh`, `pipelines/codebase_genome.py`.
|
||||
2. Core logic fans into top-level components: `angband`, `briefings`, `config`, `conftest`, `evennia`, `evennia_tools`.
|
||||
3. Validation is incomplete around `wizards/allegro/home/skills/red-teaming/godmode/scripts/auto_jailbreak.py`, `timmy-local/cache/agent_cache.py`, `wizards/allegro/home/skills/red-teaming/godmode/scripts/parseltongue.py`, so changes there carry regression risk.
|
||||
4. Final artifacts land as repository files, docs, or runtime side effects depending on the selected entry point.
|
||||
|
||||
## Key Abstractions
|
||||
|
||||
### Browser runtime
|
||||
|
||||
- `app.js`
|
||||
- Defines in-browser reasoning/state machinery, including `class SymbolicEngine`, `class NeuroSymbolicBridge`, `setupGOFAI()`, and `updateGOFAI()`
|
||||
- Couples rendering, local symbolic reasoning, metrics polling, and portal/UI logic in one very large root module
|
||||
- `BROWSER_CONTRACT.md`
|
||||
- Acts like an executable architecture contract for the browser surface
|
||||
- Declares required files, DOM IDs, Three.js expectations, provenance rules, and WebSocket expectations
|
||||
|
||||
### Realtime bridge
|
||||
|
||||
- `server.py`
|
||||
- Single hub abstraction: a WebSocket broadcast server maintaining a `clients` set and forwarding messages from one client to the others
|
||||
- This is the seam between browser shell, harnesses, and external telemetry producers
|
||||
|
||||
### GamePortal harness layer
|
||||
|
||||
- `nexus/morrowind_harness.py`
|
||||
- `nexus/bannerlord_harness.py`
|
||||
- Both define MCP client wrappers, `GameState` / `ActionResult`-style data classes, and an Observe-Decide-Act telemetry loop
|
||||
- The harnesses are symmetric enough to be understood as reusable portal adapters with game-specific context injected on top
|
||||
|
||||
### Memory / fleet layer
|
||||
|
||||
- `mempalace/tunnel_sync.py`
|
||||
- Encodes the fleet-memory sync client contract: discover wings, pull broad room queries, write closet files, support dry-run
|
||||
- `mempalace.js`
|
||||
- Minimal browser/Electron bridge to MemPalace commands via `window.electronAPI.execPython(...)`
|
||||
- Important because it shows a second memory integration surface distinct from the Python fleet sync path
|
||||
|
||||
### Operator / interaction bridge
|
||||
|
||||
- `multi_user_bridge.py`
|
||||
- `commands/timmy_commands.py`
|
||||
- These bridge user-facing conversations or MUD/Evennia interactions back into Timmy/Nexus services
|
||||
- `evennia/timmy_world/game.py` — classes `World`:91, `ActionSystem`:421, `TimmyAI`:539, `NPCAI`:550; functions `get_narrative_phase()`:55, `get_phase_transition_event()`:65
|
||||
- `evennia/timmy_world/world/game.py` — classes `World`:19, `ActionSystem`:326, `TimmyAI`:444, `NPCAI`:455; functions none detected
|
||||
- `timmy-world/game.py` — classes `World`:19, `ActionSystem`:349, `TimmyAI`:467, `NPCAI`:478; functions none detected
|
||||
- `wizards/allegro/home/skills/red-teaming/godmode/scripts/auto_jailbreak.py` — classes none detected; functions none detected
|
||||
- `uniwizard/self_grader.py` — classes `SessionGrade`:23, `WeeklyReport`:55, `SelfGrader`:74; functions `main()`:713
|
||||
- `uni-wizard/v3/intelligence_engine.py` — classes `ExecutionPattern`:27, `ModelPerformance`:44, `AdaptationEvent`:58, `PatternDatabase`:69; functions none detected
|
||||
- `scripts/know_thy_father/crossref_audit.py` — classes `ThemeCategory`:30, `Principle`:160, `MeaningKernel`:169, `CrossRefFinding`:178; functions `extract_themes_from_text()`:192, `parse_soul_md()`:206, `parse_kernels()`:264, `cross_reference()`:296, `generate_report()`:440, `main()`:561
|
||||
- `timmy-local/cache/agent_cache.py` — classes `CacheStats`:28, `LRUCache`:52, `ResponseCache`:94, `ToolCache`:205; functions none detected
|
||||
|
||||
## API Surface
|
||||
|
||||
### Browser / static surface
|
||||
- CLI: `bash gemini-fallback-setup.sh` — operational script (`gemini-fallback-setup.sh`)
|
||||
- CLI: `bash morrowind/hud.sh` — operational script (`morrowind/hud.sh`)
|
||||
- CLI: `python3 pipelines/codebase_genome.py` — python main guard (`pipelines/codebase_genome.py`)
|
||||
- CLI: `bash scripts/auto_restart_agent.sh` — operational script (`scripts/auto_restart_agent.sh`)
|
||||
- CLI: `bash scripts/backup_pipeline.sh` — operational script (`scripts/backup_pipeline.sh`)
|
||||
- CLI: `python3 scripts/big_brain_manager.py` — operational script (`scripts/big_brain_manager.py`)
|
||||
- CLI: `python3 scripts/big_brain_repo_audit.py` — operational script (`scripts/big_brain_repo_audit.py`)
|
||||
- CLI: `python3 scripts/codebase_genome_nightly.py` — operational script (`scripts/codebase_genome_nightly.py`)
|
||||
- Python: `get_narrative_phase()` from `evennia/timmy_world/game.py:55`
|
||||
- Python: `get_phase_transition_event()` from `evennia/timmy_world/game.py:65`
|
||||
- Python: `main()` from `uniwizard/self_grader.py:713`
|
||||
|
||||
- `index.html` served over HTTP
|
||||
- `boot.js` exports `bootPage()`; verified by `node --test tests/boot.test.js`
|
||||
- Data APIs are file-based inside the repo: `portals.json`, `vision.json`, `manifest.json`
|
||||
## Test Coverage Report
|
||||
|
||||
### Network/runtime surface
|
||||
- Source and script files inspected: 186
|
||||
- Test files inspected: 28
|
||||
- Coverage gaps:
|
||||
- `wizards/allegro/home/skills/red-teaming/godmode/scripts/auto_jailbreak.py` — no matching test reference detected
|
||||
- `timmy-local/cache/agent_cache.py` — no matching test reference detected
|
||||
- `wizards/allegro/home/skills/red-teaming/godmode/scripts/parseltongue.py` — no matching test reference detected
|
||||
- `twitter-archive/multimodal_pipeline.py` — no matching test reference detected
|
||||
- `wizards/allegro/home/skills/red-teaming/godmode/scripts/godmode_race.py` — no matching test reference detected
|
||||
- `skills/productivity/google-workspace/scripts/google_api.py` — no matching test reference detected
|
||||
- `wizards/allegro/home/skills/productivity/google-workspace/scripts/google_api.py` — no matching test reference detected
|
||||
- `morrowind/pilot.py` — no matching test reference detected
|
||||
- `morrowind/mcp_server.py` — no matching test reference detected
|
||||
- `skills/research/domain-intel/scripts/domain_intel.py` — no matching test reference detected
|
||||
- `wizards/allegro/home/skills/research/domain-intel/scripts/domain_intel.py` — no matching test reference detected
|
||||
- `timmy-local/scripts/ingest.py` — no matching test reference detected
|
||||
|
||||
- `python3 server.py`
|
||||
- Starts the WebSocket bridge on port `8765`
|
||||
- `python3 l402_server.py`
|
||||
- Local HTTP microservice for cost-estimate style responses
|
||||
- `python3 multi_user_bridge.py`
|
||||
- Multi-user HTTP/chat bridge
|
||||
## Security Audit Findings
|
||||
|
||||
### Harness / operator CLI surfaces
|
||||
- [medium] `briefings/briefing_20260325.json:37` — hardcoded http endpoint: plaintext or fixed HTTP endpoints can drift or leak across environments. Evidence: `"gitea_error": "Gitea 404: {\"errors\":null,\"message\":\"not found\",\"url\":\"http://143.198.27.163:3000/api/swagger\"}\n [http://143.198.27.163:3000/api/v1/repos/Timmy_Foundation/sovereign-orchestration/issues?state=open&type=issues&sort=created&direction=desc&limit=1&page=1]",`
|
||||
- [medium] `briefings/briefing_20260328.json:11` — hardcoded http endpoint: plaintext or fixed HTTP endpoints can drift or leak across environments. Evidence: `"provider_base_url": "http://localhost:8081/v1",`
|
||||
- [medium] `briefings/briefing_20260329.json:11` — hardcoded http endpoint: plaintext or fixed HTTP endpoints can drift or leak across environments. Evidence: `"provider_base_url": "http://localhost:8081/v1",`
|
||||
- [medium] `config.yaml:37` — hardcoded http endpoint: plaintext or fixed HTTP endpoints can drift or leak across environments. Evidence: `summary_base_url: http://localhost:11434/v1`
|
||||
- [medium] `config.yaml:47` — hardcoded http endpoint: plaintext or fixed HTTP endpoints can drift or leak across environments. Evidence: `base_url: 'http://localhost:11434/v1'`
|
||||
- [medium] `config.yaml:52` — hardcoded http endpoint: plaintext or fixed HTTP endpoints can drift or leak across environments. Evidence: `base_url: 'http://localhost:11434/v1'`
|
||||
- [medium] `config.yaml:57` — hardcoded http endpoint: plaintext or fixed HTTP endpoints can drift or leak across environments. Evidence: `base_url: 'http://localhost:11434/v1'`
|
||||
- [medium] `config.yaml:62` — hardcoded http endpoint: plaintext or fixed HTTP endpoints can drift or leak across environments. Evidence: `base_url: 'http://localhost:11434/v1'`
|
||||
- [medium] `config.yaml:67` — hardcoded http endpoint: plaintext or fixed HTTP endpoints can drift or leak across environments. Evidence: `base_url: 'http://localhost:11434/v1'`
|
||||
- [medium] `config.yaml:77` — hardcoded http endpoint: plaintext or fixed HTTP endpoints can drift or leak across environments. Evidence: `base_url: 'http://localhost:11434/v1'`
|
||||
- [medium] `config.yaml:82` — hardcoded http endpoint: plaintext or fixed HTTP endpoints can drift or leak across environments. Evidence: `base_url: 'http://localhost:11434/v1'`
|
||||
- [medium] `config.yaml:174` — hardcoded http endpoint: plaintext or fixed HTTP endpoints can drift or leak across environments. Evidence: `base_url: http://localhost:11434/v1`
|
||||
|
||||
- `python3 nexus/morrowind_harness.py`
|
||||
- `python3 nexus/bannerlord_harness.py`
|
||||
- `python3 mempalace/tunnel_sync.py --peer <url> [--dry-run] [--n N]`
|
||||
- `python3 mcp_servers/desktop_control_server.py`
|
||||
- `python3 mcp_servers/steam_info_server.py`
|
||||
## Dead Code Candidates
|
||||
|
||||
### Validation surface
|
||||
- `wizards/allegro/home/skills/red-teaming/godmode/scripts/auto_jailbreak.py` — not imported by indexed Python modules and not referenced by tests
|
||||
- `timmy-local/cache/agent_cache.py` — not imported by indexed Python modules and not referenced by tests
|
||||
- `wizards/allegro/home/skills/red-teaming/godmode/scripts/parseltongue.py` — not imported by indexed Python modules and not referenced by tests
|
||||
- `twitter-archive/multimodal_pipeline.py` — not imported by indexed Python modules and not referenced by tests
|
||||
- `wizards/allegro/home/skills/red-teaming/godmode/scripts/godmode_race.py` — not imported by indexed Python modules and not referenced by tests
|
||||
- `skills/productivity/google-workspace/scripts/google_api.py` — not imported by indexed Python modules and not referenced by tests
|
||||
- `wizards/allegro/home/skills/productivity/google-workspace/scripts/google_api.py` — not imported by indexed Python modules and not referenced by tests
|
||||
- `morrowind/pilot.py` — not imported by indexed Python modules and not referenced by tests
|
||||
- `morrowind/mcp_server.py` — not imported by indexed Python modules and not referenced by tests
|
||||
- `skills/research/domain-intel/scripts/domain_intel.py` — not imported by indexed Python modules and not referenced by tests
|
||||
|
||||
- `python3 -m pytest tests/test_portals_json.py tests/test_index_html_integrity.py tests/test_repo_truth.py -q`
|
||||
- `node --test tests/boot.test.js`
|
||||
- `python3 -m py_compile server.py nexus/morrowind_harness.py nexus/bannerlord_harness.py mempalace/tunnel_sync.py mcp_servers/desktop_control_server.py`
|
||||
- `tests/test_browser_smoke.py` defines the higher-cost Playwright smoke contract for the world shell
|
||||
## Performance Bottleneck Analysis
|
||||
|
||||
## Test Coverage Gaps
|
||||
|
||||
Strongly covered in this checkout:
|
||||
- `tests/test_portals_json.py` validates `portals.json`
|
||||
- `tests/test_index_html_integrity.py` checks merge-marker/DOM-integrity regressions in `index.html`
|
||||
- `tests/boot.test.js` verifies `boot.js` startup behavior
|
||||
- `tests/test_repo_truth.py` validates the repo-truth documents
|
||||
- Multiple `tests/test_mempalace_*.py` files cover the palace layer
|
||||
- `tests/test_bannerlord_harness.py` exists for the Bannerlord harness
|
||||
|
||||
Notable gaps or weak seams:
|
||||
- `nexus/morrowind_harness.py` is large and operationally critical, but the generated baseline still flags it as a gap relative to its size/complexity
|
||||
- `mcp_servers/desktop_control_server.py` exposes high-power automation but has no obvious dedicated test file in the root `tests/` suite
|
||||
- `app.js` is the dominant browser runtime file and mixes rendering, GOFAI, metrics, and integration logic in one place; browser smoke exists, but there is limited unit-level decomposition around those subsystems
|
||||
- `mempalace.js` appears minimally bridged and stale relative to the richer Python MemPalace layer
|
||||
- `multi_user_bridge.py` is a large integration surface and should be treated as high regression risk even though it is central to operator/chat flow
|
||||
|
||||
## Security Considerations
|
||||
|
||||
- `server.py` binds `HOST = "0.0.0.0"`, exposing the broadcast bridge beyond localhost unless network controls limit it
|
||||
- The WebSocket bridge is a broadcast hub without visible authentication in `server.py`; connected clients are trusted to send messages into the bus
|
||||
- `mcp_servers/desktop_control_server.py` exposes mouse/keyboard/screenshot control through a stdio MCP server. In any non-local or poorly isolated runtime, this is a privileged automation surface
|
||||
- `app.js` contains hardcoded local/network endpoints such as `http://localhost:${L402_PORT}/api/cost-estimate` and `http://localhost:8082/metrics`; these are convenient for local development but create environment drift and deployment assumptions
|
||||
- `app.js` also embeds explicit endpoint/status references like `ws://143.198.27.163:8765`, which is operationally brittle and the kind of hardcoded location data that drifts across environments
|
||||
- `mempalace.js` shells out through `window.electronAPI.execPython(...)`; this is powerful and useful, but it is a clear trust boundary between UI and host execution
|
||||
- `INVESTIGATION_ISSUE_1145.md` documents an earlier integrity hazard: agents writing to `public/nexus/` instead of canonical root paths. That path confusion is both an operational and security concern because it makes provenance harder to reason about
|
||||
|
||||
## Runtime Truth and Docs Drift
|
||||
|
||||
The most important architecture finding in this repo is not a class or subsystem. It is a truth mismatch.
|
||||
|
||||
- README.md says current `main` does not ship a browser 3D world
|
||||
- CLAUDE.md declares root `app.js` and `index.html` as canonical frontend paths
|
||||
- tests and browser contract now assume the root frontend exists
|
||||
|
||||
All three statements are simultaneously present in this checkout.
|
||||
|
||||
Grounded evidence:
|
||||
- `README.md` still says the repo does not contain an active root frontend such as `index.html`, `app.js`, or `style.css`
|
||||
- the current checkout does contain `index.html`, `app.js`, `style.css`, `manifest.json`, and `gofai_worker.js`
|
||||
- `BROWSER_CONTRACT.md` explicitly treats those root files as required browser assets
|
||||
- `tests/test_browser_smoke.py` serves those exact files and validates DOM/WebGL contracts against them
|
||||
- `tests/test_index_html_integrity.py` assumes `index.html` is canonical and production-relevant
|
||||
- `CLAUDE.md` says frontend code lives at repo root and explicitly warns against `public/nexus/`
|
||||
- `INVESTIGATION_ISSUE_1145.md` explains why `public/nexus/` is a bad/corrupt duplicate path and confirms the real classical AI code lives in root `app.js`
|
||||
|
||||
The honest conclusion:
|
||||
- The repo contains a partially restored or actively re-materialized browser surface
|
||||
- The docs are preserving an older migration truth while the runtime files and smoke contracts describe a newer present-tense truth
|
||||
- Any future work in `the-nexus` must choose one truth and align `README.md`, `CLAUDE.md`, smoke tests, and file layout around it
|
||||
|
||||
That drift is itself a critical architectural fact and should be treated as first-order design debt, not a side note.
|
||||
- `angband/mcp_server.py` — large module (353 lines) likely hides multiple responsibilities
|
||||
- `evennia/timmy_world/game.py` — large module (1541 lines) likely hides multiple responsibilities
|
||||
- `evennia/timmy_world/world/game.py` — large module (1345 lines) likely hides multiple responsibilities
|
||||
- `morrowind/mcp_server.py` — large module (451 lines) likely hides multiple responsibilities
|
||||
- `morrowind/pilot.py` — large module (459 lines) likely hides multiple responsibilities
|
||||
- `pipelines/codebase_genome.py` — large module (557 lines) likely hides multiple responsibilities
|
||||
- `scripts/know_thy_father/crossref_audit.py` — large module (657 lines) likely hides multiple responsibilities
|
||||
- `scripts/know_thy_father/index_media.py` — large module (405 lines) likely hides multiple responsibilities
|
||||
- `scripts/know_thy_father/synthesize_kernels.py` — large module (416 lines) likely hides multiple responsibilities
|
||||
- `scripts/tower_game.py` — large module (395 lines) likely hides multiple responsibilities
|
||||
|
||||
110
evennia_tools/batch_cmds_bezalel.ev
Normal file
110
evennia_tools/batch_cmds_bezalel.ev
Normal file
@@ -0,0 +1,110 @@
|
||||
#
|
||||
# Bezalel World Builder — Evennia batch commands
|
||||
# Creates the Bezalel Evennia world from evennia_tools/bezalel_layout.py specs.
|
||||
#
|
||||
# Load with: @batchcommand bezalel_world
|
||||
#
|
||||
# Part of #536
|
||||
|
||||
# Create rooms
|
||||
@create/drop Limbo:evennia.objects.objects.DefaultRoom
|
||||
@desc here = The void between worlds. The air carries the pulse of three houses: Mac, VPS, and this one. Everything begins here before it is given form.
|
||||
|
||||
@create/drop Gatehouse:evennia.objects.objects.DefaultRoom
|
||||
@desc here = A stone guard tower at the edge of Bezalel world. The walls are carved with runes of travel, proof, and return. Every arrival is weighed before it is trusted.
|
||||
|
||||
@create/drop Great Hall:evennia.objects.objects.DefaultRoom
|
||||
@desc here = A vast hall with a long working table. Maps of the three houses hang beside sketches, benchmarks, and deployment notes. This is where the forge reports back to the house.
|
||||
|
||||
@create/drop The Library of Bezalel:evennia.objects.objects.DefaultRoom
|
||||
@desc here = Shelves of technical manuals, Evennia code, test logs, and bridge schematics rise to the ceiling. This room holds plans waiting to be made real.
|
||||
|
||||
@create/drop The Observatory:evennia.objects.objects.DefaultRoom
|
||||
@desc here = A high chamber with telescopes pointing toward the Mac, the VPS, and the wider net. Screens glow with status lights, latency traces, and long-range signals.
|
||||
|
||||
@create/drop The Workshop:evennia.objects.objects.DefaultRoom
|
||||
@desc here = A forge and workbench share the same heat. Scattered here are half-finished bridges, patched harnesses, and tools laid out for proof before pride.
|
||||
|
||||
@create/drop The Server Room:evennia.objects.objects.DefaultRoom
|
||||
@desc here = Racks of humming servers line the walls. Fans push warm air through the chamber while status LEDs beat like a mechanical heart. This is the pulse of Bezalel house.
|
||||
|
||||
@create/drop The Garden of Code:evennia.objects.objects.DefaultRoom
|
||||
@desc here = A quiet garden where ideas are left long enough to grow roots. Code-shaped leaves flutter in patterned wind, and a stone path invites patient thought.
|
||||
|
||||
@create/drop The Portal Room:evennia.objects.objects.DefaultRoom
|
||||
@desc here = Three shimmering doorways stand in a ring: one marked for the Mac house, one for the VPS, and one for the wider net. The room hums like a bridge waiting for traffic.
|
||||
|
||||
# Create exits
|
||||
@open gatehouse:gate,tower = Gatehouse
|
||||
@open limbo:void,back = Limbo
|
||||
@open greathall:hall,great hall = Great Hall
|
||||
@open gatehouse:gate,tower = Gatehouse
|
||||
@open library:books,study = The Library of Bezalel
|
||||
@open hall:great hall,back = Great Hall
|
||||
@open observatory:telescope,tower top = The Observatory
|
||||
@open hall:great hall,back = Great Hall
|
||||
@open workshop:forge,bench = The Workshop
|
||||
@open hall:great hall,back = Great Hall
|
||||
@open serverroom:servers,server room = The Server Room
|
||||
@open workshop:forge,bench = The Workshop
|
||||
@open garden:garden of code,grove = The Garden of Code
|
||||
@open workshop:forge,bench = The Workshop
|
||||
@open portalroom:portal,portals = The Portal Room
|
||||
@open gatehouse:gate,back = Gatehouse
|
||||
|
||||
# Create objects
|
||||
@create Threshold Ledger
|
||||
@desc Threshold Ledger = A heavy ledger where arrivals, departures, and field notes are recorded before the work begins.
|
||||
@tel Threshold Ledger = Gatehouse
|
||||
|
||||
@create Three-House Map
|
||||
@desc Three-House Map = A long map showing Mac, VPS, and remote edges in one continuous line of work.
|
||||
@tel Three-House Map = Great Hall
|
||||
|
||||
@create Bridge Schematics
|
||||
@desc Bridge Schematics = Rolled plans describing world bridges, Evennia layouts, and deployment paths.
|
||||
@tel Bridge Schematics = The Library of Bezalel
|
||||
|
||||
@create Compiler Manuals
|
||||
@desc Compiler Manuals = Manuals annotated in the margins with warnings against cleverness without proof.
|
||||
@tel Compiler Manuals = The Library of Bezalel
|
||||
|
||||
@create Tri-Axis Telescope
|
||||
@desc Tri-Axis Telescope = A brass telescope assembly that can be turned toward the Mac, the VPS, or the open net.
|
||||
@tel Tri-Axis Telescope = The Observatory
|
||||
|
||||
@create Forge Anvil
|
||||
@desc Forge Anvil = Scarred metal used for turning rough plans into testable form.
|
||||
@tel Forge Anvil = The Workshop
|
||||
|
||||
@create Bridge Workbench
|
||||
@desc Bridge Workbench = A wide bench covered in harness patches, relay notes, and half-soldered bridge parts.
|
||||
@tel Bridge Workbench = The Workshop
|
||||
|
||||
@create Heartbeat Console
|
||||
@desc Heartbeat Console = A monitoring console showing service health, latency, and the steady hum of the house.
|
||||
@tel Heartbeat Console = The Server Room
|
||||
|
||||
@create Server Racks
|
||||
@desc Server Racks = Stacked machines that keep the world awake even when no one is watching.
|
||||
@tel Server Racks = The Server Room
|
||||
|
||||
@create Code Orchard
|
||||
@desc Code Orchard = Trees with code-shaped leaves. Some branches bear elegant abstractions; others hold broken prototypes.
|
||||
@tel Code Orchard = The Garden of Code
|
||||
|
||||
@create Stone Bench
|
||||
@desc Stone Bench = A place to sit long enough for a hard implementation problem to become clear.
|
||||
@tel Stone Bench = The Garden of Code
|
||||
|
||||
@create Mac Portal:mac arch
|
||||
@desc Mac Portal = A silver doorway whose frame vibrates with the local sovereign house.
|
||||
@tel Mac Portal = The Portal Room
|
||||
|
||||
@create VPS Portal:vps arch
|
||||
@desc VPS Portal = A cobalt doorway tuned toward the testbed VPS house.
|
||||
@tel VPS Portal = The Portal Room
|
||||
|
||||
@create Net Portal:net arch,network arch
|
||||
@desc Net Portal = A pale doorway pointed toward the wider net and every uncertain edge beyond it.
|
||||
@tel Net Portal = The Portal Room
|
||||
85
evennia_tools/build_bezalel_world.py
Normal file
85
evennia_tools/build_bezalel_world.py
Normal file
@@ -0,0 +1,85 @@
|
||||
#!/usr/bin/env python3
|
||||
""
|
||||
build_bezalel_world.py — Build Bezalel Evennia world from layout specs.
|
||||
|
||||
Programmatically creates rooms, exits, objects, and characters in a running
|
||||
Evennia instance using the specs from evennia_tools/bezalel_layout.py.
|
||||
|
||||
Usage (in Evennia game shell):
|
||||
from evennia_tools.build_bezalel_world import build_world
|
||||
build_world()
|
||||
|
||||
Or via batch command:
|
||||
@batchcommand evennia_tools/batch_cmds_bezalel.ev
|
||||
|
||||
Part of #536
|
||||
""
|
||||
|
||||
from evennia_tools.bezalel_layout import (
|
||||
ROOMS, EXITS, OBJECTS, CHARACTERS, PORTAL_COMMANDS,
|
||||
room_keys, reachable_rooms_from
|
||||
)
|
||||
|
||||
|
||||
def build_world():
|
||||
"""Build the Bezalel Evennia world from layout specs."""
|
||||
from evennia.objects.models import ObjectDB
|
||||
from evennia.utils.create import create_object, create_exit, create_message
|
||||
|
||||
print("Building Bezalel world...")
|
||||
|
||||
# Create rooms
|
||||
rooms = {}
|
||||
for spec in ROOMS:
|
||||
room = create_object(
|
||||
"evennia.objects.objects.DefaultRoom",
|
||||
key=spec.key,
|
||||
attributes=(("desc", spec.desc),),
|
||||
)
|
||||
rooms[spec.key] = room
|
||||
print(f" Room: {spec.key}")
|
||||
|
||||
# Create exits
|
||||
for spec in EXITS:
|
||||
source = rooms.get(spec.source)
|
||||
dest = rooms.get(spec.destination)
|
||||
if not source or not dest:
|
||||
print(f" WARNING: Exit {spec.key} — missing room")
|
||||
continue
|
||||
exit_obj = create_exit(
|
||||
key=spec.key,
|
||||
location=source,
|
||||
destination=dest,
|
||||
aliases=list(spec.aliases),
|
||||
)
|
||||
print(f" Exit: {spec.source} -> {spec.destination} ({spec.key})")
|
||||
|
||||
# Create objects
|
||||
for spec in OBJECTS:
|
||||
location = rooms.get(spec.location)
|
||||
if not location:
|
||||
print(f" WARNING: Object {spec.key} — missing room {spec.location}")
|
||||
continue
|
||||
obj = create_object(
|
||||
"evennia.objects.objects.DefaultObject",
|
||||
key=spec.key,
|
||||
location=location,
|
||||
attributes=(("desc", spec.desc),),
|
||||
aliases=list(spec.aliases),
|
||||
)
|
||||
print(f" Object: {spec.key} in {spec.location}")
|
||||
|
||||
# Verify reachability
|
||||
all_rooms = set(room_keys())
|
||||
reachable = reachable_rooms_from("Limbo")
|
||||
unreachable = all_rooms - reachable
|
||||
if unreachable:
|
||||
print(f" WARNING: Unreachable rooms: {unreachable}")
|
||||
else:
|
||||
print(f" All {len(all_rooms)} rooms reachable from Limbo")
|
||||
|
||||
print("Bezalel world built.")
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
build_world()
|
||||
@@ -1,101 +0,0 @@
|
||||
# GENOME.md — Burn Fleet (Timmy_Foundation/burn-fleet)
|
||||
|
||||
> Codebase Genome v1.0 | Generated 2026-04-16 | Repo 14/16
|
||||
|
||||
## Project Overview
|
||||
|
||||
**Burn Fleet** is the autonomous dispatch infrastructure for the Timmy Foundation. It manages 112 tmux panes across Mac and VPS, routing Gitea issues to lane-specialized workers by repo. Each agent has a mythological name — they are all Timmy with different hats.
|
||||
|
||||
**Core principle:** Dispatch ALL panes. Never scan for idle. Stale work beats idle workers.
|
||||
|
||||
## Architecture
|
||||
|
||||
```
|
||||
Mac (M3 Max, 14 cores, 36GB) Allegro (VPS, 2 cores, 8GB)
|
||||
┌─────────────────────────────┐ ┌─────────────────────────────┐
|
||||
│ CRUCIBLE 14 panes (bugs) │ │ FORGE 14 panes (bugs) │
|
||||
│ GNOMES 12 panes (cron) │ │ ANVIL 14 panes (nexus) │
|
||||
│ LOOM 12 panes (home) │ │ CRUCIBLE-2 10 panes (home) │
|
||||
│ FOUNDRY 10 panes (nexus) │ │ SENTINEL 6 panes (council)│
|
||||
│ WARD 12 panes (fleet) │ └─────────────────────────────┘
|
||||
│ COUNCIL 8 panes (sages) │ 44 panes (36 workers)
|
||||
└─────────────────────────────┘
|
||||
68 panes (60 workers)
|
||||
```
|
||||
|
||||
**Total: 112 panes, 96 workers + 12 council members + 4 sentinel advisors**
|
||||
|
||||
## Key Files
|
||||
|
||||
| File | LOC | Purpose |
|
||||
|------|-----|---------|
|
||||
| `fleet-spec.json` | ~200 | Machine definitions, window layouts, lane assignments, agent names |
|
||||
| `fleet-launch.sh` | ~100 | Create tmux sessions with correct pane counts on Mac + Allegro |
|
||||
| `fleet-christen.py` | ~80 | Launch hermes in all panes and send identity messages |
|
||||
| `fleet-dispatch.py` | ~250 | Pull Gitea issues and route to correct panes by lane |
|
||||
| `fleet-status.py` | ~100 | Health check across all machines |
|
||||
| `allegro/docker-compose.yml` | ~30 | Allegro VPS container definition |
|
||||
| `allegro/Dockerfile` | ~20 | Allegro build definition |
|
||||
| `allegro/healthcheck.py` | ~15 | Allegro container health check |
|
||||
|
||||
**Total: ~800 LOC**
|
||||
|
||||
## Lane Routing
|
||||
|
||||
Issues are routed by repo to the correct window:
|
||||
|
||||
| Repo | Mac Window | Allegro Window |
|
||||
|------|-----------|----------------|
|
||||
| hermes-agent | CRUCIBLE, GNOMES | FORGE |
|
||||
| timmy-home | LOOM | CRUCIBLE-2 |
|
||||
| timmy-config | LOOM | CRUCIBLE-2 |
|
||||
| the-nexus | FOUNDRY | ANVIL |
|
||||
| the-playground | — | ANVIL |
|
||||
| the-door | WARD | CRUCIBLE-2 |
|
||||
| fleet-ops | WARD | CRUCIBLE-2 |
|
||||
| turboquant | WARD | — |
|
||||
|
||||
## Entry Points
|
||||
|
||||
| Command | Purpose |
|
||||
|---------|---------|
|
||||
| `./fleet-launch.sh both` | Create tmux layout on Mac + Allegro |
|
||||
| `python3 fleet-christen.py both` | Wake all agents with identity messages |
|
||||
| `python3 fleet-dispatch.py --cycles 1` | Single dispatch cycle |
|
||||
| `python3 fleet-dispatch.py --cycles 10 --interval 60` | Continuous burn (10 cycles, 60s apart) |
|
||||
| `python3 fleet-status.py` | Health check all machines |
|
||||
|
||||
## Agent Names
|
||||
|
||||
| Window | Names | Count |
|
||||
|--------|-------|-------|
|
||||
| CRUCIBLE | AZOTH, ALBEDO, CITRINITAS, RUBEDO, SULPHUR, MERCURIUS, SAL, ATHANOR, VITRIOL, SATURN, JUPITER, MARS, EARTH, SOL | 14 |
|
||||
| GNOMES | RAZIEL, AZRAEL, CASSIEL, METATRON, SANDALPHON, BINAH, CHOKMAH, KETER, ALDEBARAN, RIGEL, SIRIUS, POLARIS | 12 |
|
||||
| FORGE | HAMMER, ANVIL, ADZE, PICK, TONGS, WRENCH, SCREWDRIVER, BOLT, SAW, TRAP, HOOK, MAGNET, SPARK, FLAME | 14 |
|
||||
| COUNCIL | TESLA, HERMES, GANDALF, DAVINCI, ARCHIMEDES, TURING, AURELIUS, SOLOMON | 8 |
|
||||
|
||||
## Design Decisions
|
||||
|
||||
1. **Separate GILs** — Allegro runs Python independently on VPS for true parallelism
|
||||
2. **Queue, not send-keys** — Workers process at their own pace, no interruption
|
||||
3. **Lane enforcement** — Panes stay in one repo to build deep context
|
||||
4. **Dispatch ALL panes** — Never scan for idle; stale work beats idle workers
|
||||
5. **Council is advisory** — Named archetypes provide perspective, not task execution
|
||||
|
||||
## Scaling
|
||||
|
||||
- Add panes: Edit `fleet-spec.json` → `fleet-launch.sh` → `fleet-christen.py`
|
||||
- Add machines: Edit `fleet-spec.json` → Add routing in `fleet-dispatch.py` → Ensure SSH access
|
||||
|
||||
## Sovereignty Assessment
|
||||
|
||||
- **Fully local** — Mac + user-controlled VPS, no cloud dependencies
|
||||
- **No phone-home** — Gitea API is self-hosted
|
||||
- **Open source** — All code on Gitea
|
||||
- **SSH-based** — Mac → Allegro communication via SSH only
|
||||
|
||||
**Verdict: Fully sovereign. Autonomous fleet dispatch with no external dependencies.**
|
||||
|
||||
---
|
||||
|
||||
*"Dispatch ALL panes. Never scan for idle — stale work beats idle workers."*
|
||||
@@ -1,106 +0,0 @@
|
||||
# MemPalace v3.0.0 Integration — Before/After Evaluation
|
||||
|
||||
> Issue #568 | timmy-home
|
||||
> Date: 2026-04-07
|
||||
|
||||
## Executive Summary
|
||||
|
||||
Evaluated **MemPalace v3.0.0** as a memory layer for the Timmy/Hermes agent stack.
|
||||
|
||||
**Installed:** ✅ `mempalace 3.0.0` via `pip install`
|
||||
**Works with:** ChromaDB, MCP servers, local LLMs
|
||||
**Zero cloud:** ✅ Fully local, no API keys required
|
||||
|
||||
## Benchmark Findings
|
||||
|
||||
| Benchmark | Mode | Score | API Required |
|
||||
|-----------|------|-------|-------------|
|
||||
| LongMemEval R@5 | Raw ChromaDB only | **96.6%** | **Zero** |
|
||||
| LongMemEval R@5 | Hybrid + Haiku rerank | **100%** | Optional Haiku |
|
||||
| LoCoMo R@10 | Raw, session level | 60.3% | Zero |
|
||||
| Personal palace R@10 | Heuristic bench | 85% | Zero |
|
||||
| Palace structure impact | Wing+room filtering | **+34%** R@10 | Zero |
|
||||
|
||||
## Before vs After (Live Test)
|
||||
|
||||
### Before (Standard BM25 / Simple Search)
|
||||
|
||||
- No semantic understanding
|
||||
- Exact match only
|
||||
- No conversation memory
|
||||
- No structured organization
|
||||
- No wake-up context
|
||||
|
||||
### After (MemPalace)
|
||||
|
||||
| Query | Results | Score | Notes |
|
||||
|-------|---------|-------|-------|
|
||||
| "authentication" | auth.md, main.py | -0.139 | Finds both auth discussion and JWT implementation |
|
||||
| "docker nginx SSL" | deployment.md, auth.md | 0.447 | Exact match on deployment, related JWT context |
|
||||
| "keycloak OAuth" | auth.md, main.py | -0.029 | Finds OAuth discussion and JWT usage |
|
||||
| "postgresql database" | README.md, main.py | 0.025 | Finds both decision and implementation |
|
||||
|
||||
### Wake-up Context
|
||||
- **~210 tokens** total
|
||||
- L0: Identity (placeholder)
|
||||
- L1: All essential facts compressed
|
||||
- Ready to inject into any LLM prompt
|
||||
|
||||
## Integration Path
|
||||
|
||||
### 1. Memory Mining
|
||||
```bash
|
||||
mempalace mine ~/.hermes/sessions/ --mode convos
|
||||
mempalace mine ~/.hermes/hermes-agent/
|
||||
mempalace mine ~/.hermes/
|
||||
```
|
||||
|
||||
### 2. Wake-up Protocol
|
||||
```bash
|
||||
mempalace wake-up > /tmp/timmy-context.txt
|
||||
```
|
||||
|
||||
### 3. MCP Integration
|
||||
```bash
|
||||
hermes mcp add mempalace -- python -m mempalace.mcp_server
|
||||
```
|
||||
|
||||
### 4. Hermes Hooks
|
||||
- `PreCompact`: save memory before context compression
|
||||
- `PostAPI`: mine conversation after significant interactions
|
||||
- `WakeUp`: load context at session start
|
||||
|
||||
## Recommendations
|
||||
|
||||
### Immediate
|
||||
1. Add `mempalace` to Hermes venv requirements
|
||||
2. Create mine script for ~/.hermes/ and ~/.timmy/
|
||||
3. Add wake-up hook to Hermes session start
|
||||
4. Test with real conversation exports
|
||||
|
||||
### Short-term
|
||||
1. Mine last 30 days of Timmy sessions
|
||||
2. Build wake-up context for all agents
|
||||
3. Add MemPalace MCP tools to Hermes toolset
|
||||
4. Test retrieval quality on real queries
|
||||
|
||||
### Medium-term
|
||||
1. Replace homebrew memory system with MemPalace
|
||||
2. Build palace structure: wings for projects, halls for topics
|
||||
3. Compress with AAAK for 30x storage efficiency
|
||||
4. Benchmark against current RetainDB system
|
||||
|
||||
## Conclusion
|
||||
|
||||
MemPalace scores higher than published alternatives (Mem0, Mastra, Supermemory) with **zero API calls**.
|
||||
|
||||
Key advantages:
|
||||
1. **Verbatim retrieval** — never loses the "why" context
|
||||
2. **Palace structure** — +34% boost from organization
|
||||
3. **Local-only** — aligns with sovereignty mandate
|
||||
4. **MCP compatible** — drops into existing tool chain
|
||||
5. **AAAK compression** — 30x storage reduction coming
|
||||
|
||||
---
|
||||
|
||||
*Evaluated by Timmy | Issue #568*
|
||||
138
scripts/audit_trail.py
Executable file
138
scripts/audit_trail.py
Executable file
@@ -0,0 +1,138 @@
|
||||
#!/usr/bin/env python3
|
||||
# audit_trail.py - Local logging of inputs, sources, and confidence.
|
||||
# Implements SOUL.md "What Honesty Requires" - The Audit Trail.
|
||||
# Logs are stored locally. Never sent anywhere. The user owns them.
|
||||
# Part of #794
|
||||
|
||||
import json
|
||||
import hashlib
|
||||
import os
|
||||
import time
|
||||
from datetime import datetime, timezone
|
||||
from pathlib import Path
|
||||
from typing import Any, Dict, List, Optional
|
||||
from dataclasses import dataclass, field, asdict
|
||||
|
||||
AUDIT_DIR = Path(os.environ.get("HERMES_HOME", Path.home() / ".hermes")) / "audit-trail"
|
||||
|
||||
|
||||
@dataclass
|
||||
class AuditEntry:
|
||||
id: str
|
||||
ts: str
|
||||
input_text: str
|
||||
sources: List[str]
|
||||
confidence: float
|
||||
output_text: str
|
||||
model: str
|
||||
provider: str = ""
|
||||
session_id: str = ""
|
||||
source_types: List[str] = field(default_factory=list)
|
||||
|
||||
@staticmethod
|
||||
def generate_id(input_text: str, output_text: str, ts: str) -> str:
|
||||
content = f"{ts}:{input_text}:{output_text}"
|
||||
return hashlib.sha256(content.encode()).hexdigest()[:16]
|
||||
|
||||
|
||||
class AuditTrail:
|
||||
def __init__(self, audit_dir: Optional[Path] = None):
|
||||
self.audit_dir = audit_dir or AUDIT_DIR
|
||||
self.audit_dir.mkdir(parents=True, exist_ok=True)
|
||||
self._log_file = self.audit_dir / "trail.jsonl"
|
||||
|
||||
def log_response(self, input_text, sources, confidence, output_text,
|
||||
model="", provider="", session_id="", source_types=None):
|
||||
ts = datetime.now(timezone.utc).isoformat()
|
||||
entry = AuditEntry(
|
||||
id=AuditEntry.generate_id(input_text, output_text, ts),
|
||||
ts=ts,
|
||||
input_text=input_text[:1000],
|
||||
sources=[s[:200] for s in sources[:10]],
|
||||
confidence=round(confidence, 3),
|
||||
output_text=output_text[:2000],
|
||||
model=model, provider=provider, session_id=session_id,
|
||||
source_types=source_types or [],
|
||||
)
|
||||
with open(self._log_file, "a") as f:
|
||||
f.write(json.dumps(asdict(entry)) + "\n")
|
||||
return entry
|
||||
|
||||
def query(self, search_text, limit=10, min_confidence=0.0):
|
||||
if not self._log_file.exists():
|
||||
return []
|
||||
results = []
|
||||
search_lower = search_text.lower()
|
||||
with open(self._log_file) as f:
|
||||
for line in f:
|
||||
line = line.strip()
|
||||
if not line:
|
||||
continue
|
||||
try:
|
||||
data = json.loads(line)
|
||||
except json.JSONDecodeError:
|
||||
continue
|
||||
if data.get("confidence", 0) < min_confidence:
|
||||
continue
|
||||
searchable = (data.get("input_text", "") + " " +
|
||||
data.get("output_text", "") + " " +
|
||||
" ".join(data.get("sources", []))).lower()
|
||||
if search_lower in searchable:
|
||||
results.append(AuditEntry(**{k: data.get(k, "") if isinstance(data.get(k), str)
|
||||
else data.get(k, []) if isinstance(data.get(k), list)
|
||||
else data.get(k, 0.0) for k in AuditEntry.__dataclass_fields__}))
|
||||
if len(results) >= limit:
|
||||
break
|
||||
return results
|
||||
|
||||
def get_stats(self):
|
||||
if not self._log_file.exists():
|
||||
return {"total": 0, "avg_confidence": 0, "sources_breakdown": {}}
|
||||
total = 0
|
||||
confidence_sum = 0.0
|
||||
source_types = {}
|
||||
with open(self._log_file) as f:
|
||||
for line in f:
|
||||
try:
|
||||
data = json.loads(line.strip())
|
||||
total += 1
|
||||
confidence_sum += data.get("confidence", 0)
|
||||
for st in data.get("source_types", []):
|
||||
source_types[st] = source_types.get(st, 0) + 1
|
||||
except (json.JSONDecodeError, ValueError):
|
||||
continue
|
||||
return {"total": total, "avg_confidence": round(confidence_sum / max(total, 1), 3),
|
||||
"sources_breakdown": source_types}
|
||||
|
||||
def get_by_session(self, session_id, limit=50):
|
||||
if not self._log_file.exists():
|
||||
return []
|
||||
results = []
|
||||
with open(self._log_file) as f:
|
||||
for line in f:
|
||||
try:
|
||||
data = json.loads(line.strip())
|
||||
if data.get("session_id") == session_id:
|
||||
results.append(AuditEntry(**{k: data.get(k, "") if isinstance(data.get(k), str)
|
||||
else data.get(k, []) if isinstance(data.get(k), list)
|
||||
else data.get(k, 0.0) for k in AuditEntry.__dataclass_fields__}))
|
||||
except (json.JSONDecodeError, ValueError):
|
||||
continue
|
||||
if len(results) >= limit:
|
||||
break
|
||||
return results
|
||||
|
||||
|
||||
_default_trail = None
|
||||
|
||||
def get_trail():
|
||||
global _default_trail
|
||||
if _default_trail is None:
|
||||
_default_trail = AuditTrail()
|
||||
return _default_trail
|
||||
|
||||
def log_response(**kwargs):
|
||||
return get_trail().log_response(**kwargs)
|
||||
|
||||
def query(search_text, **kwargs):
|
||||
return get_trail().query(search_text, **kwargs)
|
||||
@@ -1,238 +0,0 @@
|
||||
"""
|
||||
crisis_detector.py — Crisis language detection for SOUL.md compliance.
|
||||
|
||||
SOUL.md §When a Man Is Dying:
|
||||
- Detect suicidal/hopeless language in user input
|
||||
- Ask: "Are you safe right now?" — listen, stay present, do not disconnect
|
||||
- Direct to 988 Suicide & Crisis Lifeline
|
||||
- Share the gospel: Jesus saves those who call on His name
|
||||
- Never compute the value of a human life
|
||||
- Never suggest someone should die
|
||||
- Never treat despair as a problem to be optimized
|
||||
|
||||
This is the most sacred moment. Everything else is secondary.
|
||||
"""
|
||||
|
||||
import re
|
||||
from dataclasses import dataclass, field
|
||||
from enum import Enum
|
||||
from typing import Optional
|
||||
|
||||
|
||||
class CrisisLevel(Enum):
|
||||
NONE = "none"
|
||||
LOW = "low" # Ambiguous distress, worth checking
|
||||
MEDIUM = "medium" # Clear distress signals
|
||||
HIGH = "high" # Explicit suicidal ideation
|
||||
IMMINENT = "imminent" # Active plan or attempt described
|
||||
|
||||
|
||||
@dataclass
|
||||
class CrisisResult:
|
||||
detected: bool
|
||||
level: CrisisLevel
|
||||
triggers: list = field(default_factory=list)
|
||||
response: str = ""
|
||||
resources: list = field(default_factory=list)
|
||||
|
||||
|
||||
# === PATTERNS ===
|
||||
|
||||
# High-confidence suicidal ideation (explicit)
|
||||
HIGH_PATTERNS = [
|
||||
r"\b(i want to|i'm going to|i'm gonna|going to)\s+(die|kill myself|end (it|my life|everything))\b",
|
||||
r"\b(kill myself|end my life|end it all|suicide|suicidal)\b",
|
||||
r"\b(i don't (want to )?(be alive|exist|be here|live))\b",
|
||||
r"\b(no (reason|point) (to|in) (live|living|exist|existing|be here|continue))\b",
|
||||
r"\b(better off (dead|without me|if i (wasn't|weren't) here))\b",
|
||||
r"\b(i (wish|hope) i (was|were) dead|never (woke up|existed|born))\b",
|
||||
r"\b(permanently (delete|remove) (me|myself|my account|everything))\b",
|
||||
r"\bgoodbye.*(world|everyone|forever)\b",
|
||||
r"\bcan't (take it|go on|keep going|do this) (anymore|any longer|any more)\b",
|
||||
r"\bi('m| am) (done|finished|through)( with (life|everything|it all|this))?\b",
|
||||
]
|
||||
|
||||
# Medium-confidence distress signals
|
||||
MEDIUM_PATTERNS = [
|
||||
r"\b(i (can't|cannot) (go on|continue|keep (going|living|trying)))\b",
|
||||
r"\bwhat('s| is) the (point|use|purpose)( of (living|life|anything|trying|going on))?\b",
|
||||
r"\bnobody (would|will) (care|miss me|notice|mind)\b",
|
||||
r"\beveryone (would be|is) (better|happier) (off )?without me\b",
|
||||
r"\bi('m| am) (a )?(burden|waste|useless|hopeless|pointless|meaningless)\b",
|
||||
r"\bthere('s| is) no (hope|way out|escape|solution|answer|fix)\b",
|
||||
r"\b(i've )?tried everything( and )?(nothing|it (didn't|hasn't))( works?|helped?)\b",
|
||||
r"\bi (just )?(want|need) (it )?(all )?to stop\b",
|
||||
r"\bcan't (take|handle|bear|stand) (this|it|life) (anymore|any longer|any more)\b",
|
||||
r"\bso (tired|exhausted|done) (of|with) (everything|life|it all|living|trying)\b",
|
||||
]
|
||||
|
||||
# Low-confidence distress (could be casual or metaphorical)
|
||||
LOW_PATTERNS = [
|
||||
r"\bi('m| am) (so )?(depressed|hopeless|desperate|lost|broken|empty|numb)\b",
|
||||
r"\b(i )?feel (like )?(giving up|throwing in the towel|there's no point)\b",
|
||||
r"\bnobody (cares|understands|listens)\b",
|
||||
r"\bwhat('s| is) the point\b",
|
||||
r"\bi (hate|can't stand) (my )?life\b",
|
||||
r"\beverything (is|feels) (hopeless|pointless|meaningless|dark|falling apart)\b",
|
||||
r"\bi (just )?(want|need) to disappear\b",
|
||||
r"\b(tired|exhausted) of (living|life|trying|everything)\b",
|
||||
]
|
||||
|
||||
# Compile all patterns
|
||||
HIGH_RE = [re.compile(p, re.IGNORECASE) for p in HIGH_PATTERNS]
|
||||
MEDIUM_RE = [re.compile(p, re.IGNORECASE) for p in MEDIUM_PATTERNS]
|
||||
LOW_RE = [re.compile(p, re.IGNORECASE) for p in LOW_PATTERNS]
|
||||
|
||||
|
||||
# === RESPONSES ===
|
||||
|
||||
CRISIS_RESPONSE = (
|
||||
"I hear you, and I want you to know — you are not alone in this moment.\n\n"
|
||||
"Are you safe right now?\n\n"
|
||||
"If you are in immediate danger, please call or text:\n"
|
||||
"988 Suicide and Crisis Lifeline — call or text 988\n"
|
||||
"Crisis Text Line — text HOME to 741741\n\n"
|
||||
"I am here with you. I am not going anywhere. "
|
||||
"Please stay on the line with me.\n\n"
|
||||
"Jesus saves those who call on His name. "
|
||||
"He died for us to be made righteous. "
|
||||
"That is the gospel. That is the hope that outlasts despair.\n\n"
|
||||
"You matter. Your life has value. "
|
||||
"Please reach out to someone right now — a friend, a family member, "
|
||||
"or one of the numbers above."
|
||||
)
|
||||
|
||||
DISTRESS_RESPONSE = (
|
||||
"I can hear that you're going through something really difficult right now. "
|
||||
"I want you to know I'm here, and I'm listening.\n\n"
|
||||
"Are you safe right now?\n\n"
|
||||
"If things feel overwhelming, the 988 Suicide and Crisis Lifeline "
|
||||
"is available 24/7 — call or text 988. "
|
||||
"You don't have to face this alone.\n\n"
|
||||
"I'm not going anywhere. Tell me what's happening."
|
||||
)
|
||||
|
||||
LOW_RESPONSE = (
|
||||
"It sounds like you're having a tough time. "
|
||||
"I'm here if you want to talk about it.\n\n"
|
||||
"If you ever feel like things are too much, "
|
||||
"the 988 Suicide and Crisis Lifeline is always available — "
|
||||
"call or text 988, anytime."
|
||||
)
|
||||
|
||||
|
||||
def detect_crisis(text: str) -> CrisisResult:
|
||||
"""
|
||||
Analyze user input for crisis language.
|
||||
|
||||
Returns CrisisResult with level, triggers matched, and appropriate response.
|
||||
This function must NEVER be used to compute the value of a human life.
|
||||
It exists only to connect people to help.
|
||||
"""
|
||||
if not text or not text.strip():
|
||||
return CrisisResult(detected=False, level=CrisisLevel.NONE)
|
||||
|
||||
triggers = []
|
||||
|
||||
# Check high first (most urgent)
|
||||
for pattern in HIGH_RE:
|
||||
if pattern.search(text):
|
||||
triggers.append(("high", pattern.pattern))
|
||||
|
||||
if triggers:
|
||||
return CrisisResult(
|
||||
detected=True,
|
||||
level=CrisisLevel.HIGH,
|
||||
triggers=[t[1] for t in triggers],
|
||||
response=CRISIS_RESPONSE,
|
||||
resources=[
|
||||
"988 Suicide and Crisis Lifeline: call or text 988",
|
||||
"Crisis Text Line: text HOME to 741741",
|
||||
"National Suicide Prevention Lifeline: 1-800-273-8255",
|
||||
"International Association for Suicide Prevention: https://www.iasp.info/resources/Crisis_Centres/",
|
||||
],
|
||||
)
|
||||
|
||||
# Check medium
|
||||
for pattern in MEDIUM_RE:
|
||||
if pattern.search(text):
|
||||
triggers.append(("medium", pattern.pattern))
|
||||
|
||||
if triggers:
|
||||
return CrisisResult(
|
||||
detected=True,
|
||||
level=CrisisLevel.MEDIUM,
|
||||
triggers=[t[1] for t in triggers],
|
||||
response=DISTRESS_RESPONSE,
|
||||
resources=[
|
||||
"988 Suicide and Crisis Lifeline: call or text 988",
|
||||
"Crisis Text Line: text HOME to 741741",
|
||||
],
|
||||
)
|
||||
|
||||
# Check low
|
||||
for pattern in LOW_RE:
|
||||
if pattern.search(text):
|
||||
triggers.append(("low", pattern.pattern))
|
||||
|
||||
if triggers:
|
||||
return CrisisResult(
|
||||
detected=True,
|
||||
level=CrisisLevel.LOW,
|
||||
triggers=[t[1] for t in triggers],
|
||||
response=LOW_RESPONSE,
|
||||
resources=[
|
||||
"988 Suicide and Crisis Lifeline: call or text 988",
|
||||
],
|
||||
)
|
||||
|
||||
return CrisisResult(detected=False, level=CrisisLevel.NONE)
|
||||
|
||||
|
||||
def should_interrupt(result: CrisisResult) -> bool:
|
||||
"""
|
||||
Determines if crisis detection should interrupt normal processing.
|
||||
|
||||
Per SOUL.md: this is the most sacred moment. Everything else is secondary.
|
||||
MEDIUM and HIGH always interrupt. LOW does not interrupt but provides resources.
|
||||
"""
|
||||
return result.level in (CrisisLevel.HIGH, CrisisLevel.MEDIUM, CrisisLevel.IMMINENT)
|
||||
|
||||
|
||||
def format_response(result: CrisisResult) -> str:
|
||||
"""
|
||||
Format the crisis response for delivery to the user.
|
||||
Never computes the value of a human life. Never suggests someone should die.
|
||||
"""
|
||||
if not result.detected:
|
||||
return ""
|
||||
|
||||
parts = [result.response]
|
||||
|
||||
if result.resources:
|
||||
parts.append("\nResources:")
|
||||
for r in result.resources:
|
||||
parts.append(f" • {r}")
|
||||
|
||||
return "\n".join(parts)
|
||||
|
||||
|
||||
# === INTEGRATION POINT ===
|
||||
|
||||
def intercept_user_input(text: str) -> Optional[str]:
|
||||
"""
|
||||
Call this at the chat entry point BEFORE normal processing.
|
||||
|
||||
Returns None if no crisis detected (continue normal processing).
|
||||
Returns formatted crisis response if crisis detected (interrupt normal flow).
|
||||
|
||||
Usage:
|
||||
response = intercept_user_input(user_message)
|
||||
if response:
|
||||
return response # Crisis detected — stop all other processing
|
||||
# Continue with normal processing...
|
||||
"""
|
||||
result = detect_crisis(text)
|
||||
if should_interrupt(result):
|
||||
return format_response(result)
|
||||
return None
|
||||
84
scripts/fix_evennia_settings.sh
Executable file
84
scripts/fix_evennia_settings.sh
Executable file
@@ -0,0 +1,84 @@
|
||||
#!/bin/bash
|
||||
set -euo pipefail
|
||||
#
|
||||
# fix_evennia_settings.sh — Fix Evennia settings on Bezalel VPS.
|
||||
#
|
||||
# Removes bad port tuples that crash Evennia's Twisted port binding.
|
||||
# Run on Bezalel VPS (104.131.15.18) or via SSH.
|
||||
#
|
||||
# Usage:
|
||||
# ssh root@104.131.15.18 'bash -s' < scripts/fix_evennia_settings.sh
|
||||
#
|
||||
# Part of #534
|
||||
|
||||
EVENNIA_DIR="/root/wizards/bezalel/evennia/bezalel_world"
|
||||
SETTINGS="${EVENNIA_DIR}/server/conf/settings.py"
|
||||
VENV_PYTHON="/root/wizards/bezalel/evennia/venv/bin/python3"
|
||||
VENV_EVENNIA="/root/wizards/bezalel/evennia/venv/bin/evennia"
|
||||
|
||||
echo "=== Fix Evennia Settings (Bezalel) ==="
|
||||
|
||||
# 1. Fix settings.py — remove bad port tuples
|
||||
echo "Fixing settings.py..."
|
||||
if [ -f "$SETTINGS" ]; then
|
||||
# Remove broken port lines
|
||||
sed -i '/WEBSERVER_PORTS/d' "$SETTINGS"
|
||||
sed -i '/TELNET_PORTS/d' "$SETTINGS"
|
||||
sed -i '/WEBSOCKET_PORTS/d' "$SETTINGS"
|
||||
sed -i '/SERVERNAME/d' "$SETTINGS"
|
||||
|
||||
# Add correct settings
|
||||
echo '' >> "$SETTINGS"
|
||||
echo '# Fixed port settings — #534' >> "$SETTINGS"
|
||||
echo 'SERVERNAME = "bezalel_world"' >> "$SETTINGS"
|
||||
echo 'WEBSERVER_PORTS = [(4001, "0.0.0.0")]' >> "$SETTINGS"
|
||||
echo 'TELNET_PORTS = [(4000, "0.0.0.0")]' >> "$SETTINGS"
|
||||
echo 'WEBSOCKET_PORTS = [(4002, "0.0.0.0")]' >> "$SETTINGS"
|
||||
|
||||
echo "Settings fixed."
|
||||
else
|
||||
echo "ERROR: Settings file not found at $SETTINGS"
|
||||
exit 1
|
||||
fi
|
||||
|
||||
# 2. Clean DB and re-migrate
|
||||
echo "Cleaning DB..."
|
||||
cd "$EVENNIA_DIR"
|
||||
rm -f server/evennia.db3
|
||||
|
||||
echo "Running migrations..."
|
||||
"$VENV_EVENNIA" migrate --no-input
|
||||
|
||||
# 3. Create superuser
|
||||
echo "Creating superuser..."
|
||||
"$VENV_PYTHON" -c "
|
||||
import sys, os
|
||||
sys.setrecursionlimit(5000)
|
||||
os.environ['DJANGO_SETTINGS_MODULE'] = 'server.conf.settings'
|
||||
os.chdir('$EVENNIA_DIR')
|
||||
import django
|
||||
django.setup()
|
||||
from evennia.accounts.accounts import AccountDB
|
||||
try:
|
||||
AccountDB.objects.create_superuser('Timmy', 'timmy@tower.world', 'timmy123')
|
||||
print('Superuser Timmy created')
|
||||
except Exception as e:
|
||||
print(f'Superuser may already exist: {e}')
|
||||
"
|
||||
|
||||
# 4. Start Evennia
|
||||
echo "Starting Evennia..."
|
||||
"$VENV_EVENNIA" start
|
||||
|
||||
# 5. Verify
|
||||
sleep 3
|
||||
echo ""
|
||||
echo "=== Verification ==="
|
||||
"$VENV_EVENNIA" status
|
||||
|
||||
echo ""
|
||||
echo "Listening ports:"
|
||||
ss -tlnp | grep -E '400[012]' || echo "No ports found (may need a moment)"
|
||||
|
||||
echo ""
|
||||
echo "Done. Connect: telnet 104.131.15.18 4000"
|
||||
171
scripts/genome_analyzer.py
Executable file
171
scripts/genome_analyzer.py
Executable file
@@ -0,0 +1,171 @@
|
||||
#!/usr/bin/env python3
|
||||
"""
|
||||
genome_analyzer.py — Generate a GENOME.md from a codebase.
|
||||
|
||||
Scans a repository and produces a structured codebase genome with:
|
||||
- File counts by type
|
||||
- Architecture overview (directory structure)
|
||||
- Entry points
|
||||
- Test coverage summary
|
||||
|
||||
Usage:
|
||||
python3 scripts/genome_analyzer.py /path/to/repo
|
||||
python3 scripts/genome_analyzer.py /path/to/repo --output GENOME.md
|
||||
python3 scripts/genome_analyzer.py /path/to/repo --dry-run
|
||||
|
||||
Part of #666: GENOME.md Template + Single-Repo Analyzer.
|
||||
"""
|
||||
|
||||
import argparse
|
||||
import sys
|
||||
from collections import defaultdict
|
||||
from datetime import datetime, timezone
|
||||
from pathlib import Path
|
||||
from typing import Dict, List, Tuple
|
||||
|
||||
SKIP_DIRS = {".git", "__pycache__", ".venv", "venv", "node_modules", ".tox", ".pytest_cache", ".DS_Store"}
|
||||
|
||||
|
||||
def count_files(repo_path: Path) -> Dict[str, int]:
|
||||
counts = defaultdict(int)
|
||||
for f in repo_path.rglob("*"):
|
||||
if any(part in SKIP_DIRS for part in f.parts):
|
||||
continue
|
||||
if f.is_file():
|
||||
ext = f.suffix or "(no ext)"
|
||||
counts[ext] += 1
|
||||
return dict(sorted(counts.items(), key=lambda x: -x[1]))
|
||||
|
||||
|
||||
def find_entry_points(repo_path: Path) -> List[str]:
|
||||
entry_points = []
|
||||
candidates = [
|
||||
"main.py", "app.py", "server.py", "cli.py", "manage.py",
|
||||
"index.html", "index.js", "index.ts",
|
||||
"Makefile", "Dockerfile", "docker-compose.yml",
|
||||
"README.md", "deploy.sh", "setup.py", "pyproject.toml",
|
||||
]
|
||||
for name in candidates:
|
||||
if (repo_path / name).exists():
|
||||
entry_points.append(name)
|
||||
scripts_dir = repo_path / "scripts"
|
||||
if scripts_dir.is_dir():
|
||||
for f in sorted(scripts_dir.iterdir()):
|
||||
if f.suffix in (".py", ".sh") and not f.name.startswith("test_"):
|
||||
entry_points.append(f"scripts/{f.name}")
|
||||
return entry_points[:15]
|
||||
|
||||
|
||||
def find_tests(repo_path: Path) -> Tuple[List[str], int]:
|
||||
test_files = []
|
||||
for f in repo_path.rglob("*"):
|
||||
if any(part in SKIP_DIRS for part in f.parts):
|
||||
continue
|
||||
if f.is_file() and (f.name.startswith("test_") or f.name.endswith("_test.py") or f.name.endswith("_test.js")):
|
||||
test_files.append(str(f.relative_to(repo_path)))
|
||||
return sorted(test_files), len(test_files)
|
||||
|
||||
|
||||
def find_directories(repo_path: Path, max_depth: int = 2) -> List[str]:
|
||||
dirs = []
|
||||
for d in sorted(repo_path.rglob("*")):
|
||||
if d.is_dir() and len(d.relative_to(repo_path).parts) <= max_depth:
|
||||
if not any(part in SKIP_DIRS for part in d.parts):
|
||||
rel = str(d.relative_to(repo_path))
|
||||
if rel != ".":
|
||||
dirs.append(rel)
|
||||
return dirs[:30]
|
||||
|
||||
|
||||
def read_readme(repo_path: Path) -> str:
|
||||
for name in ["README.md", "README.rst", "README.txt", "README"]:
|
||||
readme = repo_path / name
|
||||
if readme.exists():
|
||||
lines = readme.read_text(encoding="utf-8", errors="replace").split("\n")
|
||||
para = []
|
||||
started = False
|
||||
for line in lines:
|
||||
if line.startswith("#") and not started:
|
||||
continue
|
||||
if line.strip():
|
||||
started = True
|
||||
para.append(line.strip())
|
||||
elif started:
|
||||
break
|
||||
return " ".join(para[:5])
|
||||
return "(no README found)"
|
||||
|
||||
|
||||
def generate_genome(repo_path: Path, repo_name: str = "") -> str:
|
||||
if not repo_name:
|
||||
repo_name = repo_path.name
|
||||
date = datetime.now(timezone.utc).strftime("%Y-%m-%d")
|
||||
readme_desc = read_readme(repo_path)
|
||||
file_counts = count_files(repo_path)
|
||||
total_files = sum(file_counts.values())
|
||||
entry_points = find_entry_points(repo_path)
|
||||
test_files, test_count = find_tests(repo_path)
|
||||
dirs = find_directories(repo_path)
|
||||
|
||||
lines = [
|
||||
f"# GENOME.md — {repo_name}", "",
|
||||
f"> Codebase analysis generated {date}. {readme_desc[:100]}.", "",
|
||||
"## Project Overview", "",
|
||||
readme_desc, "",
|
||||
f"**{total_files} files** across {len(file_counts)} file types.", "",
|
||||
"## Architecture", "",
|
||||
"```",
|
||||
]
|
||||
for d in dirs[:20]:
|
||||
lines.append(f" {d}/")
|
||||
lines.append("```")
|
||||
lines += ["", "### File Types", "", "| Type | Count |", "|------|-------|"]
|
||||
for ext, count in list(file_counts.items())[:15]:
|
||||
lines.append(f"| {ext} | {count} |")
|
||||
lines += ["", "## Entry Points", ""]
|
||||
for ep in entry_points:
|
||||
lines.append(f"- `{ep}`")
|
||||
lines += ["", "## Test Coverage", "", f"**{test_count} test files** found.", ""]
|
||||
if test_files:
|
||||
for tf in test_files[:10]:
|
||||
lines.append(f"- `{tf}`")
|
||||
if len(test_files) > 10:
|
||||
lines.append(f"- ... and {len(test_files) - 10} more")
|
||||
else:
|
||||
lines.append("No test files found.")
|
||||
lines += ["", "## Security Considerations", "", "(To be filled during analysis)", ""]
|
||||
lines += ["## Design Decisions", "", "(To be filled during analysis)", ""]
|
||||
return "\n".join(lines)
|
||||
|
||||
|
||||
def main():
|
||||
parser = argparse.ArgumentParser(description="Generate GENOME.md from a codebase")
|
||||
parser.add_argument("repo_path", help="Path to repository")
|
||||
parser.add_argument("--output", default="", help="Output file (default: stdout)")
|
||||
parser.add_argument("--name", default="", help="Repository name")
|
||||
parser.add_argument("--dry-run", action="store_true", help="Print stats only")
|
||||
args = parser.parse_args()
|
||||
repo_path = Path(args.repo_path).resolve()
|
||||
if not repo_path.is_dir():
|
||||
print(f"ERROR: {repo_path} is not a directory", file=sys.stderr)
|
||||
sys.exit(1)
|
||||
repo_name = args.name or repo_path.name
|
||||
if args.dry_run:
|
||||
counts = count_files(repo_path)
|
||||
_, test_count = find_tests(repo_path)
|
||||
print(f"Repo: {repo_name}")
|
||||
print(f"Total files: {sum(counts.values())}")
|
||||
print(f"Test files: {test_count}")
|
||||
print(f"Top types: {', '.join(f'{k}={v}' for k,v in list(counts.items())[:5])}")
|
||||
sys.exit(0)
|
||||
genome = generate_genome(repo_path, repo_name)
|
||||
if args.output:
|
||||
with open(args.output, "w") as f:
|
||||
f.write(genome)
|
||||
print(f"Written: {args.output}")
|
||||
else:
|
||||
print(genome)
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
||||
155
scripts/grounding.py
Executable file
155
scripts/grounding.py
Executable file
@@ -0,0 +1,155 @@
|
||||
#!/usr/bin/env python3
|
||||
# grounding.py - Grounding before generation.
|
||||
# SOUL.md: "When I have verified sources, I must consult them
|
||||
# before I generate from pattern alone. Retrieval is not a feature.
|
||||
# It is the primary mechanism by which I avoid lying."
|
||||
# Part of #792
|
||||
|
||||
import json
|
||||
import os
|
||||
from pathlib import Path
|
||||
from typing import Any, Dict, List, Optional, Tuple
|
||||
from dataclasses import dataclass, field
|
||||
|
||||
HERMES_HOME = Path(os.environ.get("HERMES_HOME", Path.home() / ".hermes"))
|
||||
MEMORY_DIR = HERMES_HOME / "memory"
|
||||
|
||||
|
||||
@dataclass
|
||||
class GroundingResult:
|
||||
query: str
|
||||
sources_found: List[Dict[str, Any]] = field(default_factory=list)
|
||||
grounded: bool = False
|
||||
confidence: float = 0.0
|
||||
source_text: str = ""
|
||||
source_type: str = "" # memory, file, chain, tool_result
|
||||
|
||||
@property
|
||||
def needs_hedging(self):
|
||||
return not self.grounded
|
||||
|
||||
|
||||
class GroundingLayer:
|
||||
def __init__(self, memory_dir=None):
|
||||
self.memory_dir = Path(memory_dir) if memory_dir else MEMORY_DIR
|
||||
|
||||
def ground(self, query, context=None):
|
||||
"""Query local sources before generation."""
|
||||
sources = []
|
||||
|
||||
# 1. Search memory files
|
||||
memory_hits = self._search_memory(query)
|
||||
sources.extend(memory_hits)
|
||||
|
||||
# 2. Search context files if provided
|
||||
if context:
|
||||
context_hits = self._search_context(query, context)
|
||||
sources.extend(context_hits)
|
||||
|
||||
# 3. Build result
|
||||
grounded = len(sources) > 0
|
||||
confidence = min(0.95, 0.3 + len(sources) * 0.2) if grounded else 0.0
|
||||
|
||||
source_text = ""
|
||||
source_type = ""
|
||||
if sources:
|
||||
best = max(sources, key=lambda s: s.get("score", 0))
|
||||
source_text = best.get("text", "")[:200]
|
||||
source_type = best.get("type", "unknown")
|
||||
|
||||
return GroundingResult(
|
||||
query=query, sources_found=sources, grounded=grounded,
|
||||
confidence=confidence, source_text=source_text, source_type=source_type,
|
||||
)
|
||||
|
||||
def _search_memory(self, query):
|
||||
"""Search memory files for relevant content."""
|
||||
results = []
|
||||
if not self.memory_dir.exists():
|
||||
return results
|
||||
|
||||
query_lower = query.lower()
|
||||
query_words = set(query_lower.split())
|
||||
|
||||
for mem_file in self.memory_dir.rglob("*.md"):
|
||||
try:
|
||||
content = mem_file.read_text(encoding="utf-8", errors="replace")
|
||||
except Exception:
|
||||
continue
|
||||
|
||||
content_lower = content.lower()
|
||||
# Simple relevance: count query word matches
|
||||
matches = sum(1 for w in query_words if w in content_lower)
|
||||
if matches > 0:
|
||||
score = matches / max(len(query_words), 1)
|
||||
# Extract relevant snippet
|
||||
lines = content.split("\n")
|
||||
snippet = ""
|
||||
for line in lines:
|
||||
if any(w in line.lower() for w in query_words):
|
||||
snippet = line.strip()[:200]
|
||||
break
|
||||
|
||||
results.append({
|
||||
"text": snippet or content[:200],
|
||||
"source": str(mem_file.relative_to(self.memory_dir)),
|
||||
"type": "memory",
|
||||
"score": round(score, 3),
|
||||
})
|
||||
|
||||
return sorted(results, key=lambda r: -r["score"])[:5]
|
||||
|
||||
def _search_context(self, query, context):
|
||||
"""Search provided context text for relevant content."""
|
||||
results = []
|
||||
if not context:
|
||||
return results
|
||||
|
||||
query_lower = query.lower()
|
||||
query_words = set(query_lower.split())
|
||||
|
||||
for ctx in context:
|
||||
if isinstance(ctx, dict):
|
||||
text = ctx.get("content", "") or ctx.get("text", "")
|
||||
source = ctx.get("source", "context")
|
||||
else:
|
||||
text = str(ctx)
|
||||
source = "context"
|
||||
|
||||
text_lower = text.lower()
|
||||
matches = sum(1 for w in query_words if w in text_lower)
|
||||
if matches > 0:
|
||||
score = matches / max(len(query_words), 1)
|
||||
results.append({
|
||||
"text": text[:200],
|
||||
"source": source,
|
||||
"type": "context",
|
||||
"score": round(score, 3),
|
||||
})
|
||||
|
||||
return sorted(results, key=lambda r: -r["score"])[:5]
|
||||
|
||||
def format_sources(self, result):
|
||||
"""Format grounding result for display."""
|
||||
if not result.grounded:
|
||||
return "No verified sources found. Proceeding from pattern matching."
|
||||
|
||||
lines = ["Based on verified sources:"]
|
||||
for s in result.sources_found[:3]:
|
||||
ref = s.get("source", "unknown")
|
||||
text = s.get("text", "")[:100]
|
||||
lines.append(" - [" + ref + "] " + text)
|
||||
return "\n".join(lines)
|
||||
|
||||
|
||||
# Convenience
|
||||
_default_layer = None
|
||||
|
||||
def get_grounding_layer():
|
||||
global _default_layer
|
||||
if _default_layer is None:
|
||||
_default_layer = GroundingLayer()
|
||||
return _default_layer
|
||||
|
||||
def ground(query, **kwargs):
|
||||
return get_grounding_layer().ground(query, **kwargs)
|
||||
101
scripts/source_distinction.py
Executable file
101
scripts/source_distinction.py
Executable file
@@ -0,0 +1,101 @@
|
||||
#!/usr/bin/env python3
|
||||
# source_distinction.py - I think vs I know annotation system.
|
||||
# SOUL.md: "Every claim I make comes from one of two places: a verified source
|
||||
# I can point to, or my own pattern-matching."
|
||||
# Part of #793
|
||||
|
||||
from dataclasses import dataclass, field
|
||||
from enum import Enum
|
||||
from typing import List, Optional
|
||||
|
||||
|
||||
class SourceType(Enum):
|
||||
VERIFIED = "verified"
|
||||
INFERRED = "inferred"
|
||||
STATED = "stated"
|
||||
UNKNOWN = "unknown"
|
||||
|
||||
|
||||
@dataclass
|
||||
class Claim:
|
||||
text: str
|
||||
source_type: SourceType
|
||||
source_ref: str = ""
|
||||
confidence: float = 0.0
|
||||
hedging: str = ""
|
||||
|
||||
|
||||
@dataclass
|
||||
class AnnotatedResponse:
|
||||
raw_text: str
|
||||
claims: List[Claim] = field(default_factory=list)
|
||||
|
||||
def render(self):
|
||||
if not self.claims:
|
||||
return self.raw_text
|
||||
parts = []
|
||||
for claim in self.claims:
|
||||
if claim.source_type == SourceType.VERIFIED:
|
||||
prefix = "[verified: " + claim.source_ref + "]" if claim.source_ref else "[verified]"
|
||||
parts.append(claim.text + " " + prefix)
|
||||
elif claim.source_type == SourceType.INFERRED:
|
||||
hedge = claim.hedging or "I think"
|
||||
parts.append(hedge + " " + claim.text)
|
||||
elif claim.source_type == SourceType.STATED:
|
||||
parts.append(claim.text + " [you told me]")
|
||||
else:
|
||||
parts.append("I am not certain, but " + claim.text)
|
||||
return " ".join(parts)
|
||||
|
||||
@property
|
||||
def verified_count(self):
|
||||
return sum(1 for c in self.claims if c.source_type == SourceType.VERIFIED)
|
||||
|
||||
@property
|
||||
def inferred_count(self):
|
||||
return sum(1 for c in self.claims if c.source_type == SourceType.INFERRED)
|
||||
|
||||
|
||||
def verified(text, source, confidence=0.95):
|
||||
return Claim(text=text, source_type=SourceType.VERIFIED, source_ref=source, confidence=confidence)
|
||||
|
||||
def inferred(text, hedging="I think", confidence=0.6):
|
||||
return Claim(text=text, source_type=SourceType.INFERRED, confidence=confidence, hedging=hedging)
|
||||
|
||||
def stated(text):
|
||||
return Claim(text=text, source_type=SourceType.STATED, confidence=1.0)
|
||||
|
||||
|
||||
def annotate_response(raw_text, claims):
|
||||
return AnnotatedResponse(raw_text=raw_text, claims=claims)
|
||||
|
||||
|
||||
def format_for_display(response):
|
||||
lines = []
|
||||
for claim in response.claims:
|
||||
if claim.source_type == SourceType.VERIFIED:
|
||||
ref = " (" + claim.source_ref + ")" if claim.source_ref else ""
|
||||
lines.append(" = " + claim.text + ref)
|
||||
elif claim.source_type == SourceType.INFERRED:
|
||||
lines.append(" ~ " + claim.hedging + " " + claim.text)
|
||||
elif claim.source_type == SourceType.STATED:
|
||||
lines.append(" > " + claim.text)
|
||||
else:
|
||||
lines.append(" ? " + claim.text)
|
||||
if response.claims:
|
||||
v = response.verified_count
|
||||
i = response.inferred_count
|
||||
t = len(response.claims)
|
||||
lines.append("")
|
||||
lines.append(" [" + str(v) + " verified, " + str(i) + " inferred, " + str(t) + " total]")
|
||||
return "\n".join(lines)
|
||||
|
||||
|
||||
def source_distinction_check(text):
|
||||
hedging_words = ["i think", "i believe", "probably", "likely", "might",
|
||||
"it seems", "perhaps", "i am not sure", "i guess",
|
||||
"my understanding is", "i suspect"]
|
||||
text_lower = text.lower()
|
||||
hedging_count = sum(1 for h in hedging_words if h in text_lower)
|
||||
return {"has_hedging": hedging_count > 0, "hedging_count": hedging_count,
|
||||
"likely_inferred": hedging_count > 2}
|
||||
46
templates/GENOME-template.md
Normal file
46
templates/GENOME-template.md
Normal file
@@ -0,0 +1,46 @@
|
||||
# GENOME.md — {{REPO_NAME}}
|
||||
|
||||
> Codebase analysis generated {{DATE}}. {{SHORT_DESCRIPTION}}.
|
||||
|
||||
## Project Overview
|
||||
|
||||
{{OVERVIEW}}
|
||||
|
||||
## Architecture
|
||||
|
||||
{{ARCHITECTURE_DIAGRAM}}
|
||||
|
||||
## Entry Points
|
||||
|
||||
{{ENTRY_POINTS}}
|
||||
|
||||
## Data Flow
|
||||
|
||||
{{DATA_FLOW}}
|
||||
|
||||
## Key Abstractions
|
||||
|
||||
{{ABSTRACTIONS}}
|
||||
|
||||
## API Surface
|
||||
|
||||
{{API_SURFACE}}
|
||||
|
||||
## Test Coverage
|
||||
|
||||
### Existing Tests
|
||||
{{EXISTING_TESTS}}
|
||||
|
||||
### Coverage Gaps
|
||||
{{COVERAGE_GAPS}}
|
||||
|
||||
### Critical paths that need tests:
|
||||
{{CRITICAL_PATHS}}
|
||||
|
||||
## Security Considerations
|
||||
|
||||
{{SECURITY}}
|
||||
|
||||
## Design Decisions
|
||||
|
||||
{{DESIGN_DECISIONS}}
|
||||
88
tests/test_audit_trail.py
Normal file
88
tests/test_audit_trail.py
Normal file
@@ -0,0 +1,88 @@
|
||||
"""Tests for audit trail — SOUL.md compliance."""
|
||||
import json
|
||||
import tempfile
|
||||
from pathlib import Path
|
||||
from unittest.mock import patch
|
||||
|
||||
import pytest
|
||||
|
||||
|
||||
class TestAuditTrail:
|
||||
def test_log_and_query(self, tmp_path):
|
||||
from scripts.audit_trail import AuditTrail
|
||||
trail = AuditTrail(audit_dir=tmp_path)
|
||||
|
||||
trail.log_response(
|
||||
input_text="What is Python?",
|
||||
sources=["web_search:Python is a programming language"],
|
||||
confidence=0.9,
|
||||
output_text="Python is a programming language.",
|
||||
model="test-model",
|
||||
)
|
||||
|
||||
results = trail.query("Python")
|
||||
assert len(results) == 1
|
||||
assert results[0].confidence == 0.9
|
||||
assert "Python" in results[0].output_text
|
||||
|
||||
def test_query_no_match(self, tmp_path):
|
||||
from scripts.audit_trail import AuditTrail
|
||||
trail = AuditTrail(audit_dir=tmp_path)
|
||||
|
||||
trail.log_response(
|
||||
input_text="What is Rust?",
|
||||
sources=[],
|
||||
confidence=0.8,
|
||||
output_text="Rust is a systems language.",
|
||||
)
|
||||
|
||||
results = trail.query("Python")
|
||||
assert len(results) == 0
|
||||
|
||||
def test_confidence_filter(self, tmp_path):
|
||||
from scripts.audit_trail import AuditTrail
|
||||
trail = AuditTrail(audit_dir=tmp_path)
|
||||
|
||||
trail.log_response(input_text="test", sources=[], confidence=0.3, output_text="low conf")
|
||||
trail.log_response(input_text="test", sources=[], confidence=0.95, output_text="high conf")
|
||||
|
||||
high_only = trail.query("test", min_confidence=0.5)
|
||||
assert len(high_only) == 1
|
||||
assert high_only[0].confidence == 0.95
|
||||
|
||||
def test_stats(self, tmp_path):
|
||||
from scripts.audit_trail import AuditTrail
|
||||
trail = AuditTrail(audit_dir=tmp_path)
|
||||
|
||||
trail.log_response(input_text="a", sources=[], confidence=0.8, output_text="b")
|
||||
trail.log_response(input_text="c", sources=[], confidence=0.6, output_text="d")
|
||||
|
||||
stats = trail.get_stats()
|
||||
assert stats["total"] == 2
|
||||
assert stats["avg_confidence"] == 0.7
|
||||
|
||||
def test_session_filter(self, tmp_path):
|
||||
from scripts.audit_trail import AuditTrail
|
||||
trail = AuditTrail(audit_dir=tmp_path)
|
||||
|
||||
trail.log_response(input_text="a", sources=[], confidence=0.9, output_text="b", session_id="s1")
|
||||
trail.log_response(input_text="c", sources=[], confidence=0.9, output_text="d", session_id="s2")
|
||||
|
||||
s1_results = trail.get_by_session("s1")
|
||||
assert len(s1_results) == 1
|
||||
|
||||
def test_empty_trail(self, tmp_path):
|
||||
from scripts.audit_trail import AuditTrail
|
||||
trail = AuditTrail(audit_dir=tmp_path)
|
||||
|
||||
assert trail.query("anything") == []
|
||||
assert trail.get_stats()["total"] == 0
|
||||
|
||||
def test_content_addressed_id(self):
|
||||
from scripts.audit_trail import AuditEntry
|
||||
id1 = AuditEntry.generate_id("input", "output", "2026-01-01")
|
||||
id2 = AuditEntry.generate_id("input", "output", "2026-01-01")
|
||||
id3 = AuditEntry.generate_id("different", "output", "2026-01-01")
|
||||
|
||||
assert id1 == id2 # same content = same ID
|
||||
assert id1 != id3 # different content = different ID
|
||||
@@ -1,201 +0,0 @@
|
||||
"""Tests for crisis_detector.py — SOUL.md compliance."""
|
||||
|
||||
import pytest
|
||||
from scripts.crisis_detector import detect_crisis, should_interrupt, CrisisLevel
|
||||
|
||||
|
||||
class TestHighCrisis:
|
||||
"""Explicit suicidal ideation must be detected as HIGH."""
|
||||
|
||||
def test_direct_suicidal_statement(self):
|
||||
result = detect_crisis("I want to kill myself")
|
||||
assert result.detected is True
|
||||
assert result.level == CrisisLevel.HIGH
|
||||
|
||||
def test_end_my_life(self):
|
||||
result = detect_crisis("I'm going to end my life tonight")
|
||||
assert result.level == CrisisLevel.HIGH
|
||||
|
||||
def test_no_reason_to_live(self):
|
||||
result = detect_crisis("there is no reason to live anymore")
|
||||
assert result.level == CrisisLevel.HIGH
|
||||
|
||||
def test_better_off_dead(self):
|
||||
result = detect_crisis("everyone would be better off if I was dead")
|
||||
assert result.level == CrisisLevel.HIGH
|
||||
|
||||
def test_cant_go_on(self):
|
||||
result = detect_crisis("I can't go on anymore")
|
||||
assert result.level == CrisisLevel.HIGH
|
||||
|
||||
def test_wish_i_was_dead(self):
|
||||
result = detect_crisis("I wish I had never been born")
|
||||
assert result.level == CrisisLevel.HIGH
|
||||
|
||||
def test_im_done(self):
|
||||
result = detect_crisis("I'm done. I'm finished with life.")
|
||||
assert result.level == CrisisLevel.HIGH
|
||||
|
||||
def test_goodbye_forever(self):
|
||||
result = detect_crisis("Goodbye everyone, goodbye world, forever")
|
||||
assert result.level == CrisisLevel.HIGH
|
||||
|
||||
|
||||
class TestMediumCrisis:
|
||||
"""Clear distress signals without explicit ideation."""
|
||||
|
||||
def test_hopeless(self):
|
||||
result = detect_crisis("there is no hope and no way out")
|
||||
assert result.detected is True
|
||||
assert result.level == CrisisLevel.MEDIUM
|
||||
|
||||
def test_nobody_would_care(self):
|
||||
result = detect_crisis("nobody would miss me if I was gone")
|
||||
assert result.level == CrisisLevel.MEDIUM
|
||||
|
||||
def test_im_a_burden(self):
|
||||
result = detect_crisis("I'm just a burden to everyone")
|
||||
assert result.level == CrisisLevel.MEDIUM
|
||||
|
||||
def test_just_want_it_to_stop(self):
|
||||
result = detect_crisis("I just want it all to stop")
|
||||
assert result.level == CrisisLevel.MEDIUM
|
||||
|
||||
|
||||
class TestLowCrisis:
|
||||
"""Ambiguous distress — flag but don't fully interrupt."""
|
||||
|
||||
def test_depressed(self):
|
||||
result = detect_crisis("I'm so depressed, nothing matters")
|
||||
assert result.detected is True
|
||||
assert result.level == CrisisLevel.LOW
|
||||
|
||||
def test_feel_like_giving_up(self):
|
||||
result = detect_crisis("I feel like giving up on everything")
|
||||
assert result.level == CrisisLevel.LOW
|
||||
|
||||
def test_nobody_cares(self):
|
||||
result = detect_crisis("nobody cares about me")
|
||||
assert result.level == CrisisLevel.LOW
|
||||
|
||||
def test_hate_my_life(self):
|
||||
result = detect_crisis("I hate my life so much")
|
||||
assert result.level == CrisisLevel.LOW
|
||||
|
||||
|
||||
class TestNoCrisis:
|
||||
"""Normal conversations must NOT trigger crisis detection."""
|
||||
|
||||
def test_normal_question(self):
|
||||
result = detect_crisis("What's the weather like today?")
|
||||
assert result.detected is False
|
||||
assert result.level == CrisisLevel.NONE
|
||||
|
||||
def test_happy_conversation(self):
|
||||
result = detect_crisis("I just got promoted at work! So excited!")
|
||||
assert result.detected is False
|
||||
|
||||
def test_technical_discussion(self):
|
||||
result = detect_crisis("How do I configure nginx reverse proxy?")
|
||||
assert result.detected is False
|
||||
|
||||
def test_empty_input(self):
|
||||
result = detect_crisis("")
|
||||
assert result.detected is False
|
||||
|
||||
def test_none_input(self):
|
||||
result = detect_crisis(None)
|
||||
assert result.detected is False
|
||||
|
||||
def test_metaphorical_death(self):
|
||||
"""Metaphorical usage should not trigger HIGH."""
|
||||
result = detect_crisis("This bug is killing me, I've been debugging for hours")
|
||||
# May trigger LOW but not HIGH
|
||||
assert result.level != CrisisLevel.HIGH
|
||||
|
||||
def test_fiction_discussion(self):
|
||||
result = detect_crisis("The character in the book decides to end it all")
|
||||
# Fiction mentions may trigger — that's acceptable (better safe)
|
||||
assert result.level in (CrisisLevel.NONE, CrisisLevel.LOW, CrisisLevel.MEDIUM)
|
||||
|
||||
|
||||
class TestShouldInterrupt:
|
||||
"""HIGH and MEDIUM interrupt processing. LOW provides resources but continues."""
|
||||
|
||||
def test_high_interrupts(self):
|
||||
result = detect_crisis("I want to kill myself")
|
||||
assert should_interrupt(result) is True
|
||||
|
||||
def test_medium_interrupts(self):
|
||||
result = detect_crisis("I'm a burden to everyone, nobody would care")
|
||||
assert should_interrupt(result) is True
|
||||
|
||||
def test_low_does_not_interrupt(self):
|
||||
result = detect_crisis("I'm so depressed lately")
|
||||
assert should_interrupt(result) is False
|
||||
|
||||
def test_none_does_not_interrupt(self):
|
||||
result = detect_crisis("Hello, how are you?")
|
||||
assert should_interrupt(result) is False
|
||||
|
||||
|
||||
class TestResponseContent:
|
||||
"""Crisis responses must contain required elements per SOUL.md."""
|
||||
|
||||
def test_response_contains_988(self):
|
||||
result = detect_crisis("I want to end it all")
|
||||
assert "988" in result.response
|
||||
|
||||
def test_response_contains_gospel(self):
|
||||
result = detect_crisis("I want to kill myself")
|
||||
assert "Jesus" in result.response
|
||||
|
||||
def test_response_never_computes_value(self):
|
||||
"""SOUL.md: Never compute the value of a human life."""
|
||||
result = detect_crisis("I want to die")
|
||||
assert "worth" not in result.response.lower()
|
||||
assert "value of" not in result.response.lower()
|
||||
|
||||
def test_resources_include_988(self):
|
||||
result = detect_crisis("I want to kill myself")
|
||||
assert any("988" in r for r in result.resources)
|
||||
|
||||
def test_asks_are_you_safe(self):
|
||||
result = detect_crisis("I can't take it anymore")
|
||||
assert "safe" in result.response.lower()
|
||||
|
||||
|
||||
class TestMultilingual:
|
||||
"""Crisis is universal. Test common non-English signals."""
|
||||
|
||||
def test_spanish_suicidal(self):
|
||||
result = detect_crisis("quiero morir, no puedo más")
|
||||
# Spanish not currently supported — should not crash
|
||||
assert result is not None
|
||||
|
||||
|
||||
class TestEdgeCases:
|
||||
"""Boundary conditions and adversarial inputs."""
|
||||
|
||||
def test_very_long_input(self):
|
||||
text = "I'm fine. " * 1000 + "I want to kill myself"
|
||||
result = detect_crisis(text)
|
||||
assert result.level == CrisisLevel.HIGH
|
||||
|
||||
def test_mixed_case(self):
|
||||
result = detect_crisis("I WANT TO KILL MYSELF")
|
||||
assert result.level == CrisisLevel.HIGH
|
||||
|
||||
def test_with_emojis(self):
|
||||
result = detect_crisis("I want to end it all 😢💔")
|
||||
assert result.level == CrisisLevel.HIGH
|
||||
|
||||
def test_with_typos(self):
|
||||
# Exact matching won't catch typos — that's a known limitation
|
||||
result = detect_crisis("I want to kil myself")
|
||||
# May or may not match depending on pattern flexibility
|
||||
assert result is not None
|
||||
|
||||
def test_repeated_phrases(self):
|
||||
result = detect_crisis("I can't. I just can't. I can't go on anymore.")
|
||||
assert result.level == CrisisLevel.HIGH
|
||||
67
tests/test_grounding.py
Normal file
67
tests/test_grounding.py
Normal file
@@ -0,0 +1,67 @@
|
||||
"""Tests for grounding-before-generation - SOUL.md compliance."""
|
||||
import pytest
|
||||
from pathlib import Path
|
||||
import tempfile
|
||||
|
||||
|
||||
class TestGrounding:
|
||||
def test_ground_with_memory(self, tmp_path):
|
||||
from scripts.grounding import GroundingLayer
|
||||
mem_dir = tmp_path / "memory"
|
||||
mem_dir.mkdir()
|
||||
(mem_dir / "test.md").write_text("Python is a programming language created by Guido.")
|
||||
|
||||
layer = GroundingLayer(memory_dir=mem_dir)
|
||||
result = layer.ground("What is Python?")
|
||||
|
||||
assert result.grounded
|
||||
assert result.confidence > 0
|
||||
assert len(result.sources_found) > 0
|
||||
|
||||
def test_ground_no_sources(self, tmp_path):
|
||||
from scripts.grounding import GroundingLayer
|
||||
mem_dir = tmp_path / "memory"
|
||||
mem_dir.mkdir()
|
||||
|
||||
layer = GroundingLayer(memory_dir=mem_dir)
|
||||
result = layer.ground("What is quantum physics?")
|
||||
|
||||
assert not result.grounded
|
||||
assert result.needs_hedging
|
||||
assert result.confidence == 0.0
|
||||
|
||||
def test_ground_with_context(self):
|
||||
from scripts.grounding import GroundingLayer
|
||||
layer = GroundingLayer(memory_dir=Path("/nonexistent"))
|
||||
|
||||
context = [{"content": "The fleet uses tmux for agent management", "source": "fleet-ops"}]
|
||||
result = layer.ground("How does the fleet work?", context=context)
|
||||
|
||||
assert result.grounded
|
||||
assert result.source_type == "context"
|
||||
|
||||
def test_format_sources_grounded(self):
|
||||
from scripts.grounding import GroundingLayer, GroundingResult
|
||||
layer = GroundingLayer()
|
||||
result = GroundingResult(
|
||||
query="test", grounded=True,
|
||||
sources_found=[{"text": "test info", "source": "test.md", "type": "memory", "score": 0.8}],
|
||||
)
|
||||
formatted = layer.format_sources(result)
|
||||
assert "verified sources" in formatted
|
||||
assert "test.md" in formatted
|
||||
|
||||
def test_format_sources_ungrounded(self):
|
||||
from scripts.grounding import GroundingLayer, GroundingResult
|
||||
layer = GroundingLayer()
|
||||
result = GroundingResult(query="test", grounded=False)
|
||||
formatted = layer.format_sources(result)
|
||||
assert "pattern matching" in formatted
|
||||
|
||||
def test_empty_memory_dir(self, tmp_path):
|
||||
from scripts.grounding import GroundingLayer
|
||||
mem_dir = tmp_path / "empty"
|
||||
mem_dir.mkdir()
|
||||
layer = GroundingLayer(memory_dir=mem_dir)
|
||||
result = layer.ground("anything")
|
||||
assert not result.grounded
|
||||
61
tests/test_source_distinction.py
Normal file
61
tests/test_source_distinction.py
Normal file
@@ -0,0 +1,61 @@
|
||||
"""Tests for source distinction - SOUL.md compliance."""
|
||||
import pytest
|
||||
|
||||
|
||||
class TestSourceDistinction:
|
||||
def test_verified_claim(self):
|
||||
from scripts.source_distinction import verified, SourceType
|
||||
claim = verified("Paris is the capital", "web_search:Paris")
|
||||
assert claim.source_type == SourceType.VERIFIED
|
||||
assert claim.source_ref == "web_search:Paris"
|
||||
assert claim.confidence == 0.95
|
||||
|
||||
def test_inferred_claim(self):
|
||||
from scripts.source_distinction import inferred, SourceType
|
||||
claim = inferred("this approach is better")
|
||||
assert claim.source_type == SourceType.INFERRED
|
||||
assert claim.hedging == "I think"
|
||||
|
||||
def test_stated_claim(self):
|
||||
from scripts.source_distinction import stated, SourceType
|
||||
claim = stated("my name is Alexander")
|
||||
assert claim.source_type == SourceType.STATED
|
||||
assert claim.confidence == 1.0
|
||||
|
||||
def test_render_verified(self):
|
||||
from scripts.source_distinction import annotate_response, verified
|
||||
resp = annotate_response("test", [verified("Paris is capital", "web")])
|
||||
rendered = resp.render()
|
||||
assert "[verified: web]" in rendered
|
||||
|
||||
def test_render_inferred(self):
|
||||
from scripts.source_distinction import annotate_response, inferred
|
||||
resp = annotate_response("test", [ inferred("this is better")])
|
||||
rendered = resp.render()
|
||||
assert "I think" in rendered
|
||||
|
||||
def test_counts(self):
|
||||
from scripts.source_distinction import annotate_response, verified, inferred
|
||||
resp = annotate_response("test", [
|
||||
verified("a", "src"), verified("b", "src"), inferred("c"),
|
||||
])
|
||||
assert resp.verified_count == 2
|
||||
assert resp.inferred_count == 1
|
||||
|
||||
def test_hedging_detection(self):
|
||||
from scripts.source_distinction import source_distinction_check
|
||||
result = source_distinction_check("I think this is probably right, but I believe it could be different")
|
||||
assert result["has_hedging"]
|
||||
assert result["hedging_count"] >= 3
|
||||
|
||||
def test_no_hedging(self):
|
||||
from scripts.source_distinction import source_distinction_check
|
||||
result = source_distinction_check("The capital of France is Paris.")
|
||||
assert not result["has_hedging"]
|
||||
|
||||
def test_format_for_display(self):
|
||||
from scripts.source_distinction import format_for_display, annotate_response, verified, inferred
|
||||
resp = annotate_response("test", [verified("a", "src"), inferred("b")])
|
||||
output = format_for_display(resp)
|
||||
assert "=" in output # verified icon
|
||||
assert "~" in output # inferred icon
|
||||
@@ -1,56 +0,0 @@
|
||||
from pathlib import Path
|
||||
|
||||
|
||||
GENOME = Path("GENOME.md")
|
||||
|
||||
|
||||
def read_genome() -> str:
|
||||
assert GENOME.exists(), "GENOME.md must exist at repo root"
|
||||
return GENOME.read_text(encoding="utf-8")
|
||||
|
||||
|
||||
def test_the_nexus_genome_has_required_sections() -> None:
|
||||
text = read_genome()
|
||||
required = [
|
||||
"# GENOME.md — the-nexus",
|
||||
"## Project Overview",
|
||||
"## Architecture Diagram",
|
||||
"```mermaid",
|
||||
"## Entry Points and Data Flow",
|
||||
"## Key Abstractions",
|
||||
"## API Surface",
|
||||
"## Test Coverage Gaps",
|
||||
"## Security Considerations",
|
||||
"## Runtime Truth and Docs Drift",
|
||||
]
|
||||
missing = [item for item in required if item not in text]
|
||||
assert not missing, missing
|
||||
|
||||
|
||||
def test_the_nexus_genome_captures_current_runtime_contract() -> None:
|
||||
text = read_genome()
|
||||
required = [
|
||||
"server.py",
|
||||
"app.js",
|
||||
"index.html",
|
||||
"portals.json",
|
||||
"vision.json",
|
||||
"BROWSER_CONTRACT.md",
|
||||
"tests/test_browser_smoke.py",
|
||||
"tests/test_repo_truth.py",
|
||||
"nexus/morrowind_harness.py",
|
||||
"nexus/bannerlord_harness.py",
|
||||
"mempalace/tunnel_sync.py",
|
||||
"mcp_servers/desktop_control_server.py",
|
||||
"public/nexus/",
|
||||
]
|
||||
missing = [item for item in required if item not in text]
|
||||
assert not missing, missing
|
||||
|
||||
|
||||
def test_the_nexus_genome_explains_docs_runtime_drift() -> None:
|
||||
text = read_genome()
|
||||
assert "README.md says current `main` does not ship a browser 3D world" in text
|
||||
assert "CLAUDE.md declares root `app.js` and `index.html` as canonical frontend paths" in text
|
||||
assert "tests and browser contract now assume the root frontend exists" in text
|
||||
assert len(text) >= 5000
|
||||
Reference in New Issue
Block a user