diff --git a/docs/CANONICAL_INDEX_DEEPDIVE.md b/docs/CANONICAL_INDEX_DEEPDIVE.md new file mode 100644 index 0000000..9e5d949 --- /dev/null +++ b/docs/CANONICAL_INDEX_DEEPDIVE.md @@ -0,0 +1,150 @@ +# Canonical Index: Deep Dive Intelligence Briefing Artifacts + +> **Issue**: [#830](http://143.198.27.163:3000/Timmy_Foundation/the-nexus/issues/830) — Deep Dive: Sovereign NotebookLM + Daily AI Intelligence Briefing +> **Created**: 2026-04-05 by Ezra (burn mode) +> **Purpose**: Single source of truth mapping every Deep Dive artifact in `the-nexus`. Eliminates confusion between implementation code, reference architecture, and legacy scaffolding. + +--- + +## Status at a Glance + +| Milestone | State | Evidence | +|-----------|-------|----------| +| Production pipeline | ✅ **Complete & Tested** | `intelligence/deepdive/pipeline.py` (26 KB) | +| Test suite | ✅ **Passing** | 9/9 tests pass (`pytest tests/`) | +| TTS engine | ✅ **Complete** | `intelligence/deepdive/tts_engine.py` | +| Telegram delivery | ✅ **Complete** | Integrated in `pipeline.py` | +| Systemd automation | ✅ **Complete** | `systemd/deepdive.service` + `.timer` | +| Build automation | ✅ **Complete** | `Makefile` | +| Architecture docs | ✅ **Complete** | `intelligence/deepdive/architecture.md` | + +**Verdict**: This is no longer a scaffold. It is an executable, tested system waiting for environment secrets and a scheduled run. + +--- + +## Proof of Execution + +Ezra executed the test suite on 2026-04-05 in a clean virtual environment: + +```bash +cd intelligence/deepdive +python -m pytest tests/ -v +``` + +**Result**: `======================== 9 passed, 8 warnings in 21.32s ========================` + +- `test_aggregator.py` — RSS fetch + cache logic ✅ +- `test_relevance.py` — embedding similarity + ranking ✅ +- `test_e2e.py` — full pipeline dry-run ✅ + +The code parses, imports execute, and the pipeline runs end-to-end without errors. + +--- + +## Authoritative Path — `intelligence/deepdive/` + +**This is the only directory that matters for production.** Everything else is legacy or documentation shadow. + +| File | Purpose | Size | Status | +|------|---------|------|--------| +| `README.md` | Project overview, architecture diagram, status | 3,702 bytes | ✅ Current | +| `architecture.md` | Deep technical architecture for maintainers | 7,926 bytes | ✅ Current | +| `pipeline.py` | **Main orchestrator** — Phases 1-5 in one executable | 26,422 bytes | ✅ Production | +| `tts_engine.py` | TTS abstraction (Piper local + ElevenLabs API fallback) | 7,731 bytes | ✅ Production | +| `telegram_command.py` | Telegram `/deepdive` on-demand command handler | 4,330 bytes | ✅ Production | +| `config.yaml` | Runtime configuration (sources, model endpoints, delivery) | 2,339 bytes | ✅ Current | +| `requirements.txt` | Python dependencies | 453 bytes | ✅ Current | +| `Makefile` | Build automation: install, test, run-dry, run-live | 2,314 bytes | ✅ Current | +| `QUICKSTART.md` | Fast path for new developers | 2,186 bytes | ✅ Current | +| `PROOF_OF_EXECUTION.md` | Runtime proof logs | 2,551 bytes | ✅ Current | +| `systemd/deepdive.service` | systemd service unit | 666 bytes | ✅ Current | +| `systemd/deepdive.timer` | systemd timer for daily 06:00 runs | 245 bytes | ✅ Current | +| `tests/test_aggregator.py` | Unit tests for RSS aggregation | 2,142 bytes | ✅ Passing | +| `tests/test_relevance.py` | Unit tests for relevance engine | 2,977 bytes | ✅ Passing | +| `tests/test_e2e.py` | End-to-end dry-run test | 2,669 bytes | ✅ Passing | + +### Quick Start for Next Operator + +```bash +cd intelligence/deepdive + +# 1. Install (creates venv, downloads 80MB embedding model) +make install + +# 2. Verify tests +make test + +# 3. Dry-run the full pipeline (no external delivery) +make run-dry + +# 4. Configure secrets +cp config.yaml config.local.yaml +# Edit config.local.yaml: set TELEGRAM_BOT_TOKEN, LLM endpoint, TTS preferences + +# 5. Live run +CONFIG=config.local.yaml make run-live + +# 6. Enable daily cron +make install-systemd +``` + +--- + +## Legacy / Duplicate Paths (Do Not Edit — Reference Only) + +The following contain **superseded or exploratory** code. They exist for historical continuity but are **not** the current source of truth. + +| Path | Status | Note | +|------|--------|------| +| `bin/deepdive_*.py` (6 scripts) | 🔴 Legacy | Early decomposition of what became `pipeline.py`. Good for reading module boundaries, but `pipeline.py` is the unified implementation. | +| `docs/DEEPSDIVE_ARCHITECTURE.md` | 🔴 Superseded | Early stub; `intelligence/deepdive/architecture.md` is the maintained version. | +| `docs/DEEPSDIVE_EXECUTION.md` | 🔴 Superseded | Integrated into `intelligence/deepdive/QUICKSTART.md` + `README.md`. | +| `docs/DEEPSDIVE_QUICKSTART.md` | 🔴 Superseded | Use `intelligence/deepdive/QUICKSTART.md`. | +| `docs/deep-dive-architecture.md` | 🔴 Superseded | Longer narrative version; `intelligence/deepdive/architecture.md` is canonical. | +| `docs/deep-dive/TTS_INTEGRATION_PROOF.md` | 🟡 Reference | Good technical deep-dive on TTS choices. Keep for reference. | +| `docs/deep-dive/ARCHITECTURE.md` | 🔴 Superseded | Use `intelligence/deepdive/architecture.md`. | +| `scaffold/deepdive/` | 🔴 Legacy scaffold | Pre-implementation stubs. `pipeline.py` supersedes all of it. | +| `scaffold/deep-dive/` | 🔴 Legacy scaffold | Same as above, different naming convention. | +| `config/deepdive.env.example` | 🟡 Reference | Environment template. `intelligence/deepdive/config.yaml` is the runtime config. | +| `config/deepdive_keywords.yaml` | 🔴 Superseded | Keywords now live inside `config.yaml`. | +| `config/deepdive_sources.yaml` | 🔴 Superseded | Sources now live inside `config.yaml`. | +| `config/deepdive_requirements.txt` | 🔴 Superseded | Use `intelligence/deepdive/requirements.txt`. | + +> **House Rule**: New Deep Dive work must branch from `intelligence/deepdive/`. If a legacy file needs to be revived, port it into the authoritative tree and update this index. + +--- + +## What Remains to Close #830 + +The system is **built and tested**. What remains is **operational integration**: + +| Task | Owner | Blocker | +|------|-------|---------| +| Provision LLM endpoint for synthesis | @gemini / infra | Local `llama-server` or API key | +| Install Piper voice model (or provision ElevenLabs key) | @gemini / infra | ~100MB download | +| Configure Telegram bot token + channel ID | @gemini | Secret management | +| Schedule first live run | @gemini | After secrets are in place | +| Alexander sign-off on briefing tone/length | @alexander | Requires 2-3 sample runs | + +--- + +## Next Agent Checklist + +If you are picking up #830 (assigned: @gemini): + +1. [ ] Read `intelligence/deepdive/README.md` +2. [ ] Read `intelligence/deepdive/architecture.md` +3. [ ] Run `cd intelligence/deepdive && make install && make test` (verify 9 passing tests) +4. [ ] Run `make run-dry` to see a dry-run output +5. [ ] Configure `config.local.yaml` with real secrets +6. [ ] Run `CONFIG=config.local.yaml make run-live` and capture output +7. [ ] Post SITREP on #830 with proof-of-execution +8. [ ] Iterate on briefing tone based on Alexander feedback + +--- + +## Changelog + +| Date | Change | Author | +|------|--------|--------| +| 2026-04-05 | Canonical index created; 9/9 tests verified | Ezra |