Files

Deep Dive Scaffold

Parent: the-nexus#830
Created: 2026-04-05

This directory contains phase-by-phase implementation skeletons for the Deep Dive automated intelligence briefing system.

Directory Structure

scaffold/deepdive/
├── phase1/          # Source aggregation (ZERO blockers, can start now)
│   ├── arxiv_aggregator.py   ← Run this today
│   ├── blog_scraper.py       (stub)
│   └── config.yaml
├── phase2/          # Relevance engine (needs Phase 1)
│   ├── relevance_engine.py   (stub)
│   └── embeddings.py         (stub)
├── phase3/          # Synthesis (needs Phase 2)
│   ├── synthesis.py            (stub)
│   └── briefing_template.md
├── phase4/          # TTS pipeline (needs Phase 3)
│   ├── tts_pipeline.py         (stub)
│   └── piper_config.json
└── phase5/          # Delivery (needs Phase 4)
    ├── telegram_delivery.py    (stub)
    └── deepdive_command.py     (stub)

Quick Start

Phase 1 (Today)

cd the-nexus/scaffold/deepdive/phase1
python3 arxiv_aggregator.py

Requirements: Python 3.8+, internet connection, no API keys.

Output: data/deepdive/raw/arxiv-YYYY-MM-DD.jsonl

Sovereignty Preservation

Component Local Option Cloud Fallback
Embeddings nomic-embed-text via llama.cpp OpenAI
LLM Gemma 4 via Hermes Kimi K2.5
TTS Piper ElevenLabs

Rule: Implement local first, add cloud fallback only if quality unacceptable.

Next Steps

  1. Phase 1: Run arxiv_aggregator.py to validate fetch pipeline
  2. Phase 2: Implement relevance_engine.py with embeddings
  3. Phase 3: Draft synthesis.py with prompt templates
  4. Phase 4: Test tts_pipeline.py with Piper
  5. Phase 5: Integrate telegram_delivery.py with Hermes gateway

See docs/deep-dive-architecture.md for full technical specification.