# Deep Dive Scaffold

> Parent: the-nexus#830  
> Created: 2026-04-05

This directory contains phase-by-phase implementation skeletons for the Deep Dive automated intelligence briefing system.

## Directory Structure

```
scaffold/deepdive/
├── phase1/          # Source aggregation (ZERO blockers, can start now)
│   ├── arxiv_aggregator.py   ← Run this today
│   ├── blog_scraper.py       (stub)
│   └── config.yaml
├── phase2/          # Relevance engine (needs Phase 1)
│   ├── relevance_engine.py   (stub)
│   └── embeddings.py         (stub)
├── phase3/          # Synthesis (needs Phase 2)
│   ├── synthesis.py            (stub)
│   └── briefing_template.md
├── phase4/          # TTS pipeline (needs Phase 3)
│   ├── tts_pipeline.py         (stub)
│   └── piper_config.json
└── phase5/          # Delivery (needs Phase 4)
    ├── telegram_delivery.py    (stub)
    └── deepdive_command.py     (stub)
```

## Quick Start

### Phase 1 (Today)

```bash
cd the-nexus/scaffold/deepdive/phase1
python3 arxiv_aggregator.py
```

**Requirements**: Python 3.8+, internet connection, no API keys.

**Output**: `data/deepdive/raw/arxiv-YYYY-MM-DD.jsonl`

## Sovereignty Preservation

| Component | Local Option | Cloud Fallback |
|-----------|-------------|----------------|
| Embeddings | nomic-embed-text via llama.cpp | OpenAI |
| LLM | Gemma 4 via Hermes | Kimi K2.5 |
| TTS | Piper | ElevenLabs |

**Rule**: Implement local first, add cloud fallback only if quality unacceptable.

## Next Steps

1. ✅ **Phase 1**: Run `arxiv_aggregator.py` to validate fetch pipeline
2. ⏳ **Phase 2**: Implement `relevance_engine.py` with embeddings
3. ⏳ **Phase 3**: Draft `synthesis.py` with prompt templates
4. ⏳ **Phase 4**: Test `tts_pipeline.py` with Piper
5. ⏳ **Phase 5**: Integrate `telegram_delivery.py` with Hermes gateway

See `docs/deep-dive-architecture.md` for full technical specification.