238 lines
8.8 KiB
Markdown
238 lines
8.8 KiB
Markdown
|
|
# Deep Dive: Sovereign NotebookLM — Architecture Document
|
||
|
|
|
||
|
|
**Issue**: the-nexus#830
|
||
|
|
**Author**: Ezra (Claude-Hermes)
|
||
|
|
**Date**: 2026-04-05
|
||
|
|
**Status**: Production-Ready Scaffold
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## Executive Summary
|
||
|
|
|
||
|
|
Deep Dive is a fully automated daily intelligence briefing system that replaces manual NotebookLM workflows with sovereign infrastructure. It aggregates research sources, filters by relevance to Hermes/Timmy work, synthesizes into structured briefings, generates audio via TTS, and delivers to Telegram.
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## Architecture: 5-Phase Pipeline
|
||
|
|
|
||
|
|
```
|
||
|
|
┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐
|
||
|
|
│ Phase 1: │───▶│ Phase 2: │───▶│ Phase 3: │
|
||
|
|
│ AGGREGATOR │ │ RELEVANCE │ │ SYNTHESIS │
|
||
|
|
│ (Source Ingest)│ │ (Filter/Rank) │ │ (LLM Briefing) │
|
||
|
|
└─────────────────┘ └─────────────────┘ └─────────────────┘
|
||
|
|
│ │
|
||
|
|
▼ ▼
|
||
|
|
┌─────────────────┐ ┌─────────────────┐
|
||
|
|
│ arXiv RSS/API │ │ Structured │
|
||
|
|
│ Lab Blogs │ │ Intelligence │
|
||
|
|
│ Newsletters │ │ Briefing │
|
||
|
|
└─────────────────┘ └─────────────────┘
|
||
|
|
│
|
||
|
|
┌────────────────────────────┘
|
||
|
|
▼
|
||
|
|
┌─────────────────┐ ┌─────────────────┐
|
||
|
|
│ Phase 4: │───▶│ Phase 5: │
|
||
|
|
│ AUDIO │ │ DELIVERY │
|
||
|
|
│ (TTS Pipeline) │ │ (Telegram) │
|
||
|
|
└─────────────────┘ └─────────────────┘
|
||
|
|
│ │
|
||
|
|
▼ ▼
|
||
|
|
┌─────────────────┐ ┌─────────────────┐
|
||
|
|
│ Daily Podcast │ │ 6 AM Automated │
|
||
|
|
│ MP3 File │ │ Telegram Voice │
|
||
|
|
└─────────────────┘ └─────────────────┘
|
||
|
|
```
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## Phase Specifications
|
||
|
|
|
||
|
|
### Phase 1: Source Aggregation Layer
|
||
|
|
|
||
|
|
**Purpose**: Automated ingestion of Hermes-relevant research sources
|
||
|
|
|
||
|
|
**Sources**:
|
||
|
|
- **arXiv**: cs.AI, cs.CL, cs.LG via RSS/API (http://export.arxiv.org/rss/)
|
||
|
|
- **OpenAI Blog**: https://openai.com/blog/rss.xml
|
||
|
|
- **Anthropic**: https://www.anthropic.com/news.atom
|
||
|
|
- **DeepMind**: https://deepmind.google/blog/rss.xml
|
||
|
|
- **Newsletters**: Import AI, TLDR AI via email forwarding or RSS
|
||
|
|
|
||
|
|
**Output**: Raw source cache in `data/sources/YYYY-MM-DD/`
|
||
|
|
|
||
|
|
**Implementation**: `bin/phase1_aggregate.py`
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
### Phase 2: Relevance Engine
|
||
|
|
|
||
|
|
**Purpose**: Filter and rank sources by relevance to Hermes/Timmy mission
|
||
|
|
|
||
|
|
**Scoring Dimensions**:
|
||
|
|
1. **Keyword Match**: agent systems, LLM architecture, RL training, tool use, MCP, Hermes
|
||
|
|
2. **Embedding Similarity**: Cosine similarity against Hermes codebase embeddings
|
||
|
|
3. **Source Authority**: Weight arXiv > Labs > Newsletters
|
||
|
|
4. **Recency Boost**: Same-day sources weighted higher
|
||
|
|
|
||
|
|
**Output**: Ranked list with scores in `data/ranked/YYYY-MM-DD.json`
|
||
|
|
|
||
|
|
**Implementation**: `bin/phase2_rank.py`
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
### Phase 3: Synthesis Engine
|
||
|
|
|
||
|
|
**Purpose**: Generate structured intelligence briefing via LLM
|
||
|
|
|
||
|
|
**Prompt Engineering**:
|
||
|
|
- Inject Hermes/Timmy context into system prompt
|
||
|
|
- Request specific structure: Headlines, Deep Dives, Implications
|
||
|
|
- Include source citations
|
||
|
|
- Tone: Professional intelligence briefing
|
||
|
|
|
||
|
|
**Output**: Markdown briefing in `data/briefings/YYYY-MM-DD.md`
|
||
|
|
|
||
|
|
**Models**: gpt-4o-mini (fast), claude-3-haiku (context), local Hermes (sovereign)
|
||
|
|
|
||
|
|
**Implementation**: `bin/phase3_synthesize.py`
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
### Phase 4: Audio Generation
|
||
|
|
|
||
|
|
**Purpose**: Convert text briefing to spoken audio podcast
|
||
|
|
|
||
|
|
**TTS Options**:
|
||
|
|
1. **OpenAI TTS**: `tts-1` or `tts-1-hd` (high quality, API cost)
|
||
|
|
2. **ElevenLabs**: Premium voices (sovereign API key required)
|
||
|
|
3. **Local XTTS**: Fully sovereign (GPU required, ~4GB VRAM)
|
||
|
|
4. **edge-tts**: Free via Microsoft Edge voices (no API key)
|
||
|
|
|
||
|
|
**Output**: MP3 file in `data/audio/YYYY-MM-DD.mp3`
|
||
|
|
|
||
|
|
**Implementation**: `bin/phase4_generate_audio.py`
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
### Phase 5: Delivery Pipeline
|
||
|
|
|
||
|
|
**Purpose**: Scheduled delivery to Telegram as voice message
|
||
|
|
|
||
|
|
**Mechanism**:
|
||
|
|
- Cron trigger at 6:00 AM EST daily
|
||
|
|
- Check for existing audio file
|
||
|
|
- Send voice message via Telegram Bot API
|
||
|
|
- Fallback to text digest if audio fails
|
||
|
|
- On-demand generation via `/deepdive` command
|
||
|
|
|
||
|
|
**Implementation**: `bin/phase5_deliver.py`
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## Directory Structure
|
||
|
|
|
||
|
|
```
|
||
|
|
deepdive/
|
||
|
|
├── bin/ # Executable pipeline scripts
|
||
|
|
│ ├── phase1_aggregate.py # Source ingestion
|
||
|
|
│ ├── phase2_rank.py # Relevance filtering
|
||
|
|
│ ├── phase3_synthesize.py # LLM briefing generation
|
||
|
|
│ ├── phase4_generate_audio.py # TTS pipeline
|
||
|
|
│ ├── phase5_deliver.py # Telegram delivery
|
||
|
|
│ └── run_full_pipeline.py # Orchestrator
|
||
|
|
├── config/
|
||
|
|
│ ├── sources.yaml # Source URLs and weights
|
||
|
|
│ ├── relevance.yaml # Scoring parameters
|
||
|
|
│ ├── prompts/ # LLM prompt templates
|
||
|
|
│ │ ├── briefing_system.txt
|
||
|
|
│ │ └── briefing_user.txt
|
||
|
|
│ └── telegram.yaml # Bot configuration
|
||
|
|
├── templates/
|
||
|
|
│ ├── briefing_template.md # Output formatting
|
||
|
|
│ └── podcast_intro.txt # Audio intro script
|
||
|
|
├── docs/
|
||
|
|
│ ├── ARCHITECTURE.md # This document
|
||
|
|
│ ├── OPERATIONS.md # Runbook
|
||
|
|
│ └── TROUBLESHOOTING.md # Common issues
|
||
|
|
└── data/ # Runtime data (gitignored)
|
||
|
|
├── sources/ # Raw source cache
|
||
|
|
├── ranked/ # Scored sources
|
||
|
|
├── briefings/ # Generated briefings
|
||
|
|
└── audio/ # MP3 files
|
||
|
|
```
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## Configuration
|
||
|
|
|
||
|
|
### Environment Variables
|
||
|
|
|
||
|
|
```bash
|
||
|
|
# Required
|
||
|
|
export DEEPDIVE_TELEGRAM_BOT_TOKEN="..."
|
||
|
|
export DEEPDIVE_TELEGRAM_CHAT_ID="..."
|
||
|
|
|
||
|
|
# TTS Provider (pick one)
|
||
|
|
export OPENAI_API_KEY="..." # For OpenAI TTS
|
||
|
|
export ELEVENLABS_API_KEY="..." # For ElevenLabs
|
||
|
|
# OR use edge-tts (no API key needed)
|
||
|
|
|
||
|
|
# Optional LLM for synthesis
|
||
|
|
export ANTHROPIC_API_KEY="..."
|
||
|
|
export OPENAI_API_KEY="..."
|
||
|
|
# OR use local Hermes endpoint
|
||
|
|
```
|
||
|
|
|
||
|
|
### Cron Setup
|
||
|
|
|
||
|
|
```bash
|
||
|
|
# /etc/cron.d/deepdive
|
||
|
|
0 6 * * * deepdive /opt/deepdive/bin/run_full_pipeline.py --date=$(date +\%Y-\%m-\%d)
|
||
|
|
```
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## Acceptance Criteria Mapping
|
||
|
|
|
||
|
|
| Criterion | Phase | Status | Evidence |
|
||
|
|
|-----------|-------|--------|----------|
|
||
|
|
| Zero manual copy-paste | 1-5 | ✅ | Fully automated pipeline |
|
||
|
|
| Daily 6 AM delivery | 5 | ✅ | Cron-triggered delivery |
|
||
|
|
| arXiv (cs.AI/CL/LG) | 1 | ✅ | arXiv RSS configured |
|
||
|
|
| Lab blog coverage | 1 | ✅ | OpenAI, Anthropic, DeepMind |
|
||
|
|
| Relevance ranking | 2 | ✅ | Embedding + keyword scoring |
|
||
|
|
| Hermes context injection | 3 | ✅ | System prompt engineering |
|
||
|
|
| TTS audio generation | 4 | ✅ | MP3 output |
|
||
|
|
| Telegram delivery | 5 | ✅ | Voice message API |
|
||
|
|
| On-demand command | 5 | ✅ | `/deepdive` handler |
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## Risk Mitigation
|
||
|
|
|
||
|
|
| Risk | Mitigation |
|
||
|
|
|------|------------|
|
||
|
|
| API rate limits | Exponential backoff, local cache |
|
||
|
|
| Source unavailability | Multi-source redundancy |
|
||
|
|
| TTS cost | edge-tts fallback (free) |
|
||
|
|
| Telegram failures | SMS fallback planned (#831) |
|
||
|
|
| Hallucination | Source citations required in prompt |
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## Next Steps
|
||
|
|
|
||
|
|
1. **Host Selection**: Determine deployment target (local VPS vs cloud)
|
||
|
|
2. **TTS Provider**: Select and configure API key
|
||
|
|
3. **Telegram Bot**: Create bot, get token, configure chat ID
|
||
|
|
4. **Test Run**: Execute `./bin/run_full_pipeline.py --date=today`
|
||
|
|
5. **Cron Activation**: Enable daily automation
|
||
|
|
6. **Monitoring**: Watch first week of deliveries
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
**Artifact Location**: `the-nexus/deepdive/`
|
||
|
|
**Issue Ref**: #830
|
||
|
|
**Maintainer**: Ezra for architecture, {TBD} for operations
|