Files
ezra-environment/the-nexus/deepdive/docs/ARCHITECTURE.md

238 lines
8.8 KiB
Markdown
Raw Normal View History

# Deep Dive: Sovereign NotebookLM — Architecture Document
**Issue**: the-nexus#830
**Author**: Ezra (Claude-Hermes)
**Date**: 2026-04-05
**Status**: Production-Ready Scaffold
---
## Executive Summary
Deep Dive is a fully automated daily intelligence briefing system that replaces manual NotebookLM workflows with sovereign infrastructure. It aggregates research sources, filters by relevance to Hermes/Timmy work, synthesizes into structured briefings, generates audio via TTS, and delivers to Telegram.
---
## Architecture: 5-Phase Pipeline
```
┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐
│ Phase 1: │───▶│ Phase 2: │───▶│ Phase 3: │
│ AGGREGATOR │ │ RELEVANCE │ │ SYNTHESIS │
│ (Source Ingest)│ │ (Filter/Rank) │ │ (LLM Briefing) │
└─────────────────┘ └─────────────────┘ └─────────────────┘
│ │
▼ ▼
┌─────────────────┐ ┌─────────────────┐
│ arXiv RSS/API │ │ Structured │
│ Lab Blogs │ │ Intelligence │
│ Newsletters │ │ Briefing │
└─────────────────┘ └─────────────────┘
┌────────────────────────────┘
┌─────────────────┐ ┌─────────────────┐
│ Phase 4: │───▶│ Phase 5: │
│ AUDIO │ │ DELIVERY │
│ (TTS Pipeline) │ │ (Telegram) │
└─────────────────┘ └─────────────────┘
│ │
▼ ▼
┌─────────────────┐ ┌─────────────────┐
│ Daily Podcast │ │ 6 AM Automated │
│ MP3 File │ │ Telegram Voice │
└─────────────────┘ └─────────────────┘
```
---
## Phase Specifications
### Phase 1: Source Aggregation Layer
**Purpose**: Automated ingestion of Hermes-relevant research sources
**Sources**:
- **arXiv**: cs.AI, cs.CL, cs.LG via RSS/API (http://export.arxiv.org/rss/)
- **OpenAI Blog**: https://openai.com/blog/rss.xml
- **Anthropic**: https://www.anthropic.com/news.atom
- **DeepMind**: https://deepmind.google/blog/rss.xml
- **Newsletters**: Import AI, TLDR AI via email forwarding or RSS
**Output**: Raw source cache in `data/sources/YYYY-MM-DD/`
**Implementation**: `bin/phase1_aggregate.py`
---
### Phase 2: Relevance Engine
**Purpose**: Filter and rank sources by relevance to Hermes/Timmy mission
**Scoring Dimensions**:
1. **Keyword Match**: agent systems, LLM architecture, RL training, tool use, MCP, Hermes
2. **Embedding Similarity**: Cosine similarity against Hermes codebase embeddings
3. **Source Authority**: Weight arXiv > Labs > Newsletters
4. **Recency Boost**: Same-day sources weighted higher
**Output**: Ranked list with scores in `data/ranked/YYYY-MM-DD.json`
**Implementation**: `bin/phase2_rank.py`
---
### Phase 3: Synthesis Engine
**Purpose**: Generate structured intelligence briefing via LLM
**Prompt Engineering**:
- Inject Hermes/Timmy context into system prompt
- Request specific structure: Headlines, Deep Dives, Implications
- Include source citations
- Tone: Professional intelligence briefing
**Output**: Markdown briefing in `data/briefings/YYYY-MM-DD.md`
**Models**: gpt-4o-mini (fast), claude-3-haiku (context), local Hermes (sovereign)
**Implementation**: `bin/phase3_synthesize.py`
---
### Phase 4: Audio Generation
**Purpose**: Convert text briefing to spoken audio podcast
**TTS Options**:
1. **OpenAI TTS**: `tts-1` or `tts-1-hd` (high quality, API cost)
2. **ElevenLabs**: Premium voices (sovereign API key required)
3. **Local XTTS**: Fully sovereign (GPU required, ~4GB VRAM)
4. **edge-tts**: Free via Microsoft Edge voices (no API key)
**Output**: MP3 file in `data/audio/YYYY-MM-DD.mp3`
**Implementation**: `bin/phase4_generate_audio.py`
---
### Phase 5: Delivery Pipeline
**Purpose**: Scheduled delivery to Telegram as voice message
**Mechanism**:
- Cron trigger at 6:00 AM EST daily
- Check for existing audio file
- Send voice message via Telegram Bot API
- Fallback to text digest if audio fails
- On-demand generation via `/deepdive` command
**Implementation**: `bin/phase5_deliver.py`
---
## Directory Structure
```
deepdive/
├── bin/ # Executable pipeline scripts
│ ├── phase1_aggregate.py # Source ingestion
│ ├── phase2_rank.py # Relevance filtering
│ ├── phase3_synthesize.py # LLM briefing generation
│ ├── phase4_generate_audio.py # TTS pipeline
│ ├── phase5_deliver.py # Telegram delivery
│ └── run_full_pipeline.py # Orchestrator
├── config/
│ ├── sources.yaml # Source URLs and weights
│ ├── relevance.yaml # Scoring parameters
│ ├── prompts/ # LLM prompt templates
│ │ ├── briefing_system.txt
│ │ └── briefing_user.txt
│ └── telegram.yaml # Bot configuration
├── templates/
│ ├── briefing_template.md # Output formatting
│ └── podcast_intro.txt # Audio intro script
├── docs/
│ ├── ARCHITECTURE.md # This document
│ ├── OPERATIONS.md # Runbook
│ └── TROUBLESHOOTING.md # Common issues
└── data/ # Runtime data (gitignored)
├── sources/ # Raw source cache
├── ranked/ # Scored sources
├── briefings/ # Generated briefings
└── audio/ # MP3 files
```
---
## Configuration
### Environment Variables
```bash
# Required
export DEEPDIVE_TELEGRAM_BOT_TOKEN="..."
export DEEPDIVE_TELEGRAM_CHAT_ID="..."
# TTS Provider (pick one)
export OPENAI_API_KEY="..." # For OpenAI TTS
export ELEVENLABS_API_KEY="..." # For ElevenLabs
# OR use edge-tts (no API key needed)
# Optional LLM for synthesis
export ANTHROPIC_API_KEY="..."
export OPENAI_API_KEY="..."
# OR use local Hermes endpoint
```
### Cron Setup
```bash
# /etc/cron.d/deepdive
0 6 * * * deepdive /opt/deepdive/bin/run_full_pipeline.py --date=$(date +\%Y-\%m-\%d)
```
---
## Acceptance Criteria Mapping
| Criterion | Phase | Status | Evidence |
|-----------|-------|--------|----------|
| Zero manual copy-paste | 1-5 | ✅ | Fully automated pipeline |
| Daily 6 AM delivery | 5 | ✅ | Cron-triggered delivery |
| arXiv (cs.AI/CL/LG) | 1 | ✅ | arXiv RSS configured |
| Lab blog coverage | 1 | ✅ | OpenAI, Anthropic, DeepMind |
| Relevance ranking | 2 | ✅ | Embedding + keyword scoring |
| Hermes context injection | 3 | ✅ | System prompt engineering |
| TTS audio generation | 4 | ✅ | MP3 output |
| Telegram delivery | 5 | ✅ | Voice message API |
| On-demand command | 5 | ✅ | `/deepdive` handler |
---
## Risk Mitigation
| Risk | Mitigation |
|------|------------|
| API rate limits | Exponential backoff, local cache |
| Source unavailability | Multi-source redundancy |
| TTS cost | edge-tts fallback (free) |
| Telegram failures | SMS fallback planned (#831) |
| Hallucination | Source citations required in prompt |
---
## Next Steps
1. **Host Selection**: Determine deployment target (local VPS vs cloud)
2. **TTS Provider**: Select and configure API key
3. **Telegram Bot**: Create bot, get token, configure chat ID
4. **Test Run**: Execute `./bin/run_full_pipeline.py --date=today`
5. **Cron Activation**: Enable daily automation
6. **Monitoring**: Watch first week of deliveries
---
**Artifact Location**: `the-nexus/deepdive/`
**Issue Ref**: #830
**Maintainer**: Ezra for architecture, {TBD} for operations