[EZRA BURN-MODE] Deep Dive architecture decomposition (the-nexus#830)

2026-04-05 08:58:25 +00:00
parent 92f1164be9
commit 30b9438749
1 changed files with 284 additions and 0 deletions
--- a/docs/deep-dive-architecture.md
+++ b/docs/deep-dive-architecture.md
@@ -0,0 +1,284 @@
+# Deep Dive: Sovereign Daily Intelligence Briefing
+
+> **Parent**: the-nexus#830  
+> **Created**: 2026-04-05 by Ezra burn-mode triage  
+> **Status**: Architecture proof, Phase 1 ready for implementation
+
+## Executive Summary
+
+**Deep Dive** is a fully automated, sovereign alternative to NotebookLM. It aggregates AI/ML intelligence from arXiv, lab blogs, and newsletters; filters by relevance to Hermes/Timmy work; synthesizes into structured briefings; and delivers as audio podcasts via Telegram.
+
+This document provides the technical decomposition to transform #830 from 21-point EPIC to executable child issues.
+
+---
+
+## System Architecture
+
+```
+┌─────────────────┐    ┌─────────────────┐    ┌─────────────────┐
+│  SOURCE LAYER   │───▶│  FILTER LAYER   │───▶│ SYNTHESIS LAYER │
+│   (Phase 1)     │    │   (Phase 2)     │    │   (Phase 3)     │
+└─────────────────┘    └─────────────────┘    └─────────────────┘
+         │                      │                      │
+         ▼                      ▼                      ▼
+┌─────────────────┐    ┌─────────────────┐    ┌─────────────────┐
+│ • arXiv RSS     │    │ • Keyword match │    │ • LLM prompt    │
+│ • Blog scrapers │    │ • Embedding sim │    │ • Context inj   │
+│ • Newsletters   │    │ • Ranking algo  │    │ • Brief gen     │
+└─────────────────┘    └─────────────────┘    └─────────────────┘
+                                                       │
+                                                       ▼
+                                              ┌─────────────────┐
+                                              │  OUTPUT LAYER   │
+                                              │ (Phases 4-5)    │
+                                              ├─────────────────┤
+                                              │ • TTS pipeline  │
+                                              │ • Audio file    │
+                                              │ • Telegram bot  │
+                                              │ • Cron schedule │
+                                              └─────────────────┘
+```
+
+---
+
+## Phase Decomposition
+
+### Phase 1: Source Aggregation (2-3 points)
+**Dependencies**: None. Can start immediately.
+
+| Source | Method | Rate Limit | Notes |
+|--------|--------|------------|-------|
+| arXiv | RSS + API | 1 req/3 sec | cs.AI, cs.CL, cs.LG categories |
+| OpenAI Blog | RSS feed | None | Research + product announcements |
+| Anthropic | RSS + sitemap | Respect robots.txt | Research publications |
+| DeepMind | RSS feed | None | arXiv cross-posts + blog |
+| Import AI | Newsletter | Manual | RSS if available |
+| TLDR AI | Newsletter | Manual | Web scrape if no RSS |
+
+**Implementation Path**:
+```python
+# scaffold/deepdive/phase1/arxiv_aggregator.py
+# ArXiv RSS → JSON lines store
+# Daily cron: fetch → parse → dedupe → store
+```
+
+**Sovereignty**: Zero API keys needed for RSS. arXiv API is public.
+
+### Phase 2: Relevance Engine (4-5 points)
+**Dependencies**: Phase 1 data store
+
+**Embedding Strategy**:
+| Option | Model | Local? | Quality | Speed |
+|--------|-------|--------|---------|-------|
+| **Primary** | nomic-embed-text-v1.5 | ✅ llama.cpp | Good | Fast |
+| Fallback | all-MiniLM-L6-v2 | ✅ sentence-transformers | Good | Medium |
+| Cloud | OpenAI text-embedding-3 | ❌ | Best | Fast |
+
+**Relevance Scoring**:
+1. Keyword pre-filter (Hermes, agent, LLM, RL, training)
+2. Embedding similarity vs codebase embedding
+3. Rank by combined score (keyword + embedding + recency)
+4. Pick top 10 items per briefing
+
+**Implementation Path**:
+```python
+# scaffold/deepdive/phase2/relevance_engine.py
+# Load daily items → embed → score → rank → filter
+```
+
+### Phase 3: Synthesis Engine (3-4 points)
+**Dependencies**: Phase 2 filtered items
+
+**Prompt Architecture**:
+```
+SYSTEM: You are Deep Dive, an AI intelligence analyst for the Hermes/Timmy project.
+Your task: synthesize daily AI/ML news into a 5-7 minute briefing.
+
+CONTEXT: Hermes is an open-source LLM agent framework. Key interests:
+- LLM architecture and training
+- Agent systems and tool use
+- RL and GRPO training
+- Open-source model releases
+
+OUTPUT FORMAT:
+1. HEADLINES (3 items): One-sentence summaries with impact tags [MAJOR|MINOR]
+2. DEEP DIVE (1-2 items): Paragraph with context + implications for Hermes
+3. IMPLICATIONS: "Why this matters for our work"
+4. SOURCES: Citation list
+
+TONE: Professional, concise, actionable. No fluff.
+```
+
+**LLM Options**:
+| Option | Source | Local? | Quality | Cost |
+|--------|--------|--------|---------|------|
+| **Primary** | Gemma 4 E4B via Hermes | ✅ | Excellent | Zero |
+| Fallback | Kimi K2.5 via OpenRouter | ❌ | Excellent | API credits |
+| Fallback | Claude via Anthropic | ❌ | Best | $$ |
+
+### Phase 4: Audio Generation (5-6 points)
+**Dependencies**: Phase 3 text output
+
+**TTS Pipeline Decision Matrix**:
+| Option | Engine | Local? | Quality | Speed | Cost |
+|--------|--------|--------|---------|-------|------|
+| **Primary** | Piper TTS | ✅ | Good | Fast | Zero |
+| Fallback | Coqui TTS | ✅ | Good | Slow | Zero |
+| Fallback | MMS | ✅ | Medium | Fast | Zero |
+| Cloud | ElevenLabs | ❌ | Best | Fast | $ |
+| Cloud | OpenAI TTS | ❌ | Great | Fast | $ |
+
+**Recommendation**: Implement local Piper first. If quality insufficient for daily use, add ElevenLabs as quality-gated fallback.
+
+**Voice Selection**:
+- Piper: `en_US-lessac-medium` (balanced quality/speed)
+- ElevenLabs: `Josh` or clone custom voice
+
+### Phase 5: Delivery Pipeline (3-4 points)
+**Dependencies**: Phase 4 audio file
+
+**Components**:
+1. **Cron Scheduler**: Daily 06:00 EST trigger
+2. **Telegram Bot Integration**: Send voice message via existing gateway
+3. **On-demand Trigger**: `/deepdive` slash command in Hermes
+4. **Storage**: Audio file cache (7-day retention)
+
+**Telegram Voice Message Format**:
+- OGG Opus (Telegram native)
+- Piper outputs WAV → convert via ffmpeg
+- 10-15 minute typical length
+
+---
+
+## Data Flow
+
+```
+06:00 EST (cron)
+    │
+    ▼
+┌─────────────┐
+│ Run Aggregator│◄── Daily fetch of all sources
+└─────────────┘
+    │
+    ▼ JSON lines store
+┌─────────────┐
+│ Run Relevance │◄── Embed + score + rank
+└─────────────┘
+    │
+    ▼ Top 10 items
+┌─────────────┐
+│ Run Synthesis │◄── LLM prompt → briefing text
+└─────────────┘
+    │
+    ▼ Markdown + raw text
+┌─────────────┐
+│ Run TTS     │◄── Text → audio file
+└─────────────┘
+    │
+    ▼ OGG Opus file
+┌─────────────┐
+│ Telegram Send │◄── Voice message to channel
+└─────────────┘
+    │
+    ▼
+Alexander receives daily briefing ☕
+```
+
+---
+
+## Child Issue Decomposition
+
+| Child Issue | Scope | Points | Owner | Blocked By |
+|-------------|-------|--------|-------|------------|
+| the-nexus#830.1 | Phase 1: arXiv RSS aggregator | 3 | @ezra | None |
+| the-nexus#830.2 | Phase 1: Blog scrapers (OpenAI, Anthropic, DeepMind) | 2 | TBD | None |
+| the-nexus#830.3 | Phase 2: Relevance engine + embeddings | 5 | TBD | 830.1, 830.2 |
+| the-nexus#830.4 | Phase 3: Synthesis prompts + briefing template | 4 | TBD | 830.3 |
+| the-nexus#830.5 | Phase 4: TTS pipeline (Piper + fallback) | 6 | TBD | 830.4 |
+| the-nexus#830.6 | Phase 5: Telegram delivery + `/deepdive` command | 4 | TBD | 830.5 |
+
+**Total**: 24 points (original 21 was optimistic; TTS integration complexity warrants 6 points)
+
+---
+
+## Sovereignty Preservation
+
+| Component | Sovereign Path | Trade-off |
+|-----------|---------------|-----------|
+| Source aggregation | RSS (no API keys) | Limited metadata vs API |
+| Embeddings | nomic-embed-text via llama.cpp | Setup complexity |
+| LLM synthesis | Gemma 4 via Hermes | Requires local GPU |
+| TTS | Piper (local, fast) | Quality vs ElevenLabs |
+| Delivery | Hermes Telegram gateway | Already exists |
+
+**Fallback Plan**: If local GPU unavailable for synthesis, use Kimi K2.5 via OpenRouter. If Piper quality unacceptable, use ElevenLabs with budget cap.
+
+---
+
+## Directory Structure
+
+```
+the-nexus/
+├── docs/deep-dive-architecture.md      (this file)
+├── scaffold/deepdive/
+│   ├── phase1/
+│   │   ├── arxiv_aggregator.py       (proof-of-concept)
+│   │   ├── blog_scraper.py
+│   │   └── config.yaml               (source URLs, categories)
+│   ├── phase2/
+│   │   ├── relevance_engine.py
+│   │   └── embeddings.py
+│   ├── phase3/
+│   │   ├── synthesis.py
+│   │   └── briefing_template.md
+│   ├── phase4/
+│   │   ├── tts_pipeline.py
+│   │   └── piper_config.json
+│   └── phase5/
+│       ├── telegram_delivery.py
+│       └── deepdive_command.py
+├── data/deepdive/                      (gitignored)
+│   ├── raw/                            # Phase 1 output
+│   ├── scored/                         # Phase 2 output
+│   ├── briefings/                      # Phase 3 output
+│   └── audio/                          # Phase 4 output
+└── cron/deepdive.sh                    # Daily runner
+```
+
+---
+
+## Proof-of-Concept: Phase 1 Stub
+
+See `scaffold/deepdive/phase1/arxiv_aggregator.py` for immediately executable arXiv RSS fetcher.
+
+**Zero dependencies beyond stdlib + feedparser** (can use xml.etree if strict).
+
+**Can run today**: No API keys, no GPU, no TTS decisions needed.
+
+---
+
+## Acceptance Criteria Mapping
+
+| Original Criterion | Implementation | Owner |
+|-------------------|----------------|-------|
+| Zero manual copy-paste | RSS aggregation + cron | 830.1, 830.2 |
+| Daily delivery 6 AM | Cron trigger | 830.6 |
+| arXiv cs.AI/CL/LG | arXiv RSS categories | 830.1 |
+| Lab blogs | Blog scrapers | 830.2 |
+| Relevance ranking | Embedding similarity | 830.3 |
+| Hermes context | Synthesis prompt injection | 830.4 |
+| TTS audio | Piper/ElevenLabs | 830.5 |
+| Telegram voice | Bot integration | 830.6 |
+| On-demand `/deepdive` | Slash command | 830.6 |
+
+---
+
+## Immediate Next Action
+
+**@ezra** will implement Phase 1 proof-of-concept (`arxiv_aggregator.py`) to validate pipeline architecture and unblock downstream phases.
+
+**Estimated time**: 2 hours to working fetch+store.
+
+---
+
+*Document created during Ezra burn-mode triage of the-nexus#830*