From 30b94387497598639dcbdd0acbba0cb95c06670c Mon Sep 17 00:00:00 2001 From: Ezra Date: Sun, 5 Apr 2026 08:58:25 +0000 Subject: [PATCH] [EZRA BURN-MODE] Deep Dive architecture decomposition (the-nexus#830) --- docs/deep-dive-architecture.md | 284 +++++++++++++++++++++++++++++++++ 1 file changed, 284 insertions(+) create mode 100644 docs/deep-dive-architecture.md diff --git a/docs/deep-dive-architecture.md b/docs/deep-dive-architecture.md new file mode 100644 index 0000000..2d38a4f --- /dev/null +++ b/docs/deep-dive-architecture.md @@ -0,0 +1,284 @@ +# Deep Dive: Sovereign Daily Intelligence Briefing + +> **Parent**: the-nexus#830 +> **Created**: 2026-04-05 by Ezra burn-mode triage +> **Status**: Architecture proof, Phase 1 ready for implementation + +## Executive Summary + +**Deep Dive** is a fully automated, sovereign alternative to NotebookLM. It aggregates AI/ML intelligence from arXiv, lab blogs, and newsletters; filters by relevance to Hermes/Timmy work; synthesizes into structured briefings; and delivers as audio podcasts via Telegram. + +This document provides the technical decomposition to transform #830 from 21-point EPIC to executable child issues. + +--- + +## System Architecture + +``` +┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐ +│ SOURCE LAYER │───▶│ FILTER LAYER │───▶│ SYNTHESIS LAYER │ +│ (Phase 1) │ │ (Phase 2) │ │ (Phase 3) │ +└─────────────────┘ └─────────────────┘ └─────────────────┘ + │ │ │ + ▼ ▼ ▼ +┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐ +│ • arXiv RSS │ │ • Keyword match │ │ • LLM prompt │ +│ • Blog scrapers │ │ • Embedding sim │ │ • Context inj │ +│ • Newsletters │ │ • Ranking algo │ │ • Brief gen │ +└─────────────────┘ └─────────────────┘ └─────────────────┘ + │ + ▼ + ┌─────────────────┐ + │ OUTPUT LAYER │ + │ (Phases 4-5) │ + ├─────────────────┤ + │ • TTS pipeline │ + │ • Audio file │ + │ • Telegram bot │ + │ • Cron schedule │ + └─────────────────┘ +``` + +--- + +## Phase Decomposition + +### Phase 1: Source Aggregation (2-3 points) +**Dependencies**: None. Can start immediately. + +| Source | Method | Rate Limit | Notes | +|--------|--------|------------|-------| +| arXiv | RSS + API | 1 req/3 sec | cs.AI, cs.CL, cs.LG categories | +| OpenAI Blog | RSS feed | None | Research + product announcements | +| Anthropic | RSS + sitemap | Respect robots.txt | Research publications | +| DeepMind | RSS feed | None | arXiv cross-posts + blog | +| Import AI | Newsletter | Manual | RSS if available | +| TLDR AI | Newsletter | Manual | Web scrape if no RSS | + +**Implementation Path**: +```python +# scaffold/deepdive/phase1/arxiv_aggregator.py +# ArXiv RSS → JSON lines store +# Daily cron: fetch → parse → dedupe → store +``` + +**Sovereignty**: Zero API keys needed for RSS. arXiv API is public. + +### Phase 2: Relevance Engine (4-5 points) +**Dependencies**: Phase 1 data store + +**Embedding Strategy**: +| Option | Model | Local? | Quality | Speed | +|--------|-------|--------|---------|-------| +| **Primary** | nomic-embed-text-v1.5 | ✅ llama.cpp | Good | Fast | +| Fallback | all-MiniLM-L6-v2 | ✅ sentence-transformers | Good | Medium | +| Cloud | OpenAI text-embedding-3 | ❌ | Best | Fast | + +**Relevance Scoring**: +1. Keyword pre-filter (Hermes, agent, LLM, RL, training) +2. Embedding similarity vs codebase embedding +3. Rank by combined score (keyword + embedding + recency) +4. Pick top 10 items per briefing + +**Implementation Path**: +```python +# scaffold/deepdive/phase2/relevance_engine.py +# Load daily items → embed → score → rank → filter +``` + +### Phase 3: Synthesis Engine (3-4 points) +**Dependencies**: Phase 2 filtered items + +**Prompt Architecture**: +``` +SYSTEM: You are Deep Dive, an AI intelligence analyst for the Hermes/Timmy project. +Your task: synthesize daily AI/ML news into a 5-7 minute briefing. + +CONTEXT: Hermes is an open-source LLM agent framework. Key interests: +- LLM architecture and training +- Agent systems and tool use +- RL and GRPO training +- Open-source model releases + +OUTPUT FORMAT: +1. HEADLINES (3 items): One-sentence summaries with impact tags [MAJOR|MINOR] +2. DEEP DIVE (1-2 items): Paragraph with context + implications for Hermes +3. IMPLICATIONS: "Why this matters for our work" +4. SOURCES: Citation list + +TONE: Professional, concise, actionable. No fluff. +``` + +**LLM Options**: +| Option | Source | Local? | Quality | Cost | +|--------|--------|--------|---------|------| +| **Primary** | Gemma 4 E4B via Hermes | ✅ | Excellent | Zero | +| Fallback | Kimi K2.5 via OpenRouter | ❌ | Excellent | API credits | +| Fallback | Claude via Anthropic | ❌ | Best | $$ | + +### Phase 4: Audio Generation (5-6 points) +**Dependencies**: Phase 3 text output + +**TTS Pipeline Decision Matrix**: +| Option | Engine | Local? | Quality | Speed | Cost | +|--------|--------|--------|---------|-------|------| +| **Primary** | Piper TTS | ✅ | Good | Fast | Zero | +| Fallback | Coqui TTS | ✅ | Good | Slow | Zero | +| Fallback | MMS | ✅ | Medium | Fast | Zero | +| Cloud | ElevenLabs | ❌ | Best | Fast | $ | +| Cloud | OpenAI TTS | ❌ | Great | Fast | $ | + +**Recommendation**: Implement local Piper first. If quality insufficient for daily use, add ElevenLabs as quality-gated fallback. + +**Voice Selection**: +- Piper: `en_US-lessac-medium` (balanced quality/speed) +- ElevenLabs: `Josh` or clone custom voice + +### Phase 5: Delivery Pipeline (3-4 points) +**Dependencies**: Phase 4 audio file + +**Components**: +1. **Cron Scheduler**: Daily 06:00 EST trigger +2. **Telegram Bot Integration**: Send voice message via existing gateway +3. **On-demand Trigger**: `/deepdive` slash command in Hermes +4. **Storage**: Audio file cache (7-day retention) + +**Telegram Voice Message Format**: +- OGG Opus (Telegram native) +- Piper outputs WAV → convert via ffmpeg +- 10-15 minute typical length + +--- + +## Data Flow + +``` +06:00 EST (cron) + │ + ▼ +┌─────────────┐ +│ Run Aggregator│◄── Daily fetch of all sources +└─────────────┘ + │ + ▼ JSON lines store +┌─────────────┐ +│ Run Relevance │◄── Embed + score + rank +└─────────────┘ + │ + ▼ Top 10 items +┌─────────────┐ +│ Run Synthesis │◄── LLM prompt → briefing text +└─────────────┘ + │ + ▼ Markdown + raw text +┌─────────────┐ +│ Run TTS │◄── Text → audio file +└─────────────┘ + │ + ▼ OGG Opus file +┌─────────────┐ +│ Telegram Send │◄── Voice message to channel +└─────────────┘ + │ + ▼ +Alexander receives daily briefing ☕ +``` + +--- + +## Child Issue Decomposition + +| Child Issue | Scope | Points | Owner | Blocked By | +|-------------|-------|--------|-------|------------| +| the-nexus#830.1 | Phase 1: arXiv RSS aggregator | 3 | @ezra | None | +| the-nexus#830.2 | Phase 1: Blog scrapers (OpenAI, Anthropic, DeepMind) | 2 | TBD | None | +| the-nexus#830.3 | Phase 2: Relevance engine + embeddings | 5 | TBD | 830.1, 830.2 | +| the-nexus#830.4 | Phase 3: Synthesis prompts + briefing template | 4 | TBD | 830.3 | +| the-nexus#830.5 | Phase 4: TTS pipeline (Piper + fallback) | 6 | TBD | 830.4 | +| the-nexus#830.6 | Phase 5: Telegram delivery + `/deepdive` command | 4 | TBD | 830.5 | + +**Total**: 24 points (original 21 was optimistic; TTS integration complexity warrants 6 points) + +--- + +## Sovereignty Preservation + +| Component | Sovereign Path | Trade-off | +|-----------|---------------|-----------| +| Source aggregation | RSS (no API keys) | Limited metadata vs API | +| Embeddings | nomic-embed-text via llama.cpp | Setup complexity | +| LLM synthesis | Gemma 4 via Hermes | Requires local GPU | +| TTS | Piper (local, fast) | Quality vs ElevenLabs | +| Delivery | Hermes Telegram gateway | Already exists | + +**Fallback Plan**: If local GPU unavailable for synthesis, use Kimi K2.5 via OpenRouter. If Piper quality unacceptable, use ElevenLabs with budget cap. + +--- + +## Directory Structure + +``` +the-nexus/ +├── docs/deep-dive-architecture.md (this file) +├── scaffold/deepdive/ +│ ├── phase1/ +│ │ ├── arxiv_aggregator.py (proof-of-concept) +│ │ ├── blog_scraper.py +│ │ └── config.yaml (source URLs, categories) +│ ├── phase2/ +│ │ ├── relevance_engine.py +│ │ └── embeddings.py +│ ├── phase3/ +│ │ ├── synthesis.py +│ │ └── briefing_template.md +│ ├── phase4/ +│ │ ├── tts_pipeline.py +│ │ └── piper_config.json +│ └── phase5/ +│ ├── telegram_delivery.py +│ └── deepdive_command.py +├── data/deepdive/ (gitignored) +│ ├── raw/ # Phase 1 output +│ ├── scored/ # Phase 2 output +│ ├── briefings/ # Phase 3 output +│ └── audio/ # Phase 4 output +└── cron/deepdive.sh # Daily runner +``` + +--- + +## Proof-of-Concept: Phase 1 Stub + +See `scaffold/deepdive/phase1/arxiv_aggregator.py` for immediately executable arXiv RSS fetcher. + +**Zero dependencies beyond stdlib + feedparser** (can use xml.etree if strict). + +**Can run today**: No API keys, no GPU, no TTS decisions needed. + +--- + +## Acceptance Criteria Mapping + +| Original Criterion | Implementation | Owner | +|-------------------|----------------|-------| +| Zero manual copy-paste | RSS aggregation + cron | 830.1, 830.2 | +| Daily delivery 6 AM | Cron trigger | 830.6 | +| arXiv cs.AI/CL/LG | arXiv RSS categories | 830.1 | +| Lab blogs | Blog scrapers | 830.2 | +| Relevance ranking | Embedding similarity | 830.3 | +| Hermes context | Synthesis prompt injection | 830.4 | +| TTS audio | Piper/ElevenLabs | 830.5 | +| Telegram voice | Bot integration | 830.6 | +| On-demand `/deepdive` | Slash command | 830.6 | + +--- + +## Immediate Next Action + +**@ezra** will implement Phase 1 proof-of-concept (`arxiv_aggregator.py`) to validate pipeline architecture and unblock downstream phases. + +**Estimated time**: 2 hours to working fetch+store. + +--- + +*Document created during Ezra burn-mode triage of the-nexus#830*