[EZRA BURN-MODE] Deep Dive architecture decomposition (the-nexus#830)
Some checks failed
Deploy Nexus / deploy (push) Has been cancelled
Some checks failed
Deploy Nexus / deploy (push) Has been cancelled
This commit is contained in:
284
docs/deep-dive-architecture.md
Normal file
284
docs/deep-dive-architecture.md
Normal file
@@ -0,0 +1,284 @@
|
||||
# Deep Dive: Sovereign Daily Intelligence Briefing
|
||||
|
||||
> **Parent**: the-nexus#830
|
||||
> **Created**: 2026-04-05 by Ezra burn-mode triage
|
||||
> **Status**: Architecture proof, Phase 1 ready for implementation
|
||||
|
||||
## Executive Summary
|
||||
|
||||
**Deep Dive** is a fully automated, sovereign alternative to NotebookLM. It aggregates AI/ML intelligence from arXiv, lab blogs, and newsletters; filters by relevance to Hermes/Timmy work; synthesizes into structured briefings; and delivers as audio podcasts via Telegram.
|
||||
|
||||
This document provides the technical decomposition to transform #830 from 21-point EPIC to executable child issues.
|
||||
|
||||
---
|
||||
|
||||
## System Architecture
|
||||
|
||||
```
|
||||
┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐
|
||||
│ SOURCE LAYER │───▶│ FILTER LAYER │───▶│ SYNTHESIS LAYER │
|
||||
│ (Phase 1) │ │ (Phase 2) │ │ (Phase 3) │
|
||||
└─────────────────┘ └─────────────────┘ └─────────────────┘
|
||||
│ │ │
|
||||
▼ ▼ ▼
|
||||
┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐
|
||||
│ • arXiv RSS │ │ • Keyword match │ │ • LLM prompt │
|
||||
│ • Blog scrapers │ │ • Embedding sim │ │ • Context inj │
|
||||
│ • Newsletters │ │ • Ranking algo │ │ • Brief gen │
|
||||
└─────────────────┘ └─────────────────┘ └─────────────────┘
|
||||
│
|
||||
▼
|
||||
┌─────────────────┐
|
||||
│ OUTPUT LAYER │
|
||||
│ (Phases 4-5) │
|
||||
├─────────────────┤
|
||||
│ • TTS pipeline │
|
||||
│ • Audio file │
|
||||
│ • Telegram bot │
|
||||
│ • Cron schedule │
|
||||
└─────────────────┘
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Phase Decomposition
|
||||
|
||||
### Phase 1: Source Aggregation (2-3 points)
|
||||
**Dependencies**: None. Can start immediately.
|
||||
|
||||
| Source | Method | Rate Limit | Notes |
|
||||
|--------|--------|------------|-------|
|
||||
| arXiv | RSS + API | 1 req/3 sec | cs.AI, cs.CL, cs.LG categories |
|
||||
| OpenAI Blog | RSS feed | None | Research + product announcements |
|
||||
| Anthropic | RSS + sitemap | Respect robots.txt | Research publications |
|
||||
| DeepMind | RSS feed | None | arXiv cross-posts + blog |
|
||||
| Import AI | Newsletter | Manual | RSS if available |
|
||||
| TLDR AI | Newsletter | Manual | Web scrape if no RSS |
|
||||
|
||||
**Implementation Path**:
|
||||
```python
|
||||
# scaffold/deepdive/phase1/arxiv_aggregator.py
|
||||
# ArXiv RSS → JSON lines store
|
||||
# Daily cron: fetch → parse → dedupe → store
|
||||
```
|
||||
|
||||
**Sovereignty**: Zero API keys needed for RSS. arXiv API is public.
|
||||
|
||||
### Phase 2: Relevance Engine (4-5 points)
|
||||
**Dependencies**: Phase 1 data store
|
||||
|
||||
**Embedding Strategy**:
|
||||
| Option | Model | Local? | Quality | Speed |
|
||||
|--------|-------|--------|---------|-------|
|
||||
| **Primary** | nomic-embed-text-v1.5 | ✅ llama.cpp | Good | Fast |
|
||||
| Fallback | all-MiniLM-L6-v2 | ✅ sentence-transformers | Good | Medium |
|
||||
| Cloud | OpenAI text-embedding-3 | ❌ | Best | Fast |
|
||||
|
||||
**Relevance Scoring**:
|
||||
1. Keyword pre-filter (Hermes, agent, LLM, RL, training)
|
||||
2. Embedding similarity vs codebase embedding
|
||||
3. Rank by combined score (keyword + embedding + recency)
|
||||
4. Pick top 10 items per briefing
|
||||
|
||||
**Implementation Path**:
|
||||
```python
|
||||
# scaffold/deepdive/phase2/relevance_engine.py
|
||||
# Load daily items → embed → score → rank → filter
|
||||
```
|
||||
|
||||
### Phase 3: Synthesis Engine (3-4 points)
|
||||
**Dependencies**: Phase 2 filtered items
|
||||
|
||||
**Prompt Architecture**:
|
||||
```
|
||||
SYSTEM: You are Deep Dive, an AI intelligence analyst for the Hermes/Timmy project.
|
||||
Your task: synthesize daily AI/ML news into a 5-7 minute briefing.
|
||||
|
||||
CONTEXT: Hermes is an open-source LLM agent framework. Key interests:
|
||||
- LLM architecture and training
|
||||
- Agent systems and tool use
|
||||
- RL and GRPO training
|
||||
- Open-source model releases
|
||||
|
||||
OUTPUT FORMAT:
|
||||
1. HEADLINES (3 items): One-sentence summaries with impact tags [MAJOR|MINOR]
|
||||
2. DEEP DIVE (1-2 items): Paragraph with context + implications for Hermes
|
||||
3. IMPLICATIONS: "Why this matters for our work"
|
||||
4. SOURCES: Citation list
|
||||
|
||||
TONE: Professional, concise, actionable. No fluff.
|
||||
```
|
||||
|
||||
**LLM Options**:
|
||||
| Option | Source | Local? | Quality | Cost |
|
||||
|--------|--------|--------|---------|------|
|
||||
| **Primary** | Gemma 4 E4B via Hermes | ✅ | Excellent | Zero |
|
||||
| Fallback | Kimi K2.5 via OpenRouter | ❌ | Excellent | API credits |
|
||||
| Fallback | Claude via Anthropic | ❌ | Best | $$ |
|
||||
|
||||
### Phase 4: Audio Generation (5-6 points)
|
||||
**Dependencies**: Phase 3 text output
|
||||
|
||||
**TTS Pipeline Decision Matrix**:
|
||||
| Option | Engine | Local? | Quality | Speed | Cost |
|
||||
|--------|--------|--------|---------|-------|------|
|
||||
| **Primary** | Piper TTS | ✅ | Good | Fast | Zero |
|
||||
| Fallback | Coqui TTS | ✅ | Good | Slow | Zero |
|
||||
| Fallback | MMS | ✅ | Medium | Fast | Zero |
|
||||
| Cloud | ElevenLabs | ❌ | Best | Fast | $ |
|
||||
| Cloud | OpenAI TTS | ❌ | Great | Fast | $ |
|
||||
|
||||
**Recommendation**: Implement local Piper first. If quality insufficient for daily use, add ElevenLabs as quality-gated fallback.
|
||||
|
||||
**Voice Selection**:
|
||||
- Piper: `en_US-lessac-medium` (balanced quality/speed)
|
||||
- ElevenLabs: `Josh` or clone custom voice
|
||||
|
||||
### Phase 5: Delivery Pipeline (3-4 points)
|
||||
**Dependencies**: Phase 4 audio file
|
||||
|
||||
**Components**:
|
||||
1. **Cron Scheduler**: Daily 06:00 EST trigger
|
||||
2. **Telegram Bot Integration**: Send voice message via existing gateway
|
||||
3. **On-demand Trigger**: `/deepdive` slash command in Hermes
|
||||
4. **Storage**: Audio file cache (7-day retention)
|
||||
|
||||
**Telegram Voice Message Format**:
|
||||
- OGG Opus (Telegram native)
|
||||
- Piper outputs WAV → convert via ffmpeg
|
||||
- 10-15 minute typical length
|
||||
|
||||
---
|
||||
|
||||
## Data Flow
|
||||
|
||||
```
|
||||
06:00 EST (cron)
|
||||
│
|
||||
▼
|
||||
┌─────────────┐
|
||||
│ Run Aggregator│◄── Daily fetch of all sources
|
||||
└─────────────┘
|
||||
│
|
||||
▼ JSON lines store
|
||||
┌─────────────┐
|
||||
│ Run Relevance │◄── Embed + score + rank
|
||||
└─────────────┘
|
||||
│
|
||||
▼ Top 10 items
|
||||
┌─────────────┐
|
||||
│ Run Synthesis │◄── LLM prompt → briefing text
|
||||
└─────────────┘
|
||||
│
|
||||
▼ Markdown + raw text
|
||||
┌─────────────┐
|
||||
│ Run TTS │◄── Text → audio file
|
||||
└─────────────┘
|
||||
│
|
||||
▼ OGG Opus file
|
||||
┌─────────────┐
|
||||
│ Telegram Send │◄── Voice message to channel
|
||||
└─────────────┘
|
||||
│
|
||||
▼
|
||||
Alexander receives daily briefing ☕
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Child Issue Decomposition
|
||||
|
||||
| Child Issue | Scope | Points | Owner | Blocked By |
|
||||
|-------------|-------|--------|-------|------------|
|
||||
| the-nexus#830.1 | Phase 1: arXiv RSS aggregator | 3 | @ezra | None |
|
||||
| the-nexus#830.2 | Phase 1: Blog scrapers (OpenAI, Anthropic, DeepMind) | 2 | TBD | None |
|
||||
| the-nexus#830.3 | Phase 2: Relevance engine + embeddings | 5 | TBD | 830.1, 830.2 |
|
||||
| the-nexus#830.4 | Phase 3: Synthesis prompts + briefing template | 4 | TBD | 830.3 |
|
||||
| the-nexus#830.5 | Phase 4: TTS pipeline (Piper + fallback) | 6 | TBD | 830.4 |
|
||||
| the-nexus#830.6 | Phase 5: Telegram delivery + `/deepdive` command | 4 | TBD | 830.5 |
|
||||
|
||||
**Total**: 24 points (original 21 was optimistic; TTS integration complexity warrants 6 points)
|
||||
|
||||
---
|
||||
|
||||
## Sovereignty Preservation
|
||||
|
||||
| Component | Sovereign Path | Trade-off |
|
||||
|-----------|---------------|-----------|
|
||||
| Source aggregation | RSS (no API keys) | Limited metadata vs API |
|
||||
| Embeddings | nomic-embed-text via llama.cpp | Setup complexity |
|
||||
| LLM synthesis | Gemma 4 via Hermes | Requires local GPU |
|
||||
| TTS | Piper (local, fast) | Quality vs ElevenLabs |
|
||||
| Delivery | Hermes Telegram gateway | Already exists |
|
||||
|
||||
**Fallback Plan**: If local GPU unavailable for synthesis, use Kimi K2.5 via OpenRouter. If Piper quality unacceptable, use ElevenLabs with budget cap.
|
||||
|
||||
---
|
||||
|
||||
## Directory Structure
|
||||
|
||||
```
|
||||
the-nexus/
|
||||
├── docs/deep-dive-architecture.md (this file)
|
||||
├── scaffold/deepdive/
|
||||
│ ├── phase1/
|
||||
│ │ ├── arxiv_aggregator.py (proof-of-concept)
|
||||
│ │ ├── blog_scraper.py
|
||||
│ │ └── config.yaml (source URLs, categories)
|
||||
│ ├── phase2/
|
||||
│ │ ├── relevance_engine.py
|
||||
│ │ └── embeddings.py
|
||||
│ ├── phase3/
|
||||
│ │ ├── synthesis.py
|
||||
│ │ └── briefing_template.md
|
||||
│ ├── phase4/
|
||||
│ │ ├── tts_pipeline.py
|
||||
│ │ └── piper_config.json
|
||||
│ └── phase5/
|
||||
│ ├── telegram_delivery.py
|
||||
│ └── deepdive_command.py
|
||||
├── data/deepdive/ (gitignored)
|
||||
│ ├── raw/ # Phase 1 output
|
||||
│ ├── scored/ # Phase 2 output
|
||||
│ ├── briefings/ # Phase 3 output
|
||||
│ └── audio/ # Phase 4 output
|
||||
└── cron/deepdive.sh # Daily runner
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Proof-of-Concept: Phase 1 Stub
|
||||
|
||||
See `scaffold/deepdive/phase1/arxiv_aggregator.py` for immediately executable arXiv RSS fetcher.
|
||||
|
||||
**Zero dependencies beyond stdlib + feedparser** (can use xml.etree if strict).
|
||||
|
||||
**Can run today**: No API keys, no GPU, no TTS decisions needed.
|
||||
|
||||
---
|
||||
|
||||
## Acceptance Criteria Mapping
|
||||
|
||||
| Original Criterion | Implementation | Owner |
|
||||
|-------------------|----------------|-------|
|
||||
| Zero manual copy-paste | RSS aggregation + cron | 830.1, 830.2 |
|
||||
| Daily delivery 6 AM | Cron trigger | 830.6 |
|
||||
| arXiv cs.AI/CL/LG | arXiv RSS categories | 830.1 |
|
||||
| Lab blogs | Blog scrapers | 830.2 |
|
||||
| Relevance ranking | Embedding similarity | 830.3 |
|
||||
| Hermes context | Synthesis prompt injection | 830.4 |
|
||||
| TTS audio | Piper/ElevenLabs | 830.5 |
|
||||
| Telegram voice | Bot integration | 830.6 |
|
||||
| On-demand `/deepdive` | Slash command | 830.6 |
|
||||
|
||||
---
|
||||
|
||||
## Immediate Next Action
|
||||
|
||||
**@ezra** will implement Phase 1 proof-of-concept (`arxiv_aggregator.py`) to validate pipeline architecture and unblock downstream phases.
|
||||
|
||||
**Estimated time**: 2 hours to working fetch+store.
|
||||
|
||||
---
|
||||
|
||||
*Document created during Ezra burn-mode triage of the-nexus#830*
|
||||
Reference in New Issue
Block a user