[ezra] Deep Dive scaffold #830: DEEPSDIVE_ARCHITECTURE.md
Some checks failed
Deploy Nexus / deploy (push) Has been cancelled
Some checks failed
Deploy Nexus / deploy (push) Has been cancelled
This commit is contained in:
88
docs/DEEPSDIVE_ARCHITECTURE.md
Normal file
88
docs/DEEPSDIVE_ARCHITECTURE.md
Normal file
@@ -0,0 +1,88 @@
|
|||||||
|
# Deep Dive — Sovereign NotebookLM Architecture
|
||||||
|
|
||||||
|
> Parent: [#830](http://143.198.27.163:3000/Timmy_Foundation/the-nexus/issues/830)
|
||||||
|
> Status: Architecture committed, awaiting infrastructure decisions
|
||||||
|
> Owner: @ezra
|
||||||
|
> Created: 2026-04-05
|
||||||
|
|
||||||
|
## Vision
|
||||||
|
|
||||||
|
**Deep Dive** is a fully automated daily intelligence briefing system that eliminates the 20+ minute manual research overhead. It produces a personalized AI-generated podcast (or text briefing) with **zero manual input**.
|
||||||
|
|
||||||
|
Unlike NotebookLM which requires manual source curation, Deep Dive operates autonomously.
|
||||||
|
|
||||||
|
## Architecture Overview
|
||||||
|
|
||||||
|
```
|
||||||
|
┌──────────────────────────────────────────────────────────────────────────────┐
|
||||||
|
│ D E E P D I V E P I P E L I N E │
|
||||||
|
├──────────────────────────────────────────────────────────────────────────────┤
|
||||||
|
│ ┌───────────┐ ┌───────────┐ ┌───────────┐ ┌───────────┐ ┌────────┐ │
|
||||||
|
│ │ AGGREGATE │──▶│ FILTER │──▶│ SYNTHESIZE│──▶│ AUDIO │──▶│DELIVER │ │
|
||||||
|
│ │ arXiv RSS │ │ Keywords │ │ LLM brief │ │ TTS voice │ │Telegram│ │
|
||||||
|
│ └───────────┘ └───────────┘ └───────────┘ └───────────┘ └────────┘ │
|
||||||
|
└──────────────────────────────────────────────────────────────────────────────┘
|
||||||
|
```
|
||||||
|
|
||||||
|
## Phase Specifications
|
||||||
|
|
||||||
|
### Phase 1: Aggregate
|
||||||
|
Fetches from arXiv RSS (cs.AI, cs.CL, cs.LG), lab blogs, newsletters.
|
||||||
|
|
||||||
|
**Output**: `List[RawItem]`
|
||||||
|
**Implementation**: `bin/deepdive_aggregator.py`
|
||||||
|
|
||||||
|
### Phase 2: Filter
|
||||||
|
Ranks items by keyword relevance to Hermes/Timmy work.
|
||||||
|
|
||||||
|
**Scoring Algorithm (MVP)**:
|
||||||
|
```python
|
||||||
|
keywords = ["agent", "llm", "tool use", "rlhf", "alignment"]
|
||||||
|
score = sum(1 for kw in keywords if kw in content)
|
||||||
|
```
|
||||||
|
|
||||||
|
### Phase 3: Synthesize
|
||||||
|
LLM generates structured briefing: HEADLINES, DEEP DIVES, BOTTOM LINE.
|
||||||
|
|
||||||
|
### Phase 4: Audio
|
||||||
|
TTS converts briefing to MP3 (10-15 min).
|
||||||
|
|
||||||
|
**Decision needed**: Local (Piper/coqui) vs API (ElevenLabs/OpenAI)
|
||||||
|
|
||||||
|
### Phase 5: Deliver
|
||||||
|
Telegram voice message delivered at scheduled time (default 6 AM).
|
||||||
|
|
||||||
|
## Implementation Path
|
||||||
|
|
||||||
|
### MVP (2 hours, Phases 1+5)
|
||||||
|
arXiv RSS → keyword filter → text briefing → Telegram text at 6 AM
|
||||||
|
|
||||||
|
### V1 (1 week, Phases 1-3+5)
|
||||||
|
Add LLM synthesis, more sources
|
||||||
|
|
||||||
|
### V2 (2 weeks, Full)
|
||||||
|
Add TTS audio, embedding-based filtering
|
||||||
|
|
||||||
|
## Integration Points
|
||||||
|
|
||||||
|
| System | Point | Status |
|
||||||
|
|--------|-------|--------|
|
||||||
|
| Hermes | `/deepdive` command | Pending |
|
||||||
|
| timmy-config | `cron/jobs.json` entry | Ready |
|
||||||
|
| Telegram | Voice delivery | Existing |
|
||||||
|
| TTS Service | Local vs API | **NEEDS DECISION** |
|
||||||
|
|
||||||
|
## Files
|
||||||
|
|
||||||
|
- `docs/DEEPSDIVE_ARCHITECTURE.md` — This document
|
||||||
|
- `bin/deepdive_aggregator.py` — Phase 1 source adapters
|
||||||
|
- `bin/deepdive_orchestrator.py` — Pipeline controller
|
||||||
|
|
||||||
|
## Blockers
|
||||||
|
|
||||||
|
| # | Item | Status |
|
||||||
|
|---|------|--------|
|
||||||
|
| 1 | TTS Service decision | **NEEDS DECISION** |
|
||||||
|
| 2 | `/deepdive` command registration | Pending |
|
||||||
|
|
||||||
|
**Ezra, Architect** — 2026-04-05
|
||||||
Reference in New Issue
Block a user