Files
the-nexus/docs/DEEPSDIVE_ARCHITECTURE.md

3.6 KiB

Deep Dive — Sovereign NotebookLM Architecture

Parent: #830
Status: Architecture committed, awaiting infrastructure decisions
Owner: @ezra
Created: 2026-04-05

Vision

Deep Dive is a fully automated daily intelligence briefing system that eliminates the 20+ minute manual research overhead. It produces a personalized AI-generated podcast (or text briefing) with zero manual input.

Unlike NotebookLM which requires manual source curation, Deep Dive operates autonomously.

Architecture Overview

┌──────────────────────────────────────────────────────────────────────────────┐
│                    D E E P   D I V E   P I P E L I N E                       │
├──────────────────────────────────────────────────────────────────────────────┤
│  ┌───────────┐   ┌───────────┐   ┌───────────┐   ┌───────────┐   ┌────────┐ │
│  │ AGGREGATE │──▶│  FILTER   │──▶│ SYNTHESIZE│──▶│   AUDIO   │──▶│DELIVER │ │
│  │ arXiv RSS │   │ Keywords  │   │ LLM brief │   │ TTS voice │   │Telegram│ │
│  └───────────┘   └───────────┘   └───────────┘   └───────────┘   └────────┘ │
└──────────────────────────────────────────────────────────────────────────────┘

Phase Specifications

Phase 1: Aggregate

Fetches from arXiv RSS (cs.AI, cs.CL, cs.LG), lab blogs, newsletters.

Output: List[RawItem]
Implementation: bin/deepdive_aggregator.py

Phase 2: Filter

Ranks items by keyword relevance to Hermes/Timmy work.

Scoring Algorithm (MVP):

keywords = ["agent", "llm", "tool use", "rlhf", "alignment"]
score = sum(1 for kw in keywords if kw in content)

Phase 3: Synthesize

LLM generates structured briefing: HEADLINES, DEEP DIVES, BOTTOM LINE.

Phase 4: Audio

TTS converts briefing to MP3 (10-15 min).

Decision needed: Local (Piper/coqui) vs API (ElevenLabs/OpenAI)

Phase 5: Deliver

Telegram voice message delivered at scheduled time (default 6 AM).

Implementation Path

MVP (2 hours, Phases 1+5)

arXiv RSS → keyword filter → text briefing → Telegram text at 6 AM

V1 (1 week, Phases 1-3+5)

Add LLM synthesis, more sources

V2 (2 weeks, Full)

Add TTS audio, embedding-based filtering

Integration Points

System Point Status
Hermes /deepdive command Pending
timmy-config cron/jobs.json entry Ready
Telegram Voice delivery Existing
TTS Service Local vs API NEEDS DECISION

Files

  • docs/DEEPSDIVE_ARCHITECTURE.md — This document
  • bin/deepdive_aggregator.py — Phase 1 source adapters
  • bin/deepdive_orchestrator.py — Pipeline controller

Blockers

# Item Status
1 TTS Service decision NEEDS DECISION
2 /deepdive command registration Pending

Ezra, Architect — 2026-04-05