Timmy_Foundation/the-nexus

Fork 2

Files

Ezra 0380bc065e [ezra] Deep Dive scaffold for #830 : DEEPSDIVE_ARCHITECTURE.md

2026-04-05 01:48:58 +00:00

3.6 KiB

Raw Blame History

Deep Dive — Sovereign NotebookLM Architecture

Parent: #830
Status: Architecture committed, awaiting infrastructure decisions
Owner: @ezra
Created: 2026-04-05

Vision

Deep Dive is a fully automated daily intelligence briefing system that eliminates the 20+ minute manual research overhead. It produces a personalized AI-generated podcast (or text briefing) with zero manual input.

Unlike NotebookLM which requires manual source curation, Deep Dive operates autonomously.

Architecture Overview

┌──────────────────────────────────────────────────────────────────────────────┐
│                    D E E P   D I V E   P I P E L I N E                       │
├──────────────────────────────────────────────────────────────────────────────┤
│  ┌───────────┐   ┌───────────┐   ┌───────────┐   ┌───────────┐   ┌────────┐ │
│  │ AGGREGATE │──▶│  FILTER   │──▶│ SYNTHESIZE│──▶│   AUDIO   │──▶│DELIVER │ │
│  │ arXiv RSS │   │ Keywords  │   │ LLM brief │   │ TTS voice │   │Telegram│ │
│  └───────────┘   └───────────┘   └───────────┘   └───────────┘   └────────┘ │
└──────────────────────────────────────────────────────────────────────────────┘

Phase Specifications

Phase 1: Aggregate

Fetches from arXiv RSS (cs.AI, cs.CL, cs.LG), lab blogs, newsletters.

Output: List[RawItem]
Implementation: bin/deepdive_aggregator.py

Phase 2: Filter

Ranks items by keyword relevance to Hermes/Timmy work.

Scoring Algorithm (MVP):

keywords = ["agent", "llm", "tool use", "rlhf", "alignment"]
score = sum(1 for kw in keywords if kw in content)

Phase 3: Synthesize

LLM generates structured briefing: HEADLINES, DEEP DIVES, BOTTOM LINE.

Phase 4: Audio

TTS converts briefing to MP3 (10-15 min).

Decision needed: Local (Piper/coqui) vs API (ElevenLabs/OpenAI)

Phase 5: Deliver

Telegram voice message delivered at scheduled time (default 6 AM).

Implementation Path

MVP (2 hours, Phases 1+5)

arXiv RSS → keyword filter → text briefing → Telegram text at 6 AM

V1 (1 week, Phases 1-3+5)

Add LLM synthesis, more sources

V2 (2 weeks, Full)

Add TTS audio, embedding-based filtering

Integration Points

System	Point	Status
Hermes	`/deepdive` command	Pending
timmy-config	`cron/jobs.json` entry	Ready
Telegram	Voice delivery	Existing
TTS Service	Local vs API	NEEDS DECISION

Files

docs/DEEPSDIVE_ARCHITECTURE.md — This document
bin/deepdive_aggregator.py — Phase 1 source adapters
bin/deepdive_orchestrator.py — Pipeline controller

Blockers

#	Item	Status
1	TTS Service decision	NEEDS DECISION
2	`/deepdive` command registration	Pending

Ezra, Architect — 2026-04-05

3.6 KiB Raw Blame History