Files
the-nexus/research/deep-dive/ARCHITECTURE.md
Ezra 6aaf04dc04 [#830] Deep Dive architecture scaffold - ARCHITECTURE.md
Full system design for automated daily AI intelligence briefing:
- 5-phase pipeline: Aggregate → Rank → Synthesize → Narrate → Deliver
- Source coverage: ArXiv, lab blogs, newsletters
- TTS options: Piper (sovereign) / ElevenLabs (cloud)
- Story points: 21 (broken down by phase)
2026-04-05 03:31:04 +00:00

11 KiB

Deep Dive: Sovereign NotebookLM + Daily AI Intelligence Briefing

Issue: #830
Type: EPIC (21 story points)
Owner: Ezra (assigned by Alexander)
Status: Architecture complete → Phase 1 ready for implementation


Vision

A fully automated daily intelligence briefing system that delivers a personalized AI-generated podcast briefing with zero manual input.

Inspiration: NotebookLM workflow (ingest → rank → synthesize → narrate → deliver) — but automated, scheduled, and sovereign.


5-Phase Architecture

┌─────────────────────────────────────────────────────────────────────────┐
│                         DEEP DIVE PIPELINE                              │
├───────────────┬───────────────┬───────────────┬───────────────┬─────────┤
│   PHASE 1     │   PHASE 2     │   PHASE 3     │   PHASE 4     │ PHASE 5 │
├───────────────┼───────────────┼───────────────┼───────────────┼─────────┤
│  AGGREGATE    │    RANK       │  SYNTHESIZE   │   NARRATE     │ DELIVER │
├───────────────┼───────────────┼───────────────┼───────────────┼─────────┤
│ ArXiv RSS     │ Embedding     │ LLM briefing  │ TTS engine    │Telegram │
│ Lab feeds     │ similarity    │ generator     │ (Piper /      │ voice   │
│ Newsletters   │ vs codebase   │               │ ElevenLabs)   │ message │
│ HackerNews    │               │               │               │         │
└───────────────┴───────────────┴───────────────┴───────────────┴─────────┘

Timeline: 05:00  →  05:15  →  05:30  →  05:45  →  06:00
          Fetch    Score    Generate   Audio      Deliver

Phase 1: Source Aggregation (5 points)

Data Sources

Source URL/API Frequency Priority
ArXiv cs.AI http://export.arxiv.org/rss/cs.AI Daily 5 AM P1
ArXiv cs.CL http://export.arxiv.org/rss/cs.CL Daily 5 AM P1
ArXiv cs.LG http://export.arxiv.org/rss/cs.LG Daily 5 AM P1
OpenAI Blog https://openai.com/blog/rss.xml Daily 5 AM P1
Anthropic https://www.anthropic.com/blog/rss.xml Daily 5 AM P1
DeepMind https://deepmind.google/blog/rss.xml Daily 5 AM P2
Google Research https://research.google/blog/rss.xml Daily 5 AM P2
Import AI Newsletter (email/IMAP) Daily 5 AM P2
TLDR AI https://tldr.tech/ai/rss Daily 5 AM P2
HackerNews https://hnrss.org/newest?points=100 Daily 5 AM P3

Storage Format

{
  "fetched_at": "2025-01-15T05:00:00Z",
  "source": "arxiv_cs_ai",
  "items": [
    {
      "id": "arxiv:2501.01234",
      "title": "Attention is All You Need: The Sequel",
      "abstract": "...",
      "url": "https://arxiv.org/abs/2501.01234",
      "authors": ["..."],
      "published": "2025-01-14",
      "raw_text": "title + abstract"
    }
  ]
}

Output

data/deep-dive/raw/YYYY-MM-DD-{source}.jsonl


Phase 2: Relevance Engine (6 points)

Scoring Approach

Multi-factor relevance score (0-100):

score = (
    embedding_similarity * 0.40 +    # Cosine sim vs Hermes codebase
    keyword_match_score * 0.30 +     # Title/abstract keyword hits
    source_priority * 0.15 +         # ArXiv cs.AI = 1.0, HN = 0.3
    recency_boost * 0.10 +           # Today = 1.0, -0.1 per day
    user_feedback * 0.05             # Past thumbs up/down
)

Keyword Priority List

high_value:
  - "transformer"
  - "attention mechanism"
  - "large language model"
  - "LLM"
  - "agent"
  - "multi-agent"
  - "reasoning"
  - "chain-of-thought"
  - "RLHF"
  - "fine-tuning"
  - "retrieval augmented"
  - "RAG"
  - "vector database"
  - "embedding"
  - "tool use"
  - "function calling"

medium_value:
  - "BERT"
  - "GPT"
  - "training efficiency"
  - "inference optimization"
  - "quantization"
  - "distillation"

Vector Database Decision Matrix

Option Pros Cons Recommendation
Chroma SQLite-backed, zero ops, local Scales to ~1M docs max Default
PostgreSQL + pgvector Enterprise proven, ACID Requires Postgres If Nexus uses Postgres
FAISS (in-memory) Fastest search Rebuild daily Budget option

Output

data/deep-dive/scored/YYYY-MM-DD-ranked.json

Top 10 items selected for synthesis.


Phase 3: Synthesis Engine (3 points)

Prompt Architecture

You are Deep Dive, a technical intelligence briefing AI for the Hermes/Timmy
agent system. Your audience is an AI agent builder working on sovereign,
local-first AI infrastructure.

SOURCE MATERIAL:
{ranked_items}

GENERATE:
1. **Headlines** (3 bullets): Key announcements in 20 words each
2. **Deep Dives** (2-3): Important papers with technical summary and
   implications for agent systems
3. **Quick Hits** (3-5): Brief mentions worth knowing
4. **Context Bridge**: Connect to Hermes/Timmy current work
   - Mention if papers relate to RL training, tool calling, local inference,
     or multi-agent coordination

TONE: Professional, concise, technically precise
TARGET LENGTH: 800-1200 words (10-15 min spoken)

Output Format (Markdown)

# Deep Dive: YYYY-MM-DD

## Headlines
- [Item 1]
- [Item 2]
- [Item 3]

## Deep Dives

### [Paper Title]
**Source**: ArXiv cs.AI | **Authors**: [...]

[Technical summary]

**Why it matters for Hermes**: [...]

## Quick Hits
- [...]

## Context Bridge
[Connection to current work]

Output

data/deep-dive/briefings/YYYY-MM-DD-briefing.md


Phase 4: Audio Generation (4 points)

TTS Engine Options

Engine Cost Quality Latency Sovereignty
Piper (local) Free Good Medium 100%
Coqui TTS (local) Free Medium-High High 100%
ElevenLabs API $0.05/min Excellent Low Cloud
OpenAI TTS $0.015/min Excellent Low Cloud
Google Cloud TTS $0.004/min Good Low Cloud

Recommendation

Hybrid approach:

  • Default: Piper (on-device, sovereign)
  • Override flag: ElevenLabs/OpenAI for special episodes

Piper Configuration

# High-quality English voice
model = "en_US-lessac-high"

# Speaking rate: ~150 WPM for technical content
length_scale = 1.1

# Output format
output_format = "mp3"  # 128kbps

Audio Enhancement

# Add intro/outro jingles
ffmpeg -i intro.mp3 -i speech.mp3 -i outro.mp3 \
       -filter_complex "[0:a][1:a][2:a]concat=n=3:v=0:a=1" \
       deep-dive-YYYY-MM-DD.mp3

Output

data/deep-dive/audio/YYYY-MM-DD-deep-dive.mp3 (12-18 MB)


Phase 5: Delivery Pipeline (3 points)

Cron Schedule

# Daily at 6:00 AM EST
0 6 * * * cd /path/to/deep-dive && ./run-daily.sh

# Or: staggered phases for visibility
0 5 * * * ./phase1-fetch.sh
15 5 * * * ./phase2-rank.sh
30 5 * * * ./phase3-synthesize.sh
45 5 * * * ./phase4-narrate.sh
0 6 * * * ./phase5-deliver.sh

Telegram Integration

# Via Hermes gateway or direct bot
bot.send_voice(
    chat_id=TELEGRAM_HOME_CHANNEL,
    voice=open("deep-dive-YYYY-MM-DD.mp3", "rb"),
    caption=f"📻 Deep Dive for {date}: {headline_summary}",
    duration=estimated_seconds
)

On-Demand Command

/deepdive [date]

# Fetches briefing for specified date (default: today)
# If audio exists: sends voice message
# If not: generates on-demand (may take 2-3 min)

Implementation Roadmap

Quick Win: Phase 1 Only (2-3 hours)

Goal: Prove value with text-only digests

# 1. ArXiv RSS fetcher
# 2. Simple keyword filter
# 3. Text digest via Telegram
# 4. Cron schedule

Result: Daily 8 AM text briefing

MVP: Phases 1-3-5 (Skip 2,4)

Goal: Working system without embedding/audio complexity

Fetch → Keyword filter → LLM synthesize → Text delivery

Duration: 1-2 days

Full Implementation: All 5 Phases

Goal: Complete automated podcast system

Duration: 1-2 weeks (parallel development possible)


Directory Structure

the-nexus/
└── research/
    └── deep-dive/
        ├── ARCHITECTURE.md          # This file
        ├── IMPLEMENTATION.md        # Detailed dev guide
        ├── config/
        │   ├── sources.yaml         # RSS/feed URLs
        │   ├── keywords.yaml        # Relevance keywords
        │   └── prompts/
        │       ├── synthesis.txt    # LLM prompt template
        │       └── headlines.txt    # Headline-only prompt
        ├── scripts/
        │   ├── phase1-aggregate.py
        │   ├── phase2-rank.py
        │   ├── phase3-synthesize.py
        │   ├── phase4-narrate.py
        │   ├── phase5-deliver.py
        │   └── run-daily.sh         # Orchestrator
        └── data/                    # .gitignored
            ├── raw/                 # Fetched sources
            ├── scored/              # Ranked items
            ├── briefings/           # Markdown outputs
            └── audio/               # MP3 files

Acceptance Criteria

# Criterion Phase
1 Zero manual copy-paste 1-5
2 Daily 6 AM delivery 5
3 ArXiv coverage (cs.AI, cs.CL, cs.LG) 1
4 Lab blog coverage 1
5 Relevance ranking by Hermes context 2
6 Written briefing generation 3
7 TTS audio production 4
8 Telegram voice delivery 5
9 On-demand /deepdive command 5

Risk Matrix

Risk Likelihood Impact Mitigation
ArXiv rate limiting Medium Medium Exponential backoff, caching
RSS feed changes Medium Low Health checks, fallback sources
TTS quality poor Low (Piper) High Cloud override flag
Vector DB too slow Low Medium Batch overnight, cache embeddings
Telegram file size Low Medium Compress audio, split long episodes

Dependencies

Required

  • Python 3.10+
  • feedparser (RSS)
  • requests (HTTP)
  • chromadb or sqlite3 (storage)
  • Hermes LLM client (synthesis)
  • Piper TTS (local audio)

Optional

  • sentence-transformers (embeddings)
  • ffmpeg (audio post-processing)
  • ElevenLabs API key (cloud TTS fallback)

  • #830 (Parent EPIC)
  • Commandment 6: Human-to-fleet comms
  • #166: Matrix/Conduit deployment

Next Steps

  1. Decision: Vector DB selection (Chroma vs pgvector)
  2. Implementation: Phase 1 skeleton (ArXiv fetcher)
  3. Integration: Hermes cron registration
  4. Testing: 3-day dry run (text only)
  5. Enhancement: Add TTS (Phase 4)

Architecture document version 1.0 — Ezra, 2026-04-05