2.7 KiB
2.7 KiB
Deep Dive Architecture
Technical specification for the automated daily intelligence briefing system.
System Overview
┌─────────────┬─────────────┬─────────────┬─────────────┬─────────────┐
│ Phase 1 │ Phase 2 │ Phase 3 │ Phase 4 │ Phase 5 │
│ Aggregate │ Filter │ Synthesize │ TTS │ Deliver │
├─────────────┼─────────────┼─────────────┼─────────────┼─────────────┤
│ arXiv RSS │ Chroma DB │ Claude/GPT │ Piper │ Telegram │
│ Lab Blogs │ Embeddings │ Prompt │ (local) │ Voice │
└─────────────┴─────────────┴─────────────┴─────────────┴─────────────┘
Data Flow
- Aggregation: Fetch from arXiv + lab blogs
- Relevance: Score against Hermes context via embeddings
- Synthesis: LLM generates structured briefing
- TTS: Piper converts to audio (Opus)
- Delivery: Telegram voice message
Source Coverage
| Source | Method | Frequency |
|---|---|---|
| arXiv cs.AI | RSS | Daily |
| arXiv cs.CL | RSS | Daily |
| arXiv cs.LG | RSS | Daily |
| OpenAI Blog | RSS | Weekly |
| Anthropic | RSS | Weekly |
| DeepMind | Scraper | Weekly |
Relevance Scoring
Keyword Layer: Match against 20+ Hermes keywords
Embedding Layer: all-MiniLM-L6-v2 + Chroma DB
Composite: 0.3 * keyword_score + 0.7 * embedding_score
TTS Pipeline
- Engine: Piper (
en_US-lessac-medium) - Speed: ~1.5x realtime on CPU
- Format: WAV → FFmpeg → Opus (24kbps)
- Sovereign: Fully local, zero API cost
Cron Integration
job:
name: deep-dive-daily
schedule: "0 6 * * *"
command: python3 orchestrator.py --cron
On-Demand
python3 orchestrator.py # Full run
python3 orchestrator.py --dry-run # No delivery
python3 orchestrator.py --skip-tts # Text only
Acceptance Criteria
| Criterion | Status |
|---|---|
| Zero manual copy-paste | ✅ Automated |
| Daily 6 AM delivery | ✅ Cron ready |
| arXiv + labs coverage | ✅ RSS + scraper |
| Hermes relevance filter | ✅ Embeddings |
| Written briefing | ✅ LLM synthesis |
| Audio via TTS | ✅ Piper pipeline |
| Telegram delivery | ✅ Voice API |
| On-demand command | ✅ CLI flags |
Epic: #830 | Status: Architecture Complete