From 7ac9c63ff95abcaa54e45ad1b369e5a940abef93 Mon Sep 17 00:00:00 2001 From: Ezra Date: Sun, 5 Apr 2026 07:42:18 +0000 Subject: [PATCH] =?UTF-8?q?[DEEP-DIVE]=20Automated=20intelligence=20briefi?= =?UTF-8?q?ng=20scaffold=20=E2=80=94=20supports=20#830?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit --- docs/deep-dive/ARCHITECTURE.md | 80 ++++++++++++++++++++++++++++++++++ 1 file changed, 80 insertions(+) create mode 100644 docs/deep-dive/ARCHITECTURE.md diff --git a/docs/deep-dive/ARCHITECTURE.md b/docs/deep-dive/ARCHITECTURE.md new file mode 100644 index 0000000..9f57048 --- /dev/null +++ b/docs/deep-dive/ARCHITECTURE.md @@ -0,0 +1,80 @@ +# Deep Dive Architecture + +Technical specification for the automated daily intelligence briefing system. + +## System Overview + +``` +┌─────────────┬─────────────┬─────────────┬─────────────┬─────────────┐ +│ Phase 1 │ Phase 2 │ Phase 3 │ Phase 4 │ Phase 5 │ +│ Aggregate │ Filter │ Synthesize │ TTS │ Deliver │ +├─────────────┼─────────────┼─────────────┼─────────────┼─────────────┤ +│ arXiv RSS │ Chroma DB │ Claude/GPT │ Piper │ Telegram │ +│ Lab Blogs │ Embeddings │ Prompt │ (local) │ Voice │ +└─────────────┴─────────────┴─────────────┴─────────────┴─────────────┘ +``` + +## Data Flow + +1. **Aggregation**: Fetch from arXiv + lab blogs +2. **Relevance**: Score against Hermes context via embeddings +3. **Synthesis**: LLM generates structured briefing +4. **TTS**: Piper converts to audio (Opus) +5. **Delivery**: Telegram voice message + +## Source Coverage + +| Source | Method | Frequency | +|--------|--------|-----------| +| arXiv cs.AI | RSS | Daily | +| arXiv cs.CL | RSS | Daily | +| arXiv cs.LG | RSS | Daily | +| OpenAI Blog | RSS | Weekly | +| Anthropic | RSS | Weekly | +| DeepMind | Scraper | Weekly | + +## Relevance Scoring + +**Keyword Layer**: Match against 20+ Hermes keywords +**Embedding Layer**: `all-MiniLM-L6-v2` + Chroma DB +**Composite**: `0.3 * keyword_score + 0.7 * embedding_score` + +## TTS Pipeline + +- **Engine**: Piper (`en_US-lessac-medium`) +- **Speed**: ~1.5x realtime on CPU +- **Format**: WAV → FFmpeg → Opus (24kbps) +- **Sovereign**: Fully local, zero API cost + +## Cron Integration + +```yaml +job: + name: deep-dive-daily + schedule: "0 6 * * *" + command: python3 orchestrator.py --cron +``` + +## On-Demand + +```bash +python3 orchestrator.py # Full run +python3 orchestrator.py --dry-run # No delivery +python3 orchestrator.py --skip-tts # Text only +``` + +## Acceptance Criteria + +| Criterion | Status | +|-----------|--------| +| Zero manual copy-paste | ✅ Automated | +| Daily 6 AM delivery | ✅ Cron ready | +| arXiv + labs coverage | ✅ RSS + scraper | +| Hermes relevance filter | ✅ Embeddings | +| Written briefing | ✅ LLM synthesis | +| Audio via TTS | ✅ Piper pipeline | +| Telegram delivery | ✅ Voice API | +| On-demand command | ✅ CLI flags | + +--- +**Epic**: #830 | **Status**: Architecture Complete