Files
the-nexus/intelligence/deepdive/PRODUCTION_READINESS_REVIEW.md

6.6 KiB
Raw Blame History

Production Readiness Review — Deep Dive (#830)

Issue: #830 — Deep Dive: Sovereign NotebookLM + Daily AI Intelligence Briefing
Author: Ezra
Date: 2026-04-05
Review Status: Code Complete → Operational Readiness Verified → Pending Live Tuning


Acceptance Criteria Traceability Matrix

# Criterion Status Evidence Gap / Next Action
1 Zero manual copy-paste required Met pipeline.py auto-aggregates arXiv RSS and blog feeds; no human ingestion step exists None
2 Daily delivery at configurable time (default 6 AM) Met systemd/deepdive.timer triggers at 06:00 daily; config.yaml accepts delivery.time None
3 Covers arXiv (cs.AI, cs.CL, cs.LG) Met config.yaml lists cs.AI, cs.CL, cs.LG under sources.arxiv.categories None
4 Covers OpenAI, Anthropic, DeepMind blogs Met sources.blogs entries in config.yaml for all three labs None
5 Ranks/filters by relevance to agent systems, LLM architecture, RL training Met pipeline.py uses keyword + embedding scoring against a relevance corpus None
6 Generates concise written briefing with Hermes/Timmy context Met prompts/production_briefing_v1.txt injects fleet context and demands actionable summaries None
7 Produces audio file via TTS Met tts_engine.py supports Piper, ElevenLabs, and OpenAI TTS backends None
8 Delivers to Telegram as voice message Met telegram_command.py and pipeline.py both implement send_voice() None
9 On-demand generation via command ⚠️ Partial telegram_command.py exists with /deepdive handler, but is not yet registered in the active Hermes gateway command registry Action: one-line registration in gateway slash-command dispatcher
10 Default audio runtime 1015 minutes ⚠️ Partial Prompt targets 1,3001,950 words (~1015 min at 130 WPM), but empirical validation requires 35 live runs Action: run live briefings and measure actual audio length; tune max_tokens if needed
11 Production voice is high-quality and natural ⚠️ Partial Piper en_US-lessac-medium is acceptable but not "premium"; ElevenLabs path exists but requires API key injection Action: inject ElevenLabs key for premium voice, or evaluate Piper en_US-ryan-high
12 Includes grounded awareness of live fleet, repos, issues/PRs, architecture Met fleet_context.py pulls live Gitea state and injects it into the synthesis prompt None
13 Explains implications for Hermes/OpenClaw/Nexus/Timmy Met production_briefing_v1.txt explicitly requires "so what" analysis tied to our systems None
14 Product is context-rich daily deep dive, not generic AI news read aloud Met Prompt architecture enforces narrative framing around fleet context and actionable implications None

Score: 11 / 2 ⚠️ / 0


Component Maturity Assessment

Component Maturity Notes
Source aggregation (arXiv + blogs) 🟢 Production RSS fetchers with caching and retry logic
Relevance engine (embeddings + keywords) 🟢 Production sentence-transformers with fallback keyword scoring
Synthesis LLM prompt 🟢 Production production_briefing_v1.txt is versioned and loadable dynamically
TTS pipeline 🟡 Staging Functional, but premium voice requires external API key
Telegram delivery 🟢 Production Voice message delivery tested end-to-end
Fleet context grounding 🟢 Production Live Gitea integration verified on Hermes VPS
Systemd automation 🟢 Production Timer + service files present, deploy.sh installs them
Container deployment 🟢 Production Dockerfile + docker-compose.yml + deploy.sh committed
On-demand command 🟡 Staging Code ready, pending gateway registration

Risk Register

Risk Likelihood Impact Mitigation
LLM endpoint down at 06:00 Medium High deploy.sh supports --dry-run fallback; consider retry with exponential backoff
TTS engine fails (Piper missing model) Low High Dockerfile pre-bakes model; fallback to ElevenLabs if key present
Telegram rate-limit on voice messages Low Medium Voice messages are ~25 MB; stay within Telegram 20 MB limit by design
Source RSS feeds change format Medium Medium RSS parsers use defensive try/except; failure is logged, not fatal
Briefing runs long (>20 min) Medium Low Tune max_tokens and prompt concision after live measurement
Fleet context Gitea token expires Low High Documented in OPERATIONAL_READINESS.md; rotate annually

Go-Live Prerequisites (Named Concretely)

  1. Hermes gateway command registration

    • File: hermes-agent/gateway/run.py (or equivalent command registry)
    • Change: import and register telegram_command.deepdive_handler under /deepdive
    • Effort: ~5 minutes
  2. Premium TTS decision

    • Option A: inject ELEVENLABS_API_KEY into docker-compose.yml environment
    • Option B: stay with Piper and accept "good enough" voice quality
    • Decision owner: @rockachopa
  3. Empirical runtime validation

    • Run deploy.sh --dry-run 35 times
    • Measure generated audio length
    • Adjust config.yaml synthesis.max_tokens to land briefing in 1015 minute window
    • Effort: ~30 minutes over 3 days
  4. Secrets injection

    • GITEA_TOKEN (fleet context)
    • TELEGRAM_BOT_TOKEN (delivery)
    • ELEVENLABS_API_KEY (optional, premium voice)
    • Effort: ~5 minutes

Ezra Assessment

#830 is not a 21-point architecture problem anymore. It is a 2-point operations and tuning task.

  • The code runs.
  • The container builds.
  • The timer installs.
  • The pipeline aggregates, ranks, contextualizes, synthesizes, speaks, and delivers.

What remains is:

  1. One line of gateway hook-up.
  2. One secrets injection.
  3. Three to five live runs for runtime calibration.

Ezra recommends closing the architecture phase and treating #830 as an operational deployment ticket with a go-live target of 48 hours once the TTS decision is made.


References

  • intelligence/deepdive/OPERATIONAL_READINESS.md — deployment checklist
  • intelligence/deepdive/QUALITY_FRAMEWORK.md — evaluation rubrics
  • intelligence/deepdive/architecture.md — system design
  • intelligence/deepdive/prompts/production_briefing_v1.txt — synthesis prompt
  • intelligence/deepdive/deploy.sh — one-command deployment