Commit Graph

36 Commits

Author SHA1 Message Date
perplexity
9f90392a93 feat: full-history persistent dedup index for DPO training pairs
Replace the 5-file sliding window cross-run dedup with a persistent
hash index that covers ALL historical training data. Overfitting risk
compounds across the full dataset — a 5-file window lets old duplicates
leak back into training after enough overnight runs.

New module: dedup_index.py (DedupIndex)
- Persistent JSON index (.dpo_dedup_index.json) alongside JSONL files
- Append-on-export: new prompt hashes registered after each successful
  export — no full rescan needed for normal operations
- Incremental sync: on load, detects JSONL files not yet indexed and
  ingests them automatically (handles files from other tools)
- Full rebuild: rebuild() scans ALL deepdive_*.jsonl + pairs_*.jsonl
  to reconstruct from scratch (first run, corruption recovery)
- Atomic writes (write-to-tmp + rename) to prevent index corruption
- Standalone CLI: python3 dedup_index.py <dir> --rebuild --stats

Modified: dpo_quality.py
- Imports DedupIndex with graceful degradation
- Replaces _load_history_hashes() with persistent index lookup
- Fallback: if index unavailable, scans ALL files in-memory (not just 5)
- New register_exported_hashes() method called after export
- Config key: dedup_full_history (replaces dedup_history_files)

Modified: dpo_generator.py
- Calls validator.register_exported_hashes() after successful export
  to keep the persistent index current without rescanning

Modified: config.yaml
- Replaced dedup_history_files: 5 with dedup_full_history: true

Tested — 7 integration tests:
  ✓ Fresh index build from empty directory
  ✓ Build from 3 existing JSONL files (15 unique hashes)
  ✓ Incremental sync when new file appears between runs
  ✓ Append after export + persistence across reloads
  ✓ Rebuild from scratch (recovers from corruption)
  ✓ Validator catches day-1 dupe from 20-day history (5-file window miss)
  ✓ Full pipeline: generate → validate → export → register → re-run detects
2026-04-15 21:24:01 -04:00
perplexity
d15a82ff1e feat: DPO pair quality validator — gate before overnight training
Add DPOQualityValidator that catches bad training pairs before they
enter the tightening loop. Wired into DPOPairGenerator between
generate() and export() as an automatic quality gate.

New module: dpo_quality.py
- 5 single-pair quality checks:
  1. Field length minimums (prompt ≥40, chosen ≥80, rejected ≥30 chars)
  2. Chosen/rejected length ratio (chosen must be ≥1.3x longer)
  3. Chosen≈rejected similarity (Jaccard ≤0.70 — catches low-contrast)
  4. Vocabulary diversity in chosen (unique word ratio ≥0.30)
  5. Substance markers in chosen (≥2 fleet/training/action terms)
- 2 cross-pair quality checks:
  6. Near-duplicate prompts within batch (Jaccard ≤0.85)
  7. Cross-run dedup against recent JSONL history files
- Two modes: 'drop' (filter out bad pairs) or 'flag' (export with warning)
- BatchReport with per-pair diagnostics, pass rates, and warnings
- Standalone CLI: python3 dpo_quality.py <file.jsonl> [--strict] [--json]

Modified: dpo_generator.py
- Imports DPOQualityValidator with graceful degradation
- Initializes from config validation section (enabled by default)
- Validates between generate() and export() in run()
- Quality report included in pipeline result dict
- Validator failure never blocks — falls back to unvalidated export

Modified: config.yaml
- New deepdive.training.dpo.validation section with all tunable knobs:
  enabled, flagged_pair_action, similarity thresholds, length minimums,
  dedup_history_files

Integration tested — 6 test cases covering:
  ✓ Good pairs pass (3/3 accepted)
  ✓ Bad pairs caught: too-short, high-similarity, inverted signal (0/3)
  ✓ Near-duplicate prompt detection (1/2 deduped)
  ✓ Flag mode preserves pairs with warnings (3/3 flagged)
  ✓ Cross-run deduplication against history (1 dupe caught)
  ✓ Full generator→validator→export pipeline (6/6 validated)
2026-04-15 21:24:01 -04:00
perplexity
c3b455bd9c feat: Phase 3.5 — DPO training pair generation from Deep Dive pipeline
Wire arXiv relevance filter output directly into DPO pair generation,
closing the loop between research synthesis and overnight training data.

New module: dpo_generator.py
- DPOPairGenerator class with 3 pair strategies:
  * summarize: paper → fleet-grounded analysis (chosen) vs generic (rejected)
  * relevance: 'what matters to Hermes?' → scored context vs vague
  * implication: 'what should we do?' → actionable insight vs platitude
- Extracts synthesis excerpts matched to each ranked item
- Outputs to ~/.timmy/training-data/dpo-pairs/deepdive_{timestamp}.jsonl
- Format: {prompt, chosen, rejected, task_type, evidence_ids,
  source_session, safety_flags, metadata}

Pipeline changes (pipeline.py):
- Import DPOPairGenerator with graceful degradation
- Initialize from config deepdive.training.dpo section
- Execute as Phase 3.5 between synthesis and audio
- DPO results included in pipeline return dict
- Wrapped in try/except — DPO failure never blocks delivery

Config changes (config.yaml):
- New deepdive.training.dpo section with:
  enabled, output_dir, min_score, max_pairs_per_run, pair_types

Integration tested: 2 mock items × 3 pair types = 6 valid JSONL pairs.
Chosen responses consistently richer than rejected (assert-verified).
2026-04-15 21:24:01 -04:00
61c24c390b purge: remove Anthropic from the-nexus fleet + deepdive (#1346) 2026-04-15 21:24:01 -04:00
Alexander Whitestone
557713501c fix: closes #830 2026-04-15 21:24:01 -04:00
Alexander Whitestone
ef74536e33 feat: add edge-tts as zero-cost voice output provider
Some checks failed
CI / test (pull_request) Failing after 33s
CI / validate (pull_request) Failing after 26s
Review Approval Gate / verify-review (pull_request) Failing after 5s
- Add EdgeTTSAdapter to bin/deepdive_tts.py (provider key: "edge-tts")
  default voice: en-US-GuyNeural, no API key required
- Add EdgeTTS class to intelligence/deepdive/tts_engine.py
- Update HybridTTS to try edge-tts as fallback between piper and elevenlabs
- Add --voice-memo flag to bin/night_watch.py for spoken nightly reports
- Add edge-tts>=6.1.9 to requirements.txt
- Create docs/voice-output.md documenting all providers and fallback chain
- Add tests/test_edge_tts.py with 17 unit tests (all mocked, no network)

Fixes #1126

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-08 06:29:26 -04:00
34862cf5e5 feat(fleet): promote Ollama to first-class provider, assign Gemma 4 across fleet
Some checks failed
Deploy Nexus / deploy (push) Failing after 3s
Staging Verification Gate / verify-staging (push) Failing after 3s
- lazarus-registry.yaml: replace big_brain/RunPod with local ollama/gemma4:12b
- fleet-routing.json: assign ollama:gemma4:12b to carnice, bilbobagginshire, substratum
- intelligence/deepdive/config.yaml: local model -> gemma4:12b
2026-04-07 15:55:52 +00:00
Ezra
ce2cd85adc [ezra] Production Readiness Review for Deep Dive (#830) 2026-04-05 21:00:26 +00:00
Ezra (Archivist)
d2f103654f intelligence(deepdive): Docker deployment scaffold for #830
Some checks failed
Deploy Nexus / deploy (push) Has been cancelled
- Add Dockerfile for production containerized pipeline
- Add docker-compose.yml for full stack deployment
- Add .dockerignore for clean builds
- Add deploy.sh: one-command build, test, and systemd timer install

This provides a sovereign, reproducible deployment path for the
Deep Dive daily briefing pipeline.
2026-04-05 20:40:58 +00:00
Ezra (Archivist)
4b1873d76e feat(deepdive): production briefing prompt + prompt engineering KT
Some checks failed
Deploy Nexus / deploy (push) Has been cancelled
- production_briefing_v1.txt: podcast-script prompt engineered for
  10-15 min premium audio, grounded fleet context, and actionable tone.
- PROMPT_ENGINEERING_KT.md: A/B testing protocol, failure modes,
  and maintenance checklist.
- pipeline.py: load external prompt_file from config.yaml.

Refs #830
2026-04-05 20:19:20 +00:00
Ezra (Archivist)
9ad2132482 [ezra] #830: Operational readiness checklist + fix Gitea URL to forge
Some checks failed
Deploy Nexus / deploy (push) Has been cancelled
2026-04-05 19:54:47 +00:00
Ezra
3df184e1e6 feat(deepdive): quality evaluation framework
Some checks failed
Deploy Nexus / deploy (push) Has been cancelled
- Add quality_eval.py: automated briefing quality scorer with drift detection
- Add QUALITY_FRAMEWORK.md: rubric, usage guide, and production integration spec

Refs #830
2026-04-05 19:03:05 +00:00
Ezra (Archivist)
00600a7e67 [BURN] Deep Dive proof-of-life, fleet context fix, dry-run repair
Some checks failed
Deploy Nexus / deploy (push) Has been cancelled
- Fix fleet_context.py env-var substitution for 0c16baadaebaaabc2c8390f35ef5e9aa2f4db671
- Remove non-existent wizard-checkpoints from config.yaml
- Fix bin/deepdive_orchestrator.py dry-run mock items
- Add PROOF_OF_LIFE.md with live execution output including fleet context

Progresses #830
2026-04-05 18:42:18 +00:00
Ezra (Archivist)
014bb3b71e [ezra] Gemini handoff for Deep Dive (#830)
Some checks failed
Deploy Nexus / deploy (push) Has been cancelled
- Add GEMINI_HANDOFF.md with codebase map, secrets inventory,
  production checklist, and recommended next steps
- Continuity from Ezra scaffold to Gemini production-hardening
2026-04-05 18:20:53 +00:00
b6a473d808 test(deepdive): add fleet context unit tests (#830)
Some checks failed
Deploy Nexus / deploy (push) Has been cancelled
2026-04-05 17:32:25 +00:00
5f4cc8cae2 config(deepdive): enable fleet context grounding (#830)
Some checks failed
Deploy Nexus / deploy (push) Has been cancelled
2026-04-05 17:32:24 +00:00
ca1a11f66b feat(deepdive): integrate Phase 0 fleet context into synthesis (#830)
Some checks failed
Deploy Nexus / deploy (push) Has been cancelled
2026-04-05 17:32:23 +00:00
7189565d4d feat(deepdive): add Phase 0 fleet context grounding module (#830)
Some checks failed
Deploy Nexus / deploy (push) Has been cancelled
2026-04-05 17:32:22 +00:00
b3bec469b1 [ezra] #830: Pipeline proof-of-execution document
Some checks failed
Deploy Nexus / deploy (push) Has been cancelled
2026-04-05 12:46:03 +00:00
16bd546fc9 [ezra] #830: Fix config wrapper, add arXiv API fallback, implement voice delivery, fix datetime
Some checks failed
Deploy Nexus / deploy (push) Has been cancelled
2026-04-05 12:45:07 +00:00
76c973c0c2 Update README to reflect production implementation status (#830)
Some checks failed
Deploy Nexus / deploy (push) Has been cancelled
2026-04-05 12:18:18 +00:00
fc237e67d7 Add Telegram /deepdive command handler for on-demand briefings (#830)
Some checks failed
Deploy Nexus / deploy (push) Has been cancelled
Hermes-compatible command handler that parses /deepdive args,
runs the pipeline, and returns status + audio to Telegram.
2026-04-05 12:17:17 +00:00
25a45467ac Add QUICKSTART.md for Deep Dive pipeline (#830)
Some checks failed
Deploy Nexus / deploy (push) Has been cancelled
Step-by-step guide for installation, dry-run testing, live
delivery, systemd timer enablement, and Telegram command setup.
2026-04-05 12:17:16 +00:00
92f1164be9 Add TTS engine implementation for Deep Dive (#830)
Some checks failed
Deploy Nexus / deploy (push) Has been cancelled
Executable Phase 4 component: PiperTTS, ElevenLabsTTS, HybridTTS
classes with chunking, concatenation, error handling.

Ready for integration with Phase 3 synthesizer.

Burn mode artifact by Ezra.
2026-04-05 08:31:34 +00:00
6c5ac52374 [BURN] #830: End-to-end pipeline test (dry-run validation)
Some checks failed
Deploy Nexus / deploy (push) Has been cancelled
2026-04-05 08:08:11 +00:00
b131a12592 [BURN] #830: Phase 2 tests (relevance scoring)
Some checks failed
Deploy Nexus / deploy (push) Has been cancelled
2026-04-05 08:08:10 +00:00
ffae1b6285 [BURN] #830: Phase 1 tests (arXiv RSS aggregation)
Some checks failed
Deploy Nexus / deploy (push) Has been cancelled
2026-04-05 08:08:08 +00:00
f8634c0105 [BURN] #830: Systemd timer for daily 06:00 execution
Some checks failed
Deploy Nexus / deploy (push) Has been cancelled
2026-04-05 08:08:07 +00:00
c488bb7e94 [BURN] #830: Systemd service unit
Some checks failed
Deploy Nexus / deploy (push) Has been cancelled
2026-04-05 08:08:07 +00:00
66f632bd99 [BURN] #830: Build automation (Makefile)
Some checks failed
Deploy Nexus / deploy (push) Has been cancelled
2026-04-05 08:06:12 +00:00
44302bbdf9 [BURN] #830: Working pipeline.py implementation (645 lines, executable)
Some checks failed
Deploy Nexus / deploy (push) Has been cancelled
2026-04-05 08:06:11 +00:00
88af4870d3 [scaffold] Deep Dive intelligence pipeline: intelligence/deepdive/requirements.txt
Some checks failed
Deploy Nexus / deploy (push) Has been cancelled
2026-04-05 06:19:51 +00:00
cca5909cf9 [scaffold] Deep Dive intelligence pipeline: intelligence/deepdive/config.yaml
Some checks failed
Deploy Nexus / deploy (push) Has been cancelled
2026-04-05 06:19:50 +00:00
a8b4f7a8c0 [scaffold] Deep Dive intelligence pipeline: intelligence/deepdive/pipeline.py
Some checks failed
Deploy Nexus / deploy (push) Has been cancelled
2026-04-05 06:19:49 +00:00
949becff22 [scaffold] Deep Dive intelligence pipeline: intelligence/deepdive/architecture.md
Some checks failed
Deploy Nexus / deploy (push) Has been cancelled
2026-04-05 06:19:48 +00:00
fc11ea8a28 [scaffold] Deep Dive intelligence pipeline: intelligence/deepdive/README.md
Some checks failed
Deploy Nexus / deploy (push) Has been cancelled
2026-04-05 06:19:47 +00:00