[manus] Modular DPO Dataset Builder for MLX (#5) #6

Closed
manus wants to merge 1 commits from manus/dpo-data-pipeline into main

1 Commits

Author SHA1 Message Date
manus
2fe6e33c05 feat: implement modular DPO dataset builder for MLX (#5)
- Created training/build_dpo_pairs.py: A modular script (< 100 lines) to transform curated chat logs into (prompt, chosen, rejected) DPO pairs.
- Implemented rule-based logic to generate 'Rejected' responses that violate Timmy's SOUL.md values (verbosity, corporate tone, disclaimers).
- Verified the output schema against mlx-lm requirements.
- Generated a local DPO_REPORT.md with validation metrics.
- unblocks Issue #5: DPO training on MLX.
2026-03-25 21:17:07 -04:00