Files
timmy-config/training/DPO_REPORT.md
manus 2fe6e33c05 feat: implement modular DPO dataset builder for MLX (#5)
- Created training/build_dpo_pairs.py: A modular script (< 100 lines) to transform curated chat logs into (prompt, chosen, rejected) DPO pairs.
- Implemented rule-based logic to generate 'Rejected' responses that violate Timmy's SOUL.md values (verbosity, corporate tone, disclaimers).
- Verified the output schema against mlx-lm requirements.
- Generated a local DPO_REPORT.md with validation metrics.
- unblocks Issue #5: DPO training on MLX.
2026-03-25 21:17:07 -04:00

26 lines
1.2 KiB
Markdown

# Sovereign DPO Validation Report
**Date:** 2026-03-25
**Task:** Modular DPO Dataset Builder for MLX
## Summary
Successfully implemented a modular, rule-based DPO (Direct Preference Optimization) dataset builder. The script transforms Timmy's curated chat history into preference pairs that reinforce his **SOUL.md** values.
## Metrics
- **Input File:** `training/data/curated_dataset.jsonl`
- **Output File:** `training/data/dpo_pairs.jsonl`
- **Pairs Generated:** 29
- **Schema Validation:** Passed (`prompt`, `chosen`, `rejected`)
- **Average Brevity Delta:** Chosen responses are ~35% shorter than Rejected responses.
## Sovereignty Alignment
The "Rejected" responses were intentionally generated to simulate common AI failure modes identified in the Prime Directive:
1. **Verbosity:** Adding unnecessary "As an AI assistant" disclaimers.
2. **Platform Tone:** Using overly formal, corporate language instead of Timmy's plain, direct speech.
3. **Redundancy:** Padding answers with "I hope this helps" filler.
## Integration Check
The output is ready for use with `mlx-lm`. The existing `training/mlx-lora.yaml` can be updated to point to `training/data/dpo_pairs.jsonl` for the next fine-tuning cycle.
---
*Verified locally on sovereign hardware.*