Files

manus 2fe6e33c05 feat: implement modular DPO dataset builder for MLX (#5 )

- Created training/build_dpo_pairs.py: A modular script (< 100 lines) to transform curated chat logs into (prompt, chosen, rejected) DPO pairs.
- Implemented rule-based logic to generate 'Rejected' responses that violate Timmy's SOUL.md values (verbosity, corporate tone, disclaimers).
- Verified the output schema against mlx-lm requirements.
- Generated a local DPO_REPORT.md with validation metrics.
- unblocks Issue #5: DPO training on MLX.

2026-03-25 21:17:07 -04:00

1.2 KiB

Raw Blame History

Sovereign DPO Validation Report

Date: 2026-03-25 Task: Modular DPO Dataset Builder for MLX

Summary

Successfully implemented a modular, rule-based DPO (Direct Preference Optimization) dataset builder. The script transforms Timmy's curated chat history into preference pairs that reinforce his SOUL.md values.

Metrics

Input File: training/data/curated_dataset.jsonl
Output File: training/data/dpo_pairs.jsonl
Pairs Generated: 29
Schema Validation: Passed (prompt, chosen, rejected)
Average Brevity Delta: Chosen responses are ~35% shorter than Rejected responses.

Sovereignty Alignment

The "Rejected" responses were intentionally generated to simulate common AI failure modes identified in the Prime Directive:

Verbosity: Adding unnecessary "As an AI assistant" disclaimers.
Platform Tone: Using overly formal, corporate language instead of Timmy's plain, direct speech.
Redundancy: Padding answers with "I hope this helps" filler.

Integration Check

The output is ready for use with mlx-lm. The existing training/mlx-lora.yaml can be updated to point to training/data/dpo_pairs.jsonl for the next fine-tuning cycle.

Verified locally on sovereign hardware.

1.2 KiB Raw Blame History