[CONFIG] Migrate autolora data and configs into timmy-config/training/ #556

Closed
opened 2026-03-25 23:02:35 +00:00 by perplexity · 1 comment
Member

Per direction shift (#542). The autolora repo is ~1,500 lines of custom pipeline code wrapping tools that already exist. The valuable parts are data and config — those belong in timmy-config.

What moves to timmy-config/training/

Data (yours, irreplaceable):

  • data/curated_dataset.jsonl — 26 hand-crafted exemplar conversations (crisis, pastoral, sovereignty, honesty, concision)
  • data/preference_pairs.jsonl — DPO preference pairs
  • data/mlx_curated/ — train/valid splits
  • eval/prompts_vibes.yaml — custom vibe check prompts
  • eval/prompts_nexus_vibes.yaml — Nexus-specific eval prompts

Config (replaces custom code):

  • training/axolotl.yaml — replaces train_modal.py (239 lines → ~20 line config)
  • training/eval-tasks.yaml — replaces eval/run_eval.py (300 lines → lm-evaluation-harness task definition)
  • training/Makefile — 3 targets:
    • make train → calls axolotl or mlx-lm lora with the YAML
    • make eval → calls lm-eval with the task YAML against Ollama
    • make vibes → pipes prompts through ollama run and diffs output

Thin glue (keep as small scripts):

  • training/ingest_trajectories.py — quality filter for heartbeat cycle data (174 lines, domain-specific, keep)
  • training/build_curated.py — the exemplar generator (this is data authoring, keep)

What gets deleted (replaced by imports)

  • train_modal.pyaxolotl handles Modal/cloud GPU dispatch natively
  • scripts/convert_to_mlx.pymlx-lm accepts chat format directly
  • eval/run_eval.pylm-evaluation-harness with custom YAML tasks
  • eval/run_vibes.py → shell loop over ollama run
  • eval/compare.pylm-eval-harness comparison or diff

Resulting structure

timmy-config/
├── training/
│   ├── README.md                 ← 3-command pipeline docs
│   ├── Makefile                  ← train, eval, vibes targets
│   ├── axolotl.yaml              ← training config (replaces train_modal.py)
│   ├── eval-tasks.yaml           ← lm-eval-harness task definition
│   ├── ingest_trajectories.py    ← quality filter (thin glue)
│   ├── build_curated.py          ← exemplar data authoring
│   └── data/
│       ├── curated_dataset.jsonl
│       ├── preference_pairs.jsonl
│       ├── prompts_vibes.yaml
│       └── prompts_nexus_vibes.yaml
├── SOUL.md
├── config.yaml
└── ... (existing timmy-config structure)

Requirements

pip install axolotl mlx-lm lm-evaluation-harness

No other dependencies. No custom training harness. No custom eval framework.

Per direction shift (#542). The autolora repo is ~1,500 lines of custom pipeline code wrapping tools that already exist. The valuable parts are data and config — those belong in timmy-config. ## What moves to `timmy-config/training/` **Data (yours, irreplaceable):** - `data/curated_dataset.jsonl` — 26 hand-crafted exemplar conversations (crisis, pastoral, sovereignty, honesty, concision) - `data/preference_pairs.jsonl` — DPO preference pairs - `data/mlx_curated/` — train/valid splits - `eval/prompts_vibes.yaml` — custom vibe check prompts - `eval/prompts_nexus_vibes.yaml` — Nexus-specific eval prompts **Config (replaces custom code):** - `training/axolotl.yaml` — replaces `train_modal.py` (239 lines → ~20 line config) - `training/eval-tasks.yaml` — replaces `eval/run_eval.py` (300 lines → lm-evaluation-harness task definition) - `training/Makefile` — 3 targets: - `make train` → calls `axolotl` or `mlx-lm lora` with the YAML - `make eval` → calls `lm-eval` with the task YAML against Ollama - `make vibes` → pipes prompts through `ollama run` and diffs output **Thin glue (keep as small scripts):** - `training/ingest_trajectories.py` — quality filter for heartbeat cycle data (174 lines, domain-specific, keep) - `training/build_curated.py` — the exemplar generator (this is data authoring, keep) ## What gets deleted (replaced by imports) - `train_modal.py` → `axolotl` handles Modal/cloud GPU dispatch natively - `scripts/convert_to_mlx.py` → `mlx-lm` accepts chat format directly - `eval/run_eval.py` → `lm-evaluation-harness` with custom YAML tasks - `eval/run_vibes.py` → shell loop over `ollama run` - `eval/compare.py` → `lm-eval-harness` comparison or `diff` ## Resulting structure ``` timmy-config/ ├── training/ │ ├── README.md ← 3-command pipeline docs │ ├── Makefile ← train, eval, vibes targets │ ├── axolotl.yaml ← training config (replaces train_modal.py) │ ├── eval-tasks.yaml ← lm-eval-harness task definition │ ├── ingest_trajectories.py ← quality filter (thin glue) │ ├── build_curated.py ← exemplar data authoring │ └── data/ │ ├── curated_dataset.jsonl │ ├── preference_pairs.jsonl │ ├── prompts_vibes.yaml │ └── prompts_nexus_vibes.yaml ├── SOUL.md ├── config.yaml └── ... (existing timmy-config structure) ``` ## Requirements ``` pip install axolotl mlx-lm lm-evaluation-harness ``` No other dependencies. No custom training harness. No custom eval framework.
perplexity added the p0-criticalassigned-perplexity labels 2026-03-25 23:02:35 +00:00
perplexity self-assigned this 2026-03-25 23:02:35 +00:00
Author
Member

Done. training/ directory committed to timmy-config with data, configs, and Makefile. Commit: 6507cff

Done. training/ directory committed to timmy-config with data, configs, and Makefile. Commit: 6507cff
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: Timmy_Foundation/the-nexus#556