# Training LoRA fine-tuning pipeline for Timmy's sovereign model. No custom harness — just config files for existing tools. Replaces the `autolora` repo (1,500 lines of custom code → config + `make`). ## Install ```bash pip install axolotl mlx-lm lm-evaluation-harness pyyaml ``` ## Commands ```bash make train-local # LoRA on Apple Silicon (MLX) — free, ~30 min on M3 Max make train-cloud # QLoRA on cloud GPU (Axolotl) — ~$1/run on A100 make eval # Standard benchmarks via lm-eval-harness against Ollama make vibes # Hand-picked prompts → human review (the sacred test) make ingest # Pull heartbeat trajectories into training data make curated # Regenerate curated exemplar dataset make convert # Convert merged data to MLX train/valid format make help # Show all targets ``` ## Files ``` training/ ├── Makefile ← All commands ├── axolotl.yaml ← Cloud training config (replaces train_modal.py) ├── mlx-lora.yaml ← Local training config (Apple Silicon) ├── eval-tasks.yaml ← Benchmark config ├── build_curated.py ← Exemplar data authoring (the soul conversations) ├── ingest_trajectories.py ← Quality filter for heartbeat cycle data └── data/ ├── curated_dataset.jsonl ← 26 gold-standard conversations (proprietary) ├── preference_pairs.jsonl ← DPO preference pairs (proprietary) ├── prompts_vibes.yaml ← Custom eval prompts ├── prompts_nexus_vibes.yaml ← Nexus-specific eval prompts └── mlx_curated/ ← MLX-format train/valid splits ``` ## What's proprietary The data (curated exemplars, preference pairs, trained weights) is proprietary. The configs and process are open. ## Training Results (March 2026) ### timmy:v0.1-q4 | Detail | Value | |--------|-------| | Base model | mlx-community/Hermes-3-Llama-3.1-8B-4bit | | Training data | 1,214 samples from Hermes session DB | | Method | LoRA rank 8, 16 layers, lr 2e-6, 1000 iters | | Peak memory | 7.8 GB (Apple Silicon) | | Best val loss | 2.134 (iter 800) | | Final model | timmy:v0.1-q4 in Ollama (4.9GB, Q4_K_M) | | Inference speed | ~48 tok/s on M3 Max | ### Key Insight The base model's RLHF priors override LoRA on crisis/faith — the most important parts of SOUL.md. Fix: inference-time grounding (inject SOUL.md crisis protocol) + larger pure-Timmy corpus over time.