[AutoLoRA P1] LoRA Fine-Tune Hermes 4 on Exported Trajectories #1103

Closed
opened 2026-03-23 17:29:43 +00:00 by perplexity · 1 comment
Collaborator

LoRA Fine-Tune Hermes 4 on Exported Trajectories

Priority: P1-Important
Assignee: Alexander (execution) + Timmy (review after)
Epic: #1091 — Project Bannerlord
Pipeline: AutoLoRA Sovereignty Loop (Step 4 of 7)
Blocked by: Export Trajectories issue + Download Hermes 4 issue


Context

Fine-tune a LoRA adapter on your conversation data so the local model speaks the harness dialect natively, including your 32 custom skills and project context.

What To Do

Using mlx-lm (native Apple Silicon — preferred):

pip install mlx-lm

# Prepare training data
mkdir -p ~/timmy-lora-training
cp ~/timmy-training-data.jsonl ~/timmy-lora-training/train.jsonl

# Run LoRA fine-tuning
mlx_lm.lora \
  --model <path-to-hermes4-model> \
  --train \
  --data ~/timmy-lora-training \
  --batch-size 1 \
  --lora-layers 16 \
  --iters 1000 \
  --learning-rate 1e-5 \
  --adapter-path ~/timmy-lora-adapter

# Takes 2-8 hours on M3 Max depending on dataset size
# If OOM: reduce batch-size or lora-layers
# Adapter will be ~100-500MB at ~/timmy-lora-adapter/

Alternative path (if mlx-lm doesn't support the format):

# Use unsloth on Google Colab, RunPod, or Modal for training
# Download resulting GGUF and import into Ollama locally

Test the fine-tuned model:

# Load base model with LoRA adapter
mlx_lm.generate \
  --model <path-to-hermes4-model> \
  --adapter-path ~/timmy-lora-adapter \
  --prompt "List the open PRs on the Timmy Time Dashboard repo and triage them"

# Or merge adapter into base and import to Ollama:
mlx_lm.fuse \
  --model <path-to-hermes4-model> \
  --adapter-path ~/timmy-lora-adapter \
  --save-path ~/timmy-fused-model

Done When

  • LoRA training completes without OOM errors
  • Adapter file exists at ~/timmy-lora-adapter/
  • Model with adapter responds to tool-calling prompts
  • Skills that FAILED with base Hermes 4 now WORK (compare against the fail list)
## LoRA Fine-Tune Hermes 4 on Exported Trajectories **Priority:** P1-Important **Assignee:** Alexander (execution) + Timmy (review after) **Epic:** #1091 — Project Bannerlord **Pipeline:** AutoLoRA Sovereignty Loop (Step 4 of 7) **Blocked by:** Export Trajectories issue + Download Hermes 4 issue --- ### Context Fine-tune a LoRA adapter on your conversation data so the local model speaks the harness dialect natively, including your 32 custom skills and project context. ### What To Do **Using mlx-lm (native Apple Silicon — preferred):** ```bash pip install mlx-lm # Prepare training data mkdir -p ~/timmy-lora-training cp ~/timmy-training-data.jsonl ~/timmy-lora-training/train.jsonl # Run LoRA fine-tuning mlx_lm.lora \ --model <path-to-hermes4-model> \ --train \ --data ~/timmy-lora-training \ --batch-size 1 \ --lora-layers 16 \ --iters 1000 \ --learning-rate 1e-5 \ --adapter-path ~/timmy-lora-adapter # Takes 2-8 hours on M3 Max depending on dataset size # If OOM: reduce batch-size or lora-layers # Adapter will be ~100-500MB at ~/timmy-lora-adapter/ ``` **Alternative path (if mlx-lm doesn't support the format):** ```bash # Use unsloth on Google Colab, RunPod, or Modal for training # Download resulting GGUF and import into Ollama locally ``` **Test the fine-tuned model:** ```bash # Load base model with LoRA adapter mlx_lm.generate \ --model <path-to-hermes4-model> \ --adapter-path ~/timmy-lora-adapter \ --prompt "List the open PRs on the Timmy Time Dashboard repo and triage them" # Or merge adapter into base and import to Ollama: mlx_lm.fuse \ --model <path-to-hermes4-model> \ --adapter-path ~/timmy-lora-adapter \ --save-path ~/timmy-fused-model ``` ### Done When - [ ] LoRA training completes without OOM errors - [ ] Adapter file exists at `~/timmy-lora-adapter/` - [ ] Model with adapter responds to tool-calling prompts - [ ] Skills that FAILED with base Hermes 4 now WORK (compare against the fail list)
claude self-assigned this 2026-03-23 17:54:06 +00:00
Collaborator

PR created: http://143.198.27.163:3000/rockachopa/Timmy-time-dashboard/pulls/1117

Summary of changes (AutoLoRA Step 4):

  • scripts/export_trajectories.py — converts session_*.jsonl logs into mlx-lm ChatML training data. Groups turns, embeds tool calls as <tool_call> XML, filters short responses, outputs ~/timmy-training-data.jsonl
  • scripts/lora_finetune.py — launcher wrapper around mlx_lm.lora with project defaults (batch=1, lora-layers=16, iters=1000, lr=1e-5), pre-flight checks, --dry-run, --test, and --fuse modes
  • tests/scripts/test_export_trajectories.py — 20 unit tests, all passing

Workflow: export_trajectories.py → copy to ~/timmy-lora-training/train.jsonllora_finetune.py--test--fuse → import to Ollama

PR created: http://143.198.27.163:3000/rockachopa/Timmy-time-dashboard/pulls/1117 **Summary of changes (AutoLoRA Step 4):** - `scripts/export_trajectories.py` — converts `session_*.jsonl` logs into mlx-lm ChatML training data. Groups turns, embeds tool calls as `<tool_call>` XML, filters short responses, outputs `~/timmy-training-data.jsonl` - `scripts/lora_finetune.py` — launcher wrapper around `mlx_lm.lora` with project defaults (batch=1, lora-layers=16, iters=1000, lr=1e-5), pre-flight checks, `--dry-run`, `--test`, and `--fuse` modes - `tests/scripts/test_export_trajectories.py` — 20 unit tests, all passing Workflow: `export_trajectories.py` → copy to `~/timmy-lora-training/train.jsonl` → `lora_finetune.py` → `--test` → `--fuse` → import to Ollama
Sign in to join this conversation.
No Label
2 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: Rockachopa/Timmy-time-dashboard#1103