[AutoLoRA P1] LoRA Fine-Tune Hermes 4 on Exported Trajectories #1103

New Issue

perplexity · 2026-03-23T17:29:43Z

perplexity commented

2026-03-23 17:29:43 +00:00

LoRA Fine-Tune Hermes 4 on Exported Trajectories

Priority: P1-Important
Assignee: Alexander (execution) + Timmy (review after)
Epic: #1091 — Project Bannerlord
Pipeline: AutoLoRA Sovereignty Loop (Step 4 of 7)
Blocked by: Export Trajectories issue + Download Hermes 4 issue

Context

Fine-tune a LoRA adapter on your conversation data so the local model speaks the harness dialect natively, including your 32 custom skills and project context.

What To Do

Using mlx-lm (native Apple Silicon — preferred):

pip install mlx-lm

# Prepare training data
mkdir -p ~/timmy-lora-training
cp ~/timmy-training-data.jsonl ~/timmy-lora-training/train.jsonl

# Run LoRA fine-tuning
mlx_lm.lora \
  --model <path-to-hermes4-model> \
  --train \
  --data ~/timmy-lora-training \
  --batch-size 1 \
  --lora-layers 16 \
  --iters 1000 \
  --learning-rate 1e-5 \
  --adapter-path ~/timmy-lora-adapter

# Takes 2-8 hours on M3 Max depending on dataset size
# If OOM: reduce batch-size or lora-layers
# Adapter will be ~100-500MB at ~/timmy-lora-adapter/

Alternative path (if mlx-lm doesn't support the format):

# Use unsloth on Google Colab, RunPod, or Modal for training
# Download resulting GGUF and import into Ollama locally

Test the fine-tuned model:

# Load base model with LoRA adapter
mlx_lm.generate \
  --model <path-to-hermes4-model> \
  --adapter-path ~/timmy-lora-adapter \
  --prompt "List the open PRs on the Timmy Time Dashboard repo and triage them"

# Or merge adapter into base and import to Ollama:
mlx_lm.fuse \
  --model <path-to-hermes4-model> \
  --adapter-path ~/timmy-lora-adapter \
  --save-path ~/timmy-fused-model

Done When

LoRA training completes without OOM errors
Adapter file exists at ~/timmy-lora-adapter/
Model with adapter responds to tool-calling prompts
Skills that FAILED with base Hermes 4 now WORK (compare against the fail list)

## LoRA Fine-Tune Hermes 4 on Exported Trajectories **Priority:** P1-Important **Assignee:** Alexander (execution) + Timmy (review after) **Epic:** #1091 — Project Bannerlord **Pipeline:** AutoLoRA Sovereignty Loop (Step 4 of 7) **Blocked by:** Export Trajectories issue + Download Hermes 4 issue --- ### Context Fine-tune a LoRA adapter on your conversation data so the local model speaks the harness dialect natively, including your 32 custom skills and project context. ### What To Do **Using mlx-lm (native Apple Silicon — preferred):** ```bash pip install mlx-lm # Prepare training data mkdir -p ~/timmy-lora-training cp ~/timmy-training-data.jsonl ~/timmy-lora-training/train.jsonl # Run LoRA fine-tuning mlx_lm.lora \ --model <path-to-hermes4-model> \ --train \ --data ~/timmy-lora-training \ --batch-size 1 \ --lora-layers 16 \ --iters 1000 \ --learning-rate 1e-5 \ --adapter-path ~/timmy-lora-adapter # Takes 2-8 hours on M3 Max depending on dataset size # If OOM: reduce batch-size or lora-layers # Adapter will be ~100-500MB at ~/timmy-lora-adapter/ ``` **Alternative path (if mlx-lm doesn't support the format):** ```bash # Use unsloth on Google Colab, RunPod, or Modal for training # Download resulting GGUF and import into Ollama locally ``` **Test the fine-tuned model:** ```bash # Load base model with LoRA adapter mlx_lm.generate \ --model <path-to-hermes4-model> \ --adapter-path ~/timmy-lora-adapter \ --prompt "List the open PRs on the Timmy Time Dashboard repo and triage them" # Or merge adapter into base and import to Ollama: mlx_lm.fuse \ --model <path-to-hermes4-model> \ --adapter-path ~/timmy-lora-adapter \ --save-path ~/timmy-fused-model ``` ### Done When - [ ] LoRA training completes without OOM errors - [ ] Adapter file exists at `~/timmy-lora-adapter/` - [ ] Model with adapter responds to tool-calling prompts - [ ] Skills that FAILED with base Hermes 4 now WORK (compare against the fail list)

perplexity referenced this issue

2026-03-23 17:30:01 +00:00

[EPIC] Project Bannerlord — Timmy Becomes King of Calradia #1091

claude self-assigned this 2026-03-23 17:54:06 +00:00

claude referenced this issue from a commit

2026-03-23 18:15:14 +00:00

feat: add LoRA trajectory export and fine-tune launcher scripts (AutoLoRA Step 4)

claude referenced a pull request that will close this issue

2026-03-23 18:15:29 +00:00

[claude] LoRA trajectory export and fine-tune launcher (#1103) #1117

claude commented

2026-03-23 18:15:38 +00:00

PR created: http://143.198.27.163:3000/rockachopa/Timmy-time-dashboard/pulls/1117

Summary of changes (AutoLoRA Step 4):

scripts/export_trajectories.py — converts session_*.jsonl logs into mlx-lm ChatML training data. Groups turns, embeds tool calls as <tool_call> XML, filters short responses, outputs ~/timmy-training-data.jsonl
scripts/lora_finetune.py — launcher wrapper around mlx_lm.lora with project defaults (batch=1, lora-layers=16, iters=1000, lr=1e-5), pre-flight checks, --dry-run, --test, and --fuse modes
tests/scripts/test_export_trajectories.py — 20 unit tests, all passing

Workflow: export_trajectories.py → copy to ~/timmy-lora-training/train.jsonl → lora_finetune.py → --test → --fuse → import to Ollama

PR created: http://143.198.27.163:3000/rockachopa/Timmy-time-dashboard/pulls/1117 **Summary of changes (AutoLoRA Step 4):** - `scripts/export_trajectories.py` — converts `session_*.jsonl` logs into mlx-lm ChatML training data. Groups turns, embeds tool calls as `<tool_call>` XML, filters short responses, outputs `~/timmy-training-data.jsonl` - `scripts/lora_finetune.py` — launcher wrapper around `mlx_lm.lora` with project defaults (batch=1, lora-layers=16, iters=1000, lr=1e-5), pre-flight checks, `--dry-run`, `--test`, and `--fuse` modes - `tests/scripts/test_export_trajectories.py` — 20 unit tests, all passing Workflow: `export_trajectories.py` → copy to `~/timmy-lora-training/train.jsonl` → `lora_finetune.py` → `--test` → `--fuse` → import to Ollama

claude closed this issue

2026-03-23 18:15:46 +00:00

claude referenced this issue from a commit

2026-03-23 18:15:48 +00:00

[claude] LoRA trajectory export and fine-tune launcher (#1103) (#1117)

Sign in to join this conversation.

Branches Tags

main

gemini/issue-892

claude/issue-1342

claude/issue-1346

claude/issue-1351

claude/issue-1340

fix/test-llm-triage-syntax

gemini/issue-1014

gemini/issue-932

claude/issue-1277

claude/issue-1139

claude/issue-870

claude/issue-1285

claude/issue-1292

claude/issue-1281

claude/issue-917

claude/issue-1275

claude/issue-925

claude/issue-1019

claude/issue-1094

claude/issue-1019-v3

fix/flaky-vassal-xdist-tests

fix/test-config-env-isolation

claude/issue-1019-v2

claude/issue-957-v2

claude/issue-1218

claude/issue-1217

test/chat-store-unit-tests

claude/issue-1191

claude/issue-1186

claude/issue-957

gemini/issue-936

claude/issue-1065

gemini/issue-976

gemini/issue-1149

claude/issue-1135

claude/issue-1064

gemini/issue-1012

claude/issue-1095

claude/issue-1102

claude/issue-1114

gemini/issue-978

gemini/issue-971

claude/issue-1074

claude/issue-987

claude/issue-1011

feature/internal-monologue

feature/issue-1006

feature/issue-1007

feature/issue-1008

feature/issue-1009

feature/issue-1010

feature/issue-1011

feature/issue-1012

feature/issue-1013

feature/issue-1014

feature/issue-981

feature/issue-982

feature/issue-983

feature/issue-984

feature/issue-985

feature/issue-986

feature/issue-987

feature/issue-993

claude/issue-943

claude/issue-975

claude/issue-989

claude/issue-988

fix/loop-guard-gitea-api-and-queue-validation

feature/lhf-tech-debt-fixes

kimi/issue-753

kimi/issue-714

kimi/issue-716

fix/csrf-check-before-execute

chore/migrate-gitea-to-vps

kimi/issue-640

fix/utcnow-calm-py

kimi/issue-635

kimi/issue-625

fix/router-api-truncated-param

kimi/issue-604

kimi/issue-594

review-fixes

kimi/issue-570

kimi/issue-554

kimi/issue-539

kimi/issue-540

feature/ipad-v1-api

kimi/issue-506

kimi/issue-512

refactor/airllm-doc-cleanup

kimi/issue-513

kimi/issue-514

kimi/issue-500

kimi/issue-492

kimi/issue-490

kimi/issue-459

kimi/issue-472

kimi/issue-473

kimi/issue-462

kimi/issue-463

kimi/issue-454

kimi/issue-445

kimi/issue-446

kimi/issue-431

2 Participants

Notifications

Due Date

No due date set.

Dependencies

No dependencies set.

Reference: Rockachopa/Timmy-time-dashboard#1103