[AutoLoRA P2] Set Up Continuous Improvement Loop #1105

Closed
opened 2026-03-23 17:29:44 +00:00 by perplexity · 1 comment
Collaborator

Set Up Continuous Improvement Loop

Priority: P2
Assignee: Timmy (after fine-tune is loaded)
Epic: #1091 — Project Bannerlord
Pipeline: AutoLoRA Sovereignty Loop (Step 6 of 7)
Blocked by: All P0 and P1 AutoLoRA issues


Context

After the first fine-tune is loaded and working, automate the cycle. This is the sovereignty loop — Timmy improves himself without cloud AI in the decision path.

What To Do

Create a skill (or cron job) that:

  1. Every Sunday night, exports the week's conversation trajectories
  2. Filters for high-quality exchanges (tool calls that succeeded, multi-step tasks completed)
  3. Appends to the training dataset
  4. Triggers a LoRA fine-tune job
  5. On completion, loads the new adapter
  6. Logs what changed (new skills learned, accuracy improvements)

This skill should be one of the first things the fine-tuned Timmy writes for himself.

The Sovereignty Loop

Work → Record trajectories → Export weekly → Filter quality
→ LoRA fine-tune → Load adapter → Model improves
→ Work better → Record better trajectories → Repeat forever

Done When

  • A /retrain skill or cron job exists
  • It runs end-to-end without Alexander's intervention
  • The training log shows iteration count and skill accuracy per cycle
## Set Up Continuous Improvement Loop **Priority:** P2 **Assignee:** Timmy (after fine-tune is loaded) **Epic:** #1091 — Project Bannerlord **Pipeline:** AutoLoRA Sovereignty Loop (Step 6 of 7) **Blocked by:** All P0 and P1 AutoLoRA issues --- ### Context After the first fine-tune is loaded and working, automate the cycle. This is the sovereignty loop — Timmy improves himself without cloud AI in the decision path. ### What To Do Create a skill (or cron job) that: 1. **Every Sunday night**, exports the week's conversation trajectories 2. **Filters** for high-quality exchanges (tool calls that succeeded, multi-step tasks completed) 3. **Appends** to the training dataset 4. **Triggers** a LoRA fine-tune job 5. **On completion**, loads the new adapter 6. **Logs** what changed (new skills learned, accuracy improvements) This skill should be one of the first things the fine-tuned Timmy writes for himself. ### The Sovereignty Loop ``` Work → Record trajectories → Export weekly → Filter quality → LoRA fine-tune → Load adapter → Model improves → Work better → Record better trajectories → Repeat forever ``` ### Done When - [ ] A `/retrain` skill or cron job exists - [ ] It runs end-to-end without Alexander's intervention - [ ] The training log shows iteration count and skill accuracy per cycle
claude self-assigned this 2026-03-23 17:54:01 +00:00
Collaborator

PR created: #1118

Implemented the full AutoLoRA continuous improvement loop:

  • trajectory_exporter.py — reads session JSONL logs and segments into discrete trajectories
  • quality_filter.py — scores/filters for high-quality training examples (multi-step tasks, successful tool calls)
  • training_dataset.py — appends to chat-format JSONL training file with deduplication
  • lora_trainer.py — triggers mlx_lm.lora fine-tune + Ollama adapter registration; degrades gracefully when unavailable
  • training_log.py — records iteration count, train loss, skill accuracy per cycle to .loop/retrain/training_log.jsonl + .md
  • retrain.py — main orchestrator with --dry-run and --weeks-ago flags

Registered in automations.json as retrain with schedule: weekly_sunday. Run python3 timmy_automations/retrain/retrain.py --dry-run to test the pipeline without triggering actual fine-tuning.

49 new unit tests, all passing.

PR created: #1118 Implemented the full AutoLoRA continuous improvement loop: - `trajectory_exporter.py` — reads session JSONL logs and segments into discrete trajectories - `quality_filter.py` — scores/filters for high-quality training examples (multi-step tasks, successful tool calls) - `training_dataset.py` — appends to chat-format JSONL training file with deduplication - `lora_trainer.py` — triggers `mlx_lm.lora` fine-tune + Ollama adapter registration; degrades gracefully when unavailable - `training_log.py` — records iteration count, train loss, skill accuracy per cycle to `.loop/retrain/training_log.jsonl` + `.md` - `retrain.py` — main orchestrator with `--dry-run` and `--weeks-ago` flags Registered in `automations.json` as `retrain` with `schedule: weekly_sunday`. Run `python3 timmy_automations/retrain/retrain.py --dry-run` to test the pipeline without triggering actual fine-tuning. 49 new unit tests, all passing.
Sign in to join this conversation.
No Label
2 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: Rockachopa/Timmy-time-dashboard#1105