Commit Graph

3 Commits

Author SHA1 Message Date
Alexander Whitestone
83457cc9a9 feat(#691): training pair provenance tracking — source session + model
ProvenanceTracker: added add_provenance(), extract_provenance_from_existing(),
filter_by_provenance(), generate_report() methods.

Fixed save_jsonl() to accept both (path, entries) and (entries, path)
argument orders for backward compatibility.

build_curated.py: every exemplar now gets provenance metadata
(source=curated, source_session_id, model=timmy-curated, timestamp).
Provenance coverage reported in build output.

Acceptance criteria:
- [x] Add metadata to each pair: source_session_id, model, timestamp
- [x] Filter pairs by provenance (exclude_models, exclude_sources)
- [x] Report: pair count by source model

Closes #691
2026-04-21 07:14:27 -04:00
1c69029d9c feat: integrate provenance tracking with build_curated.py (#752)
Some checks failed
Architecture Lint / Linter Tests (pull_request) Successful in 25s
Validate Config / YAML Lint (pull_request) Failing after 13s
Validate Config / JSON Validate (pull_request) Successful in 14s
Validate Config / Shell Script Lint (pull_request) Failing after 41s
Validate Config / Cron Syntax Check (pull_request) Successful in 8s
Validate Config / Python Syntax & Import Check (pull_request) Failing after 1m6s
Validate Training Data / validate (pull_request) Successful in 12s
PR Checklist / pr-checklist (pull_request) Failing after 5m22s
Smoke Test / smoke (pull_request) Failing after 17s
Validate Config / Deploy Script Dry Run (pull_request) Successful in 7s
Validate Config / Playbook Schema Validation (pull_request) Successful in 17s
Architecture Lint / Lint Repository (pull_request) Has been cancelled
Validate Config / Python Test Suite (pull_request) Has been cancelled
2026-04-17 05:25:49 +00:00
perplexity
6507cffc15 feat: migrate autolora pipeline into training/
Per direction shift (the-nexus#542).

Replaces the autolora repo (1,500 lines of custom pipeline code)
with config files for existing tools:

- axolotl.yaml: replaces train_modal.py (239 lines)
- mlx-lora.yaml: replaces MLX training scripts
- eval-tasks.yaml: replaces run_eval.py (300 lines)
- Makefile: replaces run_vibes.py, compare.py, convert_to_mlx.py

Data migrated as-is:
- curated_dataset.jsonl (26 gold-standard conversations)
- preference_pairs.jsonl (DPO pairs)
- prompts_vibes.yaml, prompts_nexus_vibes.yaml
- v0-baseline eval results (historical record)

Thin glue kept:
- build_curated.py (data authoring, not infrastructure)
- ingest_trajectories.py (domain-specific quality filter)

Dependencies: pip install axolotl mlx-lm lm-evaluation-harness
2026-03-25 23:05:50 +00:00