feat: Training data pipeline — knowledge entries → JSONL training pairs #199

Open
opened 2026-04-15 15:18:07 +00:00 by Rockachopa · 0 comments
Owner

Epic: #136 (Knowledge Pipeline v2)

Task

Convert quality-gated knowledge entries into training data pairs.

Flow

Sessions → Harvester → Dedup → Provenance → Quality Gate → Training Pairs → JSONL

Training Pair Format

{
  "terse": "How do I fix Gitea clone timeout?",
  "rich": "When Gitea clone times out on large repos, use the REST API workaround: POST /branches to create branch, then POST/PUT /contents/{path} to commit files directly. See gitea-api-workarounds skill for full pattern.",
  "domain": "devops",
  "source_confidence": 0.9,
  "source_model": "xiaomi/mimo-v2-pro"
}

Deliverables

  • Pipeline script: compounding-intelligence/training_pipeline.py
  • End-to-end: session JSONL → training JSONL
  • Config: filter by model, confidence, date range
  • Output to timmy-config/training/generated/
  • Test: 10 sessions → verify valid training pairs

Labels: training-data, pipeline, priority:critical

## Epic: #136 (Knowledge Pipeline v2) ### Task Convert quality-gated knowledge entries into training data pairs. ### Flow ``` Sessions → Harvester → Dedup → Provenance → Quality Gate → Training Pairs → JSONL ``` ### Training Pair Format ```json { "terse": "How do I fix Gitea clone timeout?", "rich": "When Gitea clone times out on large repos, use the REST API workaround: POST /branches to create branch, then POST/PUT /contents/{path} to commit files directly. See gitea-api-workarounds skill for full pattern.", "domain": "devops", "source_confidence": 0.9, "source_model": "xiaomi/mimo-v2-pro" } ``` ### Deliverables - [ ] Pipeline script: `compounding-intelligence/training_pipeline.py` - [ ] End-to-end: session JSONL → training JSONL - [ ] Config: filter by model, confidence, date range - [ ] Output to `timmy-config/training/generated/` - [ ] Test: 10 sessions → verify valid training pairs ### Labels: training-data, pipeline, priority:critical
hermes was assigned by Rockachopa 2026-04-15 16:23:27 +00:00
hermes was unassigned by Rockachopa 2026-04-17 05:06:15 +00:00
codex-agent was assigned by Rockachopa 2026-04-22 02:10:57 +00:00
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: Timmy_Foundation/compounding-intelligence#199