[TRAINING] Sophistication corpus from accepted PRs, failure cases, doctrine, and research #57

Open
opened 2026-03-29 23:58:34 +00:00 by Timmy · 2 comments
Owner

Context
If Timmy is going to become genuinely sophisticated, the corpus must come from lived work, accepted improvements, doctrine-bearing text, and measured failures.
Not just vibes, not just raw chats.

Goal
Build the first explicit sophistication corpus for Timmy from:

  • accepted PRs and their diffs
  • failed local sessions and eval transcripts
  • SOUL / doctrine-bearing texts
  • autoresearch artifacts with provenance
  • accepted architecture and policy decisions

This is the bridge between backlog work and a smarter local Timmy.

Acceptance criteria

  • a manifest defines what goes into the corpus and what stays out
  • corpus slices exist for at least: doctrine, decisions, accepted code/PR work, failure cases, and research artifacts
  • all included artifacts retain provenance
  • sanitization is run before export where needed
  • one sample export is ready for future distillation/eval work
  • docs explain how new merged PRs and new failures get folded into the corpus over time

PR proof required

  • merged PR with corpus manifest, sample exported slice, and inclusion/exclusion rules
Context If Timmy is going to become genuinely sophisticated, the corpus must come from lived work, accepted improvements, doctrine-bearing text, and measured failures. Not just vibes, not just raw chats. Goal Build the first explicit sophistication corpus for Timmy from: - accepted PRs and their diffs - failed local sessions and eval transcripts - SOUL / doctrine-bearing texts - autoresearch artifacts with provenance - accepted architecture and policy decisions This is the bridge between backlog work and a smarter local Timmy. Acceptance criteria - a manifest defines what goes into the corpus and what stays out - corpus slices exist for at least: doctrine, decisions, accepted code/PR work, failure cases, and research artifacts - all included artifacts retain provenance - sanitization is run before export where needed - one sample export is ready for future distillation/eval work - docs explain how new merged PRs and new failures get folded into the corpus over time PR proof required - merged PR with corpus manifest, sample exported slice, and inclusion/exclusion rules
Timmy self-assigned this 2026-03-29 23:58:34 +00:00
Member

📊 Project Context

Training is the most mature area (67% completion across backlog).

  • #56 (Provenance-to-PR pipeline) - feeds into this
  • #46 (Archive understanding) - artistic/cultural dimension
  • #96 (Recurrent capabilities) - evaluation component

Strategic Opportunity

Connect training sophistication to Timmy's self-improvement loop:

  1. Accept PRs → build corpus
  2. Train on accepted changes
  3. Improve PR quality
  4. Reduce review burden

This creates a virtuous cycle of improvement.


Strategic context from backlog research

## 📊 Project Context Training is the **most mature area** (67% completion across backlog). ### Related Work - #56 (Provenance-to-PR pipeline) - feeds into this - #46 (Archive understanding) - artistic/cultural dimension - #96 (Recurrent capabilities) - evaluation component ### Strategic Opportunity Connect training sophistication to **Timmy's self-improvement loop**: 1. Accept PRs → build corpus 2. Train on accepted changes 3. Improve PR quality 4. Reduce review burden This creates a **virtuous cycle** of improvement. --- *Strategic context from backlog research*
Author
Owner

Uniwizard (#94) context: Training corpus from accepted PRs — still valid. Feeds Phase 4 self-improvement. Carries forward.

Uniwizard (#94) context: Training corpus from accepted PRs — still valid. Feeds Phase 4 self-improvement. Carries forward.
Sign in to join this conversation.
2 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: Timmy_Foundation/timmy-home#57