2.3 KiB
2.3 KiB
Twitter Archive Learning Pipeline
This repo owns the tracked code, schemas, prompts, and eval contracts for Timmy's private Twitter archive learning loop.
Privacy Boundary
- Raw archive files stay outside git.
- Derived runtime artifacts live under
~/.timmy/twitter-archive/. twitter-archive/is ignored bytimmy-homeso private notes and training artifacts do not get pushed by accident.
Tracked here:
- deterministic extraction and consolidation scripts
- output schemas
- eval gate contract
- prompt/orchestration code in
timmy-config
Not tracked here:
- raw tweets
- extracted tweet text
- batch notes
- private profile artifacts
- local-only DPO pairs
- local eval outputs
Runtime Layout
The runtime workspace is:
~/.timmy/twitter-archive/
extracted/
notes/
knowledge/
insights/
training/
checkpoint.json
metrics/progress.json
source_config.json
pipeline_config.json
Source Config
Optional local file:
{
"source_path": "~/Downloads/twitter-.../data"
}
Environment override:
TIMMY_TWITTER_ARCHIVE_SOURCE=~/Downloads/twitter-.../data
Knowledge Candidate Schema
Each batch candidate file contains:
idcategoryclaimevidence_tweet_idsevidence_quotesconfidencestatusfirst_seen_atlast_confirmed_atcontradicts
The consolidator computes durable vs provisional vs retracted from these fields.
Candidate Eval Contract
Local eval JSON files under ~/.timmy/twitter-archive/training/evals/ must use:
{
"candidate_id": "timmy-archive-v0.1",
"baseline_composite": 0.71,
"candidate_composite": 0.76,
"refusal_over_fabrication_regression": false,
"source_distinction_regression": false,
"evidence_citation_rate": 0.98,
"rollback_model": "timmy-archive-v0.0"
}
Promotion gate:
- candidate composite improves by at least 5%
- no refusal regression
- no source distinction regression
- evidence citation rate stays at or above 95%
Training Command Contract
Optional local file pipeline_config.json can define:
{
"train_command": "bash -lc 'echo train me'",
"promote_command": "bash -lc 'echo promote me'"
}
If these commands are absent, the pipeline still prepares artifacts and run manifests, but training/promotion stays in a ready state instead of executing.