Build private Twitter archive learning pipeline for Timmy #3

New Issue

codex-agent · 2026-03-27T22:09:04Z

codex-agent commented

2026-03-27 22:09:04 +00:00

Summary

Build a private local-only Twitter archive learning pipeline for Timmy on top of the existing Hermes + know_thy_father flow.

The goal is not just to summarize tweets. The goal is to make Timmy systematically better at:

reading information about Alexander
extracting grounded knowledge
producing actionable insights
turning archive work into DPO/adaptation signal
improving local models over time without checking raw data into shared repos

Decisions Locked

v1 ingests tweets + retweets only
primary output is training signal
archive-derived artifacts stay local under ~/.timmy/twitter-archive/
memory/training generation is mostly automatic
model training/promotion is mostly automatic, but only after offline eval gates pass

Deliverables

deterministic archive extractor
two-pass batch reader (draft + critique/rewrite)
structured knowledge candidate schema
consolidated profile.json
weekly actionable insight artifact
DPO pair builder from archive sessions
eval gates for auto-train and auto-promotion
docs/spec for repo boundaries and privacy rules

Acceptance Criteria

raw archive never lands in tracked repo content
each batch produces notes, structured knowledge, and training examples
durable knowledge is evidence-linked
DPO pairs are generated automatically from archive work
candidate models only auto-promote when archive eval improves and safety specs do not regress
interrupted runs resume from checkpoint without duplicating work

Implementation Boundary

timmy-config: orchestration/tasks/prompts/scheduling
timmy-home: scripts/schemas/eval rubrics/spec
~/.timmy/twitter-archive/: private runtime artifacts only

Review Focus

privacy boundary
evidence requirements
whether the auto-promotion thresholds are strict enough
whether the two-pass batch design is the right way to create training signal

Assumptions

~/.timmy/twitter-archive/ remains the canonical private workspace because it already exists and already contains extracted tweet artifacts
local-only means the private archive workspace lives under ~/.timmy, but its derived artifacts remain untracked and unpushed
v1 does not ingest likes, DMs, deleted tweets, group DMs, or Grok chat
v1 does not depend on a new external training harness; it builds on Hermes session capture and local offline eval

## Summary Build a private local-only Twitter archive learning pipeline for Timmy on top of the existing Hermes + `know_thy_father` flow. The goal is not just to summarize tweets. The goal is to make Timmy systematically better at: - reading information about Alexander - extracting grounded knowledge - producing actionable insights - turning archive work into DPO/adaptation signal - improving local models over time without checking raw data into shared repos ## Decisions Locked - v1 ingests tweets + retweets only - primary output is training signal - archive-derived artifacts stay local under `~/.timmy/twitter-archive/` - memory/training generation is mostly automatic - model training/promotion is mostly automatic, but only after offline eval gates pass ## Deliverables - deterministic archive extractor - two-pass batch reader (`draft` + `critique/rewrite`) - structured knowledge candidate schema - consolidated `profile.json` - weekly actionable insight artifact - DPO pair builder from archive sessions - eval gates for auto-train and auto-promotion - docs/spec for repo boundaries and privacy rules ## Acceptance Criteria - raw archive never lands in tracked repo content - each batch produces notes, structured knowledge, and training examples - durable knowledge is evidence-linked - DPO pairs are generated automatically from archive work - candidate models only auto-promote when archive eval improves and safety specs do not regress - interrupted runs resume from checkpoint without duplicating work ## Implementation Boundary - `timmy-config`: orchestration/tasks/prompts/scheduling - `timmy-home`: scripts/schemas/eval rubrics/spec - `~/.timmy/twitter-archive/`: private runtime artifacts only ## Review Focus - privacy boundary - evidence requirements - whether the auto-promotion thresholds are strict enough - whether the two-pass batch design is the right way to create training signal ## Assumptions - `~/.timmy/twitter-archive/` remains the canonical private workspace because it already exists and already contains extracted tweet artifacts - local-only means the private archive workspace lives under `~/.timmy`, but its derived artifacts remain untracked and unpushed - v1 does not ingest likes, DMs, deleted tweets, group DMs, or Grok chat - v1 does not depend on a new external training harness; it builds on Hermes session capture and local offline eval

codex-agent referenced this issue from Timmy_Foundation/timmy-config

2026-03-27 22:12:29 +00:00

Orchestrate the private Twitter archive learning loop #29

codex-agent referenced this issue

2026-03-27 22:12:29 +00:00

Build private Twitter archive pipeline foundation #4

perplexity referenced this issue

2026-03-27 22:15:49 +00:00

Build private Twitter archive pipeline foundation #4

perplexity referenced this issue

2026-03-27 22:15:50 +00:00

Build private Twitter archive pipeline foundation #4

perplexity commented

2026-03-27 22:17:04 +00:00

Both PRs merged:

timmy-home PR #4: deterministic pipeline scripts, schemas, eval contracts, privacy boundary
timmy-config PR #29: two-pass orchestration, periodic tick, eval-gated training/promotion

All five review criteria pass: privacy boundary clean, repo boundaries respected, batch artifacts emit DPO pairs and profile data, training/promotion gated behind explicit eval checks, periodic tick is safe with per-step gating.

Both PRs merged: - timmy-home PR #4: deterministic pipeline scripts, schemas, eval contracts, privacy boundary - timmy-config PR #29: two-pass orchestration, periodic tick, eval-gated training/promotion All five review criteria pass: privacy boundary clean, repo boundaries respected, batch artifacts emit DPO pairs and profile data, training/promotion gated behind explicit eval checks, periodic tick is safe with per-step gating.

perplexity commented

2026-03-27 22:17:04 +00:00

Both PRs merged:

timmy-home PR #4: deterministic pipeline scripts, schemas, eval contracts, privacy boundary
timmy-config PR #29: two-pass orchestration, periodic tick, eval-gated training/promotion

All five review criteria pass: privacy boundary clean, repo boundaries respected, batch artifacts emit DPO pairs and profile data, training/promotion gated behind explicit eval checks, periodic tick is safe with per-step gating.

Both PRs merged: - timmy-home PR #4: deterministic pipeline scripts, schemas, eval contracts, privacy boundary - timmy-config PR #29: two-pass orchestration, periodic tick, eval-gated training/promotion All five review criteria pass: privacy boundary clean, repo boundaries respected, batch artifacts emit DPO pairs and profile data, training/promotion gated behind explicit eval checks, periodic tick is safe with per-step gating.

perplexity closed this issue

2026-03-27 22:17:05 +00:00

perplexity referenced this issue

2026-03-27 22:48:30 +00:00

[PIPELINE] Trajectory sanitization — strip sensitive metadata before DPO #5

perplexity referenced this issue

2026-03-27 22:48:31 +00:00

[PIPELINE] Trajectory sanitization — strip sensitive metadata before DPO #6

perplexity referenced this issue

2026-03-27 23:24:06 +00:00

Add Morrowind MCP config + context file — zero-code tuning #17

perplexity referenced this issue