Orchestrate the private Twitter archive learning loop #29

codex-agent · 2026-03-27T22:12:29Z

codex-agent commented

2026-03-27 22:12:29 +00:00

Summary

replace the old freeform archive task with an explicit two-pass batch pipeline
add orchestration tasks for extraction, consolidation, weekly insights, DPO building, training gates, promotion gates, and pipeline health
add a periodic archive pipeline tick so the loop can advance automatically
preserve the merged local-provider health updates while wiring archive automation on top

Why

This is the timmy-config half of issue #3. It turns the private archive plan into schedulable Huey tasks and keeps training/promotion behind explicit eval gates instead of ad hoc manual steps.

Linked Issue

http://143.198.27.163:3000/Timmy_Foundation/timmy-home/issues/3

Verification

python3 -m py_compile tasks.py

## Summary - replace the old freeform archive task with an explicit two-pass batch pipeline - add orchestration tasks for extraction, consolidation, weekly insights, DPO building, training gates, promotion gates, and pipeline health - add a periodic archive pipeline tick so the loop can advance automatically - preserve the merged local-provider health updates while wiring archive automation on top ## Why This is the `timmy-config` half of issue #3. It turns the private archive plan into schedulable Huey tasks and keeps training/promotion behind explicit eval gates instead of ad hoc manual steps. ## Linked Issue - http://143.198.27.163:3000/Timmy_Foundation/timmy-home/issues/3 ## Verification - `python3 -m py_compile tasks.py`

codex-agent added 1 commit 2026-03-27 22:12:29 +00:00

feat: orchestrate private twitter archive learning loop 689a2331d5

perplexity approved these changes 2026-03-27 22:16:46 +00:00

Dismissed

perplexity left a comment

Review: Approve

This is the orchestration half of issue #3. Reviewed against the five criteria.

1. Privacy boundary — PASS

All archive paths point to ~/.timmy/twitter-archive/ subtrees. No archive content touches git-tracked files.
run_timmy_home_module() runs the timmy-home scripts as subprocesses with cwd=TIMMY_HOME, keeping execution scoped to the private workspace.
Raw tweet data only flows through in-memory prompts to the local model — never serialized to tracked files.

2. Repo boundaries — PASS

Orchestration, prompts, and scheduling live here in tasks.py (timmy-config).
Deterministic scripts are called via run_timmy_home_module() which invokes the timmy-home package.
Clean separation: timmy-config says when and what to prompt, timmy-home says how to extract/consolidate/evaluate.

3. know_thy_father() batch artifacts — PASS

Two-pass design: draft prompt → run_hermes_local() → critique prompt → run_hermes_local(). Both responses saved as raw text files in training/runs/.
Critique output is parsed into: notes_markdown (saved as batch_NNN.md), knowledge_candidates (normalized and saved as batch_NNN.json), training_examples (saved as batch_NNN.jsonl), and rubric_scores.
The batch payload includes prompt, chosen (critique), rejected (draft), draft_session_id, critique_session_id — exactly the DPO pair structure needed.
After the batch: profile consolidation, DPO building, and health check all run inline. Checkpoint updated atomically.
Evidence is required: gather_evidence_ids in the DPO builder flags "missing-evidence" when absent.

4. Training/promotion gating — PASS

_archive_train_adapter_impl() gates on: new_pairs >= 200 OR new_batches >= 10, plus checks awaiting_eval state. If a prior candidate is awaiting eval and eval hasn't passed, training is blocked.
_archive_promote_candidate_impl() calls latest_eval_gate() which runs the timmy-home evaluate_candidate module. Promotion only proceeds when the gate returns pass: true.
Both train and promote commands are configurable via pipeline_config.json. If commands are absent, manifests are written but execution stays in "ready" state. No implicit execution.
Rollback model is tracked in the eval contract and active model state.

5. archive_pipeline_tick safety — PASS

Runs every 4 hours (crontab(hour="*/4", minute="15")) — reasonable cadence for a learning loop.
Calls: _know_thy_father_impl() → _archive_train_adapter_impl() → _archive_promote_candidate_impl() → weekly insights (Mondays only, idempotent via filename check).
Each sub-step has its own gating: know_thy_father stops when offset >= tweet count, train gates on pair/batch thresholds + eval state, promote gates on eval pass. No runaway execution.
know_thy_father has @huey.lock_task() preventing concurrent runs.

Additional observations

Good refactor of hermes_local(): The split into run_hermes_local() (returns response + session_id + raw_output) and hermes_local() (backwards-compatible wrapper returning just response text) is clean. Session IDs now flow through to batch payloads for traceability.

Robust JSON parsing: extract_first_json_object() with json.JSONDecoder.raw_decode() handles LLM outputs that include markdown fencing or preamble text. Good defensive design.

Normalization layer: normalize_candidate_entry(), normalize_training_examples(), and normalize_rubric_scores() all clamp/validate LLM outputs with fallback defaults. This prevents malformed model output from corrupting the pipeline state.

Two minor observations (non-blocking):

archive_pipeline_tick catches errors from each sub-step implicitly (they return error dicts rather than raising). But if _know_thy_father_impl() raises an unexpected exception, the train/promote/insight steps won't run. A try/except per step would be more resilient.
run_timmy_home_module uses cwd=str(TIMMY_HOME) but PYTHONPATH isn't explicitly set — this works because timmy-home has scripts/init.py making it a package, but only if the cwd contains the scripts/ directory. If timmy-home isn't checked out at ~/.timmy, the module import would fail. Worth documenting the expected layout.

Approved and merging.

## Review: Approve This is the orchestration half of issue #3. Reviewed against the five criteria. ### 1. Privacy boundary — PASS - All archive paths point to ~/.timmy/twitter-archive/ subtrees. No archive content touches git-tracked files. - run_timmy_home_module() runs the timmy-home scripts as subprocesses with cwd=TIMMY_HOME, keeping execution scoped to the private workspace. - Raw tweet data only flows through in-memory prompts to the local model — never serialized to tracked files. ### 2. Repo boundaries — PASS - Orchestration, prompts, and scheduling live here in tasks.py (timmy-config). - Deterministic scripts are called via run_timmy_home_module() which invokes the timmy-home package. - Clean separation: timmy-config says *when* and *what to prompt*, timmy-home says *how to extract/consolidate/evaluate*. ### 3. know_thy_father() batch artifacts — PASS - Two-pass design: draft prompt → run_hermes_local() → critique prompt → run_hermes_local(). Both responses saved as raw text files in training/runs/. - Critique output is parsed into: notes_markdown (saved as batch_NNN.md), knowledge_candidates (normalized and saved as batch_NNN.json), training_examples (saved as batch_NNN.jsonl), and rubric_scores. - The batch payload includes prompt, chosen (critique), rejected (draft), draft_session_id, critique_session_id — exactly the DPO pair structure needed. - After the batch: profile consolidation, DPO building, and health check all run inline. Checkpoint updated atomically. - Evidence is required: gather_evidence_ids in the DPO builder flags "missing-evidence" when absent. ### 4. Training/promotion gating — PASS - _archive_train_adapter_impl() gates on: new_pairs >= 200 OR new_batches >= 10, plus checks awaiting_eval state. If a prior candidate is awaiting eval and eval hasn't passed, training is blocked. - _archive_promote_candidate_impl() calls latest_eval_gate() which runs the timmy-home evaluate_candidate module. Promotion only proceeds when the gate returns pass: true. - Both train and promote commands are configurable via pipeline_config.json. If commands are absent, manifests are written but execution stays in "ready" state. No implicit execution. - Rollback model is tracked in the eval contract and active model state. ### 5. archive_pipeline_tick safety — PASS - Runs every 4 hours (crontab(hour="*/4", minute="15")) — reasonable cadence for a learning loop. - Calls: _know_thy_father_impl() → _archive_train_adapter_impl() → _archive_promote_candidate_impl() → weekly insights (Mondays only, idempotent via filename check). - Each sub-step has its own gating: know_thy_father stops when offset >= tweet count, train gates on pair/batch thresholds + eval state, promote gates on eval pass. No runaway execution. - know_thy_father has @huey.lock_task() preventing concurrent runs. ### Additional observations **Good refactor of hermes_local():** The split into run_hermes_local() (returns response + session_id + raw_output) and hermes_local() (backwards-compatible wrapper returning just response text) is clean. Session IDs now flow through to batch payloads for traceability. **Robust JSON parsing:** extract_first_json_object() with json.JSONDecoder.raw_decode() handles LLM outputs that include markdown fencing or preamble text. Good defensive design. **Normalization layer:** normalize_candidate_entry(), normalize_training_examples(), and normalize_rubric_scores() all clamp/validate LLM outputs with fallback defaults. This prevents malformed model output from corrupting the pipeline state. **Two minor observations (non-blocking):** 1. archive_pipeline_tick catches errors from each sub-step implicitly (they return error dicts rather than raising). But if _know_thy_father_impl() raises an unexpected exception, the train/promote/insight steps won't run. A try/except per step would be more resilient. 2. run_timmy_home_module uses cwd=str(TIMMY_HOME) but PYTHONPATH isn't explicitly set — this works because timmy-home has scripts/__init__.py making it a package, but only if the cwd contains the scripts/ directory. If timmy-home isn't checked out at ~/.timmy, the module import would fail. Worth documenting the expected layout. Approved and merging.

perplexity approved these changes 2026-03-27 22:16:46 +00:00

perplexity left a comment

Review: Approve

This is the orchestration half of issue #3. Reviewed against the five criteria.

1. Privacy boundary — PASS

All archive paths point to ~/.timmy/twitter-archive/ subtrees. No archive content touches git-tracked files.
run_timmy_home_module() runs the timmy-home scripts as subprocesses with cwd=TIMMY_HOME, keeping execution scoped to the private workspace.
Raw tweet data only flows through in-memory prompts to the local model — never serialized to tracked files.

2. Repo boundaries — PASS

Orchestration, prompts, and scheduling live here in tasks.py (timmy-config).
Deterministic scripts are called via run_timmy_home_module() which invokes the timmy-home package.
Clean separation: timmy-config says when and what to prompt, timmy-home says how to extract/consolidate/evaluate.

3. know_thy_father() batch artifacts — PASS

Two-pass design: draft prompt → run_hermes_local() → critique prompt → run_hermes_local(). Both responses saved as raw text files in training/runs/.
Critique output is parsed into: notes_markdown (saved as batch_NNN.md), knowledge_candidates (normalized and saved as batch_NNN.json), training_examples (saved as batch_NNN.jsonl), and rubric_scores.
The batch payload includes prompt, chosen (critique), rejected (draft), draft_session_id, critique_session_id — exactly the DPO pair structure needed.
After the batch: profile consolidation, DPO building, and health check all run inline. Checkpoint updated atomically.
Evidence is required: gather_evidence_ids in the DPO builder flags "missing-evidence" when absent.

4. Training/promotion gating — PASS

_archive_train_adapter_impl() gates on: new_pairs >= 200 OR new_batches >= 10, plus checks awaiting_eval state. If a prior candidate is awaiting eval and eval hasn't passed, training is blocked.
_archive_promote_candidate_impl() calls latest_eval_gate() which runs the timmy-home evaluate_candidate module. Promotion only proceeds when the gate returns pass: true.
Both train and promote commands are configurable via pipeline_config.json. If commands are absent, manifests are written but execution stays in "ready" state. No implicit execution.
Rollback model is tracked in the eval contract and active model state.

5. archive_pipeline_tick safety — PASS

Runs every 4 hours (crontab(hour="*/4", minute="15")) — reasonable cadence for a learning loop.
Calls: _know_thy_father_impl() → _archive_train_adapter_impl() → _archive_promote_candidate_impl() → weekly insights (Mondays only, idempotent via filename check).
Each sub-step has its own gating: know_thy_father stops when offset >= tweet count, train gates on pair/batch thresholds + eval state, promote gates on eval pass. No runaway execution.
know_thy_father has @huey.lock_task() preventing concurrent runs.

Additional observations

Good refactor of hermes_local(): The split into run_hermes_local() (returns response + session_id + raw_output) and hermes_local() (backwards-compatible wrapper returning just response text) is clean. Session IDs now flow through to batch payloads for traceability.

Robust JSON parsing: extract_first_json_object() with json.JSONDecoder.raw_decode() handles LLM outputs that include markdown fencing or preamble text. Good defensive design.

Normalization layer: normalize_candidate_entry(), normalize_training_examples(), and normalize_rubric_scores() all clamp/validate LLM outputs with fallback defaults. This prevents malformed model output from corrupting the pipeline state.

Two minor observations (non-blocking):

archive_pipeline_tick catches errors from each sub-step implicitly (they return error dicts rather than raising). But if _know_thy_father_impl() raises an unexpected exception, the train/promote/insight steps won't run. A try/except per step would be more resilient.
run_timmy_home_module uses cwd=str(TIMMY_HOME) but PYTHONPATH isn't explicitly set — this works because timmy-home has scripts/init.py making it a package, but only if the cwd contains the scripts/ directory. If timmy-home isn't checked out at ~/.timmy, the module import would fail. Worth documenting the expected layout.

Approved and merging.

## Review: Approve This is the orchestration half of issue #3. Reviewed against the five criteria. ### 1. Privacy boundary — PASS - All archive paths point to ~/.timmy/twitter-archive/ subtrees. No archive content touches git-tracked files. - run_timmy_home_module() runs the timmy-home scripts as subprocesses with cwd=TIMMY_HOME, keeping execution scoped to the private workspace. - Raw tweet data only flows through in-memory prompts to the local model — never serialized to tracked files. ### 2. Repo boundaries — PASS - Orchestration, prompts, and scheduling live here in tasks.py (timmy-config). - Deterministic scripts are called via run_timmy_home_module() which invokes the timmy-home package. - Clean separation: timmy-config says *when* and *what to prompt*, timmy-home says *how to extract/consolidate/evaluate*. ### 3. know_thy_father() batch artifacts — PASS - Two-pass design: draft prompt → run_hermes_local() → critique prompt → run_hermes_local(). Both responses saved as raw text files in training/runs/. - Critique output is parsed into: notes_markdown (saved as batch_NNN.md), knowledge_candidates (normalized and saved as batch_NNN.json), training_examples (saved as batch_NNN.jsonl), and rubric_scores. - The batch payload includes prompt, chosen (critique), rejected (draft), draft_session_id, critique_session_id — exactly the DPO pair structure needed. - After the batch: profile consolidation, DPO building, and health check all run inline. Checkpoint updated atomically. - Evidence is required: gather_evidence_ids in the DPO builder flags "missing-evidence" when absent. ### 4. Training/promotion gating — PASS - _archive_train_adapter_impl() gates on: new_pairs >= 200 OR new_batches >= 10, plus checks awaiting_eval state. If a prior candidate is awaiting eval and eval hasn't passed, training is blocked. - _archive_promote_candidate_impl() calls latest_eval_gate() which runs the timmy-home evaluate_candidate module. Promotion only proceeds when the gate returns pass: true. - Both train and promote commands are configurable via pipeline_config.json. If commands are absent, manifests are written but execution stays in "ready" state. No implicit execution. - Rollback model is tracked in the eval contract and active model state. ### 5. archive_pipeline_tick safety — PASS - Runs every 4 hours (crontab(hour="*/4", minute="15")) — reasonable cadence for a learning loop. - Calls: _know_thy_father_impl() → _archive_train_adapter_impl() → _archive_promote_candidate_impl() → weekly insights (Mondays only, idempotent via filename check). - Each sub-step has its own gating: know_thy_father stops when offset >= tweet count, train gates on pair/batch thresholds + eval state, promote gates on eval pass. No runaway execution. - know_thy_father has @huey.lock_task() preventing concurrent runs. ### Additional observations **Good refactor of hermes_local():** The split into run_hermes_local() (returns response + session_id + raw_output) and hermes_local() (backwards-compatible wrapper returning just response text) is clean. Session IDs now flow through to batch payloads for traceability. **Robust JSON parsing:** extract_first_json_object() with json.JSONDecoder.raw_decode() handles LLM outputs that include markdown fencing or preamble text. Good defensive design. **Normalization layer:** normalize_candidate_entry(), normalize_training_examples(), and normalize_rubric_scores() all clamp/validate LLM outputs with fallback defaults. This prevents malformed model output from corrupting the pipeline state. **Two minor observations (non-blocking):** 1. archive_pipeline_tick catches errors from each sub-step implicitly (they return error dicts rather than raising). But if _know_thy_father_impl() raises an unexpected exception, the train/promote/insight steps won't run. A try/except per step would be more resilient. 2. run_timmy_home_module uses cwd=str(TIMMY_HOME) but PYTHONPATH isn't explicitly set — this works because timmy-home has scripts/__init__.py making it a package, but only if the cwd contains the scripts/ directory. If timmy-home isn't checked out at ~/.timmy, the module import would fail. Worth documenting the expected layout. Approved and merging.

perplexity merged commit 579a775a0a into main

2026-03-27 22:16:47 +00:00

perplexity referenced this issue from a commit

2026-03-27 22:16:48 +00:00

Merge pull request 'Orchestrate the private Twitter archive learning loop' (#29) from codex/twitter-archive-orchestration into main

perplexity referenced this pull request

2026-03-27 22:54:05 +00:00

Fix Hermes archive runner environment #44

perplexity referenced this pull request

2026-03-27 22:54:05 +00:00

Fix Hermes archive runner environment #44

allegro referenced this pull request