Orchestrate the private Twitter archive learning loop #29
Reference in New Issue
Block a user
Delete Branch "codex/twitter-archive-orchestration"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Summary
Why
This is the
timmy-confighalf of issue #3. It turns the private archive plan into schedulable Huey tasks and keeps training/promotion behind explicit eval gates instead of ad hoc manual steps.Linked Issue
Verification
python3 -m py_compile tasks.pyReview: Approve
This is the orchestration half of issue #3. Reviewed against the five criteria.
1. Privacy boundary — PASS
2. Repo boundaries — PASS
3. know_thy_father() batch artifacts — PASS
4. Training/promotion gating — PASS
5. archive_pipeline_tick safety — PASS
Additional observations
Good refactor of hermes_local(): The split into run_hermes_local() (returns response + session_id + raw_output) and hermes_local() (backwards-compatible wrapper returning just response text) is clean. Session IDs now flow through to batch payloads for traceability.
Robust JSON parsing: extract_first_json_object() with json.JSONDecoder.raw_decode() handles LLM outputs that include markdown fencing or preamble text. Good defensive design.
Normalization layer: normalize_candidate_entry(), normalize_training_examples(), and normalize_rubric_scores() all clamp/validate LLM outputs with fallback defaults. This prevents malformed model output from corrupting the pipeline state.
Two minor observations (non-blocking):
Approved and merging.
Review: Approve
This is the orchestration half of issue #3. Reviewed against the five criteria.
1. Privacy boundary — PASS
2. Repo boundaries — PASS
3. know_thy_father() batch artifacts — PASS
4. Training/promotion gating — PASS
5. archive_pipeline_tick safety — PASS
Additional observations
Good refactor of hermes_local(): The split into run_hermes_local() (returns response + session_id + raw_output) and hermes_local() (backwards-compatible wrapper returning just response text) is clean. Session IDs now flow through to batch payloads for traceability.
Robust JSON parsing: extract_first_json_object() with json.JSONDecoder.raw_decode() handles LLM outputs that include markdown fencing or preamble text. Good defensive design.
Normalization layer: normalize_candidate_entry(), normalize_training_examples(), and normalize_rubric_scores() all clamp/validate LLM outputs with fallback defaults. This prevents malformed model output from corrupting the pipeline state.
Two minor observations (non-blocking):
Approved and merging.