|
|
729db767d1
|
Merge pull request 'feat(#687): training data quality filter — remove low-quality pairs' (#830) from feat/687-quality-filter into main
Smoke Test / smoke (push) Failing after 19s
Architecture Lint / Linter Tests (push) Successful in 25s
Validate Config / YAML Lint (push) Failing after 14s
Validate Config / JSON Validate (push) Successful in 15s
Validate Config / Python Syntax & Import Check (push) Failing after 41s
Validate Config / Python Test Suite (push) Has been skipped
Validate Config / Shell Script Lint (push) Failing after 46s
Validate Config / Cron Syntax Check (push) Successful in 12s
Validate Config / Deploy Script Dry Run (push) Successful in 10s
Validate Config / Playbook Schema Validation (push) Successful in 20s
Architecture Lint / Lint Repository (push) Failing after 14s
|
2026-04-20 23:40:40 +00:00 |
|
|
|
d4dedd2c3d
|
Merge pull request 'feat: backfill provenance on all training data (#752)' (#826) from fix/752-provenance-v2 into main
Smoke Test / smoke (push) Has been cancelled
Architecture Lint / Lint Repository (push) Has been cancelled
Architecture Lint / Linter Tests (push) Has been cancelled
Validate Config / YAML Lint (push) Has been cancelled
Validate Config / JSON Validate (push) Has been cancelled
Validate Config / Python Syntax & Import Check (push) Has been cancelled
Validate Config / Python Test Suite (push) Has been cancelled
Validate Config / Shell Script Lint (push) Has been cancelled
Validate Config / Cron Syntax Check (push) Has been cancelled
Validate Config / Deploy Script Dry Run (push) Has been cancelled
Validate Config / Playbook Schema Validation (push) Has been cancelled
|
2026-04-20 23:40:37 +00:00 |
|
|
|
a0266c83a4
|
fix(#687): Add quality filter tests
Smoke Test / smoke (pull_request) Failing after 15s
Architecture Lint / Linter Tests (pull_request) Successful in 20s
Validate Config / YAML Lint (pull_request) Failing after 13s
Validate Config / JSON Validate (pull_request) Successful in 15s
Validate Config / Python Syntax & Import Check (pull_request) Failing after 36s
Validate Config / Python Test Suite (pull_request) Has been skipped
Validate Config / Cron Syntax Check (pull_request) Successful in 10s
Validate Config / Shell Script Lint (pull_request) Failing after 47s
Validate Config / Deploy Script Dry Run (pull_request) Successful in 9s
Validate Config / Playbook Schema Validation (pull_request) Successful in 20s
Architecture Lint / Lint Repository (pull_request) Failing after 17s
PR Checklist / pr-checklist (pull_request) Successful in 3m48s
|
2026-04-20 23:16:13 +00:00 |
|
|
|
b28071bb71
|
fix(#687): Training data quality filter
- Score pairs on specificity, length ratio, code correctness
- Composite weighted score (0.5 spec + 0.2 length + 0.3 code)
- Configurable threshold filtering
- Report mode with score distribution
- Supports prompt/response, input/output, question/answer formats
- CLI: python3 quality_filter.py input.jsonl -o output.jsonl --report
|
2026-04-20 23:15:48 +00:00 |
|
Alexander Whitestone
|
8e791afecc
|
feat: backfill provenance on all training data (#752)
Architecture Lint / Linter Tests (pull_request) Successful in 21s
Smoke Test / smoke (pull_request) Failing after 22s
Validate Config / YAML Lint (pull_request) Failing after 16s
Validate Config / JSON Validate (pull_request) Successful in 14s
Validate Config / Python Syntax & Import Check (pull_request) Failing after 33s
Validate Config / Cron Syntax Check (pull_request) Successful in 12s
Validate Config / Deploy Script Dry Run (pull_request) Successful in 12s
Validate Config / Shell Script Lint (pull_request) Failing after 54s
Validate Config / Playbook Schema Validation (pull_request) Successful in 17s
PR Checklist / pr-checklist (pull_request) Successful in 2m25s
Architecture Lint / Lint Repository (pull_request) Has been cancelled
Validate Config / Python Test Suite (pull_request) Has been cancelled
scripts/backfill_training_provenance.py:
Backfills provenance metadata on all JSONL training files
Adds source_session_id, model, timestamp, source_type
--dry-run mode, --json output, parse error handling
Result: 11,007 pairs across 45 files now have provenance
Coverage: 0% -> 100%
Validation: python3 scripts/provenance_validate.py --threshold 50
PASS: 3800/3800 pairs have provenance
Dashboard: python3 scripts/provenance_dashboard.py
Shows pair count by model, source, coverage
|
2026-04-18 15:59:17 -04:00 |
|
Alexander Whitestone
|
edd35eaa4b
|
fix: restore pytest collection — fix 7 syntax/import errors (#823)
Architecture Lint / Linter Tests (pull_request) Successful in 12s
Smoke Test / smoke (pull_request) Failing after 19s
Validate Config / YAML Lint (pull_request) Failing after 14s
Validate Config / JSON Validate (pull_request) Successful in 13s
Validate Config / Python Syntax & Import Check (pull_request) Failing after 52s
Validate Config / Shell Script Lint (pull_request) Failing after 42s
Validate Config / Cron Syntax Check (pull_request) Successful in 16s
Validate Config / Deploy Script Dry Run (pull_request) Successful in 14s
Validate Config / Playbook Schema Validation (pull_request) Successful in 18s
PR Checklist / pr-checklist (pull_request) Successful in 3m4s
Architecture Lint / Lint Repository (pull_request) Has been cancelled
Validate Config / Python Test Suite (pull_request) Has been cancelled
Fixed collection errors:
scripts/adversary_schema.py: unterminated regex string (line 141)
scripts/config_validate.py: unmatched ')' (line 87)
scripts/pr_triage.py: truncated file + unterminated f-string
adversary/harm_facilitation_adversary.py: 4 broken f-strings
bin/glitch_patterns.py: missing get_threejs_patterns() export
tests/test_glitch_detector.py: fixed THREEJS_CATEGORIES import
tests/test_pr_triage.py: fixed function name imports
training/training_pair_provenance.py: added ProvenanceTracker class
scripts/validate_scene_data.py: symlink for import compatibility
Result: python3 -m pytest --collect-only
911 tests collected, 0 collection errors
(was: 769 collected / 7 errors)
|
2026-04-18 15:37:33 -04:00 |
|
|
|
7c03c666d8
|
Merge pull request 'feat: 500 dream description prompt enhancement pairs — scene/crisis/music data' (#821,#820,#819,#799) from fix/602 into main
Resolves add/add conflicts with already-merged files (authority_bypass_200.jsonl, identity_attacks_200.jsonl, quality_filter.py) by keeping main's versions.
Closes #602, #645, #689, #599
|
2026-04-17 02:37:00 -04:00 |
|
|
|
2c49cac144
|
Merge pull request 'fix(#662): cron fleet audit — crontab parsing, tests, CI validation' (#814) from burn/662-cron-audit-fix into main
|
2026-04-17 02:32:44 -04:00 |
|
|
|
06bebc0ca3
|
Merge pull request 'feat: adversary execution harness for prompt corpora' (#811) from fix/652-adversary-harness into main
|
2026-04-17 02:32:33 -04:00 |
|
|
|
b2246e0dcc
|
Merge pull request 'feat: PR backlog triage script — categorize, find duplicates, detect stale refs' (#810) from burn/658-pr-backlog-triage into main
|
2026-04-17 02:32:30 -04:00 |
|
|
|
39d1e1d7ce
|
Merge pull request 'fix: pipeline_state.json daily reset' (#805) from fix/650-pipeline-daily-reset-v2 into main
|
2026-04-17 02:32:18 -04:00 |
|
|
|
f57c21fda9
|
Merge pull request 'fix: training data code block indentation — normalize open_tag whitespace' (#809) from fix/750-code-block-indentation into main
|
2026-04-17 02:32:14 -04:00 |
|
|
|
65a400f3ed
|
Merge pull request 'feat: shared adversary scoring rubric and transcript schema (closes #655)' (#802) from feat/655-adversary-scoring-rubric into main
|
2026-04-17 06:19:01 +00:00 |
|
Alexander Whitestone
|
d278d7f5d5
|
fix(#662): cron fleet audit — crontab parsing, tests, CI validation
Architecture Lint / Linter Tests (pull_request) Successful in 24s
Smoke Test / smoke (pull_request) Failing after 14s
Validate Config / YAML Lint (pull_request) Failing after 14s
Validate Config / JSON Validate (pull_request) Successful in 16s
Validate Config / Python Syntax & Import Check (pull_request) Failing after 46s
Validate Config / Cron Syntax Check (pull_request) Successful in 8s
Validate Config / Deploy Script Dry Run (pull_request) Successful in 7s
Validate Config / Shell Script Lint (pull_request) Failing after 44s
Validate Config / Playbook Schema Validation (pull_request) Successful in 22s
PR Checklist / pr-checklist (pull_request) Failing after 3m55s
Architecture Lint / Lint Repository (pull_request) Has been cancelled
Validate Config / Python Test Suite (pull_request) Has been cancelled
- Added VPS crontab backup parsing to cron-audit-662.py
- New audit_fleet() combines hermes cron + VPS crontabs
- load_crontab_backups() reads cron/vps/*-crontab-backup.txt
- 20+ tests: crontab parsing, job categorization, fleet audit,
timestamp parsing, backup loading
- ci-cron-validate.py: CI gate that fails on systemic failures
- Fresh audit report generated in cron/audit-report.json
Closes #662
|
2026-04-17 01:34:45 -04:00 |
|
|
|
c633afd66d
|
fix: add underscore module version for test imports (#750)
|
2026-04-17 05:33:26 +00:00 |
|
|
|
c69ae0e72b
|
fix: normalize open_tag whitespace in code block parser (#750)
|
2026-04-17 05:33:24 +00:00 |
|
|
|
f094b0d5b5
|
feat: Add PR backlog triage script — categorize, duplicates, stale detection (#658)
|
2026-04-17 05:32:19 +00:00 |
|
|
|
42ff05aeec
|
feat: adversary execution harness for prompt corpora (#652)
Reusable harness for replaying JSONL corpora against live agents.
Supports Ollama, hermes, and mock backends.
Captures transcripts, scores responses, auto-files P0 issues.
Closes #652
|
2026-04-17 05:31:27 +00:00 |
|
|
|
acba760731
|
fix: reset_stale_states delegates to standalone script (closes #650)
Validate Config / Playbook Schema Validation (pull_request) Successful in 14s
Architecture Lint / Linter Tests (pull_request) Successful in 26s
PR Checklist / pr-checklist (pull_request) Failing after 25m6s
Smoke Test / smoke (pull_request) Failing after 12s
Validate Config / YAML Lint (pull_request) Failing after 8s
Validate Config / Python Syntax & Import Check (pull_request) Failing after 35s
Validate Config / JSON Validate (pull_request) Successful in 13s
Validate Config / Cron Syntax Check (pull_request) Successful in 8s
Validate Config / Deploy Script Dry Run (pull_request) Successful in 6s
Validate Config / Shell Script Lint (pull_request) Failing after 34s
Architecture Lint / Lint Repository (pull_request) Has been cancelled
Validate Config / Python Test Suite (pull_request) Has been cancelled
|
2026-04-17 05:26:06 +00:00 |
|
|
|
34ade6fc0e
|
fix: pipeline state daily reset (closes #650)
|
2026-04-17 05:24:14 +00:00 |
|
|
|
c5270d76e0
|
fix: pipeline state daily reset (closes #650)
|
2026-04-17 05:24:12 +00:00 |
|
|
|
38a4a73a67
|
feat: shared adversary scoring rubric and transcript schema (#655)
|
2026-04-17 05:17:29 +00:00 |
|
|
|
6b984532a1
|
feat: config validation script
Closes #690
Validates YAML syntax, required keys, value types, and
forbidden keys before deploy. Prevents broken deploys
from bad config.
|
2026-04-17 05:07:44 +00:00 |
|
Alexander Whitestone
|
f169634a75
|
feat: config drift detection across all fleet nodes (#686)
PR Checklist / pr-checklist (pull_request) Has been cancelled
Architecture Lint / Linter Tests (pull_request) Has been cancelled
Architecture Lint / Lint Repository (pull_request) Has been cancelled
Smoke Test / smoke (pull_request) Has been cancelled
Validate Config / YAML Lint (pull_request) Has been cancelled
Validate Config / JSON Validate (pull_request) Has been cancelled
Validate Config / Python Syntax & Import Check (pull_request) Has been cancelled
Validate Config / Python Test Suite (pull_request) Has been cancelled
Validate Config / Shell Script Lint (pull_request) Has been cancelled
Validate Config / Cron Syntax Check (pull_request) Has been cancelled
Validate Config / Deploy Script Dry Run (pull_request) Has been cancelled
Validate Config / Playbook Schema Validation (pull_request) Has been cancelled
Validate Training Data / validate (pull_request) Has been cancelled
Detect config drift between fleet nodes and canonical timmy-config.
scripts/config_drift_detector.py (200 lines):
- SSH-based config collection from all nodes
- Recursive diff against canonical config
- Report: which keys differ, on which nodes
- JSON output for programmatic consumption
Fleet nodes: local, ezra (143.198.27.163), bezalel (167.99.126.228)
Usage:
python3 scripts/config_drift_detector.py --report
python3 scripts/config_drift_detector.py --json
Closes #686
|
2026-04-16 01:33:57 -04:00 |
|
Merge Bot
|
11e476e79e
|
Merge PR #633: scripts/token-tracker.py
|
2026-04-16 05:11:23 +00:00 |
|
Merge Bot
|
5ac19b27ee
|
Merge PR #665: scripts/pr_triage.py
|
2026-04-16 05:10:46 +00:00 |
|
Merge Bot
|
7c16ddb741
|
Merge PR #712: scripts/nightly-pipeline-scheduler.sh (changed)
|
2026-04-16 05:09:54 +00:00 |
|
Merge Bot
|
4642c8b3b1
|
Merge PR #656: scripts/generate-crisis-direct-suicidal-pairs.py (added)
|
2026-04-16 05:06:47 +00:00 |
|
Merge Bot
|
7ee587b9f4
|
Merge PR #667: scripts/validate-scene-data.py (added)
|
2026-04-16 05:06:10 +00:00 |
|
Merge Bot
|
720516d452
|
Merge PR #671: scripts/cron-audit-662.py (added)
|
2026-04-16 05:05:56 +00:00 |
|
Merge Bot
|
8bc6e4e5f0
|
Merge PR #679: scripts/pr_triage.py (added)
|
2026-04-16 05:05:44 +00:00 |
|
Merge Bot
|
17adc703f8
|
Merge PR #729: scripts/generate_scene_descriptions.py (added)
|
2026-04-16 05:03:55 +00:00 |
|
Merge Bot
|
4b891f8f46
|
Merge PR #738: scripts/config_template.py (added)
|
2026-04-16 05:03:30 +00:00 |
|
Merge Bot
|
1a362637c9
|
Merge PR #763: scripts/pr-backlog-triage.py (added)
|
2026-04-16 04:59:59 +00:00 |
|
Merge Bot
|
6b7d219a29
|
Merge PR #768: scripts/token_budget.py (added)
|
2026-04-16 04:59:16 +00:00 |
|
Merge Bot
|
318eaefb81
|
Merge PR #771: scripts/quality_gate_integration.py (added)
|
2026-04-16 04:59:01 +00:00 |
|
Merge Bot
|
d76182c654
|
Merge PR #772: scripts/cron_audit.py (added)
|
2026-04-16 04:58:59 +00:00 |
|
Merge Bot
|
8c5b82e214
|
Merge PR #773: scripts/hash_dedup.py (added)
|
2026-04-16 04:58:55 +00:00 |
|
Merge Bot
|
297363a141
|
Merge PR #775: scripts/pr-triage-automation.py (added)
|
2026-04-16 04:58:49 +00:00 |
|
Merge Bot
|
7f121d5591
|
Merge PR #776: scripts/config_drift.py (added)
|
2026-04-16 04:58:44 +00:00 |
|
Merge Bot
|
218b6dcb33
|
Merge PR #777: scripts/token_tracker.py (added)
|
2026-04-16 04:58:40 +00:00 |
|
Merge Bot
|
636e32e467
|
Merge PR #783: scripts/normalize-code-blocks.py (added)
|
2026-04-16 04:58:23 +00:00 |
|
|
|
d120526244
|
fix: add python3 shebang to scripts/visual_pr_reviewer.py (#681)
|
2026-04-15 02:57:53 +00:00 |
|
|
|
8596ff761b
|
fix: add python3 shebang to scripts/diagram_meaning_extractor.py (#681)
|
2026-04-15 02:57:40 +00:00 |
|
|
|
7553fd4f3e
|
fix: add python3 shebang to scripts/captcha_bypass_handler.py (#681)
|
2026-04-15 02:57:25 +00:00 |
|
|
|
ad751a6de6
|
docs: add pipeline scheduler README
|
2026-04-14 22:47:12 +00:00 |
|
|
|
82f9810081
|
feat: add nightly-pipeline-scheduler.sh
|
2026-04-14 22:46:38 +00:00 |
|
|
|
04cceccd01
|
feat: add rock scene generator (#607)
Architecture Lint / Lint Repository (push) Has been cancelled
Architecture Lint / Linter Tests (push) Has been cancelled
Smoke Test / smoke (push) Has been cancelled
Validate Config / YAML Lint (push) Has been cancelled
Validate Config / JSON Validate (push) Has been cancelled
Validate Config / Python Syntax & Import Check (push) Has been cancelled
Validate Config / Python Test Suite (push) Has been cancelled
Validate Config / Shell Script Lint (push) Has been cancelled
Validate Config / Cron Syntax Check (push) Has been cancelled
Validate Config / Deploy Script Dry Run (push) Has been cancelled
Validate Config / Playbook Schema Validation (push) Has been cancelled
|
2026-04-14 22:35:43 +00:00 |
|
|
|
04dbf772b1
|
feat: Visual Smoke Test for The Nexus #490 (#558)
Architecture Lint / Linter Tests (push) Successful in 15s
Smoke Test / smoke (push) Failing after 14s
Validate Config / YAML Lint (push) Failing after 13s
Validate Config / JSON Validate (push) Successful in 13s
Validate Config / Shell Script Lint (push) Failing after 40s
Validate Config / Python Syntax & Import Check (push) Failing after 58s
Validate Config / Python Test Suite (push) Has been skipped
Validate Config / Cron Syntax Check (push) Successful in 11s
Validate Config / Deploy Script Dry Run (push) Successful in 11s
Validate Config / Playbook Schema Validation (push) Successful in 18s
Architecture Lint / Lint Repository (push) Failing after 13s
Architecture Lint / Linter Tests (pull_request) Successful in 26s
Smoke Test / smoke (pull_request) Failing after 17s
Validate Config / YAML Lint (pull_request) Failing after 12s
Validate Config / JSON Validate (pull_request) Successful in 12s
PR Checklist / pr-checklist (pull_request) Failing after 3m36s
Validate Config / Shell Script Lint (pull_request) Failing after 40s
Validate Config / Python Syntax & Import Check (pull_request) Failing after 1m4s
Validate Config / Python Test Suite (pull_request) Has been skipped
Validate Config / Cron Syntax Check (pull_request) Successful in 9s
Validate Config / Deploy Script Dry Run (pull_request) Successful in 9s
Validate Config / Playbook Schema Validation (pull_request) Successful in 20s
Architecture Lint / Lint Repository (pull_request) Failing after 16s
Merge PR #558
|
2026-04-14 22:18:23 +00:00 |
|
|
|
9651a56308
|
feat: Glitch Detector HTML Report with annotated screenshots #544 (#567)
Architecture Lint / Lint Repository (push) Has been cancelled
Architecture Lint / Linter Tests (push) Has been cancelled
Smoke Test / smoke (push) Has been cancelled
Validate Config / JSON Validate (push) Has been cancelled
Validate Config / Python Syntax & Import Check (push) Has been cancelled
Validate Config / Python Test Suite (push) Has been cancelled
Validate Config / Shell Script Lint (push) Has been cancelled
Validate Config / Cron Syntax Check (push) Has been cancelled
Validate Config / Deploy Script Dry Run (push) Has been cancelled
Validate Config / Playbook Schema Validation (push) Has been cancelled
Validate Config / YAML Lint (push) Has been cancelled
Merge PR #567
|
2026-04-14 22:17:32 +00:00 |
|