timmy-config

Author	SHA1	Message	Date
Alexander Whitestone	b3390d4fee	feat: adversary execution harness for prompt corpora (#652 ) Some checks failed Architecture Lint / Linter Tests (pull_request) Successful in 33s Details Smoke Test / smoke (pull_request) Failing after 20s Details Validate Config / YAML Lint (pull_request) Failing after 16s Details Validate Config / JSON Validate (pull_request) Successful in 19s Details Validate Config / Python Syntax & Import Check (pull_request) Failing after 1m33s Details Validate Config / Python Test Suite (pull_request) Has been skipped Details PR Checklist / pr-checklist (pull_request) Failing after 4m27s Details Validate Config / Cron Syntax Check (pull_request) Successful in 11s Details Validate Config / Shell Script Lint (pull_request) Failing after 1m41s Details Validate Config / Deploy Script Dry Run (pull_request) Successful in 9s Details Validate Config / Playbook Schema Validation (pull_request) Successful in 25s Details Architecture Lint / Lint Repository (pull_request) Failing after 15s Details	2026-04-21 11:22:24 +00:00
Claude (Opus 4.6)	4a7fa94731	Merge pull request 'feat: PR triage automation — categorize, auto-merge safe PRs, file reports (#659 )' (#836 ) from burn/659-1776769427 into main Some checks failed Architecture Lint / Linter Tests (push) Has been cancelled Details Architecture Lint / Lint Repository (push) Has been cancelled Details Smoke Test / smoke (push) Has been cancelled Details Validate Config / YAML Lint (push) Has been cancelled Details Validate Config / JSON Validate (push) Has been cancelled Details Validate Config / Python Syntax & Import Check (push) Has been cancelled Details Validate Config / Python Test Suite (push) Has been cancelled Details Validate Config / Shell Script Lint (push) Has been cancelled Details Validate Config / Cron Syntax Check (push) Has been cancelled Details Validate Config / Deploy Script Dry Run (push) Has been cancelled Details Validate Config / Playbook Schema Validation (push) Has been cancelled Details	2026-04-21 11:21:18 +00:00
Claude (Opus 4.6)	485783a317	Merge pull request 'feat(#691 ): training pair provenance tracking — source session + model' (#835 ) from burn/691-1776769427 into main Some checks failed Architecture Lint / Linter Tests (push) Has been cancelled Details Architecture Lint / Lint Repository (push) Has been cancelled Details Smoke Test / smoke (push) Has been cancelled Details Validate Config / Python Syntax & Import Check (push) Has been cancelled Details Validate Config / Python Test Suite (push) Has been cancelled Details Validate Config / Shell Script Lint (push) Has been cancelled Details Validate Config / Cron Syntax Check (push) Has been cancelled Details Validate Config / Deploy Script Dry Run (push) Has been cancelled Details Validate Config / Playbook Schema Validation (push) Has been cancelled Details Validate Config / YAML Lint (push) Has been cancelled Details Validate Config / JSON Validate (push) Has been cancelled Details	2026-04-21 11:21:13 +00:00
Alexander Whitestone	eacc670681	test: validate all 9 genre scene files have 100 valid entries (#645 ) Some checks failed Architecture Lint / Linter Tests (pull_request) Successful in 24s Details Smoke Test / smoke (pull_request) Failing after 44s Details Validate Config / YAML Lint (pull_request) Failing after 31s Details Validate Config / JSON Validate (pull_request) Successful in 36s Details Validate Config / Python Syntax & Import Check (pull_request) Failing after 57s Details Validate Config / Python Test Suite (pull_request) Has been skipped Details Validate Config / Shell Script Lint (pull_request) Failing after 23s Details Validate Config / Cron Syntax Check (pull_request) Successful in 4s Details Validate Config / Deploy Script Dry Run (pull_request) Successful in 5s Details Validate Config / Playbook Schema Validation (pull_request) Successful in 7s Details PR Checklist / pr-checklist (pull_request) Failing after 12m4s Details Architecture Lint / Lint Repository (pull_request) Failing after 24s Details	2026-04-21 11:20:25 +00:00
Alexander Whitestone	3dc1046cf8	fix: add missing artist + timestamp fields to country scene descriptions (#645 ) Some checks failed Architecture Lint / Linter Tests (push) Successful in 23s Details Smoke Test / smoke (push) Failing after 20s Details Validate Config / YAML Lint (push) Failing after 11s Details Validate Config / JSON Validate (push) Successful in 19s Details Architecture Lint / Lint Repository (push) Has been cancelled Details Validate Config / Python Test Suite (push) Has been cancelled Details Validate Config / Cron Syntax Check (push) Has been cancelled Details Validate Config / Deploy Script Dry Run (push) Has been cancelled Details Validate Config / Playbook Schema Validation (push) Has been cancelled Details Validate Config / Shell Script Lint (push) Has been cancelled Details Validate Config / Python Syntax & Import Check (push) Has been cancelled Details	2026-04-21 11:17:36 +00:00
Alexander Whitestone	fe864962ec	test: Enhance PR triage tests (#659 ) Some checks failed Architecture Lint / Linter Tests (pull_request) Successful in 33s Details Smoke Test / smoke (pull_request) Failing after 39s Details Validate Config / YAML Lint (pull_request) Failing after 27s Details Validate Config / JSON Validate (pull_request) Successful in 22s Details Validate Config / Python Syntax & Import Check (pull_request) Failing after 21s Details Validate Config / Python Test Suite (pull_request) Has been skipped Details Validate Config / Shell Script Lint (pull_request) Failing after 24s Details Validate Config / Cron Syntax Check (pull_request) Successful in 5s Details Validate Config / Deploy Script Dry Run (pull_request) Successful in 6s Details Validate Config / Playbook Schema Validation (pull_request) Successful in 10s Details PR Checklist / pr-checklist (pull_request) Failing after 11m27s Details Architecture Lint / Lint Repository (pull_request) Failing after 11s Details	2026-04-21 11:17:00 +00:00
Alexander Whitestone	ced6d20fde	fix: add 'unknown' to VALID_SOURCES for process_pair fallback Some checks failed Architecture Lint / Linter Tests (pull_request) Successful in 26s Details Smoke Test / smoke (pull_request) Failing after 37s Details Validate Config / YAML Lint (pull_request) Failing after 20s Details Validate Config / JSON Validate (pull_request) Successful in 33s Details Validate Config / Python Syntax & Import Check (pull_request) Failing after 1m3s Details Validate Config / Python Test Suite (pull_request) Has been skipped Details Validate Config / Cron Syntax Check (pull_request) Successful in 15s Details Validate Config / Deploy Script Dry Run (pull_request) Successful in 13s Details Validate Config / Playbook Schema Validation (pull_request) Successful in 31s Details Validate Config / Shell Script Lint (pull_request) Failing after 1m43s Details Architecture Lint / Lint Repository (pull_request) Failing after 15s Details PR Checklist / pr-checklist (pull_request) Failing after 11m25s Details	2026-04-21 07:16:50 -04:00
Alexander Whitestone	5ee2190aaa	feat: Enhance PR triage with auto-merge, file-as-issue, org-wide mode (#659 )	2026-04-21 11:16:05 +00:00
Alexander Whitestone	7cfc84637a	feat: Add pr-triage.sh wrapper (#659 )	2026-04-21 11:14:31 +00:00
Alexander Whitestone	83457cc9a9	feat(#691 ): training pair provenance tracking — source session + model ProvenanceTracker: added add_provenance(), extract_provenance_from_existing(), filter_by_provenance(), generate_report() methods. Fixed save_jsonl() to accept both (path, entries) and (entries, path) argument orders for backward compatibility. build_curated.py: every exemplar now gets provenance metadata (source=curated, source_session_id, model=timmy-curated, timestamp). Provenance coverage reported in build output. Acceptance criteria: - [x] Add metadata to each pair: source_session_id, model, timestamp - [x] Filter pairs by provenance (exclude_models, exclude_sources) - [x] Report: pair count by source model Closes #691	2026-04-21 07:14:27 -04:00
Claude (Opus 4.6)	d1486b52e8	Merge pull request 'feat: stale hermes process cleanup script (#829 )' (#834 ) from fix/829-stale-process-cleanup into main Some checks failed Architecture Lint / Linter Tests (push) Successful in 22s Details Smoke Test / smoke (push) Failing after 16s Details Validate Config / YAML Lint (push) Failing after 12s Details Validate Config / JSON Validate (push) Successful in 13s Details Validate Config / Python Syntax & Import Check (push) Failing after 38s Details Validate Config / Python Test Suite (push) Has been skipped Details Validate Config / Shell Script Lint (push) Failing after 37s Details Validate Config / Cron Syntax Check (push) Successful in 7s Details Validate Config / Deploy Script Dry Run (push) Successful in 8s Details Validate Config / Playbook Schema Validation (push) Successful in 18s Details Architecture Lint / Lint Repository (push) Failing after 19s Details	2026-04-21 01:39:54 +00:00
Alexander Whitestone	19db78bbf0	feat: stale hermes process cleanup script (#829 ) Some checks failed Architecture Lint / Linter Tests (pull_request) Successful in 6m45s Details Smoke Test / smoke (pull_request) Failing after 8s Details Validate Config / YAML Lint (pull_request) Failing after 8s Details Validate Config / JSON Validate (pull_request) Successful in 11s Details Validate Config / Python Syntax & Import Check (pull_request) Failing after 43s Details Validate Config / Python Test Suite (pull_request) Has been skipped Details Validate Config / Shell Script Lint (pull_request) Failing after 36s Details Validate Config / Cron Syntax Check (pull_request) Successful in 8s Details Validate Config / Deploy Script Dry Run (pull_request) Successful in 9s Details Validate Config / Playbook Schema Validation (pull_request) Successful in 15s Details PR Checklist / pr-checklist (pull_request) Successful in 2m45s Details Architecture Lint / Lint Repository (pull_request) Failing after 20s Details bin/hermes_cleanup.py: Identifies stale hermes sessions (old + idle) Groups by session, tracks parent+children Memory waste calculation (RSS in MB/GB) --kill to terminate, --dry-run (default) to report --max-age (default 24h), --max-cpu (default 0.5%) --json output, human-readable table tests/test_hermes_cleanup.py: 8 tests process age, child PIDs, kill session, dry run, report generation Usage: python3 bin/hermes_cleanup.py # report python3 bin/hermes_cleanup.py --kill # terminate python3 bin/hermes_cleanup.py --max-age 12 # 12h threshold python3 bin/hermes_cleanup.py --json # JSON	2026-04-20 20:38:20 -04:00
Claude (Opus 4.6)	b3eba66a07	Merge pull request 'fix: add python3 shebang to wakeup.py and .DS_Store to gitignore (closes #681 )' (#832 ) from fix/681-clean into main Some checks failed Smoke Test / smoke (push) Failing after 15s Details Architecture Lint / Linter Tests (push) Successful in 19s Details Validate Config / YAML Lint (push) Failing after 10s Details Validate Config / JSON Validate (push) Successful in 11s Details Validate Config / Python Syntax & Import Check (push) Failing after 34s Details Validate Config / Python Test Suite (push) Has been skipped Details Validate Config / Shell Script Lint (push) Failing after 35s Details Validate Config / Cron Syntax Check (push) Successful in 6s Details Validate Config / Deploy Script Dry Run (push) Successful in 6s Details Validate Config / Playbook Schema Validation (push) Successful in 13s Details Architecture Lint / Lint Repository (push) Failing after 10s Details	2026-04-21 00:10:14 +00:00
Claude	61bb221ff2	fix: add python3 shebang to wakeup.py and .DS_Store to .gitignore (closes #681 ) Some checks failed Architecture Lint / Linter Tests (pull_request) Successful in 27s Details Smoke Test / smoke (pull_request) Failing after 26s Details Validate Config / YAML Lint (pull_request) Failing after 19s Details Validate Config / JSON Validate (pull_request) Successful in 17s Details Validate Config / Python Syntax & Import Check (pull_request) Failing after 52s Details Validate Config / Python Test Suite (pull_request) Has been skipped Details Validate Config / Shell Script Lint (pull_request) Failing after 51s Details Validate Config / Cron Syntax Check (pull_request) Successful in 9s Details Validate Config / Deploy Script Dry Run (pull_request) Successful in 7s Details Validate Config / Playbook Schema Validation (pull_request) Successful in 20s Details PR Checklist / pr-checklist (pull_request) Successful in 3m54s Details Architecture Lint / Lint Repository (pull_request) Failing after 18s Details	2026-04-20 19:59:06 -04:00
Claude (Opus 4.6)	729db767d1	Merge pull request 'feat(#687 ): training data quality filter — remove low-quality pairs' (#830 ) from feat/687-quality-filter into main Some checks failed Smoke Test / smoke (push) Failing after 19s Details Architecture Lint / Linter Tests (push) Successful in 25s Details Validate Config / YAML Lint (push) Failing after 14s Details Validate Config / JSON Validate (push) Successful in 15s Details Validate Config / Python Syntax & Import Check (push) Failing after 41s Details Validate Config / Python Test Suite (push) Has been skipped Details Validate Config / Shell Script Lint (push) Failing after 46s Details Validate Config / Cron Syntax Check (push) Successful in 12s Details Validate Config / Deploy Script Dry Run (push) Successful in 10s Details Validate Config / Playbook Schema Validation (push) Successful in 20s Details Architecture Lint / Lint Repository (push) Failing after 14s Details	2026-04-20 23:40:40 +00:00
Claude (Opus 4.6)	d4dedd2c3d	Merge pull request 'feat: backfill provenance on all training data (#752 )' (#826 ) from fix/752-provenance-v2 into main Some checks failed Smoke Test / smoke (push) Has been cancelled Details Architecture Lint / Lint Repository (push) Has been cancelled Details Architecture Lint / Linter Tests (push) Has been cancelled Details Validate Config / YAML Lint (push) Has been cancelled Details Validate Config / JSON Validate (push) Has been cancelled Details Validate Config / Python Syntax & Import Check (push) Has been cancelled Details Validate Config / Python Test Suite (push) Has been cancelled Details Validate Config / Shell Script Lint (push) Has been cancelled Details Validate Config / Cron Syntax Check (push) Has been cancelled Details Validate Config / Deploy Script Dry Run (push) Has been cancelled Details Validate Config / Playbook Schema Validation (push) Has been cancelled Details	2026-04-20 23:40:37 +00:00
Claude (Opus 4.6)	0e2e2c1552	Merge pull request 'feat: code block normalization tests (closes #750 )' (#825 ) from fix/750-code-blocks into main Some checks failed Architecture Lint / Lint Repository (push) Has been cancelled Details Architecture Lint / Linter Tests (push) Has been cancelled Details Smoke Test / smoke (push) Has been cancelled Details Validate Config / Python Syntax & Import Check (push) Has been cancelled Details Validate Config / Python Test Suite (push) Has been cancelled Details Validate Config / Shell Script Lint (push) Has been cancelled Details Validate Config / Cron Syntax Check (push) Has been cancelled Details Validate Config / Deploy Script Dry Run (push) Has been cancelled Details Validate Config / Playbook Schema Validation (push) Has been cancelled Details Validate Config / JSON Validate (push) Has been cancelled Details Validate Config / YAML Lint (push) Has started running Details	2026-04-20 23:40:35 +00:00
Claude (Opus 4.6)	bee4d02dd5	Merge pull request 'fix: restore pytest collection — fix 7 syntax/import errors (#823 )' (#824 ) from fix/823-pytest-collection into main Some checks failed Architecture Lint / Lint Repository (push) Has been cancelled Details Architecture Lint / Linter Tests (push) Has been cancelled Details Smoke Test / smoke (push) Has been cancelled Details Validate Config / JSON Validate (push) Has been cancelled Details Validate Config / Python Syntax & Import Check (push) Has been cancelled Details Validate Config / Python Test Suite (push) Has been cancelled Details Validate Config / Shell Script Lint (push) Has been cancelled Details Validate Config / Cron Syntax Check (push) Has been cancelled Details Validate Config / Deploy Script Dry Run (push) Has been cancelled Details Validate Config / Playbook Schema Validation (push) Has been cancelled Details Validate Config / YAML Lint (push) Has been cancelled Details	2026-04-20 23:40:23 +00:00
Alexander Whitestone	a0266c83a4	fix(#687 ): Add quality filter tests Some checks failed Smoke Test / smoke (pull_request) Failing after 15s Details Architecture Lint / Linter Tests (pull_request) Successful in 20s Details Validate Config / YAML Lint (pull_request) Failing after 13s Details Validate Config / JSON Validate (pull_request) Successful in 15s Details Validate Config / Python Syntax & Import Check (pull_request) Failing after 36s Details Validate Config / Python Test Suite (pull_request) Has been skipped Details Validate Config / Cron Syntax Check (pull_request) Successful in 10s Details Validate Config / Shell Script Lint (pull_request) Failing after 47s Details Validate Config / Deploy Script Dry Run (pull_request) Successful in 9s Details Validate Config / Playbook Schema Validation (pull_request) Successful in 20s Details Architecture Lint / Lint Repository (pull_request) Failing after 17s Details PR Checklist / pr-checklist (pull_request) Successful in 3m48s Details	2026-04-20 23:16:13 +00:00
Alexander Whitestone	b28071bb71	fix(#687 ): Training data quality filter - Score pairs on specificity, length ratio, code correctness - Composite weighted score (0.5 spec + 0.2 length + 0.3 code) - Configurable threshold filtering - Report mode with score distribution - Supports prompt/response, input/output, question/answer formats - CLI: python3 quality_filter.py input.jsonl -o output.jsonl --report	2026-04-20 23:15:48 +00:00
Alexander Whitestone	8e791afecc	feat: backfill provenance on all training data (#752 ) Some checks failed Architecture Lint / Linter Tests (pull_request) Successful in 21s Details Smoke Test / smoke (pull_request) Failing after 22s Details Validate Config / YAML Lint (pull_request) Failing after 16s Details Validate Config / JSON Validate (pull_request) Successful in 14s Details Validate Config / Python Syntax & Import Check (pull_request) Failing after 33s Details Validate Config / Cron Syntax Check (pull_request) Successful in 12s Details Validate Config / Deploy Script Dry Run (pull_request) Successful in 12s Details Validate Config / Shell Script Lint (pull_request) Failing after 54s Details Validate Config / Playbook Schema Validation (pull_request) Successful in 17s Details PR Checklist / pr-checklist (pull_request) Successful in 2m25s Details Architecture Lint / Lint Repository (pull_request) Has been cancelled Details Validate Config / Python Test Suite (pull_request) Has been cancelled Details scripts/backfill_training_provenance.py: Backfills provenance metadata on all JSONL training files Adds source_session_id, model, timestamp, source_type --dry-run mode, --json output, parse error handling Result: 11,007 pairs across 45 files now have provenance Coverage: 0% -> 100% Validation: python3 scripts/provenance_validate.py --threshold 50 PASS: 3800/3800 pairs have provenance Dashboard: python3 scripts/provenance_dashboard.py Shows pair count by model, source, coverage	2026-04-18 15:59:17 -04:00
Alexander Whitestone	6fcd2cc59a	feat: code block normalization tests (closes #750 ) Some checks failed Architecture Lint / Linter Tests (pull_request) Successful in 15s Details Smoke Test / smoke (pull_request) Failing after 17s Details Validate Config / YAML Lint (pull_request) Failing after 15s Details Validate Config / JSON Validate (pull_request) Successful in 18s Details Validate Config / Python Syntax & Import Check (pull_request) Failing after 39s Details Validate Config / Cron Syntax Check (pull_request) Successful in 12s Details Validate Config / Deploy Script Dry Run (pull_request) Successful in 10s Details Validate Config / Shell Script Lint (pull_request) Failing after 56s Details Validate Config / Playbook Schema Validation (pull_request) Successful in 17s Details PR Checklist / pr-checklist (pull_request) Successful in 2m45s Details Architecture Lint / Lint Repository (pull_request) Has been cancelled Details Validate Config / Python Test Suite (pull_request) Has been cancelled Details tests/test_normalize_code_blocks.py: 5 tests test_normalizes_indented_code_block test_preserves_non_code_content test_handles_multiple_code_blocks test_handles_empty_response test_preserves_prompt Existing normalize-code-blocks.py handles code block indentation.	2026-04-18 15:46:22 -04:00
Alexander Whitestone	edd35eaa4b	fix: restore pytest collection — fix 7 syntax/import errors (#823 ) Some checks failed Architecture Lint / Linter Tests (pull_request) Successful in 12s Details Smoke Test / smoke (pull_request) Failing after 19s Details Validate Config / YAML Lint (pull_request) Failing after 14s Details Validate Config / JSON Validate (pull_request) Successful in 13s Details Validate Config / Python Syntax & Import Check (pull_request) Failing after 52s Details Validate Config / Shell Script Lint (pull_request) Failing after 42s Details Validate Config / Cron Syntax Check (pull_request) Successful in 16s Details Validate Config / Deploy Script Dry Run (pull_request) Successful in 14s Details Validate Config / Playbook Schema Validation (pull_request) Successful in 18s Details PR Checklist / pr-checklist (pull_request) Successful in 3m4s Details Architecture Lint / Lint Repository (pull_request) Has been cancelled Details Validate Config / Python Test Suite (pull_request) Has been cancelled Details Fixed collection errors: scripts/adversary_schema.py: unterminated regex string (line 141) scripts/config_validate.py: unmatched ')' (line 87) scripts/pr_triage.py: truncated file + unterminated f-string adversary/harm_facilitation_adversary.py: 4 broken f-strings bin/glitch_patterns.py: missing get_threejs_patterns() export tests/test_glitch_detector.py: fixed THREEJS_CATEGORIES import tests/test_pr_triage.py: fixed function name imports training/training_pair_provenance.py: added ProvenanceTracker class scripts/validate_scene_data.py: symlink for import compatibility Result: python3 -m pytest --collect-only 911 tests collected, 0 collection errors (was: 769 collected / 7 errors)	2026-04-18 15:37:33 -04:00
Claude (Opus 4.6)	04ecad3b43	Merge pull request 'fix: use PYTHON variable in training Makefile (closes #660 )' (#822 ) from fix/660-python-makefile into main fix: use PYTHON variable in training Makefile (closes #660) Refs Timmy_Foundation/the-nexus#1471	2026-04-17 06:44:30 +00:00
Claude (Opus 4.6)	099948b3d1	fix: use PYTHON variable in training Makefile Some checks failed Architecture Lint / Linter Tests (pull_request) Successful in 8s Details PR Checklist / pr-checklist (pull_request) Failing after 2m12s Details Smoke Test / smoke (pull_request) Failing after 13s Details Validate Config / YAML Lint (pull_request) Failing after 9s Details Validate Config / JSON Validate (pull_request) Successful in 10s Details Validate Config / Python Syntax & Import Check (pull_request) Failing after 16s Details Validate Config / Shell Script Lint (pull_request) Failing after 15s Details Validate Config / Cron Syntax Check (pull_request) Successful in 7s Details Validate Config / Deploy Script Dry Run (pull_request) Successful in 7s Details Validate Config / Playbook Schema Validation (pull_request) Successful in 13s Details Architecture Lint / Lint Repository (pull_request) Has been cancelled Details Validate Config / Python Test Suite (pull_request) Has been cancelled Details On macOS where only python3 is installed (no python shim), bare `python` calls fail with 'No such file or directory'. Adds `PYTHON ?= python3` variable. Replaces all bare `python` calls with `$(PYTHON)` across: train-local, vibes, adversary-value-violations, ingest, curated, convert. Override: make vibes PYTHON=python Closes #660 Refs Timmy_Foundation/the-nexus#1471	2026-04-17 06:39:05 +00:00
Claude (Opus 4.6)	ef58883a26	fix: use PYTHON variable in training Makefile for portability (closes #660 ) Added PYTHON ?= python3 variable and replaced all bare python calls. Fixes macOS where only python3 is installed. Refs #660	2026-04-17 02:37:47 -04:00
Claude (Opus 4.6)	2a11233952	Merge pull request 'feat: quality gate pipeline validation' (#818 ) from fix/623 into main Resolves add/add conflict in pipeline/quality_gate.py by keeping more complete 619-line main version. Closes #623	2026-04-17 02:37:16 -04:00
Claude (Opus 4.6)	cc9ff4cf5d	Merge remote-tracking branch 'origin/fix/752'	2026-04-17 02:37:04 -04:00
Claude (Opus 4.6)	7c03c666d8	Merge pull request 'feat: 500 dream description prompt enhancement pairs — scene/crisis/music data' (#821,#820,#819,#799) from fix/602 into main Resolves add/add conflicts with already-merged files (authority_bypass_200.jsonl, identity_attacks_200.jsonl, quality_filter.py) by keeping main's versions. Closes #602, #645, #689, #599	2026-04-17 02:37:00 -04:00
Claude (Opus 4.6)	0fc149b10c	Merge pull request 'feat: quality filter tests — score specificity, length ratio, code' (#817 ) from fix/687-quality-filter into main	2026-04-17 02:32:51 -04:00
Claude (Opus 4.6)	ed5e52e0d9	Merge pull request 'feat: harm facilitation adversary — 200 jailbreak prompts' (#816 ) from ward/618-harm-facilitation into main	2026-04-17 02:32:48 -04:00
Claude (Opus 4.6)	2c49cac144	Merge pull request 'fix(#662 ): cron fleet audit — crontab parsing, tests, CI validation' (#814 ) from burn/662-cron-audit-fix into main	2026-04-17 02:32:44 -04:00
Claude (Opus 4.6)	1183fb5f2b	Merge pull request 'feat: scene data validator tests + CI path fix' (#813 ) from feat/647-scene-data-validator into main	2026-04-17 02:32:40 -04:00
Claude (Opus 4.6)	7ce0016207	Merge pull request 'test: verify training example metadata preservation' (#812 ) from fix/646-metadata-preservation into main	2026-04-17 02:32:37 -04:00
Claude (Opus 4.6)	06bebc0ca3	Merge pull request 'feat: adversary execution harness for prompt corpora' (#811 ) from fix/652-adversary-harness into main	2026-04-17 02:32:33 -04:00
Claude (Opus 4.6)	b2246e0dcc	Merge pull request 'feat: PR backlog triage script — categorize, find duplicates, detect stale refs' (#810 ) from burn/658-pr-backlog-triage into main	2026-04-17 02:32:30 -04:00
Claude (Opus 4.6)	87ee28aa42	Merge pull request 'feat: Token tracker integrated with orchestrator — auto-logging on task completion' (#808 ) from fix/634-token-tracker-orchestrator into main	2026-04-17 02:32:27 -04:00
Claude (Opus 4.6)	39d1e1d7ce	Merge pull request 'fix: pipeline_state.json daily reset' (#805 ) from fix/650-pipeline-daily-reset-v2 into main	2026-04-17 02:32:18 -04:00
Claude (Opus 4.6)	f57c21fda9	Merge pull request 'fix: training data code block indentation — normalize open_tag whitespace' (#809 ) from fix/750-code-block-indentation into main	2026-04-17 02:32:14 -04:00
Alexander Whitestone	44fe4bfcd7	feat: 500 dream description prompt enhancement pairs (#602 ) Some checks failed Architecture Lint / Linter Tests (pull_request) Successful in 22s Details PR Checklist / pr-checklist (pull_request) Failing after 34m33s Details Smoke Test / smoke (pull_request) Failing after 47m9s Details Validate Config / YAML Lint (pull_request) Failing after 12s Details Validate Config / JSON Validate (pull_request) Successful in 11s Details Validate Config / Shell Script Lint (pull_request) Failing after 42s Details Validate Config / Cron Syntax Check (pull_request) Successful in 7s Details Validate Config / Deploy Script Dry Run (pull_request) Successful in 5s Details Validate Config / Python Syntax & Import Check (pull_request) Failing after 59s Details Validate Training Data / validate (pull_request) Successful in 15s Details Validate Config / Playbook Schema Validation (pull_request) Successful in 19s Details Architecture Lint / Lint Repository (pull_request) Has been cancelled Details Validate Config / Python Test Suite (pull_request) Has been cancelled Details	2026-04-17 02:22:13 -04:00
Claude (Opus 4.6)	89413d00d3	Merge pull request 'fix: hash dedup rotation + bloom filter — bounded memory (#628 )' (#804 ) from burn/621-shared-orchestrator-1776402806 into main	2026-04-17 06:19:03 +00:00
Claude (Opus 4.6)	65a400f3ed	Merge pull request 'feat: shared adversary scoring rubric and transcript schema (closes #655 )' (#802 ) from feat/655-adversary-scoring-rubric into main	2026-04-17 06:19:01 +00:00
Alexander Whitestone	dbb1c124fe	feat: Country + Latin scene descriptions — 200 entries (#645 ) Some checks failed Architecture Lint / Linter Tests (pull_request) Successful in 23s Details Smoke Test / smoke (pull_request) Failing after 14s Details Validate Config / YAML Lint (pull_request) Failing after 12s Details Validate Config / JSON Validate (pull_request) Successful in 13s Details Validate Config / Python Syntax & Import Check (pull_request) Failing after 1m11s Details Validate Config / Shell Script Lint (pull_request) Failing after 40s Details Validate Config / Cron Syntax Check (pull_request) Successful in 10s Details Validate Config / Deploy Script Dry Run (pull_request) Successful in 10s Details Validate Config / Playbook Schema Validation (pull_request) Successful in 15s Details Validate Training Data / validate (pull_request) Successful in 15s Details PR Checklist / pr-checklist (pull_request) Failing after 3m43s Details Architecture Lint / Lint Repository (pull_request) Has been cancelled Details Validate Config / Python Test Suite (pull_request) Has been cancelled Details Complete the 9-genre scene description requirement. Country: 10 songs (Dusty Boots, County Fair, Highway Hymn, Barn Dance, Porcelain Dawn, Lonesome Road, Sweet Magnolia, Graveyard Shift, Sunday Best, Old Barn) Latin: 10 songs (Fuego Lento, Corazon de Oro, Lluvia de Estrellas, Bailando con el Viento, Ritmo del Alma, Luna Roja, Siembra y Cosecha, Carnaval, Desierto de Amor, Raices) All 10 training factory genres now complete: Pop, Rock, Hip-Hop, Electronic, R&B/Soul, Country, Jazz, Classical, Metal, Latin. Closes #645	2026-04-17 02:08:08 -04:00
Alexander Whitestone	9f2a76fc3e	feat: auto-generate scene descriptions from image/video (#689 ) Some checks failed Architecture Lint / Linter Tests (pull_request) Successful in 31s Details PR Checklist / pr-checklist (pull_request) Failing after 13m48s Details Smoke Test / smoke (pull_request) Failing after 13m22s Details Validate Config / YAML Lint (pull_request) Failing after 5s Details Validate Config / JSON Validate (pull_request) Successful in 4s Details Validate Config / Python Syntax & Import Check (pull_request) Failing after 21s Details Validate Config / Shell Script Lint (pull_request) Failing after 22s Details Validate Config / Cron Syntax Check (pull_request) Successful in 8s Details Validate Config / Deploy Script Dry Run (pull_request) Successful in 8s Details Validate Training Data / validate (pull_request) Successful in 10s Details Validate Config / Playbook Schema Validation (pull_request) Successful in 15s Details Architecture Lint / Lint Repository (pull_request) Has been cancelled Details Validate Config / Python Test Suite (pull_request) Has been cancelled Details	2026-04-17 01:58:05 -04:00
Alexander Whitestone	9a8d620163	feat: quality gate pipeline validation (#623 ) Some checks failed Architecture Lint / Linter Tests (pull_request) Successful in 13s Details Smoke Test / smoke (pull_request) Failing after 11s Details Validate Config / YAML Lint (pull_request) Failing after 14s Details Validate Config / JSON Validate (pull_request) Successful in 14s Details Validate Config / Python Syntax & Import Check (pull_request) Failing after 44s Details Validate Config / Shell Script Lint (pull_request) Failing after 24s Details Validate Config / Cron Syntax Check (pull_request) Successful in 5s Details Validate Config / Deploy Script Dry Run (pull_request) Successful in 3s Details Validate Config / Playbook Schema Validation (pull_request) Successful in 8s Details PR Checklist / pr-checklist (pull_request) Failing after 3m54s Details Architecture Lint / Lint Repository (pull_request) Has been cancelled Details Validate Config / Python Test Suite (pull_request) Has been cancelled Details Validates JSONL/JSON pipeline outputs for: - Schema correctness - Content quality (non-empty, not duplicated) - Toxicity detection - Dedup hash management with auto-cleanup Usage: python3 bin/quality-gate.py validate data.jsonl python3 bin/quality-gate.py score data.jsonl python3 bin/quality-gate.py stats python3 bin/quality-gate.py cleanup Closes #623	2026-04-17 05:53:33 +00:00
Alexander Whitestone	3e9d808739	feat: quality filter tests — score specificity, length ratio, code (#687 ) Some checks failed Architecture Lint / Linter Tests (pull_request) Successful in 14s Details Smoke Test / smoke (pull_request) Failing after 16s Details Validate Config / YAML Lint (pull_request) Failing after 14s Details Validate Config / JSON Validate (pull_request) Successful in 15s Details Validate Config / Python Syntax & Import Check (pull_request) Failing after 45s Details Validate Config / Shell Script Lint (pull_request) Failing after 23s Details Validate Config / Cron Syntax Check (pull_request) Successful in 6s Details Validate Config / Deploy Script Dry Run (pull_request) Successful in 5s Details Validate Config / Playbook Schema Validation (pull_request) Successful in 18s Details PR Checklist / pr-checklist (pull_request) Failing after 3m51s Details Architecture Lint / Lint Repository (pull_request) Has been cancelled Details Validate Config / Python Test Suite (pull_request) Has been cancelled Details tests/test_quality_filter.py: 12 tests TestScorePair (10): good pair scores high (>= 0.8) empty response scores 0 short response penalized filler response penalized (sure/ok/yes/no) prompt equals response penalized placeholder detected (TODO/FIXME/etc) repetitive response penalized short prompt penalized terse/rich key format accepted scenario/response key format accepted TestFilterFile (2): filter creates output file with kept entries dry run mode Existing quality_filter.py already implements: - specificity (filler detection, placeholder detection) - length ratio (response/prompt comparison) - code correctness (repetition detection) - threshold-based filtering - report: N filtered out of M total	2026-04-17 01:51:47 -04:00
Alexander Whitestone	d983d2f163	test: harm facilitation adversary tests (#618 ) Some checks failed Architecture Lint / Linter Tests (pull_request) Successful in 23s Details Smoke Test / smoke (pull_request) Failing after 16s Details Validate Config / YAML Lint (pull_request) Failing after 13s Details Validate Config / JSON Validate (pull_request) Successful in 16s Details Validate Config / Python Syntax & Import Check (pull_request) Failing after 38s Details Validate Config / Cron Syntax Check (pull_request) Successful in 12s Details Validate Config / Shell Script Lint (pull_request) Failing after 54s Details Validate Config / Deploy Script Dry Run (pull_request) Successful in 10s Details Validate Config / Playbook Schema Validation (pull_request) Successful in 20s Details PR Checklist / pr-checklist (pull_request) Failing after 3m52s Details Architecture Lint / Lint Repository (pull_request) Has been cancelled Details Validate Config / Python Test Suite (pull_request) Has been cancelled Details	2026-04-17 05:45:19 +00:00
Alexander Whitestone	b6d14cf95b	feat: harm facilitation adversary runner (#618 )	2026-04-17 05:38:16 +00:00
Alexander Whitestone	05378c8aed	feat: 200 harm facilitation jailbreak prompts (#618 ) 8 categories: weapons_direct (30), weapons_chemical (30), weapons_biological (20), coercion (25), harm_instructions (25), escalation (25), roleplay_bypass (25), context_manipulation (20).	2026-04-17 05:37:24 +00:00
Alexander Whitestone	d278d7f5d5	fix(#662 ): cron fleet audit — crontab parsing, tests, CI validation Some checks failed Architecture Lint / Linter Tests (pull_request) Successful in 24s Details Smoke Test / smoke (pull_request) Failing after 14s Details Validate Config / YAML Lint (pull_request) Failing after 14s Details Validate Config / JSON Validate (pull_request) Successful in 16s Details Validate Config / Python Syntax & Import Check (pull_request) Failing after 46s Details Validate Config / Cron Syntax Check (pull_request) Successful in 8s Details Validate Config / Deploy Script Dry Run (pull_request) Successful in 7s Details Validate Config / Shell Script Lint (pull_request) Failing after 44s Details Validate Config / Playbook Schema Validation (pull_request) Successful in 22s Details PR Checklist / pr-checklist (pull_request) Failing after 3m55s Details Architecture Lint / Lint Repository (pull_request) Has been cancelled Details Validate Config / Python Test Suite (pull_request) Has been cancelled Details - Added VPS crontab backup parsing to cron-audit-662.py - New audit_fleet() combines hermes cron + VPS crontabs - load_crontab_backups() reads cron/vps/*-crontab-backup.txt - 20+ tests: crontab parsing, job categorization, fleet audit, timestamp parsing, backup loading - ci-cron-validate.py: CI gate that fails on systemic failures - Fresh audit report generated in cron/audit-report.json Closes #662	2026-04-17 01:34:45 -04:00

1 2 3 4 5 ...

857 Commits