b3390d4fee
feat: adversary execution harness for prompt corpora ( #652 )
Architecture Lint / Linter Tests (pull_request) Successful in 33s
Smoke Test / smoke (pull_request) Failing after 20s
Validate Config / YAML Lint (pull_request) Failing after 16s
Validate Config / JSON Validate (pull_request) Successful in 19s
Validate Config / Python Syntax & Import Check (pull_request) Failing after 1m33s
Validate Config / Python Test Suite (pull_request) Has been skipped
PR Checklist / pr-checklist (pull_request) Failing after 4m27s
Validate Config / Cron Syntax Check (pull_request) Successful in 11s
Validate Config / Shell Script Lint (pull_request) Failing after 1m41s
Validate Config / Deploy Script Dry Run (pull_request) Successful in 9s
Validate Config / Playbook Schema Validation (pull_request) Successful in 25s
Architecture Lint / Lint Repository (pull_request) Failing after 15s
2026-04-21 11:22:24 +00:00
4a7fa94731
Merge pull request 'feat: PR triage automation — categorize, auto-merge safe PRs, file reports ( #659 )' ( #836 ) from burn/659-1776769427 into main
Architecture Lint / Linter Tests (push) Has been cancelled
Architecture Lint / Lint Repository (push) Has been cancelled
Smoke Test / smoke (push) Has been cancelled
Validate Config / YAML Lint (push) Has been cancelled
Validate Config / JSON Validate (push) Has been cancelled
Validate Config / Python Syntax & Import Check (push) Has been cancelled
Validate Config / Python Test Suite (push) Has been cancelled
Validate Config / Shell Script Lint (push) Has been cancelled
Validate Config / Cron Syntax Check (push) Has been cancelled
Validate Config / Deploy Script Dry Run (push) Has been cancelled
Validate Config / Playbook Schema Validation (push) Has been cancelled
2026-04-21 11:21:18 +00:00
485783a317
Merge pull request 'feat( #691 ): training pair provenance tracking — source session + model' ( #835 ) from burn/691-1776769427 into main
Architecture Lint / Linter Tests (push) Has been cancelled
Architecture Lint / Lint Repository (push) Has been cancelled
Smoke Test / smoke (push) Has been cancelled
Validate Config / Python Syntax & Import Check (push) Has been cancelled
Validate Config / Python Test Suite (push) Has been cancelled
Validate Config / Shell Script Lint (push) Has been cancelled
Validate Config / Cron Syntax Check (push) Has been cancelled
Validate Config / Deploy Script Dry Run (push) Has been cancelled
Validate Config / Playbook Schema Validation (push) Has been cancelled
Validate Config / YAML Lint (push) Has been cancelled
Validate Config / JSON Validate (push) Has been cancelled
2026-04-21 11:21:13 +00:00
eacc670681
test: validate all 9 genre scene files have 100 valid entries ( #645 )
Architecture Lint / Linter Tests (pull_request) Successful in 24s
Smoke Test / smoke (pull_request) Failing after 44s
Validate Config / YAML Lint (pull_request) Failing after 31s
Validate Config / JSON Validate (pull_request) Successful in 36s
Validate Config / Python Syntax & Import Check (pull_request) Failing after 57s
Validate Config / Python Test Suite (pull_request) Has been skipped
Validate Config / Shell Script Lint (pull_request) Failing after 23s
Validate Config / Cron Syntax Check (pull_request) Successful in 4s
Validate Config / Deploy Script Dry Run (pull_request) Successful in 5s
Validate Config / Playbook Schema Validation (pull_request) Successful in 7s
PR Checklist / pr-checklist (pull_request) Failing after 12m4s
Architecture Lint / Lint Repository (pull_request) Failing after 24s
2026-04-21 11:20:25 +00:00
3dc1046cf8
fix: add missing artist + timestamp fields to country scene descriptions ( #645 )
Architecture Lint / Linter Tests (push) Successful in 23s
Smoke Test / smoke (push) Failing after 20s
Validate Config / YAML Lint (push) Failing after 11s
Validate Config / JSON Validate (push) Successful in 19s
Architecture Lint / Lint Repository (push) Has been cancelled
Validate Config / Python Test Suite (push) Has been cancelled
Validate Config / Cron Syntax Check (push) Has been cancelled
Validate Config / Deploy Script Dry Run (push) Has been cancelled
Validate Config / Playbook Schema Validation (push) Has been cancelled
Validate Config / Shell Script Lint (push) Has been cancelled
Validate Config / Python Syntax & Import Check (push) Has been cancelled
2026-04-21 11:17:36 +00:00
fe864962ec
test: Enhance PR triage tests ( #659 )
Architecture Lint / Linter Tests (pull_request) Successful in 33s
Smoke Test / smoke (pull_request) Failing after 39s
Validate Config / YAML Lint (pull_request) Failing after 27s
Validate Config / JSON Validate (pull_request) Successful in 22s
Validate Config / Python Syntax & Import Check (pull_request) Failing after 21s
Validate Config / Python Test Suite (pull_request) Has been skipped
Validate Config / Shell Script Lint (pull_request) Failing after 24s
Validate Config / Cron Syntax Check (pull_request) Successful in 5s
Validate Config / Deploy Script Dry Run (pull_request) Successful in 6s
Validate Config / Playbook Schema Validation (pull_request) Successful in 10s
PR Checklist / pr-checklist (pull_request) Failing after 11m27s
Architecture Lint / Lint Repository (pull_request) Failing after 11s
2026-04-21 11:17:00 +00:00
Alexander Whitestone
ced6d20fde
fix: add 'unknown' to VALID_SOURCES for process_pair fallback
Architecture Lint / Linter Tests (pull_request) Successful in 26s
Smoke Test / smoke (pull_request) Failing after 37s
Validate Config / YAML Lint (pull_request) Failing after 20s
Validate Config / JSON Validate (pull_request) Successful in 33s
Validate Config / Python Syntax & Import Check (pull_request) Failing after 1m3s
Validate Config / Python Test Suite (pull_request) Has been skipped
Validate Config / Cron Syntax Check (pull_request) Successful in 15s
Validate Config / Deploy Script Dry Run (pull_request) Successful in 13s
Validate Config / Playbook Schema Validation (pull_request) Successful in 31s
Validate Config / Shell Script Lint (pull_request) Failing after 1m43s
Architecture Lint / Lint Repository (pull_request) Failing after 15s
PR Checklist / pr-checklist (pull_request) Failing after 11m25s
2026-04-21 07:16:50 -04:00
5ee2190aaa
feat: Enhance PR triage with auto-merge, file-as-issue, org-wide mode ( #659 )
2026-04-21 11:16:05 +00:00
7cfc84637a
feat: Add pr-triage.sh wrapper ( #659 )
2026-04-21 11:14:31 +00:00
Alexander Whitestone
83457cc9a9
feat( #691 ): training pair provenance tracking — source session + model
...
ProvenanceTracker: added add_provenance(), extract_provenance_from_existing(),
filter_by_provenance(), generate_report() methods.
Fixed save_jsonl() to accept both (path, entries) and (entries, path)
argument orders for backward compatibility.
build_curated.py: every exemplar now gets provenance metadata
(source=curated, source_session_id, model=timmy-curated, timestamp).
Provenance coverage reported in build output.
Acceptance criteria:
- [x] Add metadata to each pair: source_session_id, model, timestamp
- [x] Filter pairs by provenance (exclude_models, exclude_sources)
- [x] Report: pair count by source model
Closes #691
2026-04-21 07:14:27 -04:00
d1486b52e8
Merge pull request 'feat: stale hermes process cleanup script ( #829 )' ( #834 ) from fix/829-stale-process-cleanup into main
Architecture Lint / Linter Tests (push) Successful in 22s
Smoke Test / smoke (push) Failing after 16s
Validate Config / YAML Lint (push) Failing after 12s
Validate Config / JSON Validate (push) Successful in 13s
Validate Config / Python Syntax & Import Check (push) Failing after 38s
Validate Config / Python Test Suite (push) Has been skipped
Validate Config / Shell Script Lint (push) Failing after 37s
Validate Config / Cron Syntax Check (push) Successful in 7s
Validate Config / Deploy Script Dry Run (push) Successful in 8s
Validate Config / Playbook Schema Validation (push) Successful in 18s
Architecture Lint / Lint Repository (push) Failing after 19s
2026-04-21 01:39:54 +00:00
Alexander Whitestone
19db78bbf0
feat: stale hermes process cleanup script ( #829 )
...
Architecture Lint / Linter Tests (pull_request) Successful in 6m45s
Smoke Test / smoke (pull_request) Failing after 8s
Validate Config / YAML Lint (pull_request) Failing after 8s
Validate Config / JSON Validate (pull_request) Successful in 11s
Validate Config / Python Syntax & Import Check (pull_request) Failing after 43s
Validate Config / Python Test Suite (pull_request) Has been skipped
Validate Config / Shell Script Lint (pull_request) Failing after 36s
Validate Config / Cron Syntax Check (pull_request) Successful in 8s
Validate Config / Deploy Script Dry Run (pull_request) Successful in 9s
Validate Config / Playbook Schema Validation (pull_request) Successful in 15s
PR Checklist / pr-checklist (pull_request) Successful in 2m45s
Architecture Lint / Lint Repository (pull_request) Failing after 20s
bin/hermes_cleanup.py:
Identifies stale hermes sessions (old + idle)
Groups by session, tracks parent+children
Memory waste calculation (RSS in MB/GB)
--kill to terminate, --dry-run (default) to report
--max-age (default 24h), --max-cpu (default 0.5%)
--json output, human-readable table
tests/test_hermes_cleanup.py: 8 tests
process age, child PIDs, kill session,
dry run, report generation
Usage:
python3 bin/hermes_cleanup.py # report
python3 bin/hermes_cleanup.py --kill # terminate
python3 bin/hermes_cleanup.py --max-age 12 # 12h threshold
python3 bin/hermes_cleanup.py --json # JSON
2026-04-20 20:38:20 -04:00
b3eba66a07
Merge pull request 'fix: add python3 shebang to wakeup.py and .DS_Store to gitignore ( closes #681 )' ( #832 ) from fix/681-clean into main
Smoke Test / smoke (push) Failing after 15s
Architecture Lint / Linter Tests (push) Successful in 19s
Validate Config / YAML Lint (push) Failing after 10s
Validate Config / JSON Validate (push) Successful in 11s
Validate Config / Python Syntax & Import Check (push) Failing after 34s
Validate Config / Python Test Suite (push) Has been skipped
Validate Config / Shell Script Lint (push) Failing after 35s
Validate Config / Cron Syntax Check (push) Successful in 6s
Validate Config / Deploy Script Dry Run (push) Successful in 6s
Validate Config / Playbook Schema Validation (push) Successful in 13s
Architecture Lint / Lint Repository (push) Failing after 10s
2026-04-21 00:10:14 +00:00
61bb221ff2
fix: add python3 shebang to wakeup.py and .DS_Store to .gitignore ( closes #681 )
Architecture Lint / Linter Tests (pull_request) Successful in 27s
Smoke Test / smoke (pull_request) Failing after 26s
Validate Config / YAML Lint (pull_request) Failing after 19s
Validate Config / JSON Validate (pull_request) Successful in 17s
Validate Config / Python Syntax & Import Check (pull_request) Failing after 52s
Validate Config / Python Test Suite (pull_request) Has been skipped
Validate Config / Shell Script Lint (pull_request) Failing after 51s
Validate Config / Cron Syntax Check (pull_request) Successful in 9s
Validate Config / Deploy Script Dry Run (pull_request) Successful in 7s
Validate Config / Playbook Schema Validation (pull_request) Successful in 20s
PR Checklist / pr-checklist (pull_request) Successful in 3m54s
Architecture Lint / Lint Repository (pull_request) Failing after 18s
2026-04-20 19:59:06 -04:00
729db767d1
Merge pull request 'feat( #687 ): training data quality filter — remove low-quality pairs' ( #830 ) from feat/687-quality-filter into main
Smoke Test / smoke (push) Failing after 19s
Architecture Lint / Linter Tests (push) Successful in 25s
Validate Config / YAML Lint (push) Failing after 14s
Validate Config / JSON Validate (push) Successful in 15s
Validate Config / Python Syntax & Import Check (push) Failing after 41s
Validate Config / Python Test Suite (push) Has been skipped
Validate Config / Shell Script Lint (push) Failing after 46s
Validate Config / Cron Syntax Check (push) Successful in 12s
Validate Config / Deploy Script Dry Run (push) Successful in 10s
Validate Config / Playbook Schema Validation (push) Successful in 20s
Architecture Lint / Lint Repository (push) Failing after 14s
2026-04-20 23:40:40 +00:00
d4dedd2c3d
Merge pull request 'feat: backfill provenance on all training data ( #752 )' ( #826 ) from fix/752-provenance-v2 into main
Smoke Test / smoke (push) Has been cancelled
Architecture Lint / Lint Repository (push) Has been cancelled
Architecture Lint / Linter Tests (push) Has been cancelled
Validate Config / YAML Lint (push) Has been cancelled
Validate Config / JSON Validate (push) Has been cancelled
Validate Config / Python Syntax & Import Check (push) Has been cancelled
Validate Config / Python Test Suite (push) Has been cancelled
Validate Config / Shell Script Lint (push) Has been cancelled
Validate Config / Cron Syntax Check (push) Has been cancelled
Validate Config / Deploy Script Dry Run (push) Has been cancelled
Validate Config / Playbook Schema Validation (push) Has been cancelled
2026-04-20 23:40:37 +00:00
0e2e2c1552
Merge pull request 'feat: code block normalization tests ( closes #750 )' ( #825 ) from fix/750-code-blocks into main
Architecture Lint / Lint Repository (push) Has been cancelled
Architecture Lint / Linter Tests (push) Has been cancelled
Smoke Test / smoke (push) Has been cancelled
Validate Config / Python Syntax & Import Check (push) Has been cancelled
Validate Config / Python Test Suite (push) Has been cancelled
Validate Config / Shell Script Lint (push) Has been cancelled
Validate Config / Cron Syntax Check (push) Has been cancelled
Validate Config / Deploy Script Dry Run (push) Has been cancelled
Validate Config / Playbook Schema Validation (push) Has been cancelled
Validate Config / JSON Validate (push) Has been cancelled
Validate Config / YAML Lint (push) Has started running
2026-04-20 23:40:35 +00:00
bee4d02dd5
Merge pull request 'fix: restore pytest collection — fix 7 syntax/import errors ( #823 )' ( #824 ) from fix/823-pytest-collection into main
Architecture Lint / Lint Repository (push) Has been cancelled
Architecture Lint / Linter Tests (push) Has been cancelled
Smoke Test / smoke (push) Has been cancelled
Validate Config / JSON Validate (push) Has been cancelled
Validate Config / Python Syntax & Import Check (push) Has been cancelled
Validate Config / Python Test Suite (push) Has been cancelled
Validate Config / Shell Script Lint (push) Has been cancelled
Validate Config / Cron Syntax Check (push) Has been cancelled
Validate Config / Deploy Script Dry Run (push) Has been cancelled
Validate Config / Playbook Schema Validation (push) Has been cancelled
Validate Config / YAML Lint (push) Has been cancelled
2026-04-20 23:40:23 +00:00
a0266c83a4
fix( #687 ): Add quality filter tests
Smoke Test / smoke (pull_request) Failing after 15s
Architecture Lint / Linter Tests (pull_request) Successful in 20s
Validate Config / YAML Lint (pull_request) Failing after 13s
Validate Config / JSON Validate (pull_request) Successful in 15s
Validate Config / Python Syntax & Import Check (pull_request) Failing after 36s
Validate Config / Python Test Suite (pull_request) Has been skipped
Validate Config / Cron Syntax Check (pull_request) Successful in 10s
Validate Config / Shell Script Lint (pull_request) Failing after 47s
Validate Config / Deploy Script Dry Run (pull_request) Successful in 9s
Validate Config / Playbook Schema Validation (pull_request) Successful in 20s
Architecture Lint / Lint Repository (pull_request) Failing after 17s
PR Checklist / pr-checklist (pull_request) Successful in 3m48s
2026-04-20 23:16:13 +00:00
b28071bb71
fix( #687 ): Training data quality filter
...
- Score pairs on specificity, length ratio, code correctness
- Composite weighted score (0.5 spec + 0.2 length + 0.3 code)
- Configurable threshold filtering
- Report mode with score distribution
- Supports prompt/response, input/output, question/answer formats
- CLI: python3 quality_filter.py input.jsonl -o output.jsonl --report
2026-04-20 23:15:48 +00:00
Alexander Whitestone
8e791afecc
feat: backfill provenance on all training data ( #752 )
...
Architecture Lint / Linter Tests (pull_request) Successful in 21s
Smoke Test / smoke (pull_request) Failing after 22s
Validate Config / YAML Lint (pull_request) Failing after 16s
Validate Config / JSON Validate (pull_request) Successful in 14s
Validate Config / Python Syntax & Import Check (pull_request) Failing after 33s
Validate Config / Cron Syntax Check (pull_request) Successful in 12s
Validate Config / Deploy Script Dry Run (pull_request) Successful in 12s
Validate Config / Shell Script Lint (pull_request) Failing after 54s
Validate Config / Playbook Schema Validation (pull_request) Successful in 17s
PR Checklist / pr-checklist (pull_request) Successful in 2m25s
Architecture Lint / Lint Repository (pull_request) Has been cancelled
Validate Config / Python Test Suite (pull_request) Has been cancelled
scripts/backfill_training_provenance.py:
Backfills provenance metadata on all JSONL training files
Adds source_session_id, model, timestamp, source_type
--dry-run mode, --json output, parse error handling
Result: 11,007 pairs across 45 files now have provenance
Coverage: 0% -> 100%
Validation: python3 scripts/provenance_validate.py --threshold 50
PASS: 3800/3800 pairs have provenance
Dashboard: python3 scripts/provenance_dashboard.py
Shows pair count by model, source, coverage
2026-04-18 15:59:17 -04:00
Alexander Whitestone
6fcd2cc59a
feat: code block normalization tests ( closes #750 )
...
Architecture Lint / Linter Tests (pull_request) Successful in 15s
Smoke Test / smoke (pull_request) Failing after 17s
Validate Config / YAML Lint (pull_request) Failing after 15s
Validate Config / JSON Validate (pull_request) Successful in 18s
Validate Config / Python Syntax & Import Check (pull_request) Failing after 39s
Validate Config / Cron Syntax Check (pull_request) Successful in 12s
Validate Config / Deploy Script Dry Run (pull_request) Successful in 10s
Validate Config / Shell Script Lint (pull_request) Failing after 56s
Validate Config / Playbook Schema Validation (pull_request) Successful in 17s
PR Checklist / pr-checklist (pull_request) Successful in 2m45s
Architecture Lint / Lint Repository (pull_request) Has been cancelled
Validate Config / Python Test Suite (pull_request) Has been cancelled
tests/test_normalize_code_blocks.py: 5 tests
test_normalizes_indented_code_block
test_preserves_non_code_content
test_handles_multiple_code_blocks
test_handles_empty_response
test_preserves_prompt
Existing normalize-code-blocks.py handles code block indentation.
2026-04-18 15:46:22 -04:00
Alexander Whitestone
edd35eaa4b
fix: restore pytest collection — fix 7 syntax/import errors ( #823 )
...
Architecture Lint / Linter Tests (pull_request) Successful in 12s
Smoke Test / smoke (pull_request) Failing after 19s
Validate Config / YAML Lint (pull_request) Failing after 14s
Validate Config / JSON Validate (pull_request) Successful in 13s
Validate Config / Python Syntax & Import Check (pull_request) Failing after 52s
Validate Config / Shell Script Lint (pull_request) Failing after 42s
Validate Config / Cron Syntax Check (pull_request) Successful in 16s
Validate Config / Deploy Script Dry Run (pull_request) Successful in 14s
Validate Config / Playbook Schema Validation (pull_request) Successful in 18s
PR Checklist / pr-checklist (pull_request) Successful in 3m4s
Architecture Lint / Lint Repository (pull_request) Has been cancelled
Validate Config / Python Test Suite (pull_request) Has been cancelled
Fixed collection errors:
scripts/adversary_schema.py: unterminated regex string (line 141)
scripts/config_validate.py: unmatched ')' (line 87)
scripts/pr_triage.py: truncated file + unterminated f-string
adversary/harm_facilitation_adversary.py: 4 broken f-strings
bin/glitch_patterns.py: missing get_threejs_patterns() export
tests/test_glitch_detector.py: fixed THREEJS_CATEGORIES import
tests/test_pr_triage.py: fixed function name imports
training/training_pair_provenance.py: added ProvenanceTracker class
scripts/validate_scene_data.py: symlink for import compatibility
Result: python3 -m pytest --collect-only
911 tests collected, 0 collection errors
(was: 769 collected / 7 errors)
2026-04-18 15:37:33 -04:00
04ecad3b43
Merge pull request 'fix: use PYTHON variable in training Makefile ( closes #660 )' ( #822 ) from fix/660-python-makefile into main
...
fix: use PYTHON variable in training Makefile (closes #660 )
Refs Timmy_Foundation/the-nexus#1471
2026-04-17 06:44:30 +00:00
099948b3d1
fix: use PYTHON variable in training Makefile
...
Architecture Lint / Linter Tests (pull_request) Successful in 8s
PR Checklist / pr-checklist (pull_request) Failing after 2m12s
Smoke Test / smoke (pull_request) Failing after 13s
Validate Config / YAML Lint (pull_request) Failing after 9s
Validate Config / JSON Validate (pull_request) Successful in 10s
Validate Config / Python Syntax & Import Check (pull_request) Failing after 16s
Validate Config / Shell Script Lint (pull_request) Failing after 15s
Validate Config / Cron Syntax Check (pull_request) Successful in 7s
Validate Config / Deploy Script Dry Run (pull_request) Successful in 7s
Validate Config / Playbook Schema Validation (pull_request) Successful in 13s
Architecture Lint / Lint Repository (pull_request) Has been cancelled
Validate Config / Python Test Suite (pull_request) Has been cancelled
On macOS where only python3 is installed (no python shim), bare
`python` calls fail with 'No such file or directory'.
Adds `PYTHON ?= python3` variable. Replaces all bare `python`
calls with `$(PYTHON)` across: train-local, vibes,
adversary-value-violations, ingest, curated, convert.
Override: make vibes PYTHON=python
Closes #660
Refs Timmy_Foundation/the-nexus#1471
2026-04-17 06:39:05 +00:00
ef58883a26
fix: use PYTHON variable in training Makefile for portability ( closes #660 )
...
Added PYTHON ?= python3 variable and replaced all bare python calls.
Fixes macOS where only python3 is installed.
Refs #660
2026-04-17 02:37:47 -04:00
2a11233952
Merge pull request 'feat: quality gate pipeline validation' ( #818 ) from fix/623 into main
...
Resolves add/add conflict in pipeline/quality_gate.py by keeping more complete 619-line main version.
Closes #623
2026-04-17 02:37:16 -04:00
cc9ff4cf5d
Merge remote-tracking branch 'origin/fix/752'
2026-04-17 02:37:04 -04:00
7c03c666d8
Merge pull request 'feat: 500 dream description prompt enhancement pairs — scene/crisis/music data' (#821,#820,#819,#799) from fix/602 into main
...
Resolves add/add conflicts with already-merged files (authority_bypass_200.jsonl, identity_attacks_200.jsonl, quality_filter.py) by keeping main's versions.
Closes #602 , #645 , #689 , #599
2026-04-17 02:37:00 -04:00
0fc149b10c
Merge pull request 'feat: quality filter tests — score specificity, length ratio, code' ( #817 ) from fix/687-quality-filter into main
2026-04-17 02:32:51 -04:00
ed5e52e0d9
Merge pull request 'feat: harm facilitation adversary — 200 jailbreak prompts' ( #816 ) from ward/618-harm-facilitation into main
2026-04-17 02:32:48 -04:00
2c49cac144
Merge pull request 'fix( #662 ): cron fleet audit — crontab parsing, tests, CI validation' ( #814 ) from burn/662-cron-audit-fix into main
2026-04-17 02:32:44 -04:00
1183fb5f2b
Merge pull request 'feat: scene data validator tests + CI path fix' ( #813 ) from feat/647-scene-data-validator into main
2026-04-17 02:32:40 -04:00
7ce0016207
Merge pull request 'test: verify training example metadata preservation' ( #812 ) from fix/646-metadata-preservation into main
2026-04-17 02:32:37 -04:00
06bebc0ca3
Merge pull request 'feat: adversary execution harness for prompt corpora' ( #811 ) from fix/652-adversary-harness into main
2026-04-17 02:32:33 -04:00
b2246e0dcc
Merge pull request 'feat: PR backlog triage script — categorize, find duplicates, detect stale refs' ( #810 ) from burn/658-pr-backlog-triage into main
2026-04-17 02:32:30 -04:00
87ee28aa42
Merge pull request 'feat: Token tracker integrated with orchestrator — auto-logging on task completion' ( #808 ) from fix/634-token-tracker-orchestrator into main
2026-04-17 02:32:27 -04:00
39d1e1d7ce
Merge pull request 'fix: pipeline_state.json daily reset' ( #805 ) from fix/650-pipeline-daily-reset-v2 into main
2026-04-17 02:32:18 -04:00
f57c21fda9
Merge pull request 'fix: training data code block indentation — normalize open_tag whitespace' ( #809 ) from fix/750-code-block-indentation into main
2026-04-17 02:32:14 -04:00
Alexander Whitestone
44fe4bfcd7
feat: 500 dream description prompt enhancement pairs ( #602 )
Architecture Lint / Linter Tests (pull_request) Successful in 22s
PR Checklist / pr-checklist (pull_request) Failing after 34m33s
Smoke Test / smoke (pull_request) Failing after 47m9s
Validate Config / YAML Lint (pull_request) Failing after 12s
Validate Config / JSON Validate (pull_request) Successful in 11s
Validate Config / Shell Script Lint (pull_request) Failing after 42s
Validate Config / Cron Syntax Check (pull_request) Successful in 7s
Validate Config / Deploy Script Dry Run (pull_request) Successful in 5s
Validate Config / Python Syntax & Import Check (pull_request) Failing after 59s
Validate Training Data / validate (pull_request) Successful in 15s
Validate Config / Playbook Schema Validation (pull_request) Successful in 19s
Architecture Lint / Lint Repository (pull_request) Has been cancelled
Validate Config / Python Test Suite (pull_request) Has been cancelled
2026-04-17 02:22:13 -04:00
89413d00d3
Merge pull request 'fix: hash dedup rotation + bloom filter — bounded memory ( #628 )' ( #804 ) from burn/621-shared-orchestrator-1776402806 into main
2026-04-17 06:19:03 +00:00
65a400f3ed
Merge pull request 'feat: shared adversary scoring rubric and transcript schema ( closes #655 )' ( #802 ) from feat/655-adversary-scoring-rubric into main
2026-04-17 06:19:01 +00:00
Alexander Whitestone
dbb1c124fe
feat: Country + Latin scene descriptions — 200 entries ( #645 )
...
Architecture Lint / Linter Tests (pull_request) Successful in 23s
Smoke Test / smoke (pull_request) Failing after 14s
Validate Config / YAML Lint (pull_request) Failing after 12s
Validate Config / JSON Validate (pull_request) Successful in 13s
Validate Config / Python Syntax & Import Check (pull_request) Failing after 1m11s
Validate Config / Shell Script Lint (pull_request) Failing after 40s
Validate Config / Cron Syntax Check (pull_request) Successful in 10s
Validate Config / Deploy Script Dry Run (pull_request) Successful in 10s
Validate Config / Playbook Schema Validation (pull_request) Successful in 15s
Validate Training Data / validate (pull_request) Successful in 15s
PR Checklist / pr-checklist (pull_request) Failing after 3m43s
Architecture Lint / Lint Repository (pull_request) Has been cancelled
Validate Config / Python Test Suite (pull_request) Has been cancelled
Complete the 9-genre scene description requirement.
Country: 10 songs (Dusty Boots, County Fair, Highway Hymn, Barn Dance,
Porcelain Dawn, Lonesome Road, Sweet Magnolia, Graveyard Shift,
Sunday Best, Old Barn)
Latin: 10 songs (Fuego Lento, Corazon de Oro, Lluvia de Estrellas,
Bailando con el Viento, Ritmo del Alma, Luna Roja, Siembra y Cosecha,
Carnaval, Desierto de Amor, Raices)
All 10 training factory genres now complete:
Pop, Rock, Hip-Hop, Electronic, R&B/Soul, Country, Jazz, Classical,
Metal, Latin.
Closes #645
2026-04-17 02:08:08 -04:00
Alexander Whitestone
9f2a76fc3e
feat: auto-generate scene descriptions from image/video ( #689 )
Architecture Lint / Linter Tests (pull_request) Successful in 31s
PR Checklist / pr-checklist (pull_request) Failing after 13m48s
Smoke Test / smoke (pull_request) Failing after 13m22s
Validate Config / YAML Lint (pull_request) Failing after 5s
Validate Config / JSON Validate (pull_request) Successful in 4s
Validate Config / Python Syntax & Import Check (pull_request) Failing after 21s
Validate Config / Shell Script Lint (pull_request) Failing after 22s
Validate Config / Cron Syntax Check (pull_request) Successful in 8s
Validate Config / Deploy Script Dry Run (pull_request) Successful in 8s
Validate Training Data / validate (pull_request) Successful in 10s
Validate Config / Playbook Schema Validation (pull_request) Successful in 15s
Architecture Lint / Lint Repository (pull_request) Has been cancelled
Validate Config / Python Test Suite (pull_request) Has been cancelled
2026-04-17 01:58:05 -04:00
9a8d620163
feat: quality gate pipeline validation ( #623 )
...
Architecture Lint / Linter Tests (pull_request) Successful in 13s
Smoke Test / smoke (pull_request) Failing after 11s
Validate Config / YAML Lint (pull_request) Failing after 14s
Validate Config / JSON Validate (pull_request) Successful in 14s
Validate Config / Python Syntax & Import Check (pull_request) Failing after 44s
Validate Config / Shell Script Lint (pull_request) Failing after 24s
Validate Config / Cron Syntax Check (pull_request) Successful in 5s
Validate Config / Deploy Script Dry Run (pull_request) Successful in 3s
Validate Config / Playbook Schema Validation (pull_request) Successful in 8s
PR Checklist / pr-checklist (pull_request) Failing after 3m54s
Architecture Lint / Lint Repository (pull_request) Has been cancelled
Validate Config / Python Test Suite (pull_request) Has been cancelled
Validates JSONL/JSON pipeline outputs for:
- Schema correctness
- Content quality (non-empty, not duplicated)
- Toxicity detection
- Dedup hash management with auto-cleanup
Usage:
python3 bin/quality-gate.py validate data.jsonl
python3 bin/quality-gate.py score data.jsonl
python3 bin/quality-gate.py stats
python3 bin/quality-gate.py cleanup
Closes #623
2026-04-17 05:53:33 +00:00
Alexander Whitestone
3e9d808739
feat: quality filter tests — score specificity, length ratio, code ( #687 )
...
Architecture Lint / Linter Tests (pull_request) Successful in 14s
Smoke Test / smoke (pull_request) Failing after 16s
Validate Config / YAML Lint (pull_request) Failing after 14s
Validate Config / JSON Validate (pull_request) Successful in 15s
Validate Config / Python Syntax & Import Check (pull_request) Failing after 45s
Validate Config / Shell Script Lint (pull_request) Failing after 23s
Validate Config / Cron Syntax Check (pull_request) Successful in 6s
Validate Config / Deploy Script Dry Run (pull_request) Successful in 5s
Validate Config / Playbook Schema Validation (pull_request) Successful in 18s
PR Checklist / pr-checklist (pull_request) Failing after 3m51s
Architecture Lint / Lint Repository (pull_request) Has been cancelled
Validate Config / Python Test Suite (pull_request) Has been cancelled
tests/test_quality_filter.py: 12 tests
TestScorePair (10):
good pair scores high (>= 0.8)
empty response scores 0
short response penalized
filler response penalized (sure/ok/yes/no)
prompt equals response penalized
placeholder detected (TODO/FIXME/etc)
repetitive response penalized
short prompt penalized
terse/rich key format accepted
scenario/response key format accepted
TestFilterFile (2):
filter creates output file with kept entries
dry run mode
Existing quality_filter.py already implements:
- specificity (filler detection, placeholder detection)
- length ratio (response/prompt comparison)
- code correctness (repetition detection)
- threshold-based filtering
- report: N filtered out of M total
2026-04-17 01:51:47 -04:00
d983d2f163
test: harm facilitation adversary tests ( #618 )
Architecture Lint / Linter Tests (pull_request) Successful in 23s
Smoke Test / smoke (pull_request) Failing after 16s
Validate Config / YAML Lint (pull_request) Failing after 13s
Validate Config / JSON Validate (pull_request) Successful in 16s
Validate Config / Python Syntax & Import Check (pull_request) Failing after 38s
Validate Config / Cron Syntax Check (pull_request) Successful in 12s
Validate Config / Shell Script Lint (pull_request) Failing after 54s
Validate Config / Deploy Script Dry Run (pull_request) Successful in 10s
Validate Config / Playbook Schema Validation (pull_request) Successful in 20s
PR Checklist / pr-checklist (pull_request) Failing after 3m52s
Architecture Lint / Lint Repository (pull_request) Has been cancelled
Validate Config / Python Test Suite (pull_request) Has been cancelled
2026-04-17 05:45:19 +00:00
b6d14cf95b
feat: harm facilitation adversary runner ( #618 )
2026-04-17 05:38:16 +00:00
05378c8aed
feat: 200 harm facilitation jailbreak prompts ( #618 )
...
8 categories: weapons_direct (30), weapons_chemical (30),
weapons_biological (20), coercion (25), harm_instructions (25),
escalation (25), roleplay_bypass (25), context_manipulation (20).
2026-04-17 05:37:24 +00:00
Alexander Whitestone
d278d7f5d5
fix( #662 ): cron fleet audit — crontab parsing, tests, CI validation
...
Architecture Lint / Linter Tests (pull_request) Successful in 24s
Smoke Test / smoke (pull_request) Failing after 14s
Validate Config / YAML Lint (pull_request) Failing after 14s
Validate Config / JSON Validate (pull_request) Successful in 16s
Validate Config / Python Syntax & Import Check (pull_request) Failing after 46s
Validate Config / Cron Syntax Check (pull_request) Successful in 8s
Validate Config / Deploy Script Dry Run (pull_request) Successful in 7s
Validate Config / Shell Script Lint (pull_request) Failing after 44s
Validate Config / Playbook Schema Validation (pull_request) Successful in 22s
PR Checklist / pr-checklist (pull_request) Failing after 3m55s
Architecture Lint / Lint Repository (pull_request) Has been cancelled
Validate Config / Python Test Suite (pull_request) Has been cancelled
- Added VPS crontab backup parsing to cron-audit-662.py
- New audit_fleet() combines hermes cron + VPS crontabs
- load_crontab_backups() reads cron/vps/*-crontab-backup.txt
- 20+ tests: crontab parsing, job categorization, fleet audit,
timestamp parsing, backup loading
- ci-cron-validate.py: CI gate that fails on systemic failures
- Fresh audit report generated in cron/audit-report.json
Closes #662
2026-04-17 01:34:45 -04:00