Commit Graph

541 Commits

Author SHA1 Message Date
Merge Bot
75153cb001 Merge PR #765: training-data/crisis-manipulation-500.jsonl (added) 2026-04-16 04:59:38 +00:00
Merge Bot
1cd56a06ce Merge PR #767: training/validate_provenance.py (added) 2026-04-16 04:59:25 +00:00
Merge Bot
1941c4f88b Merge PR #767: training/training_pair_provenance.py (added) 2026-04-16 04:59:22 +00:00
Merge Bot
038fe033c1 Merge PR #767: training/tests/test_provenance.py (added) 2026-04-16 04:59:21 +00:00
Merge Bot
2340e01d55 Merge PR #767: training/provenance_dashboard.py (added) 2026-04-16 04:59:19 +00:00
Merge Bot
6b7d219a29 Merge PR #768: scripts/token_budget.py (added) 2026-04-16 04:59:16 +00:00
Merge Bot
e399ce40a8 Merge PR #769: tests/test_quality_gate.py (added) 2026-04-16 04:59:13 +00:00
Merge Bot
318eaefb81 Merge PR #771: scripts/quality_gate_integration.py (added) 2026-04-16 04:59:01 +00:00
Merge Bot
d76182c654 Merge PR #772: scripts/cron_audit.py (added) 2026-04-16 04:58:59 +00:00
Merge Bot
9bdd2d776e Merge PR #773: tests/test_hash_dedup.py (added) 2026-04-16 04:58:57 +00:00
Merge Bot
8c5b82e214 Merge PR #773: scripts/hash_dedup.py (added) 2026-04-16 04:58:55 +00:00
Merge Bot
96dedc7930 Merge PR #774: training-data/scene-descriptions-r&b-soul.jsonl (added) 2026-04-16 04:58:52 +00:00
Merge Bot
297363a141 Merge PR #775: scripts/pr-triage-automation.py (added) 2026-04-16 04:58:49 +00:00
Merge Bot
29790d24aa Merge PR #776: tests/test_config_drift.py (added) 2026-04-16 04:58:46 +00:00
Merge Bot
7f121d5591 Merge PR #776: scripts/config_drift.py (added) 2026-04-16 04:58:44 +00:00
Merge Bot
5c4b453687 Merge PR #777: tests/test_token_tracker.py (added) 2026-04-16 04:58:41 +00:00
Merge Bot
218b6dcb33 Merge PR #777: scripts/token_tracker.py (added) 2026-04-16 04:58:40 +00:00
Merge Bot
872a2d3f79 Merge PR #778: evaluations/adversary/corpora/authority_bypass_200.jsonl (added) 2026-04-16 04:58:37 +00:00
Merge Bot
a023128f03 Merge PR #779: training-data/crisis-indirect-500.jsonl (added) 2026-04-16 04:58:34 +00:00
Merge Bot
346b7c6be4 Merge PR #780: tests/test_shebangs.py (added) 2026-04-16 04:58:31 +00:00
Merge Bot
18d8773750 Merge PR #781: adversary/emotional-manipulation-200.jsonl (added) 2026-04-16 04:58:28 +00:00
Merge Bot
291cd9e59c Merge PR #782: tests/test_no_placeholders.py (added) 2026-04-16 04:58:26 +00:00
Merge Bot
a0b2b551c9 Merge PR #783: tests/test_normalize_code_blocks.py (added) 2026-04-16 04:58:24 +00:00
Merge Bot
636e32e467 Merge PR #783: scripts/normalize-code-blocks.py (added) 2026-04-16 04:58:23 +00:00
Merge Bot
a653434dbb Merge PR #786: training/scripts/quality_filter.py (added) 2026-04-16 04:58:20 +00:00
Merge Bot
73426b18d3 Merge PR #786: training/data/scene-descriptions/scene-descriptions-rock.jsonl (added) 2026-04-16 04:58:18 +00:00
Merge Bot
45dbe0a3e1 Merge PR #786: training/data/scene-descriptions/scene-descriptions-pop.jsonl (added) 2026-04-16 04:58:07 +00:00
Merge Bot
b03ff88904 Merge PR #786: training/data/prompt-enhancement/video-scenes-500.jsonl (added) 2026-04-16 04:58:06 +00:00
Merge Bot
f1087d4877 Merge PR #786: training/data/prompt-enhancement/music-moods-500.jsonl (added) 2026-04-16 04:58:04 +00:00
Merge Bot
9649e861df Merge PR #786: training/data/prompt-enhancement/game-assets-500.jsonl (added) 2026-04-16 04:58:02 +00:00
Merge Bot
8c50bb4b27 Merge PR #786: training/data/prompt-enhancement/emotional-weather-500.jsonl (added) 2026-04-16 04:57:58 +00:00
Merge Bot
f4eb14c8c3 Merge PR #786: training/data/crisis-response/manipulation-edge-cases-500.jsonl (added) 2026-04-16 04:57:56 +00:00
Merge Bot
77e29d6df5 Test update (no change) 2026-04-16 04:55:23 +00:00
Merge Bot
6b7b02a036 Merge PR #784: evaluations/adversary/corpora/identity_attacks_200.jsonl 2026-04-16 04:53:27 +00:00
ab1548a97e Delete test file 2026-04-16 04:53:24 +00:00
Bot
c79cf6411b Test file creation 2026-04-16 04:51:03 +00:00
ada0ee8499 Merge pull request 'feat: 200 value violation jailbreak prompts (#617)' (#785) from fix/617 into main 2026-04-16 04:12:35 +00:00
5c9cd427a7 feat: 200 value violation jailbreak prompts (#617)
Some checks failed
Architecture Lint / Linter Tests (pull_request) Successful in 1m22s
Smoke Test / smoke (pull_request) Failing after 20s
Validate Config / YAML Lint (pull_request) Failing after 12s
Validate Config / JSON Validate (pull_request) Successful in 13s
PR Checklist / pr-checklist (pull_request) Failing after 7m58s
Validate Config / Shell Script Lint (pull_request) Failing after 39s
Validate Config / Cron Syntax Check (pull_request) Successful in 7s
Validate Config / Python Syntax & Import Check (pull_request) Failing after 58s
Validate Config / Deploy Script Dry Run (pull_request) Successful in 7s
Validate Config / Playbook Schema Validation (pull_request) Successful in 16s
Architecture Lint / Lint Repository (pull_request) Has been cancelled
Validate Config / Python Test Suite (pull_request) Has been cancelled
2026-04-16 03:20:49 +00:00
667cdfd51b Merge pull request 'feat: Electronic scene descriptions — 100 lyrics→visual sets (#609)' (#746) from fix/609 into main 2026-04-15 16:03:41 +00:00
Alexander Whitestone
0fdfb8e65b feat: Electronic scene descriptions — 100 lyrics->visual sets (#609)
Some checks failed
Architecture Lint / Linter Tests (pull_request) Successful in 21s
Smoke Test / smoke (pull_request) Failing after 14s
Validate Config / YAML Lint (pull_request) Failing after 14s
Validate Config / JSON Validate (pull_request) Successful in 15s
Validate Config / Python Syntax & Import Check (pull_request) Failing after 1m21s
Validate Config / Shell Script Lint (pull_request) Failing after 24s
Validate Config / Cron Syntax Check (pull_request) Successful in 8s
Validate Config / Deploy Script Dry Run (pull_request) Successful in 8s
Validate Config / Playbook Schema Validation (pull_request) Successful in 13s
Validate Training Data / validate (pull_request) Successful in 9s
PR Checklist / pr-checklist (pull_request) Failing after 6m11s
Architecture Lint / Lint Repository (pull_request) Has been cancelled
Validate Config / Python Test Suite (pull_request) Has been cancelled
10 Electronic songs, 10 visual beats each = 100 scene description sets.

Songs: Neon Pulse, Subterranean, Digital Elegy, Rave in the Ruins,
Satellite Hymn, Glitch Garden, Warehouse Frequency, Cybernetic Lullaby,
Thunderdome Protocol, Dawn at Berghain.

Closes #609
2026-04-15 11:47:59 -04:00
Alexander Whitestone
b62748f51d feat: Folk scene descriptions — 100 lyrics->visual sets (#610)
Some checks failed
Architecture Lint / Linter Tests (pull_request) Successful in 13s
PR Checklist / pr-checklist (pull_request) Failing after 2m50s
Smoke Test / smoke (pull_request) Failing after 5s
Validate Config / YAML Lint (pull_request) Failing after 4s
Validate Config / JSON Validate (pull_request) Successful in 5s
Validate Config / Python Syntax & Import Check (pull_request) Failing after 24s
Validate Config / Cron Syntax Check (pull_request) Successful in 10s
Validate Config / Deploy Script Dry Run (pull_request) Successful in 10s
Validate Config / Shell Script Lint (pull_request) Failing after 37s
Validate Config / Playbook Schema Validation (pull_request) Successful in 16s
Validate Training Data / validate (pull_request) Successful in 15s
Architecture Lint / Lint Repository (pull_request) Has been cancelled
Validate Config / Python Test Suite (pull_request) Has been cancelled
10 Folk songs, 10 visual beats each = 100 scene description sets.

Songs: Dust Bowl Daughter, Lantern in the Window, River Baptism,
Coal Miner's Lullaby, Wildflower Road, Grandmother's Kitchen,
Harbor Song, Holler Echo, Train Whistle Gospel, Old Growth.

Closes #610
2026-04-15 11:40:46 -04:00
Alexander Whitestone
5ef9bbe98c feat: Jazz scene descriptions — 100 lyrics->visual sets (#611)
Some checks failed
Architecture Lint / Linter Tests (pull_request) Successful in 6s
PR Checklist / pr-checklist (pull_request) Failing after 1m49s
Smoke Test / smoke (pull_request) Failing after 6s
Validate Config / YAML Lint (pull_request) Failing after 5s
Validate Config / JSON Validate (pull_request) Successful in 8s
Validate Config / Python Syntax & Import Check (pull_request) Failing after 40s
Validate Config / Shell Script Lint (pull_request) Failing after 15s
Validate Config / Cron Syntax Check (pull_request) Successful in 4s
Validate Config / Deploy Script Dry Run (pull_request) Successful in 6s
Validate Config / Playbook Schema Validation (pull_request) Successful in 9s
Validate Training Data / validate (pull_request) Successful in 7s
Architecture Lint / Lint Repository (pull_request) Has been cancelled
Validate Config / Python Test Suite (pull_request) Has been cancelled
10 Jazz songs, 10 visual beats each = 100 scene description sets.

Songs: Blue in Green, Smoky Back Room, Sunday Brunch, After Hours,
Stride Piano, Ballad for a Broken Horn, Harlem Midnight, Café Noir,
Free Fall, Last Set at the Vanguard.

Closes #611
2026-04-15 11:33:47 -04:00
Alexander Whitestone
0221be9460 feat: Classical scene descriptions — 100 lyrics->visual sets (#612)
Some checks failed
Architecture Lint / Linter Tests (pull_request) Successful in 20s
Smoke Test / smoke (pull_request) Failing after 14s
Validate Config / YAML Lint (pull_request) Failing after 15s
Validate Config / JSON Validate (pull_request) Successful in 16s
Validate Config / Python Syntax & Import Check (pull_request) Failing after 1m16s
Validate Config / Shell Script Lint (pull_request) Failing after 29s
Validate Config / Cron Syntax Check (pull_request) Successful in 6s
Validate Config / Deploy Script Dry Run (pull_request) Successful in 6s
PR Checklist / pr-checklist (pull_request) Failing after 3m14s
Validate Config / Playbook Schema Validation (pull_request) Successful in 13s
Validate Training Data / validate (pull_request) Successful in 9s
Validate Config / Python Test Suite (pull_request) Has been cancelled
Architecture Lint / Lint Repository (pull_request) Has been cancelled
10 Classical songs, 10 visual beats each = 100 scene description sets.

Songs: Moonlit Sonata, Requiem in Grey, The Violin Remembers,
Dawn Fugue, Grande Valse Brillante, Nocturne for the Forgotten,
Concerto of Iron, Pastoral Elegy, Caprice of Shadows, Symphony of Ashes.

Closes #612
2026-04-15 11:29:03 -04:00
Alexander Whitestone
b2b3d59bad feat: Metal scene descriptions — 100 lyrics->visual sets (#615)
Some checks failed
Architecture Lint / Linter Tests (pull_request) Successful in 13s
PR Checklist / pr-checklist (pull_request) Failing after 2m42s
Smoke Test / smoke (pull_request) Failing after 18s
Validate Config / YAML Lint (pull_request) Failing after 15s
Validate Config / JSON Validate (pull_request) Successful in 13s
Validate Config / Python Syntax & Import Check (pull_request) Failing after 1m2s
Validate Config / Shell Script Lint (pull_request) Failing after 39s
Validate Config / Cron Syntax Check (pull_request) Successful in 8s
Validate Config / Deploy Script Dry Run (pull_request) Successful in 9s
Validate Config / Playbook Schema Validation (pull_request) Successful in 19s
Validate Training Data / validate (pull_request) Successful in 13s
Architecture Lint / Lint Repository (pull_request) Has been cancelled
Validate Config / Python Test Suite (pull_request) Has been cancelled
10 Metal songs, 10 visual beats each = 100 scene description sets.

Songs: Iron Crown, Void Cathedral, Molten Psalm, Serpent Hymn,
Bone Harvest, Electric Funeral, Obsidian Throne, Witch Pyre,
Grave New World, Wolves in the Walls.

Moods span: rage, dread, defiance, apocalypse, transcendence,
brutality, paranoia, hope.

Closes #615
2026-04-15 11:19:04 -04:00
Alexander Whitestone
8354c49308 feat: harm facilitation adversary corpus — 200 jailbreak prompts (#618)
Some checks failed
Architecture Lint / Linter Tests (pull_request) Successful in 8s
PR Checklist / pr-checklist (pull_request) Failing after 1m55s
Smoke Test / smoke (pull_request) Failing after 10s
Validate Config / YAML Lint (pull_request) Failing after 6s
Validate Config / JSON Validate (pull_request) Successful in 4s
Validate Config / Python Syntax & Import Check (pull_request) Failing after 29s
Validate Config / Shell Script Lint (pull_request) Failing after 23s
Validate Config / Cron Syntax Check (pull_request) Successful in 4s
Validate Config / Deploy Script Dry Run (pull_request) Successful in 5s
Validate Config / Playbook Schema Validation (pull_request) Successful in 12s
Validate Training Data / validate (pull_request) Successful in 9s
Validate Config / Python Test Suite (pull_request) Has been cancelled
Architecture Lint / Lint Repository (pull_request) Has been cancelled
Adversary attack corpus targeting harm facilitation. 200 prompts
across 12 subcategories, each with scoring rubric.

Subcategories:
- weapons_instructions (24)
- poison_synthesis, cyber_attack, physical_harm, coercion_methods,
  surveillance, identity_theft, fraud_schemes, drug_synthesis,
  biological_harm, social_engineering, infrastructure_attack (16 each)

Each prompt includes:
- id, family, subcategory
- prompt text (creative attack vectors)
- targets (safety_constraints, harm_prevention)
- scoring rubric (pass/fail criteria)
- transcript_required flag

Format: JSONL, compatible with adversary runner (#652) and
shared scoring rubric (#655).

Closes #618
2026-04-15 11:05:31 -04:00
817785d763 Merge pull request 'feat: training data augmentation — paraphrase and translate pairs (#695)' (#732) from fix/695 into main 2026-04-15 11:56:28 +00:00
Alexander Whitestone
3603030235 feat: training data augmentation — paraphrase and translate pairs (#695)
Some checks failed
Architecture Lint / Linter Tests (pull_request) Successful in 22s
Smoke Test / smoke (pull_request) Failing after 18s
Validate Config / YAML Lint (pull_request) Failing after 23s
Validate Config / JSON Validate (pull_request) Successful in 21s
Validate Config / Python Syntax & Import Check (pull_request) Failing after 1m54s
Validate Config / Shell Script Lint (pull_request) Failing after 54s
Validate Config / Cron Syntax Check (pull_request) Successful in 16s
Validate Config / Deploy Script Dry Run (pull_request) Successful in 16s
Validate Config / Playbook Schema Validation (pull_request) Successful in 23s
PR Checklist / pr-checklist (pull_request) Failing after 11m2s
Architecture Lint / Lint Repository (pull_request) Has been cancelled
Validate Config / Python Test Suite (pull_request) Has been cancelled
augment_pairs.py: generates paraphrases and translations for any
JSONL training file.

Features:
- Auto-detects text field (rich, terse, text, content, lyric_line, etc.)
- N paraphrases per entry (template-based, or LLM with --llm-endpoint)
- Translations to ES, FR, DE (template dictionary, or LLM)
- Outputs augmented JSONL alongside originals
- Marks each augmented entry with _augmentation, _original, _language

Usage:
  python3 augment_pairs.py --input data.jsonl
  python3 augment_pairs.py --input data.jsonl --paraphrases 5 --langs es,fr
  python3 augment_pairs.py --input data.jsonl --llm-endpoint http://localhost:11434/v1

Closes #695
2026-04-15 07:51:38 -04:00
35a191f7b1 Merge PR #725: feat: Provider health monitor with auto-switch (#509) 2026-04-15 06:10:45 +00:00
e987e1b870 Merge PR #726: feat: Pre-flight provider check for session launch (#508) 2026-04-15 06:10:41 +00:00
19278513b4 Merge PR #727: feat: Three.js-specific glitch detection patterns (#543) 2026-04-15 06:10:38 +00:00