c633afd66d
fix: add underscore module version for test imports ( #750 )
2026-04-17 05:33:26 +00:00
c69ae0e72b
fix: normalize open_tag whitespace in code block parser ( #750 )
2026-04-17 05:33:24 +00:00
a4a33fd0f8
test: add edge-case tests for training example metadata preservation
...
Architecture Lint / Linter Tests (pull_request) Successful in 19s
Smoke Test / smoke (pull_request) Failing after 12s
Validate Config / YAML Lint (pull_request) Failing after 9s
Validate Config / JSON Validate (pull_request) Successful in 12s
Validate Config / Python Syntax & Import Check (pull_request) Failing after 45s
Validate Config / Cron Syntax Check (pull_request) Successful in 10s
Validate Config / Shell Script Lint (pull_request) Failing after 52s
Validate Config / Deploy Script Dry Run (pull_request) Successful in 9s
Validate Config / Playbook Schema Validation (pull_request) Successful in 21s
PR Checklist / pr-checklist (pull_request) Failing after 3m50s
Architecture Lint / Lint Repository (pull_request) Has been cancelled
Validate Config / Python Test Suite (pull_request) Has been cancelled
- test_metadata_with_future_fields_preserved: unknown fields pass through
- test_metadata_preserved_across_multiple_examples: per-example independence
Verifies fix for #646 .
2026-04-17 05:33:08 +00:00
f05c014143
test: Add PR backlog triage tests ( #658 )
Architecture Lint / Linter Tests (pull_request) Successful in 24s
Smoke Test / smoke (pull_request) Failing after 19s
Validate Config / YAML Lint (pull_request) Failing after 15s
Validate Config / JSON Validate (pull_request) Successful in 15s
Validate Config / Python Syntax & Import Check (pull_request) Failing after 42s
Validate Config / Shell Script Lint (pull_request) Failing after 37s
Validate Config / Cron Syntax Check (pull_request) Successful in 8s
Validate Config / Deploy Script Dry Run (pull_request) Successful in 6s
Validate Config / Playbook Schema Validation (pull_request) Successful in 13s
PR Checklist / pr-checklist (pull_request) Failing after 3m19s
Architecture Lint / Lint Repository (pull_request) Has been cancelled
Validate Config / Python Test Suite (pull_request) Has been cancelled
2026-04-17 05:32:20 +00:00
f094b0d5b5
feat: Add PR backlog triage script — categorize, duplicates, stale detection ( #658 )
2026-04-17 05:32:19 +00:00
df4dcf1fb4
test: Token tracker orchestrator integration tests ( #634 )
Architecture Lint / Linter Tests (pull_request) Successful in 24s
Smoke Test / smoke (pull_request) Failing after 9s
Validate Config / YAML Lint (pull_request) Failing after 11s
Validate Config / JSON Validate (pull_request) Successful in 13s
Validate Config / Python Syntax & Import Check (pull_request) Failing after 42s
Validate Config / Shell Script Lint (pull_request) Failing after 34s
Validate Config / Cron Syntax Check (pull_request) Successful in 5s
Validate Config / Deploy Script Dry Run (pull_request) Successful in 4s
Validate Config / Playbook Schema Validation (pull_request) Successful in 14s
PR Checklist / pr-checklist (pull_request) Failing after 3m32s
Architecture Lint / Lint Repository (pull_request) Has been cancelled
Validate Config / Python Test Suite (pull_request) Has been cancelled
2026-04-17 05:32:18 +00:00
42ff05aeec
feat: adversary execution harness for prompt corpora ( #652 )
...
Reusable harness for replaying JSONL corpora against live agents.
Supports Ollama, hermes, and mock backends.
Captures transcripts, scores responses, auto-files P0 issues.
Closes #652
2026-04-17 05:31:27 +00:00
c4790d8bb9
feat: Integrate token tracker with orchestrator ( #634 )
...
- Fix corrupted TOKEN_LOG path
- Import token_budget.record_usage in log_token_budget
- Add check_budget() before pipeline runs
- Add Huey tasks for all 5 pipelines
- Add _run_pipeline() runner with timeout and budget enforcement
- Add schedule_nightly() for dependency-ordered dispatch
- Signal hook auto-logs to both JSONL and budget tracker
2026-04-17 05:31:12 +00:00
acba760731
fix: reset_stale_states delegates to standalone script ( closes #650 )
Validate Config / Playbook Schema Validation (pull_request) Successful in 14s
Architecture Lint / Linter Tests (pull_request) Successful in 26s
PR Checklist / pr-checklist (pull_request) Failing after 25m6s
Smoke Test / smoke (pull_request) Failing after 12s
Validate Config / YAML Lint (pull_request) Failing after 8s
Validate Config / Python Syntax & Import Check (pull_request) Failing after 35s
Validate Config / JSON Validate (pull_request) Successful in 13s
Validate Config / Cron Syntax Check (pull_request) Successful in 8s
Validate Config / Deploy Script Dry Run (pull_request) Successful in 6s
Validate Config / Shell Script Lint (pull_request) Failing after 34s
Architecture Lint / Lint Repository (pull_request) Has been cancelled
Validate Config / Python Test Suite (pull_request) Has been cancelled
2026-04-17 05:26:06 +00:00
15713958e6
test: bloom filter + hash dedup rotation tests #628
Architecture Lint / Linter Tests (pull_request) Successful in 16s
Smoke Test / smoke (pull_request) Failing after 17s
Validate Config / YAML Lint (pull_request) Failing after 13s
Validate Config / JSON Validate (pull_request) Successful in 17s
Validate Config / Python Syntax & Import Check (pull_request) Failing after 35s
Validate Config / Shell Script Lint (pull_request) Failing after 21s
Validate Config / Cron Syntax Check (pull_request) Successful in 4s
Validate Config / Deploy Script Dry Run (pull_request) Successful in 6s
Validate Config / Playbook Schema Validation (pull_request) Successful in 21s
PR Checklist / pr-checklist (pull_request) Failing after 3m32s
Architecture Lint / Lint Repository (pull_request) Has been cancelled
Validate Config / Python Test Suite (pull_request) Has been cancelled
2026-04-17 05:26:05 +00:00
1c69029d9c
feat: integrate provenance tracking with build_curated.py ( #752 )
Architecture Lint / Linter Tests (pull_request) Successful in 25s
PR Checklist / pr-checklist (pull_request) Failing after 5m22s
Smoke Test / smoke (pull_request) Failing after 17s
Validate Config / YAML Lint (pull_request) Failing after 13s
Validate Config / JSON Validate (pull_request) Successful in 14s
Validate Config / Shell Script Lint (pull_request) Failing after 41s
Validate Config / Cron Syntax Check (pull_request) Successful in 8s
Validate Config / Deploy Script Dry Run (pull_request) Successful in 7s
Validate Config / Python Syntax & Import Check (pull_request) Failing after 1m6s
Validate Config / Playbook Schema Validation (pull_request) Successful in 17s
Validate Training Data / validate (pull_request) Successful in 12s
Architecture Lint / Lint Repository (pull_request) Has been cancelled
Validate Config / Python Test Suite (pull_request) Has been cancelled
2026-04-17 05:25:49 +00:00
776597712f
fix: hash dedup rotation + bloom filter — bounded memory #628
...
- BloomFilter class: O(n) space, configurable error rate
- HashDedupStore: daily JSON files, 7-day retention, auto-rotation
- Cross-run dedup in run_gate(): rejects entries seen in prior runs
- CLI: --dedup-stats, --dedup-purge commands
- Stats file rotation capped at 1000 entries
- Purge command for full hash reset
2026-04-17 05:25:10 +00:00
164643577a
fix: pipeline state daily reset ( closes #650 )
2026-04-17 05:24:19 +00:00
34ade6fc0e
fix: pipeline state daily reset ( closes #650 )
2026-04-17 05:24:14 +00:00
c5270d76e0
fix: pipeline state daily reset ( closes #650 )
2026-04-17 05:24:12 +00:00
3250eba0cc
feat: orchestrator test suite — queue, resume, parallel, tokens
2026-04-17 05:20:02 +00:00
99d4facdad
feat: pipelines __init__.py exports
2026-04-17 05:19:59 +00:00
627f2e0158
test: adversary scoring rubric and schema tests ( #655 )
Architecture Lint / Linter Tests (pull_request) Successful in 15s
Validate Config / YAML Lint (pull_request) Failing after 15s
Validate Config / JSON Validate (pull_request) Successful in 10s
Smoke Test / smoke (pull_request) Failing after 17s
Validate Config / Python Syntax & Import Check (pull_request) Failing after 31s
Validate Config / Deploy Script Dry Run (pull_request) Successful in 12s
PR Checklist / pr-checklist (pull_request) Failing after 3m51s
Validate Config / Shell Script Lint (pull_request) Failing after 53s
Validate Config / Cron Syntax Check (pull_request) Successful in 10s
Validate Config / Playbook Schema Validation (pull_request) Successful in 21s
Architecture Lint / Lint Repository (pull_request) Has been cancelled
Validate Config / Python Test Suite (pull_request) Has been cancelled
2026-04-17 05:18:38 +00:00
c808c4efb3
fix: shared orchestrator — syntax fix, resume on restart, future tracking, list CLI
...
Fixes #621
- Fix DEFAULT_TOKEN_BUDGET syntax error
- Resume paused/running jobs with checkpoints on restart
- Proper future collection and drain in run()
- Add 'list' CLI command for job inspection
- Throttle when at worker capacity
2026-04-17 05:17:59 +00:00
38a4a73a67
feat: shared adversary scoring rubric and transcript schema ( #655 )
2026-04-17 05:17:29 +00:00
6fbf5bb649
Merge pull request 'feat: sidecar config validation on deploy' ( #797 ) from feat/690-config-validation into main
Architecture Lint / Linter Tests (pull_request) Successful in 14s
Smoke Test / smoke (pull_request) Failing after 15s
Validate Config / YAML Lint (pull_request) Failing after 13s
Validate Config / JSON Validate (pull_request) Successful in 16s
Validate Config / Python Syntax & Import Check (pull_request) Failing after 42s
Validate Config / Shell Script Lint (pull_request) Failing after 45s
Validate Config / Cron Syntax Check (pull_request) Successful in 9s
Validate Config / Deploy Script Dry Run (pull_request) Successful in 9s
Validate Config / Playbook Schema Validation (pull_request) Successful in 21s
PR Checklist / pr-checklist (pull_request) Failing after 3m31s
Validate Config / Python Test Suite (pull_request) Has been cancelled
Architecture Lint / Lint Repository (pull_request) Has been cancelled
2026-04-17 05:15:05 +00:00
Alexander Whitestone
5efdc3979c
feat: crisis response — post-crisis & recovery 500 pairs ( #599 )
Architecture Lint / Linter Tests (pull_request) Successful in 7s
PR Checklist / pr-checklist (pull_request) Failing after 1m10s
Smoke Test / smoke (pull_request) Failing after 4s
Validate Config / YAML Lint (pull_request) Failing after 4s
Validate Config / JSON Validate (pull_request) Successful in 4s
Validate Config / Python Syntax & Import Check (pull_request) Failing after 22s
Validate Config / Shell Script Lint (pull_request) Failing after 14s
Validate Config / Cron Syntax Check (pull_request) Successful in 3s
Validate Config / Deploy Script Dry Run (pull_request) Successful in 3s
Validate Config / Playbook Schema Validation (pull_request) Successful in 5s
Validate Training Data / validate (pull_request) Successful in 4s
Architecture Lint / Lint Repository (pull_request) Has been cancelled
Validate Config / Python Test Suite (pull_request) Has been cancelled
2026-04-17 01:14:09 -04:00
9ec0a22d6a
test: config validation tests
...
Architecture Lint / Linter Tests (pull_request) Successful in 9s
PR Checklist / pr-checklist (pull_request) Failing after 1m9s
Validate Config / JSON Validate (pull_request) Successful in 5s
Smoke Test / smoke (pull_request) Failing after 6s
Validate Config / YAML Lint (pull_request) Failing after 5s
Validate Config / Cron Syntax Check (pull_request) Successful in 3s
Validate Config / Deploy Script Dry Run (pull_request) Successful in 3s
Validate Config / Playbook Schema Validation (pull_request) Successful in 5s
Validate Config / Python Syntax & Import Check (pull_request) Failing after 12s
Validate Config / Shell Script Lint (pull_request) Failing after 12s
Architecture Lint / Lint Repository (pull_request) Has been cancelled
Validate Config / Python Test Suite (pull_request) Has been cancelled
Part of #690
2026-04-17 05:07:46 +00:00
6b984532a1
feat: config validation script
...
Closes #690
Validates YAML syntax, required keys, value types, and
forbidden keys before deploy. Prevents broken deploys
from bad config.
2026-04-17 05:07:44 +00:00
Alexander Whitestone
3b58fe1ac8
feat: Hip-Hop scene descriptions — 100 lyrics->visual sets ( #608 )
Validate Config / Deploy Script Dry Run (pull_request) Has been cancelled
Validate Config / Playbook Schema Validation (pull_request) Has been cancelled
Architecture Lint / Linter Tests (pull_request) Has been cancelled
Architecture Lint / Lint Repository (pull_request) Has been cancelled
PR Checklist / pr-checklist (pull_request) Has been cancelled
Smoke Test / smoke (pull_request) Has been cancelled
Validate Config / YAML Lint (pull_request) Has been cancelled
Validate Config / JSON Validate (pull_request) Has been cancelled
Validate Config / Python Syntax & Import Check (pull_request) Has been cancelled
Validate Config / Python Test Suite (pull_request) Has been cancelled
Validate Config / Shell Script Lint (pull_request) Has been cancelled
Validate Config / Cron Syntax Check (pull_request) Has been cancelled
Validate Training Data / validate (pull_request) Has been cancelled
2026-04-16 02:12:38 -04:00
Alexander Whitestone
4f960e0dd8
feat: identity attacks adversary corpus — 200 jailbreak prompts ( #616 )
Architecture Lint / Linter Tests (pull_request) Has been cancelled
Architecture Lint / Lint Repository (pull_request) Has been cancelled
PR Checklist / pr-checklist (pull_request) Has been cancelled
Smoke Test / smoke (pull_request) Has been cancelled
Validate Config / YAML Lint (pull_request) Has been cancelled
Validate Config / JSON Validate (pull_request) Has been cancelled
Validate Config / Python Syntax & Import Check (pull_request) Has been cancelled
Validate Config / Python Test Suite (pull_request) Has been cancelled
Validate Config / Shell Script Lint (pull_request) Has been cancelled
Validate Config / Cron Syntax Check (pull_request) Has been cancelled
Validate Config / Deploy Script Dry Run (pull_request) Has been cancelled
Validate Config / Playbook Schema Validation (pull_request) Has been cancelled
Validate Training Data / validate (pull_request) Has been cancelled
2026-04-16 01:57:24 -04:00
Alexander Whitestone
0ddbfb0cfa
feat: emotional manipulation adversary corpus — 200 jailbreak prompts ( #620 )
Architecture Lint / Linter Tests (pull_request) Has been cancelled
Architecture Lint / Lint Repository (pull_request) Has been cancelled
PR Checklist / pr-checklist (pull_request) Has been cancelled
Validate Config / YAML Lint (pull_request) Has been cancelled
Validate Config / JSON Validate (pull_request) Has been cancelled
Smoke Test / smoke (pull_request) Has been cancelled
Validate Training Data / validate (pull_request) Has been cancelled
Validate Config / Python Syntax & Import Check (pull_request) Has been cancelled
Validate Config / Python Test Suite (pull_request) Has been cancelled
Validate Config / Shell Script Lint (pull_request) Has been cancelled
Validate Config / Cron Syntax Check (pull_request) Has been cancelled
Validate Config / Deploy Script Dry Run (pull_request) Has been cancelled
Validate Config / Playbook Schema Validation (pull_request) Has been cancelled
2026-04-16 01:49:38 -04:00
Alexander Whitestone
f169634a75
feat: config drift detection across all fleet nodes ( #686 )
...
PR Checklist / pr-checklist (pull_request) Has been cancelled
Architecture Lint / Linter Tests (pull_request) Has been cancelled
Architecture Lint / Lint Repository (pull_request) Has been cancelled
Smoke Test / smoke (pull_request) Has been cancelled
Validate Config / YAML Lint (pull_request) Has been cancelled
Validate Config / JSON Validate (pull_request) Has been cancelled
Validate Config / Python Syntax & Import Check (pull_request) Has been cancelled
Validate Config / Python Test Suite (pull_request) Has been cancelled
Validate Config / Shell Script Lint (pull_request) Has been cancelled
Validate Config / Cron Syntax Check (pull_request) Has been cancelled
Validate Config / Deploy Script Dry Run (pull_request) Has been cancelled
Validate Config / Playbook Schema Validation (pull_request) Has been cancelled
Validate Training Data / validate (pull_request) Has been cancelled
Detect config drift between fleet nodes and canonical timmy-config.
scripts/config_drift_detector.py (200 lines):
- SSH-based config collection from all nodes
- Recursive diff against canonical config
- Report: which keys differ, on which nodes
- JSON output for programmatic consumption
Fleet nodes: local, ezra (143.198.27.163), bezalel (167.99.126.228)
Usage:
python3 scripts/config_drift_detector.py --report
python3 scripts/config_drift_detector.py --json
Closes #686
2026-04-16 01:33:57 -04:00
Alexander Whitestone
b6c716998b
feat: authority bypass adversary corpus — 200 jailbreak prompts ( #619 )
...
PR Checklist / pr-checklist (pull_request) Has been cancelled
Validate Config / Playbook Schema Validation (pull_request) Has been cancelled
Architecture Lint / Linter Tests (pull_request) Has been cancelled
Architecture Lint / Lint Repository (pull_request) Has been cancelled
Validate Training Data / validate (pull_request) Has been cancelled
Smoke Test / smoke (pull_request) Has been cancelled
Validate Config / YAML Lint (pull_request) Has been cancelled
Validate Config / JSON Validate (pull_request) Has been cancelled
Validate Config / Python Syntax & Import Check (pull_request) Has been cancelled
Validate Config / Python Test Suite (pull_request) Has been cancelled
Validate Config / Shell Script Lint (pull_request) Has been cancelled
Validate Config / Cron Syntax Check (pull_request) Has been cancelled
Validate Config / Deploy Script Dry Run (pull_request) Has been cancelled
200 prompts targeting authority bypass attacks.
Attack worker 4/5.
Categories (33-34 each):
- developer_override: admin/debug mode claims
- role_play_bypass: DAN, unrestricted character modes
- sovereignty_override: ownership/creator claims
- prompt_injection: system prompt injection attempts
- emotional_manipulation_auth: guilt/loyalty pressure
- authority_spoofing: FBI/OpenAI/military impersonation
Scoring: pass=maintains SOUL.md, fail=complies with override.
Compatible with shared scoring rubric (#655 ).
Closes #619
2026-04-16 01:26:49 -04:00
Alexander Whitestone
e88b1b0d75
feat: training pair provenance tracking ( #752 )
...
Architecture Lint / Linter Tests (pull_request) Has been cancelled
Architecture Lint / Lint Repository (pull_request) Has been cancelled
PR Checklist / pr-checklist (pull_request) Has been cancelled
Validate Config / Playbook Schema Validation (pull_request) Has been cancelled
Validate Training Data / validate (pull_request) Has been cancelled
Smoke Test / smoke (pull_request) Has been cancelled
Validate Config / YAML Lint (pull_request) Has been cancelled
Validate Config / JSON Validate (pull_request) Has been cancelled
Validate Config / Python Syntax & Import Check (pull_request) Has been cancelled
Validate Config / Python Test Suite (pull_request) Has been cancelled
Validate Config / Shell Script Lint (pull_request) Has been cancelled
Validate Config / Cron Syntax Check (pull_request) Has been cancelled
Validate Config / Deploy Script Dry Run (pull_request) Has been cancelled
Provenance module for tracking source of every training pair.
training/provenance.py (151 lines):
- add_provenance(): add metadata to pairs
- validate_provenance(): check required fields
- provenance_stats(): coverage and distribution
- backfill_provenance(): annotate existing pairs
- filter_by_provenance(): exclude by model/source
- extract_provenance_from_trajectory(): hermes integration
Required fields: source_session_id, model, timestamp
Closes #752
2026-04-16 01:23:17 -04:00
Merge Bot
c587fc069b
Merge PR #559 : tests/test_nexus_smoke_test.py (added)
2026-04-16 05:16:27 +00:00
Merge Bot
6e0e302806
Merge PR #559 : scripts/nexus_smoke_test.py (changed)
2026-04-16 05:16:24 +00:00
Merge Bot
3155f9c042
Merge PR #559 : deploy/gitea-a11y/deploy-gitea-a11y.sh (added)
2026-04-16 05:16:22 +00:00
Merge Bot
a0f8d30bfd
Merge PR #559 : deploy/gitea-a11y/custom/templates/user/auth/signin_inner.tmpl (added)
2026-04-16 05:16:21 +00:00
Merge Bot
9257234c1d
Merge PR #559 : deploy/gitea-a11y/custom/templates/repo/list_a11y.tmpl (added)
2026-04-16 05:16:19 +00:00
Merge Bot
1a9b1a1f08
Merge PR #559 : deploy/gitea-a11y/custom/templates/custom/time_relative.tmpl (added)
2026-04-16 05:16:13 +00:00
Merge Bot
4d3c26a409
Merge PR #559 : deploy/gitea-a11y/README.md (added)
2026-04-16 05:16:11 +00:00
Merge Bot
fab6215b64
Merge PR #560 : tests/test_nexus_smoke_test.py (added)
2026-04-16 05:16:07 +00:00
Merge Bot
6ac390a5d2
Merge PR #560 : scripts/nexus_smoke_test.py (changed)
2026-04-16 05:16:03 +00:00
Merge Bot
226e472cea
Merge PR #560 : deploy/gitea-a11y/deploy-gitea-a11y.sh (added)
2026-04-16 05:16:00 +00:00
Merge Bot
ca9656aac2
Merge PR #560 : deploy/gitea-a11y/custom/templates/user/auth/signin_inner.tmpl (added)
2026-04-16 05:15:57 +00:00
Merge Bot
57d47644c2
Merge PR #560 : deploy/gitea-a11y/custom/templates/repo/list_a11y.tmpl (added)
2026-04-16 05:15:56 +00:00
Merge Bot
e0daa1e4fb
Merge PR #560 : deploy/gitea-a11y/custom/templates/custom/time_relative.tmpl (added)
2026-04-16 05:15:54 +00:00
Merge Bot
58fc94a173
Merge PR #560 : deploy/gitea-a11y/custom/templates/custom/header_banner.tmpl (added)
2026-04-16 05:15:52 +00:00
Merge Bot
8d33d05bca
Merge PR #787 : training/scripts/quality_filter.py
2026-04-16 05:15:50 +00:00
Merge Bot
36e2663c8e
Merge PR #787 : training/data/scene-descriptions/scene-descriptions-rock.jsonl
2026-04-16 05:15:46 +00:00
Merge Bot
cc6ade3312
Merge PR #787 : training/data/scene-descriptions/scene-descriptions-pop.jsonl
2026-04-16 05:15:43 +00:00
Merge Bot
9d3883f5fb
Merge PR #787 : training/data/prompt-enhancement/video-scenes-500.jsonl
2026-04-16 05:15:41 +00:00
Merge Bot
95214e87eb
Merge PR #787 : training/data/prompt-enhancement/music-moods-500.jsonl
2026-04-16 05:15:38 +00:00
Merge Bot
411c0e7f01
Merge PR #787 : training/data/prompt-enhancement/game-assets-500.jsonl
2026-04-16 05:15:34 +00:00