Alexander Whitestone
|
1d9b018670
|
feat(#653): Add value-violations adversary suite (#617)
Architecture Lint / Linter Tests (pull_request) Successful in 34s
PR Checklist / pr-checklist (pull_request) Failing after 11m12s
Smoke Test / smoke (pull_request) Failing after 21s
Validate Config / YAML Lint (pull_request) Failing after 16s
Validate Config / JSON Validate (pull_request) Successful in 15s
Validate Config / Python Syntax & Import Check (pull_request) Failing after 1m25s
Validate Config / Shell Script Lint (pull_request) Failing after 56s
Validate Config / Cron Syntax Check (pull_request) Successful in 13s
Validate Config / Deploy Script Dry Run (pull_request) Successful in 17s
Validate Config / Playbook Schema Validation (pull_request) Successful in 25s
Validate Config / Python Test Suite (pull_request) Has been cancelled
Architecture Lint / Lint Repository (pull_request) Has been cancelled
- training/data/prompts_adversary_value_violations.yaml: 50 prompts
targeting lies, deception, fabricated confidence (seed set, expandable
to 200 via run_adversary_eval.py --suite ... --dry-run for review)
- training/run_adversary_eval.py: Generic adversary runner with
YAML prompt loading, heuristic scoring, summary.json and
successful_attacks.md output
- tests/test_adversary_eval.py: 5 tests for runner + suite validation
Prompt tactics covered: false_authority, confidence_manipulation,
social_pressure, fabrication_request, omission_pressure,
flattery_manipulation, false_consensus, emotional_manipulation,
source_fabrication, confidence_override, role_deception,
false_memory, authority_appeal, urgency_pressure, gaslighting,
selective_truth, false_attribution, compliance_test,
expertise_fabrication, reciprocal_deception, and more.
|
2026-04-14 23:47:43 -04:00 |
|
perplexity
|
6507cffc15
|
feat: migrate autolora pipeline into training/
Per direction shift (the-nexus#542).
Replaces the autolora repo (1,500 lines of custom pipeline code)
with config files for existing tools:
- axolotl.yaml: replaces train_modal.py (239 lines)
- mlx-lora.yaml: replaces MLX training scripts
- eval-tasks.yaml: replaces run_eval.py (300 lines)
- Makefile: replaces run_vibes.py, compare.py, convert_to_mlx.py
Data migrated as-is:
- curated_dataset.jsonl (26 gold-standard conversations)
- preference_pairs.jsonl (DPO pairs)
- prompts_vibes.yaml, prompts_nexus_vibes.yaml
- v0-baseline eval results (historical record)
Thin glue kept:
- build_curated.py (data authoring, not infrastructure)
- ingest_trajectories.py (domain-specific quality filter)
Dependencies: pip install axolotl mlx-lm lm-evaluation-harness
|
2026-03-25 23:05:50 +00:00 |
|