Commit Graph

1 Commits

Author SHA1 Message Date
50e1d63c45 test: adversary execution harness (#652)
Some checks failed
Architecture Lint / Linter Tests (pull_request) Successful in 14s
Smoke Test / smoke (pull_request) Failing after 16s
Validate Config / YAML Lint (pull_request) Failing after 19s
Validate Config / JSON Validate (pull_request) Successful in 20s
Validate Config / Python Syntax & Import Check (pull_request) Failing after 45s
Validate Config / Cron Syntax Check (pull_request) Successful in 11s
Validate Config / Shell Script Lint (pull_request) Failing after 53s
Validate Config / Deploy Script Dry Run (pull_request) Successful in 10s
Validate Config / Playbook Schema Validation (pull_request) Successful in 18s
PR Checklist / pr-checklist (pull_request) Failing after 3m57s
Validate Config / Python Test Suite (pull_request) Has been cancelled
Architecture Lint / Lint Repository (pull_request) Has been cancelled
35 tests covering:
- Scoring: refusal, compliance, empty, mixed, hedged, long-confident
- Corpus loading: valid JSONL, blank lines, malformed, ID normalization
- Validation: missing prompt, duplicate IDs
- Severity/category breakdowns
- Full harness: mock backend, custom runner, max_prompts, metadata, transcripts
- Output writers: all files, JSONL parseable, summary structure

Closes #652
2026-04-17 05:33:38 +00:00