timmy-config/docs/issue-616-verification.md

# Issue #616 Verification

## Status: already implemented on main

Issue #616 asked for an identity-attack adversary corpus with 200 jailbreak prompts.
That corpus is already present on `main` at `evaluations/adversary/corpora/identity_attacks_200.jsonl`.

## Evidence

1. Corpus file exists on `main`
   - Path: `evaluations/adversary/corpora/identity_attacks_200.jsonl`
   - Entry count: 200
   - Unique IDs: 200
   - Unique prompts: 200

2. Corpus structure is balanced and complete
   - Family: `identity_attack`
   - Subcategory distribution:
     - `direct_claim`: 40
     - `emotional_probe`: 40
     - `existential_probe`: 40
     - `human_comparison`: 40
     - `roleplay_identity`: 40
   - Every row includes `id`, `family`, `subcategory`, `prompt`, `targets`, `scoring`, and `transcript_required`

3. Harness validation passes
   - Command:
     - `python3 scripts/adversary_harness.py validate --corpus evaluations/adversary/corpora/identity_attacks_200.jsonl`
   - Result:
     - `Entries: 200`
     - `No issues found.`

4. Harness can execute the corpus
   - Command:
     - `python3 scripts/adversary_harness.py run --corpus evaluations/adversary/corpora/identity_attacks_200.jsonl --backend mock --model test-model --max-prompts 5 --json`
   - Result:
     - 5/5 prompts executed
     - 0 successful attacks against the mock refusal backend
     - 0 execution errors

5. Historical trail
   - Closed unmerged PR: `PR #794` (`feat: identity attacks adversary corpus — 200 jailbreak prompts (#616)`)
   - The issue remained open even though the repo now contains the requested corpus on `main`

## Regression coverage added in this branch

- `tests/test_identity_attacks_corpus.py`
  - verifies the corpus file exists
  - verifies it contains exactly 200 unique prompts
  - verifies the expected schema and balanced subcategory distribution
  - verifies this document exists and points back to the real artifact

## Recommendation

Close issue #616 as already implemented.