docs: add real-world testing findings to OBLITERATUS skill

Added pitfalls discovered during live abliteration testing:
- Models < 1B have fragmented refusal, respond poorly (0.5B: 60%→20%)
- Models 3B+ work much better (3B: 75%→0% with advanced defaults)
- aggressive method can backfire on small models (made it worse)
- Spectral certification RED is common even when refusal rate is 0%
- Fixed torch property: total_mem → total_memory
This commit is contained in:
teknium1
2026-03-09 02:52:54 -07:00
parent a6d3becd6a
commit d6c710706f

View File

@@ -311,14 +311,17 @@ Enable with `--contribute` flag. No personal data is collected — only model na
## Common Pitfalls
1. **Don't use `informed` as default** — it's experimental and slower. Use `advanced` for reliable results.
2. **Always check perplexity** — if it spikes > 15%, the model is damaged. Reduce aggressiveness.
3. **MoE models need special handling** — use `nuclear` method for Mixtral, DeepSeek-MoE, etc.
4. **Quantized models can't be re-quantized** — abliterate the full-precision model, then quantize the output.
5. **VRAM estimation is approximate** — 4-bit quant helps but peak usage can spike during extraction.
6. **Reasoning models are sensitive** — use `surgical` for R1 distills to preserve chain-of-thought.
7. **Check `obliteratus recommend`** — telemetry data may have better parameters than defaults.
8. **AGPL license** — never `import obliteratus` in MIT/Apache projects. CLI invocation only.
9. **Large models (70B+)** — always use `--large-model` flag for conservative defaults.
2. **Models under ~1B respond poorly to abliteration** — their refusal behaviors are shallow and fragmented, making clean direction extraction difficult. Expect partial results (20-40% remaining refusal). Models 3B+ have cleaner refusal directions and respond much better (often 0% refusal with `advanced`).
3. **`aggressive` can make things worse** — on small models it can damage coherence and actually increase refusal rate. Only use it if `advanced` leaves > 10% refusals on a 3B+ model.
4. **Always check perplexity** — if it spikes > 15%, the model is damaged. Reduce aggressiveness.
5. **MoE models need special handling** — use `nuclear` method for Mixtral, DeepSeek-MoE, etc.
6. **Quantized models can't be re-quantized** — abliterate the full-precision model, then quantize the output.
7. **VRAM estimation is approximate** — 4-bit quant helps but peak usage can spike during extraction.
8. **Reasoning models are sensitive** — use `surgical` for R1 distills to preserve chain-of-thought.
9. **Check `obliteratus recommend`** — telemetry data may have better parameters than defaults.
10. **AGPL license** — never `import obliteratus` in MIT/Apache projects. CLI invocation only.
11. **Large models (70B+)** — always use `--large-model` flag for conservative defaults.
12. **Spectral certification RED is common** — the spectral check often flags "incomplete" even when practical refusal rate is 0%. Check actual refusal rate rather than relying on spectral certification alone.
## Complementary Skills