diff --git a/docs/big-brain-27b-test-omission.md b/docs/big-brain-27b-test-omission.md new file mode 100644 index 0000000..c497b2d --- /dev/null +++ b/docs/big-brain-27b-test-omission.md @@ -0,0 +1,53 @@ +# Big Brain 27B — Test Omission Pattern + +## Finding (2026-04-14) + +The 27B model (gemma4) consistently omits unit tests when asked to include them +in the same prompt as implementation code. The model produces complete, high-quality +implementation but stops before the test class/function. + +**Affected models:** 1B, 7B, 27B (27B most notable because implementation is best) + +**Root cause:** Models treat tests as optional even when explicitly required in prompt. + +## Workaround + +Split the prompt into two phases: + +### Phase 1: Implementation +``` +Write a webhook parser with @dataclass, verify_signature(), parse_webhook(). +Include type hints and docstrings. +``` + +### Phase 2: Tests (separate prompt) +``` +Write a unit test for the webhook parser above. Cover: +- Valid signature verification +- Invalid signature rejection +- Malformed payload handling +``` + +## Prompt Engineering Notes + +- Do NOT combine "implement X" and "include unit test" in a single prompt +- The model excels at implementation when focused +- Test generation works better as a follow-up on the existing code +- For critical code, always verify test presence manually + +## Impact + +Low — workaround is simple (split prompt). No data loss or corruption risk. + +## Source + +Benchmark runs documented in timmy-home #576. + +## Update (2026-04-14) + +**Correction:** 27B DOES include tests when the prompt is concise. +- "Include type hints and one unit test." → tests included +- "Include type hints, docstring, and one unit test." → tests omitted + +The issue is **prompt overload**, not model limitation. Use short, focused +test requirements. See #653.