Files
timmy-home/docs/big-brain-27b-test-omission.md
Alexander Whitestone a5e2fb1ea5
Some checks failed
Smoke Test / smoke (push) Has been cancelled
docs: Big Brain 27B test omission workaround (#654)
Merge PR #654
2026-04-14 22:14:35 +00:00

1.6 KiB

Big Brain 27B — Test Omission Pattern

Finding (2026-04-14)

The 27B model (gemma4) consistently omits unit tests when asked to include them in the same prompt as implementation code. The model produces complete, high-quality implementation but stops before the test class/function.

Affected models: 1B, 7B, 27B (27B most notable because implementation is best)

Root cause: Models treat tests as optional even when explicitly required in prompt.

Workaround

Split the prompt into two phases:

Phase 1: Implementation

Write a webhook parser with @dataclass, verify_signature(), parse_webhook().
Include type hints and docstrings.

Phase 2: Tests (separate prompt)

Write a unit test for the webhook parser above. Cover:
- Valid signature verification
- Invalid signature rejection
- Malformed payload handling

Prompt Engineering Notes

  • Do NOT combine "implement X" and "include unit test" in a single prompt
  • The model excels at implementation when focused
  • Test generation works better as a follow-up on the existing code
  • For critical code, always verify test presence manually

Impact

Low — workaround is simple (split prompt). No data loss or corruption risk.

Source

Benchmark runs documented in timmy-home #576.

Update (2026-04-14)

Correction: 27B DOES include tests when the prompt is concise.

  • "Include type hints and one unit test." → tests included
  • "Include type hints, docstring, and one unit test." → tests omitted

The issue is prompt overload, not model limitation. Use short, focused test requirements. See #653.