Files
timmy-home/timmy-config
Alexander Whitestone 09123b868d
Some checks failed
Smoke Test / smoke (pull_request) Failing after 10s
Big Brain Benchmark v7: 7B consistently finds both bugs
7B (qwen2.5:7b) found both async bugs in 2 consecutive runs (v6+v7).
Confirmed behavioral change — quality gap narrowing vs 27B.

Results: 27B wins 1/5, 1B wins 3/5, 7B wins 1/5. 27B is 5.6x slower.

Cumulative: 7B now 2/7 on both-bugs (was 0/7 before v6).
27B remains 7/7. 1B remains 0/7.

Prior PRs: #633, #642, #646, #651, #655, #660
Refs: Timmy_Foundation/timmy-home#576
2026-04-14 11:44:56 -04:00
..