All checks were successful
Smoke Test / smoke (pull_request) Successful in 14s
benchmarks/compare_configs.py: - Runs 4 configs (ollama, llama-f16, llama-turbo4, llama-turbo4-adaptive) - Aggregates TTFT, tok/s, latency, peak memory - Picks winner by highest tok/s - Outputs JSON report + human-readable table - --demo mode for testing without live servers tests/test_compare_configs.py (13 tests): - ConfigEntry, ConfigResult, default configs - Aggregation logic, winner selection, table format - Demo mode with and without output file - Prompt loading from test_prompts.json Closes #29.