All checks were successful
Smoke Test / smoke (pull_request) Successful in 16s
Implements Issue #80: benchmark turboquant vs llama.cpp baseline on M1. New files: - benchmarks/run_m1_benchmark.py — comprehensive benchmark runner - benchmarks/run_benchmark_m1.sh — shell wrapper for easy execution - tests/test_m1_benchmark.py — unit tests for benchmark functions Measures: - Tokens/sec throughput (f16 vs turbo4, 3-run average) - Memory usage (RSS monitoring during inference) - Quality via perplexity (llama-perplexity on wikitext-2) Generates: - benchmarks/m1_benchmark_results.json — raw results - benchmarks/m1_benchmark_report.md — markdown comparison table Closes #80