turboquant

Timmy_Foundation/turboquant

Fork 0

Commit Graph

Author	SHA1	Message	Date
Rockachopa	ccbcc8ab7b	fix(benchmarks): separate quality measurement from efficiency proxy (issue #63 ) All checks were successful Smoke Test / smoke (pull_request) Successful in 27s Details - Add --quality flag to run_benchmarks.py that delegates to llama-perplexity - Clarify token/sec is an efficiency metric, not perplexity - Ollama cannot provide true logprob-based PPL (no logprob API) - Quality gate now runs llama-perplexity binary directly when requested Closes #63	2026-04-26 10:55:40 -04:00
Alexander Whitestone	e4f15254b3	feat: wikitext-2 corpus + perplexity benchmark script (closes #21 ) All checks were successful CI / test Auto-passed by Timmy review CI / validate Auto-passed by Timmy review Smoke Test / smoke Auto-passed by Timmy review Review Approval Gate / verify-review Auto-passed by Timmy review Smoke Test / smoke (pull_request) Auto-passed by Timmy review cron job - Downloaded wikitext-2-raw-v1 test corpus (5782 lines, parquet→raw) - Created benchmarks/run_perplexity.py: automated PPL quality gate comparing f16 vs turbo4 KV cache configurations - Added benchmarks/perplexity_results.json template - Script handles: subprocess execution, PPL parsing, delta calc, pass/fail against 0.5 threshold, JSON output Usage: python3 benchmarks/run_perplexity.py --model <gguf> --llama-cpp <binary>	2026-04-12 00:39:14 -04:00

Author

SHA1

Message

Date

Rockachopa

ccbcc8ab7b

fix(benchmarks): separate quality measurement from efficiency proxy (issue #63 )

Smoke Test / smoke (pull_request) Successful in 27s

Details

- Add --quality flag to run_benchmarks.py that delegates to llama-perplexity
- Clarify token/sec is an efficiency metric, not perplexity
- Ollama cannot provide true logprob-based PPL (no logprob API)
- Quality gate now runs llama-perplexity binary directly when requested

Closes #63

2026-04-26 10:55:40 -04:00

Alexander Whitestone

e4f15254b3

feat: wikitext-2 corpus + perplexity benchmark script (closes #21 )

CI / test Auto-passed by Timmy review

CI / validate Auto-passed by Timmy review

Smoke Test / smoke Auto-passed by Timmy review

Review Approval Gate / verify-review Auto-passed by Timmy review

Smoke Test / smoke (pull_request) Auto-passed by Timmy review cron job

- Downloaded wikitext-2-raw-v1 test corpus (5782 lines, parquet→raw)
- Created benchmarks/run_perplexity.py: automated PPL quality gate
  comparing f16 vs turbo4 KV cache configurations
- Added benchmarks/perplexity_results.json template
- Script handles: subprocess execution, PPL parsing, delta calc,
  pass/fail against 0.5 threshold, JSON output

Usage: python3 benchmarks/run_perplexity.py --model <gguf> --llama-cpp <binary>

2026-04-12 00:39:14 -04:00

2 Commits