TurboQuant KV cache compression for local inference — PolarQuant + QJL on M4 Max via llama.cpp/Ollama. Build spec from Strago, build by Cid, coordination by Frankie.
Updated 2026-04-01 03:51:43 +00:00