3 lines
181 B
Markdown
3 lines
181 B
Markdown
|
|
# turboquant
|
||
|
|
|
||
|
|
TurboQuant KV cache compression for local inference — PolarQuant + QJL on M4 Max via llama.cpp/Ollama. Build spec from Strago, build by Cid, coordination by Frankie.
|