[P1-S1] Build llama.cpp fork with Metal backend on M4 Max #4
Reference in New Issue
Block a user
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Parent: #1 | Depends on: #3 (fork assessment)
Build the llama.cpp TurboQuant fork with Metal backend on MacBook Pro M4 Max.
2-Hour Cap
If it doesn't compile and pass smoke test (load model, generate 10 tokens) within 2 hours, STOP. Pivot to MLX path. Report what broke.
Paths
Smoke Test
Acceptance Criteria
Build Complete ✅
Branch: feature/turboquant-kv-cache (commit adac2c6)
Build: cmake -DGGML_METAL=ON -DCMAKE_BUILD_TYPE=Release → 100% clean build
Metal init output confirms TurboQuant:
All binaries built: llama-cli, llama-bench, llama-perplexity, llama-server.
Smoke test PASSED: model loads, generates coherent text at ~34 t/s.
Build time: ~3 minutes (cmake configure + make -j14)
Path used: Direct build from feature branch (no cherry-picking needed)