[P2] Custom Ollama build + MacBook deployment #10
Reference in New Issue
Block a user
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Parent: #1 | Depends on: #9 (API check)
Build custom Ollama using our llama.cpp fork as submodule. Deploy to MacBook.
Steps
Estimated Time: 15-25 min (once llama.cpp fork is validated)
Acceptance Criteria
Custom Ollama Build — DEFERRED
Three approaches attempted, all failed:
Root Cause
Ollama vendors llama.cpp with deep modifications. The TurboQuant fork spans 30+ files (Metal shaders, CUDA kernels, CPU ops, KV cache code). Clean integration requires rebasing onto Ollama's exact pinned commit — estimated multi-day effort.
Recommended Path: llama-server
The fork's
llama-serverat/tmp/llama-cpp-turboquant/build/bin/llama-server:llama-server -m <model.gguf> --port 11434 -ctk turbo4 -ctv turbo4Deferred Work
Custom Ollama build saved as future task. When Ollama updates their llama.cpp pin, the gap narrows. Phase 4 upstream watch (#15) covers this.