[P2-S0] Ollama CGo API compatibility check #9

Closed
opened 2026-03-30 17:11:11 +00:00 by Timmy · 1 comment
Owner

Parent: #1 | Depends on: Phase 1 complete

Check if our llama.cpp fork's API surface is compatible with Ollama's CGo bindings.

Action

git clone https://github.com/ollama/ollama.git
cd ollama

# Find pinned llama.cpp commit
cat llm/llama.cpp/CMakeLists.txt | head -5

# Diff API surface
diff <(grep -h "^LLAMA_API\|^GGML_API" llm/llama.cpp/include/*.h | sort) \
     <(grep -h "^LLAMA_API\|^GGML_API" /path/to/our-fork/include/*.h | sort)

Assessment

  • TurboQuant changes additive only (new functions/types)? -> Safe
  • Modified existing signatures? -> Need to update Ollama CGo bindings

Acceptance Criteria

  • Ollama's pinned llama.cpp commit identified
  • API surface diff generated
  • Compatibility assessment: compatible / needs binding updates
## Parent: #1 | Depends on: Phase 1 complete Check if our llama.cpp fork's API surface is compatible with Ollama's CGo bindings. ## Action ```bash git clone https://github.com/ollama/ollama.git cd ollama # Find pinned llama.cpp commit cat llm/llama.cpp/CMakeLists.txt | head -5 # Diff API surface diff <(grep -h "^LLAMA_API\|^GGML_API" llm/llama.cpp/include/*.h | sort) \ <(grep -h "^LLAMA_API\|^GGML_API" /path/to/our-fork/include/*.h | sort) ``` ## Assessment - TurboQuant changes additive only (new functions/types)? -> Safe - Modified existing signatures? -> Need to update Ollama CGo bindings ## Acceptance Criteria - [ ] Ollama's pinned llama.cpp commit identified - [ ] API surface diff generated - [ ] Compatibility assessment: compatible / needs binding updates
Timmy added this to the Phase 2 — Ollama Integration + Production milestone 2026-03-30 17:11:11 +00:00
Timmy added the phase-2buildowner:cid labels 2026-03-30 17:11:11 +00:00
Author
Owner

Ollama CGo API Compatibility Check

Ollama Status

  • Fixed broken symlink (moved Ollama.app from ~/Downloads to /Applications)
  • Version: 0.17.7, server running on :11434

API Surface

TurboQuant changes are ADDITIVE (good):

  • 3 new GGML_TYPE enums: TURBO3_0=41, TURBO4_0=42, TURBO2_0=43
  • 1 new GGML_OP: TURBO_WHT
  • No existing enums/structs modified

Compatibility Problem

Fork is based on NEWER upstream than Ollama's pin (ec98e2002):

  • Ollama carries 34 custom patches on its vendored llama.cpp
  • Direct replacement fails (patch sha1 mismatches)
  • Cherry-picking TurboQuant onto Ollama's tree requires patching 30+ files

Verdict

Custom Ollama build is a multi-day porting effort — not feasible in Phase 2 timeframe. Deferred.

Production Alternative

The fork's llama-server binary is already built and speaks OpenAI-compatible API (same format as Ollama). Recommended as drop-in replacement for the inference endpoint.

## Ollama CGo API Compatibility Check ### Ollama Status - Fixed broken symlink (moved Ollama.app from ~/Downloads to /Applications) - Version: 0.17.7, server running on :11434 ### API Surface TurboQuant changes are **ADDITIVE** (good): - 3 new GGML_TYPE enums: TURBO3_0=41, TURBO4_0=42, TURBO2_0=43 - 1 new GGML_OP: TURBO_WHT - No existing enums/structs modified ### Compatibility Problem Fork is based on NEWER upstream than Ollama's pin (ec98e2002): - Ollama carries 34 custom patches on its vendored llama.cpp - Direct replacement fails (patch sha1 mismatches) - Cherry-picking TurboQuant onto Ollama's tree requires patching 30+ files ### Verdict Custom Ollama build is a **multi-day porting effort** — not feasible in Phase 2 timeframe. Deferred. ### Production Alternative The fork's `llama-server` binary is already built and speaks **OpenAI-compatible API** (same format as Ollama). Recommended as drop-in replacement for the inference endpoint.
Timmy closed this issue 2026-03-30 21:04:01 +00:00
Sign in to join this conversation.