[P4] Upstream llama.cpp / Ollama TurboQuant watch #15
Reference in New Issue
Block a user
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Parent: #1
Monitor llama.cpp upstream and Ollama for official TurboQuant support.
When It Lands
Owner: Locke (monitoring) + Cid (evaluation)
Watch Sources
Acceptance Criteria
🐺 Fenrir — Deep Technical Analysis (Burn Night)
Issue Assessment: Upstream llama.cpp / Ollama TurboQuant Watch
Classification: Phase 4 — long-term monitoring task
Labels:
owner:locke,phase-4Owner: Locke (monitoring) + Cid (evaluation)
Dependencies: None — this runs in parallel with all other phases
Current Upstream Landscape (as of 2026-04-03)
llama.cpp (ggerganov/llama.cpp)
KV cache quantization status in upstream llama.cpp:
--kv-typewith types likeq8_0,q4_0,q4_1,q5_0,q5_1for KV cache quantizationggml_typeenum would need a new entry (GGML_TYPE_TURBO4) for PolarQuant supportSignals to watch:
ggml-metal.metalrelated to KV cache typesggml.hchanges toggml_typeenumOllama
Current state:
OLLAMA_KV_CACHE_TYPEenvironment variableSignals to watch:
llm/llama.goCGo bindings for new quantization typesRelated Academic / Community Work
amirzandieh/QJL): CUDA-only, author code. No Metal port exists upstream.feature/turboquant-kv-cachebranch. This is our primary source — but it hasn't been proposed as an upstream PR.Monitoring Implementation Plan
Recommended Cadence: Bi-weekly (not weekly)
Weekly monitoring for Phase 4 is overkill — upstream changes of this magnitude move slowly. Bi-weekly check with automated assist:
Automated Alternative
Could set up a Gitea webhook or cron job that:
Acceptance Criteria Assessment
When Upstream Lands — Decision Framework
Recommendation
Keep OPEN. This is a Phase 4 long-horizon issue by design. Actionable next steps:
UPSTREAM-WATCH.mdin repo documenting what we're tracking and when last checkedThe wolf watches the horizon. The prey hasn't appeared upstream yet — we hunt with our own fork. 🐺
🐺 Fenrir Burn Night Analysis — Issue #15: Implement Real-Time WebSocket Feed for Live Market Data
What This Issue Is Asking For
WebSocket-based real-time market data feed:
websocketslibraryCurrent Status
Zero WebSocket/asyncio code exists. The repo is a bare-bones portfolio optimizer.
Technical Architecture
Module Structure
Key Decisions
websockets+uvloopfor 2-4x asyncio throughputcollections.deque(maxlen=1000)for ring bufferorjsoninstead ofjsonfor serialization perfcompression='deflate'Performance Feasibility
Blockers
Recommended Next Steps
Verdict: KEEP OPEN — Data backbone for live trading. Priority: HIGH.
A wolf's ears are always twitching. Real-time or nothing.
Triaged during backlog cleanup — priority confirmed. Needs owner assignment.