Files
turboquant/docs/STATUS_TRACKER.md

1.9 KiB

TurboQuant Living Status Tracker

Updated on each milestone. See PROJECT_STATUS.md for detailed phase reports.

Quick Status

Phase Status Last Updated Issue
Phase 1: PolarQuant MVP DONE 2026-03-30 #17
Phase 2: KV Cache Compression IN PROGRESS 2026-04-15 #99
Edge Crisis Detection DONE 2026-04-15 #102
Integration PR (upstream llama.cpp) NOT STARTED
QJL Quantization NOT STARTED
Ollama Integration NOT STARTED
Benchmark Suite IN PROGRESS 2026-04-13 #12

Phase Details

Phase 1: PolarQuant MVP — COMPLETE

  • PolarQuant KV cache compression working on Apple Silicon
  • 73% KV memory savings, 1% prompt overhead, 11% generation overhead
  • Metal shaders: flash attention, WHT rotation, codebooks
  • Hardware: M3 Max 36GB (corrected from spec)
  • Gate Check #2: PASSED

Phase 2: Edge Deployment — COMPLETE

  • Crisis detection on edge devices (Pi 4, old phones)
  • Keyword + model (gemma2:2b) + offline resources
  • Deployment guide, model selection, resource cache
  • See docs/edge-crisis-deployment.md

Phase 3: Upstream Integration — NOT STARTED

  • PR to llama.cpp for turbo quantization
  • Depends on Phase 2 benchmarks

Phase 4: QJL — NOT STARTED

  • Johnson-Lindenstrauss quantization
  • Lower memory than PolarQuant
  • Research phase

Recent Milestones

Date Milestone PR/Issue
2026-04-15 Edge crisis detection deployed #102 / PR #111
2026-04-14 KV cache compression profiles PR #68
2026-04-13 Benchmark suite expanded #12 / #39
2026-03-30 Phase 1 complete: PolarQuant MVP #17

Open Blockers

Blocker Impact Issue
None currently

Last auto-updated: 2026-04-15 This file is the single source of truth for project status. Update it on every milestone merge.