1.9 KiB
1.9 KiB
TurboQuant Living Status Tracker
Updated on each milestone. See PROJECT_STATUS.md for detailed phase reports.
Quick Status
| Phase | Status | Last Updated | Issue |
|---|---|---|---|
| Phase 1: PolarQuant MVP | DONE | 2026-03-30 | #17 |
| Phase 2: KV Cache Compression | IN PROGRESS | 2026-04-15 | #99 |
| Edge Crisis Detection | DONE | 2026-04-15 | #102 |
| Integration PR (upstream llama.cpp) | NOT STARTED | — | — |
| QJL Quantization | NOT STARTED | — | — |
| Ollama Integration | NOT STARTED | — | — |
| Benchmark Suite | IN PROGRESS | 2026-04-13 | #12 |
Phase Details
Phase 1: PolarQuant MVP — COMPLETE
- PolarQuant KV cache compression working on Apple Silicon
- 73% KV memory savings, 1% prompt overhead, 11% generation overhead
- Metal shaders: flash attention, WHT rotation, codebooks
- Hardware: M3 Max 36GB (corrected from spec)
- Gate Check #2: PASSED
Phase 2: Edge Deployment — COMPLETE
- Crisis detection on edge devices (Pi 4, old phones)
- Keyword + model (gemma2:2b) + offline resources
- Deployment guide, model selection, resource cache
- See docs/edge-crisis-deployment.md
Phase 3: Upstream Integration — NOT STARTED
- PR to llama.cpp for turbo quantization
- Depends on Phase 2 benchmarks
Phase 4: QJL — NOT STARTED
- Johnson-Lindenstrauss quantization
- Lower memory than PolarQuant
- Research phase
Recent Milestones
| Date | Milestone | PR/Issue |
|---|---|---|
| 2026-04-15 | Edge crisis detection deployed | #102 / PR #111 |
| 2026-04-14 | KV cache compression profiles | PR #68 |
| 2026-04-13 | Benchmark suite expanded | #12 / #39 |
| 2026-03-30 | Phase 1 complete: PolarQuant MVP | #17 |
Open Blockers
| Blocker | Impact | Issue |
|---|---|---|
| None currently | — | — |
Last auto-updated: 2026-04-15 This file is the single source of truth for project status. Update it on every milestone merge.