61 lines
1.9 KiB
Markdown
61 lines
1.9 KiB
Markdown
|
|
# TurboQuant Living Status Tracker
|
||
|
|
|
||
|
|
Updated on each milestone. See PROJECT_STATUS.md for detailed phase reports.
|
||
|
|
|
||
|
|
## Quick Status
|
||
|
|
|
||
|
|
| Phase | Status | Last Updated | Issue |
|
||
|
|
|-------|--------|-------------|-------|
|
||
|
|
| Phase 1: PolarQuant MVP | DONE | 2026-03-30 | #17 |
|
||
|
|
| Phase 2: KV Cache Compression | IN PROGRESS | 2026-04-15 | #99 |
|
||
|
|
| Edge Crisis Detection | DONE | 2026-04-15 | #102 |
|
||
|
|
| Integration PR (upstream llama.cpp) | NOT STARTED | — | — |
|
||
|
|
| QJL Quantization | NOT STARTED | — | — |
|
||
|
|
| Ollama Integration | NOT STARTED | — | — |
|
||
|
|
| Benchmark Suite | IN PROGRESS | 2026-04-13 | #12 |
|
||
|
|
|
||
|
|
## Phase Details
|
||
|
|
|
||
|
|
### Phase 1: PolarQuant MVP — COMPLETE
|
||
|
|
- PolarQuant KV cache compression working on Apple Silicon
|
||
|
|
- 73% KV memory savings, 1% prompt overhead, 11% generation overhead
|
||
|
|
- Metal shaders: flash attention, WHT rotation, codebooks
|
||
|
|
- Hardware: M3 Max 36GB (corrected from spec)
|
||
|
|
- Gate Check #2: PASSED
|
||
|
|
|
||
|
|
### Phase 2: Edge Deployment — COMPLETE
|
||
|
|
- Crisis detection on edge devices (Pi 4, old phones)
|
||
|
|
- Keyword + model (gemma2:2b) + offline resources
|
||
|
|
- Deployment guide, model selection, resource cache
|
||
|
|
- See docs/edge-crisis-deployment.md
|
||
|
|
|
||
|
|
### Phase 3: Upstream Integration — NOT STARTED
|
||
|
|
- PR to llama.cpp for turbo quantization
|
||
|
|
- Depends on Phase 2 benchmarks
|
||
|
|
|
||
|
|
### Phase 4: QJL — NOT STARTED
|
||
|
|
- Johnson-Lindenstrauss quantization
|
||
|
|
- Lower memory than PolarQuant
|
||
|
|
- Research phase
|
||
|
|
|
||
|
|
## Recent Milestones
|
||
|
|
|
||
|
|
| Date | Milestone | PR/Issue |
|
||
|
|
|------|-----------|----------|
|
||
|
|
| 2026-04-15 | Edge crisis detection deployed | #102 / PR #111 |
|
||
|
|
| 2026-04-14 | KV cache compression profiles | PR #68 |
|
||
|
|
| 2026-04-13 | Benchmark suite expanded | #12 / #39 |
|
||
|
|
| 2026-03-30 | Phase 1 complete: PolarQuant MVP | #17 |
|
||
|
|
|
||
|
|
## Open Blockers
|
||
|
|
|
||
|
|
| Blocker | Impact | Issue |
|
||
|
|
|---------|--------|-------|
|
||
|
|
| None currently | — | — |
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
*Last auto-updated: 2026-04-15*
|
||
|
|
*This file is the single source of truth for project status.*
|
||
|
|
*Update it on every milestone merge.*
|