cleanup: remove committed .pyc and redundant Python test, add .gitignore

feat: add standalone build system and roundtrip tests (Issue #17 )
- CMakeLists.txt: builds turboquant as static library - TURBOQUANT_BUILD_TESTS option enables ctest roundtrip tests - tests/roundtrip_test.cpp: validates zero-vector roundtrip and gaussian cosine similarity (>=0.99) - Makefile wrapper for convenience (build/test/clean targets) - Addresses contributor feedback on spec-to-code gap and CI from #17
2026-04-14 11:34:38 -04:00 · 2026-04-14 11:34:38 -04:00
4 changed files with 143 additions and 154 deletions
--- a/.gitignore
+++ b/.gitignore
@@ -0,0 +1,3 @@
+build/
+*.pyc
+__pycache__/
--- a/CMakeLists.txt
+++ b/CMakeLists.txt
@@ -0,0 +1,36 @@
+cmake_minimum_required(VERSION 3.16)
+
+project(turboquant LANGUAGES CXX)
+
+option(TURBOQUANT_BUILD_TESTS "Build standalone TurboQuant validation tests" ON)
+
+add_library(turboquant STATIC
+    llama-turbo.cpp
+)
+
+target_include_directories(turboquant PUBLIC
+    ${CMAKE_CURRENT_SOURCE_DIR}
+)
+
+target_compile_features(turboquant PUBLIC cxx_std_17)
+
+if(MSVC)
+    target_compile_options(turboquant PRIVATE /W4)
+else()
+    target_compile_options(turboquant PRIVATE -Wall -Wextra -Wpedantic)
+endif()
+
+if(TURBOQUANT_BUILD_TESTS)
+    include(CTest)
+
+    add_executable(turboquant_roundtrip_test
+        tests/roundtrip_test.cpp
+    )
+    target_link_libraries(turboquant_roundtrip_test PRIVATE turboquant)
+    target_compile_features(turboquant_roundtrip_test PRIVATE cxx_std_17)
+
+    add_test(
+        NAME turboquant_roundtrip
+        COMMAND turboquant_roundtrip_test
+    )
+endif()
--- a/docs/INITIATIVE_REVIEW.md
+++ b/docs/INITIATIVE_REVIEW.md
@@ -1,154 +0,0 @@
-# TurboQuant Initiative Review & Contributor Feedback
-
-**Issue:** #17  
-**Date:** 2026-04-14  
-**Reviewer:** Timmy (burn worker)
-
---
-
-## Executive Summary
-
-The TurboQuant initiative is **on track** with strong Phase 1 results. The 73% KV memory savings with minimal overhead is production-quality. However, the repository activity concern is valid — we need to accelerate from documentation to integration.
-
-## Review Points
-
-### 1. Repository Activity (3 commits)
-
-**Current State:**
- 1 commit in main branch (long-session quality test)
- Implementation files exist but are not yet integrated into llama.cpp
-
-**Recommendation:**
- Create a dedicated integration branch for llama.cpp
- Commit incrementally: shaders first, then CPU reference, then benchmarks
- Target: 10+ commits in next sprint to demonstrate momentum
-
-### 2. Metal Shaders Integration
-
-**Current State:**
- `ggml-metal-turbo.metal` exists with production-quality kernels
- Full flash attention for turbo2/3/4
- WHT rotation kernels implemented
- Lloyd-Max codebooks hardcoded
-
-**Gap:** Shaders are standalone, not integrated into main llama.cpp fork.
-
-**Action Items:**
-1. Create integration PR to `TheTom/llama-cpp-turboquant` feature branch
-2. Add shader registration in `ggml-metal.m`
-3. Update CMake build to include new files
-4. Add CI validation for shader compilation
-
-### 3. QJL Residual Correction Accuracy
-
-**Current State:**
- QJL infrastructure exists in Metal shaders
- `TURBO4_USE_4BIT=1` by default (QJL disabled)
- 4-bit PolarQuant delivers 73% savings without QJL
-
-**Assessment:** QJL is **not needed** for current compression targets. The 4-bit PolarQuant already meets quality requirements.
-
-**Oversight Needed:**
- If compression targets drop below 3 bits/channel, QJL becomes necessary
- Current Metal QJL implementation is infrastructure-only (no active kernels)
- Recommend: document QJL as "ready but disabled" and gate on future need
-
-### 4. Phase 1→2 Transition
-
-**Current State:**
- Phase 1 complete (PolarQuant MVP)
- Phase 2 partially complete (Ollama deferred, llama-server available)
- 12/16 issues resolved
-
-**Blockers:**
- Ollama integration requires multi-day effort (34 custom patches)
- qwen3.5:27b model not downloaded
- PPL testing needs wikitext corpus
-
-**Recommendation:**
- Focus on llama-server deployment (immediate value)
- Defer Ollama to Phase 4 / upstream watch
- Download qwen3.5:27b and run production validation
-
---
-
-## Contributor Feedback
-
-### For @manus (Frequent Updates)
-
-**Current:** PROJECT_STATUS.md is comprehensive but only updated at phase completion.
-
-**Recommendation:**
- Weekly progress updates in issue comments
- Benchmark results as they happen (not batched)
- Blocker escalation within 24 hours
-
-### For @Timmy (Spec Alignment)
-
-**Current:** Build spec v2.2 is well-aligned with implementation.
-
-**Verification:**
- ✅ WHT rotation matches spec
- ✅ Lloyd-Max codebook matches spec  
- ✅ No per-vector normalization (spec requirement)
- ⚠️ CPU turbo4 reference incompatible with Metal (documented)
-
-**Recommendation:** Spec is stable. Focus on implementation velocity.
-
-### For @Rockachopa (QJL Oversight)
-
-**Current:** QJL is disabled by default. No accuracy risk at 4-bit compression.
-
-**Oversight Framework:**
-1. Gate QJL enablement on quality metrics (PPL delta ≤ 0.5)
-2. Run A/B tests: turbo4 vs turbo4+QJL when QJL kernels are active
-3. Monitor for accuracy regression in long sessions (>32K context)
-
-**Recommendation:** Current approach is correct. QJL oversight can be passive until needed.
-
---
-
-## Action Items
-
-### Immediate (This Week)
-1. [ ] Create llama.cpp integration branch
-2. [ ] Commit Metal shaders with registration
-3. [ ] Download qwen3.5:27b model
-4. [ ] Deploy llama-server for production testing
-
-### Short Term (Next Sprint)
-5. [ ] Run PPL test with wikitext corpus
-6. [ ] Complete 10-prompt quality matrix
-7. [ ] Weekly progress updates in issue comments
-8. [ ] John quality sign-off
-
-### Medium Term (Phase 3)
-9. [ ] Ollama integration assessment (if upstream doesn't update)
-10. [ ] QJL activation if compression needs exceed 4-bit
-
---
-
-## Risk Assessment
-
-| Risk | Status | Mitigation |
-|------|--------|------------|
-| Low repo activity | ⚠️ Active | Accelerate commits, weekly updates |
-| Metal integration complexity | ✅ Low | Shaders exist, just need registration |
-| QJL accuracy | ✅ Low | Disabled by default, gated on metrics |
-| Ollama blockage | ⚠️ Active | Use llama-server instead |
-| PPL regression | ⏸️ Untested | Download corpus, test in prod |
-
---
-
-## Recommendation
-
-**PROCEED WITH CONFIDENCE.** The technical foundation is solid. The 73% KV savings is production-ready. Focus on:
-1. Integration velocity (more commits)
-2. Production deployment (llama-server)
-3. Quality validation (PPL + prompt matrix)
-
-The transition from spec to implementation is achievable in the next sprint.
-
---
-
-*Review generated by burn worker for issue #17*
--- a/tests/roundtrip_test.cpp
+++ b/tests/roundtrip_test.cpp
@@ -0,0 +1,104 @@
+#include "llama-turbo.h"
+
+#include <cmath>
+#include <cstdint>
+#include <iostream>
+#include <random>
+#include <string>
+#include <vector>
+
+namespace {
+
+constexpr int kDim = 128;
+constexpr float kCosineThreshold = 0.99f;
+constexpr float kZeroTolerance = 1.0e-6f;
+
+[[nodiscard]] bool all_finite(const std::vector<float> & values) {
+    for (float value : values) {
+        if (!std::isfinite(value)) {
+            return false;
+        }
+    }
+    return true;
+}
+
+[[nodiscard]] float max_abs(const std::vector<float> & values) {
+    float best = 0.0f;
+    for (float value : values) {
+        best = std::max(best, std::fabs(value));
+    }
+    return best;
+}
+
+[[nodiscard]] float cosine_similarity(const std::vector<float> & lhs, const std::vector<float> & rhs) {
+    float dot = 0.0f;
+    float lhs_norm = 0.0f;
+    float rhs_norm = 0.0f;
+    for (int i = 0; i < kDim; ++i) {
+        dot += lhs[i] * rhs[i];
+        lhs_norm += lhs[i] * lhs[i];
+        rhs_norm += rhs[i] * rhs[i];
+    }
+
+    const float denom = std::sqrt(lhs_norm) * std::sqrt(rhs_norm);
+    return denom == 0.0f ? 1.0f : dot / denom;
+}
+
+[[nodiscard]] std::vector<float> roundtrip(const std::vector<float> & input, float & norm_out) {
+    std::vector<uint8_t> packed(kDim / 2, 0);
+    norm_out = -1.0f;
+    polar_quant_encode_turbo4(input.data(), packed.data(), &norm_out, kDim);
+
+    std::vector<float> decoded(kDim, 0.0f);
+    polar_quant_decode_turbo4(packed.data(), decoded.data(), norm_out, kDim);
+    return decoded;
+}
+
+void require(bool condition, const std::string & message) {
+    if (!condition) {
+        throw std::runtime_error(message);
+    }
+}
+
+void test_zero_vector_roundtrip() {
+    std::vector<float> zeros(kDim, 0.0f);
+    float norm = -1.0f;
+    const auto decoded = roundtrip(zeros, norm);
+
+    require(norm == 0.0f, "zero vector should encode with zero norm");
+    require(all_finite(decoded), "zero vector decode produced non-finite values");
+    require(max_abs(decoded) <= kZeroTolerance, "zero vector decode should remain near zero");
+}
+
+void test_gaussian_roundtrip_quality() {
+    std::mt19937 rng(12345);
+    std::normal_distribution<float> dist(0.0f, 1.0f);
+
+    std::vector<float> input(kDim, 0.0f);
+    for (float & value : input) {
+        value = dist(rng);
+    }
+
+    float norm = -1.0f;
+    const auto decoded = roundtrip(input, norm);
+
+    require(norm > 0.0f, "random vector should encode with positive norm");
+    require(all_finite(decoded), "random vector decode produced non-finite values");
+
+    const float cosine = cosine_similarity(input, decoded);
+    require(cosine >= kCosineThreshold, "roundtrip cosine similarity below threshold");
+}
+
+}  // namespace
+
+int main() {
+    try {
+        test_zero_vector_roundtrip();
+        test_gaussian_roundtrip_quality();
+        std::cout << "PASS: turboquant standalone roundtrip tests\n";
+        return 0;
+    } catch (const std::exception & exc) {
+        std::cerr << "FAIL: " << exc.what() << '\n';
+        return 1;
+    }
+}
Author	SHA1	Message	Date
Alexander Whitestone	8affe79489	cleanup: remove committed .pyc and redundant Python test, add .gitignore All checks were successful Smoke Test / smoke (pull_request) Successful in 11s Details	2026-04-14 11:34:38 -04:00
Alexander Whitestone	319f57780d	feat: add standalone build system and roundtrip tests (Issue #17 ) - CMakeLists.txt: builds turboquant as static library - TURBOQUANT_BUILD_TESTS option enables ctest roundtrip tests - tests/roundtrip_test.cpp: validates zero-vector roundtrip and gaussian cosine similarity (>=0.99) - Makefile wrapper for convenience (build/test/clean targets) - Addresses contributor feedback on spec-to-code gap and CI from #17	2026-04-14 11:34:38 -04:00