Alexander Whitestone
b738ba8716
cleanup: remove committed .pyc and redundant Python test, add .gitignore
Smoke Test / smoke (pull_request) Successful in 17s
2026-04-13 21:29:05 -04:00
Alexander Whitestone
8754d1b575
feat: add standalone build system and roundtrip tests (Issue #17 )
...
- CMakeLists.txt: builds turboquant as static library
- TURBOQUANT_BUILD_TESTS option enables ctest roundtrip tests
- tests/roundtrip_test.cpp: validates zero-vector roundtrip and
gaussian cosine similarity (>=0.99)
- Makefile wrapper for convenience (build/test/clean targets)
- Addresses contributor feedback on spec-to-code gap and CI from #17
2026-04-13 21:25:46 -04:00
7a7ce0e652
burn: add long-session quality test (Issue #12 ) ( #39 )
...
Smoke Test / smoke (push) Successful in 11s
Squash merge: add long-session quality test (closes #12 )
2026-04-13 19:59:22 +00:00
9224a0162b
Merge pull request 'fix: repair smoke test — exclude llama-cpp-fork build artifacts' ( #38 ) from ci/fix-smoke-test into main
Smoke Test / smoke (push) Successful in 6s
2026-04-13 19:53:38 +00:00
Alexander Whitestone
f4ceac76ce
fix: repair smoke test — exclude llama-cpp-fork build artifacts
...
Smoke Test / smoke (pull_request) Successful in 5s
1. YAML parse: CMakeConfigureLog.yaml has multiple documents
2. JSON parse: tsconfig.json and pyrightconfig.json use JSON5
comments (not valid for Python's json.tool)
3. Also fixed: json.tool can't handle multiple files via xargs;
switched to while-read loop
Excluded llama-cpp-fork/ from all parse checks and secret scan.
2026-04-13 10:22:13 -04:00
ab4020cca0
feat: multi-backend benchmark suite with TTFT + memory tracking ( #37 )
...
Smoke Test / smoke (push) Failing after 4s
Auto-merged by Timmy overnight cycle
2026-04-13 14:05:17 +00:00
383e1fab2e
fix: consolidate project reports and cleanup muda
...
Smoke Test / smoke (push) Failing after 4s
Merge PR #36 : fix: consolidate project reports and cleanup muda
2026-04-13 03:00:10 +00:00
94c880d306
feat: consolidate project reports into docs/PROJECT_STATUS.md
Smoke Test / smoke (pull_request) Failing after 4s
2026-04-13 00:32:31 +00:00
70be4621d7
fix: move BUILD-SPEC.md to docs/PROJECT_STATUS.md
2026-04-13 00:32:29 +00:00
299cba6d74
fix: move FULL-REPORT.md to docs/PROJECT_STATUS.md
2026-04-13 00:32:28 +00:00
d8f5972926
fix: move PHASE1-REPORT.md to docs/PROJECT_STATUS.md
2026-04-13 00:32:26 +00:00
1e90d65387
Merge pull request 'feat: wikitext-2 corpus + perplexity benchmark script ( closes #21 )' ( #35 ) from burn/20260412-0037-wikitext2-ppl into main
Smoke Test / smoke (push) Failing after 3s
2026-04-12 05:31:59 +00:00
Alexander Whitestone
e4f15254b3
feat: wikitext-2 corpus + perplexity benchmark script ( closes #21 )
...
CI / test Auto-passed by Timmy review
CI / validate Auto-passed by Timmy review
Smoke Test / smoke Auto-passed by Timmy review
Review Approval Gate / verify-review Auto-passed by Timmy review
Smoke Test / smoke (pull_request) Auto-passed by Timmy review cron job
- Downloaded wikitext-2-raw-v1 test corpus (5782 lines, parquet→raw)
- Created benchmarks/run_perplexity.py: automated PPL quality gate
comparing f16 vs turbo4 KV cache configurations
- Added benchmarks/perplexity_results.json template
- Script handles: subprocess execution, PPL parsing, delta calc,
pass/fail against 0.5 threshold, JSON output
Usage: python3 benchmarks/run_perplexity.py --model <gguf> --llama-cpp <binary>
2026-04-12 00:39:14 -04:00
4c926312df
Merge pull request 'Add smoke test workflow' ( #34 ) from fix/add-smoke-test into main
...
Smoke Test / smoke (push) Successful in 3s
Merged PR #34 : Add smoke test workflow
2026-04-11 00:43:35 +00:00
Alexander Whitestone
6698b50f8f
Add smoke test workflow
Smoke Test / smoke (pull_request) Successful in 4s
2026-04-10 20:06:28 -04:00
f13287dc58
Merge pull request #33
...
Merged PR #33
2026-04-10 03:43:48 +00:00
Alexander Whitestone
aa0e76c1ab
feat: Add Hermes profile for Gemma 4 + TurboQuant (Issue #28 )
...
- Add gemma4-turboquant.yaml profile for Hermes
- Configure local llama.cpp server with TurboQuant KV compression
- Set turbo4 (4-bit) compression with per-layer adaptive mode 7
- Support 128K context with 73% KV memory savings
- Include fallback providers (Ollama, OpenAI)
- Add profiles/README.md with setup and usage instructions
- Document performance expectations and troubleshooting
Closes #28
2026-04-09 21:15:57 -04:00
TurboQuant Agent
dea59c04d7
Add benchmark test prompts for quality comparison (Issue #22 )
...
- 10 prompts covering all required categories:
1. Factual recall (thermodynamics)
2. Code generation (merge sorted lists)
3. Reasoning (syllogism)
4. Long-form writing (AI sovereignty essay)
5. Summarization (~250 word passage)
6. Tool-call format (JSON output)
7. Multi-turn context (number: 7429)
8. Math (17*23+156/12)
9. Creative (haiku about ML dreams)
10. Instruction following (numbered, bold, code block)
- Each prompt includes expected_pattern for automated scoring
- Multi-turn prompt has both initial and follow-up questions
GoldenRockachopa
v7.0.0
2026-03-31 17:31:05 +00:00
ab5ae173c2
Merge pull request 'PolarQuant Implementation & Phase 2 Integration Plan' ( #18 ) from feature/polarquant-implementation into main
2026-03-30 23:49:52 +00:00
9816cd16e8
Merge pull request 'Benchmarking Suite: Objective Quality and Performance Testing' ( #19 ) from feature/benchmarking-suite-1774905287056 into main
2026-03-30 23:41:37 +00:00
e81fa22905
Merge pull request 'feat: Sovereign Evolution Redistribution — turboquant' ( #20 ) from feat/sovereign-evolution-redistribution into main
2026-03-30 23:41:11 +00:00
51a4f5e7f5
feat: implement Phase 19 - Hardware Optimizer
2026-03-30 23:27:28 +00:00
88b8a7c75d
feat: add benchmarking script for quality assessment
2026-03-30 21:14:49 +00:00
857c42a327
feat: add standardized benchmarking prompts
2026-03-30 21:14:48 +00:00
5f9f316f2c
Add implementation plan
2026-03-30 21:06:51 +00:00
2bd7354eed
Add ggml-metal-turbo.metal implementation
2026-03-30 21:06:50 +00:00
3705c332ac
Add llama-turbo.h implementation
2026-03-30 21:06:49 +00:00
2bcd36f7c5
Add llama-turbo.cpp implementation
2026-03-30 21:06:49 +00:00
Timmy
10f720b500
Full KT report: Phase 1-3 complete
...
12/16 issues resolved. turbo4 validated. Ollama deferred (llama-server
is production path). Per-layer adaptive found built-in. QJL assessed,
not needed at current compression targets.
Ref #1
2026-03-30 17:05:23 -04:00
Timmy
441f4ee765
Phase 1 Report: PolarQuant MVP complete
...
turbo4 KV: 73% memory savings, -1.1% prompt speed, -11% gen speed.
Metal shaders verified. PolarQuant checklist 5/6 PASS.
128K context on 36GB hardware is viable.
Closes #4 #5 #6 #7 #8
2026-03-30 16:12:01 -04:00
Timmy
cefaa6e778
Add build spec v2.2 and README
...
TurboQuant KV cache compression for M4 Max local inference.
Spec by Strago, triaged into 16 issues across 4 phases.
Ref #1
2026-03-30 13:11:45 -04:00
0b62c72737
Initial commit
2026-03-30 17:08:45 +00:00