turboquant

Timmy_Foundation/turboquant

Fork 0

Commit Graph

Author	SHA1	Message	Date
step35	cb2f7b0aa7	feat: add Allegro VPS benchmark infrastructure — presets, runner, tests All checks were successful Smoke Test / smoke (pull_request) Successful in 8s Details - profiles/allegro-cpu-presets.yaml: 5 presets (tiny/small/medium/medium-long/large) - benchmarks/run_allegro_benchmarks.py: --dry-run, --all, --preset, --markdown - benchmarks/allegro-2026-04-14.md: analysis & expected results - tests/test_allegro_benchmarks.py: 19 smoke tests (preset validation, runner) Deliverables for issue #95: benchmark TurboQuant presets on Allegro VPS (2 cores, 8 GB RAM). Runner integrates with existing llama-server backend. Presets tuned to ~6 GB usable memory budget; large preset needs swap. Closes #95	2026-04-26 06:52:53 -04:00
Hermes Agent	5f0d00f127	fix(docs): resolve broken markdown links and stale forge URL All checks were successful Smoke Test / smoke (pull_request) Successful in 6s Details - Update raw-IP forge URL to canonical forge domain in README.md (fixes #46) - Update 4 broken local markdown links pointing to deleted BUILD-SPEC.md, PHASE1-REPORT.md, FULL-REPORT.md to docs/PROJECT_STATUS.md (fixes #44)	2026-04-14 18:07:25 -04:00
Alexander Whitestone	aa0e76c1ab	feat: Add Hermes profile for Gemma 4 + TurboQuant (Issue #28 ) - Add gemma4-turboquant.yaml profile for Hermes - Configure local llama.cpp server with TurboQuant KV compression - Set turbo4 (4-bit) compression with per-layer adaptive mode 7 - Support 128K context with 73% KV memory savings - Include fallback providers (Ollama, OpenAI) - Add profiles/README.md with setup and usage instructions - Document performance expectations and troubleshooting Closes #28	2026-04-09 21:15:57 -04:00

Author

SHA1

Message

Date

step35

cb2f7b0aa7

feat: add Allegro VPS benchmark infrastructure — presets, runner, tests

Smoke Test / smoke (pull_request) Successful in 8s

Details

- profiles/allegro-cpu-presets.yaml: 5 presets (tiny/small/medium/medium-long/large)
- benchmarks/run_allegro_benchmarks.py: --dry-run, --all, --preset, --markdown
- benchmarks/allegro-2026-04-14.md: analysis & expected results
- tests/test_allegro_benchmarks.py: 19 smoke tests (preset validation, runner)

Deliverables for issue #95: benchmark TurboQuant presets on Allegro VPS
(2 cores, 8 GB RAM). Runner integrates with existing llama-server backend.
Presets tuned to ~6 GB usable memory budget; large preset needs swap.

Closes #95

2026-04-26 06:52:53 -04:00

Hermes Agent

5f0d00f127

fix(docs): resolve broken markdown links and stale forge URL

Smoke Test / smoke (pull_request) Successful in 6s

Details

- Update raw-IP forge URL to canonical forge domain in README.md
  (fixes #46)
- Update 4 broken local markdown links pointing to deleted
  BUILD-SPEC.md, PHASE1-REPORT.md, FULL-REPORT.md to
  docs/PROJECT_STATUS.md (fixes #44)

2026-04-14 18:07:25 -04:00

Alexander Whitestone

aa0e76c1ab

feat: Add Hermes profile for Gemma 4 + TurboQuant (Issue #28 )

- Add gemma4-turboquant.yaml profile for Hermes
- Configure local llama.cpp server with TurboQuant KV compression
- Set turbo4 (4-bit) compression with per-layer adaptive mode 7
- Support 128K context with 73% KV memory savings
- Include fallback providers (Ollama, OpenAI)
- Add profiles/README.md with setup and usage instructions
- Document performance expectations and troubleshooting

Closes #28

2026-04-09 21:15:57 -04:00

3 Commits