Commit Graph

1 Commits

Author SHA1 Message Date
Alexander Whitestone
aa0e76c1ab feat: Add Hermes profile for Gemma 4 + TurboQuant (Issue #28)
- Add gemma4-turboquant.yaml profile for Hermes
- Configure local llama.cpp server with TurboQuant KV compression
- Set turbo4 (4-bit) compression with per-layer adaptive mode 7
- Support 128K context with 73% KV memory savings
- Include fallback providers (Ollama, OpenAI)
- Add profiles/README.md with setup and usage instructions
- Document performance expectations and troubleshooting

Closes #28
2026-04-09 21:15:57 -04:00