36 lines
884 B
Plaintext
36 lines
884 B
Plaintext
|
|
GEMMA 4 DEPLOYMENT - READY TO ACTIVATE
|
|
==================================================
|
|
|
|
MODEL:
|
|
Path: /root/wizards/ezra/home/models/gemma4/gemma-4-31B-it-Q4_K_M.gguf
|
|
Size: 18.3 GB
|
|
Quantization: Q4_K_M (4.77 bits per weight)
|
|
Context: 16k tokens (configurable up to 262k)
|
|
|
|
SERVER:
|
|
Port: 11435
|
|
URL: http://127.0.0.1:11435
|
|
Threads: 4 (CPU-only)
|
|
Max tokens: 4096
|
|
Tool calling: Enabled (--jinja)
|
|
|
|
TO ACTIVATE:
|
|
1. Start server: ~/home/start-gemma4.sh
|
|
2. Switch config: ~/home/switch-to-gemma4.sh
|
|
3. Restart Ezra
|
|
|
|
TO REVERT:
|
|
Config backup created automatically on switch
|
|
Or manually edit ~/home/config.yaml
|
|
|
|
STATUS:
|
|
Model: ✓ Downloaded
|
|
Config: ✓ Ready
|
|
Server: ⏳ Needs llama-server binary
|
|
|
|
NOTE:
|
|
llama.cpp added Gemma 4 support in recent commits.
|
|
Prebuilt binaries will be available soon.
|
|
Or build from: https://github.com/ggerganov/llama.cpp
|