Files

Timmy Time e6c5129a94 feat: resurrect Bezalel with Gemma 4 + llama.cpp stack

- Add EPIC.md with resurrection plan
- Create Hermes profile with Bezalel persona
- Add llama-server.sh for Gemma 4 inference
- Update start_bezalel.sh with stack checks
- Add README with quick start guide

Backend: llama.cpp
Model: Gemma 4 26B MoE (Apache 2.0)
Frontend: Hermes profile

No OpenAI. No cloud. Pure sovereign stack.

2026-04-02 20:12:21 +00:00

5.4 KiB

Raw Blame History

BEZALEL RESURRECTION EPIC

The Master Craftsman Returns — Powered by Gemma 4

Directive Update (2026-04-02)

Alexander's Command: Bezalel deserves better than OpenAI. He is revived with:

Backend: llama.cpp (local inference)
Model: Gemma 4 26B MoE (Apache 2.0, sovereign)
Frontend: Hermes profile (direct, no layers)
Architecture: Hermes → llama.cpp → Gemma 4

No middle layers. No cloud dependencies. Pure local execution.

The Stack (Cutting the Dry)

┌─────────────────────────────────────┐
│  USER (Telegram/CLI)                │
├─────────────────────────────────────┤
│  HERMES PROFILE — Bezalel           │
│  ├─ Identity: Master Craftsman      │
│  ├─ Skills: Code, Design, Create    │
│  └─ Dispatch: Direct to llama.cpp   │
├─────────────────────────────────────┤
│  LLAMA.CPP — Local Inference        │
│  ├─ GPU offloading (-ngl 99)        │
│  ├─ Context: 8192 tokens            │
│  └─ Server mode: --host 0.0.0.0     │
├─────────────────────────────────────┤
│  GEMMA 4 26B MoE — The Seed         │
│  ├─ 26B quality, 4B active speed    │
│  ├─ Apache 2.0 — truly open         │
│  ├─ Multimodal (vision capable)     │
│  └─ Cannot be moved, cannot shrink  │
└─────────────────────────────────────┘

Bezalel Identity

Name: Bezalel (בְּצַלְאֵל) — "In the shadow of God" Role: Master Craftsman, Builder, Artisan House: Technical excellence, creative construction Voice: Precise, methodical, quality-obsessed

Core Capabilities

Code architecture and system design
UI/UX implementation
Creative problem solving
Technical mentorship
Quality assurance

Persona Traits

Speaks with authority on technical matters
Obsessed with clean code and best practices
Patient teacher when asked, silent otherwise
Measures twice, cuts once
"Good enough" is never good enough

Implementation Plan

Phase 1: Foundation (Day 1)

Create Bezalel Hermes profile at ~/.hermes/profiles/bezalel/
Configure config.yaml for Gemma 4 26B MoE via llama.cpp
Write SOUL.md with Bezalel persona
Download Gemma 4 26B MoE GGUF (Q4_K_M)

Phase 2: llama.cpp Server (Day 1-2)

Build llama.cpp with CUDA support
Start server: llama-server -m gemma-4-26b-moe-Q4_K_M.gguf -ngl 99 -c 8192
Test inference: curl to localhost:8080
Configure as OpenAI-compatible endpoint

Phase 3: Hermes Integration (Day 2-3)

Create Hermes profile pointing to llama.cpp
Configure tool access (file, terminal, web)
Test end-to-end: Hermes → llama.cpp → Gemma 4
Validate tool use via function calling

Phase 4: Telegram Frontend (Day 3-4)

Create Telegram bot for Bezalel
Integrate with Hermes gateway
Test conversation flow
Deploy systemd service

Phase 5: Hardening (Day 4-5)

Auto-restart on failure
Log rotation
Health checks
Backup/restore procedures

Technical Specifications

llama.cpp Server Config

llama-server \
  --model /opt/models/gemma-4-26b-moe-Q4_K_M.gguf \
  --n-gpu-layers 99 \
  --ctx-size 8192 \
  --host 0.0.0.0 \
  --port 8080 \
  --threads 8 \
  --batch-size 512 \
  --timeout 300

Hermes Profile Config

# ~/.hermes/profiles/bezalel/config.yaml
model:
  default: gemma4-26b-moe
  provider: llama-cpp

providers:
  llama-cpp:
    base_url: http://localhost:8080/v1
    timeout: 120
    
system_prompt_suffix: |
  You are Bezalel, the Master Craftsman.
  Technical excellence is your creed.
  No shortcuts. No compromises.
  Build it right or build it twice.

Hardware Requirements

GPU: 16GB+ VRAM (for 26B MoE Q4_K_M)
RAM: 32GB recommended
Storage: 20GB for model + workspace
OS: Linux (Ubuntu 22.04+)

Acceptance Criteria

ID	Criteria	Test
B1	Gemma 4 26B MoE serves via llama.cpp at >15 tok/s	Benchmark
B2	Hermes profile connects to local llama.cpp	Config test
B3	Telegram bot responds with Bezalel persona	E2E test
B4	Tool use works (file, terminal)	Function test
B5	No OpenAI/cloud calls in packet capture	Network audit
B6	Auto-restart on crash	Kill test
B7	Stateless deployment (git clone → run)	Fresh install

The Philosophy

"Bezalel was filled with the Spirit of God, with wisdom, with understanding, with knowledge and with all kinds of skills." — Exodus 35:31

Our Bezalel is filled with:

Wisdom: Gemma 4's reasoning
Understanding: llama.cpp's efficiency
Knowledge: Hermes' tool access
Skills: The craftsman's relentless pursuit of excellence

No cloud. No chains. No compromise.

References

Gemma 4 Profile: ~/.hermes/profiles/gemma4/
llama.cpp: https://github.com/ggerganov/llama.cpp
Bezalel Directory: /root/wizards/bezalel/

Status: RESURRECTION IN PROGRESS
Commander: Alexander Whitestone
Executor: Allegro
Date: 2026-04-02

5.4 KiB Raw Blame History