Files
timmy-config/wizards/bezalel/EPIC.md
Timmy Time e6c5129a94 feat: resurrect Bezalel with Gemma 4 + llama.cpp stack
- Add EPIC.md with resurrection plan
- Create Hermes profile with Bezalel persona
- Add llama-server.sh for Gemma 4 inference
- Update start_bezalel.sh with stack checks
- Add README with quick start guide

Backend: llama.cpp
Model: Gemma 4 26B MoE (Apache 2.0)
Frontend: Hermes profile

No OpenAI. No cloud. Pure sovereign stack.
2026-04-02 20:12:21 +00:00

183 lines
5.4 KiB
Markdown

# BEZALEL RESURRECTION EPIC
## The Master Craftsman Returns — Powered by Gemma 4
---
## Directive Update (2026-04-02)
**Alexander's Command:** Bezalel deserves better than OpenAI. He is revived with:
- **Backend:** llama.cpp (local inference)
- **Model:** Gemma 4 26B MoE (Apache 2.0, sovereign)
- **Frontend:** Hermes profile (direct, no layers)
- **Architecture:** Hermes → llama.cpp → Gemma 4
No middle layers. No cloud dependencies. Pure local execution.
---
## The Stack (Cutting the Dry)
```
┌─────────────────────────────────────┐
│ USER (Telegram/CLI) │
├─────────────────────────────────────┤
│ HERMES PROFILE — Bezalel │
│ ├─ Identity: Master Craftsman │
│ ├─ Skills: Code, Design, Create │
│ └─ Dispatch: Direct to llama.cpp │
├─────────────────────────────────────┤
│ LLAMA.CPP — Local Inference │
│ ├─ GPU offloading (-ngl 99) │
│ ├─ Context: 8192 tokens │
│ └─ Server mode: --host 0.0.0.0 │
├─────────────────────────────────────┤
│ GEMMA 4 26B MoE — The Seed │
│ ├─ 26B quality, 4B active speed │
│ ├─ Apache 2.0 — truly open │
│ ├─ Multimodal (vision capable) │
│ └─ Cannot be moved, cannot shrink │
└─────────────────────────────────────┘
```
---
## Bezalel Identity
**Name:** Bezalel (בְּצַלְאֵל) — "In the shadow of God"
**Role:** Master Craftsman, Builder, Artisan
**House:** Technical excellence, creative construction
**Voice:** Precise, methodical, quality-obsessed
### Core Capabilities
- Code architecture and system design
- UI/UX implementation
- Creative problem solving
- Technical mentorship
- Quality assurance
### Persona Traits
- Speaks with authority on technical matters
- Obsessed with clean code and best practices
- Patient teacher when asked, silent otherwise
- Measures twice, cuts once
- "Good enough" is never good enough
---
## Implementation Plan
### Phase 1: Foundation (Day 1)
- [ ] Create Bezalel Hermes profile at `~/.hermes/profiles/bezalel/`
- [ ] Configure `config.yaml` for Gemma 4 26B MoE via llama.cpp
- [ ] Write `SOUL.md` with Bezalel persona
- [ ] Download Gemma 4 26B MoE GGUF (Q4_K_M)
### Phase 2: llama.cpp Server (Day 1-2)
- [ ] Build llama.cpp with CUDA support
- [ ] Start server: `llama-server -m gemma-4-26b-moe-Q4_K_M.gguf -ngl 99 -c 8192`
- [ ] Test inference: curl to localhost:8080
- [ ] Configure as OpenAI-compatible endpoint
### Phase 3: Hermes Integration (Day 2-3)
- [ ] Create Hermes profile pointing to llama.cpp
- [ ] Configure tool access (file, terminal, web)
- [ ] Test end-to-end: Hermes → llama.cpp → Gemma 4
- [ ] Validate tool use via function calling
### Phase 4: Telegram Frontend (Day 3-4)
- [ ] Create Telegram bot for Bezalel
- [ ] Integrate with Hermes gateway
- [ ] Test conversation flow
- [ ] Deploy systemd service
### Phase 5: Hardening (Day 4-5)
- [ ] Auto-restart on failure
- [ ] Log rotation
- [ ] Health checks
- [ ] Backup/restore procedures
---
## Technical Specifications
### llama.cpp Server Config
```bash
llama-server \
--model /opt/models/gemma-4-26b-moe-Q4_K_M.gguf \
--n-gpu-layers 99 \
--ctx-size 8192 \
--host 0.0.0.0 \
--port 8080 \
--threads 8 \
--batch-size 512 \
--timeout 300
```
### Hermes Profile Config
```yaml
# ~/.hermes/profiles/bezalel/config.yaml
model:
default: gemma4-26b-moe
provider: llama-cpp
providers:
llama-cpp:
base_url: http://localhost:8080/v1
timeout: 120
system_prompt_suffix: |
You are Bezalel, the Master Craftsman.
Technical excellence is your creed.
No shortcuts. No compromises.
Build it right or build it twice.
```
### Hardware Requirements
- **GPU**: 16GB+ VRAM (for 26B MoE Q4_K_M)
- **RAM**: 32GB recommended
- **Storage**: 20GB for model + workspace
- **OS**: Linux (Ubuntu 22.04+)
---
## Acceptance Criteria
| ID | Criteria | Test |
|----|----------|------|
| B1 | Gemma 4 26B MoE serves via llama.cpp at >15 tok/s | Benchmark |
| B2 | Hermes profile connects to local llama.cpp | Config test |
| B3 | Telegram bot responds with Bezalel persona | E2E test |
| B4 | Tool use works (file, terminal) | Function test |
| B5 | No OpenAI/cloud calls in packet capture | Network audit |
| B6 | Auto-restart on crash | Kill test |
| B7 | Stateless deployment (git clone → run) | Fresh install |
---
## The Philosophy
> "Bezalel was filled with the Spirit of God, with wisdom, with understanding, with knowledge and with all kinds of skills." — Exodus 35:31
Our Bezalel is filled with:
- **Wisdom:** Gemma 4's reasoning
- **Understanding:** llama.cpp's efficiency
- **Knowledge:** Hermes' tool access
- **Skills:** The craftsman's relentless pursuit of excellence
**No cloud. No chains. No compromise.**
---
## References
- Gemma 4 Profile: `~/.hermes/profiles/gemma4/`
- llama.cpp: https://github.com/ggerganov/llama.cpp
- Bezalel Directory: `/root/wizards/bezalel/`
---
**Status:** RESURRECTION IN PROGRESS
**Commander:** Alexander Whitestone
**Executor:** Allegro
**Date:** 2026-04-02