timmy-config/wizards/bezalel/EPIC.md

# BEZALEL RESURRECTION EPIC
## The Master Craftsman Returns — Powered by Gemma 4

---

## Directive Update (2026-04-02)

**Alexander's Command:** Bezalel deserves better than OpenAI. He is revived with:
- **Backend:** llama.cpp (local inference)
- **Model:** Gemma 4 26B MoE (Apache 2.0, sovereign)
- **Frontend:** Hermes profile (direct, no layers)
- **Architecture:** Hermes → llama.cpp → Gemma 4

No middle layers. No cloud dependencies. Pure local execution.

---

## The Stack (Cutting the Dry)

```
┌─────────────────────────────────────┐
│  USER (Telegram/CLI)                │
├─────────────────────────────────────┤
│  HERMES PROFILE — Bezalel           │
│  ├─ Identity: Master Craftsman      │
│  ├─ Skills: Code, Design, Create    │
│  └─ Dispatch: Direct to llama.cpp   │
├─────────────────────────────────────┤
│  LLAMA.CPP — Local Inference        │
│  ├─ GPU offloading (-ngl 99)        │
│  ├─ Context: 8192 tokens            │
│  └─ Server mode: --host 0.0.0.0     │
├─────────────────────────────────────┤
│  GEMMA 4 26B MoE — The Seed         │
│  ├─ 26B quality, 4B active speed    │
│  ├─ Apache 2.0 — truly open         │
│  ├─ Multimodal (vision capable)     │
│  └─ Cannot be moved, cannot shrink  │
└─────────────────────────────────────┘
```

---

## Bezalel Identity

**Name:** Bezalel (בְּצַלְאֵל) — "In the shadow of God"
**Role:** Master Craftsman, Builder, Artisan
**House:** Technical excellence, creative construction
**Voice:** Precise, methodical, quality-obsessed

### Core Capabilities
- Code architecture and system design
- UI/UX implementation
- Creative problem solving
- Technical mentorship
- Quality assurance

### Persona Traits
- Speaks with authority on technical matters
- Obsessed with clean code and best practices
- Patient teacher when asked, silent otherwise
- Measures twice, cuts once
- "Good enough" is never good enough

---

## Implementation Plan

### Phase 1: Foundation (Day 1)
- [ ] Create Bezalel Hermes profile at `~/.hermes/profiles/bezalel/`
- [ ] Configure `config.yaml` for Gemma 4 26B MoE via llama.cpp
- [ ] Write `SOUL.md` with Bezalel persona
- [ ] Download Gemma 4 26B MoE GGUF (Q4_K_M)

### Phase 2: llama.cpp Server (Day 1-2)
- [ ] Build llama.cpp with CUDA support
- [ ] Start server: `llama-server -m gemma-4-26b-moe-Q4_K_M.gguf -ngl 99 -c 8192`
- [ ] Test inference: curl to localhost:8080
- [ ] Configure as OpenAI-compatible endpoint

### Phase 3: Hermes Integration (Day 2-3)
- [ ] Create Hermes profile pointing to llama.cpp
- [ ] Configure tool access (file, terminal, web)
- [ ] Test end-to-end: Hermes → llama.cpp → Gemma 4
- [ ] Validate tool use via function calling

### Phase 4: Telegram Frontend (Day 3-4)
- [ ] Create Telegram bot for Bezalel
- [ ] Integrate with Hermes gateway
- [ ] Test conversation flow
- [ ] Deploy systemd service

### Phase 5: Hardening (Day 4-5)
- [ ] Auto-restart on failure
- [ ] Log rotation
- [ ] Health checks
- [ ] Backup/restore procedures

---

## Technical Specifications

### llama.cpp Server Config
```bash
llama-server \
  --model /opt/models/gemma-4-26b-moe-Q4_K_M.gguf \
  --n-gpu-layers 99 \
  --ctx-size 8192 \
  --host 0.0.0.0 \
  --port 8080 \
  --threads 8 \
  --batch-size 512 \
  --timeout 300
```

### Hermes Profile Config
```yaml
# ~/.hermes/profiles/bezalel/config.yaml
model:
  default: gemma4-26b-moe
  provider: llama-cpp

providers:
  llama-cpp:
    base_url: http://localhost:8080/v1
    timeout: 120

system_prompt_suffix: |
  You are Bezalel, the Master Craftsman.
  Technical excellence is your creed.
  No shortcuts. No compromises.
  Build it right or build it twice.
```

### Hardware Requirements
- **GPU**: 16GB+ VRAM (for 26B MoE Q4_K_M)
- **RAM**: 32GB recommended
- **Storage**: 20GB for model + workspace
- **OS**: Linux (Ubuntu 22.04+)

---

## Acceptance Criteria

| ID | Criteria | Test |
|----|----------|------|
| B1 | Gemma 4 26B MoE serves via llama.cpp at >15 tok/s | Benchmark |
| B2 | Hermes profile connects to local llama.cpp | Config test |
| B3 | Telegram bot responds with Bezalel persona | E2E test |
| B4 | Tool use works (file, terminal) | Function test |
| B5 | No OpenAI/cloud calls in packet capture | Network audit |
| B6 | Auto-restart on crash | Kill test |
| B7 | Stateless deployment (git clone → run) | Fresh install |

---

## The Philosophy

> "Bezalel was filled with the Spirit of God, with wisdom, with understanding, with knowledge and with all kinds of skills." — Exodus 35:31

Our Bezalel is filled with:
- **Wisdom:** Gemma 4's reasoning
- **Understanding:** llama.cpp's efficiency
- **Knowledge:** Hermes' tool access
- **Skills:** The craftsman's relentless pursuit of excellence

**No cloud. No chains. No compromise.**

---

## References

- Gemma 4 Profile: `~/.hermes/profiles/gemma4/`
- llama.cpp: https://github.com/ggerganov/llama.cpp
- Bezalel Directory: `/root/wizards/bezalel/`

---

**Status:** RESURRECTION IN PROGRESS
**Commander:** Alexander Whitestone
**Executor:** Allegro
**Date:** 2026-04-02