2.6 KiB
Big Brain Provider Verification
Repo wiring for the big_brain provider used by Mac Hermes.
Issue #543
[PROVE-IT] Timmy: Wire RunPod/Vertex AI Gemma 4 to Mac Hermes
What this repo now supports
The repo no longer hardcodes one dead RunPod pod as the truth. Instead, it defines a Big Brain provider contract:
- provider name:
Big Brain - model:
gemma4:latest - endpoint style: OpenAI-compatible
/v1by default - verification path:
scripts/verify_big_brain.py
Supported deployment shapes:
- RunPod + Ollama/OpenAI-compatible bridge
- Example base URL:
https://<pod-id>-11434.proxy.runpod.net/v1
- Example base URL:
- Vertex AI through an OpenAI-compatible bridge/proxy
- Example base URL:
https://<your-bridge-host>/v1
- Example base URL:
Config wiring
config.yaml now carries a generic provider block:
- name: Big Brain
base_url: https://YOUR_BIG_BRAIN_HOST/v1
api_key: ''
model: gemma4:latest
Override at runtime if needed:
BIG_BRAIN_BASE_URLBIG_BRAIN_MODELBIG_BRAIN_BACKEND(openaiorollama)BIG_BRAIN_API_KEY
Verification scripts
1. scripts/verify_big_brain.py
Checks the configured provider using the right protocol for the chosen backend.
For openai backends it verifies:
GET /modelsPOST /chat/completions
For ollama backends it verifies:
GET /api/tagsPOST /api/generate
Writes:
big_brain_verification.json
2. scripts/big_brain_manager.py
A more verbose wrapper over the same provider contract.
Writes:
pod_verification_results.json
Usage
python3 scripts/verify_big_brain.py
python3 scripts/big_brain_manager.py
Honest current state
On fresh main before this fix, the repo was pointing at a stale RunPod endpoint:
https://8lfr3j47a5r3gn-11434.proxy.runpod.net- verification returned HTTP 404 for both model listing and generation
That meant the repo claimed Big Brain wiring existed, but the proof path was stale and tied to a dead specific pod.
This fix makes the repo wiring reusable and truthful, but it does not provision a fresh paid GPU automatically.
Acceptance mapping
What this repo change satisfies:
- Mac Hermes has a
big_brainprovider contract inconfig.yaml - Verification script checks that provider through the same API shape Hermes needs
- RunPod and Vertex-style wiring are documented without hardcoding a dead pod
What still depends on live infrastructure outside the repo:
- GPU instance actually provisioned and running
- endpoint responsive right now
- live
hermes chat --provider big_brainsuccess against a real endpoint