# Big Brain Provider Verification Repo wiring for the `big_brain` provider used by Mac Hermes. ## Issue #543 [PROVE-IT] Timmy: Wire RunPod/Vertex AI Gemma 4 to Mac Hermes ## What this repo now supports The repo no longer hardcodes one dead RunPod pod as the truth. Instead, it defines a **Big Brain provider contract**: - provider name: `Big Brain` - model: `gemma4:latest` - endpoint style: OpenAI-compatible `/v1` by default - verification path: `scripts/verify_big_brain.py` Supported deployment shapes: 1. **RunPod + Ollama/OpenAI-compatible bridge** - Example base URL: `https://-11434.proxy.runpod.net/v1` 2. **Vertex AI through an OpenAI-compatible bridge/proxy** - Example base URL: `https:///v1` ## Config wiring `config.yaml` now carries a generic provider block: ```yaml - name: Big Brain base_url: https://YOUR_BIG_BRAIN_HOST/v1 api_key: '' model: gemma4:latest ``` Override at runtime if needed: - `BIG_BRAIN_BASE_URL` - `BIG_BRAIN_MODEL` - `BIG_BRAIN_BACKEND` (`openai` or `ollama`) - `BIG_BRAIN_API_KEY` ## Verification scripts ### 1. `scripts/verify_big_brain.py` Checks the configured provider using the right protocol for the chosen backend. For `openai` backends it verifies: - `GET /models` - `POST /chat/completions` For `ollama` backends it verifies: - `GET /api/tags` - `POST /api/generate` Writes: - `big_brain_verification.json` ### 2. `scripts/big_brain_manager.py` A more verbose wrapper over the same provider contract. Writes: - `pod_verification_results.json` ## Usage ```bash python3 scripts/verify_big_brain.py python3 scripts/big_brain_manager.py ``` ## Honest current state On fresh main before this fix, the repo was pointing at a stale RunPod endpoint: - `https://8lfr3j47a5r3gn-11434.proxy.runpod.net` - verification returned HTTP 404 for both model listing and generation That meant the repo claimed Big Brain wiring existed, but the proof path was stale and tied to a dead specific pod. This fix makes the repo wiring reusable and truthful, but it does **not** provision a fresh paid GPU automatically. ## Acceptance mapping What this repo change satisfies: - [x] Mac Hermes has a `big_brain` provider contract in `config.yaml` - [x] Verification script checks that provider through the same API shape Hermes needs - [x] RunPod and Vertex-style wiring are documented without hardcoding a dead pod What still depends on live infrastructure outside the repo: - [ ] GPU instance actually provisioned and running - [ ] endpoint responsive right now - [ ] live `hermes chat --provider big_brain` success against a real endpoint