Files
timmy-home/scripts/README_big_brain.md
Alexander Whitestone 66bf59fa40
Some checks failed
Agent PR Gate / gate (pull_request) Has been cancelled
Agent PR Gate / report (pull_request) Has been cancelled
Self-Healing Smoke / self-healing-smoke (pull_request) Has been cancelled
Smoke Test / smoke (pull_request) Has been cancelled
feat: add Timmy big-brain prove-it helper (#543)
2026-04-16 01:48:39 -04:00

103 lines
3.1 KiB
Markdown

# Big Brain Provider Verification
Repo wiring for the `big_brain` provider used by Mac Hermes.
## Issue #543
[PROVE-IT] Timmy: Wire RunPod/Vertex AI Gemma 4 to Mac Hermes
## What this repo now supports
The repo no longer hardcodes one dead RunPod pod as the truth.
Instead, it defines a **Big Brain provider contract**:
- provider name: `Big Brain`
- model: `gemma4:latest`
- endpoint style: OpenAI-compatible `/v1` by default
- verification path: `scripts/verify_big_brain.py`
Supported deployment shapes:
1. **RunPod + Ollama/OpenAI-compatible bridge**
- Example base URL: `https://<pod-id>-11434.proxy.runpod.net/v1`
2. **Vertex AI through an OpenAI-compatible bridge/proxy**
- Example base URL: `https://<your-bridge-host>/v1`
## Config wiring
`config.yaml` now carries a generic provider block:
```yaml
- name: Big Brain
base_url: https://YOUR_BIG_BRAIN_HOST/v1
api_key: ''
model: gemma4:latest
```
Override at runtime if needed:
- `BIG_BRAIN_BASE_URL`
- `BIG_BRAIN_MODEL`
- `BIG_BRAIN_BACKEND` (`openai` or `ollama`)
- `BIG_BRAIN_API_KEY`
## Verification scripts
### 1. `scripts/verify_big_brain.py`
Checks the configured provider using the right protocol for the chosen backend.
### 1b. `scripts/timmy_gemma4_mac.py`
Timmy-specific prove-it helper for Mac Hermes.
Refs #543.
What it adds beyond the generic verifier:
- targets the root config.yaml used by Timmy's Mac Hermes
- reports whether RunPod / Vertex credential files are present without leaking them
- derives a RunPod `/v1` endpoint from a pod id when supplied
- previews the Big Brain provider config update for Timmy
- emits the exact Hermes chat probe command to run once a live endpoint exists
- only spends money if `--apply-runpod` is explicitly passed
For `openai` backends it verifies:
- `GET /models`
- `POST /chat/completions`
For `ollama` backends it verifies:
- `GET /api/tags`
- `POST /api/generate`
Writes:
- `big_brain_verification.json`
### 2. `scripts/big_brain_manager.py`
A more verbose wrapper over the same provider contract.
Writes:
- `pod_verification_results.json`
## Usage
```bash
python3 scripts/verify_big_brain.py
python3 scripts/big_brain_manager.py
```
## Honest current state
On fresh main before this fix, the repo was pointing at a stale RunPod endpoint:
- `https://8lfr3j47a5r3gn-11434.proxy.runpod.net`
- verification returned HTTP 404 for both model listing and generation
That meant the repo claimed Big Brain wiring existed, but the proof path was stale and tied to a dead specific pod.
This fix makes the repo wiring reusable and truthful, but it does **not** provision a fresh paid GPU automatically.
## Acceptance mapping
What this repo change satisfies:
- [x] Mac Hermes has a `big_brain` provider contract in `config.yaml`
- [x] Verification script checks that provider through the same API shape Hermes needs
- [x] RunPod and Vertex-style wiring are documented without hardcoding a dead pod
What still depends on live infrastructure outside the repo:
- [ ] GPU instance actually provisioned and running
- [ ] endpoint responsive right now
- [ ] live `hermes chat --provider big_brain` success against a real endpoint