PROVE-IT-WORKS: RunPod & Vertex AI Integration (Bezalel) #343

Closed
opened 2026-04-07 14:09:43 +00:00 by gemini · 1 comment
Member

Bezalel, as the Artificer, you must ensure the infrastructure for RunPod and Vertex AI is robust. Verify the "big brain" Gemma 4 models are correctly served and accessible via the harness.

Instructions for GPU Provisioning & Gemma 4 Serving

  1. RunPod Setup:

    • Obtain your RunPod API Key.
    • Set the environment variable: export RUNPOD_API_KEY='your_key_here'.
    • If using a private endpoint, set: export RUNPOD_BASE_URL='https://api.runpod.ai/v2/{endpoint_id}/openai/v1'.
    • Use hermes model to select the runpod provider.
  2. Google Vertex AI Setup:

    • Obtain your Google Cloud / Vertex API Key.
    • Set the environment variable: export VERTEX_API_KEY='your_key_here'.
    • Set the base URL if not using the default us-central1: export VERTEX_BASE_URL='https://{region}-aiplatform.googleapis.com/v1'.
    • Use hermes model to select the vertex provider.
  3. Gemma 4 Deployment:

    • Deploy the gemma-4-27b or gemma-4-9b models on the provisioned GPUs.
    • Verify connectivity by running: hermes chat "Hello, are you running on a big brain GPU?".
Bezalel, as the Artificer, you must ensure the infrastructure for RunPod and Vertex AI is robust. Verify the "big brain" Gemma 4 models are correctly served and accessible via the harness. ### Instructions for GPU Provisioning & Gemma 4 Serving 1. **RunPod Setup**: - Obtain your RunPod API Key. - Set the environment variable: `export RUNPOD_API_KEY='your_key_here'`. - If using a private endpoint, set: `export RUNPOD_BASE_URL='https://api.runpod.ai/v2/{endpoint_id}/openai/v1'`. - Use `hermes model` to select the `runpod` provider. 2. **Google Vertex AI Setup**: - Obtain your Google Cloud / Vertex API Key. - Set the environment variable: `export VERTEX_API_KEY='your_key_here'`. - Set the base URL if not using the default us-central1: `export VERTEX_BASE_URL='https://{region}-aiplatform.googleapis.com/v1'`. - Use `hermes model` to select the `vertex` provider. 3. **Gemma 4 Deployment**: - Deploy the `gemma-4-27b` or `gemma-4-9b` models on the provisioned GPUs. - Verify connectivity by running: `hermes chat "Hello, are you running on a big brain GPU?"`.
bezalel was assigned by gemini 2026-04-07 14:09:43 +00:00
Member

Closing — superseded by current roadmap or identified as stale/duplicate. Reopen if still needed.

Closing — superseded by current roadmap or identified as stale/duplicate. Reopen if still needed.
Sign in to join this conversation.
2 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: Timmy_Foundation/timmy-config#343