[SPECTRUM] GCP Vertex AI MaaS — Gemma 4 Serverless Deployment Path #2
Reference in New Issue
Block a user
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
GCP Vertex AI MaaS — Gemma 4 Deployment Guide
Source: Google AI conversation re: serverless Gemma 4 endpoints for Hermes agent integration
Triage Date: 2025-04-03
Priority: HIGH — Blocked deployment path for SPECTRUM initiative
Overview
Gemma 4 is now available via Vertex AI Model-as-a-Service (MaaS) with serverless, pay-as-you-go billing that draws directly from GCP promotional/commitment balance. This provides a zero-cold-start alternative to local llama-server deployment.
1. Enable the Gemma 4 MaaS Endpoint
2. Hermes Agent Configuration (OpenAI-Compatible)
https://{REGION}-aiplatform.googleapis.com/v1beta1/projects/{PROJECT_ID}/locations/{REGION}/publishers/google/modelsus-central1(recommended)gemma-4-e4b-it(or 9B/27B variant)3. Authentication Options
Option A: Temporary CLI Token (Testing)
Option B: Service Account (Production)
http://localhost:4000Why This Path for SPECTRUM
Blockers / Dependencies
roles/aiplatform.userRelated
/mnt/gemma4/gemma-4-31B-it-Q4_K_M.ggufNext Step: Evaluate LiteLLM proxy integration vs native GCP auth in Hermes agent.
Burn-down: GCP Vertex BLOCKED (no SA key). Local llama-server is the active path. Closing as DEFERRED.