Files
hermes-agent/plans/gemini-oauth-provider.md
Teknium 26030266d2 docs: Gemini OAuth provider implementation plan (#2467)
* docs: add Gemini OAuth provider implementation plan

Planning doc for a standard-route Gemini provider using Google OAuth
(Authorization Code + PKCE) with the OpenAI-compatible endpoint at
generativelanguage.googleapis.com. Covers OAuth flow, token lifecycle,
file list, and estimated scope (~700 lines).

Replaces the Node.js bridge approach from PR #2042.

* chore: update OpenRouter model list

- Add xiaomi/mimo-v2-pro
- Add nvidia/nemotron-3-super-120b-a12b (paid, higher rate limits)
- Remove openrouter/hunter-alpha and openrouter/healer-alpha (discontinued)
2026-03-22 04:46:05 -07:00

4.2 KiB

Gemini OAuth Provider — Implementation Plan

Goal

Add a first-class gemini provider that authenticates via Google OAuth, using the standard Gemini API (not Cloud Code Assist). Users who have a Google AI subscription or Gemini API access can authenticate through the browser without needing to manually copy API keys.

Architecture Decision

  • Path A (chosen): Standard Gemini API at generativelanguage.googleapis.com/v1beta/openai/
  • NOT Path B: Cloud Code Assist (cloudcode-pa.googleapis.com) — rate-limited free tier, internal API, account ban risk
  • Standard chat_completions api_mode via OpenAI SDK — no new api_mode needed
  • Our own OAuth credentials — NOT sharing tokens with Gemini CLI

OAuth Flow

  • Type: Authorization Code + PKCE (S256) — same pattern as clawdbot/pi-mono
  • Auth URL: https://accounts.google.com/o/oauth2/v2/auth
  • Token URL: https://oauth2.googleapis.com/token
  • Redirect: http://localhost:8085/oauth2callback (localhost callback server)
  • Fallback: Manual URL paste for remote/WSL/headless environments
  • Scopes: https://www.googleapis.com/auth/cloud-platform, https://www.googleapis.com/auth/userinfo.email
  • PKCE: S256 code challenge, 32-byte random verifier

Client ID

  • Need to register a "Desktop app" OAuth client on a Nous Research GCP project
  • Ship client_id + client_secret in code (Google considers installed app secrets non-confidential)
  • Alternatively: accept user-provided client_id via env vars as override

Token Lifecycle

  • Store at ~/.hermes/gemini_oauth.json (NOT sharing with ~/.gemini/oauth_creds.json)
  • Fields: client_id, client_secret, refresh_token, access_token, expires_at, email
  • File permissions: 0o600
  • Before each API call: check expiry, refresh if within 5 min of expiration
  • Refresh: POST to token URL with grant_type=refresh_token
  • File locking for concurrent access (multiple agent sessions)

API Integration

  • Base URL: https://generativelanguage.googleapis.com/v1beta/openai/
  • Auth: Authorization: Bearer <access_token> (passed as api_key to OpenAI SDK)
  • api_mode: chat_completions (standard)
  • Models: gemini-2.5-pro, gemini-2.5-flash, gemini-2.0-flash, etc.

Files to Create/Modify

New files

  1. agent/google_oauth.py — OAuth flow (PKCE, localhost server, token exchange, refresh)
    • start_oauth_flow() — opens browser, starts callback server
    • exchange_code() — code → tokens
    • refresh_access_token() — refresh flow
    • load_credentials() / save_credentials() — file I/O with locking
    • get_valid_access_token() — check expiry, refresh if needed
    • ~200 lines

Existing files to modify

  1. hermes_cli/auth.py — Add ProviderConfig for "gemini" with auth_type="oauth_google"
  2. hermes_cli/models.py — Add Gemini model catalog
  3. hermes_cli/runtime_provider.py — Add gemini branch (read OAuth token, build OpenAI client)
  4. hermes_cli/main.py — Add _model_flow_gemini(), add to provider choices
  5. hermes_cli/setup.py — Add gemini auth flow (trigger browser OAuth)
  6. run_agent.py — Token refresh before API calls (like Copilot pattern)
  7. agent/auxiliary_client.py — Add gemini to aux resolution chain
  8. agent/model_metadata.py — Add Gemini model context lengths

Tests

  1. tests/agent/test_google_oauth.py — OAuth flow unit tests
  2. tests/test_api_key_providers.py — Add gemini provider test

Docs

  1. website/docs/getting-started/quickstart.md — Add gemini to provider table
  2. website/docs/user-guide/configuration.md — Gemini setup section
  3. website/docs/reference/environment-variables.md — New env vars

Estimated scope

~400 lines new code, ~150 lines modifications, ~100 lines tests, ~50 lines docs = ~700 lines total

Prerequisites

  • Nous Research GCP project with Desktop OAuth client registered
  • OR: accept user-provided client_id via HERMES_GEMINI_CLIENT_ID env var

Reference implementations

  • clawdbot: extensions/google/oauth.flow.ts (PKCE + localhost server)
  • pi-mono: packages/ai/src/utils/oauth/google-gemini-cli.ts (same flow)
  • hermes-agent Copilot OAuth: hermes_cli/main.py _copilot_device_flow() (different flow type but same lifecycle pattern)