* docs: add Gemini OAuth provider implementation plan Planning doc for a standard-route Gemini provider using Google OAuth (Authorization Code + PKCE) with the OpenAI-compatible endpoint at generativelanguage.googleapis.com. Covers OAuth flow, token lifecycle, file list, and estimated scope (~700 lines). Replaces the Node.js bridge approach from PR #2042. * chore: update OpenRouter model list - Add xiaomi/mimo-v2-pro - Add nvidia/nemotron-3-super-120b-a12b (paid, higher rate limits) - Remove openrouter/hunter-alpha and openrouter/healer-alpha (discontinued)
4.2 KiB
4.2 KiB
Gemini OAuth Provider — Implementation Plan
Goal
Add a first-class gemini provider that authenticates via Google OAuth, using the standard Gemini API (not Cloud Code Assist). Users who have a Google AI subscription or Gemini API access can authenticate through the browser without needing to manually copy API keys.
Architecture Decision
- Path A (chosen): Standard Gemini API at
generativelanguage.googleapis.com/v1beta/openai/ - NOT Path B: Cloud Code Assist (
cloudcode-pa.googleapis.com) — rate-limited free tier, internal API, account ban risk - Standard
chat_completionsapi_mode via OpenAI SDK — no new api_mode needed - Our own OAuth credentials — NOT sharing tokens with Gemini CLI
OAuth Flow
- Type: Authorization Code + PKCE (S256) — same pattern as clawdbot/pi-mono
- Auth URL:
https://accounts.google.com/o/oauth2/v2/auth - Token URL:
https://oauth2.googleapis.com/token - Redirect:
http://localhost:8085/oauth2callback(localhost callback server) - Fallback: Manual URL paste for remote/WSL/headless environments
- Scopes:
https://www.googleapis.com/auth/cloud-platform,https://www.googleapis.com/auth/userinfo.email - PKCE: S256 code challenge, 32-byte random verifier
Client ID
- Need to register a "Desktop app" OAuth client on a Nous Research GCP project
- Ship client_id + client_secret in code (Google considers installed app secrets non-confidential)
- Alternatively: accept user-provided client_id via env vars as override
Token Lifecycle
- Store at
~/.hermes/gemini_oauth.json(NOT sharing with~/.gemini/oauth_creds.json) - Fields:
client_id,client_secret,refresh_token,access_token,expires_at,email - File permissions: 0o600
- Before each API call: check expiry, refresh if within 5 min of expiration
- Refresh: POST to token URL with
grant_type=refresh_token - File locking for concurrent access (multiple agent sessions)
API Integration
- Base URL:
https://generativelanguage.googleapis.com/v1beta/openai/ - Auth:
Authorization: Bearer <access_token>(passed asapi_keyto OpenAI SDK) - api_mode:
chat_completions(standard) - Models: gemini-2.5-pro, gemini-2.5-flash, gemini-2.0-flash, etc.
Files to Create/Modify
New files
agent/google_oauth.py— OAuth flow (PKCE, localhost server, token exchange, refresh)start_oauth_flow()— opens browser, starts callback serverexchange_code()— code → tokensrefresh_access_token()— refresh flowload_credentials()/save_credentials()— file I/O with lockingget_valid_access_token()— check expiry, refresh if needed- ~200 lines
Existing files to modify
hermes_cli/auth.py— Add ProviderConfig for "gemini" with auth_type="oauth_google"hermes_cli/models.py— Add Gemini model cataloghermes_cli/runtime_provider.py— Add gemini branch (read OAuth token, build OpenAI client)hermes_cli/main.py— Add_model_flow_gemini(), add to provider choiceshermes_cli/setup.py— Add gemini auth flow (trigger browser OAuth)run_agent.py— Token refresh before API calls (like Copilot pattern)agent/auxiliary_client.py— Add gemini to aux resolution chainagent/model_metadata.py— Add Gemini model context lengths
Tests
tests/agent/test_google_oauth.py— OAuth flow unit teststests/test_api_key_providers.py— Add gemini provider test
Docs
website/docs/getting-started/quickstart.md— Add gemini to provider tablewebsite/docs/user-guide/configuration.md— Gemini setup sectionwebsite/docs/reference/environment-variables.md— New env vars
Estimated scope
~400 lines new code, ~150 lines modifications, ~100 lines tests, ~50 lines docs = ~700 lines total
Prerequisites
- Nous Research GCP project with Desktop OAuth client registered
- OR: accept user-provided client_id via HERMES_GEMINI_CLIENT_ID env var
Reference implementations
- clawdbot:
extensions/google/oauth.flow.ts(PKCE + localhost server) - pi-mono:
packages/ai/src/utils/oauth/google-gemini-cli.ts(same flow) - hermes-agent Copilot OAuth:
hermes_cli/main.py_copilot_device_flow()(different flow type but same lifecycle pattern)