Files
hermes-agent/cli-config.yaml.example
Alexander Whitestone e157a22639 feat: wire Gemma 4 vision into browser_tool for screenshot analysis
- Add `_BROWSER_VISION_DEFAULT_MODEL = "google/gemma-4-27b-it"` constant
- Rewrite `_get_vision_model()` with 4-tier resolution:
  1. BROWSER_VISION_MODEL env var (browser-specific override)
  2. auxiliary.browser_vision.model config key
  3. AUXILIARY_VISION_MODEL env var (backward compat)
  4. google/gemma-4-27b-it default (Gemma 4 native multimodal)
- Extract `_load_browser_vision_config()` helper for testability
- Always set call_kwargs["model"] (remove redundant `if vision_model` guard)
- Read timeout from auxiliary.browser_vision.timeout before auxiliary.vision.timeout
- Register gemma-4-27b-it in Gemini provider model catalog
- Document auxiliary.browser_vision section in cli-config.yaml.example
- Add 12 unit tests in tests/tools/test_browser_vision_model.py covering all
  resolution tiers, backward compat, error fallthrough, and type guarantees

Fixes #816

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-21 21:26:03 -04:00

44 KiB