Files
hermes-agent/cli-config.yaml.example
Alexander Whitestone b6398b8b0d
All checks were successful
Lint / lint (pull_request) Successful in 19s
feat: wire Gemma 4 vision into browser_tool for screenshot analysis
Default browser screenshot analysis now uses Gemma 4 27B
(google/gemma-4-27b-it) instead of deferring to the auxiliary router's
auto-detection.  Gemma 4 is natively multimodal — the same model family
already in use for text tasks — which avoids cold-start model-switching
overhead and improves context continuity.

Resolution order for _get_vision_model():
  1. BROWSER_VISION_MODEL env var (browser-specific override)
  2. auxiliary.browser_vision.model in config.yaml
  3. AUXILIARY_VISION_MODEL env var (shared/legacy override)
  4. google/gemma-4-27b-it (new default)

- Add _BROWSER_VISION_DEFAULT_MODEL constant to browser_tool.py
- Document auxiliary.browser_vision config key in cli-config.yaml.example
- Add 10 unit tests covering all resolution steps

Fixes #816

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-21 12:49:46 -04:00

44 KiB