[gemma-4-multimodal] Vision-Based State Verification for Morrowind Agent #1487

Open
Rockachopa wants to merge 2 commits from gemma4-worker-20260409-105205-1482 into main
Owner

Visual State Verification Module for Game Agents

Closes: #1482

Summary

Adds a multimodal visual state verification module that game agents (Morrowind, Minecraft, etc.) can use to confirm environmental state via screenshot analysis.

What's Added

scripts/visual_state_verifier.py

  • VisualStateVerifier class — takes screenshots and expected state conditions
  • verify_state() — analyzes a screenshot against expected conditions using vision AI
  • morrowind_state() — convenience builder for Morrowind-specific state dicts
  • Structured prompt generation for consistent vision model analysis
  • JSON result parsing with confidence scoring
  • Batch verification examples

tests/test_visual_state_verifier.py

  • 9 tests covering error handling, state building, prompt generation, and analysis parsing
  • All tests passing

Usage Example

verifier = VisualStateVerifier()
result = verifier.verify_state(
    screenshot_path="/tmp/morrowind_screenshot.png",
    expected_state=VisualStateVerifier.morrowind_state(
        location="Balmora",
        health_min=50,
        has_weapon=True,
        nearby_npcs=["Caius Cosades"]
    ),
    context="After completing the first Caius Cosades quest",
    game="morrowind"
)
print(f"Verified: {result.verified}, Confidence: {result.confidence:.0%}")

Integration

The module is designed to work with MCP screenshot tools and vision analysis capabilities. It generates structured prompts that can be passed to any vision backend, then parses the results into actionable verification outcomes.

## Visual State Verification Module for Game Agents **Closes:** #1482 ### Summary Adds a multimodal visual state verification module that game agents (Morrowind, Minecraft, etc.) can use to confirm environmental state via screenshot analysis. ### What's Added **`scripts/visual_state_verifier.py`** - `VisualStateVerifier` class — takes screenshots and expected state conditions - `verify_state()` — analyzes a screenshot against expected conditions using vision AI - `morrowind_state()` — convenience builder for Morrowind-specific state dicts - Structured prompt generation for consistent vision model analysis - JSON result parsing with confidence scoring - Batch verification examples **`tests/test_visual_state_verifier.py`** - 9 tests covering error handling, state building, prompt generation, and analysis parsing - All tests passing ### Usage Example ```python verifier = VisualStateVerifier() result = verifier.verify_state( screenshot_path="/tmp/morrowind_screenshot.png", expected_state=VisualStateVerifier.morrowind_state( location="Balmora", health_min=50, has_weapon=True, nearby_npcs=["Caius Cosades"] ), context="After completing the first Caius Cosades quest", game="morrowind" ) print(f"Verified: {result.verified}, Confidence: {result.confidence:.0%}") ``` ### Integration The module is designed to work with MCP screenshot tools and vision analysis capabilities. It generates structured prompts that can be passed to any vision backend, then parses the results into actionable verification outcomes.
Rockachopa added 2 commits 2026-04-09 14:52:29 +00:00
Multimodal screenshot-based state verification:
- Generic verifier for any game with screenshots
- Morrowind-specific state builder
- Structured prompt generation for vision models
- JSON result parsing with confidence scoring
- Batch verification examples

Relates to #1482
test: add tests for visual state verifier module
Some checks are pending
Tests / lint (pull_request) Waiting to run
Tests / test (pull_request) Blocked by required conditions
e69c6bd90a
Relates to #1482
Some checks are pending
Tests / lint (pull_request) Waiting to run
Tests / test (pull_request) Blocked by required conditions
This pull request can be merged automatically.
This branch is out-of-date with the base branch
You are not authorized to merge this pull request.
View command line instructions

Checkout

From your project repository, check out a new branch and test the changes.
git fetch -u origin gemma4-worker-20260409-105205-1482:gemma4-worker-20260409-105205-1482
git checkout gemma4-worker-20260409-105205-1482
Sign in to join this conversation.
No Reviewers
No Label
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: Rockachopa/Timmy-time-dashboard#1487