Files
the-nexus/GAMEPORTAL_PROTOCOL.md
Alexander Whitestone 9f636d9677
Some checks failed
CI / validate (pull_request) Failing after 4s
feat: expand portal registry schema
2026-03-28 13:00:58 -04:00

7.1 KiB

GamePortal Protocol

A thin interface contract for how Timmy perceives and acts in game worlds. No adapter code. The implementation IS the MCP servers.

The Contract

Every game portal implements two operations:

capture_state() → GameState
execute_action(action) → ActionResult

That's it. Everything else is game-specific configuration.

capture_state()

Returns a snapshot of what Timmy can see and know right now.

Composed from MCP tool calls:

Data MCP Server Tool Call
Screenshot of game window desktop-control take_screenshot("game_window.png")
Screen dimensions desktop-control get_screen_size()
Mouse position desktop-control get_mouse_position()
Pixel at coordinate desktop-control pixel_color(x, y)
Current OS desktop-control get_os()
Recently played games steam-info steam-recently-played(user_id)
Game achievements steam-info steam-player-achievements(user_id, app_id)
Game stats steam-info steam-user-stats(user_id, app_id)
Live player count steam-info steam-current-players(app_id)
Game news steam-info steam-news(app_id)

GameState schema:

{
  "portal_id": "bannerlord",
  "timestamp": "2026-03-25T19:30:00Z",
  "visual": {
    "screenshot_path": "/tmp/capture_001.png",
    "screen_size": [2560, 1440],
    "mouse_position": [800, 600]
  },
  "game_context": {
    "app_id": 261550,
    "playtime_hours": 142,
    "achievements_unlocked": 23,
    "achievements_total": 96,
    "current_players_online": 8421
  }
}

The heartbeat loop constructs GameState by calling the relevant MCP tools and assembling the results. No intermediate format or adapter is needed — the MCP responses ARE the state.

execute_action(action)

Sends an input to the game through the desktop.

Composed from MCP tool calls:

Action MCP Server Tool Call
Click at position desktop-control click(x, y)
Right-click desktop-control right_click(x, y)
Double-click desktop-control double_click(x, y)
Move mouse desktop-control move_to(x, y)
Drag desktop-control drag_to(x, y, duration)
Type text desktop-control type_text("text")
Press key desktop-control press_key("space")
Key combo desktop-control hotkey("ctrl shift s")
Scroll desktop-control scroll(amount)

ActionResult schema:

{
  "success": true,
  "action": "press_key",
  "params": {"key": "space"},
  "timestamp": "2026-03-25T19:30:01Z"
}

Actions are direct MCP calls. The model decides what to do; the heartbeat loop translates tool_calls into MCP tools/call requests.

Adding a New Portal

A portal is a game configuration. To add one:

  1. Add entry to portals.json:
{
  "id": "new-game",
  "name": "New Game",
  "description": "What this portal is.",
  "status": "offline",
  "portal_type": "game-world",
  "world_category": "rpg",
  "environment": "staging",
  "access_mode": "operator",
  "readiness_state": "prototype",
  "telemetry_source": "hermes-harness:new-game-bridge",
  "owner": "Timmy",
  "app_id": 12345,
  "window_title": "New Game Window Title",
  "destination": {
    "type": "harness",
    "action_label": "Enter New Game",
    "params": { "world": "new-world" }
  }
}

Required metadata fields:

  • portal_type — high-level kind (game-world, operator-room, research-space, experiment)
  • world_category — subtype for navigation and grouping (rpg, workspace, sim, etc.)
  • environmentproduction, staging, or local
  • access_modepublic, operator, or local-only
  • readiness_stateplayable, active, prototype, rebuilding, blocked, offline
  • telemetry_source — where truth/status comes from
  • owner — who currently owns the world or integration lane
  • destination.action_label — human-facing action text for UI cards/directories
  1. No mandatory game-specific code changes. The heartbeat loop reads portals.json, uses metadata for grouping/status/visibility, and can still use fields like app_id and window_title for screenshot targeting where relevant. The MCP tools remain game-agnostic.

  2. Game-specific prompts go in training/data/prompts_*.yaml to teach the model what the game looks like and how to play it.

  3. Migration from legacy portal definitions

  • old portal entries with only id, name, description, status, and destination should be upgraded in place
  • preserve visual fields like color, position, and rotation
  • add the new metadata fields so the same registry can drive future atlas, status wall, preview cards, and many-portal navigation without inventing parallel registries

Portal: Bannerlord (Primary)

Steam App ID: 261550 Window title: Mount & Blade II: Bannerlord Mod required: BannerlordTogether (multiplayer, ticket #549)

capture_state additions:

  • Screenshot shows campaign map or battle view
  • Steam stats include: battles won, settlements owned, troops recruited
  • Achievement data shows campaign progress

Key actions:

  • Campaign map: click settlements, right-click to move army
  • Battle: click units to select, right-click to command
  • Menus: press keys for inventory (I), character (C), party (P)
  • Save/load: hotkey("ctrl s"), hotkey("ctrl l")

Training data needed:

  • Screenshots of campaign map with annotations
  • Screenshots of battle view with unit positions
  • Decision examples: "I see my army near Vlandia. I should move toward the objective."

Portal: Morrowind (Secondary)

Steam App ID: 22320 (The Elder Scrolls III: Morrowind GOTY) Window title: OpenMW (if using OpenMW) or Morrowind Multiplayer: TES3MP (OpenMW fork with multiplayer)

capture_state additions:

  • Screenshot shows first-person exploration or dialogue
  • Stats include: playtime, achievements (limited on Steam for old games)
  • OpenMW may expose additional data through log files

Key actions:

  • Movement: WASD + mouse look
  • Interact: click / press space on objects and NPCs
  • Combat: click to attack, right-click to block
  • Inventory: press Tab
  • Journal: press J
  • Rest: press T

Training data needed:

  • Screenshots of Vvardenfell landscapes, towns, interiors
  • Dialogue trees with NPC responses
  • Navigation examples: "I see Balmora ahead. I should follow the road north."

What This Protocol Does NOT Do

  • No game memory extraction. We read what's on screen, not in RAM.
  • No mod APIs. We click and type, like a human at a keyboard.
  • No custom adapters per game. Same MCP tools for every game.
  • No network protocol. Local desktop control only.

The model learns to play by looking at screenshots and pressing keys. The same way a human learns. The protocol is just "look" and "act."

Mapping to the Three Pillars

Pillar How GamePortal serves it
Heartbeat capture_state feeds the perception step. execute_action IS the action step.
Harness The DPO model is trained on (screenshot, decision, action) trajectories from portal play.
Portal Interface This protocol IS the portal interface.