# GamePortal Protocol A thin interface contract for how Timmy perceives and acts in game worlds. No adapter code. The implementation IS the MCP servers. ## The Contract Every game portal implements two operations: ``` capture_state() → GameState execute_action(action) → ActionResult ``` That's it. Everything else is game-specific configuration. ## capture_state() Returns a snapshot of what Timmy can see and know right now. **Composed from MCP tool calls:** | Data | MCP Server | Tool Call | |------|------------|-----------| | Screenshot of game window | desktop-control | `take_screenshot("game_window.png")` | | Screen dimensions | desktop-control | `get_screen_size()` | | Mouse position | desktop-control | `get_mouse_position()` | | Pixel at coordinate | desktop-control | `pixel_color(x, y)` | | Current OS | desktop-control | `get_os()` | | Recently played games | steam-info | `steam-recently-played(user_id)` | | Game achievements | steam-info | `steam-player-achievements(user_id, app_id)` | | Game stats | steam-info | `steam-user-stats(user_id, app_id)` | | Live player count | steam-info | `steam-current-players(app_id)` | | Game news | steam-info | `steam-news(app_id)` | **GameState schema:** ```json { "portal_id": "bannerlord", "timestamp": "2026-03-25T19:30:00Z", "visual": { "screenshot_path": "/tmp/capture_001.png", "screen_size": [2560, 1440], "mouse_position": [800, 600] }, "game_context": { "app_id": 261550, "playtime_hours": 142, "achievements_unlocked": 23, "achievements_total": 96, "current_players_online": 8421 } } ``` The heartbeat loop constructs `GameState` by calling the relevant MCP tools and assembling the results. No intermediate format or adapter is needed — the MCP responses ARE the state. ## execute_action(action) Sends an input to the game through the desktop. **Composed from MCP tool calls:** | Action | MCP Server | Tool Call | |--------|------------|-----------| | Click at position | desktop-control | `click(x, y)` | | Right-click | desktop-control | `right_click(x, y)` | | Double-click | desktop-control | `double_click(x, y)` | | Move mouse | desktop-control | `move_to(x, y)` | | Drag | desktop-control | `drag_to(x, y, duration)` | | Type text | desktop-control | `type_text("text")` | | Press key | desktop-control | `press_key("space")` | | Key combo | desktop-control | `hotkey("ctrl shift s")` | | Scroll | desktop-control | `scroll(amount)` | **ActionResult schema:** ```json { "success": true, "action": "press_key", "params": {"key": "space"}, "timestamp": "2026-03-25T19:30:01Z" } ``` Actions are direct MCP calls. The model decides what to do; the heartbeat loop translates tool_calls into MCP `tools/call` requests. ## Adding a New Portal A portal is a game configuration. To add one: 1. **Add entry to `portals.json`:** ```json { "id": "new-game", "name": "New Game", "description": "What this portal is.", "status": "offline", "app_id": 12345, "window_title": "New Game Window Title", "destination": { "type": "harness", "params": { "world": "new-world" } } } ``` 2. **No code changes.** The heartbeat loop reads `portals.json`, uses `app_id` for Steam API calls and `window_title` for screenshot targeting. The MCP tools are game-agnostic. 3. **Game-specific prompts** go in `training/data/prompts_*.yaml` to teach the model what the game looks like and how to play it. ## Portal: Bannerlord (Primary) **Steam App ID:** `261550` **Window title:** `Mount & Blade II: Bannerlord` **Mod required:** BannerlordTogether (multiplayer, ticket #549) **capture_state additions:** - Screenshot shows campaign map or battle view - Steam stats include: battles won, settlements owned, troops recruited - Achievement data shows campaign progress **Key actions:** - Campaign map: click settlements, right-click to move army - Battle: click units to select, right-click to command - Menus: press keys for inventory (I), character (C), party (P) - Save/load: hotkey("ctrl s"), hotkey("ctrl l") **Training data needed:** - Screenshots of campaign map with annotations - Screenshots of battle view with unit positions - Decision examples: "I see my army near Vlandia. I should move toward the objective." ## Portal: Morrowind (Secondary) **Steam App ID:** `22320` (The Elder Scrolls III: Morrowind GOTY) **Window title:** `OpenMW` (if using OpenMW) or `Morrowind` **Multiplayer:** TES3MP (OpenMW fork with multiplayer) **capture_state additions:** - Screenshot shows first-person exploration or dialogue - Stats include: playtime, achievements (limited on Steam for old games) - OpenMW may expose additional data through log files **Key actions:** - Movement: WASD + mouse look - Interact: click / press space on objects and NPCs - Combat: click to attack, right-click to block - Inventory: press Tab - Journal: press J - Rest: press T **Training data needed:** - Screenshots of Vvardenfell landscapes, towns, interiors - Dialogue trees with NPC responses - Navigation examples: "I see Balmora ahead. I should follow the road north." ## What This Protocol Does NOT Do - **No game memory extraction.** We read what's on screen, not in RAM. - **No mod APIs.** We click and type, like a human at a keyboard. - **No custom adapters per game.** Same MCP tools for every game. - **No network protocol.** Local desktop control only. The model learns to play by looking at screenshots and pressing keys. The same way a human learns. The protocol is just "look" and "act." ## Mapping to the Three Pillars | Pillar | How GamePortal serves it | |--------|--------------------------| | **Heartbeat** | capture_state feeds the perception step. execute_action IS the action step. | | **Harness** | The DPO model is trained on (screenshot, decision, action) trajectories from portal play. | | **Portal Interface** | This protocol IS the portal interface. |