From c3bdc54161b0a9f8c5e8c2c00c34fa227629b3f9 Mon Sep 17 00:00:00 2001 From: Perplexity Computer Date: Wed, 25 Mar 2026 23:38:06 +0000 Subject: [PATCH] Add GamePortal Protocol spec (#553) --- GAMEPORTAL_PROTOCOL.md | 183 +++++++++++++++++++++++++++++++++++++++++ 1 file changed, 183 insertions(+) create mode 100644 GAMEPORTAL_PROTOCOL.md diff --git a/GAMEPORTAL_PROTOCOL.md b/GAMEPORTAL_PROTOCOL.md new file mode 100644 index 0000000..87f2589 --- /dev/null +++ b/GAMEPORTAL_PROTOCOL.md @@ -0,0 +1,183 @@ +# GamePortal Protocol + +A thin interface contract for how Timmy perceives and acts in game worlds. +No adapter code. The implementation IS the MCP servers. + +## The Contract + +Every game portal implements two operations: + +``` +capture_state() → GameState +execute_action(action) → ActionResult +``` + +That's it. Everything else is game-specific configuration. + +## capture_state() + +Returns a snapshot of what Timmy can see and know right now. + +**Composed from MCP tool calls:** + +| Data | MCP Server | Tool Call | +|------|------------|-----------| +| Screenshot of game window | desktop-control | `take_screenshot("game_window.png")` | +| Screen dimensions | desktop-control | `get_screen_size()` | +| Mouse position | desktop-control | `get_mouse_position()` | +| Pixel at coordinate | desktop-control | `pixel_color(x, y)` | +| Current OS | desktop-control | `get_os()` | +| Recently played games | steam-info | `steam-recently-played(user_id)` | +| Game achievements | steam-info | `steam-player-achievements(user_id, app_id)` | +| Game stats | steam-info | `steam-user-stats(user_id, app_id)` | +| Live player count | steam-info | `steam-current-players(app_id)` | +| Game news | steam-info | `steam-news(app_id)` | + +**GameState schema:** + +```json +{ + "portal_id": "bannerlord", + "timestamp": "2026-03-25T19:30:00Z", + "visual": { + "screenshot_path": "/tmp/capture_001.png", + "screen_size": [2560, 1440], + "mouse_position": [800, 600] + }, + "game_context": { + "app_id": 261550, + "playtime_hours": 142, + "achievements_unlocked": 23, + "achievements_total": 96, + "current_players_online": 8421 + } +} +``` + +The heartbeat loop constructs `GameState` by calling the relevant MCP tools +and assembling the results. No intermediate format or adapter is needed — +the MCP responses ARE the state. + +## execute_action(action) + +Sends an input to the game through the desktop. + +**Composed from MCP tool calls:** + +| Action | MCP Server | Tool Call | +|--------|------------|-----------| +| Click at position | desktop-control | `click(x, y)` | +| Right-click | desktop-control | `right_click(x, y)` | +| Double-click | desktop-control | `double_click(x, y)` | +| Move mouse | desktop-control | `move_to(x, y)` | +| Drag | desktop-control | `drag_to(x, y, duration)` | +| Type text | desktop-control | `type_text("text")` | +| Press key | desktop-control | `press_key("space")` | +| Key combo | desktop-control | `hotkey("ctrl shift s")` | +| Scroll | desktop-control | `scroll(amount)` | + +**ActionResult schema:** + +```json +{ + "success": true, + "action": "press_key", + "params": {"key": "space"}, + "timestamp": "2026-03-25T19:30:01Z" +} +``` + +Actions are direct MCP calls. The model decides what to do; +the heartbeat loop translates tool_calls into MCP `tools/call` requests. + +## Adding a New Portal + +A portal is a game configuration. To add one: + +1. **Add entry to `portals.json`:** + +```json +{ + "id": "new-game", + "name": "New Game", + "description": "What this portal is.", + "status": "offline", + "app_id": 12345, + "window_title": "New Game Window Title", + "destination": { + "type": "harness", + "params": { "world": "new-world" } + } +} +``` + +2. **No code changes.** The heartbeat loop reads `portals.json`, + uses `app_id` for Steam API calls and `window_title` for + screenshot targeting. The MCP tools are game-agnostic. + +3. **Game-specific prompts** go in `training/data/prompts_*.yaml` + to teach the model what the game looks like and how to play it. + +## Portal: Bannerlord (Primary) + +**Steam App ID:** `261550` +**Window title:** `Mount & Blade II: Bannerlord` +**Mod required:** BannerlordTogether (multiplayer, ticket #549) + +**capture_state additions:** +- Screenshot shows campaign map or battle view +- Steam stats include: battles won, settlements owned, troops recruited +- Achievement data shows campaign progress + +**Key actions:** +- Campaign map: click settlements, right-click to move army +- Battle: click units to select, right-click to command +- Menus: press keys for inventory (I), character (C), party (P) +- Save/load: hotkey("ctrl s"), hotkey("ctrl l") + +**Training data needed:** +- Screenshots of campaign map with annotations +- Screenshots of battle view with unit positions +- Decision examples: "I see my army near Vlandia. I should move toward the objective." + +## Portal: Morrowind (Secondary) + +**Steam App ID:** `22320` (The Elder Scrolls III: Morrowind GOTY) +**Window title:** `OpenMW` (if using OpenMW) or `Morrowind` +**Multiplayer:** TES3MP (OpenMW fork with multiplayer) + +**capture_state additions:** +- Screenshot shows first-person exploration or dialogue +- Stats include: playtime, achievements (limited on Steam for old games) +- OpenMW may expose additional data through log files + +**Key actions:** +- Movement: WASD + mouse look +- Interact: click / press space on objects and NPCs +- Combat: click to attack, right-click to block +- Inventory: press Tab +- Journal: press J +- Rest: press T + +**Training data needed:** +- Screenshots of Vvardenfell landscapes, towns, interiors +- Dialogue trees with NPC responses +- Navigation examples: "I see Balmora ahead. I should follow the road north." + +## What This Protocol Does NOT Do + +- **No game memory extraction.** We read what's on screen, not in RAM. +- **No mod APIs.** We click and type, like a human at a keyboard. +- **No custom adapters per game.** Same MCP tools for every game. +- **No network protocol.** Local desktop control only. + +The model learns to play by looking at screenshots and pressing keys. +The same way a human learns. The protocol is just "look" and "act." + +## Mapping to the Three Pillars + +| Pillar | How GamePortal serves it | +|--------|--------------------------| +| **Heartbeat** | capture_state feeds the perception step. execute_action IS the action step. | +| **Harness** | The DPO model is trained on (screenshot, decision, action) trajectories from portal play. | +| **Portal Interface** | This protocol IS the portal interface. |