fix: clipboard image paste on WSL2, Wayland, and VSCode terminal

The original implementation only supported xclip (X11), which silently
fails on WSL2 (can't access Windows clipboard for images), Wayland
desktops (xclip is X11-only), and VSCode terminal on WSL2.

Clipboard backend changes (hermes_cli/clipboard.py):
- WSL2: detect via /proc/version, use powershell.exe with .NET
  System.Windows.Forms.Clipboard to extract images as base64 PNG
- Wayland: use wl-paste with MIME type detection, auto-convert BMP
  to PNG for WSLg environments (via Pillow or ImageMagick)
- Dispatch order: WSL → Wayland → X11 (xclip), with fallthrough
- New has_clipboard_image() for lightweight clipboard checks
- Cache WSL detection result per-process

CLI changes (cli.py):
- /paste command: explicit clipboard image check for terminals where
  BracketedPaste doesn't fire (image-only clipboard in VSCode/WinTerm)
- Ctrl+V keybinding: fallback for Linux terminals where Ctrl+V sends
  raw byte instead of triggering bracketed paste

Tests: 80 tests (up from 37) covering WSL, Wayland, X11 dispatch,
BMP conversion, has_clipboard_image, and /paste command.
This commit is contained in:
teknium1
2026-03-05 20:22:44 -08:00
parent 8253b54be9
commit 2317d115cd
3 changed files with 703 additions and 24 deletions

43
cli.py
View File

@@ -704,6 +704,7 @@ COMMANDS = {
"/cron": "Manage scheduled tasks (list, add, remove)",
"/skills": "Search, install, inspect, or manage skills from online registries",
"/platforms": "Show gateway/messaging platform status",
"/paste": "Check clipboard for an image and attach it",
"/reload-mcp": "Reload MCP servers from config.yaml",
"/quit": "Exit the CLI (also: /exit, /q)",
}
@@ -1132,6 +1133,23 @@ class HermesCLI:
self._image_counter -= 1
return False
def _handle_paste_command(self):
"""Handle /paste — explicitly check clipboard for an image.
This is the reliable fallback for terminals where BracketedPaste
doesn't fire for image-only clipboard content (e.g., VSCode terminal,
Windows Terminal with WSL2).
"""
from hermes_cli.clipboard import has_clipboard_image
if has_clipboard_image():
if self._try_attach_clipboard_image():
n = len(self._attached_images)
_cprint(f" 📎 Image #{n} attached from clipboard")
else:
_cprint(f" {_DIM}(>_<) Clipboard has an image but extraction failed{_RST}")
else:
_cprint(f" {_DIM}(._.) No image found in clipboard{_RST}")
def _build_multimodal_content(self, text: str, images: list) -> list:
"""Convert text + image paths into OpenAI vision multimodal content.
@@ -1837,6 +1855,8 @@ class HermesCLI:
self._manual_compress()
elif cmd_lower == "/usage":
self._show_usage()
elif cmd_lower == "/paste":
self._handle_paste_command()
elif cmd_lower == "/reload-mcp":
self._reload_mcp()
else:
@@ -2598,13 +2618,32 @@ class HermesCLI:
@kb.add(Keys.BracketedPaste, eager=True)
def handle_paste(event):
"""Handle Cmd+V / Ctrl+V paste — detect clipboard images."""
"""Handle terminal paste — detect clipboard images.
When the terminal supports bracketed paste, Ctrl+V / Cmd+V
triggers this with the pasted text. We also check the
clipboard for an image on every paste event.
"""
pasted_text = event.data or ""
if self._try_attach_clipboard_image():
event.app.invalidate()
if pasted_text:
event.current_buffer.insert_text(pasted_text)
@kb.add('c-v')
def handle_ctrl_v(event):
"""Fallback image paste for terminals without bracketed paste.
On Linux terminals (GNOME Terminal, Konsole, etc.), Ctrl+V
sends raw byte 0x16 instead of triggering a paste. This
binding catches that and checks the clipboard for images.
On terminals that DO intercept Ctrl+V for paste (macOS
Terminal, iTerm2, VSCode, Windows Terminal), the bracketed
paste handler fires instead and this binding never triggers.
"""
if self._try_attach_clipboard_image():
event.app.invalidate()
# Dynamic prompt: shows Hermes symbol when agent is working,
# or answer prompt when clarify freetext mode is active.
cli_ref = self