Files
hermes-agent/TODO.md
teknium1 60effcfc44 fix(mcp): parallel discovery, user-visible logging, config validation
- Discovery is now parallel (asyncio.gather) instead of sequential,
  fixing the 60s shared timeout issue with multiple servers
- Startup messages use print() so users see connection status even
  with default log levels (the 'tools' logger is set to ERROR)
- Summary line shows total tools and failed servers count
- Validate conflicting config: warn if both 'url' and 'command' are
  present (HTTP takes precedence)
- Update TODO.md: mark MCP as implemented, list remaining work
- Add test for conflicting config detection (51 tests total)

All 1163 tests pass.
2026-03-02 19:02:28 -08:00

5.4 KiB

Hermes Agent - Future Improvements


3. Local Browser Control via CDP 🌐

Status: Not started (currently Browserbase cloud only) Priority: Medium

Support local Chrome/Chromium via Chrome DevTools Protocol alongside existing Browserbase cloud backend.

What other agents do:

  • OpenClaw: Full CDP-based Chrome control with snapshots, actions, uploads, profiles, file chooser, PDF save, console messages, tab management. Uses local Chrome for persistent login sessions.
  • Cline: Headless browser with Computer Use (click, type, scroll, screenshot, console logs)

Our approach:

  • Add a local backend option to browser_tool.py using Playwright or raw CDP
  • Config toggle: browser.backend: local | browserbase | auto
  • auto mode: try local first, fall back to Browserbase
  • Local advantages: free, persistent login sessions, no API key needed
  • Local disadvantages: no CAPTCHA solving, no stealth mode, requires Chrome installed
  • Reuse the same 10-tool interface -- just swap the backend
  • Later: Chrome profile management for persistent sessions across restarts

4. Signal Integration 📡

Status: Not started Priority: Low

New platform adapter using signal-cli daemon (JSON-RPC HTTP + SSE). Requires Java runtime and phone number registration.

Reference: OpenClaw has Signal support via signal-cli.


5. Plugin/Extension System 🔌

Status: Partially implemented (event hooks exist in gateway/hooks.py) Priority: Medium

Full Python plugin interface that goes beyond the current hook system.

What other agents do:

  • OpenClaw: Plugin SDK with tool-send capabilities, lifecycle phase hooks (before-agent-start, after-tool-call, model-override), plugin registry with install/uninstall.
  • Pi: Extensions are TypeScript modules that can register tools, commands, keyboard shortcuts, custom UI widgets, overlays, status lines, dialogs, compaction hooks, raw terminal input listeners. Extremely comprehensive.
  • OpenCode: MCP client support (stdio, SSE, StreamableHTTP), OAuth auth for MCP servers. Also has Copilot/Codex plugins.
  • Codex: Full MCP integration with skill dependencies.
  • Cline: MCP integration + lifecycle hooks with cancellation support.

Our approach (phased):

Phase 1: Enhanced hooks

  • Expand the existing gateway/hooks.py to support more events: before-tool-call, after-tool-call, before-response, context-compress, session-end
  • Allow hooks to modify tool results (e.g., filter sensitive output)

Phase 2: Plugin interface

  • ~/.hermes/plugins/<name>/plugin.yaml + handler.py
  • Plugins can: register new tools, add CLI commands, subscribe to events, inject system prompt sections
  • hermes plugin list|install|uninstall|create CLI commands
  • Plugin discovery and validation on startup

Phase 3: MCP support (industry standard) DONE

  • MCP client that connects to external MCP servers (stdio + HTTP/StreamableHTTP)
  • Config: mcp_servers in config.yaml with connection details
  • Each MCP server's tools auto-registered as a dynamic toolset
  • Future: Resources, Prompts, Progress notifications, hermes mcp CLI command

6. MCP (Model Context Protocol) Support 🔗 DONE

Status: Implemented (PR #301) Priority: Complete

Native MCP client support with stdio and HTTP/StreamableHTTP transports, auto-discovery, reconnection with exponential backoff, env var filtering, and credential stripping. See docs/mcp.md for full documentation.

Still TODO:

  • hermes mcp CLI subcommand (list/test/status)
  • hermes tools UI integration for MCP toolsets
  • MCP Resources and Prompts support
  • OAuth authentication for remote servers
  • Progress notifications for long-running tools

8. Filesystem Checkpointing / Rollback 🔄

Status: Not started Priority: Low-Medium

Automatic filesystem snapshots after each agent loop iteration so the user can roll back destructive changes to their project.

What other agents do:

  • Cline: Workspace checkpoints at each step with Compare/Restore UI
  • OpenCode: Git-backed workspace snapshots per step, with weekly gc
  • Codex: Sandboxed execution with commit-per-step, rollback on failure

Our approach:

  • After each tool call (or batch of tool calls in a single turn) that modifies files, create a lightweight checkpoint of the affected files
  • Git-based when the project is a repo: auto-commit to a detached/temporary branch (hermes/checkpoints/<session>) after each agent turn, squash or discard on session end
  • Non-git fallback: tar snapshots of changed files in ~/.hermes/checkpoints/<session_id>/
  • hermes rollback CLI command to restore to a previous checkpoint
  • Agent-accessible via a checkpoint tool: list (show available restore points), restore (roll back to a named point), diff (show what changed since a checkpoint)
  • Configurable: off by default (opt-in via config.yaml), since auto-committing can be surprising
  • Cleanup: checkpoints expire after session ends (or configurable retention period)
  • Integration with the terminal backend: works with local, SSH, and Docker backends (snapshots happen on the execution host)

Implementation Priority Order

Tier 1: Next Up

  1. MCP Support -- #6 Done (PR #301)

Tier 2: Quality of Life

  1. Local Browser Control via CDP -- #3
  2. Plugin/Extension System -- #5

Tier 3: Nice to Have

  1. Session Branching / Checkpoints -- #7
  2. Filesystem Checkpointing / Rollback -- #8
  3. Signal Integration -- #4