- Discovery is now parallel (asyncio.gather) instead of sequential, fixing the 60s shared timeout issue with multiple servers - Startup messages use print() so users see connection status even with default log levels (the 'tools' logger is set to ERROR) - Summary line shows total tools and failed servers count - Validate conflicting config: warn if both 'url' and 'command' are present (HTTP takes precedence) - Update TODO.md: mark MCP as implemented, list remaining work - Add test for conflicting config detection (51 tests total) All 1163 tests pass.
5.4 KiB
Hermes Agent - Future Improvements
3. Local Browser Control via CDP 🌐
Status: Not started (currently Browserbase cloud only) Priority: Medium
Support local Chrome/Chromium via Chrome DevTools Protocol alongside existing Browserbase cloud backend.
What other agents do:
- OpenClaw: Full CDP-based Chrome control with snapshots, actions, uploads, profiles, file chooser, PDF save, console messages, tab management. Uses local Chrome for persistent login sessions.
- Cline: Headless browser with Computer Use (click, type, scroll, screenshot, console logs)
Our approach:
- Add a
localbackend option tobrowser_tool.pyusing Playwright or raw CDP - Config toggle:
browser.backend: local | browserbase | auto automode: try local first, fall back to Browserbase- Local advantages: free, persistent login sessions, no API key needed
- Local disadvantages: no CAPTCHA solving, no stealth mode, requires Chrome installed
- Reuse the same 10-tool interface -- just swap the backend
- Later: Chrome profile management for persistent sessions across restarts
4. Signal Integration 📡
Status: Not started Priority: Low
New platform adapter using signal-cli daemon (JSON-RPC HTTP + SSE). Requires Java runtime and phone number registration.
Reference: OpenClaw has Signal support via signal-cli.
5. Plugin/Extension System 🔌
Status: Partially implemented (event hooks exist in gateway/hooks.py)
Priority: Medium
Full Python plugin interface that goes beyond the current hook system.
What other agents do:
- OpenClaw: Plugin SDK with tool-send capabilities, lifecycle phase hooks (before-agent-start, after-tool-call, model-override), plugin registry with install/uninstall.
- Pi: Extensions are TypeScript modules that can register tools, commands, keyboard shortcuts, custom UI widgets, overlays, status lines, dialogs, compaction hooks, raw terminal input listeners. Extremely comprehensive.
- OpenCode: MCP client support (stdio, SSE, StreamableHTTP), OAuth auth for MCP servers. Also has Copilot/Codex plugins.
- Codex: Full MCP integration with skill dependencies.
- Cline: MCP integration + lifecycle hooks with cancellation support.
Our approach (phased):
Phase 1: Enhanced hooks
- Expand the existing
gateway/hooks.pyto support more events:before-tool-call,after-tool-call,before-response,context-compress,session-end - Allow hooks to modify tool results (e.g., filter sensitive output)
Phase 2: Plugin interface
~/.hermes/plugins/<name>/plugin.yaml+handler.py- Plugins can: register new tools, add CLI commands, subscribe to events, inject system prompt sections
hermes plugin list|install|uninstall|createCLI commands- Plugin discovery and validation on startup
Phase 3: MCP support (industry standard) ✅ DONE
- ✅ MCP client that connects to external MCP servers (stdio + HTTP/StreamableHTTP)
- ✅ Config:
mcp_serversin config.yaml with connection details - ✅ Each MCP server's tools auto-registered as a dynamic toolset
- Future: Resources, Prompts, Progress notifications,
hermes mcpCLI command
6. MCP (Model Context Protocol) Support 🔗 ✅ DONE
Status: Implemented (PR #301) Priority: Complete
Native MCP client support with stdio and HTTP/StreamableHTTP transports, auto-discovery, reconnection with exponential backoff, env var filtering, and credential stripping. See docs/mcp.md for full documentation.
Still TODO:
hermes mcpCLI subcommand (list/test/status)hermes toolsUI integration for MCP toolsets- MCP Resources and Prompts support
- OAuth authentication for remote servers
- Progress notifications for long-running tools
8. Filesystem Checkpointing / Rollback 🔄
Status: Not started Priority: Low-Medium
Automatic filesystem snapshots after each agent loop iteration so the user can roll back destructive changes to their project.
What other agents do:
- Cline: Workspace checkpoints at each step with Compare/Restore UI
- OpenCode: Git-backed workspace snapshots per step, with weekly gc
- Codex: Sandboxed execution with commit-per-step, rollback on failure
Our approach:
- After each tool call (or batch of tool calls in a single turn) that modifies files, create a lightweight checkpoint of the affected files
- Git-based when the project is a repo: auto-commit to a detached/temporary branch (
hermes/checkpoints/<session>) after each agent turn, squash or discard on session end - Non-git fallback: tar snapshots of changed files in
~/.hermes/checkpoints/<session_id>/ hermes rollbackCLI command to restore to a previous checkpoint- Agent-accessible via a
checkpointtool:list(show available restore points),restore(roll back to a named point),diff(show what changed since a checkpoint) - Configurable: off by default (opt-in via
config.yaml), since auto-committing can be surprising - Cleanup: checkpoints expire after session ends (or configurable retention period)
- Integration with the terminal backend: works with local, SSH, and Docker backends (snapshots happen on the execution host)
Implementation Priority Order
Tier 1: Next Up
MCP Support -- #6✅ Done (PR #301)
Tier 2: Quality of Life
- Local Browser Control via CDP -- #3
- Plugin/Extension System -- #5
Tier 3: Nice to Have
- Session Branching / Checkpoints -- #7
- Filesystem Checkpointing / Rollback -- #8
- Signal Integration -- #4