description: "Connect Hermes Agent to external tool servers via MCP — databases, APIs, filesystems, and more"
---
# MCP (Model Context Protocol)
MCP lets Hermes Agent connect to external tool servers — giving the agent access to databases, APIs, filesystems, and more without any code changes.
## Overview
The [Model Context Protocol](https://modelcontextprotocol.io/) (MCP) is an open standard for connecting AI agents to external tools and data sources. MCP servers expose tools over a lightweight RPC protocol, and Hermes Agent can connect to any compliant server automatically.
What this means for you:
- **Thousands of ready-made tools** — browse the [MCP server directory](https://github.com/modelcontextprotocol/servers) for servers covering GitHub, Slack, databases, file systems, web scraping, and more
- **No code changes needed** — add a few lines to `~/.hermes/config.yaml` and the tools appear alongside built-in ones
- **Mix and match** — run multiple MCP servers simultaneously, combining stdio-based and HTTP-based servers
- **Secure by default** — environment variables are filtered and credentials are stripped from error messages
## Prerequisites
```bash
pip install hermes-agent[mcp]
```
| Server Type | Runtime Needed | Example |
|-------------|---------------|---------|
| HTTP/remote | Nothing extra | `url: "https://mcp.example.com"` |
| `command` | Yes | Executable to run (`npx`, `uvx`, `python`) |
| `args` | No | Command-line arguments |
| `env` | No | Environment variables for the subprocess |
:::info Security
Only explicitly listed `env` variables plus a safe baseline (`PATH`, `HOME`, `USER`, `LANG`, `SHELL`, `TMPDIR`, `XDG_*`) are passed to the subprocess. Your API keys and secrets are **not** leaked.
In addition to the server's own tools, each MCP server also gets 4 utility tools auto-registered: `list_resources`, `read_resource`, `list_prompts`, and `get_prompt`. These allow the agent to discover and use MCP resources and prompts exposed by the server.
Each configured server also creates a **runtime toolset** named `mcp-<server>`. This means you can filter or reason about MCP servers at the toolset level in the same way you do with built-in toolsets.
If an MCP server disconnects, Hermes automatically reconnects with exponential backoff (1s, 2s, 4s, 8s, 16s — max 5 attempts). Initial connection failures are reported immediately.
### Shutdown
On agent exit, all MCP server connections are cleanly shut down.
## Popular MCP Servers
| Server | Package | Description |
|--------|---------|-------------|
| Filesystem | `@modelcontextprotocol/server-filesystem` | Read/write/search local files |
MCP's `sampling/createMessage` capability allows MCP servers to request LLM completions through the Hermes agent. This enables agent-in-the-loop workflows where servers can leverage the LLM during tool execution — for example, a database server asking the LLM to interpret query results, or a code analysis server requesting the LLM to review findings.
### How It Works
When an MCP server sends a `sampling/createMessage` request:
1. The sampling callback validates against rate limits and model whitelist
2. Resolves which model to use (config override > server hint > default)
3. Converts MCP messages to OpenAI-compatible format
4. Offloads the LLM call to a thread via `asyncio.to_thread()` (non-blocking)
5. Returns the response (text or tool use) back to the server
### Configuration
Sampling is **enabled by default** for all MCP servers. No extra setup needed — if you have an auxiliary LLM client configured, sampling works automatically.
```yaml
mcp_servers:
analysis_server:
command: "npx"
args: ["-y", "my-analysis-server"]
sampling:
enabled: true # default: true
model: "gemini-3-flash" # override model (optional)
max_tokens_cap: 4096 # max tokens per request (default: 4096)
timeout: 30 # LLM call timeout in seconds (default: 30)
max_rpm: 10 # max requests per minute (default: 10)
allowed_models: [] # model whitelist (empty = allow all)
max_tool_rounds: 5 # max consecutive tool use rounds (0 = disable)
Servers can include `tools` and `toolChoice` in sampling requests, enabling multi-turn tool-augmented workflows within a single sampling session. The callback forwards tool definitions to the LLM, handles tool use responses with proper `ToolUseContent` types, and enforces `max_tool_rounds` to prevent infinite loops.