417 lines
16 KiB
Markdown
417 lines
16 KiB
Markdown
# Tools
|
|
|
|
Tools are functions that extend the agent's capabilities. Each tool is defined with an OpenAI-compatible JSON schema and an async handler function.
|
|
|
|
## Tool Structure
|
|
|
|
Each tool module in `tools/` exports:
|
|
1. **Schema definitions** - OpenAI function-calling format
|
|
2. **Handler functions** - Async functions that execute the tool
|
|
|
|
```python
|
|
# Example: tools/web_tools.py
|
|
|
|
# Schema definition
|
|
WEB_SEARCH_SCHEMA = {
|
|
"type": "function",
|
|
"function": {
|
|
"name": "web_search",
|
|
"description": "Search the web for information",
|
|
"parameters": {
|
|
"type": "object",
|
|
"properties": {
|
|
"query": {"type": "string", "description": "Search query"}
|
|
},
|
|
"required": ["query"]
|
|
}
|
|
}
|
|
}
|
|
|
|
# Handler function
|
|
async def web_search(query: str) -> dict:
|
|
"""Execute web search and return results."""
|
|
# Implementation...
|
|
return {"results": [...]}
|
|
```
|
|
|
|
## Tool Categories
|
|
|
|
| Category | Module | Tools |
|
|
|----------|--------|-------|
|
|
| **Web** | `web_tools.py` | `web_search`, `web_extract`, `web_crawl` |
|
|
| **Terminal** | `terminal_tool.py` | `terminal` (local/docker/singularity/modal/ssh backends) |
|
|
| **File** | `file_tools.py` | `read_file`, `write_file`, `patch`, `search` |
|
|
| **Browser** | `browser_tool.py` | `browser_navigate`, `browser_click`, `browser_type`, etc. |
|
|
| **Vision** | `vision_tools.py` | `vision_analyze` |
|
|
| **Image Gen** | `image_generation_tool.py` | `image_generate` |
|
|
| **TTS** | `tts_tool.py` | `text_to_speech` (Edge TTS free / ElevenLabs / OpenAI) |
|
|
| **Reasoning** | `mixture_of_agents_tool.py` | `mixture_of_agents` |
|
|
| **Skills** | `skills_tool.py`, `skill_manager_tool.py` | `skills_list`, `skill_view`, `skill_manage` |
|
|
| **Todo** | `todo_tool.py` | `todo` (read/write task list for multi-step planning) |
|
|
| **Memory** | `memory_tool.py` | `memory` (persistent notes + user profile across sessions) |
|
|
| **Session Search** | `session_search_tool.py` | `session_search` (search + summarize past conversations) |
|
|
| **Cronjob** | `cronjob_tools.py` | `schedule_cronjob`, `list_cronjobs`, `remove_cronjob` |
|
|
| **RL Training** | `rl_training_tool.py` | `rl_list_environments`, `rl_start_training`, `rl_check_status`, etc. |
|
|
| **Clarify** | `clarify_tool.py` | `clarify` (interactive multiple-choice / open-ended questions, CLI-only) |
|
|
| **Code Execution** | `code_execution_tool.py` | `execute_code` (run Python scripts that call tools via RPC sandbox) |
|
|
| **Delegation** | `delegate_tool.py` | `delegate_task` (spawn subagents with isolated context, single + parallel batch) |
|
|
|
|
## Tool Registration
|
|
|
|
Each tool file self-registers via `tools/registry.py`:
|
|
|
|
```python
|
|
# tools/example_tool.py
|
|
from tools.registry import registry
|
|
|
|
EXAMPLE_SCHEMA = {
|
|
"name": "example_tool",
|
|
"description": "Does something useful.",
|
|
"parameters": { ... }
|
|
}
|
|
|
|
registry.register(
|
|
name="example_tool",
|
|
toolset="example",
|
|
schema=EXAMPLE_SCHEMA,
|
|
handler=lambda args, **kw: example_tool(args.get("param", "")),
|
|
check_fn=check_example_requirements,
|
|
requires_env=["EXAMPLE_API_KEY"],
|
|
)
|
|
```
|
|
|
|
`model_tools.py` is a thin orchestration layer that imports all tool modules (triggering registration), then delegates to the registry for schema collection and dispatch.
|
|
|
|
## Toolsets
|
|
|
|
Tools are grouped into **toolsets** for logical organization (see `toolsets.py`). All platforms share a `_HERMES_CORE_TOOLS` list; messaging platforms add `send_message`.
|
|
|
|
## Adding a New Tool
|
|
|
|
### Overview
|
|
|
|
Adding a tool touches 3 files:
|
|
|
|
1. **`tools/your_tool.py`** -- handler, schema, check function, `registry.register()` call
|
|
2. **`toolsets.py`** -- add tool name to `_HERMES_CORE_TOOLS` (or a specific toolset)
|
|
3. **`model_tools.py`** -- add `"tools.your_tool"` to the `_discover_tools()` list
|
|
|
|
### Step 1: Create the tool file
|
|
|
|
Every tool file follows the same structure: handler function, availability check, schema constant, and registry registration.
|
|
|
|
```python
|
|
# tools/weather_tool.py
|
|
"""Weather Tool -- look up current weather for a location."""
|
|
|
|
import json
|
|
import os
|
|
import logging
|
|
|
|
logger = logging.getLogger(__name__)
|
|
|
|
|
|
# --- Availability check ---
|
|
|
|
def check_weather_requirements() -> bool:
|
|
"""Return True if the tool's dependencies are available."""
|
|
return bool(os.getenv("WEATHER_API_KEY"))
|
|
|
|
|
|
# --- Handler ---
|
|
|
|
def weather_tool(location: str, units: str = "metric") -> str:
|
|
"""Fetch weather for a location. Returns JSON string."""
|
|
api_key = os.getenv("WEATHER_API_KEY")
|
|
if not api_key:
|
|
return json.dumps({"error": "WEATHER_API_KEY not configured"})
|
|
try:
|
|
# ... call weather API ...
|
|
return json.dumps({"location": location, "temp": 22, "units": units})
|
|
except Exception as e:
|
|
return json.dumps({"error": str(e)})
|
|
|
|
|
|
# --- Schema ---
|
|
|
|
WEATHER_SCHEMA = {
|
|
"name": "weather",
|
|
"description": "Get current weather for a location.",
|
|
"parameters": {
|
|
"type": "object",
|
|
"properties": {
|
|
"location": {
|
|
"type": "string",
|
|
"description": "City name or coordinates (e.g. 'London' or '51.5,-0.1')"
|
|
},
|
|
"units": {
|
|
"type": "string",
|
|
"enum": ["metric", "imperial"],
|
|
"description": "Temperature units (default: metric)",
|
|
"default": "metric"
|
|
}
|
|
},
|
|
"required": ["location"]
|
|
}
|
|
}
|
|
|
|
|
|
# --- Registration ---
|
|
|
|
from tools.registry import registry
|
|
|
|
registry.register(
|
|
name="weather",
|
|
toolset="weather",
|
|
schema=WEATHER_SCHEMA,
|
|
handler=lambda args, **kw: weather_tool(
|
|
location=args.get("location", ""),
|
|
units=args.get("units", "metric")),
|
|
check_fn=check_weather_requirements,
|
|
requires_env=["WEATHER_API_KEY"],
|
|
)
|
|
```
|
|
|
|
**Key rules:**
|
|
|
|
- Handlers MUST return a JSON string (via `json.dumps()`), never raw dicts.
|
|
- Errors MUST be returned as `{"error": "message"}`, never raised as exceptions. The registry's `dispatch()` also wraps unexpected exceptions automatically.
|
|
- The `check_fn` is called when building tool definitions -- if it returns `False`, the tool is silently excluded from the schema sent to the LLM.
|
|
- The `handler` receives `(args: dict, **kwargs)` where `args` is the LLM's tool call arguments and `kwargs` may include `task_id`, `user_task`, `store`, etc. depending on what the caller passes.
|
|
|
|
### Step 2: Add to a toolset
|
|
|
|
In `toolsets.py`, add the tool name to the appropriate place:
|
|
|
|
```python
|
|
# If it should be available on all platforms (CLI + messaging):
|
|
_HERMES_CORE_TOOLS = [
|
|
...
|
|
"weather", # <-- add here
|
|
]
|
|
|
|
# Or create a new standalone toolset:
|
|
"weather": {
|
|
"description": "Weather lookup tools",
|
|
"tools": ["weather"],
|
|
"includes": []
|
|
},
|
|
```
|
|
|
|
### Step 3: Add discovery import
|
|
|
|
In `model_tools.py`, add the module to the `_discover_tools()` list:
|
|
|
|
```python
|
|
def _discover_tools():
|
|
_modules = [
|
|
...
|
|
"tools.weather_tool", # <-- add here
|
|
]
|
|
```
|
|
|
|
This import triggers the `registry.register()` call at the bottom of the tool file.
|
|
|
|
### Async handlers
|
|
|
|
If your handler needs to call async code (e.g., `aiohttp`, async SDK), mark it with `is_async=True`:
|
|
|
|
```python
|
|
async def weather_tool_async(location: str) -> str:
|
|
async with aiohttp.ClientSession() as session:
|
|
...
|
|
return json.dumps(result)
|
|
|
|
registry.register(
|
|
name="weather",
|
|
toolset="weather",
|
|
schema=WEATHER_SCHEMA,
|
|
handler=lambda args, **kw: weather_tool_async(args.get("location", "")),
|
|
check_fn=check_weather_requirements,
|
|
is_async=True, # <-- registry calls _run_async() automatically
|
|
)
|
|
```
|
|
|
|
The registry handles async bridging transparently via `_run_async()` -- you never call `asyncio.run()` yourself. This works correctly in CLI mode (no event loop), the gateway (running async loop), and RL environments (Atropos event loop + thread pool wrapping).
|
|
|
|
### Handlers that need task_id
|
|
|
|
Tools that manage per-session state (terminal, browser, file ops) receive `task_id` via `**kwargs`:
|
|
|
|
```python
|
|
def _handle_weather(args, **kw):
|
|
task_id = kw.get("task_id") # may be None in CLI mode
|
|
return weather_tool(args.get("location", ""), task_id=task_id)
|
|
|
|
registry.register(
|
|
name="weather",
|
|
...
|
|
handler=_handle_weather,
|
|
)
|
|
```
|
|
|
|
Use a named function instead of a lambda when the arg unpacking is complex.
|
|
|
|
### Agent-loop intercepted tools
|
|
|
|
Some tools (todo, memory, session_search, delegate_task) need access to per-session agent state (TodoStore, MemoryStore, etc.) that doesn't flow through `handle_function_call`. These are intercepted by `run_agent.py` before reaching the registry. The registry still holds their schemas (so they appear in the tool list), but `dispatch()` returns a fallback error if the intercept is bypassed. See `todo_tool.py` for the pattern.
|
|
|
|
### Optional: setup wizard integration
|
|
|
|
If your tool requires an API key, add it to `hermes_cli/config.py`'s `OPTIONAL_ENV_VARS` dict so the setup wizard can prompt for it:
|
|
|
|
```python
|
|
OPTIONAL_ENV_VARS = {
|
|
...
|
|
"WEATHER_API_KEY": {
|
|
"description": "Weather API key for weather lookup",
|
|
"prompt": "Weather API key",
|
|
"url": "https://weatherapi.com/",
|
|
"tools": ["weather"],
|
|
"password": True,
|
|
},
|
|
}
|
|
```
|
|
|
|
### Optional: batch processing
|
|
|
|
Add to `toolset_distributions.py` if the tool should be available in specific batch processing distributions.
|
|
|
|
## Stateful Tools
|
|
|
|
Some tools maintain state across calls within a session:
|
|
|
|
- **Terminal**: Keeps container/sandbox running between commands
|
|
- **Browser**: Maintains browser session for multi-step navigation
|
|
|
|
State is managed per `task_id` and cleaned up automatically.
|
|
|
|
## Terminal Backends
|
|
|
|
The terminal tool supports multiple execution backends:
|
|
|
|
| Backend | Description | Use Case |
|
|
|---------|-------------|----------|
|
|
| `local` | Direct execution on host | Development, simple tasks |
|
|
| `ssh` | Remote execution via SSH | Sandboxing (agent can't modify its own code) |
|
|
| `docker` | Docker container | Isolation, reproducibility |
|
|
| `singularity` | Singularity/Apptainer | HPC clusters, rootless containers |
|
|
| `modal` | Modal cloud | Scalable cloud compute, GPUs |
|
|
|
|
Configure via environment variables or `cli-config.yaml`:
|
|
|
|
```yaml
|
|
# SSH backend example (in cli-config.yaml)
|
|
terminal:
|
|
env_type: "ssh"
|
|
ssh_host: "my-server.example.com"
|
|
ssh_user: "myuser"
|
|
ssh_key: "~/.ssh/id_rsa"
|
|
cwd: "/home/myuser/project"
|
|
```
|
|
|
|
The SSH backend uses ControlMaster for connection persistence, making subsequent commands fast.
|
|
|
|
## Skills Tools (Progressive Disclosure)
|
|
|
|
Skills are on-demand knowledge documents. They use **progressive disclosure** to minimize tokens:
|
|
|
|
```
|
|
Level 0: skills_categories() → ["mlops", "devops"] (~50 tokens)
|
|
Level 1: skills_list(category) → [{name, description}, ...] (~3k tokens)
|
|
Level 2: skill_view(name) → Full content + metadata (varies)
|
|
Level 3: skill_view(name, path) → Specific reference file (varies)
|
|
```
|
|
|
|
All skills live in `~/.hermes/skills/` — a single directory that serves as the source of truth. On fresh install, bundled skills are seeded from the repo's `skills/` directory. Hub-installed and agent-created skills also go here. The agent can modify or delete any skill.
|
|
|
|
Skill directory structure:
|
|
```
|
|
~/.hermes/skills/
|
|
├── mlops/
|
|
│ └── axolotl/
|
|
│ ├── SKILL.md # Main instructions (required)
|
|
│ ├── references/ # Additional docs
|
|
│ ├── templates/ # Output formats, configs
|
|
│ └── assets/ # Supplementary files (agentskills.io)
|
|
├── devops/
|
|
│ └── deploy-k8s/
|
|
│ └── SKILL.md
|
|
├── .hub/ # Skills Hub state
|
|
└── .bundled_manifest # Tracks seeded bundled skills
|
|
```
|
|
|
|
SKILL.md uses YAML frontmatter (agentskills.io compatible):
|
|
```yaml
|
|
---
|
|
name: axolotl
|
|
description: Fine-tuning LLMs with Axolotl
|
|
metadata:
|
|
hermes:
|
|
tags: [Fine-Tuning, LoRA, DPO]
|
|
category: mlops
|
|
---
|
|
```
|
|
|
|
## Skill Management (skill_manage)
|
|
|
|
The `skill_manage` tool lets the agent create, update, and delete its own skills -- turning successful approaches into reusable procedural knowledge.
|
|
|
|
**Module:** `tools/skill_manager_tool.py`
|
|
|
|
**Actions:**
|
|
| Action | Description | Required params |
|
|
|--------|-------------|-----------------|
|
|
| `create` | Create new skill (SKILL.md + directory) | `name`, `content`, optional `category` |
|
|
| `patch` | Targeted find-and-replace in SKILL.md or supporting file | `name`, `old_string`, `new_string`, optional `file_path`, `replace_all` |
|
|
| `edit` | Full replacement of SKILL.md (major rewrites only) | `name`, `content` |
|
|
| `delete` | Remove a user skill entirely | `name` |
|
|
| `write_file` | Add/overwrite a supporting file | `name`, `file_path`, `file_content` |
|
|
| `remove_file` | Remove a supporting file | `name`, `file_path` |
|
|
|
|
### Patch vs Edit
|
|
|
|
`patch` and `edit` both modify skill files, but serve different purposes:
|
|
|
|
**`patch`** (preferred for most updates):
|
|
- Targeted `old_string` → `new_string` replacement, same interface as the `patch` file tool
|
|
- Token-efficient: only the changed text appears in the tool call, not the full file
|
|
- Requires unique match by default; set `replace_all=true` for global replacements
|
|
- Returns match count on ambiguous matches so the model can add more context
|
|
- When targeting SKILL.md, validates that frontmatter remains intact after the patch
|
|
- Also works on supporting files via `file_path` parameter (e.g., `references/api.md`)
|
|
- Returns a file preview on not-found errors for self-correction without extra reads
|
|
|
|
**`edit`** (for major rewrites):
|
|
- Full replacement of SKILL.md content
|
|
- Use when the skill's structure needs to change (reorganizing sections, rewriting from scratch)
|
|
- The model should `skill_view()` first, then provide the complete updated text
|
|
|
|
**Constraints:**
|
|
- All skills live in `~/.hermes/skills/` and can be modified or deleted
|
|
- Skill names must be lowercase, filesystem-safe (`[a-z0-9._-]+`), max 64 chars
|
|
- SKILL.md must have valid YAML frontmatter with `name` and `description` fields
|
|
- Supporting files must be under `references/`, `templates/`, `scripts/`, or `assets/`
|
|
- Path traversal (`..`) in file paths is blocked
|
|
|
|
**Availability:** Enabled by default in CLI, Telegram, Discord, WhatsApp, and Slack. Not included in batch_runner or RL training environments.
|
|
|
|
**Behavioral guidance:** The tool description teaches the model when to create skills (after difficult tasks), when to update them (stale/broken instructions), to prefer `patch` over `edit` for targeted fixes, and the feedback loop pattern (ask user after difficult tasks, offer to save as a skill).
|
|
|
|
## Skills Hub
|
|
|
|
The Skills Hub enables searching, installing, and managing skills from online registries. It is **user-driven only** — the model cannot search for or install skills.
|
|
|
|
**Sources:** GitHub repos (openai/skills, anthropics/skills, custom taps), ClawHub, Claude Code marketplaces, LobeHub.
|
|
|
|
**Security:** Every downloaded skill is scanned by `tools/skills_guard.py` (regex patterns + optional LLM audit) before installation. Trust levels: `builtin` (ships with Hermes), `trusted` (openai/skills, anthropics/skills), `community` (everything else — any findings = blocked unless `--force`).
|
|
|
|
**Architecture:**
|
|
- `tools/skills_guard.py` — Static scanner + LLM audit, trust-aware install policy
|
|
- `tools/skills_hub.py` — SkillSource ABC, GitHubAuth (PAT + App), 4 source adapters, lock file, hub state
|
|
- `tools/skill_manager_tool.py` — Agent-managed skill CRUD (`skill_manage` tool)
|
|
- `hermes_cli/skills_hub.py` — Shared `do_*` functions, CLI subcommands, `/skills` slash command handler
|
|
|
|
**CLI:** `hermes skills search|install|inspect|list|audit|uninstall|publish|snapshot|tap`
|
|
**Slash:** `/skills search|install|inspect|list|audit|uninstall|publish|snapshot|tap`
|