hermes-agent/docs/tools.md

# Tools

Tools are functions that extend the agent's capabilities. Each tool is defined with an OpenAI-compatible JSON schema and an async handler function.

## Tool Structure

Each tool module in `tools/` exports:
1. **Schema definitions** - OpenAI function-calling format
2. **Handler functions** - Async functions that execute the tool

```python
# Example: tools/web_tools.py

# Schema definition
WEB_SEARCH_SCHEMA = {
    "type": "function",
    "function": {
        "name": "web_search",
        "description": "Search the web for information",
        "parameters": {
            "type": "object",
            "properties": {
                "query": {"type": "string", "description": "Search query"}
            },
            "required": ["query"]
        }
    }
}

# Handler function
async def web_search(query: str) -> dict:
    """Execute web search and return results."""
    # Implementation...
    return {"results": [...]}
```

## Tool Categories

| Category | Module | Tools |
|----------|--------|-------|
| **Web** | `web_tools.py` | `web_search`, `web_extract`, `web_crawl` |
| **Terminal** | `terminal_tool.py` | `terminal` (local/docker/singularity/modal/ssh backends) |
| **File** | `file_tools.py` | `read_file`, `write_file`, `patch`, `search` |
| **Browser** | `browser_tool.py` | `browser_navigate`, `browser_click`, `browser_type`, etc. |
| **Vision** | `vision_tools.py` | `vision_analyze` |
| **Image Gen** | `image_generation_tool.py` | `image_generate` |
| **TTS** | `tts_tool.py` | `text_to_speech` (Edge TTS free / ElevenLabs / OpenAI) |
| **Reasoning** | `mixture_of_agents_tool.py` | `mixture_of_agents` |
| **Skills** | `skills_tool.py`, `skill_manager_tool.py` | `skills_list`, `skill_view`, `skill_manage` |
| **Todo** | `todo_tool.py` | `todo` (read/write task list for multi-step planning) |
| **Memory** | `memory_tool.py` | `memory` (persistent notes + user profile across sessions) |
| **Session Search** | `session_search_tool.py` | `session_search` (search + summarize past conversations) |
| **Cronjob** | `cronjob_tools.py` | `schedule_cronjob`, `list_cronjobs`, `remove_cronjob` |
| **RL Training** | `rl_training_tool.py` | `rl_list_environments`, `rl_start_training`, `rl_check_status`, etc. |
| **Clarify** | `clarify_tool.py` | `clarify` (interactive multiple-choice / open-ended questions, CLI-only) |
| **Code Execution** | `code_execution_tool.py` | `execute_code` (run Python scripts that call tools via RPC sandbox) |
| **Delegation** | `delegate_tool.py` | `delegate_task` (spawn subagents with isolated context, single + parallel batch) |

## Tool Registration

Each tool file self-registers via `tools/registry.py`:

```python
# tools/example_tool.py
from tools.registry import registry

EXAMPLE_SCHEMA = {
    "name": "example_tool",
    "description": "Does something useful.",
    "parameters": { ... }
}

registry.register(
    name="example_tool",
    toolset="example",
    schema=EXAMPLE_SCHEMA,
    handler=lambda args, **kw: example_tool(args.get("param", "")),
    check_fn=check_example_requirements,
    requires_env=["EXAMPLE_API_KEY"],
)
```

`model_tools.py` is a thin orchestration layer that imports all tool modules (triggering registration), then delegates to the registry for schema collection and dispatch.

## Toolsets

Tools are grouped into **toolsets** for logical organization (see `toolsets.py`). All platforms share a `_HERMES_CORE_TOOLS` list; messaging platforms add `send_message`.

## Adding a New Tool

### Overview

Adding a tool touches 3 files:

1. **`tools/your_tool.py`** -- handler, schema, check function, `registry.register()` call
2. **`toolsets.py`** -- add tool name to `_HERMES_CORE_TOOLS` (or a specific toolset)
3. **`model_tools.py`** -- add `"tools.your_tool"` to the `_discover_tools()` list

### Step 1: Create the tool file

Every tool file follows the same structure: handler function, availability check, schema constant, and registry registration.

```python
# tools/weather_tool.py
"""Weather Tool -- look up current weather for a location."""

import json
import os
import logging

logger = logging.getLogger(__name__)


# --- Availability check ---

def check_weather_requirements() -> bool:
    """Return True if the tool's dependencies are available."""
    return bool(os.getenv("WEATHER_API_KEY"))


# --- Handler ---

def weather_tool(location: str, units: str = "metric") -> str:
    """Fetch weather for a location. Returns JSON string."""
    api_key = os.getenv("WEATHER_API_KEY")
    if not api_key:
        return json.dumps({"error": "WEATHER_API_KEY not configured"})
    try:
        # ... call weather API ...
        return json.dumps({"location": location, "temp": 22, "units": units})
    except Exception as e:
        return json.dumps({"error": str(e)})


# --- Schema ---

WEATHER_SCHEMA = {
    "name": "weather",
    "description": "Get current weather for a location.",
    "parameters": {
        "type": "object",
        "properties": {
            "location": {
                "type": "string",
                "description": "City name or coordinates (e.g. 'London' or '51.5,-0.1')"
            },
            "units": {
                "type": "string",
                "enum": ["metric", "imperial"],
                "description": "Temperature units (default: metric)",
                "default": "metric"
            }
        },
        "required": ["location"]
    }
}


# --- Registration ---

from tools.registry import registry

registry.register(
    name="weather",
    toolset="weather",
    schema=WEATHER_SCHEMA,
    handler=lambda args, **kw: weather_tool(
        location=args.get("location", ""),
        units=args.get("units", "metric")),
    check_fn=check_weather_requirements,
    requires_env=["WEATHER_API_KEY"],
)
```

**Key rules:**

- Handlers MUST return a JSON string (via `json.dumps()`), never raw dicts.
- Errors MUST be returned as `{"error": "message"}`, never raised as exceptions. The registry's `dispatch()` also wraps unexpected exceptions automatically.
- The `check_fn` is called when building tool definitions -- if it returns `False`, the tool is silently excluded from the schema sent to the LLM.
- The `handler` receives `(args: dict, **kwargs)` where `args` is the LLM's tool call arguments and `kwargs` may include `task_id`, `user_task`, `store`, etc. depending on what the caller passes.

### Step 2: Add to a toolset

In `toolsets.py`, add the tool name to the appropriate place:

```python
# If it should be available on all platforms (CLI + messaging):
_HERMES_CORE_TOOLS = [
    ...
    "weather",  # <-- add here
]

# Or create a new standalone toolset:
"weather": {
    "description": "Weather lookup tools",
    "tools": ["weather"],
    "includes": []
},
```

### Step 3: Add discovery import

In `model_tools.py`, add the module to the `_discover_tools()` list:

```python
def _discover_tools():
    _modules = [
        ...
        "tools.weather_tool",  # <-- add here
    ]
```

This import triggers the `registry.register()` call at the bottom of the tool file.

### Async handlers

If your handler needs to call async code (e.g., `aiohttp`, async SDK), mark it with `is_async=True`:

```python
async def weather_tool_async(location: str) -> str:
    async with aiohttp.ClientSession() as session:
        ...
    return json.dumps(result)

registry.register(
    name="weather",
    toolset="weather",
    schema=WEATHER_SCHEMA,
    handler=lambda args, **kw: weather_tool_async(args.get("location", "")),
    check_fn=check_weather_requirements,
    is_async=True,  # <-- registry calls _run_async() automatically
)
```

The registry handles async bridging transparently via `_run_async()` -- you never call `asyncio.run()` yourself. This works correctly in CLI mode (no event loop), the gateway (running async loop), and RL environments (Atropos event loop + thread pool wrapping).

### Handlers that need task_id

Tools that manage per-session state (terminal, browser, file ops) receive `task_id` via `**kwargs`:

```python
def _handle_weather(args, **kw):
    task_id = kw.get("task_id")  # may be None in CLI mode
    return weather_tool(args.get("location", ""), task_id=task_id)

registry.register(
    name="weather",
    ...
    handler=_handle_weather,
)
```

Use a named function instead of a lambda when the arg unpacking is complex.

### Agent-loop intercepted tools

Some tools (todo, memory, session_search, delegate_task) need access to per-session agent state (TodoStore, MemoryStore, etc.) that doesn't flow through `handle_function_call`. These are intercepted by `run_agent.py` before reaching the registry. The registry still holds their schemas (so they appear in the tool list), but `dispatch()` returns a fallback error if the intercept is bypassed. See `todo_tool.py` for the pattern.

### Optional: setup wizard integration

If your tool requires an API key, add it to `hermes_cli/config.py`'s `OPTIONAL_ENV_VARS` dict so the setup wizard can prompt for it:

```python
OPTIONAL_ENV_VARS = {
    ...
    "WEATHER_API_KEY": {
        "description": "Weather API key for weather lookup",
        "prompt": "Weather API key",
        "url": "https://weatherapi.com/",
        "tools": ["weather"],
        "password": True,
    },
}
```

### Optional: batch processing

Add to `toolset_distributions.py` if the tool should be available in specific batch processing distributions.

## Stateful Tools

Some tools maintain state across calls within a session:

- **Terminal**: Keeps container/sandbox running between commands
- **Browser**: Maintains browser session for multi-step navigation

State is managed per `task_id` and cleaned up automatically.

## Terminal Backends

The terminal tool supports multiple execution backends:

| Backend | Description | Use Case |
|---------|-------------|----------|
| `local` | Direct execution on host | Development, simple tasks |
| `ssh` | Remote execution via SSH | Sandboxing (agent can't modify its own code) |
| `docker` | Docker container | Isolation, reproducibility |
| `singularity` | Singularity/Apptainer | HPC clusters, rootless containers |
| `modal` | Modal cloud | Scalable cloud compute, GPUs |

Configure via environment variables or `cli-config.yaml`:

```yaml
# SSH backend example (in cli-config.yaml)
terminal:
  env_type: "ssh"
  ssh_host: "my-server.example.com"
  ssh_user: "myuser"
  ssh_key: "~/.ssh/id_rsa"
  cwd: "/home/myuser/project"
```

The SSH backend uses ControlMaster for connection persistence, making subsequent commands fast.

## Skills Tools (Progressive Disclosure)

Skills are on-demand knowledge documents. They use **progressive disclosure** to minimize tokens:

```
Level 0: skills_categories()     → ["mlops", "devops"]           (~50 tokens)
Level 1: skills_list(category)   → [{name, description}, ...]   (~3k tokens)
Level 2: skill_view(name)        → Full content + metadata       (varies)
Level 3: skill_view(name, path)  → Specific reference file       (varies)
```

All skills live in `~/.hermes/skills/` — a single directory that serves as the source of truth. On fresh install, bundled skills are seeded from the repo's `skills/` directory. Hub-installed and agent-created skills also go here. The agent can modify or delete any skill.

Skill directory structure:
```
~/.hermes/skills/
├── mlops/
│   └── axolotl/
│       ├── SKILL.md             # Main instructions (required)
│       ├── references/          # Additional docs
│       ├── templates/           # Output formats, configs
│       └── assets/              # Supplementary files (agentskills.io)
├── devops/
│   └── deploy-k8s/
│       └── SKILL.md
├── .hub/                        # Skills Hub state
└── .bundled_manifest            # Tracks seeded bundled skills
```

SKILL.md uses YAML frontmatter (agentskills.io compatible):
```yaml
---
name: axolotl
description: Fine-tuning LLMs with Axolotl
metadata:
  hermes:
    tags: [Fine-Tuning, LoRA, DPO]
    category: mlops
---
```

## Skill Management (skill_manage)

The `skill_manage` tool lets the agent create, update, and delete its own skills -- turning successful approaches into reusable procedural knowledge.

**Module:** `tools/skill_manager_tool.py`

**Actions:**
| Action | Description | Required params |
|--------|-------------|-----------------|
| `create` | Create new skill (SKILL.md + directory) | `name`, `content`, optional `category` |
| `patch` | Targeted find-and-replace in SKILL.md or supporting file | `name`, `old_string`, `new_string`, optional `file_path`, `replace_all` |
| `edit` | Full replacement of SKILL.md (major rewrites only) | `name`, `content` |
| `delete` | Remove a user skill entirely | `name` |
| `write_file` | Add/overwrite a supporting file | `name`, `file_path`, `file_content` |
| `remove_file` | Remove a supporting file | `name`, `file_path` |

### Patch vs Edit

`patch` and `edit` both modify skill files, but serve different purposes:

**`patch`** (preferred for most updates):
- Targeted `old_string` → `new_string` replacement, same interface as the `patch` file tool
- Token-efficient: only the changed text appears in the tool call, not the full file
- Requires unique match by default; set `replace_all=true` for global replacements
- Returns match count on ambiguous matches so the model can add more context
- When targeting SKILL.md, validates that frontmatter remains intact after the patch
- Also works on supporting files via `file_path` parameter (e.g., `references/api.md`)
- Returns a file preview on not-found errors for self-correction without extra reads

**`edit`** (for major rewrites):
- Full replacement of SKILL.md content
- Use when the skill's structure needs to change (reorganizing sections, rewriting from scratch)
- The model should `skill_view()` first, then provide the complete updated text

**Constraints:**
- All skills live in `~/.hermes/skills/` and can be modified or deleted
- Skill names must be lowercase, filesystem-safe (`[a-z0-9._-]+`), max 64 chars
- SKILL.md must have valid YAML frontmatter with `name` and `description` fields
- Supporting files must be under `references/`, `templates/`, `scripts/`, or `assets/`
- Path traversal (`..`) in file paths is blocked

**Availability:** Enabled by default in CLI, Telegram, Discord, WhatsApp, and Slack. Not included in batch_runner or RL training environments.

**Behavioral guidance:** The tool description teaches the model when to create skills (after difficult tasks), when to update them (stale/broken instructions), to prefer `patch` over `edit` for targeted fixes, and the feedback loop pattern (ask user after difficult tasks, offer to save as a skill).

## Skills Hub

The Skills Hub enables searching, installing, and managing skills from online registries. It is **user-driven only** — the model cannot search for or install skills.

**Sources:** GitHub repos (openai/skills, anthropics/skills, custom taps), ClawHub, Claude Code marketplaces, LobeHub.

**Security:** Every downloaded skill is scanned by `tools/skills_guard.py` (regex patterns + optional LLM audit) before installation. Trust levels: `builtin` (ships with Hermes), `trusted` (openai/skills, anthropics/skills), `community` (everything else — any findings = blocked unless `--force`).

**Architecture:**
- `tools/skills_guard.py` — Static scanner + LLM audit, trust-aware install policy
- `tools/skills_hub.py` — SkillSource ABC, GitHubAuth (PAT + App), 4 source adapters, lock file, hub state
- `tools/skill_manager_tool.py` — Agent-managed skill CRUD (`skill_manage` tool)
- `hermes_cli/skills_hub.py` — Shared `do_*` functions, CLI subcommands, `/skills` slash command handler

**CLI:** `hermes skills search|install|inspect|list|audit|uninstall|publish|snapshot|tap`
**Slash:** `/skills search|install|inspect|list|audit|uninstall|publish|snapshot|tap`