- Updated `.cursorrules` to provide a comprehensive overview of the interactive CLI, including its architecture, key components, and command handling. - Expanded `README.md` to introduce the CLI features, quick start instructions, and detailed command descriptions for user guidance. - Added `docs/cli.md` to document CLI usage, configuration, and animated feedback, ensuring clarity for users and developers. - Revised `docs/tools.md` to include support for SSH backend in terminal tools, enhancing the documentation for terminal execution options.
4.4 KiB
Tools
Tools are functions that extend the agent's capabilities. Each tool is defined with an OpenAI-compatible JSON schema and an async handler function.
Tool Structure
Each tool module in tools/ exports:
- Schema definitions - OpenAI function-calling format
- Handler functions - Async functions that execute the tool
# Example: tools/web_tools.py
# Schema definition
WEB_SEARCH_SCHEMA = {
"type": "function",
"function": {
"name": "web_search",
"description": "Search the web for information",
"parameters": {
"type": "object",
"properties": {
"query": {"type": "string", "description": "Search query"}
},
"required": ["query"]
}
}
}
# Handler function
async def web_search(query: str) -> dict:
"""Execute web search and return results."""
# Implementation...
return {"results": [...]}
Tool Categories
| Category | Module | Tools |
|---|---|---|
| Web | web_tools.py |
web_search, web_extract, web_crawl |
| Terminal | terminal_tool.py |
terminal (local/docker/singularity/modal/ssh backends) |
| Browser | browser_tool.py |
browser_navigate, browser_click, browser_type, etc. |
| Vision | vision_tools.py |
vision_analyze |
| Image Gen | image_generation_tool.py |
image_generate |
| Reasoning | mixture_of_agents_tool.py |
mixture_of_agents |
| Skills | skills_tool.py |
skills_categories, skills_list, skill_view |
Tool Registration
Tools are registered in model_tools.py:
# model_tools.py
TOOL_SCHEMAS = [
*WEB_TOOL_SCHEMAS,
*TERMINAL_TOOL_SCHEMAS,
*BROWSER_TOOL_SCHEMAS,
# ...
]
TOOL_HANDLERS = {
"web_search": web_search,
"terminal": terminal_tool,
"browser_navigate": browser_navigate,
# ...
}
Toolsets
Tools are grouped into toolsets for logical organization (see toolsets.py):
TOOLSETS = {
"web": {
"description": "Web search and content extraction",
"tools": ["web_search", "web_extract", "web_crawl"]
},
"terminal": {
"description": "Command execution",
"tools": ["terminal"]
},
# ...
}
Adding a New Tool
- Create handler function in
tools/your_tool.py - Define JSON schema following OpenAI format
- Register in
model_tools.py(schemas and handlers) - Add to appropriate toolset in
toolsets.py - Update
tools/__init__.pyexports
Stateful Tools
Some tools maintain state across calls within a session:
- Terminal: Keeps container/sandbox running between commands
- Browser: Maintains browser session for multi-step navigation
State is managed per task_id and cleaned up automatically.
Terminal Backends
The terminal tool supports multiple execution backends:
| Backend | Description | Use Case |
|---|---|---|
local |
Direct execution on host | Development, simple tasks |
ssh |
Remote execution via SSH | Sandboxing (agent can't modify its own code) |
docker |
Docker container | Isolation, reproducibility |
singularity |
Singularity/Apptainer | HPC clusters, rootless containers |
modal |
Modal cloud | Scalable cloud compute, GPUs |
Configure via environment variables or cli-config.yaml:
# SSH backend example (in cli-config.yaml)
terminal:
env_type: "ssh"
ssh_host: "my-server.example.com"
ssh_user: "myuser"
ssh_key: "~/.ssh/id_rsa"
cwd: "/home/myuser/project"
The SSH backend uses ControlMaster for connection persistence, making subsequent commands fast.
Skills Tools (Progressive Disclosure)
Skills are on-demand knowledge documents. They use progressive disclosure to minimize tokens:
Level 0: skills_categories() → ["mlops", "devops"] (~50 tokens)
Level 1: skills_list(category) → [{name, description}, ...] (~3k tokens)
Level 2: skill_view(name) → Full content + metadata (varies)
Level 3: skill_view(name, path) → Specific reference file (varies)
Skill directory structure:
skills/
└── mlops/
└── axolotl/
├── SKILL.md # Main instructions (required)
├── references/ # Additional docs
└── templates/ # Output formats, configs
SKILL.md uses YAML frontmatter:
---
name: axolotl
description: Fine-tuning LLMs with Axolotl
tags: [Fine-Tuning, LoRA, DPO]
---