Files

teknium c360da4f35 Enhance documentation for CLI and tool integration

- Updated `.cursorrules` to provide a comprehensive overview of the interactive CLI, including its architecture, key components, and command handling.
- Expanded `README.md` to introduce the CLI features, quick start instructions, and detailed command descriptions for user guidance.
- Added `docs/cli.md` to document CLI usage, configuration, and animated feedback, ensuring clarity for users and developers.
- Revised `docs/tools.md` to include support for SSH backend in terminal tools, enhancing the documentation for terminal execution options.

2026-01-31 06:33:43 +00:00

4.4 KiB

Raw Blame History

Tools

Tools are functions that extend the agent's capabilities. Each tool is defined with an OpenAI-compatible JSON schema and an async handler function.

Tool Structure

Each tool module in tools/ exports:

Schema definitions - OpenAI function-calling format
Handler functions - Async functions that execute the tool

# Example: tools/web_tools.py

# Schema definition
WEB_SEARCH_SCHEMA = {
    "type": "function",
    "function": {
        "name": "web_search",
        "description": "Search the web for information",
        "parameters": {
            "type": "object",
            "properties": {
                "query": {"type": "string", "description": "Search query"}
            },
            "required": ["query"]
        }
    }
}

# Handler function
async def web_search(query: str) -> dict:
    """Execute web search and return results."""
    # Implementation...
    return {"results": [...]}

Tool Categories

Category	Module	Tools
Web	`web_tools.py`	`web_search`, `web_extract`, `web_crawl`
Terminal	`terminal_tool.py`	`terminal` (local/docker/singularity/modal/ssh backends)
Browser	`browser_tool.py`	`browser_navigate`, `browser_click`, `browser_type`, etc.
Vision	`vision_tools.py`	`vision_analyze`
Image Gen	`image_generation_tool.py`	`image_generate`
Reasoning	`mixture_of_agents_tool.py`	`mixture_of_agents`
Skills	`skills_tool.py`	`skills_categories`, `skills_list`, `skill_view`

Tool Registration

Tools are registered in model_tools.py:

# model_tools.py
TOOL_SCHEMAS = [
    *WEB_TOOL_SCHEMAS,
    *TERMINAL_TOOL_SCHEMAS,
    *BROWSER_TOOL_SCHEMAS,
    # ...
]

TOOL_HANDLERS = {
    "web_search": web_search,
    "terminal": terminal_tool,
    "browser_navigate": browser_navigate,
    # ...
}

Toolsets

Tools are grouped into toolsets for logical organization (see toolsets.py):

TOOLSETS = {
    "web": {
        "description": "Web search and content extraction",
        "tools": ["web_search", "web_extract", "web_crawl"]
    },
    "terminal": {
        "description": "Command execution",
        "tools": ["terminal"]
    },
    # ...
}

Adding a New Tool

Create handler function in tools/your_tool.py
Define JSON schema following OpenAI format
Register in model_tools.py (schemas and handlers)
Add to appropriate toolset in toolsets.py
Update tools/__init__.py exports

Stateful Tools

Some tools maintain state across calls within a session:

Terminal: Keeps container/sandbox running between commands
Browser: Maintains browser session for multi-step navigation

State is managed per task_id and cleaned up automatically.

Terminal Backends

The terminal tool supports multiple execution backends:

Backend	Description	Use Case
`local`	Direct execution on host	Development, simple tasks
`ssh`	Remote execution via SSH	Sandboxing (agent can't modify its own code)
`docker`	Docker container	Isolation, reproducibility
`singularity`	Singularity/Apptainer	HPC clusters, rootless containers
`modal`	Modal cloud	Scalable cloud compute, GPUs

Configure via environment variables or cli-config.yaml:

# SSH backend example (in cli-config.yaml)
terminal:
  env_type: "ssh"
  ssh_host: "my-server.example.com"
  ssh_user: "myuser"
  ssh_key: "~/.ssh/id_rsa"
  cwd: "/home/myuser/project"

The SSH backend uses ControlMaster for connection persistence, making subsequent commands fast.

Skills Tools (Progressive Disclosure)

Skills are on-demand knowledge documents. They use progressive disclosure to minimize tokens:

Level 0: skills_categories()     → ["mlops", "devops"]           (~50 tokens)
Level 1: skills_list(category)   → [{name, description}, ...]   (~3k tokens)
Level 2: skill_view(name)        → Full content + metadata       (varies)
Level 3: skill_view(name, path)  → Specific reference file       (varies)

Skill directory structure:

skills/
└── mlops/
    └── axolotl/
        ├── SKILL.md           # Main instructions (required)
        ├── references/        # Additional docs
        └── templates/         # Output formats, configs

SKILL.md uses YAML frontmatter:

---
name: axolotl
description: Fine-tuning LLMs with Axolotl
tags: [Fine-Tuning, LoRA, DPO]
---

4.4 KiB Raw Blame History