Enhance documentation for CLI and tool integration

- Updated `.cursorrules` to provide a comprehensive overview of the interactive CLI, including its architecture, key components, and command handling.
- Expanded `README.md` to introduce the CLI features, quick start instructions, and detailed command descriptions for user guidance.
- Added `docs/cli.md` to document CLI usage, configuration, and animated feedback, ensuring clarity for users and developers.
- Revised `docs/tools.md` to include support for SSH backend in terminal tools, enhancing the documentation for terminal execution options.
This commit is contained in:
teknium
2026-01-31 06:33:43 +00:00
parent bc76a032ba
commit c360da4f35
4 changed files with 382 additions and 7 deletions

View File

@@ -1,4 +1,4 @@
Hermes-Agent is an agent harness for LLMs. Hermes-Agent is an agent harness for LLMs with an interactive CLI.
## Development Environment ## Development Environment
@@ -9,12 +9,15 @@ source venv/bin/activate # Before running any Python commands
## Project Structure ## Project Structure
- `hermes` - CLI launcher script (run with `./hermes`)
- `cli.py` - Interactive CLI with Rich UI, prompt_toolkit, animated spinners
- `cli-config.yaml` - CLI configuration (model, terminal, toolsets, personalities)
- `tools/` - Individual tool implementations (web, terminal, browser, vision, etc.) - `tools/` - Individual tool implementations (web, terminal, browser, vision, etc.)
- `tools/__init__.py` - Exports all tools for importing - `tools/__init__.py` - Exports all tools for importing
- `model_tools.py` - Consolidates tool schemas and handlers for the agent - `model_tools.py` - Consolidates tool schemas and handlers for the agent
- `toolsets.py` - Groups tools into logical toolsets (web, terminal, browser, etc.) - `toolsets.py` - Groups tools into logical toolsets (web, terminal, browser, etc.)
- `toolset_distributions.py` - Probability-based tool selection for data generation - `toolset_distributions.py` - Probability-based tool selection for data generation
- `run_agent.py` - Primary agent runner with AIAgent class - `run_agent.py` - Primary agent runner with AIAgent class and KawaiiSpinner
- `batch_runner.py` - Parallel batch processing with checkpointing - `batch_runner.py` - Parallel batch processing with checkpointing
- `tests/` - Test scripts - `tests/` - Test scripts
@@ -24,11 +27,33 @@ source venv/bin/activate # Before running any Python commands
tools/*.py → tools/__init__.py → model_tools.py → toolsets.py → toolset_distributions.py tools/*.py → tools/__init__.py → model_tools.py → toolsets.py → toolset_distributions.py
run_agent.py ──────────────────────────┘ run_agent.py ──────────────────────────┘
cli.py → run_agent.py (uses AIAgent with quiet_mode=True)
batch_runner.py → run_agent.py + toolset_distributions.py batch_runner.py → run_agent.py + toolset_distributions.py
``` ```
Always ensure consistency between tools, model_tools.py, and toolsets.py when changing any of them. Always ensure consistency between tools, model_tools.py, and toolsets.py when changing any of them.
## CLI Architecture (cli.py)
The interactive CLI uses:
- **Rich** - For the welcome banner and styled panels
- **prompt_toolkit** - For fixed input area with history and `patch_stdout`
- **KawaiiSpinner** (in run_agent.py) - Animated feedback during API calls and tool execution
Key components:
- `HermesCLI` class - Main CLI controller with commands and conversation loop
- `load_cli_config()` - Loads `cli-config.yaml`, sets environment variables for terminal
- `build_welcome_banner()` - Displays ASCII art logo, tools, and skills summary
- `/commands` - Process user commands like `/help`, `/clear`, `/personality`, etc.
CLI uses `quiet_mode=True` when creating AIAgent to suppress verbose logging and enable kawaii-style feedback instead.
### Adding CLI Commands
1. Add to `COMMANDS` dict with description
2. Add handler in `process_command()` method
3. For persistent settings, use `save_config_value()` to update `cli-config.yaml`
## Adding a New Tool ## Adding a New Tool
Follow this strict order to maintain consistency: Follow this strict order to maintain consistency:
@@ -92,6 +117,11 @@ API keys are loaded from `.env` file in repo root:
- `FAL_KEY` - Image generation (FLUX model) - `FAL_KEY` - Image generation (FLUX model)
- `NOUS_API_KEY` - Vision and Mixture-of-Agents tools - `NOUS_API_KEY` - Vision and Mixture-of-Agents tools
Terminal tool configuration (can also be set in `cli-config.yaml`):
- `TERMINAL_ENV` - Backend: local, docker, singularity, modal, or ssh
- `TERMINAL_CWD` - Working directory
- `TERMINAL_SSH_HOST`, `TERMINAL_SSH_USER`, `TERMINAL_SSH_KEY` - For SSH backend
## Agent Loop (run_agent.py) ## Agent Loop (run_agent.py)
The AIAgent class handles: The AIAgent class handles:

110
README.md
View File

@@ -4,8 +4,9 @@ An AI agent with advanced tool-calling capabilities, featuring a flexible toolse
## Features ## Features
- **Interactive CLI**: Beautiful terminal interface with animated feedback, personalities, and session management
- **Web Tools**: Search, extract content, and crawl websites - **Web Tools**: Search, extract content, and crawl websites
- **Terminal Tools**: Execute commands via mini-swe-agent (local, Docker, or Modal backends) - **Terminal Tools**: Execute commands via local, Docker, Singularity, Modal, or SSH backends
- **Browser Tools**: Automate web browsers to navigate, click, type, and extract content - **Browser Tools**: Automate web browsers to navigate, click, type, and extract content
- **Vision Tools**: Analyze images from URLs - **Vision Tools**: Analyze images from URLs
- **Reasoning Tools**: Advanced multi-model reasoning (Mixture of Agents) - **Reasoning Tools**: Advanced multi-model reasoning (Mixture of Agents)
@@ -15,6 +16,23 @@ An AI agent with advanced tool-calling capabilities, featuring a flexible toolse
- **Batch Processing**: Process datasets in parallel with checkpointing and statistics tracking - **Batch Processing**: Process datasets in parallel with checkpointing and statistics tracking
- **Ephemeral System Prompts**: Guide model behavior without polluting training datasets - **Ephemeral System Prompts**: Guide model behavior without polluting training datasets
## Quick Start (CLI)
```bash
# After setup (see below), just run:
./hermes
# Or with options:
./hermes --model "anthropic/claude-sonnet-4" --toolsets "web,terminal"
```
The CLI provides:
- Animated spinners during thinking and tool execution
- Kawaii-style feedback messages
- `/commands` for configuration, history, and session management
- Customizable personalities (`/personality kawaii`, `/personality pirate`, etc.)
- Persistent configuration via `cli-config.yaml`
## Setup ## Setup
### 1. Clone the Repository ### 1. Clone the Repository
@@ -65,11 +83,12 @@ nano .env # or use your preferred editor
### 4. Configure Terminal Backend ### 4. Configure Terminal Backend
The terminal tool uses **mini-swe-agent** environments. Configure in `.env`: The terminal tool uses **mini-swe-agent** environments. Configure in `.env` or `cli-config.yaml`:
```bash ```bash
# Backend: "local", "docker", "singularity", or "modal" # Backend: "local", "docker", "singularity", "modal", or "ssh"
TERMINAL_ENV=local # Default: runs on host machine (no isolation) TERMINAL_ENV=local # Default: runs on host machine (no isolation)
TERMINAL_ENV=ssh # Remote execution via SSH (agent code stays local)
TERMINAL_ENV=singularity # Recommended for HPC: Apptainer/Singularity containers TERMINAL_ENV=singularity # Recommended for HPC: Apptainer/Singularity containers
TERMINAL_ENV=docker # Isolated Docker containers TERMINAL_ENV=docker # Isolated Docker containers
TERMINAL_ENV=modal # Cloud execution via Modal TERMINAL_ENV=modal # Cloud execution via Modal
@@ -78,10 +97,16 @@ TERMINAL_ENV=modal # Cloud execution via Modal
TERMINAL_DOCKER_IMAGE=python:3.11-slim TERMINAL_DOCKER_IMAGE=python:3.11-slim
TERMINAL_SINGULARITY_IMAGE=docker://python:3.11-slim TERMINAL_SINGULARITY_IMAGE=docker://python:3.11-slim
TERMINAL_TIMEOUT=60 TERMINAL_TIMEOUT=60
# SSH backend (for ssh)
TERMINAL_SSH_HOST=my-server.example.com
TERMINAL_SSH_USER=myuser
TERMINAL_SSH_KEY=~/.ssh/id_rsa # Optional, uses ssh-agent if not set
``` ```
**Backend Requirements:** **Backend Requirements:**
- **local**: No extra setup (runs directly on your machine, no isolation) - **local**: No extra setup (runs directly on your machine, no isolation)
- **ssh**: SSH access to remote machine (great for sandboxing - agent can't touch its own code)
- **singularity**: Requires Apptainer or Singularity installed (common on HPC clusters, no root needed) - **singularity**: Requires Apptainer or Singularity installed (common on HPC clusters, no root needed)
- **docker**: Requires Docker installed and user in `docker` group - **docker**: Requires Docker installed and user in `docker` group
- **modal**: Requires Modal account (see setup below) - **modal**: Requires Modal account (see setup below)
@@ -232,6 +257,80 @@ Skills can include:
- `templates/` - Output formats, config files, boilerplate code - `templates/` - Output formats, config files, boilerplate code
- `scripts/` - Executable helpers (Python, shell scripts) - `scripts/` - Executable helpers (Python, shell scripts)
## Interactive CLI
The CLI provides a rich interactive experience for working with the agent.
### Running the CLI
```bash
# Basic usage
./hermes
# With specific model
./hermes --model "anthropic/claude-sonnet-4"
# With specific toolsets
./hermes --toolsets "web,terminal,skills"
```
### CLI Commands
| Command | Description |
|---------|-------------|
| `/help` | Show available commands |
| `/tools` | List available tools by toolset |
| `/toolsets` | List available toolsets |
| `/model [name]` | Show or change the current model |
| `/prompt [text]` | View/set custom system prompt |
| `/personality [name]` | Set a predefined personality |
| `/clear` | Clear screen and reset conversation |
| `/reset` | Reset conversation only |
| `/history` | Show conversation history |
| `/save` | Save current conversation to file |
| `/config` | Show current configuration |
| `/quit` | Exit the CLI |
### Configuration
Copy `cli-config.yaml.example` to `cli-config.yaml` and customize:
```yaml
# Model settings
model:
default: "anthropic/claude-sonnet-4"
# Terminal backend (local, docker, singularity, modal, or ssh)
terminal:
env_type: "local"
cwd: "." # Use current directory
# Or use SSH for remote execution (keeps agent code isolated)
# terminal:
# env_type: "ssh"
# ssh_host: "my-server.example.com"
# ssh_user: "myuser"
# ssh_key: "~/.ssh/id_rsa"
# cwd: "/home/myuser/project"
# Enable specific toolsets
toolsets:
- all # or: web, terminal, browser, vision, etc.
# Custom personalities (use with /personality command)
agent:
personalities:
helpful: "You are a helpful assistant."
kawaii: "You are a kawaii assistant! Use cute expressions..."
```
### Personalities
Built-in personalities available via `/personality`:
- `helpful`, `concise`, `technical`, `creative`, `teacher`
- `kawaii`, `catgirl`, `pirate`, `shakespeare`, `surfer`
- `noir`, `uwu`, `philosopher`, `hype`
## Toolsets System ## Toolsets System
The agent uses a toolsets system for organizing and managing tools. All tools must be part of a toolset to be accessible - individual tool selection is not supported. This ensures consistent and logical grouping of capabilities. The agent uses a toolsets system for organizing and managing tools. All tools must be part of a toolset to be accessible - individual tool selection is not supported. This ensures consistent and logical grouping of capabilities.
@@ -456,6 +555,9 @@ All environment variables can be configured in the `.env` file (copy from `.env.
| File | Purpose | | File | Purpose |
|------|---------| |------|---------|
| `hermes` | CLI launcher script (run with `./hermes`) |
| `cli.py` | Interactive CLI implementation |
| `cli-config.yaml` | CLI configuration (copy from `.example`) |
| `run_agent.py` | Main agent runner - single query execution | | `run_agent.py` | Main agent runner - single query execution |
| `batch_runner.py` | Parallel batch processing with checkpointing | | `batch_runner.py` | Parallel batch processing with checkpointing |
| `model_tools.py` | Core tool definitions and handlers | | `model_tools.py` | Core tool definitions and handlers |
@@ -465,5 +567,5 @@ All environment variables can be configured in the `.env` file (copy from `.env.
| `tools/` | Individual tool implementations | | `tools/` | Individual tool implementations |
| `tools/skills_tool.py` | Skills system with progressive disclosure | | `tools/skills_tool.py` | Skills system with progressive disclosure |
| `skills/` | On-demand knowledge documents | | `skills/` | On-demand knowledge documents |
| `architecture/` | Design documentation | | `docs/` | Documentation |
| `configs/` | Example batch run scripts | | `configs/` | Example batch run scripts |

217
docs/cli.md Normal file
View File

@@ -0,0 +1,217 @@
# CLI
The Hermes Agent CLI provides an interactive terminal interface for working with the agent.
## Running the CLI
```bash
# Basic usage
./hermes
# With specific model
./hermes --model "anthropic/claude-sonnet-4"
# With specific toolsets
./hermes --toolsets "web,terminal,skills"
# Verbose mode
./hermes --verbose
```
## Architecture
The CLI is implemented in `cli.py` and uses:
- **Rich** - Welcome banner with ASCII art and styled panels
- **prompt_toolkit** - Fixed input area with command history
- **KawaiiSpinner** - Animated feedback during operations
```
┌─────────────────────────────────────────────────┐
│ HERMES-AGENT ASCII Logo │
│ ┌─────────────┐ ┌────────────────────────────┐ │
│ │ Caduceus │ │ Model: claude-opus-4.5 │ │
│ │ ASCII Art │ │ Terminal: local │ │
│ │ │ │ Working Dir: /home/user │ │
│ │ │ │ Available Tools: 19 │ │
│ │ │ │ Available Skills: 12 │ │
│ └─────────────┘ └────────────────────────────┘ │
└─────────────────────────────────────────────────┘
│ Conversation output scrolls here... │
│ │
│ User: Hello! │
│ ────────────────────────────────────────────── │
│ (◕‿◕✿) 🧠 pondering... (2.3s) │
│ ✧٩(ˊᗜˋ*)و✧ got it! (2.3s) │
│ │
│ Assistant: Hello! How can I help you today? │
├─────────────────────────────────────────────────┤
[Fixed input area at bottom] │
└─────────────────────────────────────────────────┘
```
## Commands
| Command | Description |
|---------|-------------|
| `/help` | Show available commands |
| `/tools` | List available tools grouped by toolset |
| `/toolsets` | List available toolsets with descriptions |
| `/model [name]` | Show or change the current model |
| `/prompt [text]` | View/set/clear custom system prompt |
| `/personality [name]` | Set a predefined personality |
| `/clear` | Clear screen and reset conversation |
| `/reset` | Reset conversation only (keep screen) |
| `/history` | Show conversation history |
| `/save` | Save current conversation to file |
| `/config` | Show current configuration |
| `/quit` | Exit the CLI (also: `/exit`, `/q`) |
## Configuration
The CLI is configured via `cli-config.yaml`. Copy from `cli-config.yaml.example`:
```bash
cp cli-config.yaml.example cli-config.yaml
```
### Model Configuration
```yaml
model:
default: "anthropic/claude-opus-4.5"
base_url: "https://openrouter.ai/api/v1"
```
### Terminal Configuration
The CLI supports multiple terminal backends:
```yaml
# Local execution (default)
terminal:
env_type: "local"
cwd: "." # Current directory
# SSH remote execution (sandboxed - agent can't touch its own code)
terminal:
env_type: "ssh"
cwd: "/home/myuser/project"
ssh_host: "my-server.example.com"
ssh_user: "myuser"
ssh_key: "~/.ssh/id_rsa"
# Docker container
terminal:
env_type: "docker"
docker_image: "python:3.11"
# Singularity/Apptainer (HPC)
terminal:
env_type: "singularity"
singularity_image: "docker://python:3.11"
# Modal cloud
terminal:
env_type: "modal"
modal_image: "python:3.11"
```
### Toolsets
Control which tools are available:
```yaml
# Enable all tools
toolsets:
- all
# Or enable specific toolsets
toolsets:
- web
- terminal
- skills
```
Available toolsets: `web`, `search`, `terminal`, `browser`, `vision`, `image_gen`, `skills`, `moa`, `debugging`, `safe`
### Personalities
Predefined personalities for the `/personality` command:
```yaml
agent:
personalities:
helpful: "You are a helpful, friendly AI assistant."
kawaii: "You are a kawaii assistant! Use cute expressions..."
pirate: "Arrr! Ye be talkin' to Captain Hermes..."
# Add your own!
```
Built-in personalities:
- `helpful`, `concise`, `technical`, `creative`, `teacher`
- `kawaii`, `catgirl`, `pirate`, `shakespeare`, `surfer`
- `noir`, `uwu`, `philosopher`, `hype`
## Animated Feedback
The CLI provides animated feedback during operations:
### Thinking Animation
During API calls, shows animated spinner with thinking verbs:
```
◜ (。•́︿•̀。) pondering... (1.2s)
◠ (⊙_⊙) contemplating... (2.4s)
✧٩(ˊᗜˋ*)و✧ got it! (3.1s)
```
### Tool Execution Animation
Each tool type has unique animations:
```
⠋ (◕‿◕✿) 🔍 web_search... (0.8s)
▅ (≧◡≦) 💻 terminal... (1.2s)
🌓 (★ω★) 🌐 browser_navigate... (2.1s)
✧ (✿◠‿◠) 🎨 image_generate... (4.5s)
```
## Multi-line Input
For multi-line input, end a line with `\` to continue:
```
Write a function that:\
1. Takes a list of numbers\
2. Returns the sum
```
## Environment Variable Priority
For terminal settings, `cli-config.yaml` takes precedence over `.env`:
1. `cli-config.yaml` (highest priority in CLI)
2. `.env` file
3. System environment variables
4. Default values
This allows you to have different terminal configs for CLI vs batch processing.
## Session Management
- **History**: Command history is saved to `~/.hermes_history`
- **Conversations**: Use `/save` to export conversations
- **Reset**: Use `/clear` for full reset, `/reset` to just clear history
## Quiet Mode
The CLI runs in "quiet mode" (`HERMES_QUIET=1`), which:
- Suppresses verbose logging from tools
- Enables kawaii-style animated feedback
- Hides terminal environment warnings
- Keeps output clean and user-friendly
For verbose output (debugging), use:
```bash
./hermes --verbose
```

View File

@@ -39,7 +39,7 @@ async def web_search(query: str) -> dict:
| Category | Module | Tools | | Category | Module | Tools |
|----------|--------|-------| |----------|--------|-------|
| **Web** | `web_tools.py` | `web_search`, `web_extract`, `web_crawl` | | **Web** | `web_tools.py` | `web_search`, `web_extract`, `web_crawl` |
| **Terminal** | `terminal_tool.py` | `terminal` (local/docker/singularity/modal backends) | | **Terminal** | `terminal_tool.py` | `terminal` (local/docker/singularity/modal/ssh backends) |
| **Browser** | `browser_tool.py` | `browser_navigate`, `browser_click`, `browser_type`, etc. | | **Browser** | `browser_tool.py` | `browser_navigate`, `browser_click`, `browser_type`, etc. |
| **Vision** | `vision_tools.py` | `vision_analyze` | | **Vision** | `vision_tools.py` | `vision_analyze` |
| **Image Gen** | `image_generation_tool.py` | `image_generate` | | **Image Gen** | `image_generation_tool.py` | `image_generate` |
@@ -102,6 +102,32 @@ Some tools maintain state across calls within a session:
State is managed per `task_id` and cleaned up automatically. State is managed per `task_id` and cleaned up automatically.
## Terminal Backends
The terminal tool supports multiple execution backends:
| Backend | Description | Use Case |
|---------|-------------|----------|
| `local` | Direct execution on host | Development, simple tasks |
| `ssh` | Remote execution via SSH | Sandboxing (agent can't modify its own code) |
| `docker` | Docker container | Isolation, reproducibility |
| `singularity` | Singularity/Apptainer | HPC clusters, rootless containers |
| `modal` | Modal cloud | Scalable cloud compute, GPUs |
Configure via environment variables or `cli-config.yaml`:
```yaml
# SSH backend example (in cli-config.yaml)
terminal:
env_type: "ssh"
ssh_host: "my-server.example.com"
ssh_user: "myuser"
ssh_key: "~/.ssh/id_rsa"
cwd: "/home/myuser/project"
```
The SSH backend uses ControlMaster for connection persistence, making subsequent commands fast.
## Skills Tools (Progressive Disclosure) ## Skills Tools (Progressive Disclosure)
Skills are on-demand knowledge documents. They use **progressive disclosure** to minimize tokens: Skills are on-demand knowledge documents. They use **progressive disclosure** to minimize tokens: