Enhance documentation for CLI and tool integration

- Updated `.cursorrules` to provide a comprehensive overview of the interactive CLI, including its architecture, key components, and command handling. - Expanded `README.md` to introduce the CLI features, quick start instructions, and detailed command descriptions for user guidance. - Added `docs/cli.md` to document CLI usage, configuration, and animated feedback, ensuring clarity for users and developers. - Revised `docs/tools.md` to include support for SSH backend in terminal tools, enhancing the documentation for terminal execution options.
2026-01-31 06:33:43 +00:00
parent bc76a032ba
commit c360da4f35
4 changed files with 382 additions and 7 deletions
--- a/.cursorrules
+++ b/.cursorrules
@@ -1,4 +1,4 @@
-Hermes-Agent is an agent harness for LLMs.
+Hermes-Agent is an agent harness for LLMs with an interactive CLI.
 ## Development Environment
@@ -9,12 +9,15 @@ source venv/bin/activate  # Before running any Python commands
 ## Project Structure
 - `hermes` - CLI launcher script (run with `./hermes`)
 - `cli.py` - Interactive CLI with Rich UI, prompt_toolkit, animated spinners
 - `cli-config.yaml` - CLI configuration (model, terminal, toolsets, personalities)
 - `tools/` - Individual tool implementations (web, terminal, browser, vision, etc.)
 - `tools/__init__.py` - Exports all tools for importing
 - `model_tools.py` - Consolidates tool schemas and handlers for the agent
 - `toolsets.py` - Groups tools into logical toolsets (web, terminal, browser, etc.)
 - `toolset_distributions.py` - Probability-based tool selection for data generation
- `run_agent.py` - Primary agent runner with AIAgent class
+- `run_agent.py` - Primary agent runner with AIAgent class and KawaiiSpinner
 - `batch_runner.py` - Parallel batch processing with checkpointing
 - `tests/` - Test scripts
@@ -24,11 +27,33 @@ source venv/bin/activate  # Before running any Python commands
 tools/*.py → tools/__init__.py → model_tools.py → toolsets.py → toolset_distributions.py
                                       ↑
 run_agent.py ──────────────────────────┘
 cli.py → run_agent.py (uses AIAgent with quiet_mode=True)
 batch_runner.py → run_agent.py + toolset_distributions.py
 ```
 Always ensure consistency between tools, model_tools.py, and toolsets.py when changing any of them.
 ## CLI Architecture (cli.py)
 The interactive CLI uses:
 - **Rich** - For the welcome banner and styled panels
 - **prompt_toolkit** - For fixed input area with history and `patch_stdout`
 - **KawaiiSpinner** (in run_agent.py) - Animated feedback during API calls and tool execution
 Key components:
 - `HermesCLI` class - Main CLI controller with commands and conversation loop
 - `load_cli_config()` - Loads `cli-config.yaml`, sets environment variables for terminal
 - `build_welcome_banner()` - Displays ASCII art logo, tools, and skills summary
 - `/commands` - Process user commands like `/help`, `/clear`, `/personality`, etc.
 CLI uses `quiet_mode=True` when creating AIAgent to suppress verbose logging and enable kawaii-style feedback instead.
 ### Adding CLI Commands
 1. Add to `COMMANDS` dict with description
 2. Add handler in `process_command()` method
 3. For persistent settings, use `save_config_value()` to update `cli-config.yaml`
 ## Adding a New Tool
 Follow this strict order to maintain consistency:
@@ -92,6 +117,11 @@ API keys are loaded from `.env` file in repo root:
 - `FAL_KEY` - Image generation (FLUX model)
 - `NOUS_API_KEY` - Vision and Mixture-of-Agents tools
 Terminal tool configuration (can also be set in `cli-config.yaml`):
 - `TERMINAL_ENV` - Backend: local, docker, singularity, modal, or ssh
 - `TERMINAL_CWD` - Working directory
 - `TERMINAL_SSH_HOST`, `TERMINAL_SSH_USER`, `TERMINAL_SSH_KEY` - For SSH backend
 ## Agent Loop (run_agent.py)
 The AIAgent class handles:
--- a/README.md
+++ b/README.md
@@ -4,8 +4,9 @@ An AI agent with advanced tool-calling capabilities, featuring a flexible toolse
 ## Features
 - **Interactive CLI**: Beautiful terminal interface with animated feedback, personalities, and session management
 - **Web Tools**: Search, extract content, and crawl websites
- **Terminal Tools**: Execute commands via mini-swe-agent (local, Docker, or Modal backends)
+- **Terminal Tools**: Execute commands via local, Docker, Singularity, Modal, or SSH backends
 - **Browser Tools**: Automate web browsers to navigate, click, type, and extract content
 - **Vision Tools**: Analyze images from URLs
 - **Reasoning Tools**: Advanced multi-model reasoning (Mixture of Agents)
@@ -15,6 +16,23 @@ An AI agent with advanced tool-calling capabilities, featuring a flexible toolse
 - **Batch Processing**: Process datasets in parallel with checkpointing and statistics tracking
 - **Ephemeral System Prompts**: Guide model behavior without polluting training datasets
 ## Quick Start (CLI)
 ```bash
 # After setup (see below), just run:
 ./hermes
 # Or with options:
 ./hermes --model "anthropic/claude-sonnet-4" --toolsets "web,terminal"
 ```
 The CLI provides:
 - Animated spinners during thinking and tool execution
 - Kawaii-style feedback messages
 - `/commands` for configuration, history, and session management
 - Customizable personalities (`/personality kawaii`, `/personality pirate`, etc.)
 - Persistent configuration via `cli-config.yaml`
 ## Setup
 ### 1. Clone the Repository
@@ -65,11 +83,12 @@ nano .env  # or use your preferred editor
 ### 4. Configure Terminal Backend
-The terminal tool uses **mini-swe-agent** environments. Configure in `.env`:
+The terminal tool uses **mini-swe-agent** environments. Configure in `.env` or `cli-config.yaml`:
 ```bash
-# Backend: "local", "docker", "singularity", or "modal"
+# Backend: "local", "docker", "singularity", "modal", or "ssh"
 TERMINAL_ENV=local          # Default: runs on host machine (no isolation)
 TERMINAL_ENV=ssh            # Remote execution via SSH (agent code stays local)
 TERMINAL_ENV=singularity    # Recommended for HPC: Apptainer/Singularity containers
 TERMINAL_ENV=docker         # Isolated Docker containers
 TERMINAL_ENV=modal          # Cloud execution via Modal
@@ -78,10 +97,16 @@ TERMINAL_ENV=modal          # Cloud execution via Modal
 TERMINAL_DOCKER_IMAGE=python:3.11-slim
 TERMINAL_SINGULARITY_IMAGE=docker://python:3.11-slim
 TERMINAL_TIMEOUT=60
 # SSH backend (for ssh)
 TERMINAL_SSH_HOST=my-server.example.com
 TERMINAL_SSH_USER=myuser
 TERMINAL_SSH_KEY=~/.ssh/id_rsa  # Optional, uses ssh-agent if not set
 ```
 **Backend Requirements:**
 - **local**: No extra setup (runs directly on your machine, no isolation)
 - **ssh**: SSH access to remote machine (great for sandboxing - agent can't touch its own code)
 - **singularity**: Requires Apptainer or Singularity installed (common on HPC clusters, no root needed)
 - **docker**: Requires Docker installed and user in `docker` group
 - **modal**: Requires Modal account (see setup below)
@@ -232,6 +257,80 @@ Skills can include:
 - `templates/` - Output formats, config files, boilerplate code
 - `scripts/` - Executable helpers (Python, shell scripts)
 ## Interactive CLI
 The CLI provides a rich interactive experience for working with the agent.
 ### Running the CLI
 ```bash
 # Basic usage
 ./hermes
 # With specific model
 ./hermes --model "anthropic/claude-sonnet-4"
 # With specific toolsets
 ./hermes --toolsets "web,terminal,skills"
 ```
 ### CLI Commands
 | Command | Description |
 |---------|-------------|
 | `/help` | Show available commands |
 | `/tools` | List available tools by toolset |
 | `/toolsets` | List available toolsets |
 | `/model [name]` | Show or change the current model |
 | `/prompt [text]` | View/set custom system prompt |
 | `/personality [name]` | Set a predefined personality |
 | `/clear` | Clear screen and reset conversation |
 | `/reset` | Reset conversation only |
 | `/history` | Show conversation history |
 | `/save` | Save current conversation to file |
 | `/config` | Show current configuration |
 | `/quit` | Exit the CLI |
 ### Configuration
 Copy `cli-config.yaml.example` to `cli-config.yaml` and customize:
 ```yaml
 # Model settings
 model:
  default: "anthropic/claude-sonnet-4"
 # Terminal backend (local, docker, singularity, modal, or ssh)
 terminal:
  env_type: "local"
  cwd: "."  # Use current directory
 # Or use SSH for remote execution (keeps agent code isolated)
 # terminal:
 #   env_type: "ssh"
 #   ssh_host: "my-server.example.com"
 #   ssh_user: "myuser"
 #   ssh_key: "~/.ssh/id_rsa"
 #   cwd: "/home/myuser/project"
 # Enable specific toolsets
 toolsets:
  - all  # or: web, terminal, browser, vision, etc.
 # Custom personalities (use with /personality command)
 agent:
  personalities:
    helpful: "You are a helpful assistant."
    kawaii: "You are a kawaii assistant! Use cute expressions..."
 ```
 ### Personalities
 Built-in personalities available via `/personality`:
 - `helpful`, `concise`, `technical`, `creative`, `teacher`
 - `kawaii`, `catgirl`, `pirate`, `shakespeare`, `surfer`
 - `noir`, `uwu`, `philosopher`, `hype`
 ## Toolsets System
 The agent uses a toolsets system for organizing and managing tools. All tools must be part of a toolset to be accessible - individual tool selection is not supported. This ensures consistent and logical grouping of capabilities.
@@ -456,6 +555,9 @@ All environment variables can be configured in the `.env` file (copy from `.env.
 | File | Purpose |
 |------|---------|
 | `hermes` | CLI launcher script (run with `./hermes`) |
 | `cli.py` | Interactive CLI implementation |
 | `cli-config.yaml` | CLI configuration (copy from `.example`) |
 | `run_agent.py` | Main agent runner - single query execution |
 | `batch_runner.py` | Parallel batch processing with checkpointing |
 | `model_tools.py` | Core tool definitions and handlers |
@@ -465,5 +567,5 @@ All environment variables can be configured in the `.env` file (copy from `.env.
 | `tools/` | Individual tool implementations |
 | `tools/skills_tool.py` | Skills system with progressive disclosure |
 | `skills/` | On-demand knowledge documents |
-| `architecture/` | Design documentation |
+| `docs/` | Documentation |
 | `configs/` | Example batch run scripts |
--- a/docs/cli.md
+++ b/docs/cli.md
@@ -0,0 +1,217 @@
 # CLI
 The Hermes Agent CLI provides an interactive terminal interface for working with the agent.
 ## Running the CLI
 ```bash
 # Basic usage
 ./hermes
 # With specific model
 ./hermes --model "anthropic/claude-sonnet-4"
 # With specific toolsets
 ./hermes --toolsets "web,terminal,skills"
 # Verbose mode
 ./hermes --verbose
 ```
 ## Architecture
 The CLI is implemented in `cli.py` and uses:
 - **Rich** - Welcome banner with ASCII art and styled panels
 - **prompt_toolkit** - Fixed input area with command history
 - **KawaiiSpinner** - Animated feedback during operations
 ```
 ┌─────────────────────────────────────────────────┐
 │  HERMES-AGENT ASCII Logo                        │
 │  ┌─────────────┐ ┌────────────────────────────┐ │
 │  │  Caduceus   │ │ Model: claude-opus-4.5     │ │
 │  │  ASCII Art  │ │ Terminal: local            │ │
 │  │             │ │ Working Dir: /home/user    │ │
 │  │             │ │ Available Tools: 19        │ │
 │  │             │ │ Available Skills: 12       │ │
 │  └─────────────┘ └────────────────────────────┘ │
 └─────────────────────────────────────────────────┘
 │ Conversation output scrolls here...             │
 │                                                 │
 │ User: Hello!                                    │
 │ ────────────────────────────────────────────── │
 │   (◕‿◕✿) 🧠 pondering... (2.3s)                │
 │   ✧٩(ˊᗜˋ*)و✧ got it! (2.3s)                    │
 │                                                 │
 │ Assistant: Hello! How can I help you today?    │
 ├─────────────────────────────────────────────────┤
 │ ❯ [Fixed input area at bottom]                  │
 └─────────────────────────────────────────────────┘
 ```
 ## Commands
 | Command | Description |
 |---------|-------------|
 | `/help` | Show available commands |
 | `/tools` | List available tools grouped by toolset |
 | `/toolsets` | List available toolsets with descriptions |
 | `/model [name]` | Show or change the current model |
 | `/prompt [text]` | View/set/clear custom system prompt |
 | `/personality [name]` | Set a predefined personality |
 | `/clear` | Clear screen and reset conversation |
 | `/reset` | Reset conversation only (keep screen) |
 | `/history` | Show conversation history |
 | `/save` | Save current conversation to file |
 | `/config` | Show current configuration |
 | `/quit` | Exit the CLI (also: `/exit`, `/q`) |
 ## Configuration
 The CLI is configured via `cli-config.yaml`. Copy from `cli-config.yaml.example`:
 ```bash
 cp cli-config.yaml.example cli-config.yaml
 ```
 ### Model Configuration
 ```yaml
 model:
  default: "anthropic/claude-opus-4.5"
  base_url: "https://openrouter.ai/api/v1"
 ```
 ### Terminal Configuration
 The CLI supports multiple terminal backends:
 ```yaml
 # Local execution (default)
 terminal:
  env_type: "local"
  cwd: "."  # Current directory
 # SSH remote execution (sandboxed - agent can't touch its own code)
 terminal:
  env_type: "ssh"
  cwd: "/home/myuser/project"
  ssh_host: "my-server.example.com"
  ssh_user: "myuser"
  ssh_key: "~/.ssh/id_rsa"
 # Docker container
 terminal:
  env_type: "docker"
  docker_image: "python:3.11"
 # Singularity/Apptainer (HPC)
 terminal:
  env_type: "singularity"
  singularity_image: "docker://python:3.11"
 # Modal cloud
 terminal:
  env_type: "modal"
  modal_image: "python:3.11"
 ```
 ### Toolsets
 Control which tools are available:
 ```yaml
 # Enable all tools
 toolsets:
  - all
 # Or enable specific toolsets
 toolsets:
  - web
  - terminal
  - skills
 ```
 Available toolsets: `web`, `search`, `terminal`, `browser`, `vision`, `image_gen`, `skills`, `moa`, `debugging`, `safe`
 ### Personalities
 Predefined personalities for the `/personality` command:
 ```yaml
 agent:
  personalities:
    helpful: "You are a helpful, friendly AI assistant."
    kawaii: "You are a kawaii assistant! Use cute expressions..."
    pirate: "Arrr! Ye be talkin' to Captain Hermes..."
    # Add your own!
 ```
 Built-in personalities:
 - `helpful`, `concise`, `technical`, `creative`, `teacher`
 - `kawaii`, `catgirl`, `pirate`, `shakespeare`, `surfer`
 - `noir`, `uwu`, `philosopher`, `hype`
 ## Animated Feedback
 The CLI provides animated feedback during operations:
 ### Thinking Animation
 During API calls, shows animated spinner with thinking verbs:
 ```
  ◜ (｡•́︿•̀｡) pondering... (1.2s)
  ◠ (⊙_⊙) contemplating... (2.4s)
  ✧٩(ˊᗜˋ*)و✧ got it! (3.1s)
 ```
 ### Tool Execution Animation
 Each tool type has unique animations:
 ```
  ⠋ (◕‿◕✿) 🔍 web_search... (0.8s)
  ▅ (≧◡≦) 💻 terminal... (1.2s)
  🌓 (★ω★) 🌐 browser_navigate... (2.1s)
  ✧ (✿◠‿◠) 🎨 image_generate... (4.5s)
 ```
 ## Multi-line Input
 For multi-line input, end a line with `\` to continue:
 ```
 ❯ Write a function that:\
  1. Takes a list of numbers\
  2. Returns the sum
 ```
 ## Environment Variable Priority
 For terminal settings, `cli-config.yaml` takes precedence over `.env`:
 1. `cli-config.yaml` (highest priority in CLI)
 2. `.env` file
 3. System environment variables
 4. Default values
 This allows you to have different terminal configs for CLI vs batch processing.
 ## Session Management
 - **History**: Command history is saved to `~/.hermes_history`
 - **Conversations**: Use `/save` to export conversations
 - **Reset**: Use `/clear` for full reset, `/reset` to just clear history
 ## Quiet Mode
 The CLI runs in "quiet mode" (`HERMES_QUIET=1`), which:
 - Suppresses verbose logging from tools
 - Enables kawaii-style animated feedback
 - Hides terminal environment warnings
 - Keeps output clean and user-friendly
 For verbose output (debugging), use:
 ```bash
 ./hermes --verbose
 ```
--- a/docs/tools.md
+++ b/docs/tools.md
@@ -39,7 +39,7 @@ async def web_search(query: str) -> dict:
 | Category | Module | Tools |
 |----------|--------|-------|
 | **Web** | `web_tools.py` | `web_search`, `web_extract`, `web_crawl` |
-| **Terminal** | `terminal_tool.py` | `terminal` (local/docker/singularity/modal backends) |
+| **Terminal** | `terminal_tool.py` | `terminal` (local/docker/singularity/modal/ssh backends) |
 | **Browser** | `browser_tool.py` | `browser_navigate`, `browser_click`, `browser_type`, etc. |
 | **Vision** | `vision_tools.py` | `vision_analyze` |
 | **Image Gen** | `image_generation_tool.py` | `image_generate` |
@@ -102,6 +102,32 @@ Some tools maintain state across calls within a session:
 State is managed per `task_id` and cleaned up automatically.
 ## Terminal Backends
 The terminal tool supports multiple execution backends:
 | Backend | Description | Use Case |
 |---------|-------------|----------|
 | `local` | Direct execution on host | Development, simple tasks |
 | `ssh` | Remote execution via SSH | Sandboxing (agent can't modify its own code) |
 | `docker` | Docker container | Isolation, reproducibility |
 | `singularity` | Singularity/Apptainer | HPC clusters, rootless containers |
 | `modal` | Modal cloud | Scalable cloud compute, GPUs |
 Configure via environment variables or `cli-config.yaml`:
 ```yaml
 # SSH backend example (in cli-config.yaml)
 terminal:
  env_type: "ssh"
  ssh_host: "my-server.example.com"
  ssh_user: "myuser"
  ssh_key: "~/.ssh/id_rsa"
  cwd: "/home/myuser/project"
 ```
 The SSH backend uses ControlMaster for connection persistence, making subsequent commands fast.
 ## Skills Tools (Progressive Disclosure)
 Skills are on-demand knowledge documents. They use **progressive disclosure** to minimize tokens: