diff --git a/.cursorrules b/.cursorrules index 033353e71..4a4641c4f 100644 --- a/.cursorrules +++ b/.cursorrules @@ -1,4 +1,4 @@ -Hermes-Agent is an agent harness for LLMs. +Hermes-Agent is an agent harness for LLMs with an interactive CLI. ## Development Environment @@ -9,12 +9,15 @@ source venv/bin/activate # Before running any Python commands ## Project Structure +- `hermes` - CLI launcher script (run with `./hermes`) +- `cli.py` - Interactive CLI with Rich UI, prompt_toolkit, animated spinners +- `cli-config.yaml` - CLI configuration (model, terminal, toolsets, personalities) - `tools/` - Individual tool implementations (web, terminal, browser, vision, etc.) - `tools/__init__.py` - Exports all tools for importing - `model_tools.py` - Consolidates tool schemas and handlers for the agent - `toolsets.py` - Groups tools into logical toolsets (web, terminal, browser, etc.) - `toolset_distributions.py` - Probability-based tool selection for data generation -- `run_agent.py` - Primary agent runner with AIAgent class +- `run_agent.py` - Primary agent runner with AIAgent class and KawaiiSpinner - `batch_runner.py` - Parallel batch processing with checkpointing - `tests/` - Test scripts @@ -24,11 +27,33 @@ source venv/bin/activate # Before running any Python commands tools/*.py → tools/__init__.py → model_tools.py → toolsets.py → toolset_distributions.py ↑ run_agent.py ──────────────────────────┘ +cli.py → run_agent.py (uses AIAgent with quiet_mode=True) batch_runner.py → run_agent.py + toolset_distributions.py ``` Always ensure consistency between tools, model_tools.py, and toolsets.py when changing any of them. +## CLI Architecture (cli.py) + +The interactive CLI uses: +- **Rich** - For the welcome banner and styled panels +- **prompt_toolkit** - For fixed input area with history and `patch_stdout` +- **KawaiiSpinner** (in run_agent.py) - Animated feedback during API calls and tool execution + +Key components: +- `HermesCLI` class - Main CLI controller with commands and conversation loop +- `load_cli_config()` - Loads `cli-config.yaml`, sets environment variables for terminal +- `build_welcome_banner()` - Displays ASCII art logo, tools, and skills summary +- `/commands` - Process user commands like `/help`, `/clear`, `/personality`, etc. + +CLI uses `quiet_mode=True` when creating AIAgent to suppress verbose logging and enable kawaii-style feedback instead. + +### Adding CLI Commands + +1. Add to `COMMANDS` dict with description +2. Add handler in `process_command()` method +3. For persistent settings, use `save_config_value()` to update `cli-config.yaml` + ## Adding a New Tool Follow this strict order to maintain consistency: @@ -92,6 +117,11 @@ API keys are loaded from `.env` file in repo root: - `FAL_KEY` - Image generation (FLUX model) - `NOUS_API_KEY` - Vision and Mixture-of-Agents tools +Terminal tool configuration (can also be set in `cli-config.yaml`): +- `TERMINAL_ENV` - Backend: local, docker, singularity, modal, or ssh +- `TERMINAL_CWD` - Working directory +- `TERMINAL_SSH_HOST`, `TERMINAL_SSH_USER`, `TERMINAL_SSH_KEY` - For SSH backend + ## Agent Loop (run_agent.py) The AIAgent class handles: diff --git a/README.md b/README.md index 5ceece4dd..ca80bc75a 100644 --- a/README.md +++ b/README.md @@ -4,8 +4,9 @@ An AI agent with advanced tool-calling capabilities, featuring a flexible toolse ## Features +- **Interactive CLI**: Beautiful terminal interface with animated feedback, personalities, and session management - **Web Tools**: Search, extract content, and crawl websites -- **Terminal Tools**: Execute commands via mini-swe-agent (local, Docker, or Modal backends) +- **Terminal Tools**: Execute commands via local, Docker, Singularity, Modal, or SSH backends - **Browser Tools**: Automate web browsers to navigate, click, type, and extract content - **Vision Tools**: Analyze images from URLs - **Reasoning Tools**: Advanced multi-model reasoning (Mixture of Agents) @@ -15,6 +16,23 @@ An AI agent with advanced tool-calling capabilities, featuring a flexible toolse - **Batch Processing**: Process datasets in parallel with checkpointing and statistics tracking - **Ephemeral System Prompts**: Guide model behavior without polluting training datasets +## Quick Start (CLI) + +```bash +# After setup (see below), just run: +./hermes + +# Or with options: +./hermes --model "anthropic/claude-sonnet-4" --toolsets "web,terminal" +``` + +The CLI provides: +- Animated spinners during thinking and tool execution +- Kawaii-style feedback messages +- `/commands` for configuration, history, and session management +- Customizable personalities (`/personality kawaii`, `/personality pirate`, etc.) +- Persistent configuration via `cli-config.yaml` + ## Setup ### 1. Clone the Repository @@ -65,11 +83,12 @@ nano .env # or use your preferred editor ### 4. Configure Terminal Backend -The terminal tool uses **mini-swe-agent** environments. Configure in `.env`: +The terminal tool uses **mini-swe-agent** environments. Configure in `.env` or `cli-config.yaml`: ```bash -# Backend: "local", "docker", "singularity", or "modal" +# Backend: "local", "docker", "singularity", "modal", or "ssh" TERMINAL_ENV=local # Default: runs on host machine (no isolation) +TERMINAL_ENV=ssh # Remote execution via SSH (agent code stays local) TERMINAL_ENV=singularity # Recommended for HPC: Apptainer/Singularity containers TERMINAL_ENV=docker # Isolated Docker containers TERMINAL_ENV=modal # Cloud execution via Modal @@ -78,10 +97,16 @@ TERMINAL_ENV=modal # Cloud execution via Modal TERMINAL_DOCKER_IMAGE=python:3.11-slim TERMINAL_SINGULARITY_IMAGE=docker://python:3.11-slim TERMINAL_TIMEOUT=60 + +# SSH backend (for ssh) +TERMINAL_SSH_HOST=my-server.example.com +TERMINAL_SSH_USER=myuser +TERMINAL_SSH_KEY=~/.ssh/id_rsa # Optional, uses ssh-agent if not set ``` **Backend Requirements:** - **local**: No extra setup (runs directly on your machine, no isolation) +- **ssh**: SSH access to remote machine (great for sandboxing - agent can't touch its own code) - **singularity**: Requires Apptainer or Singularity installed (common on HPC clusters, no root needed) - **docker**: Requires Docker installed and user in `docker` group - **modal**: Requires Modal account (see setup below) @@ -232,6 +257,80 @@ Skills can include: - `templates/` - Output formats, config files, boilerplate code - `scripts/` - Executable helpers (Python, shell scripts) +## Interactive CLI + +The CLI provides a rich interactive experience for working with the agent. + +### Running the CLI + +```bash +# Basic usage +./hermes + +# With specific model +./hermes --model "anthropic/claude-sonnet-4" + +# With specific toolsets +./hermes --toolsets "web,terminal,skills" +``` + +### CLI Commands + +| Command | Description | +|---------|-------------| +| `/help` | Show available commands | +| `/tools` | List available tools by toolset | +| `/toolsets` | List available toolsets | +| `/model [name]` | Show or change the current model | +| `/prompt [text]` | View/set custom system prompt | +| `/personality [name]` | Set a predefined personality | +| `/clear` | Clear screen and reset conversation | +| `/reset` | Reset conversation only | +| `/history` | Show conversation history | +| `/save` | Save current conversation to file | +| `/config` | Show current configuration | +| `/quit` | Exit the CLI | + +### Configuration + +Copy `cli-config.yaml.example` to `cli-config.yaml` and customize: + +```yaml +# Model settings +model: + default: "anthropic/claude-sonnet-4" + +# Terminal backend (local, docker, singularity, modal, or ssh) +terminal: + env_type: "local" + cwd: "." # Use current directory + +# Or use SSH for remote execution (keeps agent code isolated) +# terminal: +# env_type: "ssh" +# ssh_host: "my-server.example.com" +# ssh_user: "myuser" +# ssh_key: "~/.ssh/id_rsa" +# cwd: "/home/myuser/project" + +# Enable specific toolsets +toolsets: + - all # or: web, terminal, browser, vision, etc. + +# Custom personalities (use with /personality command) +agent: + personalities: + helpful: "You are a helpful assistant." + kawaii: "You are a kawaii assistant! Use cute expressions..." +``` + +### Personalities + +Built-in personalities available via `/personality`: +- `helpful`, `concise`, `technical`, `creative`, `teacher` +- `kawaii`, `catgirl`, `pirate`, `shakespeare`, `surfer` +- `noir`, `uwu`, `philosopher`, `hype` + ## Toolsets System The agent uses a toolsets system for organizing and managing tools. All tools must be part of a toolset to be accessible - individual tool selection is not supported. This ensures consistent and logical grouping of capabilities. @@ -456,6 +555,9 @@ All environment variables can be configured in the `.env` file (copy from `.env. | File | Purpose | |------|---------| +| `hermes` | CLI launcher script (run with `./hermes`) | +| `cli.py` | Interactive CLI implementation | +| `cli-config.yaml` | CLI configuration (copy from `.example`) | | `run_agent.py` | Main agent runner - single query execution | | `batch_runner.py` | Parallel batch processing with checkpointing | | `model_tools.py` | Core tool definitions and handlers | @@ -465,5 +567,5 @@ All environment variables can be configured in the `.env` file (copy from `.env. | `tools/` | Individual tool implementations | | `tools/skills_tool.py` | Skills system with progressive disclosure | | `skills/` | On-demand knowledge documents | -| `architecture/` | Design documentation | +| `docs/` | Documentation | | `configs/` | Example batch run scripts | diff --git a/docs/cli.md b/docs/cli.md new file mode 100644 index 000000000..6e42475eb --- /dev/null +++ b/docs/cli.md @@ -0,0 +1,217 @@ +# CLI + +The Hermes Agent CLI provides an interactive terminal interface for working with the agent. + +## Running the CLI + +```bash +# Basic usage +./hermes + +# With specific model +./hermes --model "anthropic/claude-sonnet-4" + +# With specific toolsets +./hermes --toolsets "web,terminal,skills" + +# Verbose mode +./hermes --verbose +``` + +## Architecture + +The CLI is implemented in `cli.py` and uses: + +- **Rich** - Welcome banner with ASCII art and styled panels +- **prompt_toolkit** - Fixed input area with command history +- **KawaiiSpinner** - Animated feedback during operations + +``` +┌─────────────────────────────────────────────────┐ +│ HERMES-AGENT ASCII Logo │ +│ ┌─────────────┐ ┌────────────────────────────┐ │ +│ │ Caduceus │ │ Model: claude-opus-4.5 │ │ +│ │ ASCII Art │ │ Terminal: local │ │ +│ │ │ │ Working Dir: /home/user │ │ +│ │ │ │ Available Tools: 19 │ │ +│ │ │ │ Available Skills: 12 │ │ +│ └─────────────┘ └────────────────────────────┘ │ +└─────────────────────────────────────────────────┘ +│ Conversation output scrolls here... │ +│ │ +│ User: Hello! │ +│ ────────────────────────────────────────────── │ +│ (◕‿◕✿) 🧠 pondering... (2.3s) │ +│ ✧٩(ˊᗜˋ*)و✧ got it! (2.3s) │ +│ │ +│ Assistant: Hello! How can I help you today? │ +├─────────────────────────────────────────────────┤ +│ ❯ [Fixed input area at bottom] │ +└─────────────────────────────────────────────────┘ +``` + +## Commands + +| Command | Description | +|---------|-------------| +| `/help` | Show available commands | +| `/tools` | List available tools grouped by toolset | +| `/toolsets` | List available toolsets with descriptions | +| `/model [name]` | Show or change the current model | +| `/prompt [text]` | View/set/clear custom system prompt | +| `/personality [name]` | Set a predefined personality | +| `/clear` | Clear screen and reset conversation | +| `/reset` | Reset conversation only (keep screen) | +| `/history` | Show conversation history | +| `/save` | Save current conversation to file | +| `/config` | Show current configuration | +| `/quit` | Exit the CLI (also: `/exit`, `/q`) | + +## Configuration + +The CLI is configured via `cli-config.yaml`. Copy from `cli-config.yaml.example`: + +```bash +cp cli-config.yaml.example cli-config.yaml +``` + +### Model Configuration + +```yaml +model: + default: "anthropic/claude-opus-4.5" + base_url: "https://openrouter.ai/api/v1" +``` + +### Terminal Configuration + +The CLI supports multiple terminal backends: + +```yaml +# Local execution (default) +terminal: + env_type: "local" + cwd: "." # Current directory + +# SSH remote execution (sandboxed - agent can't touch its own code) +terminal: + env_type: "ssh" + cwd: "/home/myuser/project" + ssh_host: "my-server.example.com" + ssh_user: "myuser" + ssh_key: "~/.ssh/id_rsa" + +# Docker container +terminal: + env_type: "docker" + docker_image: "python:3.11" + +# Singularity/Apptainer (HPC) +terminal: + env_type: "singularity" + singularity_image: "docker://python:3.11" + +# Modal cloud +terminal: + env_type: "modal" + modal_image: "python:3.11" +``` + +### Toolsets + +Control which tools are available: + +```yaml +# Enable all tools +toolsets: + - all + +# Or enable specific toolsets +toolsets: + - web + - terminal + - skills +``` + +Available toolsets: `web`, `search`, `terminal`, `browser`, `vision`, `image_gen`, `skills`, `moa`, `debugging`, `safe` + +### Personalities + +Predefined personalities for the `/personality` command: + +```yaml +agent: + personalities: + helpful: "You are a helpful, friendly AI assistant." + kawaii: "You are a kawaii assistant! Use cute expressions..." + pirate: "Arrr! Ye be talkin' to Captain Hermes..." + # Add your own! +``` + +Built-in personalities: +- `helpful`, `concise`, `technical`, `creative`, `teacher` +- `kawaii`, `catgirl`, `pirate`, `shakespeare`, `surfer` +- `noir`, `uwu`, `philosopher`, `hype` + +## Animated Feedback + +The CLI provides animated feedback during operations: + +### Thinking Animation + +During API calls, shows animated spinner with thinking verbs: +``` + ◜ (。•́︿•̀。) pondering... (1.2s) + ◠ (⊙_⊙) contemplating... (2.4s) + ✧٩(ˊᗜˋ*)و✧ got it! (3.1s) +``` + +### Tool Execution Animation + +Each tool type has unique animations: +``` + ⠋ (◕‿◕✿) 🔍 web_search... (0.8s) + ▅ (≧◡≦) 💻 terminal... (1.2s) + 🌓 (★ω★) 🌐 browser_navigate... (2.1s) + ✧ (✿◠‿◠) 🎨 image_generate... (4.5s) +``` + +## Multi-line Input + +For multi-line input, end a line with `\` to continue: + +``` +❯ Write a function that:\ + 1. Takes a list of numbers\ + 2. Returns the sum +``` + +## Environment Variable Priority + +For terminal settings, `cli-config.yaml` takes precedence over `.env`: + +1. `cli-config.yaml` (highest priority in CLI) +2. `.env` file +3. System environment variables +4. Default values + +This allows you to have different terminal configs for CLI vs batch processing. + +## Session Management + +- **History**: Command history is saved to `~/.hermes_history` +- **Conversations**: Use `/save` to export conversations +- **Reset**: Use `/clear` for full reset, `/reset` to just clear history + +## Quiet Mode + +The CLI runs in "quiet mode" (`HERMES_QUIET=1`), which: +- Suppresses verbose logging from tools +- Enables kawaii-style animated feedback +- Hides terminal environment warnings +- Keeps output clean and user-friendly + +For verbose output (debugging), use: +```bash +./hermes --verbose +``` diff --git a/docs/tools.md b/docs/tools.md index edad6d6c8..4c84c83c1 100644 --- a/docs/tools.md +++ b/docs/tools.md @@ -39,7 +39,7 @@ async def web_search(query: str) -> dict: | Category | Module | Tools | |----------|--------|-------| | **Web** | `web_tools.py` | `web_search`, `web_extract`, `web_crawl` | -| **Terminal** | `terminal_tool.py` | `terminal` (local/docker/singularity/modal backends) | +| **Terminal** | `terminal_tool.py` | `terminal` (local/docker/singularity/modal/ssh backends) | | **Browser** | `browser_tool.py` | `browser_navigate`, `browser_click`, `browser_type`, etc. | | **Vision** | `vision_tools.py` | `vision_analyze` | | **Image Gen** | `image_generation_tool.py` | `image_generate` | @@ -102,6 +102,32 @@ Some tools maintain state across calls within a session: State is managed per `task_id` and cleaned up automatically. +## Terminal Backends + +The terminal tool supports multiple execution backends: + +| Backend | Description | Use Case | +|---------|-------------|----------| +| `local` | Direct execution on host | Development, simple tasks | +| `ssh` | Remote execution via SSH | Sandboxing (agent can't modify its own code) | +| `docker` | Docker container | Isolation, reproducibility | +| `singularity` | Singularity/Apptainer | HPC clusters, rootless containers | +| `modal` | Modal cloud | Scalable cloud compute, GPUs | + +Configure via environment variables or `cli-config.yaml`: + +```yaml +# SSH backend example (in cli-config.yaml) +terminal: + env_type: "ssh" + ssh_host: "my-server.example.com" + ssh_user: "myuser" + ssh_key: "~/.ssh/id_rsa" + cwd: "/home/myuser/project" +``` + +The SSH backend uses ControlMaster for connection persistence, making subsequent commands fast. + ## Skills Tools (Progressive Disclosure) Skills are on-demand knowledge documents. They use **progressive disclosure** to minimize tokens: