- Updated `.env.example` to include new API keys and configuration options for the mini-swe-agent backend, including support for local, Docker, and Modal environments. - Added `.gitmodules` to include mini-swe-agent as a submodule for easier integration. - Refactored `mini_swe_runner.py` to use the updated model format and default to OpenRouter for API calls. - Enhanced `model_tools.py` to support the new terminal tool definitions and ensure compatibility with the mini-swe-agent backend. - Updated `README.md` to reflect changes in setup instructions and environment variable configurations. - Improved `terminal_tool.py` to manage execution environments and lifecycle, ensuring proper cleanup and error handling. - Introduced `terminal_hecate.py` for executing commands on MorphCloud VMs, providing an alternative backend for terminal operations.
304 lines
11 KiB
Markdown
304 lines
11 KiB
Markdown
# Hermes Agent
|
|
|
|
An AI agent with advanced tool-calling capabilities, featuring a flexible toolsets system for organizing and managing tools.
|
|
|
|
## Features
|
|
|
|
- **Web Tools**: Search, extract content, and crawl websites
|
|
- **Terminal Tools**: Execute commands via mini-swe-agent (local, Docker, or Modal backends)
|
|
- **Vision Tools**: Analyze images from URLs
|
|
- **Reasoning Tools**: Advanced multi-model reasoning (Mixture of Agents)
|
|
- **Creative Tools**: Generate images from text prompts
|
|
- **Toolsets System**: Organize tools into logical groups for different scenarios
|
|
- **Batch Processing**: Process datasets in parallel with checkpointing and statistics tracking
|
|
- **Ephemeral System Prompts**: Guide model behavior without polluting training datasets
|
|
|
|
## Setup
|
|
|
|
### 1. Clone the Repository
|
|
```bash
|
|
# Clone with submodules (recommended)
|
|
git clone --recurse-submodules https://github.com/NousResearch/Hermes-Agent.git
|
|
cd Hermes-Agent
|
|
|
|
# Or if already cloned without submodules:
|
|
git submodule update --init --recursive
|
|
```
|
|
|
|
### 2. Install Dependencies
|
|
```bash
|
|
# Create and activate virtual environment (recommended)
|
|
python3 -m venv venv
|
|
source venv/bin/activate # On Windows: venv\Scripts\activate
|
|
|
|
# Install required packages
|
|
pip install -r requirements.txt
|
|
|
|
# Install mini-swe-agent for terminal tools
|
|
pip install -e ./mini-swe-agent
|
|
```
|
|
|
|
### 3. Configure Environment Variables
|
|
```bash
|
|
# Copy the example environment file
|
|
cp .env.example .env
|
|
|
|
# Edit .env and add your API keys
|
|
nano .env # or use your preferred editor
|
|
```
|
|
|
|
**Required API Keys:**
|
|
- `OPENROUTER_API_KEY` - LLM access via OpenRouter (get at: https://openrouter.ai/keys)
|
|
- `FIRECRAWL_API_KEY` - Web tools (get at: https://firecrawl.dev/)
|
|
- `NOUS_API_KEY` - Vision & reasoning tools (get at: https://inference-api.nousresearch.com/)
|
|
- `FAL_KEY` - Image generation (get at: https://fal.ai/)
|
|
|
|
**Optional API Keys:**
|
|
- `ANTHROPIC_API_KEY` - Direct Anthropic access (if not using OpenRouter)
|
|
- `OPENAI_API_KEY` - Direct OpenAI access (if not using OpenRouter)
|
|
- `MORPH_API_KEY` - For legacy Hecate terminal backend (get at: https://morph.so/)
|
|
|
|
### 4. Configure Terminal Backend
|
|
|
|
The terminal tool uses **mini-swe-agent** environments. Configure in `.env`:
|
|
|
|
```bash
|
|
# Backend: "local" (host machine), "docker" (containers), or "modal" (cloud)
|
|
TERMINAL_ENV=local # Default: runs on host machine
|
|
TERMINAL_ENV=docker # Recommended: isolated Docker containers
|
|
TERMINAL_ENV=modal # Cloud execution via Modal
|
|
|
|
# Docker settings (for docker/modal backends)
|
|
TERMINAL_DOCKER_IMAGE=python:3.11-slim
|
|
TERMINAL_TIMEOUT=60
|
|
```
|
|
|
|
**Backend Requirements:**
|
|
- **local**: No extra setup (runs directly on your machine)
|
|
- **docker**: Requires Docker installed and running. User must be in `docker` group.
|
|
- **modal**: Requires Modal account (see setup below)
|
|
|
|
### Modal Cloud Backend Setup
|
|
|
|
[Modal](https://modal.com) provides serverless cloud compute for running sandboxed environments at scale.
|
|
|
|
```bash
|
|
# 1. Install Modal and dependencies
|
|
pip install modal boto3
|
|
|
|
# 2. Authenticate with Modal (opens browser)
|
|
modal setup
|
|
|
|
# 3. Set terminal backend to modal in .env
|
|
TERMINAL_ENV=modal
|
|
```
|
|
|
|
Modal uses CLI-based authentication (stored in `~/.modal/`), so no API key is needed in `.env`. After running `modal setup`, commands will automatically execute in Modal's cloud sandboxes.
|
|
|
|
See `.env.example` for all available configuration options including debug settings.
|
|
|
|
## Toolsets System
|
|
|
|
The agent uses a toolsets system for organizing and managing tools. All tools must be part of a toolset to be accessible - individual tool selection is not supported. This ensures consistent and logical grouping of capabilities.
|
|
|
|
### Key Concepts
|
|
|
|
- **Toolsets**: Logical groups of tools for specific use cases (e.g., "research", "development", "debugging")
|
|
- **Composition**: Toolsets can include other toolsets for powerful combinations
|
|
- **Custom Toolsets**: Create your own toolsets at runtime or by editing `toolsets.py`
|
|
- **Toolset-Only Access**: Tools are only accessible through toolsets, not individually
|
|
|
|
### Available Toolsets
|
|
|
|
See `toolsets.py` for the complete list of predefined toolsets including:
|
|
- Basic toolsets (web, terminal, vision, creative, reasoning)
|
|
- Composite toolsets (research, development, analysis, etc.)
|
|
- Scenario-specific toolsets (debugging, documentation, API testing, etc.)
|
|
- Special toolsets (safe mode without terminal, minimal, offline)
|
|
|
|
### Using Toolsets
|
|
|
|
```bash
|
|
# Use a predefined toolset
|
|
python run_agent.py --enabled_toolsets=research --query "Find latest AI papers"
|
|
|
|
# Combine multiple toolsets
|
|
python run_agent.py --enabled_toolsets=web,vision --query "Analyze this website"
|
|
|
|
# Enable all toolsets explicitly (same as omitting the flag)
|
|
python run_agent.py --enabled_toolsets=all --query "Do web research and run commands if helpful"
|
|
|
|
# Safe mode (no terminal access)
|
|
python run_agent.py --enabled_toolsets=safe --query "Help without running commands"
|
|
|
|
# List all available toolsets and tools
|
|
python run_agent.py --list_tools
|
|
```
|
|
|
|
For detailed documentation on toolsets, see `TOOLSETS_README.md`.
|
|
|
|
## Basic Usage
|
|
|
|
### Default (all tools enabled)
|
|
```bash
|
|
# Uses OpenRouter by default - just set OPENROUTER_API_KEY in .env
|
|
python run_agent.py \
|
|
--query "search up the latest docs on jit in python 3.13 and write me basic example that's not in their docs. profile its perf" \
|
|
--max_turns 20 \
|
|
--model anthropic/claude-sonnet-4-20250514
|
|
```
|
|
|
|
### With specific toolset
|
|
```bash
|
|
python run_agent.py \
|
|
--query "Debug this Python error" \
|
|
--enabled_toolsets=debugging \
|
|
--model anthropic/claude-sonnet-4-20250514
|
|
```
|
|
|
|
### Python API
|
|
```python
|
|
from run_agent import AIAgent
|
|
|
|
# Uses OpenRouter by default (reads OPENROUTER_API_KEY from .env)
|
|
agent = AIAgent(
|
|
model="anthropic/claude-sonnet-4-20250514",
|
|
enabled_toolsets=["research"]
|
|
)
|
|
response = agent.chat("Find information about quantum computing")
|
|
|
|
# Create custom toolset at runtime
|
|
from toolsets import create_custom_toolset
|
|
|
|
create_custom_toolset(
|
|
name="my_tools",
|
|
description="My custom toolkit",
|
|
tools=["web_search"],
|
|
includes=["terminal", "vision"]
|
|
)
|
|
|
|
agent = AIAgent(enabled_toolsets=["my_tools"])
|
|
```
|
|
|
|
## Batch Processing
|
|
|
|
Process multiple prompts from a dataset in parallel with automatic checkpointing and statistics tracking:
|
|
|
|
```bash
|
|
# Basic batch processing
|
|
python batch_runner.py \
|
|
--dataset_file=prompts.jsonl \
|
|
--batch_size=20 \
|
|
--run_name=my_run
|
|
|
|
# With specific distribution
|
|
python batch_runner.py \
|
|
--dataset_file=prompts.jsonl \
|
|
--batch_size=20 \
|
|
--run_name=image_run \
|
|
--distribution=image_gen \
|
|
--num_workers=4
|
|
```
|
|
|
|
**Key Features:**
|
|
- Parallel processing with configurable workers
|
|
- Toolset distributions for varied data generation
|
|
- Automatic checkpointing and resume capability
|
|
- Combined output in `data/<run_name>/trajectories.jsonl`
|
|
- Tool usage statistics and success rates
|
|
|
|
**Quick Start:** See [QUICKSTART_BATCH.md](QUICKSTART_BATCH.md) for a 5-minute getting started guide.
|
|
**Full Documentation:** See [BATCH_PROCESSING.md](BATCH_PROCESSING.md) for comprehensive documentation.
|
|
|
|
### Ephemeral System Prompts
|
|
|
|
The ephemeral system prompt feature allows you to guide the model's behavior during batch processing **without** saving that prompt to the training dataset trajectories. This is useful for:
|
|
|
|
- Guiding model behavior during data collection
|
|
- Adding task-specific instructions
|
|
- Keeping saved trajectories clean and focused on tool-calling format
|
|
|
|
**Example:**
|
|
```bash
|
|
python batch_runner.py \
|
|
--dataset_file=prompts.jsonl \
|
|
--batch_size=10 \
|
|
--run_name=my_run \
|
|
--ephemeral_system_prompt="You are a helpful assistant focused on image generation."
|
|
```
|
|
|
|
The ephemeral prompt will influence the model's behavior during execution, but **only the standard tool-calling system prompt** will be saved in the trajectory files.
|
|
|
|
**Documentation:** See [docs/ephemeral_system_prompt.md](docs/ephemeral_system_prompt.md) for complete details.
|
|
|
|
## Command Line Arguments
|
|
|
|
**Single Agent (`run_agent.py`):**
|
|
- `--query`: The question or task for the agent
|
|
- `--model`: Model to use (default: claude-opus-4-20250514)
|
|
- `--api_key`: API key for authentication
|
|
- `--base_url`: API endpoint URL
|
|
- `--max_turns`: Maximum number of tool-calling iterations
|
|
- `--enabled_toolsets`: Comma-separated list of toolsets to enable. Use `all` (or `*`) to enable everything. If omitted, all toolsets are enabled by default.
|
|
- `--disabled_toolsets`: Comma-separated list of toolsets to disable
|
|
- `--list_tools`: List all available toolsets and tools
|
|
- `--save_trajectories`: Save conversation trajectories to JSONL files
|
|
|
|
**Batch Processing (`batch_runner.py`):**
|
|
- `--dataset_file`: Path to JSONL file with prompts
|
|
- `--batch_size`: Number of prompts per batch
|
|
- `--run_name`: Name for this run (for output/checkpointing)
|
|
- `--distribution`: Toolset distribution to use (default: "default")
|
|
- `--num_workers`: Number of parallel workers (default: 4)
|
|
- `--resume`: Resume from checkpoint if interrupted
|
|
- `--ephemeral_system_prompt`: System prompt used during execution but NOT saved to trajectories
|
|
- `--list_distributions`: List available toolset distributions
|
|
|
|
## Environment Variables
|
|
|
|
All environment variables can be configured in the `.env` file (copy from `.env.example`).
|
|
|
|
**LLM Provider (OpenRouter):**
|
|
- `OPENROUTER_API_KEY`: Primary LLM access via OpenRouter (supports Claude, GPT-4, Gemini, etc.)
|
|
- `LLM_MODEL`: Default model (e.g., `anthropic/claude-sonnet-4`, `openai/gpt-4o`)
|
|
|
|
**Tool API Keys:**
|
|
- `FIRECRAWL_API_KEY`: Web tools (search, extract, crawl)
|
|
- `NOUS_API_KEY`: Vision and reasoning tools
|
|
- `FAL_KEY`: Image generation tools
|
|
|
|
**Optional Direct Provider Keys:**
|
|
- `ANTHROPIC_API_KEY`: Direct Anthropic access (fallback if OpenRouter not set)
|
|
- `OPENAI_API_KEY`: Direct OpenAI access (fallback if OpenRouter not set)
|
|
|
|
**Terminal Tool Configuration (mini-swe-agent backend):**
|
|
- `TERMINAL_ENV`: Backend type - `local`, `docker`, or `modal` (default: `local`)
|
|
- `TERMINAL_DOCKER_IMAGE`: Docker image to use (default: `python:3.11-slim`)
|
|
- `TERMINAL_TIMEOUT`: Command timeout in seconds (default: `60`)
|
|
- `TERMINAL_LIFETIME_SECONDS`: Cleanup inactive environments after this time (default: `300`)
|
|
- `TERMINAL_CWD`: Working directory inside containers (default: `/tmp`)
|
|
|
|
**Legacy Hecate Terminal Backend (optional):**
|
|
- `MORPH_API_KEY`: For Hecate/MorphCloud terminal backend
|
|
- `HECATE_VM_LIFETIME_SECONDS`: VM lifetime (default: 300)
|
|
- `HECATE_DEFAULT_SNAPSHOT_ID`: Default snapshot (default: snapshot_p5294qxt)
|
|
|
|
**Debug Options:**
|
|
- `WEB_TOOLS_DEBUG`, `VISION_TOOLS_DEBUG`, `MOA_TOOLS_DEBUG`, `IMAGE_TOOLS_DEBUG`: Enable debug logging
|
|
|
|
## Documentation
|
|
|
|
**Single Agent Usage:**
|
|
- `TOOLSETS_README.md`: Comprehensive guide to the toolsets system
|
|
- `toolsets.py`: View and modify available toolsets
|
|
- `model_tools.py`: Core tool definitions and handlers
|
|
|
|
**Batch Processing:**
|
|
- `QUICKSTART_BATCH.md`: 5-minute quick start guide
|
|
- `BATCH_PROCESSING.md`: Complete batch processing documentation
|
|
- `toolset_distributions.py`: Toolset distributions for data generation
|
|
|
|
## Examples
|
|
|
|
See `TOOLSETS_README.md` for extensive examples of using different toolsets for various scenarios.
|