Go to file

teknium b78076cac7 Enhance trajectory_compressor.py with new input options and sampling functionality

- Updated the main function to accept both single JSONL files and directories for compression.
- Added support for sampling a percentage of trajectories before compression.
- Improved usage documentation with detailed examples for various compression scenarios.
- Enhanced error handling for input validation and dry run mode.
- Streamlined output handling to manage temporary files during processing.

2026-01-29 06:04:13 +00:00

__pycache__

initital commit

2025-07-22 18:32:44 -07:00

architecture

Made to be more descriptive from comments

2025-09-12 18:10:36 -05:00

configs

Add mini-swe-agent runner and trajectory compressor

2026-01-23 00:52:46 +00:00

mini-swe-agent @ 07aa6a7385

Update environment configuration and enhance terminal tool integration

2026-01-23 12:26:53 +00:00

scripts

Add mini-swe-agent runner and trajectory compressor

2026-01-23 00:52:46 +00:00

tests

some cleanups

2025-11-05 03:47:17 +00:00

tools

Update environment configuration and enhance terminal tool integration

2026-01-23 12:26:53 +00:00

.cursorrules

Add batch processing capabilities with checkpointing and statistics tracking, along with toolset distribution management. Update README and add test scripts for validation.

2025-10-06 03:17:58 +00:00

.env.example

Update environment configuration and enhance terminal tool integration

2026-01-23 12:26:53 +00:00

.gitignore

Enhance batch processing and image generation tools

2026-01-18 10:11:59 +00:00

.gitmodules

Update environment configuration and enhance terminal tool integration

2026-01-23 12:26:53 +00:00

batch_runner.py

Enhance batch processing and image generation tools

2026-01-18 10:11:59 +00:00

mini_swe_runner.py

Update environment configuration and enhance terminal tool integration

2026-01-23 12:26:53 +00:00

model_tools.py

Update environment configuration and enhance terminal tool integration

2026-01-23 12:26:53 +00:00

pyproject.toml

A bit of restructuring for simplicity and organization

2025-10-01 23:29:25 +00:00

README.md

Update environment configuration and enhance terminal tool integration

2026-01-23 12:26:53 +00:00

requirements.txt

Update environment configuration and enhance terminal tool integration

2026-01-23 12:26:53 +00:00

run_agent.py

Update environment configuration and enhance terminal tool integration

2026-01-23 12:26:53 +00:00

toolset_distributions.py

update distribution and gitignore

2025-11-16 01:03:23 +00:00

toolsets.py

Add support for enabling all toolsets with 'all' or '*' alias in README and toolset resolution logic

2025-10-03 13:45:29 +00:00

trajectory_compressor.py

Enhance trajectory_compressor.py with new input options and sampling functionality

2026-01-29 06:04:13 +00:00

README.md

Hermes Agent

An AI agent with advanced tool-calling capabilities, featuring a flexible toolsets system for organizing and managing tools.

Features

Web Tools: Search, extract content, and crawl websites
Terminal Tools: Execute commands via mini-swe-agent (local, Docker, or Modal backends)
Vision Tools: Analyze images from URLs
Reasoning Tools: Advanced multi-model reasoning (Mixture of Agents)
Creative Tools: Generate images from text prompts
Toolsets System: Organize tools into logical groups for different scenarios
Batch Processing: Process datasets in parallel with checkpointing and statistics tracking
Ephemeral System Prompts: Guide model behavior without polluting training datasets

Setup

1. Clone the Repository

# Clone with submodules (recommended)
git clone --recurse-submodules https://github.com/NousResearch/Hermes-Agent.git
cd Hermes-Agent

# Or if already cloned without submodules:
git submodule update --init --recursive

2. Install Dependencies

# Create and activate virtual environment (recommended)
python3 -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

# Install required packages
pip install -r requirements.txt

# Install mini-swe-agent for terminal tools
pip install -e ./mini-swe-agent

3. Configure Environment Variables

# Copy the example environment file
cp .env.example .env

# Edit .env and add your API keys
nano .env  # or use your preferred editor

Required API Keys:

OPENROUTER_API_KEY - LLM access via OpenRouter (get at: https://openrouter.ai/keys)
FIRECRAWL_API_KEY - Web tools (get at: https://firecrawl.dev/)
NOUS_API_KEY - Vision & reasoning tools (get at: https://inference-api.nousresearch.com/)
FAL_KEY - Image generation (get at: https://fal.ai/)

Optional API Keys:

ANTHROPIC_API_KEY - Direct Anthropic access (if not using OpenRouter)
OPENAI_API_KEY - Direct OpenAI access (if not using OpenRouter)
MORPH_API_KEY - For legacy Hecate terminal backend (get at: https://morph.so/)

4. Configure Terminal Backend

The terminal tool uses mini-swe-agent environments. Configure in .env:

# Backend: "local" (host machine), "docker" (containers), or "modal" (cloud)
TERMINAL_ENV=local          # Default: runs on host machine
TERMINAL_ENV=docker         # Recommended: isolated Docker containers
TERMINAL_ENV=modal          # Cloud execution via Modal

# Docker settings (for docker/modal backends)
TERMINAL_DOCKER_IMAGE=python:3.11-slim
TERMINAL_TIMEOUT=60

Backend Requirements:

local: No extra setup (runs directly on your machine)
docker: Requires Docker installed and running. User must be in docker group.
modal: Requires Modal account (see setup below)

Modal provides serverless cloud compute for running sandboxed environments at scale.

# 1. Install Modal and dependencies
pip install modal boto3

# 2. Authenticate with Modal (opens browser)
modal setup

# 3. Set terminal backend to modal in .env
TERMINAL_ENV=modal

Modal uses CLI-based authentication (stored in ~/.modal/), so no API key is needed in .env. After running modal setup, commands will automatically execute in Modal's cloud sandboxes.

See .env.example for all available configuration options including debug settings.

Toolsets System

The agent uses a toolsets system for organizing and managing tools. All tools must be part of a toolset to be accessible - individual tool selection is not supported. This ensures consistent and logical grouping of capabilities.

Key Concepts

Toolsets: Logical groups of tools for specific use cases (e.g., "research", "development", "debugging")
Composition: Toolsets can include other toolsets for powerful combinations
Custom Toolsets: Create your own toolsets at runtime or by editing toolsets.py
Toolset-Only Access: Tools are only accessible through toolsets, not individually

Available Toolsets

See toolsets.py for the complete list of predefined toolsets including:

Basic toolsets (web, terminal, vision, creative, reasoning)
Composite toolsets (research, development, analysis, etc.)
Scenario-specific toolsets (debugging, documentation, API testing, etc.)
Special toolsets (safe mode without terminal, minimal, offline)

Using Toolsets

# Use a predefined toolset
python run_agent.py --enabled_toolsets=research --query "Find latest AI papers"

# Combine multiple toolsets
python run_agent.py --enabled_toolsets=web,vision --query "Analyze this website"

# Enable all toolsets explicitly (same as omitting the flag)
python run_agent.py --enabled_toolsets=all --query "Do web research and run commands if helpful"

# Safe mode (no terminal access)
python run_agent.py --enabled_toolsets=safe --query "Help without running commands"

# List all available toolsets and tools
python run_agent.py --list_tools

For detailed documentation on toolsets, see TOOLSETS_README.md.

Basic Usage

Default (all tools enabled)

# Uses OpenRouter by default - just set OPENROUTER_API_KEY in .env
python run_agent.py \
  --query "search up the latest docs on jit in python 3.13 and write me basic example that's not in their docs. profile its perf" \
  --max_turns 20 \
  --model anthropic/claude-sonnet-4-20250514

With specific toolset

python run_agent.py \
  --query "Debug this Python error" \
  --enabled_toolsets=debugging \
  --model anthropic/claude-sonnet-4-20250514

Python API

from run_agent import AIAgent

# Uses OpenRouter by default (reads OPENROUTER_API_KEY from .env)
agent = AIAgent(
    model="anthropic/claude-sonnet-4-20250514",
    enabled_toolsets=["research"]
)
response = agent.chat("Find information about quantum computing")

# Create custom toolset at runtime
from toolsets import create_custom_toolset

create_custom_toolset(
    name="my_tools",
    description="My custom toolkit",
    tools=["web_search"],
    includes=["terminal", "vision"]
)

agent = AIAgent(enabled_toolsets=["my_tools"])

Batch Processing

Process multiple prompts from a dataset in parallel with automatic checkpointing and statistics tracking:

# Basic batch processing
python batch_runner.py \
  --dataset_file=prompts.jsonl \
  --batch_size=20 \
  --run_name=my_run

# With specific distribution
python batch_runner.py \
  --dataset_file=prompts.jsonl \
  --batch_size=20 \
  --run_name=image_run \
  --distribution=image_gen \
  --num_workers=4

Key Features:

Parallel processing with configurable workers
Toolset distributions for varied data generation
Automatic checkpointing and resume capability
Combined output in data/<run_name>/trajectories.jsonl
Tool usage statistics and success rates

Quick Start: See QUICKSTART_BATCH.md for a 5-minute getting started guide.
Full Documentation: See BATCH_PROCESSING.md for comprehensive documentation.

Ephemeral System Prompts

The ephemeral system prompt feature allows you to guide the model's behavior during batch processing without saving that prompt to the training dataset trajectories. This is useful for:

Guiding model behavior during data collection
Adding task-specific instructions
Keeping saved trajectories clean and focused on tool-calling format

Example:

python batch_runner.py \
  --dataset_file=prompts.jsonl \
  --batch_size=10 \
  --run_name=my_run \
  --ephemeral_system_prompt="You are a helpful assistant focused on image generation."

The ephemeral prompt will influence the model's behavior during execution, but only the standard tool-calling system prompt will be saved in the trajectory files.

Documentation: See docs/ephemeral_system_prompt.md for complete details.

Command Line Arguments

Single Agent (run_agent.py):

--query: The question or task for the agent
--model: Model to use (default: claude-opus-4-20250514)
--api_key: API key for authentication
--base_url: API endpoint URL
--max_turns: Maximum number of tool-calling iterations
--enabled_toolsets: Comma-separated list of toolsets to enable. Use all (or *) to enable everything. If omitted, all toolsets are enabled by default.
--disabled_toolsets: Comma-separated list of toolsets to disable
--list_tools: List all available toolsets and tools
--save_trajectories: Save conversation trajectories to JSONL files

Batch Processing (batch_runner.py):

--dataset_file: Path to JSONL file with prompts
--batch_size: Number of prompts per batch
--run_name: Name for this run (for output/checkpointing)
--distribution: Toolset distribution to use (default: "default")
--num_workers: Number of parallel workers (default: 4)
--resume: Resume from checkpoint if interrupted
--ephemeral_system_prompt: System prompt used during execution but NOT saved to trajectories
--list_distributions: List available toolset distributions

Environment Variables

All environment variables can be configured in the .env file (copy from .env.example).

LLM Provider (OpenRouter):

OPENROUTER_API_KEY: Primary LLM access via OpenRouter (supports Claude, GPT-4, Gemini, etc.)
LLM_MODEL: Default model (e.g., anthropic/claude-sonnet-4, openai/gpt-4o)

Tool API Keys:

FIRECRAWL_API_KEY: Web tools (search, extract, crawl)
NOUS_API_KEY: Vision and reasoning tools
FAL_KEY: Image generation tools

Optional Direct Provider Keys:

ANTHROPIC_API_KEY: Direct Anthropic access (fallback if OpenRouter not set)
OPENAI_API_KEY: Direct OpenAI access (fallback if OpenRouter not set)

Terminal Tool Configuration (mini-swe-agent backend):

TERMINAL_ENV: Backend type - local, docker, or modal (default: local)
TERMINAL_DOCKER_IMAGE: Docker image to use (default: python:3.11-slim)
TERMINAL_TIMEOUT: Command timeout in seconds (default: 60)
TERMINAL_LIFETIME_SECONDS: Cleanup inactive environments after this time (default: 300)
TERMINAL_CWD: Working directory inside containers (default: /tmp)

Legacy Hecate Terminal Backend (optional):

MORPH_API_KEY: For Hecate/MorphCloud terminal backend
HECATE_VM_LIFETIME_SECONDS: VM lifetime (default: 300)
HECATE_DEFAULT_SNAPSHOT_ID: Default snapshot (default: snapshot_p5294qxt)

Debug Options:

WEB_TOOLS_DEBUG, VISION_TOOLS_DEBUG, MOA_TOOLS_DEBUG, IMAGE_TOOLS_DEBUG: Enable debug logging

Documentation

Single Agent Usage:

TOOLSETS_README.md: Comprehensive guide to the toolsets system
toolsets.py: View and modify available toolsets
model_tools.py: Core tool definitions and handlers

Batch Processing:

QUICKSTART_BATCH.md: 5-minute quick start guide
BATCH_PROCESSING.md: Complete batch processing documentation
toolset_distributions.py: Toolset distributions for data generation

Examples

See TOOLSETS_README.md for extensive examples of using different toolsets for various scenarios.

Languages

Python 94.6%

TeX 3%

Shell 0.5%

JavaScript 0.4%

Nix 0.4%

Other 1%

README.md

Hermes Agent

Features

Setup

1. Clone the Repository

2. Install Dependencies

3. Configure Environment Variables

4. Configure Terminal Backend

Modal Cloud Backend Setup

Toolsets System

Key Concepts

Available Toolsets

Using Toolsets

Basic Usage

Default (all tools enabled)

With specific toolset

Python API

Batch Processing

Ephemeral System Prompts

Command Line Arguments

Environment Variables

Documentation

Examples