- Updated `.cursorrules` to provide a comprehensive overview of the interactive CLI, including its architecture, key components, and command handling. - Expanded `README.md` to introduce the CLI features, quick start instructions, and detailed command descriptions for user guidance. - Added `docs/cli.md` to document CLI usage, configuration, and animated feedback, ensuring clarity for users and developers. - Revised `docs/tools.md` to include support for SSH backend in terminal tools, enhancing the documentation for terminal execution options.
201 lines
7.4 KiB
Plaintext
201 lines
7.4 KiB
Plaintext
Hermes-Agent is an agent harness for LLMs with an interactive CLI.
|
|
|
|
## Development Environment
|
|
|
|
**IMPORTANT**: Always use the virtual environment if it exists:
|
|
```bash
|
|
source venv/bin/activate # Before running any Python commands
|
|
```
|
|
|
|
## Project Structure
|
|
|
|
- `hermes` - CLI launcher script (run with `./hermes`)
|
|
- `cli.py` - Interactive CLI with Rich UI, prompt_toolkit, animated spinners
|
|
- `cli-config.yaml` - CLI configuration (model, terminal, toolsets, personalities)
|
|
- `tools/` - Individual tool implementations (web, terminal, browser, vision, etc.)
|
|
- `tools/__init__.py` - Exports all tools for importing
|
|
- `model_tools.py` - Consolidates tool schemas and handlers for the agent
|
|
- `toolsets.py` - Groups tools into logical toolsets (web, terminal, browser, etc.)
|
|
- `toolset_distributions.py` - Probability-based tool selection for data generation
|
|
- `run_agent.py` - Primary agent runner with AIAgent class and KawaiiSpinner
|
|
- `batch_runner.py` - Parallel batch processing with checkpointing
|
|
- `tests/` - Test scripts
|
|
|
|
## File Dependency Chain
|
|
|
|
```
|
|
tools/*.py → tools/__init__.py → model_tools.py → toolsets.py → toolset_distributions.py
|
|
↑
|
|
run_agent.py ──────────────────────────┘
|
|
cli.py → run_agent.py (uses AIAgent with quiet_mode=True)
|
|
batch_runner.py → run_agent.py + toolset_distributions.py
|
|
```
|
|
|
|
Always ensure consistency between tools, model_tools.py, and toolsets.py when changing any of them.
|
|
|
|
## CLI Architecture (cli.py)
|
|
|
|
The interactive CLI uses:
|
|
- **Rich** - For the welcome banner and styled panels
|
|
- **prompt_toolkit** - For fixed input area with history and `patch_stdout`
|
|
- **KawaiiSpinner** (in run_agent.py) - Animated feedback during API calls and tool execution
|
|
|
|
Key components:
|
|
- `HermesCLI` class - Main CLI controller with commands and conversation loop
|
|
- `load_cli_config()` - Loads `cli-config.yaml`, sets environment variables for terminal
|
|
- `build_welcome_banner()` - Displays ASCII art logo, tools, and skills summary
|
|
- `/commands` - Process user commands like `/help`, `/clear`, `/personality`, etc.
|
|
|
|
CLI uses `quiet_mode=True` when creating AIAgent to suppress verbose logging and enable kawaii-style feedback instead.
|
|
|
|
### Adding CLI Commands
|
|
|
|
1. Add to `COMMANDS` dict with description
|
|
2. Add handler in `process_command()` method
|
|
3. For persistent settings, use `save_config_value()` to update `cli-config.yaml`
|
|
|
|
## Adding a New Tool
|
|
|
|
Follow this strict order to maintain consistency:
|
|
|
|
1. Create `tools/your_tool.py` with:
|
|
- Handler function (sync or async) returning a JSON string via `json.dumps()`
|
|
- `check_*_requirements()` function to verify dependencies (e.g., API keys)
|
|
- Schema definition following OpenAI function-calling format
|
|
|
|
2. Export in `tools/__init__.py`:
|
|
- Import the handler and check function
|
|
- Add to `__all__` list
|
|
|
|
3. Register in `model_tools.py`:
|
|
- Create `get_*_tool_definitions()` function or add to existing
|
|
- Add routing in `handle_function_call()` dispatcher
|
|
- Update `get_all_tool_names()` with the tool name
|
|
- Update `get_toolset_for_tool()` mapping
|
|
- Update `get_available_toolsets()` and `check_toolset_requirements()`
|
|
|
|
4. Add to toolset in `toolsets.py`:
|
|
- Add to existing toolset or create new one in TOOLSETS dict
|
|
|
|
5. Optionally add to `toolset_distributions.py` for batch processing
|
|
|
|
## Tool Implementation Pattern
|
|
|
|
```python
|
|
# tools/example_tool.py
|
|
import json
|
|
import os
|
|
|
|
def check_example_requirements() -> bool:
|
|
"""Check if required API keys/dependencies are available."""
|
|
return bool(os.getenv("EXAMPLE_API_KEY"))
|
|
|
|
def example_tool(param: str, task_id: str = None) -> str:
|
|
"""Execute the tool and return JSON string result."""
|
|
try:
|
|
result = {"success": True, "data": "..."}
|
|
return json.dumps(result, ensure_ascii=False)
|
|
except Exception as e:
|
|
return json.dumps({"error": str(e)}, ensure_ascii=False)
|
|
```
|
|
|
|
All tool handlers MUST return a JSON string. Never return raw dicts.
|
|
|
|
## Stateful Tools
|
|
|
|
Tools that maintain state (terminal, browser) require:
|
|
- `task_id` parameter for session isolation between concurrent tasks
|
|
- `cleanup_*()` function to release resources
|
|
- Cleanup is called automatically in run_agent.py after conversation completes
|
|
|
|
## Environment Variables
|
|
|
|
API keys are loaded from `.env` file in repo root:
|
|
- `OPENROUTER_API_KEY` - Main LLM API access (primary provider)
|
|
- `FIRECRAWL_API_KEY` - Web search/extract tools
|
|
- `BROWSERBASE_API_KEY` / `BROWSERBASE_PROJECT_ID` - Browser automation
|
|
- `FAL_KEY` - Image generation (FLUX model)
|
|
- `NOUS_API_KEY` - Vision and Mixture-of-Agents tools
|
|
|
|
Terminal tool configuration (can also be set in `cli-config.yaml`):
|
|
- `TERMINAL_ENV` - Backend: local, docker, singularity, modal, or ssh
|
|
- `TERMINAL_CWD` - Working directory
|
|
- `TERMINAL_SSH_HOST`, `TERMINAL_SSH_USER`, `TERMINAL_SSH_KEY` - For SSH backend
|
|
|
|
## Agent Loop (run_agent.py)
|
|
|
|
The AIAgent class handles:
|
|
- Processing enabled toolsets to provide to the model
|
|
- Piping prompts to the agent
|
|
- Looping LLM calls when tools are invoked, until natural language response
|
|
- Returning the final response
|
|
|
|
Uses OpenAI-compatible API (primarily OpenRouter) with the OpenAI Python SDK.
|
|
|
|
## Reasoning Model Support
|
|
|
|
For models that support chain-of-thought reasoning:
|
|
- Extract `reasoning_content` from API responses
|
|
- Store in `assistant_msg["reasoning"]` for trajectory export
|
|
- Pass back via `reasoning_content` field on subsequent turns
|
|
|
|
## Trajectory Format
|
|
|
|
Conversations are saved in ShareGPT format for training:
|
|
```json
|
|
{"from": "system", "value": "System prompt with <tools>...</tools>"}
|
|
{"from": "human", "value": "User message"}
|
|
{"from": "gpt", "value": "<think>reasoning</think>\n<tool_call>{...}</tool_call>"}
|
|
{"from": "tool", "value": "<tool_response>{...}</tool_response>"}
|
|
{"from": "gpt", "value": "Final response"}
|
|
```
|
|
|
|
Tool calls use `<tool_call>` XML tags, responses use `<tool_response>` tags, reasoning uses `<think>` tags.
|
|
|
|
## Batch Processing (batch_runner.py)
|
|
|
|
For processing multiple prompts:
|
|
- Parallel execution with multiprocessing
|
|
- Content-based resume for fault tolerance (matches on prompt text, not indices)
|
|
- Toolset distributions control probabilistic tool availability per prompt
|
|
- Output: `data/<run_name>/trajectories.jsonl` (combined) + individual batch files
|
|
|
|
## Logging
|
|
|
|
Trajectories restructure tools as a system prompt for storage in a format suitable for later training use.
|
|
|
|
## Skills System
|
|
|
|
Skills are on-demand knowledge documents the agent can load. Located in `skills/` directory:
|
|
|
|
```
|
|
skills/
|
|
├── mlops/ # Category folder
|
|
│ ├── axolotl/ # Skill folder
|
|
│ │ ├── SKILL.md # Main instructions (required)
|
|
│ │ ├── references/ # Additional docs, API specs
|
|
│ │ └── templates/ # Output formats, configs
|
|
│ └── vllm/
|
|
│ └── SKILL.md
|
|
└── example-skill/
|
|
└── SKILL.md
|
|
```
|
|
|
|
**Progressive disclosure** (token-efficient):
|
|
1. `skills_categories()` - List category names (~50 tokens)
|
|
2. `skills_list(category)` - Name + description per skill (~3k tokens)
|
|
3. `skill_view(name)` - Full content + tags + linked files
|
|
|
|
SKILL.md files use YAML frontmatter:
|
|
```yaml
|
|
---
|
|
name: skill-name
|
|
description: Brief description for listing
|
|
tags: [tag1, tag2]
|
|
related_skills: [other-skill]
|
|
version: 1.0.0
|
|
---
|
|
# Skill Content...
|
|
```
|
|
|
|
Tool files: `tools/skills_tool.py` → `model_tools.py` → `toolsets.py` |