Enhanced documentation to reflect the new interactive setup command for configuring messaging platforms (Telegram, Discord, Slack, WhatsApp). Updated sections in AGENTS.md, README.md, and messaging.md to provide clear instructions on using the 'hermes gateway setup' command, improving user experience and accessibility for platform configuration.
26 KiB
Hermes Agent - Development Guide
Instructions for AI coding assistants (GitHub Copilot, Cursor, etc.) and human developers.
Hermes Agent is an AI agent harness with tool-calling capabilities, interactive CLI, messaging integrations, and scheduled tasks.
Development Environment
IMPORTANT: Always use the virtual environment if it exists:
source venv/bin/activate # Before running any Python commands
Project Structure
hermes-agent/
├── agent/ # Agent internals (extracted from run_agent.py)
│ ├── model_metadata.py # Model context lengths, token estimation
│ ├── context_compressor.py # Auto context compression
│ ├── prompt_caching.py # Anthropic prompt caching
│ ├── prompt_builder.py # System prompt assembly (identity, skills index, context files)
│ ├── display.py # KawaiiSpinner, tool preview formatting
│ └── trajectory.py # Trajectory saving helpers
├── hermes_cli/ # CLI implementation
│ ├── main.py # Entry point, command dispatcher
│ ├── banner.py # Welcome banner, ASCII art, skills summary
│ ├── commands.py # Slash command definitions + autocomplete
│ ├── callbacks.py # Interactive prompt callbacks (clarify, sudo, approval)
│ ├── setup.py # Interactive setup wizard
│ ├── config.py # Config management & migration
│ ├── status.py # Status display
│ ├── doctor.py # Diagnostics
│ ├── gateway.py # Gateway management
│ ├── uninstall.py # Uninstaller
│ ├── cron.py # Cron job management
│ └── skills_hub.py # Skills Hub CLI + /skills slash command
├── tools/ # Tool implementations
│ ├── registry.py # Central tool registry (schemas, handlers, dispatch)
│ ├── approval.py # Dangerous command detection + per-session approval
│ ├── environments/ # Terminal execution backends
│ │ ├── base.py # BaseEnvironment ABC
│ │ ├── local.py # Local execution with interrupt support
│ │ ├── docker.py # Docker container execution
│ │ ├── ssh.py # SSH remote execution
│ │ ├── singularity.py # Singularity/Apptainer + SIF management
│ │ └── modal.py # Modal cloud execution
│ ├── terminal_tool.py # Terminal orchestration (sudo, lifecycle, factory)
│ ├── todo_tool.py # Planning & task management
│ ├── process_registry.py # Background process management
│ └── ... # Other tool files
├── gateway/ # Messaging platform adapters
│ ├── platforms/ # Platform-specific adapters (telegram, discord, slack, whatsapp)
│ └── ...
├── cron/ # Scheduler implementation
├── environments/ # RL training environments (Atropos integration)
├── skills/ # Bundled skill sources
├── cli.py # Interactive CLI orchestrator (HermesCLI class)
├── run_agent.py # AIAgent class (core conversation loop)
├── model_tools.py # Tool orchestration (thin layer over tools/registry.py)
├── toolsets.py # Tool groupings
├── toolset_distributions.py # Probability-based tool selection
└── batch_runner.py # Parallel batch processing
User Configuration (stored in ~/.hermes/):
~/.hermes/config.yaml- Settings (model, terminal, toolsets, etc.)~/.hermes/.env- API keys and secrets~/.hermes/pairing/- DM pairing data~/.hermes/hooks/- Custom event hooks~/.hermes/image_cache/- Cached user images~/.hermes/audio_cache/- Cached user voice messages~/.hermes/sticker_cache.json- Telegram sticker descriptions
File Dependency Chain
tools/registry.py (no deps — imported by all tool files)
↑
tools/*.py (each calls registry.register() at import time)
↑
model_tools.py (imports tools/registry + triggers tool discovery)
↑
run_agent.py, cli.py, batch_runner.py, environments/
Each tool file co-locates its schema, handler, and registration. model_tools.py is a thin orchestration layer.
AIAgent Class
The main agent is implemented in run_agent.py:
class AIAgent:
def __init__(
self,
model: str = "anthropic/claude-sonnet-4",
api_key: str = None,
base_url: str = "https://openrouter.ai/api/v1",
max_iterations: int = 60, # Max tool-calling loops
enabled_toolsets: list = None,
disabled_toolsets: list = None,
verbose_logging: bool = False,
quiet_mode: bool = False, # Suppress progress output
tool_progress_callback: callable = None, # Called on each tool use
):
# Initialize OpenAI client, load tools based on toolsets
...
def chat(self, user_message: str, task_id: str = None) -> str:
# Main entry point - runs the agent loop
...
Agent Loop
The core loop in _run_agent_loop():
1. Add user message to conversation
2. Call LLM with tools
3. If LLM returns tool calls:
- Execute each tool
- Add tool results to conversation
- Go to step 2
4. If LLM returns text response:
- Return response to user
while turns < max_turns:
response = client.chat.completions.create(
model=model,
messages=messages,
tools=tool_schemas,
)
if response.tool_calls:
for tool_call in response.tool_calls:
result = await execute_tool(tool_call)
messages.append(tool_result_message(result))
turns += 1
else:
return response.content
Conversation Management
Messages are stored as a list of dicts following OpenAI format:
messages = [
{"role": "system", "content": "You are a helpful assistant..."},
{"role": "user", "content": "Search for Python tutorials"},
{"role": "assistant", "content": None, "tool_calls": [...]},
{"role": "tool", "tool_call_id": "...", "content": "..."},
{"role": "assistant", "content": "Here's what I found..."},
]
Reasoning Model Support
For models that support chain-of-thought reasoning:
- Extract
reasoning_contentfrom API responses - Store in
assistant_msg["reasoning"]for trajectory export - Pass back via
reasoning_contentfield on subsequent turns
CLI Architecture (cli.py)
The interactive CLI uses:
- Rich - For the welcome banner and styled panels
- prompt_toolkit - For fixed input area with history,
patch_stdout, slash command autocomplete, and floating completion menus - KawaiiSpinner (in run_agent.py) - Animated kawaii faces during API calls; clean
┊activity feed for tool execution results
Key components:
HermesCLIclass - Main CLI controller with commands and conversation loopSlashCommandCompleter- Autocomplete dropdown for/commands(type/to see all)agent/skill_commands.py- Scans skills and builds invocation messages (shared with gateway)load_cli_config()- Loads config, sets environment variables for terminalbuild_welcome_banner()- Displays ASCII art logo, tools, and skills summary
CLI UX notes:
- Thinking spinner (during LLM API call) shows animated kawaii face + verb (
(⌐■_■) deliberating...) - When LLM returns tool calls, the spinner clears silently (no "got it!" noise)
- Tool execution results appear as a clean activity feed:
┊ {emoji} {verb} {detail} {duration} - "got it!" only appears when the LLM returns a final text response (
⚕ ready) - The prompt shows
⚕ ❯when the agent is working,❯when idle - Pasting 5+ lines auto-saves to
~/.hermes/pastes/and collapses to a reference - Multi-line input via Alt+Enter or Ctrl+J
/commands- Process user commands like/help,/clear,/personality, etc./skill-name- Invoke installed skills directly (e.g.,/axolotl,/gif-search)
CLI uses quiet_mode=True when creating AIAgent to suppress verbose logging.
Skill Slash Commands
Every installed skill in ~/.hermes/skills/ is automatically registered as a slash command.
The skill name (from frontmatter or folder name) becomes the command: axolotl → /axolotl.
Implementation (agent/skill_commands.py, shared between CLI and gateway):
scan_skill_commands()scans all SKILL.md files at startupbuild_skill_invocation_message()loads the SKILL.md content and builds a user-turn message- The message includes the full skill content, a list of supporting files (not loaded), and the user's instruction
- Supporting files can be loaded on demand via the
skill_viewtool - Injected as a user message (not system prompt) to preserve prompt caching
Adding CLI Commands
- Add to
COMMANDSdict with description - Add handler in
process_command()method - For persistent settings, use
save_config_value()to update config
Hermes CLI Commands
The unified hermes command provides all functionality:
| Command | Description |
|---|---|
hermes |
Interactive chat (default) |
hermes chat -q "..." |
Single query mode |
hermes setup |
Configure API keys and settings |
hermes config |
View current configuration |
hermes config edit |
Open config in editor |
hermes config set KEY VAL |
Set a specific value |
hermes config check |
Check for missing config |
hermes config migrate |
Prompt for missing config interactively |
hermes status |
Show configuration status |
hermes doctor |
Diagnose issues |
hermes update |
Update to latest (checks for new config) |
hermes uninstall |
Uninstall (can keep configs for reinstall) |
hermes gateway |
Start gateway (messaging + cron scheduler) |
hermes gateway setup |
Configure messaging platforms interactively |
hermes gateway install |
Install gateway as system service |
hermes cron list |
View scheduled jobs |
hermes cron status |
Check if cron scheduler is running |
hermes version |
Show version info |
hermes pairing list/approve/revoke |
Manage DM pairing codes |
Messaging Gateway
The gateway connects Hermes to Telegram, Discord, Slack, and WhatsApp.
Setup
The interactive setup wizard handles platform configuration:
hermes gateway setup # Arrow-key menu of all platforms, configure tokens/allowlists/home channels
This is the recommended way to configure messaging. It shows which platforms are already set up, walks through each one interactively, and offers to start/restart the gateway service at the end.
Platforms can also be configured manually in ~/.hermes/.env:
Configuration (in ~/.hermes/.env):
# Telegram
TELEGRAM_BOT_TOKEN=123456:ABC-DEF... # From @BotFather
TELEGRAM_ALLOWED_USERS=123456789,987654 # Comma-separated user IDs (from @userinfobot)
# Discord
DISCORD_BOT_TOKEN=MTIz... # From Developer Portal
DISCORD_ALLOWED_USERS=123456789012345678 # Comma-separated user IDs
# Agent Behavior
HERMES_MAX_ITERATIONS=60 # Max tool-calling iterations
MESSAGING_CWD=/home/myuser # Terminal working directory for messaging
# Tool progress is configured in config.yaml (display.tool_progress: off|new|all|verbose)
Working Directory Behavior
- CLI (
hermescommand): Uses current directory (.→os.getcwd()) - Messaging (Telegram/Discord): Uses
MESSAGING_CWD(default: home directory)
This is intentional: CLI users are in a terminal and expect the agent to work in their current directory, while messaging users need a consistent starting location.
Security (User Allowlists):
IMPORTANT: By default, the gateway denies all users who are not in an allowlist or paired via DM.
The gateway checks {PLATFORM}_ALLOWED_USERS environment variables:
- If set: Only listed user IDs can interact with the bot
- If unset: All users are denied unless
GATEWAY_ALLOW_ALL_USERS=trueis set
Users can find their IDs:
- Telegram: Message @userinfobot
- Discord: Enable Developer Mode, right-click name → Copy ID
DM Pairing System
Instead of static allowlists, users can pair via one-time codes:
- Unknown user DMs the bot → receives pairing code
- Owner runs
hermes pairing approve <platform> <code> - User is permanently authorized
Security: 8-char codes, 1-hour expiry, rate-limited (1/10min/user), max 3 pending per platform, lockout after 5 failed attempts, chmod 0600 on data files.
Files: gateway/pairing.py, hermes_cli/pairing.py
Event Hooks
Hooks fire at lifecycle points. Place hook directories in ~/.hermes/hooks/:
~/.hermes/hooks/my-hook/
├── HOOK.yaml # name, description, events list
└── handler.py # async def handle(event_type, context): ...
Events: gateway:startup, session:start, session:reset, agent:start, agent:step, agent:end, command:*
The agent:step event fires each iteration of the tool-calling loop with tool names and results.
Files: gateway/hooks.py
Tool Progress Notifications
When tool_progress is enabled in config.yaml, the bot sends status messages as it works:
💻 \ls -la`...` (terminal commands show the actual command)🔍 web_search...📄 web_extract...🐍 execute_code...(programmatic tool calling sandbox)🔀 delegate_task...(subagent delegation)❓ clarify...(user question, CLI-only)
Modes:
new: Only when switching to a different tool (less spam)all: Every single tool call
Typing Indicator
The gateway keeps the "typing..." indicator active throughout processing, refreshing every 4 seconds. This lets users know the bot is working even during long tool-calling sequences.
Platform Toolsets:
Each platform has a dedicated toolset in toolsets.py:
hermes-telegram: Full tools including terminal (with safety checks)hermes-discord: Full tools including terminalhermes-whatsapp: Full tools including terminal
Configuration System
Configuration files are stored in ~/.hermes/ for easy user access:
~/.hermes/config.yaml- All settings (model, terminal, compression, etc.)~/.hermes/.env- API keys and secrets
Adding New Configuration Options
When adding new configuration variables, you MUST follow this process:
For config.yaml options:
- Add to
DEFAULT_CONFIGinhermes_cli/config.py - CRITICAL: Bump
_config_versioninDEFAULT_CONFIGwhen adding required fields - This triggers migration prompts for existing users on next
hermes updateorhermes setup
Example:
DEFAULT_CONFIG = {
# ... existing config ...
"new_feature": {
"enabled": True,
"option": "default_value",
},
# BUMP THIS when adding required fields
"_config_version": 2, # Was 1, now 2
}
For .env variables (API keys/secrets):
- Add to
REQUIRED_ENV_VARSorOPTIONAL_ENV_VARSinhermes_cli/config.py - Include metadata for the migration system:
OPTIONAL_ENV_VARS = {
# ... existing vars ...
"NEW_API_KEY": {
"description": "What this key is for",
"prompt": "Display name in prompts",
"url": "https://where-to-get-it.com/",
"tools": ["tools_it_enables"], # What tools need this
"password": True, # Mask input
},
}
Update related files:
hermes_cli/setup.py- Add prompts in the setup wizardcli-config.yaml.example- Add example with comments- Update README.md if user-facing
Config Version Migration
The system uses _config_version to detect outdated configs:
check_for_missing_config()compares user config toDEFAULT_CONFIGmigrate_config()interactively prompts for missing values- Called automatically by
hermes updateand optionally byhermes setup
Environment Variables
API keys are loaded from ~/.hermes/.env:
OPENROUTER_API_KEY- Main LLM API access (primary provider)FIRECRAWL_API_KEY- Web search/extract toolsBROWSERBASE_API_KEY/BROWSERBASE_PROJECT_ID- Browser automationFAL_KEY- Image generation (FLUX model)NOUS_API_KEY- Vision and Mixture-of-Agents tools
Terminal tool configuration (in ~/.hermes/config.yaml):
terminal.backend- Backend: local, docker, singularity, modal, or sshterminal.cwd- Working directory ("." = host CWD for local only; for remote backends set an absolute path inside the target, or omit to use the backend's default)terminal.docker_image- Image for Docker backendterminal.singularity_image- Image for Singularity backendterminal.modal_image- Image for Modal backend- SSH:
TERMINAL_SSH_HOST,TERMINAL_SSH_USER,TERMINAL_SSH_KEYin .env
Agent behavior (in ~/.hermes/.env):
HERMES_MAX_ITERATIONS- Max tool-calling iterations (default: 60)MESSAGING_CWD- Working directory for messaging platforms (default: ~)display.tool_progressin config.yaml - Tool progress:off,new,all,verboseOPENAI_API_KEY- Voice transcription (Whisper STT)SLACK_BOT_TOKEN/SLACK_APP_TOKEN- Slack integration (Socket Mode)SLACK_ALLOWED_USERS- Comma-separated Slack user IDsHERMES_HUMAN_DELAY_MODE- Response pacing: off/natural/customHERMES_HUMAN_DELAY_MIN_MS/HERMES_HUMAN_DELAY_MAX_MS- Custom delay range
Dangerous Command Approval
The terminal tool includes safety checks for potentially destructive commands (e.g., rm -rf, DROP TABLE, chmod 777, etc.):
Behavior by Backend:
- Docker/Singularity/Modal: Commands run unrestricted (isolated containers)
- Local/SSH: Dangerous commands trigger approval flow
Approval Flow (CLI):
⚠️ Potentially dangerous command detected: recursive delete
rm -rf /tmp/test
[o]nce | [s]ession | [a]lways | [d]eny
Choice [o/s/a/D]:
Approval Flow (Messaging):
- Command is blocked with explanation
- Agent explains the command was blocked for safety
- User must add the pattern to their allowlist via
hermes config editor run the command directly on their machine
Configuration:
command_allowlistin~/.hermes/config.yamlstores permanently allowed patterns- Add patterns via "always" approval or edit directly
Sudo Handling (Messaging):
- If sudo fails over messaging, output includes tip to add
SUDO_PASSWORDto~/.hermes/.env
Background Process Management
The process tool works alongside terminal for managing long-running background processes:
Starting a background process:
terminal(command="pytest -v tests/", background=true)
# Returns: {"session_id": "proc_abc123", "pid": 12345, ...}
Managing it with the process tool:
process(action="list")-- show all running/recent processesprocess(action="poll", session_id="proc_abc123")-- check status + new outputprocess(action="log", session_id="proc_abc123")-- full output with paginationprocess(action="wait", session_id="proc_abc123", timeout=600)-- block until doneprocess(action="kill", session_id="proc_abc123")-- terminateprocess(action="write", session_id="proc_abc123", data="y")-- send stdinprocess(action="submit", session_id="proc_abc123", data="yes")-- send + Enter
Key behaviors:
- Background processes execute through the configured terminal backend (local/Docker/Modal/SSH/Singularity) -- never directly on the host unless
TERMINAL_ENV=local - The
waitaction blocks the tool call until the process finishes, times out, or is interrupted by a new user message - PTY mode (
pty=trueon terminal) enables interactive CLI tools (Codex, Claude Code) - In RL training, background processes are auto-killed when the episode ends (
tool_context.cleanup()) - In the gateway, sessions with active background processes are exempt from idle reset
- The process registry checkpoints to
~/.hermes/processes.jsonfor crash recovery
Files: tools/process_registry.py (registry + handler), tools/terminal_tool.py (spawn integration)
Adding New Tools
Adding a tool requires changes in 2 files (the tool file and toolsets.py):
- Create
tools/your_tool.pywith handler, schema, check function, and registry call:
# tools/example_tool.py
import json
import os
from tools.registry import registry
def check_example_requirements() -> bool:
"""Check if required API keys/dependencies are available."""
return bool(os.getenv("EXAMPLE_API_KEY"))
def example_tool(param: str, task_id: str = None) -> str:
"""Execute the tool and return JSON string result."""
try:
result = {"success": True, "data": "..."}
return json.dumps(result, ensure_ascii=False)
except Exception as e:
return json.dumps({"error": str(e)}, ensure_ascii=False)
EXAMPLE_SCHEMA = {
"name": "example_tool",
"description": "Does something useful.",
"parameters": {
"type": "object",
"properties": {
"param": {"type": "string", "description": "The parameter"}
},
"required": ["param"]
}
}
registry.register(
name="example_tool",
toolset="example",
schema=EXAMPLE_SCHEMA,
handler=lambda args, **kw: example_tool(
param=args.get("param", ""), task_id=kw.get("task_id")),
check_fn=check_example_requirements,
requires_env=["EXAMPLE_API_KEY"],
)
-
Add to
toolsets.py: Add"example_tool"to_HERMES_CORE_TOOLSif it should be in all platform toolsets, or create a new toolset entry. -
Add discovery import in
model_tools.py's_discover_tools()list:"tools.example_tool".
That's it. The registry handles schema collection, dispatch, availability checking, and error wrapping automatically. No edits to TOOLSET_REQUIREMENTS, handle_function_call(), get_all_tool_names(), or any other data structure.
Optional: Add to OPTIONAL_ENV_VARS in hermes_cli/config.py for the setup wizard, and to toolset_distributions.py for batch processing.
Special case: tools that need agent-level state (like todo, memory):
These are intercepted by run_agent.py's tool dispatch loop before handle_function_call(). The registry still holds their schemas, but dispatch returns a stub error as a safety fallback. See todo_tool.py for the pattern.
All tool handlers MUST return a JSON string. The registry's dispatch() wraps all exceptions in {"error": "..."} automatically.
Dynamic Tool Availability
Tools declare their requirements at registration time via check_fn and requires_env. The registry checks check_fn() when building tool definitions -- tools whose check fails are silently excluded.
Stateful Tools
Tools that maintain state (terminal, browser) require:
task_idparameter for session isolation between concurrent taskscleanup_*()function to release resources- Cleanup is called automatically in run_agent.py after conversation completes
Trajectory Format
Conversations are saved in ShareGPT format for training:
{"from": "system", "value": "System prompt with <tools>...</tools>"}
{"from": "human", "value": "User message"}
{"from": "gpt", "value": "<think>reasoning</think>\n<tool_call>{...}</tool_call>"}
{"from": "tool", "value": "<tool_response>{...}</tool_response>"}
{"from": "gpt", "value": "Final response"}
Tool calls use <tool_call> XML tags, responses use <tool_response> tags, reasoning uses <think> tags.
Trajectory Export
agent = AIAgent(save_trajectories=True)
agent.chat("Do something")
# Saves to trajectories/*.jsonl in ShareGPT format
Batch Processing (batch_runner.py)
For processing multiple prompts:
- Parallel execution with multiprocessing
- Content-based resume for fault tolerance (matches on prompt text, not indices)
- Toolset distributions control probabilistic tool availability per prompt
- Output:
data/<run_name>/trajectories.jsonl(combined) + individual batch files
python batch_runner.py \
--dataset_file=prompts.jsonl \
--batch_size=20 \
--num_workers=4 \
--run_name=my_run
Skills System
Skills are on-demand knowledge documents the agent can load. Compatible with the agentskills.io open standard.
skills/
├── mlops/ # Category folder
│ ├── axolotl/ # Skill folder
│ │ ├── SKILL.md # Main instructions (required)
│ │ ├── references/ # Additional docs, API specs
│ │ ├── templates/ # Output formats, configs
│ │ └── assets/ # Supplementary files (agentskills.io)
│ └── vllm/
│ └── SKILL.md
├── .hub/ # Skills Hub state (gitignored)
│ ├── lock.json # Installed skill provenance
│ ├── quarantine/ # Pending security review
│ ├── audit.log # Security scan history
│ ├── taps.json # Custom source repos
│ └── index-cache/ # Cached remote indexes
Progressive disclosure (token-efficient):
skills_categories()- List category names (~50 tokens)skills_list(category)- Name + description per skill (~3k tokens)skill_view(name)- Full content + tags + linked files
SKILL.md files use YAML frontmatter (agentskills.io format):
---
name: skill-name
description: Brief description for listing
version: 1.0.0
metadata:
hermes:
tags: [tag1, tag2]
related_skills: [other-skill]
---
# Skill Content...
Skills Hub — user-driven skill search/install from online registries (GitHub, ClawHub, Claude marketplaces, LobeHub). Not exposed as an agent tool — the model cannot search for or install skills. Users manage skills via hermes skills ... CLI commands or the /skills slash command in chat.
Key files:
tools/skills_tool.py— Agent-facing skill list/view (progressive disclosure)tools/skills_guard.py— Security scanner (regex + LLM audit, trust-aware install policy)tools/skills_hub.py— Source adapters (GitHub, ClawHub, Claude marketplace, LobeHub), lock file, authhermes_cli/skills_hub.py— CLI subcommands +/skillsslash command handler
Testing Changes
After making changes:
- Run
hermes doctorto check setup - Run
hermes config checkto verify config - Test with
hermes chat -q "test message" - For new config options, test fresh install:
rm -rf ~/.hermes && hermes setup