Introduced a new `provider_routing` section in the CLI configuration to control how requests are routed across providers when using OpenRouter. This includes options for sorting providers by throughput, latency, or price, as well as allowing or ignoring specific providers, setting the order of provider attempts, and managing data collection policies. Updated relevant classes and documentation to support these features, enhancing flexibility in provider selection.
505 lines
19 KiB
Markdown
505 lines
19 KiB
Markdown
# Contributing to Hermes Agent
|
|
|
|
Thank you for contributing to Hermes Agent! This guide covers everything you need: setting up your dev environment, understanding the architecture, deciding what to build, and getting your PR merged.
|
|
|
|
---
|
|
|
|
## Contribution Priorities
|
|
|
|
We value contributions in this order:
|
|
|
|
1. **Bug fixes** — crashes, incorrect behavior, data loss. Always top priority.
|
|
2. **Cross-platform compatibility** — Windows, macOS, different Linux distros, different terminal emulators. We want Hermes to work everywhere.
|
|
3. **Security hardening** — shell injection, prompt injection, path traversal, privilege escalation. See [Security](#security-considerations).
|
|
4. **Performance and robustness** — retry logic, error handling, graceful degradation.
|
|
5. **New skills** — but only broadly useful ones. See [Should it be a Skill or a Tool?](#should-it-be-a-skill-or-a-tool)
|
|
6. **New tools** — rarely needed. Most capabilities should be skills. See below.
|
|
7. **Documentation** — fixes, clarifications, new examples.
|
|
|
|
---
|
|
|
|
## Should it be a Skill or a Tool?
|
|
|
|
This is the most common question for new contributors. The answer is almost always **skill**.
|
|
|
|
### Make it a Skill when:
|
|
|
|
- The capability can be expressed as instructions + shell commands + existing tools
|
|
- It wraps an external CLI or API that the agent can call via `terminal` or `web_extract`
|
|
- It doesn't need custom Python integration or API key management baked into the agent
|
|
- Examples: arXiv search, git workflows, Docker management, PDF processing, email via CLI tools
|
|
|
|
### Make it a Tool when:
|
|
|
|
- It requires end-to-end integration with API keys, auth flows, or multi-component configuration managed by the agent harness
|
|
- It needs custom processing logic that must execute precisely every time (not "best effort" from LLM interpretation)
|
|
- It handles binary data, streaming, or real-time events that can't go through the terminal
|
|
- Examples: browser automation (Browserbase session management), TTS (audio encoding + platform delivery), vision analysis (base64 image handling)
|
|
|
|
### Should the Skill be bundled?
|
|
|
|
Bundled skills (in `skills/`) ship with every Hermes install. They should be **broadly useful to most users**:
|
|
|
|
- Document handling, web research, common dev workflows, system administration
|
|
- Used regularly by a wide range of people
|
|
|
|
If your skill is specialized (a niche engineering tool, a specific SaaS integration, a game), it's better suited for a **Skills Hub** — upload it to a skills registry and share it in the [Nous Research Discord](https://discord.gg/NousResearch). Users can install it with `hermes skills install`.
|
|
|
|
---
|
|
|
|
## Development Setup
|
|
|
|
### Prerequisites
|
|
|
|
| Requirement | Notes |
|
|
|-------------|-------|
|
|
| **Git** | With `--recurse-submodules` support |
|
|
| **Python 3.11+** | uv will install it if missing |
|
|
| **uv** | Fast Python package manager ([install](https://docs.astral.sh/uv/)) |
|
|
| **Node.js 18+** | Optional — needed for browser tools and WhatsApp bridge |
|
|
|
|
### Clone and install
|
|
|
|
```bash
|
|
git clone --recurse-submodules https://github.com/NousResearch/hermes-agent.git
|
|
cd hermes-agent
|
|
|
|
# Create venv with Python 3.11
|
|
uv venv venv --python 3.11
|
|
export VIRTUAL_ENV="$(pwd)/venv"
|
|
|
|
# Install with all extras (messaging, cron, CLI menus, dev tools)
|
|
uv pip install -e ".[all,dev]"
|
|
uv pip install -e "./mini-swe-agent"
|
|
uv pip install -e "./tinker-atropos"
|
|
|
|
# Optional: browser tools
|
|
npm install
|
|
```
|
|
|
|
### Configure for development
|
|
|
|
```bash
|
|
mkdir -p ~/.hermes/{cron,sessions,logs,memories,skills}
|
|
cp cli-config.yaml.example ~/.hermes/config.yaml
|
|
touch ~/.hermes/.env
|
|
|
|
# Add at minimum an LLM provider key:
|
|
echo 'OPENROUTER_API_KEY=sk-or-v1-your-key' >> ~/.hermes/.env
|
|
```
|
|
|
|
### Run
|
|
|
|
```bash
|
|
# Symlink for global access
|
|
mkdir -p ~/.local/bin
|
|
ln -sf "$(pwd)/venv/bin/hermes" ~/.local/bin/hermes
|
|
|
|
# Verify
|
|
hermes doctor
|
|
hermes chat -q "Hello"
|
|
```
|
|
|
|
### Run tests
|
|
|
|
```bash
|
|
pytest tests/ -v
|
|
```
|
|
|
|
---
|
|
|
|
## Project Structure
|
|
|
|
```
|
|
hermes-agent/
|
|
├── run_agent.py # AIAgent class — core conversation loop, tool dispatch, session persistence
|
|
├── cli.py # HermesCLI class — interactive TUI, prompt_toolkit integration
|
|
├── model_tools.py # Tool orchestration (thin layer over tools/registry.py)
|
|
├── toolsets.py # Tool groupings and presets (hermes-cli, hermes-telegram, etc.)
|
|
├── hermes_state.py # SQLite session database with FTS5 full-text search
|
|
├── batch_runner.py # Parallel batch processing for trajectory generation
|
|
│
|
|
├── agent/ # Agent internals (extracted modules)
|
|
│ ├── prompt_builder.py # System prompt assembly (identity, skills, context files, memory)
|
|
│ ├── context_compressor.py # Auto-summarization when approaching context limits
|
|
│ ├── auxiliary_client.py # Resolves auxiliary OpenAI clients (summarization, vision)
|
|
│ ├── display.py # KawaiiSpinner, tool progress formatting
|
|
│ ├── model_metadata.py # Model context lengths, token estimation
|
|
│ └── trajectory.py # Trajectory saving helpers
|
|
│
|
|
├── hermes_cli/ # CLI command implementations
|
|
│ ├── main.py # Entry point, argument parsing, command dispatch
|
|
│ ├── config.py # Config management, migration, env var definitions
|
|
│ ├── setup.py # Interactive setup wizard
|
|
│ ├── auth.py # Provider resolution, OAuth, Nous Portal
|
|
│ ├── models.py # OpenRouter model selection lists
|
|
│ ├── banner.py # Welcome banner, ASCII art
|
|
│ ├── commands.py # Slash command definitions + autocomplete
|
|
│ ├── callbacks.py # Interactive callbacks (clarify, sudo, approval)
|
|
│ ├── doctor.py # Diagnostics
|
|
│ └── skills_hub.py # Skills Hub CLI + /skills slash command
|
|
│
|
|
├── tools/ # Tool implementations (self-registering)
|
|
│ ├── registry.py # Central tool registry (schemas, handlers, dispatch)
|
|
│ ├── approval.py # Dangerous command detection + per-session approval
|
|
│ ├── terminal_tool.py # Terminal orchestration (sudo, env lifecycle, backends)
|
|
│ ├── file_operations.py # read_file, write_file, search, patch, etc.
|
|
│ ├── web_tools.py # web_search, web_extract (Firecrawl + Gemini summarization)
|
|
│ ├── vision_tools.py # Image analysis via multimodal models
|
|
│ ├── delegate_tool.py # Subagent spawning and parallel task execution
|
|
│ ├── code_execution_tool.py # Sandboxed Python with RPC tool access
|
|
│ ├── session_search_tool.py # Search past conversations with FTS5 + summarization
|
|
│ ├── cronjob_tools.py # Scheduled task management
|
|
│ ├── skill_tools.py # Skill search, load, manage
|
|
│ └── environments/ # Terminal execution backends
|
|
│ ├── base.py # BaseEnvironment ABC
|
|
│ ├── local.py, docker.py, ssh.py, singularity.py, modal.py
|
|
│
|
|
├── gateway/ # Messaging gateway
|
|
│ ├── run.py # GatewayRunner — platform lifecycle, message routing, cron
|
|
│ ├── config.py # Platform configuration resolution
|
|
│ ├── session.py # Session store, context prompts, reset policies
|
|
│ └── platforms/ # Platform adapters
|
|
│ ├── telegram.py, discord_adapter.py, slack.py, whatsapp.py
|
|
│
|
|
├── scripts/ # Installer and bridge scripts
|
|
│ ├── install.sh # Linux/macOS installer
|
|
│ ├── install.ps1 # Windows PowerShell installer
|
|
│ └── whatsapp-bridge/ # Node.js WhatsApp bridge (Baileys)
|
|
│
|
|
├── skills/ # Bundled skills (copied to ~/.hermes/skills/ on install)
|
|
├── environments/ # RL training environments (Atropos integration)
|
|
├── tests/ # Test suite
|
|
├── docs/ # Additional documentation
|
|
│
|
|
├── cli-config.yaml.example # Example configuration (copied to ~/.hermes/config.yaml)
|
|
└── AGENTS.md # Development guide for AI coding assistants
|
|
```
|
|
|
|
### User configuration (stored in `~/.hermes/`)
|
|
|
|
| Path | Purpose |
|
|
|------|---------|
|
|
| `~/.hermes/config.yaml` | Settings (model, terminal, toolsets, compression, etc.) |
|
|
| `~/.hermes/.env` | API keys and secrets |
|
|
| `~/.hermes/auth.json` | OAuth credentials (Nous Portal) |
|
|
| `~/.hermes/skills/` | All active skills (bundled + hub-installed + agent-created) |
|
|
| `~/.hermes/memories/` | Persistent memory (MEMORY.md, USER.md) |
|
|
| `~/.hermes/state.db` | SQLite session database |
|
|
| `~/.hermes/sessions/` | JSON session logs |
|
|
| `~/.hermes/cron/` | Scheduled job data |
|
|
| `~/.hermes/whatsapp/session/` | WhatsApp bridge credentials |
|
|
|
|
---
|
|
|
|
## Architecture Overview
|
|
|
|
### Core Loop
|
|
|
|
```
|
|
User message → AIAgent._run_agent_loop()
|
|
├── Build system prompt (prompt_builder.py)
|
|
├── Build API kwargs (model, messages, tools, reasoning config)
|
|
├── Call LLM (OpenAI-compatible API)
|
|
├── If tool_calls in response:
|
|
│ ├── Execute each tool via registry dispatch
|
|
│ ├── Add tool results to conversation
|
|
│ └── Loop back to LLM call
|
|
├── If text response:
|
|
│ ├── Persist session to DB
|
|
│ └── Return final_response
|
|
└── Context compression if approaching token limit
|
|
```
|
|
|
|
### Key Design Patterns
|
|
|
|
- **Self-registering tools**: Each tool file calls `registry.register()` at import time. `model_tools.py` triggers discovery by importing all tool modules.
|
|
- **Toolset grouping**: Tools are grouped into toolsets (`web`, `terminal`, `file`, `browser`, etc.) that can be enabled/disabled per platform.
|
|
- **Session persistence**: All conversations are stored in SQLite (`hermes_state.py`) with full-text search. JSON logs go to `~/.hermes/sessions/`.
|
|
- **Ephemeral injection**: System prompts and prefill messages are injected at API call time, never persisted to the database or logs.
|
|
- **Provider abstraction**: The agent works with any OpenAI-compatible API. Provider resolution happens at init time (Nous Portal OAuth, OpenRouter API key, or custom endpoint).
|
|
- **Provider routing**: When using OpenRouter, `provider_routing` in config.yaml controls provider selection (sort by throughput/latency/price, allow/ignore specific providers, data retention policies). These are injected as `extra_body.provider` in API requests.
|
|
|
|
---
|
|
|
|
## Code Style
|
|
|
|
- **PEP 8** with practical exceptions (we don't enforce strict line length)
|
|
- **Comments**: Only when explaining non-obvious intent, trade-offs, or API quirks. Don't narrate what the code does — `# increment counter` adds nothing
|
|
- **Error handling**: Catch specific exceptions. Log with `logger.warning()`/`logger.error()` — use `exc_info=True` for unexpected errors so stack traces appear in logs
|
|
- **Cross-platform**: Never assume Unix. See [Cross-Platform Compatibility](#cross-platform-compatibility)
|
|
|
|
---
|
|
|
|
## Adding a New Tool
|
|
|
|
Before writing a tool, ask: [should this be a skill instead?](#should-it-be-a-skill-or-a-tool)
|
|
|
|
Tools self-register with the central registry. Each tool file co-locates its schema, handler, and registration:
|
|
|
|
```python
|
|
"""my_tool — Brief description of what this tool does."""
|
|
|
|
import json
|
|
from tools.registry import registry
|
|
|
|
|
|
def my_tool(param1: str, param2: int = 10, **kwargs) -> str:
|
|
"""Handler. Returns a string result (often JSON)."""
|
|
result = do_work(param1, param2)
|
|
return json.dumps(result)
|
|
|
|
|
|
MY_TOOL_SCHEMA = {
|
|
"type": "function",
|
|
"function": {
|
|
"name": "my_tool",
|
|
"description": "What this tool does and when the agent should use it.",
|
|
"parameters": {
|
|
"type": "object",
|
|
"properties": {
|
|
"param1": {"type": "string", "description": "What param1 is"},
|
|
"param2": {"type": "integer", "description": "What param2 is", "default": 10},
|
|
},
|
|
"required": ["param1"],
|
|
},
|
|
},
|
|
}
|
|
|
|
|
|
def _check_requirements() -> bool:
|
|
"""Return True if this tool's dependencies are available."""
|
|
return True
|
|
|
|
|
|
registry.register(
|
|
name="my_tool",
|
|
toolset="my_toolset",
|
|
schema=MY_TOOL_SCHEMA,
|
|
handler=lambda args, **kw: my_tool(**args, **kw),
|
|
check_fn=_check_requirements,
|
|
)
|
|
```
|
|
|
|
Then add the import to `model_tools.py` in the `_modules` list:
|
|
|
|
```python
|
|
_modules = [
|
|
# ... existing modules ...
|
|
"tools.my_tool",
|
|
]
|
|
```
|
|
|
|
If it's a new toolset, add it to `toolsets.py` and to the relevant platform presets.
|
|
|
|
---
|
|
|
|
## Adding a Bundled Skill
|
|
|
|
Bundled skills live in `skills/` organized by category:
|
|
|
|
```
|
|
skills/
|
|
├── research/
|
|
│ └── arxiv/
|
|
│ ├── SKILL.md # Required: main instructions
|
|
│ └── scripts/ # Optional: helper scripts
|
|
│ └── search_arxiv.py
|
|
├── productivity/
|
|
│ └── ocr-and-documents/
|
|
│ ├── SKILL.md
|
|
│ ├── scripts/
|
|
│ └── references/
|
|
└── ...
|
|
```
|
|
|
|
### SKILL.md format
|
|
|
|
```markdown
|
|
---
|
|
name: my-skill
|
|
description: Brief description (shown in skill search results)
|
|
version: 1.0.0
|
|
author: Your Name
|
|
license: MIT
|
|
metadata:
|
|
hermes:
|
|
tags: [Category, Subcategory, Keywords]
|
|
related_skills: [other-skill-name]
|
|
---
|
|
|
|
# Skill Title
|
|
|
|
Brief intro.
|
|
|
|
## When to Use
|
|
Trigger conditions — when should the agent load this skill?
|
|
|
|
## Quick Reference
|
|
Table of common commands or API calls.
|
|
|
|
## Procedure
|
|
Step-by-step instructions the agent follows.
|
|
|
|
## Pitfalls
|
|
Known failure modes and how to handle them.
|
|
|
|
## Verification
|
|
How the agent confirms it worked.
|
|
```
|
|
|
|
### Skill guidelines
|
|
|
|
- **No external dependencies unless absolutely necessary.** Prefer stdlib Python, curl, and existing Hermes tools (`web_extract`, `terminal`, `read_file`).
|
|
- **Progressive disclosure.** Put the most common workflow first. Edge cases and advanced usage go at the bottom.
|
|
- **Include helper scripts** for XML/JSON parsing or complex logic — don't expect the LLM to write parsers inline every time.
|
|
- **Test it.** Run `hermes --toolsets skills -q "Use the X skill to do Y"` and verify the agent follows the instructions correctly.
|
|
|
|
---
|
|
|
|
## Cross-Platform Compatibility
|
|
|
|
Hermes runs on Linux, macOS, and Windows. When writing code that touches the OS:
|
|
|
|
### Critical rules
|
|
|
|
1. **`termios` and `fcntl` are Unix-only.** Always catch both `ImportError` and `NotImplementedError`:
|
|
```python
|
|
try:
|
|
from simple_term_menu import TerminalMenu
|
|
menu = TerminalMenu(options)
|
|
idx = menu.show()
|
|
except (ImportError, NotImplementedError):
|
|
# Fallback: numbered menu for Windows
|
|
for i, opt in enumerate(options):
|
|
print(f" {i+1}. {opt}")
|
|
idx = int(input("Choice: ")) - 1
|
|
```
|
|
|
|
2. **File encoding.** Windows may save `.env` files in `cp1252`. Always handle encoding errors:
|
|
```python
|
|
try:
|
|
load_dotenv(env_path)
|
|
except UnicodeDecodeError:
|
|
load_dotenv(env_path, encoding="latin-1")
|
|
```
|
|
|
|
3. **Process management.** `os.setsid()`, `os.killpg()`, and signal handling differ on Windows. Use platform checks:
|
|
```python
|
|
import platform
|
|
if platform.system() != "Windows":
|
|
kwargs["preexec_fn"] = os.setsid
|
|
```
|
|
|
|
4. **Path separators.** Use `pathlib.Path` instead of string concatenation with `/`.
|
|
|
|
5. **Shell commands in installers.** If you change `scripts/install.sh`, check if the equivalent change is needed in `scripts/install.ps1`.
|
|
|
|
---
|
|
|
|
## Security Considerations
|
|
|
|
Hermes has terminal access. Security matters.
|
|
|
|
### Existing protections
|
|
|
|
| Layer | Implementation |
|
|
|-------|---------------|
|
|
| **Sudo password piping** | Uses `shlex.quote()` to prevent shell injection |
|
|
| **Dangerous command detection** | Regex patterns in `tools/approval.py` with user approval flow |
|
|
| **Cron prompt injection** | Scanner in `tools/cronjob_tools.py` blocks instruction-override patterns |
|
|
| **Write deny list** | Protected paths (`~/.ssh/authorized_keys`, `/etc/shadow`) resolved via `os.path.realpath()` to prevent symlink bypass |
|
|
| **Skills guard** | Security scanner for hub-installed skills (`tools/skills_guard.py`) |
|
|
| **Code execution sandbox** | `execute_code` child process runs with API keys stripped from environment |
|
|
| **Container hardening** | Docker: read-only root, all capabilities dropped, no privilege escalation, PID limits |
|
|
|
|
### When contributing security-sensitive code
|
|
|
|
- **Always use `shlex.quote()`** when interpolating user input into shell commands
|
|
- **Resolve symlinks** with `os.path.realpath()` before path-based access control checks
|
|
- **Don't log secrets.** API keys, tokens, and passwords should never appear in log output
|
|
- **Catch broad exceptions** around tool execution so a single failure doesn't crash the agent loop
|
|
- **Test on all platforms** if your change touches file paths, process management, or shell commands
|
|
|
|
If your PR affects security, note it explicitly in the description.
|
|
|
|
---
|
|
|
|
## Pull Request Process
|
|
|
|
### Branch naming
|
|
|
|
```
|
|
fix/description # Bug fixes
|
|
feat/description # New features
|
|
docs/description # Documentation
|
|
test/description # Tests
|
|
refactor/description # Code restructuring
|
|
```
|
|
|
|
### Before submitting
|
|
|
|
1. **Run tests**: `pytest tests/ -v`
|
|
2. **Test manually**: Run `hermes` and exercise the code path you changed
|
|
3. **Check cross-platform impact**: If you touch file I/O, process management, or terminal handling, consider Windows and macOS
|
|
4. **Keep PRs focused**: One logical change per PR. Don't mix a bug fix with a refactor with a new feature.
|
|
|
|
### PR description
|
|
|
|
Include:
|
|
- **What** changed and **why**
|
|
- **How to test** it (reproduction steps for bugs, usage examples for features)
|
|
- **What platforms** you tested on
|
|
- Reference any related issues
|
|
|
|
### Commit messages
|
|
|
|
We use [Conventional Commits](https://www.conventionalcommits.org/):
|
|
|
|
```
|
|
<type>(<scope>): <description>
|
|
```
|
|
|
|
| Type | Use for |
|
|
|------|---------|
|
|
| `fix` | Bug fixes |
|
|
| `feat` | New features |
|
|
| `docs` | Documentation |
|
|
| `test` | Tests |
|
|
| `refactor` | Code restructuring (no behavior change) |
|
|
| `chore` | Build, CI, dependency updates |
|
|
|
|
Scopes: `cli`, `gateway`, `tools`, `skills`, `agent`, `install`, `whatsapp`, `security`, etc.
|
|
|
|
Examples:
|
|
```
|
|
fix(cli): prevent crash in save_config_value when model is a string
|
|
feat(gateway): add WhatsApp multi-user session isolation
|
|
fix(security): prevent shell injection in sudo password piping
|
|
test(tools): add unit tests for file_operations
|
|
```
|
|
|
|
---
|
|
|
|
## Reporting Issues
|
|
|
|
- Use [GitHub Issues](https://github.com/NousResearch/hermes-agent/issues)
|
|
- Include: OS, Python version, Hermes version (`hermes version`), full error traceback
|
|
- Include steps to reproduce
|
|
- Check existing issues before creating duplicates
|
|
- For security vulnerabilities, please report privately
|
|
|
|
---
|
|
|
|
## Community
|
|
|
|
- **Discord**: [discord.gg/NousResearch](https://discord.gg/NousResearch) — for questions, showcasing projects, and sharing skills
|
|
- **GitHub Discussions**: For design proposals and architecture discussions
|
|
- **Skills Hub**: Upload specialized skills to a registry and share them with the community
|
|
|
|
---
|
|
|
|
## License
|
|
|
|
By contributing, you agree that your contributions will be licensed under the [MIT License](LICENSE).
|