Add a 'platforms' field to SKILL.md frontmatter that restricts skills to specific operating systems. Skills with platforms: [macos] only appear in the system prompt, skills_list(), and slash commands on macOS. Skills without the field load everywhere (backward compatible). Implementation: - skill_matches_platform() in tools/skills_tool.py — core filter - Wired into all 3 discovery paths: prompt_builder.py, skills_tool.py, skill_commands.py - 28 new tests across 3 test files New bundled Apple/macOS skills (all platforms: [macos]): - imessage — Send/receive iMessages via imsg CLI - apple-reminders — Manage Reminders via remindctl CLI - apple-notes — Manage Notes via memo CLI - findmy — Track devices/AirTags via AppleScript + screen capture Docs updated: CONTRIBUTING.md, AGENTS.md, creating-skills.md, skills.md (user guide)
523 lines
21 KiB
Markdown
523 lines
21 KiB
Markdown
# Contributing to Hermes Agent
|
|
|
|
Thank you for contributing to Hermes Agent! This guide covers everything you need: setting up your dev environment, understanding the architecture, deciding what to build, and getting your PR merged.
|
|
|
|
---
|
|
|
|
## Contribution Priorities
|
|
|
|
We value contributions in this order:
|
|
|
|
1. **Bug fixes** — crashes, incorrect behavior, data loss. Always top priority.
|
|
2. **Cross-platform compatibility** — Windows, macOS, different Linux distros, different terminal emulators. We want Hermes to work everywhere.
|
|
3. **Security hardening** — shell injection, prompt injection, path traversal, privilege escalation. See [Security](#security-considerations).
|
|
4. **Performance and robustness** — retry logic, error handling, graceful degradation.
|
|
5. **New skills** — but only broadly useful ones. See [Should it be a Skill or a Tool?](#should-it-be-a-skill-or-a-tool)
|
|
6. **New tools** — rarely needed. Most capabilities should be skills. See below.
|
|
7. **Documentation** — fixes, clarifications, new examples.
|
|
|
|
---
|
|
|
|
## Should it be a Skill or a Tool?
|
|
|
|
This is the most common question for new contributors. The answer is almost always **skill**.
|
|
|
|
### Make it a Skill when:
|
|
|
|
- The capability can be expressed as instructions + shell commands + existing tools
|
|
- It wraps an external CLI or API that the agent can call via `terminal` or `web_extract`
|
|
- It doesn't need custom Python integration or API key management baked into the agent
|
|
- Examples: arXiv search, git workflows, Docker management, PDF processing, email via CLI tools
|
|
|
|
### Make it a Tool when:
|
|
|
|
- It requires end-to-end integration with API keys, auth flows, or multi-component configuration managed by the agent harness
|
|
- It needs custom processing logic that must execute precisely every time (not "best effort" from LLM interpretation)
|
|
- It handles binary data, streaming, or real-time events that can't go through the terminal
|
|
- Examples: browser automation (Browserbase session management), TTS (audio encoding + platform delivery), vision analysis (base64 image handling)
|
|
|
|
### Should the Skill be bundled?
|
|
|
|
Bundled skills (in `skills/`) ship with every Hermes install. They should be **broadly useful to most users**:
|
|
|
|
- Document handling, web research, common dev workflows, system administration
|
|
- Used regularly by a wide range of people
|
|
|
|
If your skill is official and useful but not universally needed (e.g., a paid service integration, a heavyweight dependency), put it in **`optional-skills/`** — it ships with the repo but isn't activated by default. Users can discover it via `hermes skills browse` (labeled "official") and install it with `hermes skills install` (no third-party warning, builtin trust).
|
|
|
|
If your skill is specialized, community-contributed, or niche, it's better suited for a **Skills Hub** — upload it to a skills registry and share it in the [Nous Research Discord](https://discord.gg/NousResearch). Users can install it with `hermes skills install`.
|
|
|
|
---
|
|
|
|
## Development Setup
|
|
|
|
### Prerequisites
|
|
|
|
| Requirement | Notes |
|
|
|-------------|-------|
|
|
| **Git** | With `--recurse-submodules` support |
|
|
| **Python 3.11+** | uv will install it if missing |
|
|
| **uv** | Fast Python package manager ([install](https://docs.astral.sh/uv/)) |
|
|
| **Node.js 18+** | Optional — needed for browser tools and WhatsApp bridge |
|
|
|
|
### Clone and install
|
|
|
|
```bash
|
|
git clone --recurse-submodules https://github.com/NousResearch/hermes-agent.git
|
|
cd hermes-agent
|
|
|
|
# Create venv with Python 3.11
|
|
uv venv venv --python 3.11
|
|
export VIRTUAL_ENV="$(pwd)/venv"
|
|
|
|
# Install with all extras (messaging, cron, CLI menus, dev tools)
|
|
uv pip install -e ".[all,dev]"
|
|
uv pip install -e "./mini-swe-agent"
|
|
uv pip install -e "./tinker-atropos"
|
|
|
|
# Optional: browser tools
|
|
npm install
|
|
```
|
|
|
|
### Configure for development
|
|
|
|
```bash
|
|
mkdir -p ~/.hermes/{cron,sessions,logs,memories,skills}
|
|
cp cli-config.yaml.example ~/.hermes/config.yaml
|
|
touch ~/.hermes/.env
|
|
|
|
# Add at minimum an LLM provider key:
|
|
echo 'OPENROUTER_API_KEY=sk-or-v1-your-key' >> ~/.hermes/.env
|
|
```
|
|
|
|
### Run
|
|
|
|
```bash
|
|
# Symlink for global access
|
|
mkdir -p ~/.local/bin
|
|
ln -sf "$(pwd)/venv/bin/hermes" ~/.local/bin/hermes
|
|
|
|
# Verify
|
|
hermes doctor
|
|
hermes chat -q "Hello"
|
|
```
|
|
|
|
### Run tests
|
|
|
|
```bash
|
|
pytest tests/ -v
|
|
```
|
|
|
|
---
|
|
|
|
## Project Structure
|
|
|
|
```
|
|
hermes-agent/
|
|
├── run_agent.py # AIAgent class — core conversation loop, tool dispatch, session persistence
|
|
├── cli.py # HermesCLI class — interactive TUI, prompt_toolkit integration
|
|
├── model_tools.py # Tool orchestration (thin layer over tools/registry.py)
|
|
├── toolsets.py # Tool groupings and presets (hermes-cli, hermes-telegram, etc.)
|
|
├── hermes_state.py # SQLite session database with FTS5 full-text search
|
|
├── batch_runner.py # Parallel batch processing for trajectory generation
|
|
│
|
|
├── agent/ # Agent internals (extracted modules)
|
|
│ ├── prompt_builder.py # System prompt assembly (identity, skills, context files, memory)
|
|
│ ├── context_compressor.py # Auto-summarization when approaching context limits
|
|
│ ├── auxiliary_client.py # Resolves auxiliary OpenAI clients (summarization, vision)
|
|
│ ├── display.py # KawaiiSpinner, tool progress formatting
|
|
│ ├── model_metadata.py # Model context lengths, token estimation
|
|
│ └── trajectory.py # Trajectory saving helpers
|
|
│
|
|
├── hermes_cli/ # CLI command implementations
|
|
│ ├── main.py # Entry point, argument parsing, command dispatch
|
|
│ ├── config.py # Config management, migration, env var definitions
|
|
│ ├── setup.py # Interactive setup wizard
|
|
│ ├── auth.py # Provider resolution, OAuth, Nous Portal
|
|
│ ├── models.py # OpenRouter model selection lists
|
|
│ ├── banner.py # Welcome banner, ASCII art
|
|
│ ├── commands.py # Slash command definitions + autocomplete
|
|
│ ├── callbacks.py # Interactive callbacks (clarify, sudo, approval)
|
|
│ ├── doctor.py # Diagnostics
|
|
│ └── skills_hub.py # Skills Hub CLI + /skills slash command
|
|
│
|
|
├── tools/ # Tool implementations (self-registering)
|
|
│ ├── registry.py # Central tool registry (schemas, handlers, dispatch)
|
|
│ ├── approval.py # Dangerous command detection + per-session approval
|
|
│ ├── terminal_tool.py # Terminal orchestration (sudo, env lifecycle, backends)
|
|
│ ├── file_operations.py # read_file, write_file, search, patch, etc.
|
|
│ ├── web_tools.py # web_search, web_extract (Firecrawl + Gemini summarization)
|
|
│ ├── vision_tools.py # Image analysis via multimodal models
|
|
│ ├── delegate_tool.py # Subagent spawning and parallel task execution
|
|
│ ├── code_execution_tool.py # Sandboxed Python with RPC tool access
|
|
│ ├── session_search_tool.py # Search past conversations with FTS5 + summarization
|
|
│ ├── cronjob_tools.py # Scheduled task management
|
|
│ ├── skill_tools.py # Skill search, load, manage
|
|
│ └── environments/ # Terminal execution backends
|
|
│ ├── base.py # BaseEnvironment ABC
|
|
│ ├── local.py, docker.py, ssh.py, singularity.py, modal.py, daytona.py
|
|
│
|
|
├── gateway/ # Messaging gateway
|
|
│ ├── run.py # GatewayRunner — platform lifecycle, message routing, cron
|
|
│ ├── config.py # Platform configuration resolution
|
|
│ ├── session.py # Session store, context prompts, reset policies
|
|
│ └── platforms/ # Platform adapters
|
|
│ ├── telegram.py, discord_adapter.py, slack.py, whatsapp.py
|
|
│
|
|
├── scripts/ # Installer and bridge scripts
|
|
│ ├── install.sh # Linux/macOS installer
|
|
│ ├── install.ps1 # Windows PowerShell installer
|
|
│ └── whatsapp-bridge/ # Node.js WhatsApp bridge (Baileys)
|
|
│
|
|
├── skills/ # Bundled skills (copied to ~/.hermes/skills/ on install)
|
|
├── optional-skills/ # Official optional skills (discoverable via hub, not activated by default)
|
|
├── environments/ # RL training environments (Atropos integration)
|
|
├── tests/ # Test suite
|
|
├── website/ # Documentation site (hermes-agent.nousresearch.com)
|
|
│
|
|
├── cli-config.yaml.example # Example configuration (copied to ~/.hermes/config.yaml)
|
|
└── AGENTS.md # Development guide for AI coding assistants
|
|
```
|
|
|
|
### User configuration (stored in `~/.hermes/`)
|
|
|
|
| Path | Purpose |
|
|
|------|---------|
|
|
| `~/.hermes/config.yaml` | Settings (model, terminal, toolsets, compression, etc.) |
|
|
| `~/.hermes/.env` | API keys and secrets |
|
|
| `~/.hermes/auth.json` | OAuth credentials (Nous Portal) |
|
|
| `~/.hermes/skills/` | All active skills (bundled + hub-installed + agent-created) |
|
|
| `~/.hermes/memories/` | Persistent memory (MEMORY.md, USER.md) |
|
|
| `~/.hermes/state.db` | SQLite session database |
|
|
| `~/.hermes/sessions/` | JSON session logs |
|
|
| `~/.hermes/cron/` | Scheduled job data |
|
|
| `~/.hermes/whatsapp/session/` | WhatsApp bridge credentials |
|
|
|
|
---
|
|
|
|
## Architecture Overview
|
|
|
|
### Core Loop
|
|
|
|
```
|
|
User message → AIAgent._run_agent_loop()
|
|
├── Build system prompt (prompt_builder.py)
|
|
├── Build API kwargs (model, messages, tools, reasoning config)
|
|
├── Call LLM (OpenAI-compatible API)
|
|
├── If tool_calls in response:
|
|
│ ├── Execute each tool via registry dispatch
|
|
│ ├── Add tool results to conversation
|
|
│ └── Loop back to LLM call
|
|
├── If text response:
|
|
│ ├── Persist session to DB
|
|
│ └── Return final_response
|
|
└── Context compression if approaching token limit
|
|
```
|
|
|
|
### Key Design Patterns
|
|
|
|
- **Self-registering tools**: Each tool file calls `registry.register()` at import time. `model_tools.py` triggers discovery by importing all tool modules.
|
|
- **Toolset grouping**: Tools are grouped into toolsets (`web`, `terminal`, `file`, `browser`, etc.) that can be enabled/disabled per platform.
|
|
- **Session persistence**: All conversations are stored in SQLite (`hermes_state.py`) with full-text search. JSON logs go to `~/.hermes/sessions/`.
|
|
- **Ephemeral injection**: System prompts and prefill messages are injected at API call time, never persisted to the database or logs.
|
|
- **Provider abstraction**: The agent works with any OpenAI-compatible API. Provider resolution happens at init time (Nous Portal OAuth, OpenRouter API key, or custom endpoint).
|
|
- **Provider routing**: When using OpenRouter, `provider_routing` in config.yaml controls provider selection (sort by throughput/latency/price, allow/ignore specific providers, data retention policies). These are injected as `extra_body.provider` in API requests.
|
|
|
|
---
|
|
|
|
## Code Style
|
|
|
|
- **PEP 8** with practical exceptions (we don't enforce strict line length)
|
|
- **Comments**: Only when explaining non-obvious intent, trade-offs, or API quirks. Don't narrate what the code does — `# increment counter` adds nothing
|
|
- **Error handling**: Catch specific exceptions. Log with `logger.warning()`/`logger.error()` — use `exc_info=True` for unexpected errors so stack traces appear in logs
|
|
- **Cross-platform**: Never assume Unix. See [Cross-Platform Compatibility](#cross-platform-compatibility)
|
|
|
|
---
|
|
|
|
## Adding a New Tool
|
|
|
|
Before writing a tool, ask: [should this be a skill instead?](#should-it-be-a-skill-or-a-tool)
|
|
|
|
Tools self-register with the central registry. Each tool file co-locates its schema, handler, and registration:
|
|
|
|
```python
|
|
"""my_tool — Brief description of what this tool does."""
|
|
|
|
import json
|
|
from tools.registry import registry
|
|
|
|
|
|
def my_tool(param1: str, param2: int = 10, **kwargs) -> str:
|
|
"""Handler. Returns a string result (often JSON)."""
|
|
result = do_work(param1, param2)
|
|
return json.dumps(result)
|
|
|
|
|
|
MY_TOOL_SCHEMA = {
|
|
"type": "function",
|
|
"function": {
|
|
"name": "my_tool",
|
|
"description": "What this tool does and when the agent should use it.",
|
|
"parameters": {
|
|
"type": "object",
|
|
"properties": {
|
|
"param1": {"type": "string", "description": "What param1 is"},
|
|
"param2": {"type": "integer", "description": "What param2 is", "default": 10},
|
|
},
|
|
"required": ["param1"],
|
|
},
|
|
},
|
|
}
|
|
|
|
|
|
def _check_requirements() -> bool:
|
|
"""Return True if this tool's dependencies are available."""
|
|
return True
|
|
|
|
|
|
registry.register(
|
|
name="my_tool",
|
|
toolset="my_toolset",
|
|
schema=MY_TOOL_SCHEMA,
|
|
handler=lambda args, **kw: my_tool(**args, **kw),
|
|
check_fn=_check_requirements,
|
|
)
|
|
```
|
|
|
|
Then add the import to `model_tools.py` in the `_modules` list:
|
|
|
|
```python
|
|
_modules = [
|
|
# ... existing modules ...
|
|
"tools.my_tool",
|
|
]
|
|
```
|
|
|
|
If it's a new toolset, add it to `toolsets.py` and to the relevant platform presets.
|
|
|
|
---
|
|
|
|
## Adding a Skill
|
|
|
|
Bundled skills live in `skills/` organized by category. Official optional skills use the same structure in `optional-skills/`:
|
|
|
|
```
|
|
skills/
|
|
├── research/
|
|
│ └── arxiv/
|
|
│ ├── SKILL.md # Required: main instructions
|
|
│ └── scripts/ # Optional: helper scripts
|
|
│ └── search_arxiv.py
|
|
├── productivity/
|
|
│ └── ocr-and-documents/
|
|
│ ├── SKILL.md
|
|
│ ├── scripts/
|
|
│ └── references/
|
|
└── ...
|
|
```
|
|
|
|
### SKILL.md format
|
|
|
|
```markdown
|
|
---
|
|
name: my-skill
|
|
description: Brief description (shown in skill search results)
|
|
version: 1.0.0
|
|
author: Your Name
|
|
license: MIT
|
|
platforms: [macos, linux] # Optional — restrict to specific OS platforms
|
|
# Valid: macos, linux, windows
|
|
# Omit to load on all platforms (default)
|
|
metadata:
|
|
hermes:
|
|
tags: [Category, Subcategory, Keywords]
|
|
related_skills: [other-skill-name]
|
|
---
|
|
|
|
# Skill Title
|
|
|
|
Brief intro.
|
|
|
|
## When to Use
|
|
Trigger conditions — when should the agent load this skill?
|
|
|
|
## Quick Reference
|
|
Table of common commands or API calls.
|
|
|
|
## Procedure
|
|
Step-by-step instructions the agent follows.
|
|
|
|
## Pitfalls
|
|
Known failure modes and how to handle them.
|
|
|
|
## Verification
|
|
How the agent confirms it worked.
|
|
```
|
|
|
|
### Platform-specific skills
|
|
|
|
Skills can declare which OS platforms they support via the `platforms` frontmatter field. Skills with this field are automatically hidden from the system prompt, `skills_list()`, and slash commands on incompatible platforms.
|
|
|
|
```yaml
|
|
platforms: [macos] # macOS only (e.g., iMessage, Apple Reminders)
|
|
platforms: [macos, linux] # macOS and Linux
|
|
platforms: [windows] # Windows only
|
|
```
|
|
|
|
If the field is omitted or empty, the skill loads on all platforms (backward compatible). See `skills/apple/` for examples of macOS-only skills.
|
|
|
|
### Skill guidelines
|
|
|
|
- **No external dependencies unless absolutely necessary.** Prefer stdlib Python, curl, and existing Hermes tools (`web_extract`, `terminal`, `read_file`).
|
|
- **Progressive disclosure.** Put the most common workflow first. Edge cases and advanced usage go at the bottom.
|
|
- **Include helper scripts** for XML/JSON parsing or complex logic — don't expect the LLM to write parsers inline every time.
|
|
- **Test it.** Run `hermes --toolsets skills -q "Use the X skill to do Y"` and verify the agent follows the instructions correctly.
|
|
|
|
---
|
|
|
|
## Cross-Platform Compatibility
|
|
|
|
Hermes runs on Linux, macOS, and Windows. When writing code that touches the OS:
|
|
|
|
### Critical rules
|
|
|
|
1. **`termios` and `fcntl` are Unix-only.** Always catch both `ImportError` and `NotImplementedError`:
|
|
```python
|
|
try:
|
|
from simple_term_menu import TerminalMenu
|
|
menu = TerminalMenu(options)
|
|
idx = menu.show()
|
|
except (ImportError, NotImplementedError):
|
|
# Fallback: numbered menu for Windows
|
|
for i, opt in enumerate(options):
|
|
print(f" {i+1}. {opt}")
|
|
idx = int(input("Choice: ")) - 1
|
|
```
|
|
|
|
2. **File encoding.** Windows may save `.env` files in `cp1252`. Always handle encoding errors:
|
|
```python
|
|
try:
|
|
load_dotenv(env_path)
|
|
except UnicodeDecodeError:
|
|
load_dotenv(env_path, encoding="latin-1")
|
|
```
|
|
|
|
3. **Process management.** `os.setsid()`, `os.killpg()`, and signal handling differ on Windows. Use platform checks:
|
|
```python
|
|
import platform
|
|
if platform.system() != "Windows":
|
|
kwargs["preexec_fn"] = os.setsid
|
|
```
|
|
|
|
4. **Path separators.** Use `pathlib.Path` instead of string concatenation with `/`.
|
|
|
|
5. **Shell commands in installers.** If you change `scripts/install.sh`, check if the equivalent change is needed in `scripts/install.ps1`.
|
|
|
|
---
|
|
|
|
## Security Considerations
|
|
|
|
Hermes has terminal access. Security matters.
|
|
|
|
### Existing protections
|
|
|
|
| Layer | Implementation |
|
|
|-------|---------------|
|
|
| **Sudo password piping** | Uses `shlex.quote()` to prevent shell injection |
|
|
| **Dangerous command detection** | Regex patterns in `tools/approval.py` with user approval flow |
|
|
| **Cron prompt injection** | Scanner in `tools/cronjob_tools.py` blocks instruction-override patterns |
|
|
| **Write deny list** | Protected paths (`~/.ssh/authorized_keys`, `/etc/shadow`) resolved via `os.path.realpath()` to prevent symlink bypass |
|
|
| **Skills guard** | Security scanner for hub-installed skills (`tools/skills_guard.py`) |
|
|
| **Code execution sandbox** | `execute_code` child process runs with API keys stripped from environment |
|
|
| **Container hardening** | Docker: all capabilities dropped, no privilege escalation, PID limits, size-limited tmpfs |
|
|
|
|
### When contributing security-sensitive code
|
|
|
|
- **Always use `shlex.quote()`** when interpolating user input into shell commands
|
|
- **Resolve symlinks** with `os.path.realpath()` before path-based access control checks
|
|
- **Don't log secrets.** API keys, tokens, and passwords should never appear in log output
|
|
- **Catch broad exceptions** around tool execution so a single failure doesn't crash the agent loop
|
|
- **Test on all platforms** if your change touches file paths, process management, or shell commands
|
|
|
|
If your PR affects security, note it explicitly in the description.
|
|
|
|
---
|
|
|
|
## Pull Request Process
|
|
|
|
### Branch naming
|
|
|
|
```
|
|
fix/description # Bug fixes
|
|
feat/description # New features
|
|
docs/description # Documentation
|
|
test/description # Tests
|
|
refactor/description # Code restructuring
|
|
```
|
|
|
|
### Before submitting
|
|
|
|
1. **Run tests**: `pytest tests/ -v`
|
|
2. **Test manually**: Run `hermes` and exercise the code path you changed
|
|
3. **Check cross-platform impact**: If you touch file I/O, process management, or terminal handling, consider Windows and macOS
|
|
4. **Keep PRs focused**: One logical change per PR. Don't mix a bug fix with a refactor with a new feature.
|
|
|
|
### PR description
|
|
|
|
Include:
|
|
- **What** changed and **why**
|
|
- **How to test** it (reproduction steps for bugs, usage examples for features)
|
|
- **What platforms** you tested on
|
|
- Reference any related issues
|
|
|
|
### Commit messages
|
|
|
|
We use [Conventional Commits](https://www.conventionalcommits.org/):
|
|
|
|
```
|
|
<type>(<scope>): <description>
|
|
```
|
|
|
|
| Type | Use for |
|
|
|------|---------|
|
|
| `fix` | Bug fixes |
|
|
| `feat` | New features |
|
|
| `docs` | Documentation |
|
|
| `test` | Tests |
|
|
| `refactor` | Code restructuring (no behavior change) |
|
|
| `chore` | Build, CI, dependency updates |
|
|
|
|
Scopes: `cli`, `gateway`, `tools`, `skills`, `agent`, `install`, `whatsapp`, `security`, etc.
|
|
|
|
Examples:
|
|
```
|
|
fix(cli): prevent crash in save_config_value when model is a string
|
|
feat(gateway): add WhatsApp multi-user session isolation
|
|
fix(security): prevent shell injection in sudo password piping
|
|
test(tools): add unit tests for file_operations
|
|
```
|
|
|
|
---
|
|
|
|
## Reporting Issues
|
|
|
|
- Use [GitHub Issues](https://github.com/NousResearch/hermes-agent/issues)
|
|
- Include: OS, Python version, Hermes version (`hermes version`), full error traceback
|
|
- Include steps to reproduce
|
|
- Check existing issues before creating duplicates
|
|
- For security vulnerabilities, please report privately
|
|
|
|
---
|
|
|
|
## Community
|
|
|
|
- **Discord**: [discord.gg/NousResearch](https://discord.gg/NousResearch) — for questions, showcasing projects, and sharing skills
|
|
- **GitHub Discussions**: For design proposals and architecture discussions
|
|
- **Skills Hub**: Upload specialized skills to a registry and share them with the community
|
|
|
|
---
|
|
|
|
## License
|
|
|
|
By contributing, you agree that your contributions will be licensed under the [MIT License](LICENSE).
|