diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md index 97cf4bfe5..289605319 100644 --- a/CONTRIBUTING.md +++ b/CONTRIBUTING.md @@ -1,240 +1,503 @@ # Contributing to Hermes Agent -Thank you for your interest in contributing to Hermes Agent! This document provides guidelines and information for contributors. +Thank you for contributing to Hermes Agent! This guide covers everything you need: setting up your dev environment, understanding the architecture, deciding what to build, and getting your PR merged. -## Getting Started +--- + +## Contribution Priorities + +We value contributions in this order: + +1. **Bug fixes** — crashes, incorrect behavior, data loss. Always top priority. +2. **Cross-platform compatibility** — Windows, macOS, different Linux distros, different terminal emulators. We want Hermes to work everywhere. +3. **Security hardening** — shell injection, prompt injection, path traversal, privilege escalation. See [Security](#security-considerations). +4. **Performance and robustness** — retry logic, error handling, graceful degradation. +5. **New skills** — but only broadly useful ones. See [Should it be a Skill or a Tool?](#should-it-be-a-skill-or-a-tool) +6. **New tools** — rarely needed. Most capabilities should be skills. See below. +7. **Documentation** — fixes, clarifications, new examples. + +--- + +## Should it be a Skill or a Tool? + +This is the most common question for new contributors. The answer is almost always **skill**. + +### Make it a Skill when: + +- The capability can be expressed as instructions + shell commands + existing tools +- It wraps an external CLI or API that the agent can call via `terminal` or `web_extract` +- It doesn't need custom Python integration or API key management baked into the agent +- Examples: arXiv search, git workflows, Docker management, PDF processing, email via CLI tools + +### Make it a Tool when: + +- It requires end-to-end integration with API keys, auth flows, or multi-component configuration managed by the agent harness +- It needs custom processing logic that must execute precisely every time (not "best effort" from LLM interpretation) +- It handles binary data, streaming, or real-time events that can't go through the terminal +- Examples: browser automation (Browserbase session management), TTS (audio encoding + platform delivery), vision analysis (base64 image handling) + +### Should the Skill be bundled? + +Bundled skills (in `skills/`) ship with every Hermes install. They should be **broadly useful to most users**: + +- Document handling, web research, common dev workflows, system administration +- Used regularly by a wide range of people + +If your skill is specialized (a niche engineering tool, a specific SaaS integration, a game), it's better suited for a **Skills Hub** — upload it to a skills registry and share it in the [Nous Research Discord](https://discord.gg/NousResearch). Users can install it with `hermes skills install`. + +--- + +## Development Setup ### Prerequisites -- Python 3.11+ -- An OpenRouter API key (for running the agent) -- Git +| Requirement | Notes | +|-------------|-------| +| **Git** | With `--recurse-submodules` support | +| **Python 3.11+** | uv will install it if missing | +| **uv** | Fast Python package manager ([install](https://docs.astral.sh/uv/)) | +| **Node.js 18+** | Optional — needed for browser tools and WhatsApp bridge | -### Development Setup +### Clone and install -1. Clone the repository: - ```bash - git clone https://github.com/NousResearch/hermes-agent.git - cd hermes-agent - ``` +```bash +git clone --recurse-submodules https://github.com/NousResearch/hermes-agent.git +cd hermes-agent -2. Install dependencies: - ```bash - pip install -e . - # Or using uv - uv pip install -e . - ``` +# Create venv with Python 3.11 +uv venv venv --python 3.11 +export VIRTUAL_ENV="$(pwd)/venv" -3. Copy the example environment file and configure: - ```bash - cp .env.example .env - # Edit .env with your API keys - ``` +# Install with all extras (messaging, cron, CLI menus, dev tools) +uv pip install -e ".[all,dev]" +uv pip install -e "./mini-swe-agent" +uv pip install -e "./tinker-atropos" -4. Run the setup script (optional, for shell autocompletion): - ```bash - ./setup-hermes.sh - ``` +# Optional: browser tools +npm install +``` + +### Configure for development + +```bash +mkdir -p ~/.hermes/{cron,sessions,logs,memories,skills} +cp cli-config.yaml.example ~/.hermes/config.yaml +touch ~/.hermes/.env + +# Add at minimum an LLM provider key: +echo 'OPENROUTER_API_KEY=sk-or-v1-your-key' >> ~/.hermes/.env +``` + +### Run + +```bash +# Symlink for global access +mkdir -p ~/.local/bin +ln -sf "$(pwd)/venv/bin/hermes" ~/.local/bin/hermes + +# Verify +hermes doctor +hermes chat -q "Hello" +``` + +### Run tests + +```bash +pytest tests/ -v +``` + +--- ## Project Structure ``` hermes-agent/ -├── run_agent.py # Main AIAgent class -├── cli.py # Interactive CLI -├── model_tools.py # Tool registry orchestration -├── toolsets.py # Toolset definitions -├── agent/ # Agent internals (extracted modules) -│ ├── prompt_builder.py # System prompt assembly -│ ├── context_compressor.py -│ ├── auxiliary_client.py -│ └── ... -├── tools/ # Individual tool implementations -│ ├── registry.py # Central tool registry -│ ├── terminal_tool.py -│ ├── web_tools.py -│ ├── file_tools.py -│ └── ... -├── gateway/ # Multi-platform messaging gateway -│ ├── run.py -│ ├── platforms/ # Platform adapters (Telegram, Discord, etc.) -│ └── ... -├── skills/ # Built-in skills -├── docs/ # Documentation -└── tests/ # Test suite +├── run_agent.py # AIAgent class — core conversation loop, tool dispatch, session persistence +├── cli.py # HermesCLI class — interactive TUI, prompt_toolkit integration +├── model_tools.py # Tool orchestration (thin layer over tools/registry.py) +├── toolsets.py # Tool groupings and presets (hermes-cli, hermes-telegram, etc.) +├── hermes_state.py # SQLite session database with FTS5 full-text search +├── batch_runner.py # Parallel batch processing for trajectory generation +│ +├── agent/ # Agent internals (extracted modules) +│ ├── prompt_builder.py # System prompt assembly (identity, skills, context files, memory) +│ ├── context_compressor.py # Auto-summarization when approaching context limits +│ ├── auxiliary_client.py # Resolves auxiliary OpenAI clients (summarization, vision) +│ ├── display.py # KawaiiSpinner, tool progress formatting +│ ├── model_metadata.py # Model context lengths, token estimation +│ └── trajectory.py # Trajectory saving helpers +│ +├── hermes_cli/ # CLI command implementations +│ ├── main.py # Entry point, argument parsing, command dispatch +│ ├── config.py # Config management, migration, env var definitions +│ ├── setup.py # Interactive setup wizard +│ ├── auth.py # Provider resolution, OAuth, Nous Portal +│ ├── models.py # OpenRouter model selection lists +│ ├── banner.py # Welcome banner, ASCII art +│ ├── commands.py # Slash command definitions + autocomplete +│ ├── callbacks.py # Interactive callbacks (clarify, sudo, approval) +│ ├── doctor.py # Diagnostics +│ └── skills_hub.py # Skills Hub CLI + /skills slash command +│ +├── tools/ # Tool implementations (self-registering) +│ ├── registry.py # Central tool registry (schemas, handlers, dispatch) +│ ├── approval.py # Dangerous command detection + per-session approval +│ ├── terminal_tool.py # Terminal orchestration (sudo, env lifecycle, backends) +│ ├── file_operations.py # read_file, write_file, search, patch, etc. +│ ├── web_tools.py # web_search, web_extract (Firecrawl + Gemini summarization) +│ ├── vision_tools.py # Image analysis via multimodal models +│ ├── delegate_tool.py # Subagent spawning and parallel task execution +│ ├── code_execution_tool.py # Sandboxed Python with RPC tool access +│ ├── session_search_tool.py # Search past conversations with FTS5 + summarization +│ ├── cronjob_tools.py # Scheduled task management +│ ├── skill_tools.py # Skill search, load, manage +│ └── environments/ # Terminal execution backends +│ ├── base.py # BaseEnvironment ABC +│ ├── local.py, docker.py, ssh.py, singularity.py, modal.py +│ +├── gateway/ # Messaging gateway +│ ├── run.py # GatewayRunner — platform lifecycle, message routing, cron +│ ├── config.py # Platform configuration resolution +│ ├── session.py # Session store, context prompts, reset policies +│ └── platforms/ # Platform adapters +│ ├── telegram.py, discord_adapter.py, slack.py, whatsapp.py +│ +├── scripts/ # Installer and bridge scripts +│ ├── install.sh # Linux/macOS installer +│ ├── install.ps1 # Windows PowerShell installer +│ └── whatsapp-bridge/ # Node.js WhatsApp bridge (Baileys) +│ +├── skills/ # Bundled skills (copied to ~/.hermes/skills/ on install) +├── environments/ # RL training environments (Atropos integration) +├── tests/ # Test suite +├── docs/ # Additional documentation +│ +├── cli-config.yaml.example # Example configuration (copied to ~/.hermes/config.yaml) +└── AGENTS.md # Development guide for AI coding assistants ``` -## Contributing Guidelines +### User configuration (stored in `~/.hermes/`) -### Code Style +| Path | Purpose | +|------|---------| +| `~/.hermes/config.yaml` | Settings (model, terminal, toolsets, compression, etc.) | +| `~/.hermes/.env` | API keys and secrets | +| `~/.hermes/auth.json` | OAuth credentials (Nous Portal) | +| `~/.hermes/skills/` | All active skills (bundled + hub-installed + agent-created) | +| `~/.hermes/memories/` | Persistent memory (MEMORY.md, USER.md) | +| `~/.hermes/state.db` | SQLite session database | +| `~/.hermes/sessions/` | JSON session logs | +| `~/.hermes/cron/` | Scheduled job data | +| `~/.hermes/whatsapp/session/` | WhatsApp bridge credentials | -- Follow PEP 8 for Python code -- Use type hints where practical -- Add docstrings to functions and classes (Google-style docstrings preferred) -- Keep lines under 100 characters when reasonable +--- -### Adding a New Tool +## Architecture Overview -Tools self-register with the central registry. To add a new tool: +### Core Loop -1. Create a new file in `tools/` (e.g., `tools/my_tool.py`) +``` +User message → AIAgent._run_agent_loop() + ├── Build system prompt (prompt_builder.py) + ├── Build API kwargs (model, messages, tools, reasoning config) + ├── Call LLM (OpenAI-compatible API) + ├── If tool_calls in response: + │ ├── Execute each tool via registry dispatch + │ ├── Add tool results to conversation + │ └── Loop back to LLM call + ├── If text response: + │ ├── Persist session to DB + │ └── Return final_response + └── Context compression if approaching token limit +``` -2. Define your tool handler and schema: +### Key Design Patterns + +- **Self-registering tools**: Each tool file calls `registry.register()` at import time. `model_tools.py` triggers discovery by importing all tool modules. +- **Toolset grouping**: Tools are grouped into toolsets (`web`, `terminal`, `file`, `browser`, etc.) that can be enabled/disabled per platform. +- **Session persistence**: All conversations are stored in SQLite (`hermes_state.py`) with full-text search. JSON logs go to `~/.hermes/sessions/`. +- **Ephemeral injection**: System prompts and prefill messages are injected at API call time, never persisted to the database or logs. +- **Provider abstraction**: The agent works with any OpenAI-compatible API. Provider resolution happens at init time (Nous Portal OAuth, OpenRouter API key, or custom endpoint). + +--- + +## Code Style + +- **PEP 8** with practical exceptions (we don't enforce strict line length) +- **Comments**: Only when explaining non-obvious intent, trade-offs, or API quirks. Don't narrate what the code does — `# increment counter` adds nothing +- **Error handling**: Catch specific exceptions. Log with `logger.warning()`/`logger.error()` — use `exc_info=True` for unexpected errors so stack traces appear in logs +- **Cross-platform**: Never assume Unix. See [Cross-Platform Compatibility](#cross-platform-compatibility) + +--- + +## Adding a New Tool + +Before writing a tool, ask: [should this be a skill instead?](#should-it-be-a-skill-or-a-tool) + +Tools self-register with the central registry. Each tool file co-locates its schema, handler, and registration: + +```python +"""my_tool — Brief description of what this tool does.""" + +import json +from tools.registry import registry + + +def my_tool(param1: str, param2: int = 10, **kwargs) -> str: + """Handler. Returns a string result (often JSON).""" + result = do_work(param1, param2) + return json.dumps(result) + + +MY_TOOL_SCHEMA = { + "type": "function", + "function": { + "name": "my_tool", + "description": "What this tool does and when the agent should use it.", + "parameters": { + "type": "object", + "properties": { + "param1": {"type": "string", "description": "What param1 is"}, + "param2": {"type": "integer", "description": "What param2 is", "default": 10}, + }, + "required": ["param1"], + }, + }, +} + + +def _check_requirements() -> bool: + """Return True if this tool's dependencies are available.""" + return True + + +registry.register( + name="my_tool", + toolset="my_toolset", + schema=MY_TOOL_SCHEMA, + handler=lambda args, **kw: my_tool(**args, **kw), + check_fn=_check_requirements, +) +``` + +Then add the import to `model_tools.py` in the `_modules` list: + +```python +_modules = [ + # ... existing modules ... + "tools.my_tool", +] +``` + +If it's a new toolset, add it to `toolsets.py` and to the relevant platform presets. + +--- + +## Adding a Bundled Skill + +Bundled skills live in `skills/` organized by category: + +``` +skills/ +├── research/ +│ └── arxiv/ +│ ├── SKILL.md # Required: main instructions +│ └── scripts/ # Optional: helper scripts +│ └── search_arxiv.py +├── productivity/ +│ └── ocr-and-documents/ +│ ├── SKILL.md +│ ├── scripts/ +│ └── references/ +└── ... +``` + +### SKILL.md format + +```markdown +--- +name: my-skill +description: Brief description (shown in skill search results) +version: 1.0.0 +author: Your Name +license: MIT +metadata: + hermes: + tags: [Category, Subcategory, Keywords] + related_skills: [other-skill-name] +--- + +# Skill Title + +Brief intro. + +## When to Use +Trigger conditions — when should the agent load this skill? + +## Quick Reference +Table of common commands or API calls. + +## Procedure +Step-by-step instructions the agent follows. + +## Pitfalls +Known failure modes and how to handle them. + +## Verification +How the agent confirms it worked. +``` + +### Skill guidelines + +- **No external dependencies unless absolutely necessary.** Prefer stdlib Python, curl, and existing Hermes tools (`web_extract`, `terminal`, `read_file`). +- **Progressive disclosure.** Put the most common workflow first. Edge cases and advanced usage go at the bottom. +- **Include helper scripts** for XML/JSON parsing or complex logic — don't expect the LLM to write parsers inline every time. +- **Test it.** Run `hermes --toolsets skills -q "Use the X skill to do Y"` and verify the agent follows the instructions correctly. + +--- + +## Cross-Platform Compatibility + +Hermes runs on Linux, macOS, and Windows. When writing code that touches the OS: + +### Critical rules + +1. **`termios` and `fcntl` are Unix-only.** Always catch both `ImportError` and `NotImplementedError`: ```python - #!/usr/bin/env python3 - """ - My Tool Module - Brief description - - Longer description of what the tool does. - """ - - import json - from tools.registry import registry - - - def my_tool_handler(args: dict, **kwargs) -> str: - """Execute the tool and return JSON result.""" - # Your implementation here - return json.dumps({"result": "success"}) - - - def check_my_tool_requirements() -> bool: - """Check if tool dependencies are available.""" - return True # Or actual availability check - - - MY_TOOL_SCHEMA = { - "name": "my_tool", - "description": "What this tool does...", - "parameters": { - "type": "object", - "properties": { - "param1": { - "type": "string", - "description": "Description of param1" - } - }, - "required": ["param1"] - } - } - - # Register with the central registry - registry.register( - name="my_tool", - toolset="my_toolset", - schema=MY_TOOL_SCHEMA, - handler=lambda args, **kw: my_tool_handler(args, **kw), - check_fn=check_my_tool_requirements, - ) + try: + from simple_term_menu import TerminalMenu + menu = TerminalMenu(options) + idx = menu.show() + except (ImportError, NotImplementedError): + # Fallback: numbered menu for Windows + for i, opt in enumerate(options): + print(f" {i+1}. {opt}") + idx = int(input("Choice: ")) - 1 ``` -3. Add the import to `model_tools.py` in `_discover_tools()`: +2. **File encoding.** Windows may save `.env` files in `cp1252`. Always handle encoding errors: ```python - _modules = [ - # ... existing modules ... - "tools.my_tool", - ] + try: + load_dotenv(env_path) + except UnicodeDecodeError: + load_dotenv(env_path, encoding="latin-1") ``` -4. Add your toolset to `toolsets.py` if it's a new category - -### Adding a Skill - -Skills are markdown documents with YAML frontmatter. Create a new skill: - -1. Create a directory in `skills/`: - ``` - skills/my-skill/ - └── SKILL.md +3. **Process management.** `os.setsid()`, `os.killpg()`, and signal handling differ on Windows. Use platform checks: + ```python + import platform + if platform.system() != "Windows": + kwargs["preexec_fn"] = os.setsid ``` -2. Write the skill file with proper frontmatter: - ```markdown - --- - name: my-skill - description: Brief description of what this skill does - version: 1.0.0 - author: Your Name - tags: [category, subcategory] - --- - - # My Skill - - Instructions for the agent when using this skill... - ``` +4. **Path separators.** Use `pathlib.Path` instead of string concatenation with `/`. -### Pull Request Process +5. **Shell commands in installers.** If you change `scripts/install.sh`, check if the equivalent change is needed in `scripts/install.ps1`. -1. **Fork the repository** and create a feature branch: - ```bash - git checkout -b feat/my-feature - # or - git checkout -b fix/issue-description - ``` +--- -2. **Make your changes** with clear, focused commits +## Security Considerations -3. **Test your changes**: - ```bash - # Run the test suite - pytest tests/ - - # Test manually with the CLI - python cli.py - ``` +Hermes has terminal access. Security matters. -4. **Update documentation** if needed +### Existing protections -5. **Submit a pull request** with: - - Clear title following conventional commits (e.g., `feat(tools):`, `fix(cli):`, `docs:`) - - Description of what changed and why - - Reference to any related issues +| Layer | Implementation | +|-------|---------------| +| **Sudo password piping** | Uses `shlex.quote()` to prevent shell injection | +| **Dangerous command detection** | Regex patterns in `tools/approval.py` with user approval flow | +| **Cron prompt injection** | Scanner in `tools/cronjob_tools.py` blocks instruction-override patterns | +| **Write deny list** | Protected paths (`~/.ssh/authorized_keys`, `/etc/shadow`) resolved via `os.path.realpath()` to prevent symlink bypass | +| **Skills guard** | Security scanner for hub-installed skills (`tools/skills_guard.py`) | +| **Code execution sandbox** | `execute_code` child process runs with API keys stripped from environment | +| **Container hardening** | Docker: read-only root, all capabilities dropped, no privilege escalation, PID limits | -### Commit Message Format +### When contributing security-sensitive code -We follow [Conventional Commits](https://www.conventionalcommits.org/): +- **Always use `shlex.quote()`** when interpolating user input into shell commands +- **Resolve symlinks** with `os.path.realpath()` before path-based access control checks +- **Don't log secrets.** API keys, tokens, and passwords should never appear in log output +- **Catch broad exceptions** around tool execution so a single failure doesn't crash the agent loop +- **Test on all platforms** if your change touches file paths, process management, or shell commands + +If your PR affects security, note it explicitly in the description. + +--- + +## Pull Request Process + +### Branch naming + +``` +fix/description # Bug fixes +feat/description # New features +docs/description # Documentation +test/description # Tests +refactor/description # Code restructuring +``` + +### Before submitting + +1. **Run tests**: `pytest tests/ -v` +2. **Test manually**: Run `hermes` and exercise the code path you changed +3. **Check cross-platform impact**: If you touch file I/O, process management, or terminal handling, consider Windows and macOS +4. **Keep PRs focused**: One logical change per PR. Don't mix a bug fix with a refactor with a new feature. + +### PR description + +Include: +- **What** changed and **why** +- **How to test** it (reproduction steps for bugs, usage examples for features) +- **What platforms** you tested on +- Reference any related issues + +### Commit messages + +We use [Conventional Commits](https://www.conventionalcommits.org/): ``` (): - -[optional body] - -[optional footer] ``` -Types: -- `feat`: New feature -- `fix`: Bug fix -- `docs`: Documentation only -- `refactor`: Code change that neither fixes a bug nor adds a feature -- `test`: Adding or correcting tests -- `chore`: Changes to build process or auxiliary tools +| Type | Use for | +|------|---------| +| `fix` | Bug fixes | +| `feat` | New features | +| `docs` | Documentation | +| `test` | Tests | +| `refactor` | Code restructuring (no behavior change) | +| `chore` | Build, CI, dependency updates | -Scopes: `cli`, `gateway`, `tools`, `skills`, `agent`, etc. +Scopes: `cli`, `gateway`, `tools`, `skills`, `agent`, `install`, `whatsapp`, `security`, etc. -### Security Considerations +Examples: +``` +fix(cli): prevent crash in save_config_value when model is a string +feat(gateway): add WhatsApp multi-user session isolation +fix(security): prevent shell injection in sudo password piping +test(tools): add unit tests for file_operations +``` -When contributing tools that interact with external resources: - -- **Skills Guard**: External skills pass through security scanning (`tools/skills_guard.py`) -- **Dangerous Commands**: Terminal commands are checked against patterns (`tools/approval.py`) -- **Memory Scanning**: Memory entries are scanned for injection attempts -- **Context Scanning**: AGENTS.md and similar files are scanned before prompt injection - -If your change affects security, please note this in your PR. +--- ## Reporting Issues -- Use GitHub Issues for bug reports and feature requests -- Include steps to reproduce for bugs -- Include system information (OS, Python version) +- Use [GitHub Issues](https://github.com/NousResearch/hermes-agent/issues) +- Include: OS, Python version, Hermes version (`hermes version`), full error traceback +- Include steps to reproduce - Check existing issues before creating duplicates +- For security vulnerabilities, please report privately -## Questions? +--- -- Open a GitHub Discussion for general questions -- Join the Nous Research community for real-time chat +## Community + +- **Discord**: [discord.gg/NousResearch](https://discord.gg/NousResearch) — for questions, showcasing projects, and sharing skills +- **GitHub Discussions**: For design proposals and architecture discussions +- **Skills Hub**: Upload specialized skills to a registry and share them with the community + +--- ## License -By contributing, you agree that your contributions will be licensed under the same license as the project. +By contributing, you agree that your contributions will be licensed under the [MIT License](LICENSE).