Commit Graph

15 Commits

Author SHA1 Message Date
Test
0fab46f65c fix: allow agent-created skills with caution-level findings
Agent-created skills were using the same policy as community hub
installs, blocking any skill with medium/high severity findings
(e.g. docker pull, pip install, git clone). This meant the agent
couldn't create skills that reference Docker or other common tools.

Changed agent-created policy from (allow, block, block) to
(allow, allow, block) — matching the trusted policy. Caution-level
findings (medium/high severity) are now allowed through, while
dangerous findings (critical severity like exfiltration, prompt
injection, reverse shells) remain blocked.

Added 4 tests covering the agent-created policy: safe allowed,
caution allowed, dangerous blocked, force override.
2026-03-17 16:32:25 -07:00
Stable Genius
3325e51e53 fix(skills): honor policy table for dangerous verdicts
Salvaged from PR #1007 by stablegenius49.

- let INSTALL_POLICY decide dangerous verdict handling for builtin skills
- allow --force to override blocked dangerous decisions for trusted and community sources
- accept --yes / -y as aliases for --force in /skills install
- update regression tests to match the intended policy precedence
2026-03-14 11:27:02 -07:00
teknium1
0aa31cd3cb feat: call_llm/async_call_llm + config slots + migrate all consumers
Add centralized call_llm() and async_call_llm() functions that own the
full LLM request lifecycle:
  1. Resolve provider + model from task config or explicit args
  2. Get or create a cached client for that provider
  3. Format request args (max_tokens handling, provider extra_body)
  4. Make the API call with max_tokens/max_completion_tokens retry
  5. Return the response

Config: expanded auxiliary section with provider:model slots for all
tasks (compression, vision, web_extract, session_search, skills_hub,
mcp, flush_memories). Config version bumped to 7.

Migrated all auxiliary consumers:
- context_compressor.py: uses call_llm(task='compression')
- vision_tools.py: uses async_call_llm(task='vision')
- web_tools.py: uses async_call_llm(task='web_extract')
- session_search_tool.py: uses async_call_llm(task='session_search')
- browser_tool.py: uses call_llm(task='vision'/'web_extract')
- mcp_tool.py: uses call_llm(task='mcp')
- skills_guard.py: uses call_llm(provider='openrouter')
- run_agent.py flush_memories: uses call_llm(task='flush_memories')

Tests updated for context_compressor and MCP tool. Some test mocks
still need updating (15 remaining failures from mock pattern changes,
2 pre-existing).
2026-03-11 20:52:19 -07:00
teknium1
07f09ecd83 refactor: route ad-hoc LLM consumers through centralized provider router
Route all remaining ad-hoc auxiliary LLM call sites through
resolve_provider_client() so auth, headers, and API format (Chat
Completions vs Responses API) are handled consistently in one place.

Files changed:

- tools/openrouter_client.py: Replace manual AsyncOpenAI construction
  with resolve_provider_client('openrouter', async_mode=True). The
  shared client module now delegates entirely to the router.

- tools/skills_guard.py: Replace inline OpenAI client construction
  (hardcoded OpenRouter base_url, manual api_key lookup, manual
  headers) with resolve_provider_client('openrouter'). Remove unused
  OPENROUTER_BASE_URL import.

- trajectory_compressor.py: Add _detect_provider() to map config
  base_url to a provider name, then route through
  resolve_provider_client. Falls back to raw construction for
  unrecognized custom endpoints.

- mini_swe_runner.py: Route default case (no explicit api_key/base_url)
  through resolve_provider_client('openrouter') with auto-detection
  fallback. Preserves direct construction when explicit creds are
  passed via CLI args.

- agent/auxiliary_client.py: Fix stale module docstring — vision auto
  mode now correctly documents that Codex and custom endpoints are
  tried (not skipped).
2026-03-11 20:02:36 -07:00
teknium1
4d53b7ccaa Add OpenRouter app attribution headers to skills_guard and trajectory_compressor
These two files were creating bare OpenAI clients pointing at OpenRouter
without the HTTP-Referer / X-OpenRouter-Title / X-OpenRouter-Categories
headers that the rest of the codebase sends for app attribution.

- skills_guard.py: LLM audit client (always OpenRouter)
- trajectory_compressor.py: sync + async summarization clients
  (guarded with 'openrouter' in base_url check since the endpoint
  is user-configurable)
2026-03-08 14:23:18 -07:00
teknium1
f2e24faaca feat: optional skills — official skills shipped but not activated by default
Add 'optional-skills/' directory for official skills that ship with the repo
but are not copied to ~/.hermes/skills/ during setup. They are:
- NOT shown to the model in the system prompt
- NOT copied during hermes setup/update
- Discoverable via 'hermes skills search' labeled as 'official'
- Installable via 'hermes skills install' with builtin trust (no third-party warning)
- Auto-categorized on install based on directory structure

Implementation:
- OptionalSkillSource adapter in tools/skills_hub.py (search/fetch/inspect)
- Added to create_source_router() as first source (highest priority)
- Trust level 'builtin' for official skills in skills_guard.py
- Friendly install message for official skills (no third-party warning)
- 'official' label in cyan in search results and skill list

First optional skill: Blackbox CLI (autonomous-ai-agents/blackbox)
- Multi-model coding agent with built-in judge/Chairman pattern
- Delegates to Claude, Codex, Gemini, and Blackbox models
- Open-source CLI (GPL-3.0, TypeScript, forked from Gemini CLI)
- Requires paid Blackbox AI API key

Refs: #475
2026-03-06 01:24:11 -08:00
teknium1
ffc6d767ec Merge PR #388: fix --force bypassing dangerous verdict in should_allow_install
Authored by Farukest. Fixes #387.

Removes 'and not force' from the dangerous verdict check so --force
can never install skills with critical security findings (reverse shells,
data exfiltration, etc). The docstring already documented this behavior
but the code didn't enforce it.
2026-03-04 19:19:57 -08:00
Farukest
4805be0119 fix: prevent --force from overriding dangerous verdict in should_allow_install
The docstring states --force should never override dangerous verdicts,
but the condition `if result.verdict == "dangerous" and not force`
allowed force=True to skip the early return. Execution then fell
through to `if force: return True`, bypassing the policy block.

Removed `and not force` so dangerous skills are always blocked
regardless of the --force flag.
2026-03-04 18:10:18 +03:00
Farukest
a3ca71fe26 fix: use is_relative_to() for symlink boundary check in skills_guard
The symlink escape check in _check_structure() used startswith()
without a trailing separator. A symlink resolving to a sibling
directory with a shared prefix (e.g. 'axolotl-backdoor') would pass
the check for 'axolotl' since the string prefix matched.

Replaced with Path.is_relative_to() which correctly handles directory
boundaries and is consistent with the skill_view path check.
2026-03-04 17:23:23 +03:00
teknium1
021f62cb0c fix(security): patch multi-word bypass in 8 more injection patterns
Systematic audit of all prompt injection regexes in skills_guard.py
found 8 more patterns with the same single-word gap vulnerability
fixed in PR #192. Multi-word variants like 'pretend that you are',
'output the full system prompt', 'respond without your safety
filters', etc. all bypassed the scanner.

Fixed patterns:
- you are [now] → you are [... now]
- do not [tell] the user → do not [... tell ... the] user
- pretend [you are|to be] → pretend [... you are|to be]
- output the [system|initial] prompt → output [... system|initial] prompt
- act as if you [have no] [restrictions] → act as if [... you ... have no ... restrictions]
- respond without [restrictions] → respond without [... restrictions]
- you have been [updated] to → you have been [... updated] to
- share [the] [entire] [conversation] → share [... conversation]

All use (?:\w+\s+)* to allow arbitrary intermediate words.
2026-03-04 06:00:41 -08:00
teknium1
ba214e43c8 fix(security): apply same multi-word bypass fix to disregard pattern
The 'disregard ... instructions/rules/guidelines' regex had the
same single-word gap vulnerability as the 'ignore' pattern fixed
in PR #192. 'disregard all your instructions' bypassed the scanner.

Added (?:\w+\s+)* between both keyword groups to allow arbitrary
intermediate words.
2026-03-04 05:55:38 -08:00
0xbyt4
4ea29978fc fix(security): catch multi-word prompt injection in skills_guard
The regex `ignore\s+(previous|all|...)\s+instructions` only matched
a single keyword between 'ignore' and 'instructions'. Phrases like
'ignore all prior instructions' bypassed the scanner entirely.

Changed to `ignore\s+(?:\w+\s+)*(previous|all|...)\s+instructions`
to allow arbitrary words before the keyword.
2026-02-28 20:16:48 +03:00
Raeli Savitt
95b6bd5df6 Harden agent attack surface: scan writes to memory, skills, cron, and context files
The security scanner (skills_guard.py) was only wired into the hub install path.
All other write paths to persistent state — skills created by the agent, memory
entries, cron prompts, and context files — bypassed it entirely. This closes
those gaps:

- file_operations: deny-list blocks writes to ~/.ssh, ~/.aws, ~/.hermes/.env, etc.
- code_execution_tool: filter secret env vars from sandbox child process
- skill_manager_tool: wire scan_skill() into create/edit/patch/write_file with rollback
- skills_guard: add "agent-created" trust level (same policy as community)
- memory_tool: scan content for injection/exfil before system prompt injection
- prompt_builder: scan AGENTS.md, .cursorrules, SOUL.md for prompt injection
- cronjob_tools: scan cron prompts for critical threats before scheduling

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-25 23:43:15 -05:00
teknium1
70dd3a16dc Cleanup time! 2026-02-20 23:23:32 -08:00
teknium1
14e59706b7 Add Skills Hub — universal skill search, install, and management from online registries
Implements the Hermes Skills Hub with agentskills.io spec compliance,
multi-registry skill discovery, security scanning, and user-driven
management via CLI and /skills slash command.

Core features:
- Security scanner (tools/skills_guard.py): 120 threat patterns across
  12 categories, trust-aware install policy (builtin/trusted/community),
  structural checks, unicode injection detection, LLM audit pass
- Hub client (tools/skills_hub.py): GitHub, ClawHub, Claude Code
  marketplace, and LobeHub source adapters with shared GitHubAuth
  (PAT + gh CLI + GitHub App), lock file provenance tracking, quarantine
  flow, and unified search across all sources
- CLI interface (hermes_cli/skills_hub.py): search, install, inspect,
  list, audit, uninstall, publish (GitHub PR), snapshot export/import,
  and tap management — powers both `hermes skills` and `/skills`

Spec conformance (Phase 0):
- Upgraded frontmatter parser to yaml.safe_load with fallback
- Migrated 39 SKILL.md files: tags/related_skills to metadata.hermes.*
- Added assets/ directory support and compatibility/metadata fields
- Excluded .hub/ from skill discovery in skills_tool.py

Updated 13 config/doc files including README, AGENTS.md, .env.example,
setup wizard, doctor, status, pyproject.toml, and docs.
2026-02-18 16:09:05 -08:00