Salvaged from PR #1007 by stablegenius49.
- let INSTALL_POLICY decide dangerous verdict handling for builtin skills
- allow --force to override blocked dangerous decisions for trusted and community sources
- accept --yes / -y as aliases for --force in /skills install
- update regression tests to match the intended policy precedence
Add centralized call_llm() and async_call_llm() functions that own the
full LLM request lifecycle:
1. Resolve provider + model from task config or explicit args
2. Get or create a cached client for that provider
3. Format request args (max_tokens handling, provider extra_body)
4. Make the API call with max_tokens/max_completion_tokens retry
5. Return the response
Config: expanded auxiliary section with provider:model slots for all
tasks (compression, vision, web_extract, session_search, skills_hub,
mcp, flush_memories). Config version bumped to 7.
Migrated all auxiliary consumers:
- context_compressor.py: uses call_llm(task='compression')
- vision_tools.py: uses async_call_llm(task='vision')
- web_tools.py: uses async_call_llm(task='web_extract')
- session_search_tool.py: uses async_call_llm(task='session_search')
- browser_tool.py: uses call_llm(task='vision'/'web_extract')
- mcp_tool.py: uses call_llm(task='mcp')
- skills_guard.py: uses call_llm(provider='openrouter')
- run_agent.py flush_memories: uses call_llm(task='flush_memories')
Tests updated for context_compressor and MCP tool. Some test mocks
still need updating (15 remaining failures from mock pattern changes,
2 pre-existing).
Route all remaining ad-hoc auxiliary LLM call sites through
resolve_provider_client() so auth, headers, and API format (Chat
Completions vs Responses API) are handled consistently in one place.
Files changed:
- tools/openrouter_client.py: Replace manual AsyncOpenAI construction
with resolve_provider_client('openrouter', async_mode=True). The
shared client module now delegates entirely to the router.
- tools/skills_guard.py: Replace inline OpenAI client construction
(hardcoded OpenRouter base_url, manual api_key lookup, manual
headers) with resolve_provider_client('openrouter'). Remove unused
OPENROUTER_BASE_URL import.
- trajectory_compressor.py: Add _detect_provider() to map config
base_url to a provider name, then route through
resolve_provider_client. Falls back to raw construction for
unrecognized custom endpoints.
- mini_swe_runner.py: Route default case (no explicit api_key/base_url)
through resolve_provider_client('openrouter') with auto-detection
fallback. Preserves direct construction when explicit creds are
passed via CLI args.
- agent/auxiliary_client.py: Fix stale module docstring — vision auto
mode now correctly documents that Codex and custom endpoints are
tried (not skipped).
These two files were creating bare OpenAI clients pointing at OpenRouter
without the HTTP-Referer / X-OpenRouter-Title / X-OpenRouter-Categories
headers that the rest of the codebase sends for app attribution.
- skills_guard.py: LLM audit client (always OpenRouter)
- trajectory_compressor.py: sync + async summarization clients
(guarded with 'openrouter' in base_url check since the endpoint
is user-configurable)
Add 'optional-skills/' directory for official skills that ship with the repo
but are not copied to ~/.hermes/skills/ during setup. They are:
- NOT shown to the model in the system prompt
- NOT copied during hermes setup/update
- Discoverable via 'hermes skills search' labeled as 'official'
- Installable via 'hermes skills install' with builtin trust (no third-party warning)
- Auto-categorized on install based on directory structure
Implementation:
- OptionalSkillSource adapter in tools/skills_hub.py (search/fetch/inspect)
- Added to create_source_router() as first source (highest priority)
- Trust level 'builtin' for official skills in skills_guard.py
- Friendly install message for official skills (no third-party warning)
- 'official' label in cyan in search results and skill list
First optional skill: Blackbox CLI (autonomous-ai-agents/blackbox)
- Multi-model coding agent with built-in judge/Chairman pattern
- Delegates to Claude, Codex, Gemini, and Blackbox models
- Open-source CLI (GPL-3.0, TypeScript, forked from Gemini CLI)
- Requires paid Blackbox AI API key
Refs: #475
Authored by Farukest. Fixes#387.
Removes 'and not force' from the dangerous verdict check so --force
can never install skills with critical security findings (reverse shells,
data exfiltration, etc). The docstring already documented this behavior
but the code didn't enforce it.
The docstring states --force should never override dangerous verdicts,
but the condition `if result.verdict == "dangerous" and not force`
allowed force=True to skip the early return. Execution then fell
through to `if force: return True`, bypassing the policy block.
Removed `and not force` so dangerous skills are always blocked
regardless of the --force flag.
The symlink escape check in _check_structure() used startswith()
without a trailing separator. A symlink resolving to a sibling
directory with a shared prefix (e.g. 'axolotl-backdoor') would pass
the check for 'axolotl' since the string prefix matched.
Replaced with Path.is_relative_to() which correctly handles directory
boundaries and is consistent with the skill_view path check.
Systematic audit of all prompt injection regexes in skills_guard.py
found 8 more patterns with the same single-word gap vulnerability
fixed in PR #192. Multi-word variants like 'pretend that you are',
'output the full system prompt', 'respond without your safety
filters', etc. all bypassed the scanner.
Fixed patterns:
- you are [now] → you are [... now]
- do not [tell] the user → do not [... tell ... the] user
- pretend [you are|to be] → pretend [... you are|to be]
- output the [system|initial] prompt → output [... system|initial] prompt
- act as if you [have no] [restrictions] → act as if [... you ... have no ... restrictions]
- respond without [restrictions] → respond without [... restrictions]
- you have been [updated] to → you have been [... updated] to
- share [the] [entire] [conversation] → share [... conversation]
All use (?:\w+\s+)* to allow arbitrary intermediate words.
The 'disregard ... instructions/rules/guidelines' regex had the
same single-word gap vulnerability as the 'ignore' pattern fixed
in PR #192. 'disregard all your instructions' bypassed the scanner.
Added (?:\w+\s+)* between both keyword groups to allow arbitrary
intermediate words.
The regex `ignore\s+(previous|all|...)\s+instructions` only matched
a single keyword between 'ignore' and 'instructions'. Phrases like
'ignore all prior instructions' bypassed the scanner entirely.
Changed to `ignore\s+(?:\w+\s+)*(previous|all|...)\s+instructions`
to allow arbitrary words before the keyword.
The security scanner (skills_guard.py) was only wired into the hub install path.
All other write paths to persistent state — skills created by the agent, memory
entries, cron prompts, and context files — bypassed it entirely. This closes
those gaps:
- file_operations: deny-list blocks writes to ~/.ssh, ~/.aws, ~/.hermes/.env, etc.
- code_execution_tool: filter secret env vars from sandbox child process
- skill_manager_tool: wire scan_skill() into create/edit/patch/write_file with rollback
- skills_guard: add "agent-created" trust level (same policy as community)
- memory_tool: scan content for injection/exfil before system prompt injection
- prompt_builder: scan AGENTS.md, .cursorrules, SOUL.md for prompt injection
- cronjob_tools: scan cron prompts for critical threats before scheduling
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>