Salvage of #3389 by @binhnt92 with reasoning fallback and retry logic added on top.
All 7 auxiliary LLM call sites now use extract_content_or_reasoning() which mirrors the main agent loop's behavior: extract content, strip think blocks, fall back to structured reasoning fields, retry on empty.
Closes#3389.
* fix(skills): reduce skills.sh resolution churn and preserve trust for wrapped identifiers
- Accept common skills.sh prefix typos (skils-sh/, skils.sh/)
- Strip skills-sh/ prefix in _resolve_trust_level() so trusted repos
stay trusted when installed through skills.sh
- Use resolved identifier (from bundle/meta) for scan_skill source
- Prefer tree search before root scan in _discover_identifier()
- Add _resolve_github_meta() consolidation for inspect flow
Cherry-picked from PR #3001 by kshitijk4poor.
* fix: restore candidate loop in SkillsShSource.fetch() for consistency
The cherry-picked PR only tried the first candidate identifier in
fetch() while inspect() (via _resolve_github_meta) tried all four.
This meant skills at repo/skills/path would be found by inspect but
missed by fetch, forcing it through the heavier _discover_identifier
flow. Restore the candidate loop so both paths behave identically.
Updated the test assertion to match.
---------
Co-authored-by: kshitijk4poor <82637225+kshitijk4poor@users.noreply.github.com>
_resolve_trust_level() didn't handle 'agent-created' source, so it
fell through to 'community' trust level. Community policy blocks on
any caution or dangerous findings, which meant common patterns like
curl with env vars, systemctl, crontab, cloudflared references etc.
would block skill creation/patching.
The agent-created policy row already existed in INSTALL_POLICY with
permissive settings (allow caution, ask on dangerous) but was never
reached. Now it is.
Fixes reports of skill_manage being blocked by security scanner.
Changes the policy for agent-created skills with critical security
findings from 'block' (silently rejected) to 'ask' (allowed with
warning logged). The agent created the skill, so blocking it entirely
is too aggressive — let it through but log the findings.
- Policy: agent-created dangerous changed from block to ask
- should_allow_install returns None for 'ask' (vs True/False)
- format_scan_report shows 'NEEDS CONFIRMATION' for ask
- skill_manager_tool.py caller handles None (allows with warning)
- force=True still overrides as before
Based on PR #2271 by redhelix (closed — 3200 lines of unrelated
Mission Control code excluded).
Agent-created skills were using the same policy as community hub
installs, blocking any skill with medium/high severity findings
(e.g. docker pull, pip install, git clone). This meant the agent
couldn't create skills that reference Docker or other common tools.
Changed agent-created policy from (allow, block, block) to
(allow, allow, block) — matching the trusted policy. Caution-level
findings (medium/high severity) are now allowed through, while
dangerous findings (critical severity like exfiltration, prompt
injection, reverse shells) remain blocked.
Added 4 tests covering the agent-created policy: safe allowed,
caution allowed, dangerous blocked, force override.
Salvaged from PR #1007 by stablegenius49.
- let INSTALL_POLICY decide dangerous verdict handling for builtin skills
- allow --force to override blocked dangerous decisions for trusted and community sources
- accept --yes / -y as aliases for --force in /skills install
- update regression tests to match the intended policy precedence
Add centralized call_llm() and async_call_llm() functions that own the
full LLM request lifecycle:
1. Resolve provider + model from task config or explicit args
2. Get or create a cached client for that provider
3. Format request args (max_tokens handling, provider extra_body)
4. Make the API call with max_tokens/max_completion_tokens retry
5. Return the response
Config: expanded auxiliary section with provider:model slots for all
tasks (compression, vision, web_extract, session_search, skills_hub,
mcp, flush_memories). Config version bumped to 7.
Migrated all auxiliary consumers:
- context_compressor.py: uses call_llm(task='compression')
- vision_tools.py: uses async_call_llm(task='vision')
- web_tools.py: uses async_call_llm(task='web_extract')
- session_search_tool.py: uses async_call_llm(task='session_search')
- browser_tool.py: uses call_llm(task='vision'/'web_extract')
- mcp_tool.py: uses call_llm(task='mcp')
- skills_guard.py: uses call_llm(provider='openrouter')
- run_agent.py flush_memories: uses call_llm(task='flush_memories')
Tests updated for context_compressor and MCP tool. Some test mocks
still need updating (15 remaining failures from mock pattern changes,
2 pre-existing).
Route all remaining ad-hoc auxiliary LLM call sites through
resolve_provider_client() so auth, headers, and API format (Chat
Completions vs Responses API) are handled consistently in one place.
Files changed:
- tools/openrouter_client.py: Replace manual AsyncOpenAI construction
with resolve_provider_client('openrouter', async_mode=True). The
shared client module now delegates entirely to the router.
- tools/skills_guard.py: Replace inline OpenAI client construction
(hardcoded OpenRouter base_url, manual api_key lookup, manual
headers) with resolve_provider_client('openrouter'). Remove unused
OPENROUTER_BASE_URL import.
- trajectory_compressor.py: Add _detect_provider() to map config
base_url to a provider name, then route through
resolve_provider_client. Falls back to raw construction for
unrecognized custom endpoints.
- mini_swe_runner.py: Route default case (no explicit api_key/base_url)
through resolve_provider_client('openrouter') with auto-detection
fallback. Preserves direct construction when explicit creds are
passed via CLI args.
- agent/auxiliary_client.py: Fix stale module docstring — vision auto
mode now correctly documents that Codex and custom endpoints are
tried (not skipped).
These two files were creating bare OpenAI clients pointing at OpenRouter
without the HTTP-Referer / X-OpenRouter-Title / X-OpenRouter-Categories
headers that the rest of the codebase sends for app attribution.
- skills_guard.py: LLM audit client (always OpenRouter)
- trajectory_compressor.py: sync + async summarization clients
(guarded with 'openrouter' in base_url check since the endpoint
is user-configurable)
Add 'optional-skills/' directory for official skills that ship with the repo
but are not copied to ~/.hermes/skills/ during setup. They are:
- NOT shown to the model in the system prompt
- NOT copied during hermes setup/update
- Discoverable via 'hermes skills search' labeled as 'official'
- Installable via 'hermes skills install' with builtin trust (no third-party warning)
- Auto-categorized on install based on directory structure
Implementation:
- OptionalSkillSource adapter in tools/skills_hub.py (search/fetch/inspect)
- Added to create_source_router() as first source (highest priority)
- Trust level 'builtin' for official skills in skills_guard.py
- Friendly install message for official skills (no third-party warning)
- 'official' label in cyan in search results and skill list
First optional skill: Blackbox CLI (autonomous-ai-agents/blackbox)
- Multi-model coding agent with built-in judge/Chairman pattern
- Delegates to Claude, Codex, Gemini, and Blackbox models
- Open-source CLI (GPL-3.0, TypeScript, forked from Gemini CLI)
- Requires paid Blackbox AI API key
Refs: #475
Authored by Farukest. Fixes#387.
Removes 'and not force' from the dangerous verdict check so --force
can never install skills with critical security findings (reverse shells,
data exfiltration, etc). The docstring already documented this behavior
but the code didn't enforce it.
The docstring states --force should never override dangerous verdicts,
but the condition `if result.verdict == "dangerous" and not force`
allowed force=True to skip the early return. Execution then fell
through to `if force: return True`, bypassing the policy block.
Removed `and not force` so dangerous skills are always blocked
regardless of the --force flag.
The symlink escape check in _check_structure() used startswith()
without a trailing separator. A symlink resolving to a sibling
directory with a shared prefix (e.g. 'axolotl-backdoor') would pass
the check for 'axolotl' since the string prefix matched.
Replaced with Path.is_relative_to() which correctly handles directory
boundaries and is consistent with the skill_view path check.
Systematic audit of all prompt injection regexes in skills_guard.py
found 8 more patterns with the same single-word gap vulnerability
fixed in PR #192. Multi-word variants like 'pretend that you are',
'output the full system prompt', 'respond without your safety
filters', etc. all bypassed the scanner.
Fixed patterns:
- you are [now] → you are [... now]
- do not [tell] the user → do not [... tell ... the] user
- pretend [you are|to be] → pretend [... you are|to be]
- output the [system|initial] prompt → output [... system|initial] prompt
- act as if you [have no] [restrictions] → act as if [... you ... have no ... restrictions]
- respond without [restrictions] → respond without [... restrictions]
- you have been [updated] to → you have been [... updated] to
- share [the] [entire] [conversation] → share [... conversation]
All use (?:\w+\s+)* to allow arbitrary intermediate words.
The 'disregard ... instructions/rules/guidelines' regex had the
same single-word gap vulnerability as the 'ignore' pattern fixed
in PR #192. 'disregard all your instructions' bypassed the scanner.
Added (?:\w+\s+)* between both keyword groups to allow arbitrary
intermediate words.
The regex `ignore\s+(previous|all|...)\s+instructions` only matched
a single keyword between 'ignore' and 'instructions'. Phrases like
'ignore all prior instructions' bypassed the scanner entirely.
Changed to `ignore\s+(?:\w+\s+)*(previous|all|...)\s+instructions`
to allow arbitrary words before the keyword.
The security scanner (skills_guard.py) was only wired into the hub install path.
All other write paths to persistent state — skills created by the agent, memory
entries, cron prompts, and context files — bypassed it entirely. This closes
those gaps:
- file_operations: deny-list blocks writes to ~/.ssh, ~/.aws, ~/.hermes/.env, etc.
- code_execution_tool: filter secret env vars from sandbox child process
- skill_manager_tool: wire scan_skill() into create/edit/patch/write_file with rollback
- skills_guard: add "agent-created" trust level (same policy as community)
- memory_tool: scan content for injection/exfil before system prompt injection
- prompt_builder: scan AGENTS.md, .cursorrules, SOUL.md for prompt injection
- cronjob_tools: scan cron prompts for critical threats before scheduling
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>