Adds a /update command to Telegram, Discord, and other gateway platforms
that runs `hermes update` to pull the latest code, update dependencies,
sync skills, and restart the gateway.
Implementation:
- Spawns `hermes update` in a separate systemd scope (systemd-run --user
--scope) so the process survives the gateway restart that hermes update
triggers at the end. Falls back to nohup if systemd-run is unavailable.
- Writes a marker file (.update_pending.json) with the originating
platform and chat_id before spawning the update.
- On gateway startup, _send_update_notification() checks for the marker,
reads the captured update output, sends the results back to the user,
and cleans up.
Also:
- Registers /update as a Discord slash command
- Updates README.md, docs/messaging.md, docs/slash-commands.md
- Adds 18 tests covering handler, notification, and edge cases
Authored by FarukEst. Fixes#392.
1. Initialize data={} before health-check loop to prevent NameError when
resp.json() raises after http_ready is set to True.
2. Extract _close_bridge_log() helper and call on all return False paths
to prevent file descriptor leaks on failed connection attempts.
Refactors disconnect() to reuse the same helper.
Authored by 0xbyt4.
The italic regex [^*]+ matched across newlines, corrupting bullet lists
using * markers (e.g. '* Item one\n* Item two' became italic garbage).
Fixed by adding \n to the negated character class: [^*\n]+.
Authored by 0xbyt4.
The dedup logic in GitHubSource.search() and unified_search() used
'r.trust_level == "trusted"' which let trusted results overwrite builtin
ones. Now uses ranked comparison: builtin (2) > trusted (1) > community (0).
Authored by 0xbyt4.
Two fixes:
- extract_images(): only remove extracted image tags, not all markdown image
tags. Previously  was silently dropped when real images
were also present.
- truncate_message(): walk chunk_body not full_chunk when tracking code block
state, so the reopened fence prefix doesn't toggle in_code off and leave
continuation chunks with unclosed code blocks.
Authored by 0xbyt4.
144 new tests covering gateway/pairing.py, tools/skill_manager_tool.py,
tools/skills_tool.py, honcho_integration/session.py, and
agent/auxiliary_client.py.
Authored by Farukest. Fixes#389.
Replaces hardcoded forward-slash string checks ('/.git/', '/.hub/') with
Path.parts membership test in _find_all_skills() and scan_skill_commands().
On Windows, str(Path) uses backslashes so the old filter never matched,
causing quarantined skills to appear as installed.
Authored by Farukest. Fixes#387.
Removes 'and not force' from the dangerous verdict check so --force
can never install skills with critical security findings (reverse shells,
data exfiltration, etc). The docstring already documented this behavior
but the code didn't enforce it.
Authored by Farukest. Fixes#385.
Replaces startswith() with Path.is_relative_to() in _check_structure()
symlink escape check — same fix pattern as skill_view() (PR #352).
Prevents symlinks escaping to sibling directories with shared name prefixes.
Fixes a bug where the refresh token was not persisted when the API key
mint failed (e.g., 402 insufficient credits, timeout). The rotated
refresh token was lost, causing subsequent auth attempts to fail with
a stale token.
Changes:
- Persist auth state immediately after each successful token refresh,
before attempting the mint
- Use latest in-memory refresh token on mint-retry paths (was using
the stale original)
- Atomic durable writes for auth.json (temp file + fsync + replace)
- Opt-in OAuth trace logging (HERMES_OAUTH_TRACE=1, fingerprint-only)
- 3 regression tests covering refresh+402, refresh+timeout, and
invalid-token retry behavior
Author: Robin Fernandes <rewbs>
Authored by PercyDikec. Fixes#394.
The transcript extraction used len(history) to find new messages, but
history includes session_meta entries stripped before reaching the agent.
This caused 1 message lost per turn from turn 2 onwards. Fix returns
history_offset (filtered length) from _run_agent and uses it for the slice.
Two fixes for the case where a user switches to a model with a smaller
context window while having a large existing session:
1. Preflight compression in run_conversation(): Before the main loop,
estimate tokens of loaded history + system prompt. If it exceeds the
model's compression threshold (85% of context), compress proactively
with up to 3 passes. This naturally handles model switches because
the gateway creates a fresh AIAgent per message with the current
model's context length.
2. Error handler reordering: Context-length errors (400 with 'maximum
context length' etc.) are now checked BEFORE the generic 4xx handler.
Previously, OpenRouter's 400-status context-length errors were caught
as non-retryable client errors and aborted immediately, never reaching
the compression+retry logic.
Reported by Sonicrida on Discord: 840-message session (2MB+) crashed
after switching from a large-context model to minimax via OpenRouter.
get_definitions() already wrapped check_fn() calls in try/except,
but is_toolset_available() did not. A failing check (network error,
missing import, bad config) would propagate uncaught and crash the
CLI banner, agent startup, and tools-info display.
Now is_toolset_available() catches all exceptions and returns False,
matching the existing pattern in get_definitions().
Added 4 tests covering exception handling in is_toolset_available(),
check_toolset_requirements(), get_definitions(), and
check_tool_availability().
Closes#402
The transcript extraction used len(history) to find new messages, but
history includes session_meta entries that are stripped before passing
to the agent. This mismatch caused 1 message to be lost from the
transcript on every turn after the first, because the slice offset
was too high. Use the filtered history length (history_offset) returned
by _run_agent instead.
Also changed the else branch from returning all agent_messages to
returning an empty list, so compressed/shorter agent output does not
duplicate the entire history into the transcript.
The hidden directory filter used hardcoded forward-slash strings like
'/.git/' and '/.hub/' to exclude internal directories. On Windows,
Path returns backslash-separated strings, so the filter never matched.
This caused quarantined skills in .hub/quarantine/ to appear as
installed skills and available slash commands on Windows.
Replaced string-based checks with Path.parts membership test which
works on both Windows and Unix.
The docstring states --force should never override dangerous verdicts,
but the condition `if result.verdict == "dangerous" and not force`
allowed force=True to skip the early return. Execution then fell
through to `if force: return True`, bypassing the policy block.
Removed `and not force` so dangerous skills are always blocked
regardless of the --force flag.
The symlink escape check in _check_structure() used startswith()
without a trailing separator. A symlink resolving to a sibling
directory with a shared prefix (e.g. 'axolotl-backdoor') would pass
the check for 'axolotl' since the string prefix matched.
Replaced with Path.is_relative_to() which correctly handles directory
boundaries and is consistent with the skill_view path check.
session_search was returning the current session if it matched the
query, which is redundant — the agent already has the current
conversation context. This wasted an LLM summarization call and a
result slot.
Added current_session_id parameter to session_search(). The agent
passes self.session_id and the search filters out any results where
either the raw or parent-resolved session ID matches. Both the raw
match and the parent-resolved match are checked to handle child
sessions from delegation.
Two tests added verifying the exclusion works and that other
sessions are still returned.
Replace the string-based startswith + os.sep approach with
Path.is_relative_to() (Python 3.9+, we require 3.10+). This is
the idiomatic pathlib way to check path containment — it handles
separators, case sensitivity, and the equal-path case natively
without string manipulation.
Simplified tests to match: removed the now-unnecessary
test_separator_is_os_native test since is_relative_to doesn't
depend on separator choice.
The session key construction logic was duplicated in 4 places
(session.py + 3 inline copies in run.py), which is exactly the
kind of drift that caused issue #349 in the first place.
Extracted build_session_key() as a public function in session.py.
SessionStore._generate_session_key() now delegates to it, and all
inline key construction in run.py has been replaced with calls to
the shared function. Tests updated to test the function directly.
The previous implementation used `len(self._entries) > 1` to check if any
sessions had ever been created. This failed for single-platform users because
when sessions reset (via /reset, auto-reset, or gateway restart), the entry
for the same session_key is replaced in _entries, not added. So len(_entries)
stays at 1 for users who only use one platform.
Fix: Query the SQLite database's session count instead. The database preserves
historical session records (marked as ended), so session_count() correctly
returns > 1 for returning users even after resets.
This prevents the agent from reintroducing itself to returning users after
every session reset.
Fixes#351
Improvements to the HA integration merged from PR #184:
- Add ha_list_services tool: discovers available services (actions) per
domain with descriptions and parameter fields. Tells the model what
it can do with each device type (e.g. light.turn_on accepts brightness,
color_name, transition). Closes the gap where the model had to guess
available actions.
- Add HA to hermes tools config: users can enable/disable the homeassistant
toolset and configure HASS_TOKEN + HASS_URL through 'hermes tools' setup
flow instead of manually editing .env.
- Fix should-fix items from code review:
- Remove sys.path.insert hack from gateway adapter
- Replace all print() calls with proper logger (info/warning/error)
- Move env var reads from import-time to handler-time via _get_config()
- Add dedicated REST session reuse in gateway send()
- Update ha_call_service description to reference ha_list_services for
action discovery.
- Update tests for new ha_list_services tool in toolset resolution.
Banner integration:
- MCP Servers section in CLI startup banner between Tools and Skills
- Shows each server with transport type, tool count, connection status
- Failed servers shown in red; section hidden when no MCP configured
- Summary line includes MCP server count
- Removed raw print() calls from discovery (banner handles display)
/reload-mcp command:
- New slash command in both CLI and gateway
- Disconnects all MCP servers, re-reads config.yaml, reconnects
- Reports what changed (added/removed/reconnected servers)
- Allows adding/removing MCP servers without restarting
Resources & Prompts support:
- 4 utility tools registered per server: list_resources, read_resource,
list_prompts, get_prompt
- Exposes MCP Resources (data sources) and Prompts (templates) as tools
- Proper parameter schemas (uri for read_resource, name for get_prompt)
- Handles text and binary resource content
- 23 new tests covering schemas, handlers, and registration
Test coverage: 74 MCP tests total, 1186 tests pass overall.
- Discovery is now parallel (asyncio.gather) instead of sequential,
fixing the 60s shared timeout issue with multiple servers
- Startup messages use print() so users see connection status even
with default log levels (the 'tools' logger is set to ERROR)
- Summary line shows total tools and failed servers count
- Validate conflicting config: warn if both 'url' and 'command' are
present (HTTP takes precedence)
- Update TODO.md: mark MCP as implemented, list remaining work
- Add test for conflicting config detection (51 tests total)
All 1163 tests pass.
When both OPENROUTER_API_KEY and OPENAI_API_KEY are set (e.g. OPENAI_API_KEY
in .bashrc), the wrong key was sent to OpenRouter causing auth failures.
Fixed key resolution order in cli.py and runtime_provider.py.
Fixes#289
- Add threading.Lock protecting all shared state (_servers, _mcp_loop, _mcp_thread)
- Fix deadlock in shutdown_mcp_servers: _stop_mcp_loop was called inside
a _lock block but also acquires _lock (non-reentrant)
- Fix race condition in _ensure_mcp_loop with concurrent callers
- Change idempotency to per-server (retry failed servers, skip connected)
- Dynamic toolset injection via startswith("hermes-") instead of hardcoded list
- Parallel shutdown via asyncio.gather instead of sequential loop
- Add tests for partial failure retry, parallel shutdown, dynamic injection
Patch _servers to empty dict in tests that call discover_mcp_tools()
with mocked config, preventing interference from real MCP connections
that may exist when running within the full test suite.
Refactor MCP connections from AsyncExitStack to task-per-server
architecture. Each server now runs as a long-lived asyncio Task
with `async with stdio_client(...)`, ensuring anyio cancel-scope
cleanup happens in the same Task that opened the connection.
Connect to external MCP servers via stdio transport, discover their tools
at startup, and register them into the hermes-agent tool registry.
- New tools/mcp_tool.py: config loading, server connection via background
event loop, tool handler factories, discovery, and graceful shutdown
- model_tools.py: trigger MCP discovery after built-in tool imports
- cli.py: call shutdown_mcp_servers in _run_cleanup
- pyproject.toml: add mcp>=1.2.0 as optional dependency
- 27 unit tests covering config, schema conversion, handlers, registration,
SDK interaction, toolset injection, graceful fallback, and shutdown
Config format (in ~/.hermes/config.yaml):
mcp_servers:
filesystem:
command: "npx"
args: ["-y", "@modelcontextprotocol/server-filesystem", "/tmp"]
Added a fixture to redirect HERMES_HOME to a temporary directory during tests, preventing writes to the user's home directory. Updated the test for DebugSession to create a dedicated log directory for saving logs, ensuring test isolation and accuracy in assertions.