Commit Graph

1369 Commits

Author SHA1 Message Date
aydnOktay
41fa4fbaa5 fix: add exc_info=True to image generation error logging
Adds full stack traces to error logs in _upscale_image() and
image_generate_tool() for better debugging. Matches the pattern
used across the rest of the codebase.

Cherry-picked from PR #868 by aydnOktay.

Co-authored-by: aydnOktay <aydnOktay@users.noreply.github.com>
2026-03-11 09:15:45 -07:00
teknium1
91101065bb fix: improve git error logging in checkpoint manager
- Log command, return code, and stderr on non-zero exit
- Add exc_info=True to timeout, FileNotFoundError, and catch-all handlers
- Add debug field to restore() error responses with raw git output
- Keeps user-facing error messages clean while preserving detail for debugging

Inspired by PR #843 (aydnOktay).
2026-03-11 09:00:09 -07:00
teknium1
01bec40724 refactor(gateway): consolidate model resolution via _resolve_gateway_model()
Replace two inline copies of the env/config model resolution pattern
(in _run_agent_sync and _run_agent) with the _resolve_gateway_model()
helper introduced in PR #830.

Left untouched:
- Session hygiene block: different default (sonnet vs opus) + reads
  compression config from the same YAML load
- /model command: also reads provider from same config block
2026-03-11 08:59:17 -07:00
Teknium
9b58b9bced Merge pull request #955 from NousResearch/hermes/hermes-cf9f7d54
fix(vision): log error when vision client is unavailable + doctor MiniMax fix
2026-03-11 08:59:11 -07:00
teknium1
b66c8b409c fix(vision): log error when vision client is unavailable
Previously the early return for unconfigured vision model was silent.
Now logs an error so the failure is visible in logs for debugging.

Inspired by PR #839 by aydnOktay.

Co-authored-by: aydnOktay <aydnOktay@users.noreply.github.com>
2026-03-11 08:58:56 -07:00
Teknium
09b1de5f71 Merge pull request #954 from NousResearch/hermes/hermes-20ea56c0
fix(config): atomic write for .env to prevent API key loss on crash
2026-03-11 08:58:52 -07:00
alireza78a
3667138d05 fix(config): atomic write for .env to prevent API key loss on crash
save_env_value() used bare open('w') which truncates .env immediately.
A crash or OOM kill between truncation and completed write silently
wipes every credential in the file.

Write now goes to a temp file first, then os.replace() swaps it
atomically. Either the old .env exists or the new one does — never
a truncated half-write. Same pattern used in cron/jobs.py.

Cherry-picked from PR #842 by alireza78a, rebased onto current main
with conflict resolution (_secure_file refactor).

Co-authored-by: alireza78a <alireza78a@users.noreply.github.com>
2026-03-11 08:58:33 -07:00
Dev User
66c0b719de fix(gateway): pass model to temporary AIAgent instances
Memory flush, /compress, and session hygiene create AIAgent without
model=, falling back to the hardcoded default "anthropic/claude-opus-4.6".
This fails with a 400 error when the active provider is openai-codex
(Codex only accepts its own model names like gpt-5.1-codex-mini).

Add _resolve_gateway_model() that mirrors the env/config resolution
already used by _run_agent_sync, and wire it into all three temporary
agent creation sites.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-11 08:56:19 -07:00
Teknium
d905e612aa Merge pull request #950 from NousResearch/hermes/hermes-20ea56c0
docs: conditional skill activation — duckduckgo-search fallback + documentation
2026-03-11 08:48:40 -07:00
Teknium
fa7a18f42a Merge pull request #949 from NousResearch/hermes/hermes-b86fddbe
fix(cron): handle naive legacy timestamps in due-job checks
2026-03-11 08:47:10 -07:00
teknium1
82113f1f1e docs: conditional skill activation — tag duckduckgo-search as web fallback and add documentation
- Tag duckduckgo-search skill with fallback_for_toolsets: [web] so it
  auto-hides when Firecrawl is available and auto-shows when it isn't
- Add 'Conditional Activation' section to CONTRIBUTING.md with full
  spec, semantics, and examples for all 4 frontmatter fields
- Add 'Conditional Activation (Fallback Skills)' section to the user-
  facing skills docs with field reference table and practical example
- Update SKILL.md format examples in both docs to show the new fields

Follow-up to PR #785 (conditional skill activation feature).
2026-03-11 08:47:01 -07:00
Teknium
01d3b31479 Merge PR #785: feat: conditional skill activation based on tool availability
Authored by teyrebaz33. Closes #539.

feat: conditional skill activation based on tool availability
2026-03-11 08:43:30 -07:00
teknium1
a5ffa1278c test(cron): add regression tests for _ensure_aware timezone conversion
Three new tests for the naive timestamp fix (PR #807):
- test_ensure_aware_naive_preserves_absolute_time: verifies UTC equivalent
  is preserved when interpreting naive datetimes as system-local time
- test_ensure_aware_normalizes_aware_to_hermes_tz: verifies already-aware
  datetimes are normalized to Hermes tz without shifting the instant
- test_ensure_aware_due_job_not_skipped_when_system_ahead: end-to-end
  regression test for the original bug scenario
2026-03-11 08:42:04 -07:00
Teknium
b7d58320a8 Merge pull request #947 from NousResearch/hermes/hermes-cf9f7d54
fix(doctor): skip /models health check for MiniMax providers
2026-03-11 08:41:29 -07:00
0xNyk
605ba4adea fix(cron): interpret naive timestamps as local time in due-job checks
Legacy cron job rows may store next_run_at without timezone info.
_ensure_aware() previously stamped the Hermes-configured tz directly
via replace(tzinfo=...), which shifts absolute time when system-local
tz differs from Hermes tz — causing overdue jobs to appear not due.

Now: naive datetimes are interpreted as system-local wall time first,
then converted to Hermes tz. Aware datetimes are normalized to Hermes
tz for consistency.

Cherry-picked from PR #807, rebased onto current main.
Fixes #806

Co-authored-by: 0xNyk <0xNyk@users.noreply.github.com>
2026-03-11 08:38:24 -07:00
Teknium
24a0c08d58 Merge pull request #796 from 0xbyt4/fix/discovery-failed-count
Clean bug fix — failed MCP server connections were silently swallowed, making failed_count dead code. Well-tested.
2026-03-11 08:32:32 -07:00
Bartok9
b4a100dfc0 fix(doctor): skip /models health check for MiniMax providers
MiniMax APIs (global and China) don't support /v1/models, causing
hermes doctor to always show HTTP 404 even with valid API keys.
Skip the HTTP check for these providers and show '(key configured)'
when the API key is present.

Cherry-picked from PR #822 by Bartok9, rebased onto current main.

Fixes #811

Co-authored-by: Bartok9 <259807879+Bartok9@users.noreply.github.com>
2026-03-11 08:29:35 -07:00
0xbyt4
4a8f23eddf fix: correctly track failed MCP server connections in discovery
_discover_one() caught all exceptions and returned [], making
asyncio.gather(return_exceptions=True) redundant. The
isinstance(result, Exception) branch in _discover_all() was dead
code, so failed_count was always 0. This caused:
- No summary printed when all servers fail (silent failure)
- ok_servers always equaling total_servers (misleading count)
- Unused variables transport_desc and transport_type

Fix: let exceptions propagate to gather() so failed_count increments
correctly. Move per-server failure logging to _discover_all(). Remove
dead variables.
2026-03-11 18:24:45 +03:00
teknium1
a54405e339 fix: proactive compression after large tool results + Anthropic error detection
Two fixes for context overflow handling:

1. Proactive compression after tool execution: The compression check now
   estimates the next prompt size using real token counts from the last API
   response (prompt_tokens + completion_tokens) plus a conservative estimate
   of newly appended tool results (chars // 3 for JSON-heavy content).
   Previously, should_compress() only checked last_prompt_tokens which
   didn't account for tool results — so a 130k prompt + 100k chars of tool
   output would pass the 140k threshold check but fail the 200k API limit.

2. Safety net: Added 'prompt is too long' to context-length error detection
   phrases. Anthropic returns 'prompt is too long: N tokens > M maximum'
   on HTTP 400, which wasn't matched by existing phrases. This ensures
   compression fires even if the proactive check underestimates.

Fixes #813
2026-03-11 08:04:52 -07:00
teknium1
efb780c754 Revert "fix: smart vision setup that respects the user's chosen provider"
This reverts commit c64efa9260.
2026-03-11 07:59:00 -07:00
teknium1
c64efa9260 fix: smart vision setup that respects the user's chosen provider
The old flow blindly asked for an OpenRouter API key after ANY non-OR
provider selection, even for Nous Portal and Codex which already
support vision natively. This was confusing and annoying.

New behavior:
- OpenRouter: skip — vision uses Gemini via their OR key
- Nous Portal OAuth: skip — vision uses Gemini via Nous
- OpenAI Codex: skip — gpt-5.3-codex supports vision
- Custom endpoint (api.openai.com): show OpenAI vision model picker
  (gpt-4o, gpt-4o-mini, gpt-4.1, etc.), saves AUXILIARY_VISION_MODEL
- Custom (other) / z.ai / kimi / minimax / nous-api:
  - First checks if existing OR/Nous creds already cover vision
  - If not, offers friendly choice: OpenRouter / OpenAI / Skip
  - No more 'enter OpenRouter key' thrown in your face

Also fixes the setup summary to check actual vision availability
across all providers instead of hardcoding 'requires OPENROUTER_API_KEY'.
MoA still correctly requires OpenRouter (calls multiple frontier models).
2026-03-11 07:48:44 -07:00
teknium1
43cb35cb21 docs: list individual config commands first, then hermes setup as all-in-one
Show users the specific commands for each config area (hermes model,
hermes tools, hermes config set, hermes gateway setup) and then
present 'hermes setup' as the option to configure everything at once.
2026-03-11 07:30:28 -07:00
teknium1
db496180db docs: remove hermes setup from install flow, point to hermes model/tools instead
The installer already handles full setup (provider config, etc.), so
telling users to run 'hermes setup' post-install is redundant and
confusing. Updated all docs to reflect the correct flow:

1. Run the installer (handles everything including provider setup)
2. Use 'hermes model', 'hermes tools', 'hermes gateway setup' to
   reconfigure individual settings later

Files updated:
- README.md: removed setup from quick install & getting started
- installation.md: updated post-install, manual step 9, troubleshooting
- quickstart.md: updated provider section & quick reference table
- cli-commands.md: updated hermes setup description
- faq.md: replaced hermes setup references with specific commands
2026-03-11 07:28:05 -07:00
Teknium
c69adfbb17 Merge pull request #825 from JackTheGit/fix/docs-typos-batch2
Fix several documentation typos
2026-03-11 07:13:24 -07:00
teknium1
683c8b24d4 fix: reduce max_retries to 3 and make ValueError/TypeError non-retryable
- max_retries reduced from 6 to 3 — 6 retries with exponential backoff
  could stall for ~275s total on persistent errors
- ValueError and TypeError now detected as non-retryable client errors
  and abort immediately instead of being retried with backoff (these are
  local validation/programming errors that will never succeed on retry)
2026-03-11 07:04:46 -07:00
teknium1
d2dee43825 fix: allow tool_choice, parallel_tool_calls, prompt_cache_key in codex preflight
_preflight_codex_api_kwargs rejected these three fields as unsupported,
but _build_api_kwargs adds them to every codex request. This caused a
ValueError before _interruptible_api_call was reached, which was caught
by the retry loop and retried with exponential backoff — appearing as
an infinite hang in tests (275s total backoff across 6 retries).

The fix adds these keys to allowed_keys and passes them through to the
normalized request dict.

This fixes the hanging test_cron_run_job_codex_path_handles_internal_401_refresh
test (now passes in 2.6s instead of timing out).
2026-03-11 07:00:14 -07:00
dmahan93
59b53f0a23 fix: skip tests when atroposlib/minisweagent unavailable in CI
- test_agent_loop_tool_calling.py: import atroposlib at module level
  to trigger skip (environments.agent_loop is now importable without
  atroposlib due to __init__.py graceful fallback)
- test_modal_sandbox_fixes.py: skip TestToolResolution tests when
  minisweagent not installed
2026-03-11 06:52:55 -07:00
dmahan93
d198a647e2 fix: guard all atroposlib imports for CI without atropos installed
- environments/__init__.py: try/except on atroposlib imports so
  submodules like tool_call_parsers remain importable standalone
- test_agent_loop.py, test_tool_call_parsers.py,
  test_managed_server_tool_support.py: skip at module level when
  atroposlib is missing
2026-03-11 06:52:55 -07:00
dmahan93
0f53275169 test: skip atropos-dependent tests when atroposlib not installed
Guard all test files that import from environments/ or atroposlib
with try/except + pytest.skip(allow_module_level=True) so they
gracefully skip instead of crashing when deps aren't available.
2026-03-11 06:52:55 -07:00
dmahan93
366de72a38 add a local vllm instance 2026-03-11 06:52:55 -07:00
dmahan93
13f5459670 fix: use ManagedServer for vLLM in TBLite eval + local_vllm config
TBLite eval was bypassing ManagedServer and calling ServerManager
directly, which uses /v1/chat/completions — not available on the
atropos vllm_api_server (/generate only).

Now uses _use_managed_server() to detect vLLM/SGLang backends and
route through ManagedServer (Phase 2) with proper tool_parser and
/generate endpoint. Falls back to Phase 1 for OpenAI endpoints.

Also adds local_vllm.yaml config for running against a local vLLM
server with Docker sandboxes.
2026-03-11 06:52:55 -07:00
dmahan93
93333387d6 fix: handle dict and object tool_calls in agent loop
vLLM's ToolCallTranslator returns tool_calls as dicts, while
OpenAI API returns them as objects with .id, .function.name etc.
Normalize both formats in the agent loop.
2026-03-11 06:52:26 -07:00
dmahan93
1f9e7cd659 test: 5 vLLM integration tests + fallback tool call parser
Tests hit a real vLLM server (Qwen/Qwen3-4B-Thinking-2507) via
ManagedServer Phase 2. Auto-skip if server isn't running.

Tests verify:
- Single tool call through full agent loop
- Multi-tool calls across turns
- ManagedServer produces SequenceNodes with tokens/logprobs
- Direct response without tools
- Thinking model produces <think> blocks

Also adds fallback parser in agent_loop.py: when ManagedServer's
ToolCallTranslator can't parse (vLLM not installed), hermes-agent's
standalone parsers extract <tool_call> tags from raw content.
2026-03-11 06:52:26 -07:00
dmahan93
09fc64c6b6 add eval output to gitignore 2026-03-11 06:52:26 -07:00
dmahan93
84147f4d81 refactor: update to new atropos tool-calling API
Migrate from old tool_call_parser (instance) to new ToolCallTranslator
pattern from atropos add-openai-endpoint-for-managed-server branch:

- Set tool_parser on ServerManager (string name, e.g. 'hermes')
- Use managed_server(tokenizer=..., preserve_think_blocks=...)
  instead of managed_server(tokenizer=..., tool_call_parser=instance)
- ManagedServer now handles tool call translation internally via
  ToolCallTranslator (bidirectional raw text <-> OpenAI tool_calls)
- Remove old parser loading code (get_parser/KeyError fallback)

The hermes-agent tool_call_parsers/ directory is preserved as a
standalone fallback for environments that don't use vLLM's parsers.
2026-03-11 06:52:26 -07:00
dmahan93
ee4b20b55b test: 9 agent loop tool-calling integration tests
Real LLM calls via OpenRouter using stepfun/step-3.5-flash:free (zero cost).
Falls back to paid models if free model is unavailable.

Tests: single tool call, multi-tool single turn, multi-turn chains,
unknown tool rejection, max_turns limit, direct response (no tools),
tool error handling, AgentResult structure, conversation history.
2026-03-11 06:52:26 -07:00
dmahan93
ed27b826c5 feat: add eval_concurrency limit + Docker local config for TBLite
- Add eval_concurrency config field with asyncio.Semaphore
- Add local.yaml config using Docker backend (sandboxed, no cloud costs)
- Register docker_image alongside modal_image for backend flexibility
- Default: 8 parallel tasks for local runs
2026-03-11 06:52:26 -07:00
dmahan93
b03aefaf20 test: 13 tests for Modal sandbox infra fixes 2026-03-11 06:51:42 -07:00
dmahan93
d7f4db53f5 fix: Modal sandbox eval infra (9 fixes for TBLite baseline)
Fixes discovered while running TBLite baseline evaluation:

1. ephemeral_disk param not supported in modal 1.3.5 - check before passing
2. Modal legacy image builder requires working pip - add ensurepip fix via
   setup_dockerfile_commands to handle task images with broken pip
3. Host cwd leaked into Modal sandbox - add /home/ to host prefix check
4. Tilde ~ not expanded by subprocess.run(cwd=) in sandboxes - use /root
5. install_pipx must stay True for swerex-remote to be available

Dependencies also needed (not in this commit):
- git submodule update --init mini-swe-agent
- uv pip install swe-rex boto3
2026-03-11 06:51:42 -07:00
dmahan93
2c97bf3936 Add tests for atropos tool calling integration
- test_tool_call_parsers.py: 16 tests for parser registry, hermes parser
  (single/multiple/truncated/malformed), and ParseResult contract validation
- test_agent_loop.py: 21 tests for HermesAgentLoop with mock servers
  (text responses, tool calls, max turns, unknown tools, API errors,
  extra_body forwarding, managed state, blocked tools, reasoning extraction)
- test_managed_server_tool_support.py: 9 tests validating API compatibility
  between hermes-agent and atroposlib's ManagedServer tool_call_parser support
  (gracefully skips on baseline atroposlib, passes on tool_call_support branch)
2026-03-11 06:51:26 -07:00
teknium1
1dfa544250 Merge PR #802: test: parallelize test suite with pytest-xdist
Adds pytest-xdist to dev dependencies and -n auto to default pytest addopts
for parallel test execution across CPU cores.

Authored by OutThisLife.

Co-authored-by: OutThisLife <OutThisLife@users.noreply.github.com>
2026-03-11 06:43:00 -07:00
teknium1
eac5f8f40f fix: wire email platform into toolset mappings + add documentation
Post-merge fixes for the email gateway (PR #797):

1. Add Platform.EMAIL to all 4 platform-to-toolset/config mapping
   dicts in gateway/run.py. Without this, email sessions silently
   fell back to the Telegram toolset because these dicts were added
   after the PR branched off main.

2. Add email (and signal) to hermes_cli/tools_config.py and
   hermes_cli/skills_config.py PLATFORMS dicts so they appear in
   'hermes tools' and 'hermes skills' CLI commands.

3. Add full email setup documentation:
   - website/docs/user-guide/messaging/email.md — setup guide with
     Gmail/Outlook instructions, configuration, troubleshooting,
     security advice, and env var reference
   - Update messaging/index.md — add email to architecture diagram,
     platform toolset table, security examples, and next steps
2026-03-11 06:34:32 -07:00
teknium1
184aa5b2b3 fix: tighten exc_info assertion in vision test (from PR #803)
The weaker assertion (r.exc_info is not None) passes even when
exc_info is (None, None, None). Check r.exc_info[0] is not None
to verify actual exception info is present.

The _aux_async_client mock was already applied on main.

Co-authored-by: OutThisLife <nickolasgustafsson@gmail.com>
2026-03-11 06:32:01 -07:00
0xbyt4
bdcf247efe feat: add email gateway platform (IMAP/SMTP)
Allow users to interact with Hermes by sending and receiving emails.
Uses IMAP polling for incoming messages and SMTP for replies with
proper threading (In-Reply-To, References headers).

Integrates with all 14 gateway extension points: config, adapter
factory, authorization, send_message tool, cron delivery, toolsets,
prompt hints, channel directory, setup wizard, status display, and
env example.

65 tests covering config, parsing, dispatch, threading, IMAP fetch,
SMTP send, attachments, and all integration points.
2026-03-11 06:32:01 -07:00
Teknium
b16d7f2da6 Merge pull request #921 from NousResearch/hermes/hermes-ece5a45c
feat(cli): add /reasoning command for effort level and display toggle
2026-03-11 06:30:20 -07:00
teknium1
9423fda5cb feat: configurable subagent provider:model with full credential resolution
Adds delegation.model and delegation.provider config fields so subagents
can run on a completely different provider:model pair than the parent agent.

When delegation.provider is set, the system resolves the full credential
bundle (base_url, api_key, api_mode) via resolve_runtime_provider() —
the same path used by CLI/gateway startup. This means all configured
providers work out of the box: openrouter, nous, zai, kimi-coding,
minimax, minimax-cn.

Key design decisions:
- Provider resolution uses hermes_cli.runtime_provider (single source of
  truth for credential resolution across CLI, gateway, cron, and now
  delegation)
- When only delegation.model is set (no provider), the model name changes
  but parent credentials are inherited (for switching models within the
  same provider like OpenRouter)
- When delegation.provider is set, full credentials are resolved
  independently — enabling cross-provider delegation (e.g. parent on
  Nous Portal, subagents on OpenRouter)
- Clear error messages if provider resolution fails (missing API key,
  unknown provider name)
- _load_config() now falls back to hermes_cli.config.load_config() for
  gateway/cron contexts where CLI_CONFIG is unavailable

Based on PR #791 by 0xbyt4 (closes #609), reworked to use proper
provider credential resolution instead of passing provider as metadata.

Co-authored-by: 0xbyt4 <0xbyt4@users.noreply.github.com>
2026-03-11 06:12:21 -07:00
teknium1
4d873f77c1 feat(cli): add /reasoning command for effort level and display toggle
Combined implementation of reasoning management:
- /reasoning              Show current effort level and display state
- /reasoning <level>      Set reasoning effort (none, low, medium, high, xhigh)
- /reasoning show|on      Show model thinking/reasoning in output
- /reasoning hide|off     Hide model thinking/reasoning from output

Effort level changes persist to config and force agent re-init.
Display toggle updates the agent callback dynamically without re-init.

When display is enabled:
- Intermediate reasoning shown as dim [thinking] lines during tool loops
- Final reasoning shown in a bordered box above the response
- Long reasoning collapsed (5 lines intermediate, 10 lines final)

Also adds:
- reasoning_callback parameter to AIAgent
- last_reasoning in run_conversation result dict
- show_reasoning config option (display section, default: false)
- Display section in /config output
- 34 tests covering both features

Combines functionality from PR #789 and PR #790.

Co-authored-by: Aum Desai <Aum08Desai@users.noreply.github.com>
Co-authored-by: 0xbyt4 <35742124+0xbyt4@users.noreply.github.com>
2026-03-11 06:02:18 -07:00
teknium1
09336a6710 Merge PR #795: fix: handle empty choices in MCP sampling callback
Adds defensive guard against empty/None/missing choices in SamplingHandler.__call__
before accessing response.choices[0]. Returns proper ErrorData instead of crashing
with IndexError/TypeError on content filtering, provider errors, or rate limits.

Authored by 0xbyt4.

Co-authored-by: 0xbyt4 <0xbyt4@users.noreply.github.com>
2026-03-11 05:47:51 -07:00
aydnOktay
9149c34a26 refactor(slack): replace print statements with structured logging
Replaces all ad-hoc print() calls in the Slack gateway adapter with
proper logging.getLogger(__name__) calls, matching the pattern already
used by every other platform adapter (telegram, discord, whatsapp,
signal, homeassistant).

Changes:
- Add import logging + module-level logger
- Use logger.error for failures, logger.warning for non-critical
  fallbacks, logger.info for status, logger.debug for routine ops
- Add exc_info=True for full stack traces on all error/warning paths
- Use %s format strings (lazy evaluation) instead of f-strings
- Wrap disconnect() in try/except for safety
- Add structured context (file paths, channel IDs, URLs) to log messages
- Convert document handling prints added after the original PR

Cherry-picked from PR #778 by aydnOktay, rebased onto current main
with conflict resolution and extended to cover document/video methods
added since the PR was created.

Co-authored-by: aydnOktay <xaydinoktay@gmail.com>
2026-03-11 05:34:43 -07:00
teknium1
c837ef949d fix: replace debug print() with logger.error() in file_tools
Stray print() in write_file_tool exception handler leaked debug output
to stdout. Replaced with logger.error() which is already set up in
the file.

Authored by memosr.

Co-authored-by: memosr <memosr@users.noreply.github.com>
2026-03-11 04:38:07 -07:00