Commit Graph

52 Commits

Author SHA1 Message Date
Teknium
3576f44a57 feat: add Vercel AI Gateway provider (#1628)
* feat: add Vercel AI Gateway as a first-class provider

Adds AI Gateway (ai-gateway.vercel.sh) as a new inference provider
with AI_GATEWAY_API_KEY authentication, live model discovery, and
reasoning support via extra_body.reasoning.

Based on PR #1492 by jerilynzheng.

* feat: add AI Gateway to setup wizard, doctor, and fallback providers

* test: add AI Gateway to api_key_providers test suite

* feat: add AI Gateway to hermes model CLI and model metadata

Wire AI Gateway into the interactive model selection menu and add
context lengths for AI Gateway model IDs in model_metadata.py.

* feat: use claude-haiku-4.5 as AI Gateway auxiliary model

* revert: use gemini-3-flash as AI Gateway auxiliary model

* fix: move AI Gateway below established providers in selection order

---------

Co-authored-by: jerilynzheng <jerilynzheng@users.noreply.github.com>
Co-authored-by: jerilynzheng <zheng.jerilyn@gmail.com>
2026-03-17 00:12:16 -07:00
teknium1
f4d61c168b merge: resolve conflicts with main (show_cost, turn routing, docker docs) 2026-03-16 14:22:38 -07:00
teknium1
8feb9e4656 docs: add streaming section to configuration guide 2026-03-16 12:53:49 -07:00
Teknium
5e5c92663d fix: hermes update causes dual gateways on macOS (launchd) (#1567)
* feat: add optional smart model routing

Add a conservative cheap-vs-strong routing option that can send very short/simple turns to a cheaper model across providers while keeping the primary model for complex work. Wire it through CLI, gateway, and cron, and document the config.yaml workflow.

* fix(gateway): remove recursive ExecStop from systemd units, extend TimeoutStopSec to 60s

* fix(gateway): avoid recursive ExecStop in user systemd unit

* fix: extend ExecStop removal and TimeoutStopSec=60 to system unit

The cherry-picked PR #1448 fix only covered the user systemd unit.
The system unit had the same TimeoutStopSec=15 and could benefit
from the same 60s timeout for clean shutdown. Also adds a regression
test for the system unit.

---------

Co-authored-by: Ninja <ninja@local>

* feat(skills): add blender-mcp optional skill for 3D modeling

Control a running Blender instance from Hermes via socket connection
to the blender-mcp addon (port 9876). Supports creating 3D objects,
materials, animations, and running arbitrary bpy code.

Placed in optional-skills/ since it requires Blender 4.3+ desktop
with a third-party addon manually started each session.

* feat(acp): support slash commands in ACP adapter (#1532)

Adds /help, /model, /tools, /context, /reset, /compact, /version
to the ACP adapter (VS Code, Zed, JetBrains). Commands are handled
directly in the server without instantiating the TUI — each command
queries agent/session state and returns plain text.

Unrecognized /commands fall through to the LLM as normal messages.

/model uses detect_provider_for_model() for auto-detection when
switching models, matching the CLI and gateway behavior.

Fixes #1402

* fix(logging): improve error logging in session search tool (#1533)

* fix(gateway): restart on retryable startup failures (#1517)

* feat(email): add skip_attachments option via config.yaml

* feat(email): add skip_attachments option via config.yaml

Adds a config.yaml-driven option to skip email attachments in the
gateway email adapter. Useful for malware protection and bandwidth
savings.

Configure in config.yaml:
  platforms:
    email:
      skip_attachments: true

Based on PR #1521 by @an420eth, changed from env var to config.yaml
(via PlatformConfig.extra) to match the project's config-first pattern.

* docs: document skip_attachments option for email adapter

* fix(telegram): retry on transient TLS failures during connect and send

Add exponential-backoff retry (3 attempts) around initialize() to
handle transient TLS resets during gateway startup. Also catches
TimedOut and OSError in addition to NetworkError.

Add exponential-backoff retry (3 attempts) around send_message() for
NetworkError during message delivery, wrapping the existing Markdown
fallback logic.

Both imports are guarded with try/except ImportError for test
environments where telegram is mocked.

Based on PR #1527 by cmd8. Closes #1526.

* feat: permissive block_anchor thresholds and unicode normalization (#1539)

Salvaged from PR #1528 by an420eth. Closes #517.

Improves _strategy_block_anchor in fuzzy_match.py:
- Add unicode normalization (smart quotes, em/en-dashes, ellipsis,
  non-breaking spaces → ASCII) so LLM-produced unicode artifacts
  don't break anchor line matching
- Lower thresholds: 0.10 for unique matches (was 0.70), 0.30 for
  multiple candidates — if first/last lines match exactly, the
  block is almost certainly correct
- Use original (non-normalized) content for offset calculation to
  preserve correct character positions

Tested: 3 new scenarios fixed (em-dash anchors, non-breaking space
anchors, very-low-similarity unique matches), zero regressions on
all 9 existing fuzzy match tests.

Co-authored-by: an420eth <an420eth@users.noreply.github.com>

* feat(cli): add file path autocomplete in the input prompt (#1545)

When typing a path-like token (./  ../  ~/  /  or containing /),
the CLI now shows filesystem completions in the dropdown menu.
Directories show a trailing slash and 'dir' label; files show
their size. Completions are case-insensitive and capped at 30
entries.

Triggered by tokens like:
  edit ./src/ma     → shows ./src/main.py, ./src/manifest.json, ...
  check ~/doc       → shows ~/docs/, ~/documents/, ...
  read /etc/hos     → shows /etc/hosts, /etc/hostname, ...
  open tools/reg    → shows tools/registry.py

Slash command autocomplete (/help, /model, etc.) is unaffected —
it still triggers when the input starts with /.

Inspired by OpenCode PR #145 (file path completion menu).

Implementation:
- hermes_cli/commands.py: _extract_path_word() detects path-like
  tokens, _path_completions() yields filesystem Completions with
  size labels, get_completions() routes to paths vs slash commands
- tests/hermes_cli/test_path_completion.py: 26 tests covering
  path extraction, prefix filtering, directory markers, home
  expansion, case-insensitivity, integration with slash commands

* feat(privacy): redact PII from LLM context when privacy.redact_pii is enabled

Add privacy.redact_pii config option (boolean, default false). When
enabled, the gateway redacts personally identifiable information from
the system prompt before sending it to the LLM provider:

- Phone numbers (user IDs on WhatsApp/Signal) → hashed to user_<sha256>
- User IDs → hashed to user_<sha256>
- Chat IDs → numeric portion hashed, platform prefix preserved
- Home channel IDs → hashed
- Names/usernames → NOT affected (user-chosen, publicly visible)

Hashes are deterministic (same user → same hash) so the model can
still distinguish users in group chats. Routing and delivery use
the original values internally — redaction only affects LLM context.

Inspired by OpenClaw PR #47959.

* fix(privacy): skip PII redaction on Discord/Slack (mentions need real IDs)

Discord uses <@user_id> for mentions and Slack uses <@U12345> — the LLM
needs the real ID to tag users. Redaction now only applies to WhatsApp,
Signal, and Telegram where IDs are pure routing metadata.

Add 4 platform-specific tests covering Discord, WhatsApp, Signal, Slack.

* feat: smart approvals + /stop command (inspired by OpenAI Codex)

* feat: smart approvals — LLM-based risk assessment for dangerous commands

Adds a 'smart' approval mode that uses the auxiliary LLM to assess
whether a flagged command is genuinely dangerous or a false positive,
auto-approving low-risk commands without prompting the user.

Inspired by OpenAI Codex's Smart Approvals guardian subagent
(openai/codex#13860).

Config (config.yaml):
  approvals:
    mode: manual   # manual (default), smart, off

Modes:
- manual — current behavior, always prompt the user
- smart  — aux LLM evaluates risk: APPROVE (auto-allow), DENY (block),
           or ESCALATE (fall through to manual prompt)
- off    — skip all approval prompts (equivalent to --yolo)

When smart mode auto-approves, the pattern gets session-level approval
so subsequent uses of the same pattern don't trigger another LLM call.
When it denies, the command is blocked without user prompt. When
uncertain, it escalates to the normal manual approval flow.

The LLM prompt is carefully scoped: it sees only the command text and
the flagged reason, assesses actual risk vs false positive, and returns
a single-word verdict.

* feat: make smart approval model configurable via config.yaml

Adds auxiliary.approval section to config.yaml with the same
provider/model/base_url/api_key pattern as other aux tasks (vision,
web_extract, compression, etc.).

Config:
  auxiliary:
    approval:
      provider: auto
      model: ''        # fast/cheap model recommended
      base_url: ''
      api_key: ''

Bridged to env vars in both CLI and gateway paths so the aux client
picks them up automatically.

* feat: add /stop command to kill all background processes

Adds a /stop slash command that kills all running background processes
at once. Currently users have to process(list) then process(kill) for
each one individually.

Inspired by OpenAI Codex's separation of interrupt (Ctrl+C stops current
turn) from /stop (cleans up background processes). See openai/codex#14602.

Ctrl+C continues to only interrupt the active agent turn — background
dev servers, watchers, etc. are preserved. /stop is the explicit way
to clean them all up.

* feat: first-class plugin architecture + hide status bar cost by default (#1544)

The persistent status bar now shows context %, token counts, and
duration but NOT $ cost by default. Cost display is opt-in via:

  display:
    show_cost: true

in config.yaml, or: hermes config set display.show_cost true

The /usage command still shows full cost breakdown since the user
explicitly asked for it — this only affects the always-visible bar.

Status bar without cost:
  ⚕ claude-sonnet-4 │ 12K/200K │ 6% │ 15m

Status bar with show_cost: true:
  ⚕ claude-sonnet-4 │ 12K/200K │ 6% │ $0.06 │ 15m

* feat: improve memory prioritization + aggressive skill updates (inspired by OpenAI Codex)

* feat: improve memory prioritization — user preferences over procedural knowledge

Inspired by OpenAI Codex's memory prompt improvements (openai/codex#14493)
which focus memory writes on user preferences and recurring patterns
rather than procedural task details.

Key insight: 'Optimize for reducing future user steering — the most
valuable memory prevents the user from having to repeat themselves.'

Changes:
- MEMORY_GUIDANCE (prompt_builder.py): added prioritization hierarchy
  and the core principle about reducing user steering
- MEMORY_SCHEMA (memory_tool.py): reordered WHEN TO SAVE list to put
  corrections first, added explicit PRIORITY guidance
- Memory nudge (run_agent.py): now asks specifically about preferences,
  corrections, and workflow patterns instead of generic 'anything'
- Memory flush (run_agent.py): now instructs to prioritize user
  preferences and corrections over task-specific details

* feat: more aggressive skill creation and update prompting

Press harder on skill updates — the agent should proactively patch
skills when it encounters issues during use, not wait to be asked.

Changes:
- SKILLS_GUIDANCE: 'consider saving' → 'save'; added explicit instruction
  to patch skills immediately when found outdated/wrong
- Skills header: added instruction to update loaded skills before finishing
  if they had missing steps or wrong commands
- Skill nudge: more assertive ('save the approach' not 'consider saving'),
  now also prompts for updating existing skills used in the task
- Skill nudge interval: lowered default from 15 to 10 iterations
- skill_manage schema: added 'patch it immediately' to update triggers

* feat: first-class plugin architecture (#1555)

Plugin system for extending Hermes with custom tools, hooks, and
integrations — no source code changes required.

Core system (hermes_cli/plugins.py):
  - Plugin discovery from ~/.hermes/plugins/, .hermes/plugins/, and
    pip entry_points (hermes_agent.plugins group)
  - PluginContext with register_tool() and register_hook()
  - 6 lifecycle hooks: pre/post tool_call, pre/post llm_call,
    on_session_start/end
  - Namespace package handling for relative imports in plugins
  - Graceful error isolation — broken plugins never crash the agent

Integration (model_tools.py):
  - Plugin discovery runs after built-in + MCP tools
  - Plugin tools bypass toolset filter via get_plugin_tool_names()
  - Pre/post tool call hooks fire in handle_function_call()

CLI:
  - /plugins command shows loaded plugins, tool counts, status
  - Added to COMMANDS dict for autocomplete

Docs:
  - Getting started guide (build-a-hermes-plugin.md) — full tutorial
    building a calculator plugin step by step
  - Reference page (features/plugins.md) — quick overview + tables
  - Covers: file structure, schemas, handlers, hooks, data files,
    bundled skills, env var gating, pip distribution, common mistakes

Tests: 16 tests covering discovery, loading, hooks, tool visibility.

* fix: hermes update causes dual gateways on macOS (launchd)

Three bugs worked together to create the dual-gateway problem:

1. cmd_update only checked systemd for gateway restart, completely
   ignoring launchd on macOS. After killing the PID it would print
   'Restart it with: hermes gateway run' even when launchd was about
   to auto-respawn the process.

2. launchd's KeepAlive.SuccessfulExit=false respawns the gateway
   after SIGTERM (non-zero exit), so the user's manual restart
   created a second instance.

3. The launchd plist lacked --replace (systemd had it), so the
   respawned gateway didn't kill stale instances on startup.

Fixes:
- Add --replace to launchd ProgramArguments (matches systemd)
- Add launchd detection to cmd_update's auto-restart logic
- Print 'auto-restart via launchd' instead of manual restart hint

* fix: add launchd plist auto-refresh + explicit restart in cmd_update

Two integration issues with the initial fix:

1. Existing macOS users with old plist (no --replace) would never
   get the fix until manual uninstall/reinstall. Added
   refresh_launchd_plist_if_needed() — mirrors the existing
   refresh_systemd_unit_if_needed(). Called from launchd_start(),
   launchd_restart(), and cmd_update.

2. cmd_update relied on KeepAlive respawn after SIGTERM rather than
   explicit launchctl stop/start. This caused races: launchd would
   respawn the old process before the PID file was cleaned up.
   Now does explicit stop+start (matching how systemd gets an
   explicit systemctl restart), with plist refresh first so the
   new --replace flag is picked up.

---------

Co-authored-by: Ninja <ninja@local>
Co-authored-by: alireza78a <alireza78a@users.noreply.github.com>
Co-authored-by: Oktay Aydin <113846926+aydnOktay@users.noreply.github.com>
Co-authored-by: JP Lew <polydegen@protonmail.com>
Co-authored-by: an420eth <an420eth@users.noreply.github.com>
2026-03-16 12:36:29 -07:00
teknium1
9a423c3487 fix(privacy): skip PII redaction on Discord/Slack (mentions need real IDs)
Discord uses <@user_id> for mentions and Slack uses <@U12345> — the LLM
needs the real ID to tag users. Redaction now only applies to WhatsApp,
Signal, and Telegram where IDs are pure routing metadata.

Add 4 platform-specific tests covering Discord, WhatsApp, Signal, Slack.
2026-03-16 05:58:34 -07:00
teknium1
c51e7b4af7 feat(privacy): redact PII from LLM context when privacy.redact_pii is enabled
Add privacy.redact_pii config option (boolean, default false). When
enabled, the gateway redacts personally identifiable information from
the system prompt before sending it to the LLM provider:

- Phone numbers (user IDs on WhatsApp/Signal) → hashed to user_<sha256>
- User IDs → hashed to user_<sha256>
- Chat IDs → numeric portion hashed, platform prefix preserved
- Home channel IDs → hashed
- Names/usernames → NOT affected (user-chosen, publicly visible)

Hashes are deterministic (same user → same hash) so the model can
still distinguish users in group chats. Routing and delivery use
the original values internally — redaction only affects LLM context.

Inspired by OpenClaw PR #47959.
2026-03-16 05:48:45 -07:00
teknium1
780ddd102b fix(docker): gate cwd workspace mount behind config
Keep Docker sandboxes isolated by default. Add an explicit terminal.docker_mount_cwd_to_workspace opt-in, thread it through terminal/file environment creation, and document the security tradeoff and config.yaml workflow clearly.
2026-03-16 05:20:56 -07:00
Bartok9
8cdbbcaaa2 fix(docker): auto-mount host CWD to /workspace
Fixes #1445 — When using Docker backend, the user's current working
directory is now automatically bind-mounted to /workspace inside the
container. This allows users to run `cd my-project && hermes` and have
their project files accessible to the agent without manual volume config.

Changes:
- Add host_cwd and auto_mount_cwd parameters to DockerEnvironment
- Capture original host CWD in _get_env_config() before container fallback
- Pass host_cwd through _create_environment() to Docker backend
- Add TERMINAL_DOCKER_NO_AUTO_MOUNT env var to disable if needed
- Skip auto-mount when /workspace is already explicitly mounted
- Add tests for auto-mount behavior
- Add documentation for the new feature

The auto-mount is skipped when:
1. TERMINAL_DOCKER_NO_AUTO_MOUNT=true is set
2. User configured docker_volumes with :/workspace
3. persistent_filesystem=true (persistent sandbox mode)

This makes the Docker backend behave more intuitively — the agent
operates on the user's actual project directory by default.
2026-03-16 05:20:21 -07:00
Bartok9
3543b755af fix(docker): auto-mount host CWD to /workspace
Fixes #1445 — When using Docker backend, the user's current working
directory is now automatically bind-mounted to /workspace inside the
container. This allows users to run `cd my-project && hermes` and have
their project files accessible to the agent without manual volume config.

Changes:
- Add host_cwd and auto_mount_cwd parameters to DockerEnvironment
- Capture original host CWD in _get_env_config() before container fallback
- Pass host_cwd through _create_environment() to Docker backend
- Add TERMINAL_DOCKER_NO_AUTO_MOUNT env var to disable if needed
- Skip auto-mount when /workspace is already explicitly mounted
- Add tests for auto-mount behavior
- Add documentation for the new feature

The auto-mount is skipped when:
1. TERMINAL_DOCKER_NO_AUTO_MOUNT=true is set
2. User configured docker_volumes with :/workspace
3. persistent_filesystem=true (persistent sandbox mode)

This makes the Docker backend behave more intuitively — the agent
operates on the user's actual project directory by default.
2026-03-16 04:53:24 -07:00
teknium1
38b4fd3737 fix(gateway): make group session isolation configurable
default group and channel sessions to per-user isolation, allow opting back into shared room sessions via config.yaml, and document Discord gateway routing and session behavior.
2026-03-16 00:22:23 -07:00
teknium1
6894358fe1 docs: add persistent shell section to configuration and env-vars reference
Documents terminal.persistent_shell config option, per-backend env var
overrides, precedence table, and what state persists across commands.
2026-03-15 21:01:50 -07:00
Teknium
463239ed85 docs: fallback providers + /background command documentation
* docs: comprehensive fallback providers documentation

- New dedicated page: user-guide/features/fallback-providers.md covering
  both primary model fallback and auxiliary task fallback systems
- Updated configuration.md with fallback_model config section
- Updated environment-variables.md noting fallback is config-only
- Fleshed out developer-guide/provider-runtime.md fallback section with
  internal architecture details (trigger points, activation flow, config flow)
- Added cross-reference from provider-routing.md distinguishing OpenRouter
  sub-provider routing from Hermes-level model fallback
- Added new page to sidebar under Integrations

* docs: comprehensive /background command documentation

- Added Background Sessions section to cli.md covering how it works
  (daemon threads, isolated sessions, config inheritance, Rich panel
  output, bell notification, concurrent tasks)
- Added Background Sessions section to messaging/index.md covering
  messaging-specific behavior (async execution, result delivery back
  to same chat, fire-and-forget pattern)
- Documented background_process_notifications config
  (all/result/error/off) in messaging docs and configuration.md
- Added HERMES_BACKGROUND_NOTIFICATIONS env var to reference page
- Fixed inconsistency in slash-commands.md: /background was listed as
  messaging-only but works in both CLI and messaging. Moved it to the
  'both surfaces' note.
- Expanded one-liner table descriptions with detail and cross-references
2026-03-15 06:24:28 -07:00
teknium1
85ef09e520 Merge origin/main into hermes/hermes-dd253d81 2026-03-14 21:16:29 -07:00
Teknium
6b1adb7eb1 Merge pull request #1376 from NousResearch/hermes/hermes-781f9235-docs
docs: clarify saved custom endpoint routing
2026-03-14 21:15:24 -07:00
teknium1
282df107a5 docs: clarify saved custom endpoint routing 2026-03-14 21:12:42 -07:00
teknium1
9f6bccd76a feat: add direct endpoint overrides for auxiliary and delegation
Add base_url/api_key overrides for auxiliary tasks and delegation so users can
route those flows straight to a custom OpenAI-compatible endpoint without
having to rely on provider=main or named custom providers.

Also clear gateway session env vars in test isolation so the full suite stays
deterministic when run from a messaging-backed agent session.
2026-03-14 21:11:37 -07:00
Teknium
fc5443d854 Merge pull request #1360 from NousResearch/hermes/hermes-aa701810
fix: refresh Anthropic OAuth before stale env tokens
2026-03-14 19:53:40 -07:00
teknium1
799114ac8b docs: clarify Anthropic Claude auth flow 2026-03-14 19:49:38 -07:00
teknium1
e099117a3b docs: complete voice mode docs 2026-03-14 19:29:01 -07:00
teknium1
f8b30d1035 docs(soul): add comprehensive SOUL.md guide
Document the new global-only SOUL behavior, add a dedicated use guide, update personality/context/config docs, and fix docs language that still described cwd-local SOUL loading.
2026-03-14 09:37:26 -07:00
teknium1
dd6a5732e7 docs: fix salvaged PR #980 troubleshooting details
Correct the PowerShell UTF-8 snippet in the new Windows encoding tip
and soften the Docker CLI wording to match Hermes' actual lookup
behavior.
2026-03-14 06:02:57 -07:00
aydnOktay
767b5463f9 docs: add terminal backend and windows troubleshooting 2026-03-14 06:01:22 -07:00
Teknium
984f00e0b0 docs: expand Docusaurus coverage across CLI, tools, skills, and skins (#1232)
- add code-derived reference pages for slash commands, tools, toolsets,
  bundled skills, and official optional skills
- document the skin system and link visual theming separately from
  conversational personality
- refresh quickstart, configuration, environment variable, and messaging
  docs to match current provider, gateway, and browser behavior
- fix stale command, session, and Home Assistant configuration guidance
2026-03-13 21:34:41 -07:00
Teknium
475dd58a8e Merge PR #736: feat(honcho): async writes, memory modes, session title integration, setup CLI
Authored by erosika. Builds on #38 and #243.

Adds async write support, configurable memory modes, context prefetch pipeline,
4 new Honcho tools (honcho_context, honcho_profile, honcho_search, honcho_conclude),
full 'hermes honcho' CLI, session strategies, AI peer identity, recallMode A/B,
gateway lifecycle management, and comprehensive docs.

Cherry-picks fixes from PRs #831/#832 (adavyas).

Co-authored-by: erosika <erosika@users.noreply.github.com>
Co-authored-by: adavyas <adavyas@users.noreply.github.com>
2026-03-12 19:05:11 -07:00
teknium1
809abd60bf docs: add Anthropic provider to all documentation pages
- quickstart.md: Add Anthropic to the provider comparison table
- configuration.md: Add Anthropic to provider list table, add full
  'Anthropic (Native)' section with three auth methods (API key,
  setup-token, Claude Code auto-detect), config.yaml example,
  and provider alias tip
- environment-variables.md: Add ANTHROPIC_API_KEY, ANTHROPIC_TOKEN,
  CLAUDE_CODE_OAUTH_TOKEN to LLM Providers table; add 'anthropic'
  to HERMES_INFERENCE_PROVIDER values list
2026-03-12 17:28:36 -07:00
Erosika
a0b0dbe6b2 Merge remote-tracking branch 'origin/main' into feat/honcho-async-memory
Made-with: Cursor

# Conflicts:
#	cli.py
#	tests/test_run_agent.py
2026-03-11 12:22:56 -04:00
Teknium
b16d7f2da6 Merge pull request #921 from NousResearch/hermes/hermes-ece5a45c
feat(cli): add /reasoning command for effort level and display toggle
2026-03-11 06:30:20 -07:00
teknium1
9423fda5cb feat: configurable subagent provider:model with full credential resolution
Adds delegation.model and delegation.provider config fields so subagents
can run on a completely different provider:model pair than the parent agent.

When delegation.provider is set, the system resolves the full credential
bundle (base_url, api_key, api_mode) via resolve_runtime_provider() —
the same path used by CLI/gateway startup. This means all configured
providers work out of the box: openrouter, nous, zai, kimi-coding,
minimax, minimax-cn.

Key design decisions:
- Provider resolution uses hermes_cli.runtime_provider (single source of
  truth for credential resolution across CLI, gateway, cron, and now
  delegation)
- When only delegation.model is set (no provider), the model name changes
  but parent credentials are inherited (for switching models within the
  same provider like OpenRouter)
- When delegation.provider is set, full credentials are resolved
  independently — enabling cross-provider delegation (e.g. parent on
  Nous Portal, subagents on OpenRouter)
- Clear error messages if provider resolution fails (missing API key,
  unknown provider name)
- _load_config() now falls back to hermes_cli.config.load_config() for
  gateway/cron contexts where CLI_CONFIG is unavailable

Based on PR #791 by 0xbyt4 (closes #609), reworked to use proper
provider credential resolution instead of passing provider as metadata.

Co-authored-by: 0xbyt4 <0xbyt4@users.noreply.github.com>
2026-03-11 06:12:21 -07:00
teknium1
4d873f77c1 feat(cli): add /reasoning command for effort level and display toggle
Combined implementation of reasoning management:
- /reasoning              Show current effort level and display state
- /reasoning <level>      Set reasoning effort (none, low, medium, high, xhigh)
- /reasoning show|on      Show model thinking/reasoning in output
- /reasoning hide|off     Hide model thinking/reasoning from output

Effort level changes persist to config and force agent re-init.
Display toggle updates the agent callback dynamically without re-init.

When display is enabled:
- Intermediate reasoning shown as dim [thinking] lines during tool loops
- Final reasoning shown in a bordered box above the response
- Long reasoning collapsed (5 lines intermediate, 10 lines final)

Also adds:
- reasoning_callback parameter to AIAgent
- last_reasoning in run_conversation result dict
- show_reasoning config option (display section, default: false)
- Display section in /config output
- 34 tests covering both features

Combines functionality from PR #789 and PR #790.

Co-authored-by: Aum Desai <Aum08Desai@users.noreply.github.com>
Co-authored-by: 0xbyt4 <35742124+0xbyt4@users.noreply.github.com>
2026-03-11 06:02:18 -07:00
teknium1
2e1aa1b424 docs: add iteration budget pressure section to configuration guide
Documents the two-tier budget warning system from PR #762:
- Explains caution (70%) and warning (90%) thresholds
- Table showing what the model sees at each tier
- Notes on how injection preserves prompt caching
- Links to max_turns config
2026-03-11 00:40:44 -07:00
teknium1
c5321298ce docs: add quick commands documentation
Documents the quick_commands config feature from PR #746:
- configuration.md: full section with examples (server status, disk,
  gpu, update), behavior notes (timeout, priority, works everywhere)
- cli.md: brief section with config example + link to config guide
2026-03-11 00:28:52 -07:00
Erosika
74c214e957 feat(honcho): async memory integration with prefetch pipeline and recallMode
Adds full Honcho memory integration to Hermes:

- Session manager with async background writes, memory modes (honcho/hybrid/local),
  and dialectic prefetch for first-turn context warming
- Agent integration: prefetch pipeline, tool surface gated by recallMode,
  system prompt context injection, SIGTERM/SIGINT flush handlers
- CLI commands: setup, status, mode, tokens, peer, identity, migrate
- recallMode setting (auto | context | tools) for A/B testing retrieval strategies
- Session strategies: per-session, per-repo (git tree root), per-directory, global
- Polymorphic memoryMode config: string shorthand or per-peer object overrides
- 97 tests covering async writes, client config, session resolution, and memory modes
2026-03-10 16:21:07 -04:00
teknium1
c1775de56f feat: filesystem checkpoints and /rollback command
Automatic filesystem snapshots before destructive file operations,
with user-facing rollback.  Inspired by PR #559 (by @alireza78a).

Architecture:
- Shadow git repos at ~/.hermes/checkpoints/{hash}/ via GIT_DIR
- CheckpointManager: take/list/restore, turn-scoped dedup, pruning
- Transparent — the LLM never sees it, no tool schema, no tokens
- Once per turn — only first write_file/patch triggers a snapshot

Integration:
- Config: checkpoints.enabled + checkpoints.max_snapshots
- CLI flag: hermes --checkpoints
- Trigger: run_agent.py _execute_tool_calls() before write_file/patch
- /rollback slash command in CLI + gateway (list, restore by number)
- Pre-rollback snapshot auto-created on restore (undo the undo)

Safety:
- Never blocks file operations — all errors silently logged
- Skips root dir, home dir, dirs >50K files
- Disables gracefully when git not installed
- Shadow repo completely isolated from project git

Tests: 35 new tests, all passing (2798 total suite)
Docs: feature page, config reference, CLI commands reference
2026-03-10 00:49:15 -07:00
teknium1
fa2e72ae9c docs: document docker_volumes config for shared host directories
The Docker backend already supports user-configured volume mounts via
docker_volumes, but it was undocumented — missing from DEFAULT_CONFIG,
cli.py defaults, and configuration docs.

Changes:
- hermes_cli/config.py: Add docker_volumes to DEFAULT_CONFIG with
  inline documentation and examples
- cli.py: Add docker_volumes to load_cli_config defaults
- configuration.md: Full Docker Volume Mounts section with YAML
  examples, use cases (providing files, receiving outputs, shared
  workspaces), and env var alternative
2026-03-09 15:29:34 -07:00
teknium1
3ffaac00dd feat: bell_on_complete — terminal bell when agent finishes
Adds a simple config option to play the terminal bell (\a) when the
agent finishes a response. Useful for long-running tasks — switch to
another window and your terminal will ding when done.

Works over SSH since the bell character propagates through the
connection. Most terminal emulators can be configured to flash the
taskbar, play a sound, or show a visual indicator on bell.

Config (default: off):
  display:
    bell_on_complete: true

Closes #318
2026-03-08 21:30:48 -07:00
teknium1
a8bf414f4a feat: browser console/errors tool, annotated screenshots, auto-recording, and dogfood QA skill
New browser capabilities and a built-in skill for agent-driven web QA.

## New tool: browser_console

Returns console messages (log/warn/error/info) AND uncaught JavaScript
exceptions in a single call. Uses agent-browser's 'console' and 'errors'
commands through the existing session plumbing. Supports --clear to reset
buffers. Verified working in both local and Browserbase cloud modes.

## Enhanced tool: browser_vision(annotate=True)

New boolean parameter on browser_vision. When true, agent-browser overlays
numbered [N] labels on interactive elements — each [N] maps to ref @eN.
Annotation data (element name, role, bounding box) returned alongside the
vision analysis. Useful for QA reports and spatial reasoning.

## Config: browser.record_sessions

Auto-record browser sessions as WebM video files when enabled:
- Starts recording on first browser_navigate
- Stops and saves on browser_close
- Saves to ~/.hermes/browser_recordings/
- Works in both local and cloud modes (verified)
- Disabled by default

## Built-in skill: dogfood

Systematic exploratory QA testing for web applications. Teaches the agent
a 5-phase workflow:
1. Plan — accept URL, create output dirs, set scope
2. Explore — systematic crawl with annotated screenshots
3. Collect Evidence — screenshots, console errors, JS exceptions
4. Categorize — severity (Critical/High/Medium/Low) and category
   (Functional/Visual/Accessibility/Console/UX/Content)
5. Report — structured markdown with per-issue evidence

Includes:
- skills/dogfood/SKILL.md — full workflow instructions
- skills/dogfood/references/issue-taxonomy.md — severity/category defs
- skills/dogfood/templates/dogfood-report-template.md — report template

## Tests

21 new tests covering:
- browser_console message/error parsing, clear flag, empty/failed states
- browser_console schema registration
- browser_vision annotate schema and flag passing
- record_sessions config defaults and recording lifecycle
- Dogfood skill file existence and content validation

Addresses #315.
2026-03-08 21:28:12 -07:00
teknium1
2d1a1c1c47 refactor: remove redundant 'openai' auxiliary provider, clean up docs
The 'openai' provider was redundant — using OPENAI_BASE_URL +
OPENAI_API_KEY with provider: 'main' already covers direct OpenAI API.

Provider options are now: auto, openrouter, nous, codex, main.

- Removed _try_openai(), _OPENAI_AUX_MODEL, _OPENAI_BASE_URL
- Replaced openai tests with codex provider tests
- Updated all docs to remove 'openai' option and clarify 'main'
- 'main' description now explicitly mentions it works with OpenAI API,
  local models, and any OpenAI-compatible endpoint

Tests: 2467 passed.
2026-03-08 18:50:26 -07:00
teknium1
71e81728ac feat: Codex OAuth vision support + multimodal content adapter
The Codex Responses API (chatgpt.com/backend-api/codex) supports
vision via gpt-5.3-codex. This was verified with real API calls
using image analysis.

Changes to _CodexCompletionsAdapter:
- Added _convert_content_for_responses() to translate chat.completions
  multimodal format to Responses API format:
  - {type: 'text'} → {type: 'input_text'}
  - {type: 'image_url', image_url: {url: '...'}} → {type: 'input_image', image_url: '...'}
- Fixed: removed 'stream' from resp_kwargs (responses.stream() handles it)
- Fixed: removed max_output_tokens and temperature (Codex endpoint rejects them)

Provider changes:
- Added 'codex' as explicit auxiliary provider option
- Vision auto-fallback now includes Codex (OpenRouter → Nous → Codex)
  since gpt-5.3-codex supports multimodal input
- Updated docs with Codex OAuth examples

Tested with real Codex OAuth token + ~/.hermes/image2.png — confirmed
working end-to-end through the full adapter pipeline.

Tests: 2459 passed.
2026-03-08 18:44:33 -07:00
teknium1
ae4a674c84 feat: add 'openai' as auxiliary provider option
Users can now set provider: "openai" for auxiliary tasks (vision, web
extract, compression) to use OpenAI's API directly with their
OPENAI_API_KEY. This hits api.openai.com/v1 with gpt-4o-mini as the
default model — supports vision since GPT-4o handles image input.

Provider options are now: auto, openrouter, nous, openai, main.

Changes:
- agent/auxiliary_client.py: added _try_openai(), "openai" case in
  _resolve_forced_provider(), updated auxiliary_max_tokens_param()
  to use max_completion_tokens for OpenAI
- Updated docs: cli-config.yaml.example, AGENTS.md, and user-facing
  configuration.md with Common Setups section showing OpenAI,
  OpenRouter, and local model examples
- 3 new tests for OpenAI provider resolution

Tests: 2459 passed (was 2429).
2026-03-08 18:25:30 -07:00
teknium1
169615abc8 docs: add Auxiliary Models section to user-facing configuration docs
Adds clear how-to documentation for changing the vision model, web
extraction model, and compression model to the user-facing docs site
(website/docs/user-guide/configuration.md).

Includes:
- Full auxiliary config.yaml example
- 'Changing the Vision Model' walkthrough with config + env var options
- Provider options table (auto/openrouter/nous/main)
- Multimodal safety warning for vision
- Environment variable reference table
- Updated the warning about OpenRouter-dependent tools to mention
  auxiliary model configuration
2026-03-08 18:10:55 -07:00
teknium1
cf63b2471f docs: add resume history display to sessions, CLI, config, and AGENTS docs
- sessions.md: New 'Conversation Recap on Resume' subsection with visual
  example, feature bullet points, and config snippet
- cli.md: New 'Session Resume Display' subsection with cross-reference
- configuration.md: Add resume_display to display settings YAML block
- AGENTS.md: Add _preload_resumed_session() and _display_resumed_history()
  to key components, add UX note about resume panel
2026-03-08 17:55:14 -07:00
teknium1
4be783446a fix: wire worktree flag into hermes CLI entry point + docs + tests
Critical fixes:
- Add --worktree/-w to hermes_cli/main.py argparse (both chat
  subcommand and top-level parser) so 'hermes -w' works via the
  actual CLI entry point, not just 'python cli.py -w'
- Pass worktree flag through cmd_chat() kwargs to cli_main()
- Handle worktree attr in bare 'hermes' and --resume/--continue paths

Bug fixes in cli.py:
- Skip worktree creation for --list-tools/--list-toolsets (wasteful)
- Wrap git worktree subprocess.run in try/except (crash on timeout)
- Add stale worktree pruning on startup (_prune_stale_worktrees):
  removes clean worktrees older than 24h left by crashed/killed sessions

Documentation updates:
- AGENTS.md: add --worktree to CLI commands table
- cli-config.yaml.example: add worktree config section
- website/docs/reference/cli-commands.md: add to core commands
- website/docs/user-guide/cli.md: add usage examples
- website/docs/user-guide/configuration.md: add config docs

Test improvements (17 → 31 tests):
- Stale worktree pruning (prune old clean, keep recent, keep dirty)
- Directory symlink via .worktreeinclude
- Edge cases (no commits, not a repo, pre-existing .worktrees/)
- CLI flag/config OR logic
- TERMINAL_CWD integration
- System prompt injection format
2026-03-07 21:05:40 -08:00
teknium1
b84f9e410c feat: default reasoning effort from xhigh to medium
Reduces token usage and latency for most tasks by defaulting to
medium reasoning effort instead of xhigh. Users can still override
via config or CLI flag. Updates code, tests, example config, and docs.
2026-03-07 10:14:19 -08:00
teknium1
388dd4789c feat: add z.ai/GLM, Kimi/Moonshot, MiniMax as first-class providers
Adds 4 new direct API-key providers (zai, kimi-coding, minimax, minimax-cn)
to the inference provider system. All use standard OpenAI-compatible
chat/completions endpoints with Bearer token auth.

Core changes:
- auth.py: Extended ProviderConfig with api_key_env_vars and base_url_env_var
  fields. Added providers to PROVIDER_REGISTRY. Added provider aliases
  (glm, z-ai, zhipu, kimi, moonshot). Added auto-detection of API-key
  providers in resolve_provider(). Added resolve_api_key_provider_credentials()
  and get_api_key_provider_status() helpers.
- runtime_provider.py: Added generic API-key provider branch in
  resolve_runtime_provider() — any provider with auth_type='api_key'
  is automatically handled.
- main.py: Added providers to hermes model menu with generic
  _model_flow_api_key_provider() flow. Updated _has_any_provider_configured()
  to check all provider env vars. Updated argparse --provider choices.
- setup.py: Added providers to setup wizard with API key prompts and
  curated model lists.
- config.py: Added env vars (GLM_API_KEY, KIMI_API_KEY, MINIMAX_API_KEY,
  etc.) to OPTIONAL_ENV_VARS.
- status.py: Added API key display and provider status section.
- doctor.py: Added connectivity checks for each provider endpoint.
- cli.py: Updated provider docstrings.

Docs: Updated README.md, .env.example, cli-config.yaml.example,
cli-commands.md, environment-variables.md, configuration.md.

Tests: 50 new tests covering registry, aliases, resolution, auto-detection,
credential resolution, and runtime provider dispatch.

Inspired by PR #33 (numman-ali) which proposed a provider registry approach.
Credit to tars90percent (PR #473) and manuelschipper (PR #420) for related
provider improvements merged earlier in this changeset.
2026-03-06 18:55:18 -08:00
teknium1
68fbae5692 docs: add Custom & Self-Hosted LLM Providers guide
Comprehensive guide for using Hermes Agent with alternative LLM backends:
- Ollama (local models, zero config)
- vLLM (high-performance GPU inference)
- SGLang (RadixAttention, prefix caching)
- llama.cpp / llama-server (CPU & Metal inference)
- LiteLLM Proxy (multi-provider gateway)
- ClawRouter (cost-optimized routing with complexity scoring)
- 10+ other compatible providers table (Together, Groq, DeepSeek, etc.)
- Choosing the Right Setup decision table
- General custom endpoint setup instructions

All of these work via the existing OPENAI_BASE_URL + OPENAI_API_KEY
custom endpoint support — no code changes needed.
2026-03-06 14:16:06 -08:00
teknium1
39299e2de4 Merge PR #451: feat: Add Daytona environment backend
Authored by rovle. Adds Daytona as the sixth terminal execution backend
with cloud sandboxes, persistent workspaces, and full CLI/gateway integration.
Includes 24 unit tests and 8 integration tests.
2026-03-06 03:32:40 -08:00
teknium1
363633e2ba fix: allow self-hosted Firecrawl without API key + add self-hosting docs
On top of PR #460: self-hosted Firecrawl instances don't require an API
key (USE_DB_AUTHENTICATION=false), so don't force users to set a dummy
FIRECRAWL_API_KEY when FIRECRAWL_API_URL is set. Also adds a proper
self-hosting section to the configuration docs explaining what you get,
what you lose, and how to set it up (Docker stack, tradeoffs vs cloud).

Added 2 more tests (URL-only without key, neither-set raises).
2026-03-05 16:44:21 -08:00
caentzminger
d7d10b14cd feat(tools): add support for self-hosted firecrawl
Adds optional FIRECRAWL_API_URL environment variable to support
self-hosted Firecrawl deployments alongside the cloud service.

- Add FIRECRAWL_API_URL to optional env vars in hermes_cli/config.py
- Update _get_firecrawl_client() in tools/web_tools.py to accept custom API URL
- Add tests for client initialization with/without URL
- Document new env var in installation and config guides
2026-03-05 16:16:18 -06:00
rovle
74a36b0729 docs: add Daytona to backend lists in docs
Signed-off-by: rovle <lovre.pesut@gmail.com>
2026-03-05 11:55:41 -08:00
teknium1
19016497ef docs: fix all remaining minor accuracy issues
- updating.md: Note that 'hermes update' auto-handles config migration
- cli.md: Add summary_model to compression config, fix display config
  (add personality/compact), remove unverified pastes/ claim
- configuration.md: Add 5 missing config sections (stt, human_delay,
  code_execution, delegation, clarify), fix display defaults,
  fix reasoning_effort default to empty/unset
- messaging/index.md: Add GATEWAY_ALLOWED_USERS to security section
- skills.md: Add category field to skills_list return value
- mcp.md: Document auto-registered utility tools (resources/prompts)
- architecture.md: Fix file_tools.py reference, base_url default to None,
  synchronous agent loop pseudocode
- cli-commands.md: Fix hermes logout description
- environment-variables.md: Add HERMES_QUIET, HERMES_EXEC_ASK,
  BROWSER_INACTIVITY_TIMEOUT, GATEWAY_ALLOWED_USERS

Verification scan: 27/27 checks passed, zero issues remaining.
2026-03-05 07:00:51 -08:00