Commit Graph

693 Commits

Author SHA1 Message Date
Teknium
143b74ec00 fix: first-run guard stuck in loop when provider configured via config.yaml (#4298)
The _has_any_provider_configured() guard only checked env vars, .env file,
and auth.json — missing config.yaml model.provider/base_url/api_key entirely.
Users who configured a provider through setup (saving to config.yaml) but had
empty API key placeholders in .env from the install template were permanently
blocked by the 'not configured' message.

Changes:
- _has_any_provider_configured() now checks config.yaml model section for
  explicit provider, base_url, or api_key — covers custom endpoints and
  providers that store credentials in config rather than env vars
- .env.example: comment out all empty API key placeholders so they don't
  pollute the environment when copied to .env by the installer
- .env.example: mark LLM_MODEL as deprecated (config.yaml is source of truth)
- 4 new tests for the config.yaml detection path

Reported by OkadoOP on Discord.
2026-03-31 11:42:52 -07:00
Dakota Secula-Rosell
c1606aed69 fix(cli): allow empty strings and falsy values in config set
`hermes config set KEY ""` and `hermes config set KEY 0` were rejected
because the guard used `not value` which is truthy for empty strings,
zero, and False. Changed to `value is None` so only truly missing
arguments are rejected.

Closes #4277

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-31 11:41:12 -07:00
Teknium
344239c2db feat: auto-detect models from server probe in custom endpoint setup (#4218)
Custom endpoint setup (_model_flow_custom) now probes the server first
and presents detected models instead of asking users to type blind:

- Single model: auto-confirms with Y/n prompt
- Multiple models: numbered list picker, or type a name
- No models / probe failed: falls back to manual input

Context length prompt also moved after model selection so the user sees
the verified endpoint before being asked for details.

All recent fixes preserved: config dict sync (#4172), api_key
persistence (#4182), no save_env_value for URLs (#4165).

Inspired by PR #4194 by sudoingX — re-implemented against current main.

Co-authored-by: Xpress AI (Dip KD) <200180104+sudoingX@users.noreply.github.com>
2026-03-31 03:29:00 -07:00
Teknium
8d59881a62 feat(auth): same-provider credential pools with rotation, custom endpoint support, and interactive CLI (#2647)
* feat(auth): add same-provider credential pools and rotation UX

Add same-provider credential pooling so Hermes can rotate across
multiple credentials for a single provider, recover from exhausted
credentials without jumping providers immediately, and configure
that behavior directly in hermes setup.

- agent/credential_pool.py: persisted per-provider credential pools
- hermes auth add/list/remove/reset CLI commands
- 429/402/401 recovery with pool rotation in run_agent.py
- Setup wizard integration for pool strategy configuration
- Auto-seeding from env vars and existing OAuth state

Co-authored-by: kshitijk4poor <82637225+kshitijk4poor@users.noreply.github.com>
Salvaged from PR #2647

* fix(tests): prevent pool auto-seeding from host env in credential pool tests

Tests for non-pool Anthropic paths and auth remove were failing when
host env vars (ANTHROPIC_API_KEY) or file-backed OAuth credentials
were present. The pool auto-seeding picked these up, causing unexpected
pool entries in tests.

- Mock _select_pool_entry in auxiliary_client OAuth flag tests
- Clear Anthropic env vars and mock _seed_from_singletons in auth remove test

* feat(auth): add thread safety, least_used strategy, and request counting

- Add threading.Lock to CredentialPool for gateway thread safety
  (concurrent requests from multiple gateway sessions could race on
  pool state mutations without this)
- Add 'least_used' rotation strategy that selects the credential
  with the lowest request_count, distributing load more evenly
- Add request_count field to PooledCredential for usage tracking
- Add mark_used() method to increment per-credential request counts
- Wrap select(), mark_exhausted_and_rotate(), and try_refresh_current()
  with lock acquisition
- Add tests: least_used selection, mark_used counting, concurrent
  thread safety (4 threads × 20 selects with no corruption)

* feat(auth): add interactive mode for bare 'hermes auth' command

When 'hermes auth' is called without a subcommand, it now launches an
interactive wizard that:

1. Shows full credential pool status across all providers
2. Offers a menu: add, remove, reset cooldowns, set strategy
3. For OAuth-capable providers (anthropic, nous, openai-codex), the
   add flow explicitly asks 'API key or OAuth login?' — making it
   clear that both auth types are supported for the same provider
4. Strategy picker shows all 4 options (fill_first, round_robin,
   least_used, random) with the current selection marked
5. Remove flow shows entries with indices for easy selection

The subcommand paths (hermes auth add/list/remove/reset) still work
exactly as before for scripted/non-interactive use.

* fix(tests): update runtime_provider tests for config.yaml source of truth (#4165)

Tests were using OPENAI_BASE_URL env var which is no longer consulted
after #4165. Updated to use model config (provider, base_url, api_key)
which is the new single source of truth for custom endpoint URLs.

* feat(auth): support custom endpoint credential pools keyed by provider name

Custom OpenAI-compatible endpoints all share provider='custom', making
the provider-keyed pool useless. Now pools for custom endpoints are
keyed by 'custom:<normalized_name>' where the name comes from the
custom_providers config list (auto-generated from URL hostname).

- Pool key format: 'custom:together.ai', 'custom:local-(localhost:8080)'
- load_pool('custom:name') seeds from custom_providers api_key AND
  model.api_key when base_url matches
- hermes auth add/list now shows custom endpoints alongside registry
  providers
- _resolve_openrouter_runtime and _resolve_named_custom_runtime check
  pool before falling back to single config key
- 6 new tests covering custom pool keying, seeding, and listing

* docs: add Excalidraw diagram of full credential pool flow

Comprehensive architecture diagram showing:
- Credential sources (env vars, auth.json OAuth, config.yaml, CLI)
- Pool storage and auto-seeding
- Runtime resolution paths (registry, custom, OpenRouter)
- Error recovery (429 retry-then-rotate, 402 immediate, 401 refresh)
- CLI management commands and strategy configuration

Open at: https://excalidraw.com/#json=2Ycqhqpi6f12E_3ITyiwh,c7u9jSt5BwrmiVzHGbm87g

* fix(tests): update setup wizard pool tests for unified select_provider_and_model flow

The setup wizard now delegates to select_provider_and_model() instead
of using its own prompt_choice-based provider picker. Tests needed:
- Mock select_provider_and_model as no-op (provider pre-written to config)
- Call _stub_tts BEFORE custom prompt_choice mock (it overwrites it)
- Pre-write model.provider to config so the pool step is reached

* docs: add comprehensive credential pool documentation

- New page: website/docs/user-guide/features/credential-pools.md
  Full guide covering quick start, CLI commands, rotation strategies,
  error recovery, custom endpoint pools, auto-discovery, thread safety,
  architecture, and storage format.
- Updated fallback-providers.md to reference credential pools as the
  first layer of resilience (same-provider rotation before cross-provider)
- Added hermes auth to CLI commands reference with usage examples
- Added credential_pool_strategies to configuration guide

* chore: remove excalidraw diagram from repo (external link only)

* refactor: simplify credential pool code — extract helpers, collapse extras, dedup patterns

- _load_config_safe(): replace 4 identical try/except/import blocks
- _iter_custom_providers(): shared generator for custom provider iteration
- PooledCredential.extra dict: collapse 11 round-trip-only fields
  (token_type, scope, client_id, portal_base_url, obtained_at,
  expires_in, agent_key_id, agent_key_expires_in, agent_key_reused,
  agent_key_obtained_at, tls) into a single extra dict with
  __getattr__ for backward-compatible access
- _available_entries(): shared exhaustion-check between select and peek
- Dedup anthropic OAuth seeding (hermes_pkce + claude_code identical)
- SimpleNamespace replaces class _Args boilerplate in auth_commands
- _try_resolve_from_custom_pool(): shared pool-check in runtime_provider

Net -17 lines. All 383 targeted tests pass.

---------

Co-authored-by: kshitijk4poor <82637225+kshitijk4poor@users.noreply.github.com>
2026-03-31 03:10:01 -07:00
Teknium
2ae50bdddd fix(telegram): enforce 32-char limit on command names with collision avoidance (#4211)
Telegram Bot API requires command names to be 1-32 characters. Plugin
and skill names that exceed this limit now get truncated. If truncation
creates a collision (with core commands, other plugins, or other skills),
the name is shortened to 31 chars and a digit 0-9 is appended.

Adds _clamp_telegram_names() helper used for both plugin and skill
entries in telegram_menu_commands(). Core CommandDef commands are tracked
as reserved names so truncated plugin/skill names never shadow them.

Addresses the fix from PR #4191 (sroecker) with collision-safe truncation.

Tests: 9 new tests covering truncation, digit suffixes, exhaustion, dedup.
2026-03-31 02:41:50 -07:00
Nils
50302ed70a fix(tools): make browser SSRF check configurable via browser.allow_private_urls (#4198)
* fix(tools): skip SSRF check in local browser mode

The SSRF protection added in #3041 blocks all private/internal
addresses unconditionally in browser_navigate(). This prevents
legitimate local development use cases (localhost testing, LAN
device access) when using the local Chromium backend.

The SSRF check is only meaningful for cloud browsers (Browserbase,
BrowserUse) where the agent could reach internal resources on a
remote machine. In local mode, the user already has full terminal
and network access, so the check adds no security value.

This change makes the SSRF check conditional on _get_cloud_provider(),
keeping full protection in cloud mode while allowing private addresses
in local mode.

* fix(tools): make SSRF check configurable via browser.allow_private_urls

Replace unconditional SSRF check with a configurable setting.
Default (False) keeps existing security behavior. Setting to True
allows navigating to private/internal IPs for local dev and LAN use cases.

---------

Co-authored-by: Nils (Norya) <nils@begou.dev>
2026-03-31 02:11:55 -07:00
Teknium
086ec5590d fix: gate Claude Code credentials behind explicit Hermes config in wizard trigger (#4210)
If a user has Claude Code installed but never configured Hermes, the
first-run guard found those external credentials and skipped the setup
wizard. Users got silently routed to someone else's inference without
being asked.

Now _has_any_provider_configured() checks whether Hermes itself has been
explicitly configured (model in config differs from hardcoded default)
before counting Claude Code credentials. Fresh installs trigger the
wizard regardless of what external tools are on the machine.

Salvaged from PR #4194 by sudoingX — wizard trigger fix only.
Model auto-detect change under separate review.

Co-authored-by: Xpress AI (Dip KD) <200180104+sudoingX@users.noreply.github.com>
2026-03-31 02:01:15 -07:00
Teknium
c53a296df1 feat: add MiniMax M2.7 to hermes model picker and opencode-go (#4208)
Add MiniMax-M2.7 and M2.7-highspeed to _PROVIDER_MODELS for minimax
and minimax-cn providers in main.py so hermes model shows them.
Update opencode-go bare ID from m2.5 to m2.7 in models.py.

Salvaged from PR #4197 by octo-patch.
2026-03-31 01:54:13 -07:00
Teknium
1bca6f3930 fix: save API key to model config for custom endpoints (#4182)
Custom cloud endpoints (Together.ai, RunPod, Groq, etc.) lost their
API key after #4165 removed OPENAI_API_KEY .env saves.  The key was
only saved to the custom_providers list which is unreachable at
runtime for plain 'custom' provider resolution.

Save model.api_key to config.yaml alongside model.provider and
model.base_url in all three custom endpoint code paths:
- _model_flow_custom (new endpoint with model name)
- _model_flow_custom (new endpoint without model name)
- _model_flow_named_custom (switching to a saved endpoint)

The runtime resolver already reads model.api_key (runtime_provider.py
line 224-228), so the key is picked up automatically.  Each custom
endpoint carries its own key in config — no shared OPENAI_API_KEY
env var needed.
2026-03-31 01:36:15 -07:00
Teknium
ff78ad4c81 feat: add discord.reactions config option to disable message reactions (#4199)
Adds a 'reactions' key under the discord config section (default: true).
When set to false, the bot no longer adds 👀// reactions to messages
during processing. The config maps to DISCORD_REACTIONS env var following
the same pattern as require_mention and auto_thread.

Files changed:
- hermes_cli/config.py: Add reactions default to DEFAULT_CONFIG
- gateway/config.py: Map discord.reactions to DISCORD_REACTIONS env var
- gateway/platforms/discord.py: Gate on_processing_start/complete hooks
- tests/gateway/test_discord_reactions.py: 3 new tests for config gate
2026-03-31 01:24:48 -07:00
Teknium
491e79bca9 refactor: unify setup wizard provider selection with hermes model
setup_model_provider() had 800+ lines of duplicated provider handling
that reimplemented the same credential prompting, OAuth flows, and model
selection that hermes model already provides via the _model_flow_*
functions.  Every new provider had to be added in both places, and the
two implementations diverged in config persistence (setup.py did raw
YAML writes, _set_model_provider, and _update_config_for_provider
depending on the provider — main.py used its own load/save cycle).

This caused the #4172 bug: _model_flow_custom saved config to disk but
the wizard's final save_config(config) overwrote it with stale values.

Fix: extract the core of cmd_model() into select_provider_and_model()
and have setup_model_provider() call it.  After the call, re-sync the
wizard's config dict from disk.  Deletes ~800 lines of duplicated
provider handling from setup.py.

Also fixes cmd_model() double-AuthError crash on fresh installs with
no API keys configured.
2026-03-31 01:04:07 -07:00
Teknium
89d8127772 fix: setup wizard overwrites custom endpoint config (#4172)
_model_flow_custom() saved model.provider and model.base_url to disk
via its own load_config/save_config cycle, but never updated the
setup wizard's in-memory config dict.  The wizard's final
save_config(config) then overwrote the custom settings with the
stale default string model value.

Fix: after saving to disk, also mutate the caller's config dict so
the wizard's final save preserves model.provider='custom' and the
base_url.  Both the model_name and no-model_name branches are
covered.

Added regression tests that simulate the full wizard flow including
the final save_config(config) call — the step that was previously
untested.
2026-03-30 23:17:26 -07:00
Teknium
f890a94c12 refactor: make config.yaml the single source of truth for endpoint URLs (#4165)
OPENAI_BASE_URL was written to .env AND config.yaml, creating a dual-source
confusion. Users (especially Docker) would see the URL in .env and assume
that's where all config lives, then wonder why LLM_MODEL in .env didn't work.

Changes:
- Remove all 27 save_env_value("OPENAI_BASE_URL", ...) calls across main.py,
  setup.py, and tools_config.py
- Remove OPENAI_BASE_URL env var reading from runtime_provider.py, cli.py,
  models.py, and gateway/run.py
- Remove LLM_MODEL/HERMES_MODEL env var reading from gateway/run.py and
  auxiliary_client.py — config.yaml model.default is authoritative
- Vision base URL now saved to config.yaml auxiliary.vision.base_url
  (both setup wizard and tools_config paths)
- Tests updated to set config values instead of env vars

Convention enforced: .env is for SECRETS only (API keys). All other
configuration (model names, base URLs, provider selection) lives
exclusively in config.yaml.
2026-03-30 22:02:53 -07:00
Teknium
1bd206ea5d feat: add /btw command for ephemeral side questions (#4161)
Adds /btw <question> — ask a quick follow-up using the current
session context without interrupting the main conversation.

- Snapshots conversation history, answers with a no-tools agent
- Response is not persisted to session history or DB
- Runs in a background thread (CLI) / async task (gateway)
- Per-session guard prevents concurrent /btw in gateway

Implementation:
- model_tools.py: enabled_toolsets=[] now correctly means "no tools"
  (was falsy, fell through to default "all tools")
- run_agent.py: persist_session=False gates _persist_session()
- cli.py: _handle_btw_command (background thread, Rich panel output)
- gateway/run.py: _handle_btw_command + _run_btw_task (async task)
- hermes_cli/commands.py: CommandDef for "btw"

Inspired by PR #3504 by areu01or00, reimplemented cleanly on current
main with the enabled_toolsets=[] fix and without the __btw_no_tools__
hack.
2026-03-30 21:10:05 -07:00
Teknium
f8e1ee10aa Fix profile list model display (#4160)
Co-authored-by: txhno <roshwarrier@gmail.com>
2026-03-30 20:40:13 -07:00
Teknium
fb4b87f4af chore: add claude-sonnet-4.6 to OpenRouter and Nous model lists (#4157) 2026-03-30 20:33:21 -07:00
Teknium
83e5249be6 fix(gateway): use setsid instead of systemd-run --user for /update (salvage #4024) (#4104)
Salvaged from PR #4024 by @Sertug17. Fixes #4017.

- Replace systemd-run --user --scope with setsid for portable session detach
- Add system-level service detection to cmd_update gateway restart
- Falls back to start_new_session=True on systems without setsid (macOS, minimal containers)
2026-03-30 20:22:09 -07:00
Teknium
45396aaa92 fix(alibaba): use standard DashScope international endpoint (#4133)
* fix(alibaba): use standard DashScope international endpoint

The Alibaba Cloud provider was hardcoded to the coding-intl endpoint
(https://coding-intl.dashscope.aliyuncs.com/v1) which only accepts
Alibaba Coding Plan API keys.

Standard DashScope API keys fail with invalid_api_key error against
this endpoint. Changed to the international compatible-mode endpoint
(https://dashscope-intl.aliyuncs.com/compatible-mode/v1) which works
with standard DashScope keys.

Users with Coding Plan keys or China-region keys can still override
via DASHSCOPE_BASE_URL or config.yaml base_url.

Fixes #3912

* fix: update test to match new DashScope default endpoint

---------

Co-authored-by: kagura-agent <kagura.chen28@gmail.com>
2026-03-30 19:06:30 -07:00
Teknium
04367e2fac fix(cron): stop truncating job IDs in list view (#4132)
Remove [:8] truncation from hermes cron list output. Job IDs are 12
hex chars — truncating to 8 makes them unusable for cron run/pause/remove
which require the full ID.

Co-authored-by: vitobotta <vitobotta@users.noreply.github.com>
2026-03-30 19:05:34 -07:00
Teknium
720507efac feat: add post-migration cleanup for OpenClaw directories (#4100)
After migrating from OpenClaw, leftover workspace directories contain
state files (todo.json, sessions, logs) that confuse the agent — it
discovers them and reads/writes to stale locations instead of the
Hermes state directory, causing issues like cron jobs reading a
different todo list than interactive sessions.

Changes:
- hermes claw migrate now offers to archive the source directory after
  successful migration (rename to .pre-migration, not delete)
- New `hermes claw cleanup` subcommand for users who already migrated
  and need to archive leftover OpenClaw directories
- Migration notes updated with explicit cleanup guidance
- 42 tests covering all new functionality

Reported by SteveSkedasticity — multiple todo.json files across
~/.hermes/, ~/.openclaw/workspace/, and ~/.openclaw/workspace-assistant/
caused cron jobs to read from wrong locations.
2026-03-30 17:39:08 -07:00
Teknium
e64b047663 chore: prepare Hermes for Homebrew packaging (#4099)
Co-authored-by: Yabuku-xD <78594762+Yabuku-xD@users.noreply.github.com>
2026-03-30 17:34:43 -07:00
SHL0MS
3c8f910973 feat: respect NO_COLOR env var and TERM=dumb (#4079)
Add should_use_color() function to hermes_cli/colors.py that checks
NO_COLOR (https://no-color.org/) and TERM=dumb before emitting ANSI
escapes. The existing color() helper now uses this function instead
of a bare isatty() check.

This is the foundation — cli.py and banner.py still have inline ANSI
constants that bypass this module (tracked in #4071).

Closes #4066

Co-authored-by: SHL0MS <SHL0MS@users.noreply.github.com>
2026-03-30 17:07:21 -07:00
Teknium
de368cac54 fix(tools): show browser and TTS in reconfigure menu (#4041)
* fix(gateway): honor default for invalid bool-like config values

* refactor: simplify web backend priority detection

Replace cascading boolean conditions with a priority-ordered loop.
Same behavior (verified against all 16 env var combinations),
half the lines, trivially extensible for new backends.

* fix(tools): show browser and TTS in reconfigure menu

_toolset_has_keys() returned False for toolsets with no-key providers
(Local Browser, Edge TTS) because it only checked providers with
env_vars. Users couldn't find these tools in the reconfigure list
and had no obvious way to switch browser/TTS backends.

Now treats providers with empty env_vars as always-configured, so
toolsets with free/local options always appear in the reconfigure menu.

---------

Co-authored-by: aydnOktay <xaydinoktay@gmail.com>
2026-03-30 14:11:39 -07:00
Teknium
f93637b3a1 feat: add /profile slash command to show active profile (#4027)
Adds /profile to COMMAND_REGISTRY (Info category) with handlers in
both CLI and gateway. Shows the active profile name and home directory.

Works on all platforms — CLI, Telegram, Discord, Slack, etc.
Detects profile by checking if HERMES_HOME is under ~/.hermes/profiles/.
Shows 'default' when running without a profile.
2026-03-30 13:20:06 -07:00
Teknium
950f69475f feat(browser): add Camofox local anti-detection browser backend (#4008)
Camofox-browser is a self-hosted Node.js server wrapping Camoufox
(Firefox fork with C++ fingerprint spoofing). When CAMOFOX_URL is set,
all 11 browser tools route through the Camofox REST API instead of
the agent-browser CLI.

Maps 1:1 to the existing browser tool interface:
- Navigate, snapshot, click, type, scroll, back, press, close
- Get images, vision (screenshot + LLM analysis)
- Console (returns empty with note — camofox limitation)

Setup: npm start in camofox-browser dir, or docker run -p 9377:9377
Then: CAMOFOX_URL=http://localhost:9377 in ~/.hermes/.env

Advantages over Browserbase (cloud):
- Free (no per-session API costs)
- Local (zero network latency for browser ops)
- Anti-detection at C++ level (bypasses Cloudflare/Google bot detection)
- Works offline, Docker-ready

Files:
- tools/browser_camofox.py: Full REST backend (~400 lines)
- tools/browser_tool.py: Routing at each tool function
- hermes_cli/config.py: CAMOFOX_URL env var entry
- tests/tools/test_browser_camofox.py: 20 tests
2026-03-30 13:18:42 -07:00
Teknium
158f49f19a fix: enforce priority order in Telegram menu — core > plugins > skills (#4023)
The menu now has explicit priority tiers:
1. Core CommandDef commands (always included, never bumped)
2. Plugin slash commands (take precedence over skills)
3. Built-in skill commands (fill remaining slots alphabetically)

Only skills get trimmed when the 100-command cap is hit. Adding new
core commands or plugin commands automatically pushes skills out,
not the other way around.
2026-03-30 13:04:06 -07:00
Teknium
60ecde8ac7 fix: fit all 100 commands in Telegram menu with 40-char descriptions (#4010)
* fix: truncate skill descriptions to 100 chars in Telegram menu

* fix: 40-char desc cap + 100 command limit for Telegram menu

setMyCommands has an undocumented total payload size limit.
50 commands with 256-char descriptions failed, 50 with 100-char
worked, and 100 with 40-char descriptions also works (~5300 total
chars). Truncate skill descriptions to 40 chars in the menu picker
and set cap back to 100. Full descriptions available via /commands.
2026-03-30 11:21:13 -07:00
Teknium
f3069c649c fix(cli): add missing subprocess.run() timeouts in doctor and status (#4009)
Add timeout parameters to 4 subprocess.run() calls that could hang
indefinitely if the child process blocks (e.g., unresponsive docker
daemon, systemctl waiting for D-Bus):

- doctor.py: docker info (timeout=10), ssh check (timeout=15)
- status.py: systemctl is-active (timeout=5), launchctl list (timeout=5)

Each call site now catches subprocess.TimeoutExpired and treats it as
a failure, consistent with how non-zero return codes are already handled.

Add AST-based regression test that verifies every subprocess.run() call
in CLI modules specifies a timeout keyword argument.

Co-authored-by: dieutx <dangtc94@gmail.com>
2026-03-30 11:17:15 -07:00
Teknium
0976bf6cd0 feat: add /yolo slash command to toggle dangerous command approvals (#3990)
Adds a /yolo command that toggles HERMES_YOLO_MODE at runtime, skipping
all dangerous command approval prompts for the current session. Works in
both CLI and gateway (Telegram, Discord, etc.).

- /yolo -> ON: all commands auto-approved, no confirmation prompts
- /yolo -> OFF: normal approval flow restored

The --yolo CLI flag already existed for launch-time opt-in. This adds
the ability to toggle mid-session without restarting.

Session-scoped — resets when the process ends. Uses the existing
HERMES_YOLO_MODE env var that check_all_command_guards() already
respects.
2026-03-30 11:17:09 -07:00
Teknium
9fd78c7a8e fix: use SKILLS_DIR not repo path for Telegram menu skill filter (#4005)
Skills are synced to ~/.hermes/skills/ (SKILLS_DIR), not the repo's
skills/ directory. The previous filter compared against the repo path
so no skills matched. Now checks SKILLS_DIR and excludes .hub/
subdirectory (user-installed hub skills).
2026-03-30 11:01:13 -07:00
Teknium
5ceed021dc feat(gateway): skill-aware slash commands, paginated /commands, Telegram 100-cap (#3934)
* feat(gateway): skill-aware slash commands, paginated /commands, Telegram 100-cap

Map active skills to Telegram's slash command menu so users can
discover and invoke skills directly. Three changes:

1. Telegram menu now includes active skill commands alongside built-in
   commands, capped at 100 entries (Telegram Bot API limit). Overflow
   commands remain callable but hidden from the picker. Logged at
   startup when cap is hit.

2. New /commands [page] gateway command for paginated browsing of all
   commands + skills. /help now shows first 10 skill commands and
   points to /commands for the full list.

3. When a user types a slash command that matches a disabled or
   uninstalled skill, they get actionable guidance:
   - Disabled: 'Enable it with: hermes skills config'
   - Optional (not installed): 'Install with: hermes skills install official/<path>'

Built on ideas from PR #3921 by @kshitijk4poor.

* chore: move 21 niche skills to optional-skills

Move specialized/niche skills from built-in (skills/) to optional
(optional-skills/) to reduce the default skill count. Users can
install them with: hermes skills install official/<category>/<name>

Moved skills (21):
- mlops: accelerate, chroma, faiss, flash-attention,
  hermes-atropos-environments, huggingface-tokenizers, instructor,
  lambda-labs, llava, nemo-curator, pinecone, pytorch-lightning,
  qdrant, saelens, simpo, slime, tensorrt-llm, torchtitan
- research: domain-intel, duckduckgo-search
- devops: inference-sh cli

Built-in skills: 96 → 75
Optional skills: 22 → 43

* fix: only include repo built-in skills in Telegram menu, not user-installed

User-installed skills (from hub or manually added) stay accessible via
/skills and by typing the command directly, but don't get registered
in the Telegram slash command picker. Only skills whose SKILL.md is
under the repo's skills/ directory are included in the menu.

This keeps the Telegram menu focused on the curated built-in set while
user-installed skills remain discoverable through /skills and /commands.
2026-03-30 10:57:30 -07:00
Teknium
37825189dd fix(skills): validate hub bundle paths before install (#3986)
Co-authored-by: Gutslabs <gutslabsxyz@gmail.com>
2026-03-30 08:37:19 -07:00
Teknium
e08778fa1e chore: release v0.6.0 (2026.3.30) (#3985) 2026-03-30 08:29:38 -07:00
Teknium
74181fe726 fix: add TTY guard to interactive CLI commands to prevent CPU spin (#3933)
When interactive TUI commands are invoked non-interactively (e.g. via
the agent's terminal() tool through a subprocess pipe), curses loops
spin at 100% CPU and input() calls hang indefinitely.

Defense in depth — two layers:

1. Source-level guard in curses_checklist() (curses_ui.py + checklist.py):
   Returns cancel_returns immediately when stdin is not a TTY. This
   catches ALL callers automatically, including future code.

2. Command-level guards with clear error messages:
   - hermes tools (interactive checklist, not list/disable/enable)
   - hermes setup (interactive wizard)
   - hermes model (provider/model picker)
   - hermes whatsapp (pairing setup)
   - hermes skills config (skill toggle)
   - hermes mcp configure (tool selection)
   - hermes uninstall (confirmation prompt)

Non-interactive subcommands (hermes tools list, hermes tools enable,
hermes mcp add/remove/list/test, hermes skills search/install/browse)
remain unaffected.
2026-03-30 08:10:23 -07:00
Teknium
b4496b33b5 fix: background task media delivery + vision download timeout (#3919)
* feat(telegram): add webhook mode as alternative to polling

When TELEGRAM_WEBHOOK_URL is set, the adapter starts an HTTP webhook
server (via python-telegram-bot's start_webhook()) instead of long
polling. This enables cloud platforms like Fly.io and Railway to
auto-wake suspended machines on inbound HTTP traffic.

Polling remains the default — no behavior change unless the env var
is set.

Env vars:
  TELEGRAM_WEBHOOK_URL    Public HTTPS URL for Telegram to push to
  TELEGRAM_WEBHOOK_PORT   Local listen port (default 8443)
  TELEGRAM_WEBHOOK_SECRET Secret token for update verification

Cherry-picked and adapted from PR #2022 by SHL0MS. Preserved all
current main enhancements (network error recovery, polling conflict
detection, DM topics setup).

Co-authored-by: SHL0MS <SHL0MS@users.noreply.github.com>

* fix: send_document call in background task delivery + vision download timeout

Two fixes salvaged from PR #2269 by amethystani:

1. gateway/run.py: adapter.send_file() → adapter.send_document()
   send_file() doesn't exist on BasePlatformAdapter. Background task
   media files were silently never delivered (AttributeError swallowed
   by except Exception: pass).

2. tools/vision_tools.py: configurable image download timeout via
   HERMES_VISION_DOWNLOAD_TIMEOUT env var (default 30s), plus guard
   against raise None when max_retries=0.

The third fix in #2269 (opencode-go auth config) was already resolved
on main.

Co-authored-by: amethystani <amethystani@users.noreply.github.com>

---------

Co-authored-by: SHL0MS <SHL0MS@users.noreply.github.com>
Co-authored-by: amethystani <amethystani@users.noreply.github.com>
2026-03-30 02:59:39 -07:00
Wing Lian
efae525dc5 feat(plugins): add inject_message interface for remote message injection (#3778) 2026-03-30 02:48:06 -07:00
Teknium
947faed3bc feat(approvals): make dangerous command approval timeout configurable (#3886)
* feat(approvals): make dangerous command approval timeout configurable

Read `approvals.timeout` from config.yaml (default 60s) instead of
hardcoding 60 seconds in both the fallback CLI prompt and the TUI
prompt_toolkit callback.

Follows the same pattern as `clarify.timeout` which is already
configurable via CLI_CONFIG.

Closes #3765

* fix: add timeout default to approvals section in DEFAULT_CONFIG

---------

Co-authored-by: acsezen <asezen@icloud.com>
2026-03-30 00:02:02 -07:00
Teknium
09def65eff fix(migration): expand OpenClaw migration to cover full data footprint (#3869)
Cross-referenced the OpenClaw Zod schema and TypeScript source against
our migration script. Found and fixed:

Expanded data sources:
- Legacy config fallback: clawdbot.json, moldbot.json
- Legacy dir fallback: ~/.clawdbot/, ~/.moldbot/
- API keys from ~/.openclaw/.env and auth-profiles.json
- Personal skills from ~/.agents/skills/
- Project skills from workspace/.agents/skills/
- BOOTSTRAP.md archived (was silently skipped)
- Expanded env key allowlist: DEEPSEEK, GEMINI, ZAI, MINIMAX

Fixed wrong config paths (verified against Zod schema):
- humanDelay.enabled → humanDelay.mode (field doesn't exist as .enabled)
- agents.defaults.exec.timeout → tools.exec.timeoutSec (wrong path + name)
- messages.tts.elevenlabs.voiceId → messages.tts.providers.elevenlabs.voiceId
- session.resetTriggers (string[]) → session.reset (structured object)
- approvals.mode → approvals.exec.mode (no top-level mode)
- browser.inactivityTimeoutMs → doesn't exist; map cdpUrl+headless instead
- tools.webSearch.braveApiKey → tools.web.search.brave.apiKey
- tools.exec.timeout → tools.exec.timeoutSec

Added SecretRef resolution:
- All token/apiKey fields in OpenClaw can be strings, env templates
  (${VAR}), or SecretRef objects ({source:'env',id:'VAR'}). Added
  resolve_secret_input() to handle all three forms.

Fixed auth-profiles.json:
- Canonical field is 'key' not 'apiKey' (though alias accepted)
- File wraps entries in a 'profiles' key — now handled

Fixed TTS config:
- Provider settings at messages.tts.providers.{name} (not flat)
- Also checks top-level 'talk' config as fallback source

Docs updated with new sources and key list.
2026-03-29 22:49:34 -07:00
Teknium
366bfc3c76 fix(setup): auto-install matrix-nio during hermes setup (#3873)
Setup previously only printed a manual install hint for matrix-nio,
causing the gateway to crash with 'matrix-nio not installed' after
configuring Matrix. Now auto-installs matrix-nio (or matrix-nio[e2e]
when E2EE is enabled) using the same uv-first/pip-fallback pattern
as Daytona and Modal backends.

Also adds hermes-agent[matrix] to the [all] extra in pyproject.toml
and a regression test to keep it there.

Co-authored-by: Gutslabs <Gutslabs@users.noreply.github.com>
Co-authored-by: cutepawss <cutepawss@users.noreply.github.com>
2026-03-29 21:53:28 -07:00
Teknium
ccf7bb1102 fix(nous): use curated model list instead of full API dump for Nous Portal (#3867)
All three Nous Portal model selection paths (hermes model, first-time
login, setup wizard) were hitting the live /models endpoint and showing
every model available — potentially hundreds. Now uses the curated
_PROVIDER_MODELS['nous'] list (25 agentic models matching OpenRouter
defaults) with 'Enter custom model name' for anything else.

Fixed in:
- hermes_cli/main.py: _model_flow_nous()
- hermes_cli/auth.py: _login_nous() model selection
- hermes_cli/setup.py: post-login model selection
2026-03-29 21:38:10 -07:00
Teknium
ce2841f3c9 feat(gateway): add WeCom (Enterprise WeChat) platform support (#3847)
Adds WeCom as a gateway platform adapter using the AI Bot WebSocket
gateway for real-time bidirectional communication. No public endpoint
or new pip dependencies needed (uses existing aiohttp + httpx).

Features:
- WebSocket persistent connection with auto-reconnect (exponential backoff)
- DM and group messaging with configurable access policies
- Media upload/download with AES decryption for encrypted attachments
- Markdown rendering, quote context preservation
- Proactive + passive reply message modes
- Chunked media upload pipeline (512KB chunks)

Cherry-picked from PR #1898 by EvilRan with:
- Moved to current main (PR was 300 commits behind)
- Skipped base.py regressions (reply_to additions are good but belong
  in a separate PR since they affect all platforms)
- Fixed test assertions to match current base class send() signature
  (reply_to=None kwarg now explicit)
- All 16 integration points added surgically to current main
- No new pip dependencies (aiohttp + httpx already installed)

Fixes #1898

Co-authored-by: EvilRan <EvilRan@users.noreply.github.com>
2026-03-29 21:29:13 -07:00
Teknium
86ac23c8da fix(auth): stop silently falling back to OpenRouter when no provider is configured (#3862)
Previously, when no API keys or provider credentials were found, Hermes
silently defaulted to OpenRouter + Claude Opus. This caused confusion
when users configured local servers (LM Studio, Ollama, etc.) with a
typo or unrecognized provider name — the system would silently route to
OpenRouter instead of telling them something was wrong.

Changes:
- resolve_provider() now raises AuthError when no credentials are found
  instead of returning 'openrouter' as a silent fallback
- Added local server aliases: lmstudio, ollama, vllm, llamacpp → custom
- Removed hardcoded 'anthropic/claude-opus-4.6' fallback from gateway
  and cron scheduler (they read from config.yaml instead)
- Updated cli-config.yaml.example with complete provider documentation
  including all supported providers, aliases, and local server setup
2026-03-29 21:06:35 -07:00
Teknium
aa389924ad fix: prefer curated model list when live probe returns fewer models (#3856)
The model picker for API-key providers (MiniMax, z.ai, etc.) probes
the live /models endpoint when the curated list has fewer than 8
models. When the live endpoint returns fewer models than the curated
list (e.g. MiniMax's Anthropic-compatible endpoint doesn't list M2.7),
the incomplete live list was used instead.

Now falls back to the curated list when live returns fewer models,
ensuring new models like MiniMax-M2.7 always appear in the picker.
2026-03-29 20:55:15 -07:00
Teknium
981e14001c fix: clear api_mode on provider switch instead of hardcoding chat_completions (#3857)
PR #3726 fixed stale codex_responses persisting when switching providers
by hardcoding api_mode=chat_completions in 5 model flows. This broke
MiniMax, MiniMax-CN, and Alibaba which use /anthropic endpoints that
need anthropic_messages — the hardcoded value overrides the URL-based
auto-detection in runtime_provider.py.

Fix: pop api_mode from config in the 3 URL-dependent flows (custom
endpoint, Kimi, api_key_provider) instead of hardcoding. The runtime
resolver already correctly auto-detects api_mode from the base_url
suffix (/anthropic -> anthropic_messages, else chat_completions).

OpenRouter and Copilot ACP flows keep the explicit value since their
api_mode is always known.

Reported by stefan171.
2026-03-29 20:44:39 -07:00
Teknium
9d28f4aba3 fix: add gpt-5.4-mini to Codex fallback catalog (#3855)
Co-authored-by: Clippy <clippy@grads.flow>
2026-03-29 20:10:00 -07:00
Teknium
ca4907dfbc feat(gateway): add Feishu/Lark platform support (#3817)
Adds Feishu (ByteDance's enterprise messaging platform) as a gateway
platform adapter with full feature parity: WebSocket + webhook transports,
message batching, dedup, rate limiting, rich post/card content parsing,
media handling (images/audio/files/video), group @mention gating,
reaction routing, and interactive card button support.

Cherry-picked from PR #1793 by penwyp with:
- Moved to current main (PR was 458 commits behind)
- Fixed _send_with_retry shadowing BasePlatformAdapter method (renamed to
  _feishu_send_with_retry to avoid signature mismatch crash)
- Fixed import structure: aiohttp/websockets imported independently of
  lark_oapi so they remain available when SDK is missing
- Fixed get_hermes_home import (hermes_constants, not hermes_cli.config)
- Added skip decorators for tests requiring lark_oapi SDK
- All 16 integration points added surgically to current main

New dependency: lark-oapi>=1.5.3,<2 (optional, pip install hermes-agent[feishu])

Fixes #1788

Co-authored-by: penwyp <penwyp@users.noreply.github.com>
2026-03-29 18:17:42 -07:00
Teknium
e314833c9d feat(display): configurable tool preview length -- show full paths by default (#3841)
Tool call previews (paths, commands, queries) were hardcoded to truncate
at 35-40 chars across CLI spinners, completion lines, and gateway progress
messages. Users could not see full file paths in tool output.

New config option: display.tool_preview_length (default 0 = no limit).
Set a positive number to truncate at that length.

Changes:
- display.py: module-level _tool_preview_max_len with getter/setter;
  build_tool_preview() and get_cute_tool_message() _trunc/_path respect it
- cli.py: reads config at startup, spinner widget respects config
- gateway/run.py: reads config per-message, progress callback respects config
- run_agent.py: removed redundant 30-char quiet-mode spinner truncation
- config.py: added display.tool_preview_length to DEFAULT_CONFIG

Reported by kriskaminski
2026-03-29 18:02:42 -07:00
Teknium
45c8d3da96 fix(banner): show lazy-initialized tools in yellow instead of red (salvage #1854) (#3822)
Tools from check_fn-gated toolsets (honcho, homeassistant) showed as
red (disabled) in the startup banner even when properly configured.
This happened because check_fn runs lazily after session context is
set, but the banner renders before agent init.

Now distinguishes three states:
  - red:    truly unavailable (missing env var, no API key)
  - yellow: lazy-initialized (check_fn pending, will activate on use)
  - normal: available and ready

Only the banner fix was salvaged from the original PR; unrelated
bundled changes (context_compressor, STT config, auth default_model,
SessionResetPolicy) were discarded.

Co-authored-by: Jah-yee <Jah-yee@users.noreply.github.com>
2026-03-29 16:53:29 -07:00
Teknium
df806bdbaf feat(cron): add cron.wrap_response config to disable delivery wrapping (#3807)
Adds a config option to suppress the header/footer text that wraps
cron job responses when delivered to messaging platforms.

Set cron.wrap_response: false in config.yaml for clean output without
the 'Cronjob Response: <name>' header and 'The agent cannot see this
message' footer.  Default is true (preserves current behavior).
2026-03-29 16:31:01 -07:00
Teknium
c4cf20f564 fix: clear __pycache__ during update to prevent stale bytecode ImportError (#3819)
Third report of gateway crashing with:
  ImportError: cannot import name 'get_hermes_home' from 'hermes_constants'

Root cause: stale .pyc bytecode files survive code updates. When Python
loads a cached .pyc that references names from the old source, the import
fails and the gateway won't start.

Two bugs fixed:
1. Git update path: no cache clearing at all after git pull
2. ZIP update path: __pycache__ was explicitly in the preserve set

Added _clear_bytecode_cache() helper that removes all __pycache__ dirs
under PROJECT_ROOT (skipping venv/node_modules/.git/.worktrees). Called
in both git and ZIP update paths, before pip install.
2026-03-29 16:23:36 -07:00