Commit Graph

9 Commits

Author SHA1 Message Date
Teknium
c1da1fdcd5 feat: auto-detect provider when switching models via /model (#1506)
When typing /model deepseek-chat while on a different provider, the
model name now auto-resolves to the correct provider instead of
silently staying on the wrong one and causing API errors.

Detection priority:
1. Direct provider with credentials (e.g. DEEPSEEK_API_KEY set)
2. OpenRouter catalog match with proper slug remapping
3. Direct provider without creds (clear error beats silent failure)

Also adds DeepSeek as a first-class API-key provider — just set
DEEPSEEK_API_KEY and /model deepseek-chat routes directly.

Bare model names get remapped to proper OpenRouter slugs:
  /model gpt-5.4 → openai/gpt-5.4
  /model claude-opus-4.6 → anthropic/claude-opus-4.6

Salvages the concept from PR #1177 by @virtaava with credential
awareness and OpenRouter slug mapping added.

Co-authored-by: virtaava <virtaava@users.noreply.github.com>
2026-03-16 04:34:45 -07:00
Teknium
4a8cd6f856 fix: stop rejecting unlisted models, accept with warning instead
* fix: use session_key instead of chat_id for adapter interrupt lookups

monitor_for_interrupt() in _run_agent was using source.chat_id to query
the adapter's has_pending_interrupt() and get_pending_message() methods.
But the adapter stores interrupt events under build_session_key(source),
which produces a different string (e.g. 'agent:main:telegram:dm' vs '123456').

This key mismatch meant the interrupt was never detected through the
adapter path, which is the only active interrupt path for all adapter-based
platforms (Telegram, Discord, Slack, etc.). The gateway-level interrupt
path (in dispatch_message) is unreachable because the adapter intercepts
the 2nd message in handle_message() before it reaches dispatch_message().

Result: sending a new message while subagents were running had no effect —
the interrupt was silently lost.

Fix: replace all source.chat_id references in the interrupt-related code
within _run_agent() with the session_key parameter, which matches the
adapter's storage keys.

Also adds regression tests verifying session_key vs chat_id consistency.

* debug: add file-based logging to CLI interrupt path

Temporary instrumentation to diagnose why message-based interrupts
don't seem to work during subagent execution. Logs to
~/.hermes/interrupt_debug.log (immune to redirect_stdout).

Two log points:
1. When Enter handler puts message into _interrupt_queue
2. When chat() reads it and calls agent.interrupt()

This will reveal whether the message reaches the queue and
whether the interrupt is actually fired.

* fix: accept unlisted models with warning instead of rejecting

validate_requested_model() previously hard-rejected any model not found
in the provider's API listing. This was too aggressive — users on higher
plan tiers (e.g. Z.AI Pro/Max) may have access to models not shown in
the public listing (like glm-5 on coding endpoints).

Changes:
- validate_requested_model: accept unlisted models with a warning note
  instead of blocking. The model is saved to config and used immediately.
- Z.AI setup: always offer glm-5 in the model list regardless of whether
  a coding endpoint was detected. Pro/Max plans support it.
- Z.AI setup detection message: softened from 'GLM-5 is not available'
  to 'GLM-5 may still be available depending on your plan tier'
2026-03-12 16:02:35 -07:00
teknium1
ec2c6dff70 feat: unified /model and /provider into single view
Both /model and /provider now show the same unified display:

  Current: anthropic/claude-opus-4.6 via OpenRouter

  Authenticated providers & models:
    [openrouter] ← active
      anthropic/claude-opus-4.6 ← current
      anthropic/claude-sonnet-4.5
      ...
    [nous]
      claude-opus-4-6
      gemini-3-flash
      ...
    [openai-codex]
      gpt-5.2-codex
      gpt-5.1-codex-mini
      ...

  Not configured: Z.AI / GLM, Kimi / Moonshot, ...

  Switch model:    /model <model-name>
  Switch provider: /model <provider>:<model-name>
  Example: /model nous:claude-opus-4-6

Users can see all authenticated providers and their models at a glance,
making it easy to switch mid-conversation.

Also added curated model lists for Nous Portal and OpenAI Codex to
hermes_cli/models.py.
2026-03-11 23:06:06 -07:00
teknium1
a23bcb81ce fix: improve /model user feedback + update docs
User messaging improvements:
- Rejection: '(>_<) Error: not a valid model' instead of '(^_^) Warning: Error:'
- Rejection: shows 'Model unchanged' + tip about /model and /provider
- Session-only: explains 'this session only' with reason and 'will revert on restart'
- Saved: clear '(saved to config)' confirmation

Docs updated:
- cli-commands.md, cli.md, messaging/index.md: /model now shows
  provider:model syntax, /provider command added to tables

Test fixes: deduplicated test names, assertions match new messages.
2026-03-08 06:13:12 -07:00
teknium1
66d3e6a0c2 feat: provider switching via /model + enhanced model display
Add provider:model syntax to /model command for runtime provider switching:
  /model zai:glm-5           → switch to Z.AI provider with glm-5
  /model nous:hermes-3       → switch to Nous Portal with hermes-3
  /model openrouter:anthropic/claude-sonnet-4.5  → explicit OpenRouter

When switching providers, credentials are resolved via resolve_runtime_provider
and validated before committing. Both model and provider are saved to config.
Provider aliases work (glm: → zai, kimi: → kimi-coding, etc.).

Enhanced /model (no args) display now shows:
  - Current model and provider
  - Curated model list for the current provider with ← marker
  - Usage examples including provider:model syntax

39 tests covering parse_model_input, curated_models_for_provider,
provider switching (success + credential failure), and display output.
2026-03-08 05:45:59 -07:00
teknium1
8c734f2f27 fix: remove OpenRouter '/' format enforcement — let API probe be the authority
Not all providers require 'provider/model' format. Removing the rigid
format check lets the live API probe handle all validation uniformly.
If someone types 'gpt-5.4' on OpenRouter, the probe won't find it and
will suggest 'openai/gpt-5.4' — better UX than a format rejection.
2026-03-08 05:31:41 -07:00
teknium1
245d174359 feat: validate /model against live API instead of hardcoded lists
Replace the static catalog-based model validation with a live API probe.
The /model command now hits the provider's /models endpoint to check if
the requested model actually exists:

- Model found in API → accepted + saved to config
- Model NOT found in API → rejected with 'Error: not a valid model'
  and fuzzy-match suggestions from the live model list
- API unreachable → graceful fallback to hardcoded catalog (session-only
  for unrecognized models)
- Format errors (empty, spaces, missing '/') still caught instantly
  without a network call

The API probe takes ~0.2s for OpenRouter (346 models) and works with any
OpenAI-compatible endpoint (Ollama, vLLM, custom, etc.).

32 tests covering all paths: format checks, API found, API not found,
API unreachable fallback, CLI integration.
2026-03-08 05:22:20 -07:00
teknium1
90fa9e54ca fix: guard validate_requested_model + expand test coverage (PR #649 follow-up)
- Wrap validate_requested_model in try/except so /model doesn't crash
  if validation itself fails (falls back to old accept+save behavior)
- Remove unnecessary sys.path.insert from both test files
- Expand test_model_validation.py: 4 → 23 tests covering normalize_provider,
  provider_model_ids, empty/whitespace/spaces rejection, OpenRouter format
  validation, custom endpoints, nous provider, provider aliases, unknown
  providers, fuzzy suggestions
- Expand test_cli_model_command.py: 2 → 5 tests adding known-model save,
  validation crash fallback, and /model with no argument
2026-03-08 04:47:35 -07:00
stablegenius49
9d3a44e0e8 fix: validate /model values before saving 2026-03-08 04:47:35 -07:00