- Add _fetch_anthropic_models() to hermes_cli/models.py — hits the
Anthropic /v1/models endpoint to get the live model catalog. Handles
both API key and OAuth token auth headers.
- Wire it into provider_model_ids() so both 'hermes model' and
'hermes setup model' show the live list instead of a stale static one.
- Update static _PROVIDER_MODELS fallback with full current catalog:
opus-4-6, sonnet-4-6, opus-4-5, sonnet-4-5, opus-4, sonnet-4, haiku-4-5
- Update model_metadata.py with context lengths for all current models.
- Fix thinking parameter for 4.5+ models: use type='adaptive' instead
of type='enabled' (Anthropic deprecated 'enabled' for newer models,
warns at runtime). Detects model version from the model name string.
Verified live:
hermes model → Anthropic → auto-detected creds → shows 7 live models
hermes chat --provider anthropic --model claude-opus-4-6 → works
The critical bug: read_claude_code_credentials() only looked at
~/.claude/.credentials.json, but Claude Code's native binary (v2.x,
Bun-compiled) stores credentials in ~/.claude.json at the top level
as 'primaryApiKey'. The .credentials.json file is only written by
older npm-based installs.
Now checks both locations in priority order:
1. ~/.claude.json → primaryApiKey (native binary, v2.x)
2. ~/.claude/.credentials.json → claudeAiOauth.accessToken (legacy)
Verified live: hermes model → Anthropic → auto-detected credentials →
claude-sonnet-4-20250514 → 'Hello there, how are you?' (5 words)
Both 'hermes model' and 'hermes setup model' now present a clear
two-option auth flow when no credentials are found:
1. Claude Pro/Max subscription (setup-token)
- Step-by-step instructions to run 'claude setup-token'
- User pastes the resulting sk-ant-oat01-... token
2. Anthropic API key (pay-per-token)
- Link to console.anthropic.com/settings/keys
- User pastes sk-ant-api03-... key
Also handles:
- Auto-detection of existing Claude Code creds (~/.claude/.credentials.json)
- Existing credentials shown with option to update
- Consistent UX between 'hermes model' and 'hermes setup model'
Feedback fixes:
1. Revert _convert_vision_content — vision is handled by the vision_analyze
tool, not by converting image blocks inline in conversation messages.
Removed the function and its tests.
2. Add Anthropic to 'hermes model' (cmd_model in main.py):
- Added to provider_labels dict
- Added to providers selection list
- Added _model_flow_anthropic() with Claude Code credential auto-detection,
API key prompting, and model selection from catalog.
3. Wire up Anthropic as a vision-capable auxiliary provider:
- Added _try_anthropic() to auxiliary_client.py using claude-sonnet-4
as the vision model (Claude natively supports multimodal)
- Added to the get_vision_auxiliary_client() auto-detection chain
(after OpenRouter/Nous, before Codex/custom)
Cache tracking note: the Anthropic cache metrics branch in run_agent.py
(cache_read_input_tokens / cache_creation_input_tokens) is in the correct
place — it's response-level parsing, same location as the existing
OpenRouter cache tracking. auxiliary_client.py has no cache tracking.
After studying clawdbot (OpenClaw) and OpenCode implementations:
## Beta headers
- Add interleaved-thinking-2025-05-14 and fine-grained-tool-streaming-2025-05-14
as common betas (sent with ALL auth types, not just OAuth)
- OAuth tokens additionally get oauth-2025-04-20
- API keys now also get the common betas (previously got none)
## Vision/image support
- Add _convert_vision_content() to convert OpenAI multimodal format
(image_url blocks) to Anthropic format (image blocks with base64/url source)
- Handles both data: URIs (base64) and regular URLs
## Role alternation enforcement
- Anthropic strictly rejects consecutive same-role messages (400 error)
- Add post-processing step that merges consecutive user/assistant messages
- Handles string, list, and mixed content types during merge
## Tool choice support
- Add tool_choice parameter to build_anthropic_kwargs()
- Maps OpenAI values: auto→auto, required→any, none→omit, name→tool
## Cache metrics tracking
- Anthropic uses cache_read_input_tokens / cache_creation_input_tokens
(different from OpenRouter's prompt_tokens_details.cached_tokens)
- Add api_mode-aware branch in run_agent.py cache stats logging
## Credential refresh on 401
- On 401 error during anthropic_messages mode, re-read credentials
via resolve_anthropic_token() (picks up refreshed Claude Code tokens)
- Rebuild client if new token differs from current one
- Follows same pattern as Codex/Nous 401 refresh handlers
## Tests
- 44 adapter tests (8 new: vision conversion, role alternation, tool choice)
- Updated beta header tests to verify new structure
- Full suite: 3198 passed, 0 regressions
Add @easyops-cn/docusaurus-search-local (v0.55.1) for offline/local
full-text search across all documentation pages.
- Search bar appears in the navbar (Ctrl/Cmd+K shortcut)
- Builds a search index at build time — no external service needed
- Highlights matched terms on target page after clicking a result
- Dedicated /search page for expanded results
- Blog indexing disabled (blog is off)
- docsRouteBasePath set to '/' to match existing docs routing
Follow-up to PR #862 (local skills classification by arceus77-7):
- Remove unnecessary isinstance guard on _read_manifest() return value —
it always returns Dict[str, str], so set() on it suffices.
- Extract repeated hub-dir monkeypatching into a shared pytest fixture (hub_env).
- Add three_source_env fixture for source-classification tests.
- Add _read_manifest monkeypatch to test_do_list_initializes_hub_dir
(was fragile — relied on empty skills list masking the real manifest).
- Add test coverage for --source hub and --source builtin filters.
- Extract _capture() helper to reduce console/StringIO boilerplate.
5 tests, all green.
When a dangerous command is detected and the user is prompted for
approval, long commands are truncated (80 chars in fallback, 70 chars
in the TUI). Users had no way to see the full command before deciding.
This adds a 'View full command' option across all approval interfaces:
- CLI fallback (tools/approval.py): [v]iew option in the prompt menu.
Shows the full command and re-prompts for approval decision.
- CLI TUI (cli.py): 'Show full command' choice in the arrow-key
selection panel. Expands the command display in-place and removes
the view option after use.
- CLI callbacks (callbacks.py): 'view' choice added to the list when
the command exceeds 70 characters.
- Gateway (gateway/run.py): 'full', 'show', 'view' responses reveal
the complete command while keeping the approval pending.
Includes 7 new tests covering view-then-approve, view-then-deny,
short command fallthrough, and double-view behavior.
Closes community feedback about the 80-char cap on dangerous commands.
* fix: /reasoning command output ordering, display, and inline think extraction
Three issues with the /reasoning command:
1. Output interleaving: The command echo used print() while feedback
used _cprint(), causing them to render out-of-order under
prompt_toolkit's patch_stdout. Changed echo to use _cprint() so
all output renders through the same path in correct order.
2. Reasoning display not working: /reasoning show toggled a flag
but reasoning never appeared for models that embed thinking in
inline <think> blocks rather than structured API fields. Added
fallback extraction in _build_assistant_message to capture
<think> block content as reasoning when no structured reasoning
fields (reasoning, reasoning_content, reasoning_details) are
present. This feeds into both the reasoning callback (during
tool loops) and the post-response reasoning box display.
3. Feedback clarity: Added checkmarks to confirm actions, persisted
show/hide to config (was session-only before), and aligned the
status display for readability.
Tests: 7 new tests for inline think block extraction (41 total).
* feat: add /reasoning command to gateway (Telegram/Discord/etc)
The /reasoning command only existed in the CLI — messaging platforms
had no way to view or change reasoning settings. This adds:
1. /reasoning command handler in the gateway:
- No args: shows current effort level and display state
- /reasoning <level>: sets reasoning effort (none/low/medium/high/xhigh)
- /reasoning show|hide: toggles reasoning display in responses
- All changes saved to config.yaml immediately
2. Reasoning display in gateway responses:
- When show_reasoning is enabled, prepends a 'Reasoning' block
with the model's last_reasoning content before the response
- Collapses long reasoning (>15 lines) to keep messages readable
- Uses last_reasoning from run_conversation result dict
3. Plumbing:
- Added _show_reasoning attribute loaded from config at startup
- Propagated last_reasoning through _run_agent return dict
- Added /reasoning to help text and known_commands set
- Uses getattr for _show_reasoning to handle test stubs
* fix: improve Kimi model selection — auto-detect endpoint, add missing models
Kimi Coding Plan setup:
- New dedicated _model_flow_kimi() replaces the generic API-key flow
for kimi-coding. Removes the confusing 'Base URL' prompt entirely —
the endpoint is auto-detected from the API key prefix:
sk-kimi-* → api.kimi.com/coding/v1 (Kimi Coding Plan)
other → api.moonshot.ai/v1 (legacy Moonshot)
- Shows appropriate models for each endpoint:
Coding Plan: kimi-for-coding, kimi-k2.5, kimi-k2-thinking, kimi-k2-thinking-turbo
Moonshot: full model catalog
- Clears any stale KIMI_BASE_URL override so runtime auto-detection
via _resolve_kimi_base_url() works correctly.
Model catalog updates:
- Added kimi-for-coding (primary Coding Plan model) and kimi-k2-thinking-turbo
to models.py, main.py _PROVIDER_MODELS, and model_metadata.py context windows.
- Updated User-Agent from KimiCLI/1.0 to KimiCLI/1.3 (Kimi's coding
endpoint whitelists known coding agents via User-Agent sniffing).
Adds --pass-session-id CLI flag. When set, the agent's system prompt
includes the session ID:
Conversation started: Sunday, March 08, 2026 06:32 PM
Session ID: 20260308_183200_abc123
Usage:
hermes --pass-session-id
hermes chat --pass-session-id
Implementation threads the flag as a proper parameter through the full
chain (main.py → cli.py → run_agent.py) rather than using an env var,
avoiding collisions in multi-agent/multitenant setups.
Based on PR #726 by dmahan93, reworked to use instance parameter
instead of HERMES_PASS_SESSION_ID environment variable.
Co-authored-by: dmahan93 <dmahan93@users.noreply.github.com>
* fix: /reasoning command output ordering, display, and inline think extraction
Three issues with the /reasoning command:
1. Output interleaving: The command echo used print() while feedback
used _cprint(), causing them to render out-of-order under
prompt_toolkit's patch_stdout. Changed echo to use _cprint() so
all output renders through the same path in correct order.
2. Reasoning display not working: /reasoning show toggled a flag
but reasoning never appeared for models that embed thinking in
inline <think> blocks rather than structured API fields. Added
fallback extraction in _build_assistant_message to capture
<think> block content as reasoning when no structured reasoning
fields (reasoning, reasoning_content, reasoning_details) are
present. This feeds into both the reasoning callback (during
tool loops) and the post-response reasoning box display.
3. Feedback clarity: Added checkmarks to confirm actions, persisted
show/hide to config (was session-only before), and aligned the
status display for readability.
Tests: 7 new tests for inline think block extraction (41 total).
* feat: add /reasoning command to gateway (Telegram/Discord/etc)
The /reasoning command only existed in the CLI — messaging platforms
had no way to view or change reasoning settings. This adds:
1. /reasoning command handler in the gateway:
- No args: shows current effort level and display state
- /reasoning <level>: sets reasoning effort (none/low/medium/high/xhigh)
- /reasoning show|hide: toggles reasoning display in responses
- All changes saved to config.yaml immediately
2. Reasoning display in gateway responses:
- When show_reasoning is enabled, prepends a 'Reasoning' block
with the model's last_reasoning content before the response
- Collapses long reasoning (>15 lines) to keep messages readable
- Uses last_reasoning from run_conversation result dict
3. Plumbing:
- Added _show_reasoning attribute loaded from config at startup
- Propagated last_reasoning through _run_agent return dict
- Added /reasoning to help text and known_commands set
- Uses getattr for _show_reasoning to handle test stubs
curated_models_for_provider() now tries the live API first (via
provider_model_ids) before falling back to static _PROVIDER_MODELS.
This means /model and /provider slash commands show the actual
available models, not a stale hardcoded list.
Also added live Nous Portal model fetching via fetch_nous_models()
in provider_model_ids(), alongside the existing Codex live fetch.
- Update __version__ to 0.2.0 (was 0.1.0)
- Update pyproject.toml to match
- Add RELEASE_v0.2.0.md with comprehensive changelog covering:
- All 231 merged PRs
- 120 resolved issues
- 74+ contributors credited
- Organized by feature area with PR links
- Fix version mismatch: __init__.py had 'v1.0.0', pyproject.toml had '0.1.0'
Now both use '0.1.0' (no v prefix — added in display code only)
- Add __release_date__ for CalVer date tracking alongside SemVer version
- Fix double-v bug in cmd_version (was printing 'vv1.0.0')
- Update banner title to show 'Hermes Agent v0.1.0 (2026.3.12)' format
- Update cli.py banner to match new format
- Add scripts/release.py: full release automation tool
- Generates categorized changelogs from git history
- Maps git authors to GitHub @mentions (70+ contributors)
- Supports dry-run preview and --publish mode
- Creates annotated CalVer git tags + GitHub Releases
- Bumps semver in source files automatically
- Usage: python scripts/release.py --bump minor --publish
- Add .release_notes.md to .gitignore
Versioning scheme: CalVer tags (v2026.3.12) + SemVer display (v0.1.0)
4 test files spawn real processes or make live API calls that hang
indefinitely in batch/CI runs. Skip them with pytestmark:
- tests/tools/test_code_execution.py (subprocess spawns)
- tests/tools/test_file_tools_live.py (live LocalEnvironment)
- tests/test_413_compression.py (blocks on process)
- tests/test_agent_loop_tool_calling.py (live OpenRouter API calls)
Also added global 30s signal.alarm timeout in conftest.py as a safety
net, and removed stale nous-api test that hung on OAuth browser login.
Suite now runs in ~55s with no hangs.
- Remove test_nous_api_setup_preserves_model_provider_metadata (nous-api
provider no longer exists, test selected Nous OAuth which hangs waiting
for browser login)
- Fix test_nous_oauth_setup prompt_choice index: 1→0 (Nous Portal is
now first option after nous-api removal)
- gateway/run.py: Take main's _resolve_gateway_model() helper
- hermes_cli/setup.py: Re-apply nous-api removal after merge brought
it back. Fix provider_idx offset (Custom is now index 3, not 4).
- tests/hermes_cli/test_setup.py: Fix custom setup test index (3→4)
- Custom endpoints can serve any model, so skip validation for
provider='custom' in validate_requested_model(). Previously it
would reject any model name since there's no static catalog or
live API to check against.
- Show clear setup instructions when switching to custom endpoint
without OPENAI_BASE_URL/OPENAI_API_KEY configured.
- Added curated model lists for Nous Portal and OpenAI Codex to
_PROVIDER_MODELS so /model shows their available models.
Both /model and /provider now show the same unified display:
Current: anthropic/claude-opus-4.6 via OpenRouter
Authenticated providers & models:
[openrouter] ← active
anthropic/claude-opus-4.6 ← current
anthropic/claude-sonnet-4.5
...
[nous]
claude-opus-4-6
gemini-3-flash
...
[openai-codex]
gpt-5.2-codex
gpt-5.1-codex-mini
...
Not configured: Z.AI / GLM, Kimi / Moonshot, ...
Switch model: /model <model-name>
Switch provider: /model <provider>:<model-name>
Example: /model nous:claude-opus-4-6
Users can see all authenticated providers and their models at a glance,
making it easy to switch mid-conversation.
Also added curated model lists for Nous Portal and OpenAI Codex to
hermes_cli/models.py.
Nous Portal backend will become a transparent proxy for OpenRouter-
specific parameters (provider preferences, etc.), so keep sending them
to all providers. The reasoning disabled fix is kept (that's a real
constraint of the Nous endpoint).
Two bugs in _build_api_kwargs that broke Nous Portal:
1. Provider preferences (only, ignore, order, sort) are OpenRouter-
specific routing features. They were being sent in extra_body to ALL
providers, including Nous Portal. When the config had
providers_only=['google-vertex'], Nous Portal returned 404 'Inference
host not found' because it doesn't have a google-vertex backend.
Fix: Only include provider preferences when _is_openrouter is True.
2. Reasoning config with enabled=false was being sent to Nous Portal,
which requires reasoning and returns 400 'Reasoning is mandatory for
this endpoint and cannot be disabled.'
Fix: Omit the reasoning parameter for Nous when enabled=false.
Root cause found via HERMES_DUMP_REQUESTS=1 which showed the exact
request payload being sent to Nous Portal's inference API.
Model selection now comes exclusively from config.yaml (set via
'hermes model' or 'hermes setup'). The LLM_MODEL env var is no longer
read or written anywhere in production code.
Why: env vars are per-process/per-user and would conflict in
multi-agent or multi-tenant setups. Config.yaml is file-based and
can be scoped per-user or eventually per-session.
Changes:
- cli.py: Read model from CLI_CONFIG only, not LLM_MODEL/OPENAI_MODEL
- hermes_cli/auth.py: _save_model_choice() no longer writes LLM_MODEL
to .env
- hermes_cli/setup.py: Remove 12 save_env_value('LLM_MODEL', ...)
calls from all provider setup flows
- gateway/run.py: Remove LLM_MODEL fallback (HERMES_MODEL still works
for gateway process runtime)
- cron/scheduler.py: Same
- agent/auxiliary_client.py: Remove LLM_MODEL from custom endpoint
model detection
Phase 2 of the provider router migration — route the main agent's
client construction and fallback activation through
resolve_provider_client() instead of duplicated ad-hoc logic.
run_agent.py:
- __init__: When no explicit api_key/base_url, use
resolve_provider_client(provider, raw_codex=True) for client
construction. Explicit creds (from CLI/gateway runtime provider)
still construct directly.
- _try_activate_fallback: Replace _resolve_fallback_credentials and
its duplicated _FALLBACK_API_KEY_PROVIDERS / _FALLBACK_OAUTH_PROVIDERS
dicts with a single resolve_provider_client() call. The router
handles all provider types (API-key, OAuth, Codex) centrally.
- Remove _resolve_fallback_credentials method and both fallback dicts.
agent/auxiliary_client.py:
- Add raw_codex parameter to resolve_provider_client(). When True,
returns the raw OpenAI client for Codex providers instead of wrapping
in CodexAuxiliaryClient. The main agent needs this for direct
responses.stream() access.
3251 passed, 2 pre-existing unrelated failures.
Update 14 test files to use the new call_llm/async_call_llm mock
patterns instead of the old get_text_auxiliary_client/
get_vision_auxiliary_client tuple returns.
- vision_tools tests: mock async_call_llm instead of _aux_async_client
- browser tests: mock call_llm instead of _aux_vision_client
- flush_memories tests: mock call_llm instead of get_text_auxiliary_client
- session_search tests: mock async_call_llm with RuntimeError
- mcp_tool tests: fix whitelist model config, use side_effect for
multi-response tests
- auxiliary_config_bridge: update for model=None (resolved in router)
3251 passed, 2 pre-existing unrelated failures.
Add centralized call_llm() and async_call_llm() functions that own the
full LLM request lifecycle:
1. Resolve provider + model from task config or explicit args
2. Get or create a cached client for that provider
3. Format request args (max_tokens handling, provider extra_body)
4. Make the API call with max_tokens/max_completion_tokens retry
5. Return the response
Config: expanded auxiliary section with provider:model slots for all
tasks (compression, vision, web_extract, session_search, skills_hub,
mcp, flush_memories). Config version bumped to 7.
Migrated all auxiliary consumers:
- context_compressor.py: uses call_llm(task='compression')
- vision_tools.py: uses async_call_llm(task='vision')
- web_tools.py: uses async_call_llm(task='web_extract')
- session_search_tool.py: uses async_call_llm(task='session_search')
- browser_tool.py: uses call_llm(task='vision'/'web_extract')
- mcp_tool.py: uses call_llm(task='mcp')
- skills_guard.py: uses call_llm(provider='openrouter')
- run_agent.py flush_memories: uses call_llm(task='flush_memories')
Tests updated for context_compressor and MCP tool. Some test mocks
still need updating (15 remaining failures from mock pattern changes,
2 pre-existing).
Route all remaining ad-hoc auxiliary LLM call sites through
resolve_provider_client() so auth, headers, and API format (Chat
Completions vs Responses API) are handled consistently in one place.
Files changed:
- tools/openrouter_client.py: Replace manual AsyncOpenAI construction
with resolve_provider_client('openrouter', async_mode=True). The
shared client module now delegates entirely to the router.
- tools/skills_guard.py: Replace inline OpenAI client construction
(hardcoded OpenRouter base_url, manual api_key lookup, manual
headers) with resolve_provider_client('openrouter'). Remove unused
OPENROUTER_BASE_URL import.
- trajectory_compressor.py: Add _detect_provider() to map config
base_url to a provider name, then route through
resolve_provider_client. Falls back to raw construction for
unrecognized custom endpoints.
- mini_swe_runner.py: Route default case (no explicit api_key/base_url)
through resolve_provider_client('openrouter') with auto-detection
fallback. Preserves direct construction when explicit creds are
passed via CLI args.
- agent/auxiliary_client.py: Fix stale module docstring — vision auto
mode now correctly documents that Codex and custom endpoints are
tried (not skipped).
Three interconnected fixes for auxiliary client infrastructure:
1. CENTRALIZED PROVIDER ROUTER (auxiliary_client.py)
Add resolve_provider_client(provider, model, async_mode) — a single
entry point for creating properly configured clients. Given a provider
name and optional model, it handles auth lookup (env vars, OAuth
tokens, auth.json), base URL resolution, provider-specific headers,
and API format differences (Chat Completions vs Responses API for
Codex). All auxiliary consumers should route through this instead of
ad-hoc env var lookups.
Refactored get_text_auxiliary_client, get_async_text_auxiliary_client,
and get_vision_auxiliary_client to use the router internally.
2. FIX CODEX VISION BYPASS (vision_tools.py)
vision_tools.py was constructing a raw AsyncOpenAI client from the
sync vision client's api_key/base_url, completely bypassing the Codex
Responses API adapter. When the vision provider resolved to Codex,
the raw client would hit chatgpt.com/backend-api/codex with
chat.completions.create() which only supports the Responses API.
Fix: Added get_async_vision_auxiliary_client() which properly wraps
Codex into AsyncCodexAuxiliaryClient. vision_tools.py now uses this
instead of manual client construction.
3. FIX COMPRESSION FALLBACK + VISION ERROR HANDLING
- context_compressor.py: Removed _get_fallback_client() which blindly
looked for OPENAI_API_KEY + OPENAI_BASE_URL (fails for Codex OAuth,
API-key providers, users without OPENAI_BASE_URL set). Replaced
with fallback loop through resolve_provider_client() for each
known provider, with same-provider dedup.
- vision_tools.py: Added error detection for vision capability
failures. Returns clear message to the model when the configured
model doesn't support vision, instead of a generic error.
Addresses #886
When hermes-agent runs as a systemd service, Docker container, or
headless daemon, the stdout pipe can become unavailable (idle timeout,
buffer exhaustion, socket reset). Any print() call then raises
OSError: [Errno 5] Input/output error, crashing run_conversation()
and causing cron jobs to fail.
Rather than wrapping individual print() calls (68 in run_conversation
alone), this adds a transparent _SafeWriter wrapper installed once at
the start of run_conversation(). It delegates all writes to the real
stdout and silently catches OSError. Zero overhead on the happy path,
comprehensive coverage of all print calls including future ones.
Fixes#845
Co-authored-by: J0hnLawMississippi <J0hnLawMississippi@users.noreply.github.com>
Adds full stack traces to error logs in _upscale_image() and
image_generate_tool() for better debugging. Matches the pattern
used across the rest of the codebase.
Cherry-picked from PR #868 by aydnOktay.
Co-authored-by: aydnOktay <aydnOktay@users.noreply.github.com>
- Forum parent channel IDs now match free-response list (add a forum
channel ID and all its threads respond without mention)
- Better thread chat names: 'Guild / forum / thread' for forum threads
- Add discord.require_mention and discord.free_response_channels to
config.yaml (bridged to env vars, env vars still override)
- Keep require_mention defaulting to true (safe for shared servers)
Cherry-picked from PR #867 by insecurejezza with default fix and
config.yaml integration.
Co-authored-by: insecurejezza <insecurejezza@users.noreply.github.com>
- Log command, return code, and stderr on non-zero exit
- Add exc_info=True to timeout, FileNotFoundError, and catch-all handlers
- Add debug field to restore() error responses with raw git output
- Keeps user-facing error messages clean while preserving detail for debugging
Inspired by PR #843 (aydnOktay).
Replace two inline copies of the env/config model resolution pattern
(in _run_agent_sync and _run_agent) with the _resolve_gateway_model()
helper introduced in PR #830.
Left untouched:
- Session hygiene block: different default (sonnet vs opus) + reads
compression config from the same YAML load
- /model command: also reads provider from same config block
Previously the early return for unconfigured vision model was silent.
Now logs an error so the failure is visible in logs for debugging.
Inspired by PR #839 by aydnOktay.
Co-authored-by: aydnOktay <aydnOktay@users.noreply.github.com>
save_env_value() used bare open('w') which truncates .env immediately.
A crash or OOM kill between truncation and completed write silently
wipes every credential in the file.
Write now goes to a temp file first, then os.replace() swaps it
atomically. Either the old .env exists or the new one does — never
a truncated half-write. Same pattern used in cron/jobs.py.
Cherry-picked from PR #842 by alireza78a, rebased onto current main
with conflict resolution (_secure_file refactor).
Co-authored-by: alireza78a <alireza78a@users.noreply.github.com>
Memory flush, /compress, and session hygiene create AIAgent without
model=, falling back to the hardcoded default "anthropic/claude-opus-4.6".
This fails with a 400 error when the active provider is openai-codex
(Codex only accepts its own model names like gpt-5.1-codex-mini).
Add _resolve_gateway_model() that mirrors the env/config resolution
already used by _run_agent_sync, and wire it into all three temporary
agent creation sites.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Tag duckduckgo-search skill with fallback_for_toolsets: [web] so it
auto-hides when Firecrawl is available and auto-shows when it isn't
- Add 'Conditional Activation' section to CONTRIBUTING.md with full
spec, semantics, and examples for all 4 frontmatter fields
- Add 'Conditional Activation (Fallback Skills)' section to the user-
facing skills docs with field reference table and practical example
- Update SKILL.md format examples in both docs to show the new fields
Follow-up to PR #785 (conditional skill activation feature).