setup_model_provider() had 800+ lines of duplicated provider handling
that reimplemented the same credential prompting, OAuth flows, and model
selection that hermes model already provides via the _model_flow_*
functions. Every new provider had to be added in both places, and the
two implementations diverged in config persistence (setup.py did raw
YAML writes, _set_model_provider, and _update_config_for_provider
depending on the provider — main.py used its own load/save cycle).
This caused the #4172 bug: _model_flow_custom saved config to disk but
the wizard's final save_config(config) overwrote it with stale values.
Fix: extract the core of cmd_model() into select_provider_and_model()
and have setup_model_provider() call it. After the call, re-sync the
wizard's config dict from disk. Deletes ~800 lines of duplicated
provider handling from setup.py.
Also fixes cmd_model() double-AuthError crash on fresh installs with
no API keys configured.
OPENAI_BASE_URL was written to .env AND config.yaml, creating a dual-source
confusion. Users (especially Docker) would see the URL in .env and assume
that's where all config lives, then wonder why LLM_MODEL in .env didn't work.
Changes:
- Remove all 27 save_env_value("OPENAI_BASE_URL", ...) calls across main.py,
setup.py, and tools_config.py
- Remove OPENAI_BASE_URL env var reading from runtime_provider.py, cli.py,
models.py, and gateway/run.py
- Remove LLM_MODEL/HERMES_MODEL env var reading from gateway/run.py and
auxiliary_client.py — config.yaml model.default is authoritative
- Vision base URL now saved to config.yaml auxiliary.vision.base_url
(both setup wizard and tools_config paths)
- Tests updated to set config values instead of env vars
Convention enforced: .env is for SECRETS only (API keys). All other
configuration (model names, base URLs, provider selection) lives
exclusively in config.yaml.
Move OpenRouter to position 1 in the setup wizard's provider list
to match hermes model ordering. Update default selection index and
fix test expectations for the new ordering.
Setup order: OpenRouter → Nous Portal → Codex → Custom → ...
Replace the fragile hardcoded context length system with a multi-source
resolution chain that correctly identifies context windows per provider.
Key changes:
- New agent/models_dev.py: Fetches and caches the models.dev registry
(3800+ models across 100+ providers with per-provider context windows).
In-memory cache (1hr TTL) + disk cache for cold starts.
- Rewritten get_model_context_length() resolution chain:
0. Config override (model.context_length)
1. Custom providers per-model context_length
2. Persistent disk cache
3. Endpoint /models (local servers)
4. Anthropic /v1/models API (max_input_tokens, API-key only)
5. OpenRouter live API (existing, unchanged)
6. Nous suffix-match via OpenRouter (dot/dash normalization)
7. models.dev registry lookup (provider-aware)
8. Thin hardcoded defaults (broad family patterns)
9. 128K fallback (was 2M)
- Provider-aware context: same model now correctly resolves to different
context windows per provider (e.g. claude-opus-4.6: 1M on Anthropic,
128K on GitHub Copilot). Provider name flows through ContextCompressor.
- DEFAULT_CONTEXT_LENGTHS shrunk from 80+ entries to ~16 broad patterns.
models.dev replaces the per-model hardcoding.
- CONTEXT_PROBE_TIERS changed from [2M, 1M, 512K, 200K, 128K, 64K, 32K]
to [128K, 64K, 32K, 16K, 8K]. Unknown models no longer start at 2M.
- hermes model: prompts for context_length when configuring custom
endpoints. Supports shorthand (32k, 128K). Saved to custom_providers
per-model config.
- custom_providers schema extended with optional models dict for
per-model context_length (backward compatible).
- Nous Portal: suffix-matches bare IDs (claude-opus-4-6) against
OpenRouter's prefixed IDs (anthropic/claude-opus-4.6) with dot/dash
normalization. Handles all 15 current Nous models.
- Anthropic direct: queries /v1/models for max_input_tokens. Only works
with regular API keys (sk-ant-api*), not OAuth tokens. Falls through
to models.dev for OAuth users.
Tests: 5574 passed (18 new tests for models_dev + updated probe tiers)
Docs: Updated configuration.md context length section, AGENTS.md
Co-authored-by: Test <test@test.com>
Add first-class GitHub Copilot and Copilot ACP provider support across
model selection, runtime provider resolution, CLI sessions, delegated
subagents, cron jobs, and the Telegram gateway.
This also normalizes Copilot model catalogs and API modes, introduces a
Copilot ACP OpenAI-compatible shim, and fixes service-mode auth by
resolving Homebrew-installed gh binaries under launchd.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Salvaged from PR #1708 by @kartikkabadi. Cherry-picked with authorship preserved.
Fixes pre-existing test failures from setup TTS prompt flow changes and environment-sensitive assumptions.
Co-authored-by: Kartik <user2@RentKars-MacBook-Air.local>
Add Kilo Gateway (kilo.ai) as an API-key provider with OpenAI-compatible
endpoint at https://api.kilo.ai/api/gateway. Supports 500+ models from
Anthropic, OpenAI, Google, xAI, Mistral, MiniMax via a single API key.
- Register kilocode in PROVIDER_REGISTRY with aliases (kilo, kilo-code,
kilo-gateway) and KILOCODE_API_KEY / KILOCODE_BASE_URL env vars
- Add to model catalog, CLI provider menu, setup wizard, doctor checks
- Add google/gemini-3-flash-preview as default aux model
- 12 new tests covering registration, aliases, credential resolution,
runtime config
- Documentation updates (env vars, config, fallback providers)
- Fix setup test index shift from provider insertion
Inspired by PR #1473 by @amanning3390.
Co-authored-by: amanning3390 <amanning3390@users.noreply.github.com>
Add regression coverage for the new provider-aware vision setup flow and make the default OpenAI choice write AUXILIARY_VISION_MODEL so auxiliary vision requests don't fall back to the main model slug.