feat: allow custom endpoints to use responses API via api_mode override (#1651)

Add HERMES_API_MODE env var and model.api_mode config field to let
custom OpenAI-compatible endpoints opt into codex_responses mode
without requiring the OpenAI Codex OAuth provider path.

- _get_configured_api_mode() reads HERMES_API_MODE env (precedence)
  then model.api_mode from config.yaml; validates against whitelist
- Applied in both _resolve_openrouter_runtime() and
  _resolve_named_custom_runtime() (original PR only covered openrouter)
- Fix _dump_api_request_debug() to show /responses URL when in
  codex_responses mode instead of always showing /chat/completions
- Tests for config override, env override, invalid values, named
  custom providers, and debug dump URL for both API modes

Inspired by PR #1041 by @mxyhi.

Co-authored-by: mxyhi <mxyhi@users.noreply.github.com>
This commit is contained in:
Teknium
2026-03-17 02:04:36 -07:00
committed by GitHub
parent 68fbcdaa06
commit f2414bfd45
4 changed files with 131 additions and 4 deletions

View File

@@ -1351,7 +1351,7 @@ class AIAgent:
error: Optional[Exception] = None,
) -> Optional[Path]:
"""
Dump a debug-friendly HTTP request record for chat.completions.create().
Dump a debug-friendly HTTP request record for the active inference API.
Captures the request body from api_kwargs (excluding transport-only keys
like timeout). Intended for debugging provider-side 4xx failures where
@@ -1374,7 +1374,7 @@ class AIAgent:
"reason": reason,
"request": {
"method": "POST",
"url": f"{self.base_url.rstrip('/')}/chat/completions",
"url": f"{self.base_url.rstrip('/')}{'/responses' if self.api_mode == 'codex_responses' else '/chat/completions'}",
"headers": {
"Authorization": f"Bearer {self._mask_api_key_for_logs(api_key)}",
"Content-Type": "application/json",