feat: allow custom endpoints to use responses API via api_mode override (#1651)

Add HERMES_API_MODE env var and model.api_mode config field to let custom OpenAI-compatible endpoints opt into codex_responses mode without requiring the OpenAI Codex OAuth provider path. - _get_configured_api_mode() reads HERMES_API_MODE env (precedence) then model.api_mode from config.yaml; validates against whitelist - Applied in both _resolve_openrouter_runtime() and _resolve_named_custom_runtime() (original PR only covered openrouter) - Fix _dump_api_request_debug() to show /responses URL when in codex_responses mode instead of always showing /chat/completions - Tests for config override, env override, invalid values, named custom providers, and debug dump URL for both API modes Inspired by PR #1041 by @mxyhi. Co-authored-by: mxyhi <mxyhi@users.noreply.github.com>
2026-03-17 02:04:36 -07:00
parent 68fbcdaa06
commit f2414bfd45
4 changed files with 131 additions and 4 deletions
--- a/run_agent.py
+++ b/run_agent.py
@@ -1351,7 +1351,7 @@ class AIAgent:
        error: Optional[Exception] = None,
    ) -> Optional[Path]:
        """
-        Dump a debug-friendly HTTP request record for chat.completions.create().
+        Dump a debug-friendly HTTP request record for the active inference API.

        Captures the request body from api_kwargs (excluding transport-only keys
        like timeout). Intended for debugging provider-side 4xx failures where
@@ -1374,7 +1374,7 @@ class AIAgent:
                "reason": reason,
                "request": {
                    "method": "POST",
-                    "url": f"{self.base_url.rstrip('/')}/chat/completions",
+                    "url": f"{self.base_url.rstrip('/')}{'/responses' if self.api_mode == 'codex_responses' else '/chat/completions'}",
                    "headers": {
                        "Authorization": f"Bearer {self._mask_api_key_for_logs(api_key)}",
                        "Content-Type": "application/json",