hermes-agent

Files

teknium1 71e81728ac feat: Codex OAuth vision support + multimodal content adapter

The Codex Responses API (chatgpt.com/backend-api/codex) supports
vision via gpt-5.3-codex. This was verified with real API calls
using image analysis.

Changes to _CodexCompletionsAdapter:
- Added _convert_content_for_responses() to translate chat.completions
  multimodal format to Responses API format:
  - {type: 'text'} → {type: 'input_text'}
  - {type: 'image_url', image_url: {url: '...'}} → {type: 'input_image', image_url: '...'}
- Fixed: removed 'stream' from resp_kwargs (responses.stream() handles it)
- Fixed: removed max_output_tokens and temperature (Codex endpoint rejects them)

Provider changes:
- Added 'codex' as explicit auxiliary provider option
- Vision auto-fallback now includes Codex (OpenRouter → Nous → Codex)
  since gpt-5.3-codex supports multimodal input
- Updated docs with Codex OAuth examples

Tested with real Codex OAuth token + ~/.hermes/image2.png — confirmed
working end-to-end through the full adapter pipeline.

Tests: 2459 passed.

2026-03-08 18:44:33 -07:00

developer-guide

feat: platform-conditional skill loading + Apple/macOS skills

2026-03-07 00:47:54 -08:00

getting-started

fix: allow self-hosted Firecrawl without API key + add self-hosting docs

2026-03-05 16:44:21 -08:00

reference

docs: add dedicated /compress command documentation

2026-03-08 17:21:15 -07:00

user-guide

feat: Codex OAuth vision support + multimodal content adapter

2026-03-08 18:44:33 -07:00

index.md

docs: rebrand messaging — 'the self-improving AI agent'

2026-03-06 04:34:06 -08:00