Commit Graph

4 Commits

Author SHA1 Message Date
vincent
86eed141af fix: rebuild compressed payload before retry 2026-03-07 18:55:01 -05:00
teknium1
8311e8984b fix: preflight context compression + error handler ordering for model switches
Two fixes for the case where a user switches to a model with a smaller
context window while having a large existing session:

1. Preflight compression in run_conversation(): Before the main loop,
   estimate tokens of loaded history + system prompt. If it exceeds the
   model's compression threshold (85% of context), compress proactively
   with up to 3 passes. This naturally handles model switches because
   the gateway creates a fresh AIAgent per message with the current
   model's context length.

2. Error handler reordering: Context-length errors (400 with 'maximum
   context length' etc.) are now checked BEFORE the generic 4xx handler.
   Previously, OpenRouter's 400-status context-length errors were caught
   as non-retryable client errors and aborted immediately, never reaching
   the compression+retry logic.

Reported by Sonicrida on Discord: 840-message session (2MB+) crashed
after switching from a large-context model to minimax via OpenRouter.
2026-03-04 14:42:41 -08:00
teknium1
19f28a633a fix(agent): enhance 413 error handling and improve conversation history management in tests 2026-02-27 23:04:32 -08:00
tekelala
79bd65034c fix(agent): handle 413 payload-too-large via compression instead of aborting
The 413 "Request Entity Too Large" error from the LLM API was caught by the
generic 4xx handler which aborts immediately. This is wrong for 413 — it's a
payload-size issue that can be resolved by compressing conversation history.

- Intercept 413 before the generic 4xx block and route to _compress_context
- Exclude 413 from generic is_client_error detection
- Add 'request entity too large' to context-length phrases as safety net
- Add tests for 413 compression behavior

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-27 12:21:27 -05:00