hermes-agent/tests/run_agent at c83674dd772ed84c65d6934682995eae6ed9dbe7 - hermes-agent - Hermes Gitea

Timmy_Foundation/hermes-agent

Files

History

Teknium d6785dc4d4 fix: empty response recovery for reasoning models (mimo, qwen, GLM) (#8609 )

Three fixes for the (empty) response bug affecting open reasoning models:

1. Allow retries after prefill exhaustion — models like mimo-v2-pro always
   populate reasoning fields via OpenRouter, so the old 'not _has_structured'
   guard on the retry path blocked retries for EVERY reasoning model after
   the 2 prefill attempts.  Now: 2 prefills + 3 retries = 6 total attempts
   before (empty).

2. Reset prefill/retry counters on tool-call recovery — the counters
   accumulated across the entire conversation, never resetting during
   tool-calling turns.  A model cycling empty→prefill→tools→empty burned
   both prefill attempts and the third empty got zero recovery.  Now
   counters reset when prefill succeeds with tool calls.

3. Strip think blocks before _truly_empty check — inline <think> content
   made the string non-empty, skipping both retry paths.

Reported by users on Telegram with xiaomi/mimo-v2-pro and qwen3.5 models.
Reproduced: qwen3.5-9b emits tool calls as XML in reasoning field instead
of proper function calls, causing content=None + tool_calls=None + reasoning
with embedded <tool_call> XML.  Prefill recovery works but counter
accumulation caused permanent (empty) in long sessions.

2026-04-12 15:38:11 -07:00

..

__init__.py

refactor(tests): re-architect tests + fix CI failures (#5946 )

2026-04-07 17:19:07 -07:00

test_413_compression.py

fix: clear conversation_history after mid-loop compression to prevent empty sessions (#7001 )

2026-04-10 00:14:59 -07:00

test_860_dedup.py

refactor(tests): re-architect tests + fix CI failures (#5946 )

2026-04-07 17:19:07 -07:00

test_1630_context_overflow_loop.py

refactor(tests): re-architect tests + fix CI failures (#5946 )

2026-04-07 17:19:07 -07:00

test_agent_guardrails.py

fix(delegate): make max_concurrent_children configurable + error on excess

2026-04-10 13:38:14 -07:00

test_agent_loop_tool_calling.py

refactor(tests): re-architect tests + fix CI failures (#5946 )

2026-04-07 17:19:07 -07:00

test_agent_loop_vllm.py

refactor(tests): re-architect tests + fix CI failures (#5946 )

2026-04-07 17:19:07 -07:00

test_agent_loop.py

refactor(tests): re-architect tests + fix CI failures (#5946 )

2026-04-07 17:19:07 -07:00

test_anthropic_error_handling.py

refactor(tests): re-architect tests + fix CI failures (#5946 )

2026-04-07 17:19:07 -07:00

test_async_httpx_del_neuter.py

refactor(tests): re-architect tests + fix CI failures (#5946 )

2026-04-07 17:19:07 -07:00

test_compression_boundary.py

refactor(tests): re-architect tests + fix CI failures (#5946 )

2026-04-07 17:19:07 -07:00

test_compression_feasibility.py

fix(agent): route compression aux through live session runtime

2026-04-12 01:34:52 -07:00

test_compression_persistence.py

refactor(tests): re-architect tests + fix CI failures (#5946 )

2026-04-07 17:19:07 -07:00

test_compressor_fallback_update.py

refactor(tests): re-architect tests + fix CI failures (#5946 )

2026-04-07 17:19:07 -07:00

test_context_pressure.py

fix(agent): tiered context pressure warnings + gateway dedup (#6411 )

2026-04-08 21:31:44 -07:00

test_context_token_tracking.py

refactor(tests): re-architect tests + fix CI failures (#5946 )

2026-04-07 17:19:07 -07:00

test_dict_tool_call_args.py

refactor(tests): re-architect tests + fix CI failures (#5946 )

2026-04-07 17:19:07 -07:00

test_exit_cleanup_interrupt.py

refactor(tests): re-architect tests + fix CI failures (#5946 )

2026-04-07 17:19:07 -07:00

test_fallback_model.py

fix(model): normalize direct provider ids in auxiliary routing

2026-04-10 05:52:45 -07:00

test_flush_memories_codex.py

fix(agent): respect config timeout for flush_memories instead of hardcoded 30s

2026-04-08 18:55:33 -07:00

test_interactive_interrupt.py

refactor(tests): re-architect tests + fix CI failures (#5946 )

2026-04-07 17:19:07 -07:00

test_interrupt_propagation.py

fix: scope tool interrupt signal per-thread to prevent cross-session leaks (#7930 )

2026-04-11 14:02:58 -07:00

test_long_context_tier_429.py

refactor(tests): re-architect tests + fix CI failures (#5946 )

2026-04-07 17:19:07 -07:00

test_openai_client_lifecycle.py

refactor(tests): re-architect tests + fix CI failures (#5946 )

2026-04-07 17:19:07 -07:00

test_percentage_clamp.py

fix: update 6 test files broken by dead code removal

2026-04-10 03:44:43 -07:00

test_primary_runtime_restore.py

fix(run_agent): recover primary client on openai transport errors

2026-04-10 03:21:24 -07:00

test_provider_fallback.py

refactor(tests): re-architect tests + fix CI failures (#5946 )

2026-04-07 17:19:07 -07:00

test_provider_parity.py

feat: expand /fast to all OpenAI Priority Processing models (#6960 )

2026-04-09 22:06:30 -07:00

test_real_interrupt_subagent.py

refactor(tests): re-architect tests + fix CI failures (#5946 )

2026-04-07 17:19:07 -07:00

test_redirect_stdout_issue.py

refactor(tests): re-architect tests + fix CI failures (#5946 )

2026-04-07 17:19:07 -07:00

test_run_agent_codex_responses.py

feat(gateway): surface natural mid-turn assistant messages in chat platforms

2026-04-11 16:21:39 -07:00

test_run_agent.py

fix: empty response recovery for reasoning models (mimo, qwen, GLM) (#8609 )

2026-04-12 15:38:11 -07:00

test_session_meta_filtering.py

refactor(tests): re-architect tests + fix CI failures (#5946 )

2026-04-07 17:19:07 -07:00

test_session_reset_fix.py

refactor(tests): re-architect tests + fix CI failures (#5946 )

2026-04-07 17:19:07 -07:00

test_streaming.py

refactor(tests): re-architect tests + fix CI failures (#5946 )

2026-04-07 17:19:07 -07:00

test_strict_api_validation.py

refactor(tests): re-architect tests + fix CI failures (#5946 )

2026-04-07 17:19:07 -07:00

test_switch_model_context.py

fix: pass config_context_length to switch_model context compressor

2026-04-10 05:52:45 -07:00

test_token_persistence_non_cli.py

refactor(tests): re-architect tests + fix CI failures (#5946 )

2026-04-07 17:19:07 -07:00

test_tool_arg_coercion.py

refactor(tests): re-architect tests + fix CI failures (#5946 )

2026-04-07 17:19:07 -07:00

test_unicode_ascii_codec.py

fix(unicode): sanitize surrogate metadata and allow two-pass retry

2026-04-10 13:05:01 -07:00