fix(cli): prevent multiple reasoning boxes from rendering

Added a check to suppress further reasoning rendering once the response box is open, preventing potential overlap of reasoning boxes during late thinking blocks. This enhances the user experience by maintaining a clean output in the CLI.
This commit is contained in:
Teknium
2026-03-21 06:28:47 -07:00
parent 2da79b13df
commit eb537b5db4

6
cli.py
View File

@@ -1473,9 +1473,15 @@ class HermesCLI:
Opens a dim reasoning box on first token, streams line-by-line.
The box is closed automatically when content tokens start arriving
(via _stream_delta → _emit_stream_text).
Once the response box is open, suppress any further reasoning
rendering — a late thinking block (e.g. after an interrupt) would
otherwise draw a reasoning box inside the response box.
"""
if not text:
return
if getattr(self, "_stream_box_opened", False):
return
# Open reasoning box on first reasoning token
if not getattr(self, "_reasoning_box_opened", False):