feat: default reasoning effort from xhigh to medium

Reduces token usage and latency for most tasks by defaulting to
medium reasoning effort instead of xhigh. Users can still override
via config or CLI flag. Updates code, tests, example config, and docs.
This commit is contained in:
teknium1
2026-03-07 10:14:19 -08:00
parent 23e84de830
commit b84f9e410c
9 changed files with 25 additions and 24 deletions

View File

@@ -421,10 +421,10 @@ Control how much "thinking" the model does before responding:
```yaml
agent:
reasoning_effort: "" # empty = use model default. Options: xhigh (max), high, medium, low, minimal, none
reasoning_effort: "" # empty = medium (default). Options: xhigh (max), high, medium, low, minimal, none
```
When unset (default), the model's own default reasoning level is used. Setting a value overrides it — higher reasoning effort gives better results on complex tasks at the cost of more tokens and latency.
When unset (default), reasoning effort defaults to "medium" — a balanced level that works well for most tasks. Setting a value overrides it — higher reasoning effort gives better results on complex tasks at the cost of more tokens and latency.
## TTS Configuration