feat: add OLLAMA_NUM_CTX config to cap context window (#83)
- Add ollama_num_ctx setting (default 4096) to config.py - Pass num_ctx option to Ollama in agent.py and agents/base.py - Add OLLAMA_NUM_CTX to .env.example with usage docs - Add context_window note in providers.yaml - Fix mock_settings in test_agent.py for new attribute - qwen3:30b with 4096 ctx uses ~19GB vs 45GB default
This commit is contained in:
@@ -28,6 +28,7 @@ providers:
|
||||
- name: qwen3.5:latest
|
||||
default: true
|
||||
context_window: 128000
|
||||
# Note: actual context is capped by OLLAMA_NUM_CTX (default 4096) to save RAM
|
||||
capabilities: [text, tools, json, streaming]
|
||||
- name: llama3.1:8b-instruct
|
||||
context_window: 128000
|
||||
|
||||
Reference in New Issue
Block a user