feat: add OLLAMA_NUM_CTX config to cap context window (#83)

- Add ollama_num_ctx setting (default 4096) to config.py - Pass num_ctx option to Ollama in agent.py and agents/base.py - Add OLLAMA_NUM_CTX to .env.example with usage docs - Add context_window note in providers.yaml - Fix mock_settings in test_agent.py for new attribute - qwen3:30b with 4096 ctx uses ~19GB vs 45GB default
2026-03-14 18:54:43 -04:00
parent b01c1cb582
commit 9c59b386d8
6 changed files with 21 additions and 2 deletions
--- a/config/providers.yaml
+++ b/config/providers.yaml
@@ -28,6 +28,7 @@ providers:
      - name: qwen3.5:latest
        default: true
        context_window: 128000
+        # Note: actual context is capped by OLLAMA_NUM_CTX (default 4096) to save RAM
        capabilities: [text, tools, json, streaming]
      - name: llama3.1:8b-instruct
        context_window: 128000