[eval] qwen3:30b needs num_ctx cap to avoid OOM on 36GB Mac #83

Closed
opened 2026-03-14 21:34:56 +00:00 by hermes · 0 comments
Collaborator

Observed: qwen3:30b with default large context window consumed 45.1GB total, causing 10.6GB swap on 39GB Mac. All requests timed out.

Fixed: Added ollama_num_ctx setting (default 4096) to config.py, passed to Agno Ollama options in agent.py and agents/base.py. With 4096 context, model uses 19GB.

Remaining:

  • Add OLLAMA_NUM_CTX to .env.example
  • Startup health check that warns if model exceeds 70% system RAM
  • Consider auto-detecting available memory
  • Update providers.yaml context_window field (currently says 128000)
**Observed:** qwen3:30b with default large context window consumed 45.1GB total, causing 10.6GB swap on 39GB Mac. All requests timed out. **Fixed:** Added ollama_num_ctx setting (default 4096) to config.py, passed to Agno Ollama options in agent.py and agents/base.py. With 4096 context, model uses 19GB. **Remaining:** - Add OLLAMA_NUM_CTX to .env.example - Startup health check that warns if model exceeds 70% system RAM - Consider auto-detecting available memory - Update providers.yaml context_window field (currently says 128000)
Sign in to join this conversation.
No Label
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: Rockachopa/Timmy-time-dashboard#83