[claude] Configure Qwen3-14B Q5_K_M as Timmy primary brain (#1064) #1145

Closed

claude wants to merge 1 commits from claude/issue-1064 into main

Author	SHA1	Message	Date
Alexander Whitestone	9c916e1c5d	feat: configure Qwen3-14B Q5_K_M as Timmy primary brain Some checks failed Tests / lint (pull_request) Failing after 16s Details Tests / test (pull_request) Has been skipped Details Fixes #1064 - Modelfile.timmy: rebase from ~/timmy-fused-model.gguf (Hermes4 LoRA) to qwen3:14b; add min_p 0.02, num_predict 4096, explicit stop tokens (<\|im_end\|>, <\|im_start\|>), and a full sovereign-AI system prompt. Memory budget: ~10.5 GB model + ~7 GB KV cache = ~17.5 GB at 32K ctx. - config.py: change default ollama_model to "timmy", bump ollama_num_ctx to 32768 to match the Modelfile; add qwen3:14b as first text fallback. - config/providers.yaml: promote "timmy" to default model (Qwen3-14B Q5_K_M); add qwen3:14b entry; refresh fallback_chains (tools + text) to lead with timmy → qwen3:14b; note Hermes4 LoRA path superseded. - multimodal.py: add qwen3, qwen3:14b, qwen3:30b, timmy, hermes4-14b to KNOWN_MODEL_CAPABILITIES; add timmy + qwen3:14b to TOOLS fallback chain. - prompts.py: correct "small 4096 token context" limitation to 32K. Build commands (manual, run on the M3 Max): ollama pull qwen3:14b ollama create timmy -f Modelfile.timmy Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-23 14:36:22 -04:00

Author

SHA1

Message

Date

Alexander Whitestone

9c916e1c5d

feat: configure Qwen3-14B Q5_K_M as Timmy primary brain

Tests / lint (pull_request) Failing after 16s

Details

Tests / test (pull_request) Has been skipped

Details

Fixes #1064

- Modelfile.timmy: rebase from ~/timmy-fused-model.gguf (Hermes4 LoRA)
  to qwen3:14b; add min_p 0.02, num_predict 4096, explicit stop tokens
  (<|im_end|>, <|im_start|>), and a full sovereign-AI system prompt.
  Memory budget: ~10.5 GB model + ~7 GB KV cache = ~17.5 GB at 32K ctx.

- config.py: change default ollama_model to "timmy", bump ollama_num_ctx
  to 32768 to match the Modelfile; add qwen3:14b as first text fallback.

- config/providers.yaml: promote "timmy" to default model (Qwen3-14B
  Q5_K_M); add qwen3:14b entry; refresh fallback_chains (tools + text)
  to lead with timmy → qwen3:14b; note Hermes4 LoRA path superseded.

- multimodal.py: add qwen3, qwen3:14b, qwen3:30b, timmy, hermes4-14b to
  KNOWN_MODEL_CAPABILITIES; add timmy + qwen3:14b to TOOLS fallback chain.

- prompts.py: correct "small 4096 token context" limitation to 32K.

Build commands (manual, run on the M3 Max):
  ollama pull qwen3:14b
  ollama create timmy -f Modelfile.timmy

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

2026-03-23 14:36:22 -04:00

[claude] Configure Qwen3-14B Q5_K_M as Timmy primary brain (#1064) #1145

1 Commits