[Study] Best Local Uncensored Agent Model for M3 Max 36GB #1063
Closed
opened 2026-03-23 12:51:52 +00:00 by perplexity
·
1 comment
No Branch/Tag Specified
main
gemini/issue-892
claude/issue-1342
claude/issue-1346
claude/issue-1351
claude/issue-1340
fix/test-llm-triage-syntax
gemini/issue-1014
gemini/issue-932
claude/issue-1277
claude/issue-1139
claude/issue-870
claude/issue-1285
claude/issue-1292
claude/issue-1281
claude/issue-917
claude/issue-1275
claude/issue-925
claude/issue-1019
claude/issue-1094
claude/issue-1019-v3
fix/flaky-vassal-xdist-tests
fix/test-config-env-isolation
claude/issue-1019-v2
claude/issue-957-v2
claude/issue-1218
claude/issue-1217
test/chat-store-unit-tests
claude/issue-1191
claude/issue-1186
claude/issue-957
gemini/issue-936
claude/issue-1065
gemini/issue-976
gemini/issue-1149
claude/issue-1135
claude/issue-1064
gemini/issue-1012
claude/issue-1095
claude/issue-1102
claude/issue-1114
gemini/issue-978
gemini/issue-971
claude/issue-1074
claude/issue-987
claude/issue-1011
feature/internal-monologue
feature/issue-1006
feature/issue-1007
feature/issue-1008
feature/issue-1009
feature/issue-1010
feature/issue-1011
feature/issue-1012
feature/issue-1013
feature/issue-1014
feature/issue-981
feature/issue-982
feature/issue-983
feature/issue-984
feature/issue-985
feature/issue-986
feature/issue-987
feature/issue-993
claude/issue-943
claude/issue-975
claude/issue-989
claude/issue-988
fix/loop-guard-gitea-api-and-queue-validation
feature/lhf-tech-debt-fixes
kimi/issue-753
kimi/issue-714
kimi/issue-716
fix/csrf-check-before-execute
chore/migrate-gitea-to-vps
kimi/issue-640
fix/utcnow-calm-py
kimi/issue-635
kimi/issue-625
fix/router-api-truncated-param
kimi/issue-604
kimi/issue-594
review-fixes
kimi/issue-570
kimi/issue-554
kimi/issue-539
kimi/issue-540
feature/ipad-v1-api
kimi/issue-506
kimi/issue-512
refactor/airllm-doc-cleanup
kimi/issue-513
kimi/issue-514
kimi/issue-500
kimi/issue-492
kimi/issue-490
kimi/issue-459
kimi/issue-472
kimi/issue-473
kimi/issue-462
kimi/issue-463
kimi/issue-454
kimi/issue-445
kimi/issue-446
kimi/issue-431
GoldenRockachopa
hermes/v0.1
Labels
Clear labels
222-epic
actionable
assigned-claude
assigned-gemini
assigned-groq
assigned-kimi
assigned-manus
claude-ready
consolidation
deprioritized
deprioritized
duplicate
gemini-review
groq-ready
harness
heartbeat
inference
infrastructure
kimi-ready
memory-session
morrowind
needs-design
needs-extraction
p0-critical
p1-important
p2-backlog
philosophy
rejected-direction
seed:know-purpose
seed:serve-real
seed:tell-truth
sovereignty
Workshop: Timmy as Presence (Epic #222)
Has a concrete code/config task extracted
Issue currently assigned to Claude agent — do not assign to another agent
Issue currently assigned to Gemini agent — do not assign to another agent
Issue currently assigned to Kimi agent — do not assign to another agent
Issue currently assigned to Manus agent — do not assign to another agent
Part of a consolidation epic
Keep open but not blocking P0 work
Keep open but not blocking P0 work
Duplicate of another issue
Auto-generated by Gemini, needs relevance review
Core product: agent framework, heartbeat, inference, memory
Harness: Agent heartbeat loop
Harness: Inference and model routing
Supporting stage: dashboard, CI/CD, deployment, DNS
Scoped and ready for Kimi to pick up
Harness: Memory and session crystallization
Harness: Morrowind embodiment
Needs architectural design before implementation
Philosophy with unextracted engineering work
Priority 0: Must fix now
Priority 1: Important, next sprint
Priority 2: Backlog, do when time permits
Philosophical foundation — informs architecture decisions
Closed: rejected or superseded direction
Three Seeds: KNOW YOUR PURPOSE
Three Seeds: SERVE THE REAL
Three Seeds: TELL THE TRUTH
Harness: Sovereignty stack
No Label
Milestone
No items
No Milestone
Projects
Clear projects
No project
Notifications
Due Date
No due date set.
Dependencies
No dependencies set.
Reference: Rockachopa/Timmy-time-dashboard#1063
Reference in New Issue
Block a user
Blocking a user prevents them from interacting with repositories, such as opening or commenting on pull requests or issues. Learn more about blocking a user.
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Source
PDF:
The-Best-Local-Uncensored-Agent-Model-for-M3-Max-36GB.pdfSubmitted by: rockachopa
Summary
Definitive model selection research for Timmy's local brain on Apple Silicon M3 Max with 36GB unified memory. The document evaluates quantized open-source models for agent orchestration — tool calling, code generation, shell execution, issue triage, and creative writing — under the hard constraint of 28GB usable VRAM (after ~8GB macOS/app overhead).
Key Findings
Primary Recommendation: Qwen3-14B Q5_K_M
qwen3:14b| GGUF source:bartowski/Qwen3-14B-GGUFRunner-up: Dolphin 3.0-R1-Mistral-24B Q4_K_M
Fast Mode: Qwen3-8B Q6_K
Two-Model Strategy (Recommended)
OLLAMA_MAX_LOADED_MODELS=2Critical Insight: "Uncensored" is a Red Herring
Hermes 3 8B — Notable Mention
Ollama vs MLX Performance
MLX is 25–50% faster than Ollama, but Ollama has the superior ecosystem for agent orchestration (built-in tool calling API, JSON mode, model management, OpenAI-compatible endpoint).
Includes: Production-Ready Artifacts
MCP Integration Path
pip install qwen-agent[mcp])Cross-References
Work Suggestions
See child issues for actionable implementation tasks.
PR created: #1143
Artifacts delivered from the study:
Modelfile.qwen3-14b— Primary agent model (Q5_K_M, 32K ctx, temp 0.3). Tool calling F1 0.971, ~17.5 GB on M3 Max 36 GB.Modelfile.qwen3-8b— Fast routing model (Q6_K, 32K ctx, temp 0.2). F1 0.933 at ~45–55 tok/s, ~11.6 GB. Both models combined: ~17 GB — stay loaded simultaneously withOLLAMA_MAX_LOADED_MODELS=2.scripts/benchmark_local_model.sh— 5-test evaluation suite (tool call compliance, code gen, shell gen, multi-turn coherence, issue triage quality).src/config.py— Updated defaults:ollama_model → qwen3:14b,ollama_num_ctx → 32768, addedollama_fast_model = qwen3:8bandollama_max_loaded_models = 2.All 20 unit tests pass.