Read original ↗
newsReddit r/LocalLLaMATrust 58 · CommunityPublished 7d agoLive · 7d ago

Help optimizing llama.cpp + Qwen 27B on RTX PRO 6000 Blackwell for coding agents

Our company recently acquired a workstation with an RTX PRO 6000 Blackwell , and we're experimenting with local LLMs to reduce part of our Claude token usage. Right now we’re running Qwen3.6 27B MTP Q8_K_XL with llama.cpp on Windows 11 . I've been using both Claude Opus and Sonnet for a while, and my impression is that this model feels somewhat comparable to Sonnet, but a bit weaker

Covers (incoming)

Related across the graph