newsReddit r/LocalLLaMATrust 58 · CommunityPublished yesterdayLive · yesterday

Best coding model for 3x Spark setup?

Hi, our company has dedicated 3x Asus Ascent GX10 (GB10) to run a coding model for our dev teams. max 30, but we expect concurrency of 5-10, preferably something stable/reliable. I'm trying to figure out what would be the best model and overall setup: the current setup that seems to be the most effective: vLLM + llama-swap (the classic) models: - something qwen like Qwen 3.5 122B or Qwen 3-coder....

Covers

modelHelix-7B

Covers (incoming)

repojjang-ai/vmlx

Related across the graph

modelHelix-7B repojjang-ai/vmlx