newsReddit r/LocalLLaMATrust 58 · CommunityPublished yesterdayLive · yesterday
Best coding model for 3x Spark setup?
Hi, our company has dedicated 3x Asus Ascent GX10 (GB10) to run a coding model for our dev teams. max 30, but we expect concurrency of 5-10, preferably something stable/reliable. I'm trying to figure out what would be the best model and overall setup: the current setup that seems to be the most effective: vLLM + llama-swap (the classic) models: - something qwen like Qwen 3.5 122B or Qwen 3-coder....
