Read original ↗
newsReddit r/LocalLLaMATrust 58 · CommunityPublished 6d agoLive · 5d ago

We built a calibration-aware Q4_K_M quant of Qwen3.5 0.8B that recovers 96.5% of the BF16 gap vs pure llama.cpp Q4_K_M (SpectralQuant)