Read original ↗
newsReddit r/LocalLLaMATrust 58 · CommunityPublished 3d agoLive · 3d ago

HIP: use hipBLAS for dense prefill on gfx900, keep MMQ for MoE by DEV-DUFORD · Pull Request #24588 · ggml-org/llama.cpp