EnrichedOpen SourceReddit r/LocalLLaMACommunityLive · 4d agoPublished 6/29/2026

Slow performance Unsloth Gemma 12B Q8

I recently replaced GPT-OSS 20B Q4 with Gemma 4 12B Q8 but i went from roughly 70 t/s to 10 t/s. Am I doing something wrong? In the current session I am trying a Q5 modell with no change in performance meassured against the Q8. [Service] Type=simple User=root WorkingDirectory=/ro

View in news graph →

Why it matters

This story from Reddit r/LocalLLaMA is relevant to the Open Source branch of the AI ecosystem and may affect models, products, or research direction.

Technical breakdown

Business impact

Watch for product launches, funding moves, or policy shifts tied to this headline.