Read original ↗
paperarXivTrust 82 · PrimaryPublished 2d agoLive · 22h ago

$\text{Log}_\text{b}$Quant: Quantizing Language Models in Logarithmic Space

Quantization has become an invaluable tool to reduce memory requirements and inference speed of modern language models, in particular to make them available for consumer setups and edge devices. While previous work has primarily focused on uniform quantization codebooks, such approaches are prone to suboptimal representations due to low-frequency high-magnitude weights. We introduce Log$_\text{b}$Quant, a novel logarithmic quantization approach with adjustable bases, to adapt to common parameter distributions. We show that our method exhibits superior performance at 4-bit precision on several

Lineage graph

Paper → model → repo connections mined from source citations (Tier-1 exact match).

Why these links exist

  • Linked via arxiv authorJeremias Bohn

    $\text{Log}_\text{b}$Quant: Quantizing Language Models in Logarithmic Space

  • Linked via arxiv authorTizian Dippold

    $\text{Log}_\text{b}$Quant: Quantizing Language Models in Logarithmic Space

  • Linked via arxiv authorMahdi Koubaa

    $\text{Log}_\text{b}$Quant: Quantizing Language Models in Logarithmic Space

  • Linked via arxiv authorElias R. Wahl

    $\text{Log}_\text{b}$Quant: Quantizing Language Models in Logarithmic Space

  • Linked via arxiv authorGeorg Groh

    $\text{Log}_\text{b}$Quant: Quantizing Language Models in Logarithmic Space

Related to

Implements

authored (incoming)

Implements (incoming)

Related across the graph

Topics