paper · arXiv

Quantization at 1.58 bits

Ternary-weight models that retain most of full-precision quality.

Want the primary source?View original →