1 items across the graph — tagged with Efficient Inference.
On-device LLM Inference Powered by X-Bit Quantization