2 items across the graph — tagged with Language Models.
On-device LLM Inference Powered by X-Bit Quantization
Efficient multi-token attribution for reasoning language models — Python package, CLI, and HTML token traces