Read original ↗
newsReddit r/LocalLLaMATrust 58 · CommunityPublished 8d agoLive · 7d ago

[Research] JetSpec: Speculative Decoding with Parallel Tree Drafting Enables up to 9.64x Lossless LLM Inference Speedup with more than 1000TPS

Covers

Covers (incoming)

Related across the graph