Read original ↗

newsHacker NewsTrust 72 · CommunityPublished 6d agoLive · 6d ago

DSpark: Speculative decoding accelerates LLM inference [pdf]

717points294comments

Open Source Hacker News

Covers

paperSpeculative decoding with draft models paperWhen are likely answers right? On Sequence Probability and Correctness in LLMs

Covers (incoming)

paperDepth Exploration for LLM Decoding paperBlockPilot: Instance-Adaptive Policy Learning for Diffusion-based Speculative Decoding repoKaden-Schutt/hipfire reposgl-project/SpecForge repoalibaba/rtp-llm repodphnAI/aphrodite-engine repoguoqingbao/xinfer

Related across the graph

paperWhen are likely answers right? On Sequence Probability and Correctness in LLMs paperBlockPilot: Instance-Adaptive Policy Learning for Diffusion-based Speculative Decoding repoalibaba/rtp-llm reposgl-project/SpecForge paperDepth Exploration for LLM Decoding repoguoqingbao/xinfer repodphnAI/aphrodite-engine repoKaden-Schutt/hipfire paperSpeculative decoding with draft models