Angestrom
Search
Papers
Models
Live AI
Intelligence
Search
⌕
Go
⌘K
More
▾
Enterprise
Pricing
Sign in
≡
Home
/
Papers
/
Speculative decoding with draft models
paper · arXiv
Speculative decoding with draft models
Accelerating generation by drafting tokens with a small model.
✦
Explain this simply
Want the primary source?
View original →
Related to (incoming)
glossary_term
Token
⌥ PATH
·
G
Token
→
P
Speculative decoding with draft models
⧉
↗ share
Related across the graph
glossary_term
Token
Topics
efficiency
inference
✦