Read original ↗
paperarXivTrust 82 · PrimaryPublished yesterdayLive · 19h ago

Spec-AUF: Accept-Until-Fail Training under Train-Inference Misalignment for Masked Block Drafters

Speculative decoding accelerates autoregressive generation by drafting a block of tokens that the target model verifies left-to-right, committing only the longest accepted prefix. Block (DLM-style) drafters predict the whole block in parallel, which is fast but trained with a full-block cross-entropy that supervises every position against the gold continuation -- even though inference discards every token after the first rejection. Recent acceptance-aware objectives patch this by reweighting the full-block loss; we instead use teacher-forced learning as a motivation for how supervision should

Lineage graph

Paper → model → repo connections mined from source citations (Tier-1 exact match).

Why these links exist

  • Linked via arxiv authorTianjian Yang

    Spec-AUF: Accept-Until-Fail Training under Train-Inference Misalignment for Masked Block Drafters

  • Linked via arxiv authorMeng Li

    Spec-AUF: Accept-Until-Fail Training under Train-Inference Misalignment for Masked Block Drafters

authored (incoming)

Related across the graph

Topics