paperarXivTrust 82 · PrimaryPublished 5d agoLive · 3d ago
Adaptive Block Diffusion: Resolving Training-Inference Mismatch in Diffusion Language Models
Diffusion Language Models (DLMs) are typically trained under fixed context structures, restricting denoising to predetermined token subsets. This creates a mismatch between training and inference, where models must operate over arbitrary configurations, leading to degradation off the training grid. We propose Adaptive Block Diffusion (ABD), which resolves this mismatch by optimizing denoising risk over a distribution of prefix-window configurations. By treating the configuration as a stochastic variable, ABD trains a single model over the full configuration space without architectural changes.
Lineage graph
Paper → model → repo connections mined from source citations (Tier-1 exact match).
Implements
Covers
Related to
Covers (incoming)
Related across the graph
newsDiffusionGemma: 4x faster text generationglossary_termTransformerrepominimal-diffusion-lmnewsLearning Unmasking Policies for Diffusion Language Models - Apple Machine Learning ResearchnewsWhat if context compression is a diffusion noise function? Proposal + honest results from untrained-model experiments [R]
