Read original ↗
paperarXivTrust 82 · PrimaryPublished 5d agoLive · 3d ago

Multi-Block Diffusion Language Models

Block Diffusion Language Models (BD-LMs) improve diffusion-based text generation with KV caching and flexible-length generation. A natural next step is to extend them from Single-Block Diffusion (SingleBD) to Multi-Block Diffusion (MultiBD), where a \textit{running-set} of consecutive blocks is decoded concurrently for inter-block parallelism. However, existing BD-LMs are mostly trained under teacher forcing, where the model observes only one noisy block conditioned on a clean prefix. While the recent diffusion forcing strategy introduces visibility among multiple noisy blocks, its training st

Lineage graph

Paper → model → repo connections mined from source citations (Tier-1 exact match).

Implements

Covers

Covers (incoming)

Related across the graph

Topics