paperarXivTrust 82 · PrimaryPublished 5d agoLive · 3d ago

Understanding Evaluation Illusion in Diffusion Large Language Models

Despite the capability of parallel decoding, diffusion large language models (dLLMs) require many denoising steps to maintain generation quality, motivating recent research on efficient decoding strategies. However, existing studies have reported inconsistent evaluation results even under seemingly identical evaluation settings, risking biased conclusions about dLLM decoding methods. To understand this evaluation concern, we conduct a rigorous evaluation of current decoding methods for dLLMs across diverse evaluation settings. Surprisingly, our analysis reveals that the ranking of decoding met

Lineage graph

Paper → model → repo connections mined from source citations (Tier-1 exact match).

Implements

repominimal-diffusion-lm

Covers

newsDiffusionGemma: 4x faster text generation newsWhat if context compression is a diffusion noise function? Proposal + honest results from untrained-model experiments [R]

Related to

glossary_termTransformer

Related across the graph

newsDiffusionGemma: 4x faster text generation glossary_termTransformer repominimal-diffusion-lm newsWhat if context compression is a diffusion noise function? Proposal + honest results from untrained-model experiments [R]

Topics

cs.CL