paperarXivTrust 82 · PrimaryPublished 3d agoLive · 2d ago
When LLMs Read Tables Carelessly: Measuring and Reducing Data Referencing Errors
While large language models (LLMs) perform well on table tasks, they still make data referencing errors (DREs), i.e., incorrectly citing or omitting table values, despite understanding the table structure. Beyond final-answer accuracy, DREs directly compromise the correctness and reliability of intermediate reasoning steps. Yet prior studies have only offered limited, small-scale analyses. In this work, we present the first systematic evaluation of tabular data referencing errors across different models and tasks. Our results show that DREs occur across all tested models (1.7B to 20B parameter
Lineage graph
Paper → model → repo connections mined from source citations (Tier-1 exact match).
