Read original ↗
paperarXivTrust 82 · PrimaryPublished 2d agoLive · 22h ago

Reading Order Inference for Complex Document Layouts

Reading order inference remains a critical bottleneck in the digitization of complex historical manuscripts, where pages contain multiple spatially interleaved reading streams, the canonical example being the Glossa Ordinaria layout, in which a central text is surrounded by commentaries that wrap around it in non-rectangular, non-convex regions. We present a training-free, graph-based framework: each OCR text line becomes a node in a directed candidate-transition graph, edges are scored by a weighted additive ensemble of two lightweight language-model signals (causal language model conditional

Lineage graph

Paper → model → repo connections mined from source citations (Tier-1 exact match).

Why these links exist

Covers

authored (incoming)

Related across the graph

Topics