Reading Order Inference for Complex Document Layouts
Reading order inference remains a critical bottleneck in the digitization of complex historical manuscripts, where pages contain multiple spatially interleaved reading streams, the canonical example being the Glossa Ordinaria layout, in which a central text is surrounded by commentaries that wrap around it in non-rectangular, non-convex regions. We present a training-free, graph-based framework: each OCR text line becomes a node in a directed candidate-transition graph, edges are scored by a weighted additive ensemble of two lightweight language-model signals (causal language model conditional
Lineage graph
Paper → model → repo connections mined from source citations (Tier-1 exact match).
Why these links exist
- Linked via arxiv authorIddo Hakim →
Reading Order Inference for Complex Document Layouts
- Linked via arxiv authorSharva Gogawale →
Reading Order Inference for Complex Document Layouts
- Linked via arxiv authorOmer Ventura →
Reading Order Inference for Complex Document Layouts
- Linked via arxiv authorGal Grudka →
Reading Order Inference for Complex Document Layouts
- Linked via arxiv authorDaria Vasyutinsky-Shapira →
Reading Order Inference for Complex Document Layouts
- Linked via arxiv authorBerat Kurar-Barakat →
Reading Order Inference for Complex Document Layouts
- Linked via arxiv authorNachum Dershowitz →
Reading Order Inference for Complex Document Layouts
