paperarXivTrust 82 · PrimaryPublished 5d agoLive · 3d ago
Can OCR-VLMs Read Devanagari? A Stress-Test Benchmark and Post-Correction Study
OCR systems, ranging from classical engines to specialised OCR vision-language models (OCR-VLMs) and frontier multimodal LLMs, report strong results on English and Chinese document benchmarks, yet their behaviour on Indic scripts is largely uncharacterised. We benchmark ten systems on Devanagari (Hindi): classical EasyOCR; open VLMs (Qwen2.5-VL-3B, Qwen3-VL-8B, olmOCR-7B); specialised OCR-VLMs (DeepSeek-OCR, Unlimited-OCR); and frontier closed models (Gemini 2.5 Flash, Claude Opus 4.7, GPT-5.5, Mistral OCR), across four synthetic degradation conditions and 300 real printed scans. We report fou
Lineage graph
Paper → model → repo connections mined from source citations (Tier-1 exact match).
Covers
Covers (incoming)
Implements (incoming)
Related across the graph
repoUfonik88/invoice-ocr-appnewsIs Qwen3-VL-2B the only viable VLM for JSON extraction on a "potato"?newsPP-OCRv6 on Hugging Face: 50-Language OCR from 1.5M to 34.5M ParametersnewsFind the best open-source OCR models in one place at Papers with Code [P]newsTurboOCR v3 — high-speed document OCR server (C++/CUDA), ~520 img/s on RTX 5090
