paperarXivTrust 82 · PrimaryPublished 5d agoLive · 3d ago
KrishokChat: A Citation-Grounded Dataset and Benchmark for Bengali Agricultural Advisory
We present KrishokChat, the first citation-grounded Bengali agricultural instruction-tuning dataset for crop advisory in low-resource settings. We establish a foundation of 290 hierarchical Knowledge Nodes, extracting disease symptoms, management practices, chemical dosages, and verbatim citations from 129 domain-filtered agricultural manuals. Every training instance inherits a verified citation header, guaranteeing 100% citation provenance. Using a Partitioned Seed Generation Matrix, these nodes are expanded into 139,200 supervised fine-tuning pairs, and augmented with 5,300 chemical safety a
Lineage graph
Paper → model → repo connections mined from source citations (Tier-1 exact match).
