paperarXivTrust 82 · PrimaryPublished 7d agoLive · 4d ago
Cognitive Episodes in LLM Reasoning Traces Enable Interpretable Human Item Difficulty Prediction
Predicting human item difficulty is central to educational assessment, where reliable estimates support fairness and effective test construction. Existing methods often depend on costly human calibration or item-level textual representations, providing limited evidence about the cognitive processes that make items difficult. We argue that difficulty should be viewed not only as a property of item text, but also as an observable consequence of the problem-solving burden an item induces. Large Reasoning Models (LRMs) offer scalable process evidence through reasoning traces, but such evidence mus
Lineage graph
Paper → model → repo connections mined from source citations (Tier-1 exact match).
