Measuring the Gap Between Human and LLM Research Ideas
LLMs are increasingly used to brainstorm research ideas, but existing evaluations mostly judge individual ideas by novelty, feasibility, or expert preference. We instead ask: how far are current LLM-generated ideas from human researchers? To characterize this gap, we build a large-scale evaluation framework for ideation from high-quality human research papers. For each paper, we reverse-engineer a small set of closely related prior works that likely inspired its core idea. LLMs are then prompted to generate a new idea from the set of paper titles and summaries. We introduce a two-axis research
Lineage graph
Paper → model → repo connections mined from source citations (Tier-1 exact match).
Why these links exist
- Linked via arxiv authorZiyu Chen →
Measuring the Gap Between Human and LLM Research Ideas
- Linked via arxiv authorYilun Zhao →
Measuring the Gap Between Human and LLM Research Ideas
- Linked via arxiv authorArman Cohan →
Measuring the Gap Between Human and LLM Research Ideas
