Enrichedbenchmarks
New benchmark exposes reasoning gaps in top models
A harder evaluation suite shows even leading models struggle on multi-hop tasks.
Why it matters
This story from AI News is relevant to the benchmarks branch of the AI ecosystem and may affect models, products, or research direction.
Technical breakdown
Source: AI News. See the original article for technical details.
Business impact
Watch for product launches, funding moves, or policy shifts tied to this headline.
