paperarXivTrust 82 · PrimaryPublished 4d agoLive · 3d ago

The Human Creativity Benchmark

Modern AI evaluation frameworks treat evaluator disagreement as noise to be resolved. In creative domains, professional disagreement reflects genuine differences in taste, not measurement error. We argue that evaluating creative AI requires preserving two distinct signals: convergence, where professionals align around shared best practices, and divergence, where individual taste legitimately varies. We present the Human Creativity Benchmark (HCB), a benchmark that operationalizes this separation by collecting pairwise preferences, scalar ratings on prompt adherence, usability, and visual appea

Lineage graph

Paper → model → repo connections mined from source citations (Tier-1 exact match).

Covers

newsWhy Aren’t We Measuring How AI Affects Humans?

Implements

repoFabiojvv/ai-cortex-hub

Covers (incoming)

newsBy modeling visual saliency, AI improves ratings of artistic product designs - Tech Xplore

Related across the graph

newsBy modeling visual saliency, AI improves ratings of artistic product designs - Tech Xplore repoFabiojvv/ai-cortex-hub newsWhy Aren’t We Measuring How AI Affects Humans?

Topics

cs.AI