paperarXivTrust 82 · PrimaryPublished 7d agoLive · 4d ago

Democratic ICAI: Debating Our Way to Steering Principles from Preferences

Preference-based alignment often struggles to capture the reasoning that underlies human judgments. Many evaluations rely on multiple interacting criteria, yet pairwise labels reveal only the final choice rather than the considerations that shape preferences. Inverse Constitutional AI (ICAI) improves interpretability in decision making by summarizing preferences into natural-language principles, but its single-pass explanations miss much of the nuance involved in complex decisions. We introduce Democratic ICAI, a novel approach that gathers multiple competing rationales through structured pers

Lineage graph

Paper → model → repo connections mined from source citations (Tier-1 exact match).

Topics

cs.LG