paperarXivTrust 82 · PrimaryPublished 7d agoLive · 4d ago
Democratic ICAI: Debating Our Way to Steering Principles from Preferences
Preference-based alignment often struggles to capture the reasoning that underlies human judgments. Many evaluations rely on multiple interacting criteria, yet pairwise labels reveal only the final choice rather than the considerations that shape preferences. Inverse Constitutional AI (ICAI) improves interpretability in decision making by summarizing preferences into natural-language principles, but its single-pass explanations miss much of the nuance involved in complex decisions. We introduce Democratic ICAI, a novel approach that gathers multiple competing rationales through structured pers
Lineage graph
Paper → model → repo connections mined from source citations (Tier-1 exact match).
