paperarXivTrust 82 · PrimaryPublished 3d agoLive · 2d ago
On the Convergence of Self-Improving Online LLM Alignment
The Self-Improving Alignment (SAIL) algorithm addresses distribution shift by reducing a bilevel formulation of the problem to an efficient, single-level method. Empirically, SAIL has demonstrated strong performance on this task. However, a formal analysis of its convergence properties has been lacking. We identify a key theoretical challenge: the standard SAIL objective function is not guaranteed to be strongly concave due to unfavorable properties of its Hessian. To address this limitation, we propose a regularized objective, SAIL-RevKL, which incorporates a reverse Kullback-Leibler (KL) div
Lineage graph
Paper → model → repo connections mined from source citations (Tier-1 exact match).
