Read original ↗
paperarXivTrust 82 · PrimaryPublished 3d agoLive · 2d ago

Signed-Permutation Coordinate Transport for RMSNorm Transformers

Modern LLM workflows move coordinate-indexed objects across checkpoints: steering vectors, sparse autoencoders, top-$k$ neuron sets, attribution lists, and merge alignments. This is only well posed after fixing the model's residual-stream gauge, which we show is architecture-dependent: LayerNorm residual charts have permutation gauge $S_d$ (up to a global sign flip), while RMSNorm charts with generic per-channel gain have signed-permutation gauge $B_d = S_d \ltimes \{\pm 1\}^d$. Permutation-only alignment is therefore symmetry-incomplete for RMSNorm models. We introduce sign-marginalized Hunga

Lineage graph

Paper → model → repo connections mined from source citations (Tier-1 exact match).

Topics