Read original ↗
paperarXivTrust 82 · PrimaryPublished 2d agoLive · 21h ago

Is One Layer Enough? Training A Single Transformer Layer Can Match Full-Parameter RL Training

Reinforcement learning (RL) has become a central component of post-training large language models (LLMs), yet little is understood about how RL adaptation is distributed across transformer layers. Existing approaches typically update all model parameters uniformly, implicitly assuming that every layer contributes similarly to the gains obtained during RL post-training. In this work, we challenge this assumption through a systematic layer-wise study of RL training. Surprisingly, we find that training a single transformer layer can recover most of the gains achieved by full-parameter RL training

Lineage graph

Paper → model → repo connections mined from source citations (Tier-1 exact match).

Why these links exist

  • Linked via arxiv authorZijian Zhang

    Is One Layer Enough? Training A Single Transformer Layer Can Match Full-Parameter RL Training

  • Linked via arxiv authorRizhen Hu

    Is One Layer Enough? Training A Single Transformer Layer Can Match Full-Parameter RL Training

  • Linked via arxiv authorAthanasios Glentis

    Is One Layer Enough? Training A Single Transformer Layer Can Match Full-Parameter RL Training

  • Linked via arxiv authorDawei Li

    Is One Layer Enough? Training A Single Transformer Layer Can Match Full-Parameter RL Training

  • Linked via arxiv authorChung-Yiu Yau

    Is One Layer Enough? Training A Single Transformer Layer Can Match Full-Parameter RL Training

  • Linked via arxiv authorHongzhou Lin

    Is One Layer Enough? Training A Single Transformer Layer Can Match Full-Parameter RL Training

  • Linked via arxiv authorMingyi Hong

    Is One Layer Enough? Training A Single Transformer Layer Can Match Full-Parameter RL Training

Related to

Covers

Covers (incoming)

Implements (incoming)

authored (incoming)

Related across the graph

Topics