Read original ↗
paperarXivTrust 82 · PrimaryPublished 3d agoLive · 2d ago

Random Reshuffling Dominates Stochastic Gradient Descent

Stochastic Gradient Descent ($\textsf{SGD}$) is one of the most classical optimization algorithms with favorable theoretical guarantees, yet the practical implementation of $\textsf{SGD}$ differs subtly from its well-known form and is often referred to as Shuffling Stochastic Gradient Descent ($\textsf{Shuffling SGD}$). A particularly popular strategy in $\textsf{Shuffling SGD}$ is Random Reshuffling ($\textsf{RR}$), which has achieved great empirical success across numerous experiments. Despite its strong performance, $\textsf{RR}$ has long been considered a heuristic due to a lack of theoret

Lineage graph

Paper → model → repo connections mined from source citations (Tier-1 exact match).

Topics