Further training a pretrained model on a smaller, specific dataset.
When and why tiny models suddenly generalize long after overfitting.