Angestrom
Search
Papers
Models
Live AI
Intelligence
Search
⌕
Go
⌘K
More
▾
Enterprise
Pricing
Sign in
≡
Home
/
Papers
/
Grokking in small transformers
paper · arXiv
Grokking in small transformers
When and why tiny models suddenly generalize long after overfitting.
✦
Explain this simply
Want the primary source?
View original →
Related to
article
Why small models are having a moment
Explains
tutorial
Build your first transformer from scratch
⌥ PATH
·
·
Why small models are having a moment
→
L
Build your first transformer from scratch
→
P
Grokking in small transformers
⧉
↗ share
Related across the graph
article
Why small models are having a moment
tutorial
Build your first transformer from scratch
Topics
llm
training
✦