newsReddit r/MachineLearningTrust 72 · CommunityPublished 5d agoLive · 5d ago
I shrank a transformer until every number fitted on the screen and made the weights editable [R]
I've been teaching myself how LLMs actually work, not at the API level, but down to the matrix multiplications. To force myself to really understand the forward pass, I first built a complete transformer by hand in a spreadsheet from embeddings through to the loss. Then I turned the forward pass into a web page so it's easier to share. It's a full transformer (single attention head, single block) shrunk to the smallest size where every single number still
