Tagged: pytorch
2 articles
Attention Is All You Need: Building the Original Transformer that Started the LLM Revolution Attention Is All You Need replaced RNNs with self-attention and changed everything. I built the original encoder-decoder transformer from scratch and trained it to translate English to French. Read article Building a Tiny LLM From Scratch, Trained on Poe A decoder-only transformer, two tokenizers, and 1.9 million characters of Edgar Allan Poe. What building an LLM from zero teaches you. Read article