Learning Transformer Through Implementation
From Multi-Head Attention to the Original Transformer model, BERT, the Encoder-Decoder based MarianMT translation model, and even Vision Transformer, you'll learn Transformer inside and out by implementing them directly in code.
Intermediate
Deep Learning(DL), PyTorch, encoder-decoder











