Vaswani Ashish
(2 articles)2017
Transformer
Deep learning model architecture designed for handling sequential data, especially effective in natural language processing tasks.
Generality: 862
2017
Transformer Block
A neural network architecture component essential for efficiently handling sequential data by capturing long-range dependencies using attention mechanisms.
Generality: 500