Alexander Kolesnikov

(1 article)

ViTs
Vision Transformers

Class of DL models that apply the transformer architecture, originally designed for natural language processing, to computer vision tasks.

Generality: 705