Self-Supervised Pretraining

Self-supervised pretraining leverages the inherent structure within the data to generate pseudo-labels, allowing the model to learn useful representations from vast amounts of unlabeled data. This approach involves tasks like predicting the next word in a sentence, filling in masked parts of an image, or reconstructing corrupted data. Once pretrained, these models capture rich and generalizable features that can be fine-tuned with a smaller set of labeled examples for specific tasks such as classification, detection, or translation. This technique has been particularly impactful in natural language processing (NLP) with models like BERT and GPT, and in computer vision with models like SimCLR and BYOL.

The concept of self-supervised learning began gaining traction in the mid-2010s, with significant advancements around 2018-2019, particularly in the field of NLP with the introduction of models like BERT (2018) and GPT-2 (2019).

Notable contributors to the development of self-supervised pretraining include researchers at Google AI who developed BERT (Jacob Devlin et al.), and OpenAI with the creation of the GPT series (Alec Radford et al.). In computer vision, contributions from Yann LeCun at Facebook AI Research (FAIR) and researchers like Ting Chen (SimCLR) and Jean-Baptiste Grill (BYOL) have been significant.

Self-Supervised Pretraining

Key Contributors

Newsletter

Academic Papers

Self-supervised representation learning: Introduction, advances, and challenges

Self-supervised learning in medicine and healthcare

Predicting what you already know helps: Provable self-supervised learning

Dit: Self-supervised pre-training for document image transformer

Are large-scale datasets necessary for self-supervised pre-training?