Andrej Karpathy
(12 articles)Next Word Prediction
Enables language models to predict the most probable subsequent word in a text sequence using generative AI techniques.
Generality: 780
Generative
Subset of AI technologies capable of generating new content, ideas, or data that mimic human-like outputs.
Generality: 840
Similarity Learning
A technique in AI focusing on training models to measure task-related similarity between data points.
Generality: 675
Image-to-Text Model
AI systems that convert visual information from images into descriptive textual representations, enabling machines to understand and communicate the content of images.
Generality: 755
Generative AI
Subset of AI technologies that can generate new content, ranging from text and images to music and code, based on learned patterns and data.
Generality: 830
VQA
Visual Question Answering
Visual Question Answering
Field of AI where systems are designed to answer questions about visual content, such as images or videos.
Generality: 625
Few Shot
ML technique designed to recognize patterns and make predictions based on a very limited amount of training data.
Generality: 675
Zero-shot Capability
The ability of AI models to perform tasks or make predictions on new types of data that they have not encountered during training, without needing any example-specific fine-tuning.
Generality: 775
Next Token Prediction
Technique used in language modeling where the model predicts the following token based on the previous ones.
Generality: 735
Scaling Hypothesis
Enlarging model size, data, and computational resources can consistently improve task performance up to very large scales.
Generality: 765
ITM
Image-Text Matching
Image-Text Matching
AI technique that involves automatically identifying correspondences between textual descriptions and visual elements within images.
Generality: 480
VLM
Visual Language Model
Visual Language Model
AI models designed to interpret and generate content by integrating visual and textual information, enabling them to perform tasks like image captioning, visual question answering, and more.
Generality: 621