Niki Parmar

(9 articles)
Attention Masking
2017

Attention Masking

Technique used in models based on transformers, where it manipulates the handling of sequence order and irrelevant elements in ML tasks.

Generality: 645

Attention Projection Matrix
2017

Attention Projection Matrix

Matrix used in attention mechanisms within neural networks, particularly in transformer models, to project input vectors into query, key, and value vectors.

Generality: 625

Attention Block
2017

Attention Block

Core component in neural networks, particularly in transformers, designed to selectively focus on the most relevant parts of an input sequence when making predictions.

Generality: 835

Positional Encoding
2017

Positional Encoding

Technique used in neural network models, especially in transformers, to inject information about the order of tokens in the input sequence.

Generality: 762

Encoder-Decoder Transformer
2017

Encoder-Decoder Transformer

A structure used in NLP for understanding and generating language by encoding input and decoding the output.

Generality: 775

Cross-Attention
2017

Cross-Attention

Mechanism in neural networks that allows the model to weigh and integrate information from different input sources dynamically.

Generality: 675

Masking
2017

Masking

Technique used in NLP models to prevent future input tokens from influencing the prediction of current tokens.

Generality: 639

Multi-headed Attention
2017

Multi-headed Attention

Mechanism in neural networks that allows the model to jointly attend to information from different representation subspaces at different positions.

Generality: 801

Self-Attention
2017

Self-Attention

Mechanism in neural networks that allows models to weigh the importance of different parts of the input data differently.

Generality: 800