Phenomenon in neural networks where gradients of the network's parameters become very small, effectively preventing the weights from changing their values during training.
Generality: 773