Parametric Memory

Parametric memory is a form of memory used in artificial intelligence and machine learning models, where the knowledge or information is embedded within the parameters of the model itself. This contrasts with non-parametric memory systems, which store information explicitly and rely on external databases or memory banks for retrieval. Parametric memory is significant in neural networks, particularly in models like Transformers, where the weights of the network capture and encode the necessary knowledge from the training data. This method enhances the model's ability to generalize from seen data and perform well on unseen data without needing to access a separate memory store. It is especially useful in natural language processing tasks, where large amounts of contextual information are distilled into the network's parameters.

The concept of parametric memory became more prominent with the development and rise of deep learning models in the 2010s, particularly with the advent of attention mechanisms and Transformers in 2017, which highlighted the efficiency and power of storing information within model parameters.

Significant figures in the development of parametric memory include researchers involved in the creation of the Transformer architecture, such as Ashish Vaswani and his colleagues at Google Brain, whose work on "Attention is All You Need" was pivotal in showcasing the capabilities of parameterized models for memory and processing tasks.

Parametric Memory

Newsletter

Academic Papers

Zero: Memory optimizations toward training trillion parameter models

Reinforcement learning, fast and slow

Semi-parametric topological memory for navigation

A comparative analysis of K-nearest neighbor, genetic, support vector machine, decision tree, and long short term memory algorithms in machine learning

Zero-infinity: Breaking the gpu memory wall for extreme scale deep learning