David Silver

(25 articles)

1952

RL
Reinforcement Learning

Type of ML where an agent learns to make decisions by performing actions in an environment to achieve a goal, guided by rewards.

Generality: 890

1956

Motor Learning

Process by which robots or AI systems acquire, refine, and optimize motor skills through experience and practice.

Generality: 675

1962

Function Approximation

Method used in AI to estimate complex functions using simpler, computationally efficient models.

Generality: 810

1976

Overfitting

When a ML model learns the detail and noise in the training data to the extent that it negatively impacts the performance of the model on new data.

Generality: 890

1980

Universal Learning Algorithms

Theoretical frameworks aimed at creating systems capable of learning any task to human-level competency, leveraging principles that could allow for generalization across diverse domains.

Generality: 840

1986

State Representation

The method by which an AI system formulates a concise and informative description of the environment's current situation or context.

Generality: 682

1988

Temporal Difference Learning

A method in reinforcement learning that updates predictions based on the difference between successive predictions, rather than solely relying on final outcome errors.

Generality: 775

1991

Meta-Learning

Learning to learn involves techniques that enable AI models to learn how to adapt quickly to new tasks with minimal data.

Generality: 858

1991

Catastrophic Forgetting

Phenomenon where a neural network forgets previously learned information upon learning new data.

Generality: 686

1992

Policy Learning

Branch of reinforcement learning where the objective is to find an optimal policy that dictates the best action to take in various states to maximize cumulative reward.

Generality: 790

1992

Policy Gradient Algorithm

Type of RL algorithm that optimizes the policy directly by computing gradients of expected rewards with respect to policy parameters.

Generality: 805

1992

Policy Gradient

Class of algorithms in RL that optimizes the parameters of a policy directly through gradient ascent on expected future rewards.

Generality: 675

2013

DRL
Deep Reinforcement Learning

Combines neural networks with a reinforcement learning framework, enabling AI systems to learn optimal actions through trial and error to maximize a cumulative reward.

Generality: 855

2014

Sequence Prediction

Involves forecasting the next item(s) in a sequence based on the observed pattern of prior sequences.

Generality: 825

2014

Autoregressive Sequence Generator

A predictive model harnessed in AI tasks, particularly involving times series, which leverages its own prior outputs as inputs in subsequent predictions.

Generality: 650

2015

Robustness

Ability of an algorithm or model to deliver consistent and accurate results under varying operating conditions and input perturbations.

Generality: 885

2015

DQN
Deep Q-Networks

RL technique that combines Q-learning with deep neural networks to enable agents to learn how to make optimal decisions from high-dimensional sensory inputs.

Generality: 853

2016

Move 37

Pivotal move made by AlphaGo in its second game against Go champion Lee Sedol, which showcased the superior strategic capabilities of AI in the game of Go.

Generality: 140

2016

Sample Efficiency

Ability of a ML model to achieve high performance with a relatively small number of training samples.

Generality: 815

2017

Expressive Hidden States

internal representations within a neural network that effectively capture and encode complex patterns and dependencies in the input data.

Generality: 695

2017

Ablation

Method where components of a neural network are systematically removed or altered to study their impact on the model's performance.

Generality: 650

2018

Precomputed Policy

A strategy computed in advance for decision-making processes in AI systems, particularly within reinforcement learning, to optimize future actions.

Generality: 550

2019

Post-Training

Techniques and adjustments applied to neural networks after their initial training phase to enhance performance, efficiency, or adaptability to new data or tasks.

Generality: 650

2020

1-N Systems

Architectures where one input or controller manages multiple outputs or agents, applicable in fields like neural networks and robotics.

Generality: 790

2021

Instruction Following Model

AI system designed to execute tasks based on specific commands or instructions provided by users.

Generality: 640

David Silver

RLReinforcement Learning

Motor Learning

Function Approximation

Overfitting

Universal Learning Algorithms

State Representation

Temporal Difference Learning

Meta-Learning

Catastrophic Forgetting

Policy Learning

Policy Gradient Algorithm

Policy Gradient

DRLDeep Reinforcement Learning

Sequence Prediction

Autoregressive Sequence Generator

Robustness

DQNDeep Q-Networks

Move 37

Sample Efficiency

Expressive Hidden States

Ablation

Precomputed Policy

Post-Training

1-N Systems

Instruction Following Model

RL
Reinforcement Learning

DRL
Deep Reinforcement Learning

DQN
Deep Q-Networks