Sergey Levine

(9 articles)

1992

Policy Gradient Algorithm

Type of RL algorithm that optimizes the policy directly by computing gradients of expected rewards with respect to policy parameters.

Generality: 805

1992

Policy Gradient

Class of algorithms in RL that optimizes the parameters of a policy directly through gradient ascent on expected future rewards.

Generality: 675

2012

Data Efficient Learning

ML approach that requires fewer data to train a functional model.

Generality: 791

2015

TRPO
Trust Region Policy Optimization

Advanced algorithm used in RL to ensure stable and reliable policy updates by optimizing within a trust region, thus preventing drastic policy changes.

Generality: 635

2016

Imitation Learning

AI technique where models learn to perform tasks by mimicking human behavior or strategies demonstrated in training data.

Generality: 850

2016

Sample Efficiency

Ability of a ML model to achieve high performance with a relatively small number of training samples.

Generality: 815

2016

FSL
Few-Shot Learning

ML approach that enables models to learn and make accurate predictions from a very small dataset.

Generality: 575

2016

Few Shot

ML technique designed to recognize patterns and make predictions based on a very limited amount of training data.

Generality: 675

2017

PPO
Proximal Policy Optimization

RL algorithm that aims to balance ease of implementation, sample efficiency, and reliable performance by using a simpler but effective update method for policy optimization.

Generality: 670

Sergey Levine

Policy Gradient Algorithm

Policy Gradient

Data Efficient Learning

TRPOTrust Region Policy Optimization

Imitation Learning

Sample Efficiency

FSLFew-Shot Learning

Few Shot

PPOProximal Policy Optimization

TRPO
Trust Region Policy Optimization

FSL
Few-Shot Learning

PPO
Proximal Policy Optimization