Andrew G. Barto

(5 articles)

1977

Actor-Critic Models

'Reinforcement learning architecture that includes two components: an actor that determines the actions to take and a critic that evaluates those actions to improve the policy.'

Generality: 705

1988

Temporal Difference Learning

A method in reinforcement learning that updates predictions based on the difference between successive predictions, rather than solely relying on final outcome errors.

Generality: 775

1989

Q-Value

Measure used in RL to represent the expected future rewards that an agent can obtain, starting from a given state and choosing a particular action.

Generality: 820

1990

Incremental Learning

A method where AI systems continuously acquire new data and knowledge while retaining previously learned information without retraining from scratch.

Generality: 750

2018

Precomputed Policy

A strategy computed in advance for decision-making processes in AI systems, particularly within reinforcement learning, to optimize future actions.

Generality: 550