Andrew G. Barto
(5 articles)Actor-Critic Models
'Reinforcement learning architecture that includes two components: an actor that determines the actions to take and a critic that evaluates those actions to improve the policy.'
Generality: 705
Temporal Difference Learning
A method in reinforcement learning that updates predictions based on the difference between successive predictions, rather than solely relying on final outcome errors.
Generality: 775
Q-Value
Measure used in RL to represent the expected future rewards that an agent can obtain, starting from a given state and choosing a particular action.
Generality: 820
Incremental Learning
A method where AI systems continuously acquire new data and knowledge while retaining previously learned information without retraining from scratch.
Generality: 750
Precomputed Policy
A strategy computed in advance for decision-making processes in AI systems, particularly within reinforcement learning, to optimize future actions.
Generality: 550