Christopher Watkins
(2 articles)
1989
Q-Learning
Model-free reinforcement learning algorithm that seeks to learn the value of actions in a given state, enabling an agent to maximize cumulative reward over time.
Generality: 870

1989
Q-Value
Measure used in RL to represent the expected future rewards that an agent can obtain, starting from a given state and choosing a particular action.
Generality: 820