John Langford
(3 articles)
1971
Best-of-N
A strategy in AI that involves generating multiple outputs and selecting the best one based on a predefined criterion or scoring function.
Generality: 575

1992
Policy Learning
Branch of reinforcement learning where the objective is to find an optimal policy that dictates the best action to take in various states to maximize cumulative reward.
Generality: 790

2018
Precomputed Policy
A strategy computed in advance for decision-making processes in AI systems, particularly within reinforcement learning, to optimize future actions.
Generality: 550