John Langford

(3 articles)

1971

Best-of-N

A strategy in AI that involves generating multiple outputs and selecting the best one based on a predefined criterion or scoring function.

Generality: 575

1992

Policy Learning

Branch of reinforcement learning where the objective is to find an optimal policy that dictates the best action to take in various states to maximize cumulative reward.

Generality: 790

2018

Precomputed Policy

A strategy computed in advance for decision-making processes in AI systems, particularly within reinforcement learning, to optimize future actions.

Generality: 550