Paul Christiano

(6 articles)

Control Problem

Challenge of ensuring that highly advanced AI systems act in alignment with human values and intentions.

Generality: 845

Field of research aimed at ensuring AI technologies are beneficial and do not pose harm to humanity.

Generality: 870

Diverse scenarios where AI systems do not perform as expected or generate unintended consequences.

Generality: 714

Technique that combines reinforcement learning (RL) with human feedback to guide the learning process towards desired outcomes.

Generality: 625

Process of ensuring that an AI system's goals and behaviors are consistent with human values and ethics.

Generality: 790

Probability of an existential catastrophe, often discussed within the context of AI safety and risk assessment.

Generality: 550