Trevor Hastie

(24 articles)
Regression
1805

Regression

Statistical method used in ML to predict a continuous outcome variable based on one or more predictor variables.

Generality: 860

PCA (Principal Component Analysis)
1901

PCA
Principal Component Analysis

A statistical procedure that transforms a dataset into a set of orthogonal components, intended to reduce dimensionality while preserving as much variability as possible.

Generality: 500

Cross Validation
1931

Cross Validation

Statistical method used to estimate the skill of ML models on unseen data by partitioning the original dataset into a training set to train the model and a test set to evaluate it.

Generality: 852

Statistical Classification
1956

Statistical Classification

The problem of identifying which category or class an object belongs to based on its features or characteristics.

Generality: 500

Unsupervised Learning
1958

Unsupervised Learning

Type of ML where algorithms learn patterns from untagged data, without any guidance on what outcomes to predict.

Generality: 905

Supervised Classifier
1959

Supervised Classifier

Algorithm that, given a set of labeled training data, learns to predict the labels of new, unseen data.

Generality: 870

Regularization
1970

Regularization

Technique used in machine learning to reduce model overfitting by adding a penalty to the loss function based on the complexity of the model.

Generality: 845

Bias-Variance Trade-off
1970

Bias-Variance Trade-off

In ML, achieving optimal model performance involves balancing bias and variance to minimize overall error.

Generality: 818

Curse of Dimensionality
1970

Curse of Dimensionality

Phenomenon where the complexity and computational cost of analyzing data increase exponentially with the number of dimensions or features.

Generality: 827

Probabilistic Programming
1974

Probabilistic Programming

Programming paradigm designed to handle uncertainty and probabilistic models, allowing for the creation of programs that can make inferences about data by incorporating statistical methods directly into the code.

Generality: 820

Empirical Risk Minimization
1974

Empirical Risk Minimization

A foundational principle in statistics and ML (Machine Learning), focused on minimizing the average of the loss function over a sample dataset.

Generality: 814

Overfitting
1976

Overfitting

When a ML model learns the detail and noise in the training data to the extent that it negatively impacts the performance of the model on new data.

Generality: 890

Feature Importance
1986

Feature Importance

Techniques used to identify and rank the significance of input variables (features) in contributing to the predictive power of a ML model.

Generality: 800

Feature Extraction
1986

Feature Extraction

Process of transforming raw data into a set of features that are more meaningful and informative for a specific task, such as classification or prediction.

Generality: 880

Boosting
1989

Boosting

ML ensemble technique that combines multiple weak learners to form a strong learner, aiming to improve the accuracy of predictions.

Generality: 800

Similarity Computation
1990

Similarity Computation

A mathematical process to quantify the likeness between data objects, often used in AI to enhance pattern recognition and data clustering.

Generality: 675

Ensamble Algorithm
1992

Ensamble Algorithm

Combines multiple machine learning models to improve overall performance by reducing bias, variance, or noise.

Generality: 860

Bias-Variance Dilemma
1992

Bias-Variance Dilemma

Fundamental problem in supervised ML that involves a trade-off between a model’s ability to minimize error due to bias and error due to variance.

Generality: 893

Ensemble Methods
1996

Ensemble Methods

ML technique where multiple models are trained and used collectively to solve a problem.

Generality: 860

Ensemble Learning
1996

Ensemble Learning

ML paradigm where multiple models (often called weak learners) are trained to solve the same problem and combined to improve the accuracy of predictions.

Generality: 795

Meta-Classifier
1996

Meta-Classifier

Algorithm that combines multiple ML models to improve prediction accuracy over individual models.

Generality: 811

Early Stopping
1996

Early Stopping

A regularization technique used to prevent overfitting in ML models by halting training when performance on a validation set begins to degrade.

Generality: 675

Discriminative AI
2014

Discriminative AI

Algorithms that learn the boundary between classes of data, focusing on distinguishing between different outputs given an input.

Generality: 840

Model-Based Classifier
2015

Model-Based Classifier

ML algorithm that uses a pre-defined statistical model to make predictions based on input data.

Generality: 835