Hyperparameter Tuning

Hyperparameter tuning involves selecting the best combination of hyperparameters that affect the learning and structure of a machine learning model. These hyperparameters, which include settings like learning rate, number of hidden layers, batch size, or regularization parameters, are not directly learned through the training process. Instead, they must be set prior to training and can greatly influence model outcomes. Effective tuning typically employs methods such as grid search, random search, or more advanced techniques like Bayesian optimization to systematically test different hyperparameter configurations and identify the most effective ones, often measured by a model's performance on a validation dataset.

The concept of hyperparameter tuning has been integral to machine learning since its early days but became more formalized and critical with the rise of more complex models in the 1990s and 2000s. Techniques like grid search and random search have been commonly used since the early 2000s.

While hyperparameter tuning is a broad field with many contributors, the development of automated methods such as Bayesian optimization has been significantly advanced by researchers like James Bergstra and Yoshua Bengio. Their work, particularly in the early 2010s, helped streamline and improve the efficiency of the hyperparameter tuning process.

Hyperparameter Tuning

Newsletter

Academic Papers

Hyperparameter optimization

Hyperparameter optimization for machine learning models based on Bayesian optimization

Hyper-parameter optimization: A review of algorithms and applications

Bayesian optimization is superior to random search for machine learning hyperparameter tuning: Analysis of the black-box optimization challenge 2020

Hyperparameter tuning for machine learning algorithms used for arabic sentiment analysis