Hyperparameter optimization: Foundations, algorithms, best practices, and open challenges

TLDR

Hyperparameters critically influence machine learning performance and must be carefully selected, yet manual tuning is time‑consuming and often unreliable. The paper reviews key hyperparameter optimization methods and offers practical guidance for selecting algorithms, evaluating performance, and integrating HPO into machine learning pipelines. It surveys HPO techniques—from grid and random search to Bayesian optimization, Hyperband, and racing—and discusses algorithm selection, evaluation, pipeline integration, runtime, and parallelization. Article falls under Algorithmic Development, Statistics Technologies, Machine Learning Technologies, and Prediction.

Abstract

Abstract Most machine learning algorithms are configured by a set of hyperparameters whose values must be carefully chosen and which often considerably impact performance. To avoid a time‐consuming and irreproducible manual process of trial‐and‐error to find well‐performing hyperparameter configurations, various automatic hyperparameter optimization (HPO) methods—for example, based on resampling error estimation for supervised machine learning—can be employed. After introducing HPO from a general perspective, this paper reviews important HPO methods, from simple techniques such as grid or random search to more advanced methods like evolution strategies, Bayesian optimization, Hyperband, and racing. This work gives practical recommendations regarding important choices to be made when conducting HPO, including the HPO algorithms themselves, performance evaluation, how to combine HPO with machine learning pipelines, runtime improvements, and parallelization. This article is categorized under: Algorithmic Development > Statistics Technologies > Machine Learning Technologies > Prediction

References

Page 1

	Year	Citations

Page 1