Publication | Closed Access
Hyperband: a novel bandit-based approach to hyperparameter optimization
1K
Citations
41
References
2017
Year
Mathematical ProgrammingArtificial IntelligenceHyperparameter OptimizationHyperparameter EstimationEngineeringMachine LearningBayesian OptimizationData ScienceModel TuningParameter TuningAdaptive Resource AllocationComputer ScienceDeep LearningNovel Bandit-based ApproachExploration V ExploitationAdaptive Optimization
Hyperparameter optimization is critical for machine learning performance, and recent work uses Bayesian methods, but this study focuses on accelerating random search through adaptive resource allocation and early stopping. The study aims to accelerate random hyperparameter search by introducing Hyperband, a bandit‑based algorithm that allocates resources adaptively and early stops poor configurations. Hyperband treats hyperparameter search as a pure‑exploration infinite‑armed bandit, allocating a predefined resource (iterations, data samples, or features) to randomly sampled configurations and is compared experimentally to Bayesian optimization methods. Hyperband achieves more than an order‑of‑magnitude speedup compared to competing Bayesian optimization methods across diverse deep‑learning and kernel‑based tasks.
Performance of machine learning algorithms depends critically on identifying a good set of hyperparameters. While recent approaches use Bayesian optimization to adaptively select configurations, we focus on speeding up random search through adaptive resource allocation and early-stopping. We formulate hyperparameter optimization as a pure-exploration nonstochastic infinite-armed bandit problem where a predefined resource like iterations, data samples, or features is allocated to randomly sampled configurations. We introduce a novel algorithm, Hyperband, for this framework and analyze its theoretical properties, providing several desirable guarantees. Furthermore, we compare Hyperband with popular Bayesian optimization methods on a suite of hyperparameter optimization problems. We observe that Hyperband can provide over an order-of-magnitude speedup over our competitor set on a variety of deep-learning and kernel-based learning problems.
| Year | Citations | |
|---|---|---|
Page 1
Page 1