Publication | Closed Access
An Efficient Approach for Assessing Hyperparameter Importance
312
Citations
34
References
2014
Year
Unknown Venue
Machine‑learning performance hinges on hyperparameter settings, and while Bayesian optimization has achieved notable successes, it offers little insight into the relative importance of hyperparameters and their interactions. This paper proposes efficient techniques that use random‑forest models trained on Bayesian‑optimization data to reveal hyperparameter importance. We present a linear‑time algorithm for computing random‑forest prediction marginals and apply functional ANOVA to quantify the importance of individual hyperparameters and their interactions, validated on popular ML frameworks and combinatorial solvers. The approach uncovers that, even in very high‑dimensional settings, most performance variation is driven by only a few hyperparameters, providing actionable insight into hyperparameter‑performance relationships.
The performance of many machine learning methods depends critically on hyperparameter settings. Sophisticated Bayesian optimization methods have recently achieved considerable successes in optimizing these hyperparameters, in several cases surpassing the performance of human experts. However, blind reliance on such methods can leave end users without insight into the relative importance of different hyperparameters and their interactions. This paper describes efficient methods that can be used to gain such insight, leveraging random forest models fit on the data already gathered by Bayesian optimization. We first introduce a novel, linear-time algorithm for computing marginals of random forest predictions and then show how to leverage these predictions within a functional ANOVA framework, to quantify the importance of both single hyperparameters and of interactions between hyperparameters. We conducted experiments with prominent machine learning frameworks and state-of-the-art solvers for combinatorial problems. We show that our methods provide insight into the relationship between hyperparameter settings and performance, and demonstrate that--even in very highdimensional cases--most performance variation is attributable to just a few hyperparameters.
| Year | Citations | |
|---|---|---|
Page 1
Page 1