Publication | Open Access
BoTorch: A Framework for Efficient Monte-Carlo Bayesian Optimization
202
Citations
0
References
2019
Year
Artificial IntelligenceModel OptimizationHyperparameter EstimationEngineeringMachine LearningBayesian OptimizationData ScienceUncertainty QuantificationModel TuningParameter TuningComputer EngineeringKnowledge GradientBayesian MethodsComputer ScienceStatistical InferenceMarkov Chain Monte CarloMonte Carlo SamplingAutomatic Machine Learning
Bayesian optimization offers sample‑efficient global optimization for diverse fields such as machine learning, engineering, physics, and experimental design. This work introduces BoTorch, a modern framework that integrates Monte‑Carlo acquisition functions, sample‑average‑approximation optimization, auto‑differentiation, variance reduction, and a novel one‑shot Knowledge Gradient formulation. BoTorch’s modular PyTorch design enables flexible specification of probabilistic models and acquisition functions, leverages fast predictive distributions, hardware acceleration, deterministic optimization, and is supported by new theoretical convergence guarantees. Experiments show that BoTorch achieves superior sample efficiency compared to other popular Bayesian optimization libraries.
Bayesian optimization provides sample-efficient global optimization for a broad range of applications, including automatic machine learning, engineering, physics, and experimental design. We introduce BoTorch, a modern programming framework for Bayesian optimization that combines Monte-Carlo (MC) acquisition functions, a novel sample average approximation optimization approach, auto-differentiation, and variance reduction techniques. BoTorch's modular design facilitates flexible specification and optimization of probabilistic models written in PyTorch, simplifying implementation of new acquisition functions. Our approach is backed by novel theoretical convergence results and made practical by a distinctive algorithmic foundation that leverages fast predictive distributions, hardware acceleration, and deterministic optimization. We also propose a novel "one-shot" formulation of the Knowledge Gradient, enabled by a combination of our theoretical and software contributions. In experiments, we demonstrate the improved sample efficiency of BoTorch relative to other popular libraries.