Publication | Closed Access
On the Use of Stochastic Hessian Information in Optimization Methods for Machine Learning
249
Citations
14
References
2011
Year
EngineeringMachine LearningNewton-cg MethodCurvature InformationSampled Curvature InformationOptimization MethodsData ScienceUncertainty QuantificationPattern RecognitionDerivative-free OptimizationSupervised LearningStochastic Hessian InformationContinuous OptimizationComputational Learning TheoryLarge Scale OptimizationComputer ScienceStatistical Learning TheoryModel OptimizationConvex OptimizationStatistical Inference
This paper describes how to incorporate sampled curvature information in a Newton-CG method and in a limited memory quasi-Newton method for statistical learning. The motivation for this work stems from supervised machine learning applications involving a very large number of training points. We follow a batch approach, also known in the stochastic optimization literature as a sample average approximation approach. Curvature information is incorporated in two subsampled Hessian algorithms, one based on a matrix-free inexact Newton iteration and one on a preconditioned limited memory BFGS iteration. A crucial feature of our technique is that Hessian-vector multiplications are carried out with a significantly smaller sample size than is used for the function and gradient. The efficiency of the proposed methods is illustrated using a machine learning application involving speech recognition.
| Year | Citations | |
|---|---|---|
Page 1
Page 1