Publication | Closed Access
Mini-batch gradient descent: Faster convergence under data sparsity
91
Citations
18
References
2017
Year
Unknown Venue
EngineeringMachine LearningMini-batch Gradient DescentData ScienceData MiningPattern RecognitionSparse Neural NetworkSupervised LearningData SparsityComputational Learning TheoryKnowledge DiscoveryLarge Scale OptimizationComputer ScienceDeep LearningAdaptive OptimizationMinibatch Gradient DescentSparse RepresentationPractical PerformanceParallel Learning
The practical performance of stochastic gradient descent on large-scale machine learning tasks is often much better than what current theoretical tools can guarantee. This indicates that there is an inherent structure in these problems that could be exploited to strengthen the analysis. In this paper, we argue that data sparsity is such a property. We derive explicit expressions for how data sparsity affects the range of admissible step-sizes and the convergence factors of minibatch gradient descent. Our theoretical results are validated by solving least-squares support vector machine problems on both synthetic and real-life data sets. The experimental results demonstrate improved performance of our update rules compared to the traditional mini-batch gradient descent algorithm.
| Year | Citations | |
|---|---|---|
Page 1
Page 1