Non-convex Optimization for Machine Learning

Abstract

A vast majority of machine learning algorithms train their models and perform\ninference by solving optimization problems. In order to capture the learning\nand prediction problems accurately, structural constraints such as sparsity or\nlow rank are frequently imposed or else the objective itself is designed to be\na non-convex function. This is especially true of algorithms that operate in\nhigh-dimensional spaces or that train non-linear models such as tensor models\nand deep networks.\n The freedom to express the learning problem as a non-convex optimization\nproblem gives immense modeling power to the algorithm designer, but often such\nproblems are NP-hard to solve. A popular workaround to this has been to relax\nnon-convex problems to convex ones and use traditional methods to solve the\n(convex) relaxed optimization problems. However this approach may be lossy and\nnevertheless presents significant challenges for large scale optimization.\n On the other hand, direct approaches to non-convex optimization have met with\nresounding success in several domains and remain the methods of choice for the\npractitioner, as they frequently outperform relaxation-based techniques -\npopular heuristics include projected gradient descent and alternating\nminimization. However, these are often poorly understood in terms of their\nconvergence and other properties.\n This monograph presents a selection of recent advances that bridge a\nlong-standing gap in our understanding of these heuristics. The monograph will\nlead the reader through several widely used non-convex optimization techniques,\nas well as applications thereof. The goal of this monograph is to both,\nintroduce the rich literature in this area, as well as equip the reader with\nthe tools and techniques needed to analyze these simple procedures for\nnon-convex problems.\n

References

Page 1

	Year	Citations

Page 1