Publication | Open Access
Learning Overparameterized Neural Networks via Stochastic Gradient\n Descent on Structured Data
161
Citations
20
References
2018
Year
Neural networks have many successful applications, while much less\ntheoretical understanding has been gained. Towards bridging this gap, we study\nthe problem of learning a two-layer overparameterized ReLU neural network for\nmulti-class classification via stochastic gradient descent (SGD) from random\ninitialization. In the overparameterized setting, when the data comes from\nmixtures of well-separated distributions, we prove that SGD learns a network\nwith a small generalization error, albeit the network has enough capacity to\nfit arbitrary labels. Furthermore, the analysis provides interesting insights\ninto several aspects of learning neural networks and can be verified based on\nempirical studies on synthetic data and on the MNIST dataset.\n
| Year | Citations | |
|---|---|---|
Page 1
Page 1