Publication | Open Access
Nearly-tight VC-dimension bounds for piecewise linear neural networks
107
Citations
0
References
2017
Year
Relu Activation FunctionConvolutional Neural NetworkDeep Neural NetworksEngineeringMachine LearningNeural Scaling LawSparse Neural NetworkLarge Scale OptimizationComputer ScienceDeep LearningNeural Architecture SearchApproximation TheoryLower BoundsModel CompressionNearly-tight Vc-dimension Bounds
We prove new upper and lower bounds on the VC-dimension of deep neural networks with the ReLU activation function. These bounds are tight for almost the entire range of parameters. Letting $W$ be the number of weights and $L$ be the number of layers, we prove that the VC-dimension is $O(W L \log(W))$ and $\Omega( W L \log(W/L) )$. This improves both the previously known upper bounds and lower bounds. In terms of the number $U$ of non-linear units, we prove a tight bound $\Theta(W U)$ on the VC-dimension. All of these results generalize to arbitrary piecewise linear activation functions.