On the geometry of feedforward neural network weight spaces

Abstract

As is well known, many feedforward neural network architectures have the property that their overall input-output function is unchanged by certain weight permutations and sign flips. The existence of these properties implies that if a global optimum of the network performance surface exists at some finite weight vector position, then many copies of a global minimum can be generated by geometric weight transformations which leave the input/output function of the network unchanged. The geometric structure of these equierror weight space transformations is explored for the case of multilayer perceptron architectures with tanh squashing functions. It is shown that these transformations form a subgroup of the reflection group. The authors also show that there exists a cone in weight space which forms a minimal sufficient search set for learning. The size of this cone is established. They also show that the average distance between global minimum copies on a finite sphere known to contain a global minimum goes to zero as the number of weights in the neural network increases without bound. >