Concepedia

Abstract

Principal curves and surfaces play an important role in dimensionality reduction applications of machine learning and signal processing. Vaguely defined, principal curves are smooth curves that pass through the middle of the data distribution. This intuitive definition is ill posed and to this day researchers have struggled with its practical implications. Two main causes of these difficulties are: (i) the desire to build a self-consistent definition using global statistics (for instance conditional expectations), and (ii) not decoupling the definition of the principal curve from the data samples. In this paper, we introduce the concept of principal sets, which are the union of all principal surfaces with a particular dimensionality. The proposed definition of principal surfaces provides rigorous conditions for a point to satisfy that can be evaluated using only the gradient and Hessian of the probability density at the point of interest. Since the definition is decoupled from the data samples, any density estimator could be employed to obtain a probability distribution expression and identify the principal surfaces of the data under this particular model.

References

YearCitations

Page 1