Concepedia

Publication | Closed Access

Automatic Choice of Dimensionality for PCA

471

Citations

11

References

2000

Year

Thomas P. Minka

Unknown Venue

Abstract

A central issue in principal component analysis (PCA) is choosing the number of principal components to be retained. By interpreting PCA as density estimation, this paper shows how to use Bayesian model selection to determine the true dimensionality of the data. The resulting estimate is simple to compute yet guaranteed to pick the correct dimensionality, given enough data. In simulations, it is more accurate than cross-validation and other proposed algorithms, plus it runs much faster. 1 Introduction Principal component analysis (PCA) decomposes high-dimensional data into a low-dimensional subspace component and a noise component. This decomposition is useful for data compression as well as de-noising, making it a common rst step for many data processing tasks. Tipping and Bishop (1997b) have shown that PCA can be interpreted as maximum-likelihood density estimation. This paper extends their work by applying Bayesian model selection to the probabilistic PCA model, providing a simple...

References

YearCitations

Page 1