Publication | Closed Access
Cross-Validatory Estimation of the Number of Components in Factor and Principal Components Models
2.5K
Citations
28
References
1978
Year
Factor and principal component analyses approximate a data matrix by separating systematic signal from random noise, but determining the true rank of the matrix—i.e., how much of the data is signal versus noise—remains a key challenge. The study proposes using cross‑validation to estimate the matrix rank. The authors partition the data matrix, estimate the rank on one subset, and evaluate predictive performance on a held‑out subset, selecting the rank that maximizes this predictive accuracy.
By means of factor analysis (FA) or principal components analysis (PCA) a matrix Y with the elements y ik is approximated by the model Here the parameters α, β and θ express the systematic part of the data yik, "signal," and the residuals ∊ ik express the "random" part, "noise." When applying FA or PCA to a matrix of real data obtained, for example, by characterizing N chemical mixtures by M measured variables, one major problem is the estimation of the rank A of the matrix Y, i.e. the estimation of how much of the data y ik is "signal" and how much is "noise." Cross validation can be used to approach this problem. The matrix Y is partitioned and the rank A is determined so as to maximize the predictive properties of model (I) when the parameters are estimated on one part of the matrix Y and the prediction tested on another part of the matrix Y.
| Year | Citations | |
|---|---|---|
Page 1
Page 1