Concepedia

Publication | Open Access

On the distribution of the largest eigenvalue in principal components analysis

2K

Citations

36

References

2001

Year

TLDR

The largest singular value squared of an n × p Gaussian matrix, equivalently the largest eigenvalue of a p‑variate Wishart distribution with identity covariance, is denoted x(1). The authors analyze the asymptotic regime where p and n grow large with ratio γ = n/p ≥ 1, deriving the limit via a corresponding result for complex Wishart matrices using random‑matrix‑theory methods. After centering by μ_p = (√(n‑1)+√p)^2 and scaling by σ_p = (√(n‑1)+√p)(1/√(n‑1)+1/√p)^{1/3}, the distribution of x(1) converges to the Tracy–Widom law of order 1, which can be numerically evaluated, and simulations show the approximation is informative for n and p as small as 5, indicating that large‑p multivariate distribution theory may be more practically applicable than fixed‑p counterparts.

Abstract

Let x(1) denote the square of the largest singular value of an n × p matrix X, all of whose entries are independent standard Gaussian variates. Equivalently, x(1) is the largest principal component variance of the covariance matrix $X'X$, or the largest eigenvalue of a p­variate Wishart distribution on n degrees of freedom with identity covariance. Consider the limit of large p and n with $n/p = \gamma \ge 1$. When centered by $\mu_p = (\sqrt{n-1} + \sqrt{p})^2$ and scaled by $\sigma_p = (\sqrt{n-1} + \sqrt{p})(1/\sqrt{n-1} + 1/\sqrt{p}^{1/3}$, the distribution of x(1) approaches the Tracey-Widom law of order 1, which is defined in terms of the Painlevé II differential equation and can be numerically evaluated and tabulated in software. Simulations show the approximation to be informative for n and p as small as 5. The limit is derived via a corresponding result for complex Wishart matrices using methods from random matrix theory. The result suggests that some aspects of large p multivariate distribution theory may be easier to apply in practice than their fixed p counterparts.

References

YearCitations

Page 1