Concepedia

TLDR

Principal component analysis is a multivariate technique that analyzes data tables with inter‑correlated quantitative variables using eigen‑decomposition of matrices such as correlation or covariance. Its goal is to extract key information and represent it as orthogonal principal components, illustrating similarity patterns among observations and variables in spot maps. PCA is mathematically based on eigen‑decomposition of positive semi‑definite matrices and singular value decomposition of rectangular matrices, determined by eigenvectors and eigenvalues that reveal the matrix structure.

Abstract

Principal component analysis (PCA) is a multivariate technique that analyzes a data table in which observations are described by several inter-correlated quantitative dependent variables. Its goal is to extract the important information from the statistical data to represent it as a set of new orthogonal variables called principal components, and to display the pattern of similarity between the observations and of the variables as points in spot maps. Mathematically, PCA depends upon the eigen-decomposition of positive semi-definite matrices and upon the singular value decomposition (SVD) of rectangular matrices. It is determined by eigenvectors and eigenvalues. Eigenvectors and eigenvalues are numbers and vectors associated to square matrices. Together they provide the eigen-decomposition of a matrix, which analyzes the structure of this matrix. Even though the eigen-decomposition does not exist for all square matrices, it has a particularly simple expression for matrices such as correlation, covariance, or cross-product matrices.