A PLS kernel algorithm for data sets with many variables and fewer objects. Part 1: Theory and algorithm

Abstract

Abstract A fast PLS regression algorithm dealing with large data matrices with many variables ( K ) and fewer objects ( N ) is presented For such data matrices the classical algorithm is computer‐intensive and memory‐demanding. Recently, Lindgren et al. ( J. Chemometrics , 7 , 45–49 (1993)) developed a quick and efficient kernel algorithm for the case with many objects and few variables. The present paper is focused on the opposite case, i.e. many variables and fewer objects. A kernel algorithm is presented based on eigenvectors to the ‘kernel’ matrix XX T YY T , which is a square, non‐symmetric matrix of size N × N , where N is the number of objects. Using the kernel matrix and the association matrices XX T ( N × N ) and YY T ( N × N ), it is possible to calculate all score and loading vectors and hence conduct a complete PLS regression including diagnostics such as R 2 . This is done without returning to the original data matrices X and Y . The algorithm is presented in equation form, with proofs of some new properties and as MATLAB code.

References

Page 1

	Year	Citations

Page 1