PCA Based on Graph Laplacian Regularization and P-Norm for Gene Selection and Clustering

Abstract

In modern molecular biology, the hotspots and difficulties of this field are identifying characteristic genes from gene expression data. Traditional reconstruction-error-minimization model principal component analysis (PCA) as a matrix decomposition method uses quadratic error function, which is known sensitive to outliers and noise. Hence, it is necessary to learn a good PCA method when outliers and noise exist. In this paper, we develop a novel PCA method enforcing P-norm on error function and graph-Laplacian regularization term for matrix decomposition problem, which is called as PgLPCA. The heart of the method designing for reducing outliers and noise is a new error function based on non-convex proximal P-norm. Besides, Laplacian regularization term is used to find the internal geometric structure in the data representation. To solve the minimization problem, we develop an efficient optimization algorithm based on the augmented Lagrange multiplier method. This method is used to select characteristic genes and cluster the samples from explosive biological data, which has higher accuracy than compared methods.

References

Page 1

	Year	Citations

Page 1