Concepedia

Abstract

This paper proposes a study on the impact of the use of dimension reduction techniques (DRTs) in the quality of partitions produced by cluster analysis of microarray datasets. We tested seven DRTs applied to four microarray cancer datasets and ran four clustering algorithms using the original and reduced datasets. Overall results showed that using DRTs provides a improvement in performance of all algorithms tested, specially in the hierarchical class. We could see that, despite Principal Component Analysis (PCA) being the most widely used DRT, its was overcome by other nonlinear methods and it did not provide a substantial performance increase in the clustering algorithms. On the other hand, t-distributed Stochastic Embedding (t-SNE) and Laplacian Eigenmaps (LE) achieved good results for all datasets.

References

YearCitations

Page 1