Concepedia

Publication | Open Access

Principal component analysis for clustering gene expression data

1.3K

Citations

21

References

2001

Year

Abstract

Our empirical study showed that clustering with the PCs instead of the original variables does not necessarily improve, and often degrades, cluster quality. In particular, the first few PCs (which contain most of the variation in the data) do not necessarily capture most of the cluster structure. We also showed that clustering with PCs has different impact on different algorithms and different similarity metrics. Overall, we would not recommend PCA before clustering except in special circumstances.

References

YearCitations

Page 1