Clustering by fast search and find of density peaks

TLDR

Cluster analysis classifies elements into categories based on similarity, with applications ranging from astronomy to bioinformatics, bibliometrics, and pattern recognition. The authors propose a clustering method that identifies cluster centers by higher density and larger distance from higher‑density neighbors. The procedure uses density peaks to infer cluster numbers, detect outliers, and identify clusters regardless of shape or dimensionality. The algorithm’s effectiveness is demonstrated on multiple test cases.

Abstract

Cluster analysis is aimed at classifying elements into categories on the basis of their similarity. Its applications range from astronomy to bioinformatics, bibliometrics, and pattern recognition. We propose an approach based on the idea that cluster centers are characterized by a higher density than their neighbors and by a relatively large distance from points with higher densities. This idea forms the basis of a clustering procedure in which the number of clusters arises intuitively, outliers are automatically spotted and excluded from the analysis, and clusters are recognized regardless of their shape and of the dimensionality of the space in which they are embedded. We demonstrate the power of the algorithm on several test cases.

References

Page 1

	Year	Citations

Page 1