Concepedia

Publication | Closed Access

Clustering by Passing Messages Between Data Points

6.8K

Citations

14

References

2007

Year

TLDR

Clustering by selecting exemplar data points is crucial for pattern detection in sensory data, yet random initialization only succeeds when it starts near a good solution. The authors introduce affinity propagation, a method that uses pairwise similarity measures to identify exemplars. Affinity propagation iteratively exchanges real‑valued messages between points until a high‑quality set of exemplars and clusters emerges, and it was applied to faces, microarray genes, sentences, and airline cities. The method achieves significantly lower clustering error than alternatives while running in less than one‑hundredth the time.

Abstract

Clustering data by identifying a subset of representative examples is important for processing sensory signals and detecting patterns in data. Such “exemplars” can be found by randomly choosing an initial subset of data points and then iteratively refining it, but this works well only if that initial choice is close to a good solution. We devised a method called “affinity propagation,” which takes as input measures of similarity between pairs of data points. Real-valued messages are exchanged between data points until a high-quality set of exemplars and corresponding clusters gradually emerges. We used affinity propagation to cluster images of faces, detect genes in microarray data, identify representative sentences in this manuscript, and identify cities that are efficiently accessed by airline travel. Affinity propagation found clusters with much lower error than other methods, and it did so in less than one-hundredth the amount of time.

References

YearCitations

Page 1