Publication | Closed Access
A divide-and-merge methodology for clustering
59
Citations
29
References
2005
Year
Unknown Venue
Cluster ComputingEngineeringSemantic WebUnsupervised Machine LearningText MiningNatural Language ProcessingDivide PhaseInformation RetrievalData ScienceData MiningData IntegrationStatisticsDocument ClusteringDivide-and-merge MethodologyKnowledge DiscoveryComputer ScienceBig Data SearchMerge PhaseCluster DevelopmentCorrelation ClusteringSearch TechniqueFuzzy ClusteringSimilarity Search
We present a divide-and-merge methodology for clustering a set of objects that combines a top-down "divide" phase with a bottom-up "merge" phase. In contrast, previous algorithms either use top-down or bottom-up methods to construct a hierarchical clustering or produce a flat clustering using local search (e.g., k-means). Our divide phase produces a tree whose leaves are the elements of the set. For this phase, we use an efficient spectral algorithm. The merge phase quickly finds an optimal tree-respecting partition for many natural objective functions, e.g., k-means, min-diameter, min-sum, correlation clustering, etc., We present a meta-search engine that uses this methodology to cluster results from web searches. We also give empirical results on text-based data where the algorithm performs better than or competitively with existing clustering algorithms.
| Year | Citations | |
|---|---|---|
Page 1
Page 1