Publication | Closed Access
Parallel hierarchical clustering on shared memory platforms
22
Citations
17
References
2012
Year
Unknown Venue
Cluster ComputingEngineeringMachine LearningComputer ArchitectureParallel AlgorithmsUnsupervised Machine LearningData ScienceData MiningShared MemoryHierarchical ClusteringMemory PlatformsParallel ComputingMassively-parallel ComputingClustering (Nuclear Physics)Knowledge DiscoveryComputer EngineeringComputer ScienceSingle-linkage Hierarchical ClusteringComputational SciencePresent ShrinkParallel ProgrammingClustering (Data Mining)Data-level ParallelismMassive Data ProcessingBig Data
Hierarchical clustering has many advantages over traditional clustering algorithms like k-means, but it suffers from higher computational costs and a less obvious parallel structure. Thus, in order to scale this technique up to larger datasets, we present SHRINK, a novel shared-memory algorithm for single-linkage hierarchical clustering based on merging the solutions from overlapping sub-problems. In our experiments, we find that SHRINK provides a speedup of 18–20 on 36 cores on both real and synthetic datasets of up to 250,000 points. Source code for SHRINK is available for download on our website, http://cucis.ece.northwestern.edu.
| Year | Citations | |
|---|---|---|
Page 1
Page 1