Parallel K-means Clustering Algorithm on NOWs

Abstract

ABSTRACT – Despite its simplicity and its linear time, a serial K-means algorithm&apos;s time complexity remains expensive when it is applied to a problem of large size of multidimensional vectors. In this paper we show an improvement by a factor of O(K/2), where K is the number of desired clusters, by applying theories of parallel computing to the algorithm. In addition to time improvement, the parallel version of K-means algorithm also enables the algorithm to run on larger collective memory of multiple machines when the memory of a single machine is insufficient to solve a problem. We show that a problem size can be scaled up to O(K) times a problem size on a single machine.

References

Page 1

	Year	Citations

Page 1