Improvement of K-means Cluster Quality by Post Processing Resulted Clusters

Abstract

A big volume of new data is generated in every moment of the day by different devices and domains as social network, mobile and desktop devices, financial transaction, online websites, different search engines and a lot of smart home devices. The generated data is diversified and can be structured or unstructured. Clustering is the process of categorizing a dataset in groups of records that are similar and are called clusters, the grouping process being performed using a specific criterion. The K-means clustering algorithm is still popular after many years. Different versions of the K-means algorithm emerged along the time and were focused on improving the K-means algorithm by performing some preprocessing steps or by reducing the number of iterations having as a final objective the improvement of the processing time. This paper presents a way of improving the resulted clusters generated by the K-means algorithm by post processing the resulted clusters with a supervised learning algorithm. The proposed approach is focused on improving the quality of the resulting clusters and not on reducing the processing time.

References

Page 1

	Year	Citations

Page 1