Concepedia

Publication | Closed Access

Real time clustering of tweets using adaptive PSO technique and MapReduce

11

Citations

12

References

2015

Year

Abstract

These days large amount of data is generated by social media such as Twitter, Facebook and YouTube etc. These kinds of data have very complicated structures which causes difficulty with respect to capturing, storing, analyzing, clustering and visualization of data. Recently, clustering of such data has caught the attention of researchers. For this, distinct algorithms such as K-Means are suggested to cluster the data. There is a need for an algorithm that is able to cluster the data in a lesser amount of time, in case of data stream. Hence the need to use a parallel and distributed environment using map-reduce framework. Likewise particle swarm optimization techniques are preferable for clustering problem, since it scales very well as data, dimensions increase. The paper implements PSO algorithm for clustering Twitter data using Hadoop's map-reduce framework. The outcome illustrates that parallel PSO performs very well compared to K-Means algorithm. The results show that the F-Measure is increasing with increase in number of particles. Also the optimum number of nodes required is illustrated with experimental result.

References

YearCitations

Page 1