Concepedia

Abstract

Abstract Information from micro-blogging site such as Twitter is a huge repository of data. A lot of research is happening on sentiments, discovering patterns and prediction. One challenging task is dividing this humongous unstructured data into clusters. Several topic modeling methods are proposed by researchers. This paper presents a brief summary of topic modeling methods LDA, LSI and NMF and their applications. Experiments are conducted on the Twitter based datasets created using tweets on keywords Cauvery river, Lokpal bill and Rahul Gandhi. Paper covers a brief discussion on evaluating the accuracy of topics formed using perplexity, log-likelihood and topic coherence measures. Best topics formed are then fed to the Logistic regression model. The model created is showing better accuracy with LDA.

References

YearCitations

Page 1