Concepedia

Publication | Closed Access

Sentiment analysis for Arabizi text

42

Citations

26

References

2016

Year

Abstract

This paper has used supervised learning to assign sentiment or polarity labels to tweets written in Arabizi. Arabizi is a form of writing Arabic text which relies on using Latin letters rather than Arabic letters. This form of writing is common with the Arab youth. A rule-based converter was designed and applied on the tweets to convert them from Arabizi to Arabic. Subsequently, the resultant tweets were annotated with their respective sentiment labels using crowdsourcing. This ArabiziDataset consists of 3206 tweets. Results obtained by this work reveal that SVM accuracies are higher than Naive Bayes accuracies. Secondly, removal of stopwords and mapping emoticons to their corresponding words did not greatly improve the accuracies for Arabizi data. Thirdly, eliminating neutral tweets at early stage in the classification improves Precision for both Naive Bayes and SVM. However, Recall values fluctuated, sometimes they got improved; on other times they did not improve.

References

YearCitations

Page 1