Concepedia

Publication | Open Access

FastText.zip: Compressing text classification models

905

Citations

24

References

2016

Year

Abstract

Online communities can be used to promote destructive behaviours, as in pro-Eating Disorder (ED) communities. Research needs annotated data to study these phenomena. Even though many platforms have already moderated this type of content, Twitter has not, and it can still be used for research purposes. In this paper, we unveiled emojis, words, and uncommon linguistic patterns within the ED Twitter community by using the Correlation Explanation (CorEx) algorithm on unstructured and non-annotated data to retrieve the topics. Then we annotated the dataset following these topics. We analysed then the use of CorEx and Word Mover’s Distance to retrieve automatically similar new sentences and augment the annotated dataset.

References

YearCitations

Page 1